Play all

Introduction

Outline

Camelot

PDF Structure

CopyPaste from PDF

Tabular

Camelot Excalibur

Camelot features

Installing Camelot

Demo

How to use Camelot

Excalibur table list

Excalibur table object

Exporting

Output

Export

Plotting

Grid Plot

Excalibur

Refresh

Excalibur UI

Exporting Data

Rules

Results

Future Improvements

Questions

Description:

Explore techniques for extracting tabular data from PDFs using open-source Python tools in this EuroPython 2019 conference talk. Learn about the challenges of working with PDF tables and discover how Camelot and Excalibur can provide efficient solutions. Gain hands-on experience with installing and using these tools to extract, process, and export tabular data from PDFs. Understand the process of defining extraction rules, automating batch processing, and exporting data in various formats including CSV, Excel, JSON, HTML, and pandas DataFrames. Discover how to maintain control over sensitive PDF documents while efficiently extracting structured data for further analysis and processing.

Extracting Tabular Data from PDFs with Camelot and Excalibur

EuroPython Conference

Add to list

#Conference Talks #EuroPython #Programming #Programming Languages #Python #Data Science #Data Extraction #Data Processing #Batch Processing

0:00 / 0:00