Главная
Study mode:
on
1
Introduction
2
Outline
3
Camelot
4
PDF Structure
5
CopyPaste from PDF
6
Tabular
7
Camelot Excalibur
8
Camelot features
9
Installing Camelot
10
Demo
11
How to use Camelot
12
Excalibur table list
13
Excalibur table object
14
Exporting
15
Output
16
Export
17
Plotting
18
Grid Plot
19
Excalibur
20
Refresh
21
Excalibur UI
22
Exporting Data
23
Rules
24
Results
25
Future Improvements
26
Questions
Description:
Explore techniques for extracting tabular data from PDFs using open-source Python tools in this EuroPython 2019 conference talk. Learn about the challenges of working with PDF tables and discover how Camelot and Excalibur can provide efficient solutions. Gain hands-on experience with installing and using these tools to extract, process, and export tabular data from PDFs. Understand the process of defining extraction rules, automating batch processing, and exporting data in various formats including CSV, Excel, JSON, HTML, and pandas DataFrames. Discover how to maintain control over sensitive PDF documents while efficiently extracting structured data for further analysis and processing.

Extracting Tabular Data from PDFs with Camelot and Excalibur

EuroPython Conference
Add to list
0:00 / 0:00