I have official site adobe acrobat and also have a pdf file
I would like to use python to write a translator of tables from the Pdf d Exel format, I have already finished working with the interface, the main problem for me is to write a parser for this site or something like that, I will be grateful for your help.
Here is my interface code:
import sys
from PyQt5.QtWidgets import *
class MainWindow(QWidget):
def __init__(self):
super(MainWindow, self).__init__()
self.setGeometry(700, 300, 330, 430)
self.setWindowTitle('PdfRedactor')
layout = QVBoxLayout()
# Add a field to enter the file directory
name_label = QLabel('Enter pdf file directory')
self.file_pdf = QLineEdit(self)
# Add a field to save the file
player_label = QLabel('Specify the name of the excel file')
self.file_xls = QLineEdit(self)
# Adding a "Start" button
start_button = QPushButton('Start data transfer', self)
start_button.clicked.connect(self.start)
# Placing elements on the form
form_layout = QFormLayout()
form_layout.addRow(name_label, self.file_pdf)
form_layout.addRow(player_label, self.file_xls)
layout.addLayout(form_layout)
layout.addWidget(start_button)
self.setLayout(layout)
# Add style
self.setStyleSheet("""
QWidget {
background-color: #f0f0f0;
}
QLabel {
font-size: 14px;
color: #333;
}
QLineEdit {
padding: 5px;
font-size: 12px;
border: 1px solid #aaa;
border-radius: 4px;
background-color: #fff;
}
QRadioButton {
font-size: 12px;
color: #333;
}
QPushButton {
background-color: #4CAF50;
color: white;
padding: 8px 16px;
font-size: 14px;
border: none;
border-radius: 4px;
}
QPushButton:hover {
background-color: #45a049;
}
""")
def start(self):
pass
# There should be parsing here
if __name__ == '__main__':
app = QApplication(sys.argv)
ex = MainWindow()
ex.show()
sys.exit(app.exec_())
I tried many options for transferring tables from PDF files to Excel (all known Python libraries) and this site showed the best result, thanks for the help.
Maybe try including some things you tried, and why they didn’t work. Also the Russian at the beginning is a bit confusing.
Microsoft use Acrobat for server convert to PDF, Adobe use MS Office for convert to xlsx there is no better method, look inside a conversion of a blank and see the added data is Microsoft defaults
<a:font script="Uigh" typeface="Microsoft Uighur"/> </a:majorFont> <a:minorFont> <a:latin typeface="Calibri"/> <a:ea typeface=""/> <a:cs typeface=""/> <a:font script="Jpan" typeface="MS Pゴシック"/>
Office has plugins for Adobe conversion Adobe will do reverse. however the 2 formats will never convert perfectly as PDF is output and Excel is not an easy PDF input method, Word is better for PDF in