Parsing an Adobe Acrobat website to transfer tables from pdf to excel [closed]

I have official site adobe acrobat and also have a pdf file
I would like to use python to write a translator of tables from the Pdf d Exel format, I have already finished working with the interface, the main problem for me is to write a parser for this site or something like that, I will be grateful for your help.
Here is my interface code:

import sys

from PyQt5.QtWidgets import *


class MainWindow(QWidget):
    def __init__(self):
        super(MainWindow, self).__init__()
        self.setGeometry(700, 300, 330, 430)
        self.setWindowTitle('PdfRedactor')
        layout = QVBoxLayout()

        # Add a field to enter the file directory
        name_label = QLabel('Enter pdf file directory')
        self.file_pdf = QLineEdit(self)

        # Add a field to save the file
        player_label = QLabel('Specify the name of the excel file')
        self.file_xls = QLineEdit(self)

        # Adding a "Start" button
        start_button = QPushButton('Start data transfer', self)
        start_button.clicked.connect(self.start)

        # Placing elements on the form
        form_layout = QFormLayout()
        form_layout.addRow(name_label, self.file_pdf)
        form_layout.addRow(player_label, self.file_xls)

        layout.addLayout(form_layout)
        layout.addWidget(start_button)
        self.setLayout(layout)

        # Add style
        self.setStyleSheet("""
                    QWidget {
                        background-color: #f0f0f0;
                    }

                    QLabel {
                        font-size: 14px;
                        color: #333;
                    }

                    QLineEdit {
                        padding: 5px;
                        font-size: 12px;
                        border: 1px solid #aaa;
                        border-radius: 4px;
                        background-color: #fff;
                    }

                    QRadioButton {
                        font-size: 12px;
                        color: #333;
                    }

                    QPushButton {
                        background-color: #4CAF50;
                        color: white;
                        padding: 8px 16px;
                        font-size: 14px;
                        border: none;
                        border-radius: 4px;
                    }

                    QPushButton:hover {
                        background-color: #45a049;
                    }
                """)

    def start(self):
        pass
    # There should be parsing here


if __name__ == '__main__':
    app = QApplication(sys.argv)
    ex = MainWindow()
    ex.show()
    sys.exit(app.exec_())

I tried many options for transferring tables from PDF files to Excel (all known Python libraries) and this site showed the best result, thanks for the help.

  • 1

    Maybe try including some things you tried, and why they didn’t work. Also the Russian at the beginning is a bit confusing.

    – 

  • Microsoft use Acrobat for server convert to PDF, Adobe use MS Office for convert to xlsx there is no better method, look inside a conversion of a blank and see the added data is Microsoft defaults <a:font script="Uigh" typeface="Microsoft Uighur"/> </a:majorFont> <a:minorFont> <a:latin typeface="Calibri"/> <a:ea typeface=""/> <a:cs typeface=""/> <a:font script="Jpan" typeface="MS Pゴシック"/> Office has plugins for Adobe conversion Adobe will do reverse. however the 2 formats will never convert perfectly as PDF is output and Excel is not an easy PDF input method, Word is better for PDF in

    – 




Leave a Comment