How To Analyse Pdf Documents With Amazon Textract In A Synchronous Way?

December 20, 2023 Post a Comment

I want to extract tables from a bunch of PDFs I have. To do this I am using AWS Textract Python pipeline. Please advise how can I do this without SNS and SQS? I want it to be sync

Solution 1:

You cannot directly process PDF documents synchronously with Textract currently. From the Textract documentation:

Amazon Textract synchronous operations (DetectDocumentText and AnalyzeDocument) support the PNG and JPEG image formats. Asynchronous operations (StartDocumentTextDetection, StartDocumentAnalysis) also support the PDF file format.

A work-around would be to convert the PDF document into images in your code and then use the synchronous API operations with these images to process the documents.

Baca Juga

How To Schedule Or Automate Dataset Refresh In Aws Quicksight
How To Use Gzip To Compress Json Data In Python Program?
Aws S3 Check If File Exists Based On A Conditional Path

Learn Python Tutorials

How To Analyse Pdf Documents With Amazon Textract In A Synchronous Way?

Solution 1:

Post a Comment for "How To Analyse Pdf Documents With Amazon Textract In A Synchronous Way?"