Skip to content

Textract

Textract is a machine learning service that automatically extracts text, forms, and tables from scanned documents. It simplifies the process of extracting valuable information from a variety of document types, enabling applications to quickly analyze and understand document content.

LocalStack allows you to mock Textract APIs in your local environment. The supported APIs are available on our API Coverage section, providing details on the extent of Textract’s integration with LocalStack.

This guide is tailored for users new to Textract and assumes basic knowledge of the AWS CLI and our awslocal wrapper script.

Start your LocalStack container using your preferred method. We will demonstrate how to perform basic Textract operations, such as mocking text detection in a document.

You can use the DetectDocumentText API to identify and extract text from a document. Execute the following command:

Terminal window
awslocal textract detect-document-text \
--document '{"S3Object":{"Bucket":"your-bucket","Name":"your-document"}}'
Output
{
"DocumentMetadata": {
"Pages": {
"Pages": 389
}
},
"Blocks": [],
"DetectDocumentTextModelVersion": "1.0"
}

You can use the StartDocumentTextDetection API to asynchronously detect text in a document. Execute the following command:

Terminal window
awslocal textract start-document-text-detection \
--document-location '{"S3Object":{"Bucket":"bucket","Name":"document"}}'
Output
{
"JobId": "501d7251-1249-41e0-a0b3-898064bfc506"
}

Save the JobId value to use in the next command.

You can use the GetDocumentTextDetection API to retrieve the results of a document text detection job. Execute the following command:

Terminal window
awslocal textract get-document-text-detection \
--job-id "501d7251-1249-41e0-a0b3-898064bfc506"

Replace 501d7251-1249-41e0-a0b3-898064bfc506 with the JobId value retrieved from the previous command.

Output
{
"DocumentMetadata": {
"Pages": {
"Pages": 389
}
},
"JobStatus": "SUCCEEDED",
"Blocks": [],
"DetectDocumentTextModelVersion": "1.0"
}
OperationImplementedImage
Page 1 of 0