Options
All
  • Public
  • Public/Protected
  • All
Menu

Class TextractParser

This class provides methods to process the information returned by Textract into a tree structure that is easier to work with.

A single instance of this class is automatically created for you and provided as the default export from this library.

Hierarchy

  • TextractParser

Index

Properties

Private factory

factory: DocumentFactory

Methods

parseDetectTextCallback

  • Method that acts as a proxy for the standard Textract callback.

    This proxy will process the data returned by Textract and call the provided callback with the processed information. It can be invoked as shown, where myCallback is written by the user of the library.

    textract.detectDocumentText(request, textractParser.parseDetectTextCallback(myCallback))

    Parameters

    Returns TextractDetectTextCallback

    callback function that can be used with the AWS Textract invocation

parseDetectTextResponse

  • Method that parses the textract response synchronously.

    For example it can also be used as part of processing the result of a promise as shown below.

    textract.detectDocumentText(request).promise()
    .then(data => textractParser.parseDetectTextResponse(data))
    .then(parsedData => console.log(parsedData))
    .catch(err => console.log(err))

    NOTE: If used to process GetDocumentTextDetectionResponse response then all data should be contained within a single response. If a NextToken is detected on the response then null will be returned. See parseGetTextDetection for a helper method which will aggregate the responses from the GetDocumentTextDetection operation.

    Parameters

    Returns Document | null

    Document that acts as the root node for the processed tree

parseGetTextDetection

  • parseGetTextDetection(textract: Textract, jobId: string): Promise<Document>
  • Method that retrieves the result of a asynchronous document text detection operation (which may require multiple requests to AWS) and produces a tree of the results.

    An example of how to use this method is shown below.

    const jobId = 'your-job-id'
    const client = new AWS.Textract()
    
    textract.detectDocumentText(client, jobId)
     .then(parsedData => console.log(parsedData))
     .catch(err => console.log(err))

    If the specified Textract job is not marked as SUCCEEDED or the AWS operations fail to return the results then the Promise will be rejected.

    NOTE: This method will try and retrieve all the results for the Textract job and process them in memory. For extremely large documents then memory may become an issue.

    Parameters

    • textract: Textract
    • jobId: string

      the id of the Textract job for which we want to parse the results

    Returns Promise<Document>

    Promise for a document that acts as the root node for the processed tree

Generated using TypeDoc