Document Extraction with SAP Process Automation– ‘Automated Template Detection’ feature

INTRODUCTION

I’m writing this blogpost with immense pleasure and excitement today. This is not just an announcement but also a How-To about a major feature release ‘Automated Template Detection’ which is one of the most awaited features requested by our customers frequently for a while. It is aimed at improving the way we carry out document processing further easing up the bot creation in a smarter way. (Of course, one amongst the many great upcoming features!)

This feature was released in April 2022 and was made available to all our existing RPA customers and the SAP Process Automation customers as an integration feature with Document Information Extraction Service.

But before we start, a few words on SAP Process Automation for those who aren’t yet used to the context –

SAP Process Automation  combines the capabilities of SAP Workflow Management & SAP Intelligent Robotic Process Automation into an intuitive no-code experience. This is a significant step towards simplifying process automation and enabling more people within the organization to participate in automating processes. You can read the news article published by.

Here is a short 3-minute video which provides a good insight on SAP Process Automation. Hope it piques your interest even further.

https://youtu.be/yhQ8oa3_WFQ

PRE-REQUISITE  

If you are new to SAP Process Automation –

  • SAP Process Automation is available in SAP BTP as part of CPEA or Pay-as-you-go (PAYG) commercial models. If you don’t have access to one, you can get started with PAYG. Follow this SAP Developer Tutorial for step-by-step guidance on how to create a new Pay-as-you-go account (PAYG).

You can find more info on SAP Process Automation by accessing SAP Discovery center. You will be able to see information on DC availability, Pricing, and more assets when they are made available.

SAP Process Automation is now available to try free of charge on SAP BTP free tier.

  • To get started and how to subscribe with Sap Process Automation, please refer to this blog.

For the existing RPA customers –

  • Installation as per the instructions in Help Portal
  • Knowledge about Projects, Automation. Tutorials can be found under: Tutorials
  • To learn how to create and use templates, refer to this blog.

Where to find the Automated Template Detection – 

  1. Select Dependencies from the combo box, click on Manage Dependencies button shown below

2. Click on Add dependencies

         

3. Look for “SAP Intelligent RPA Document Information Extraction SDK” and simply add it

4. All the document information extraction service activities would then appear as show in the image below

What is Automated Template Detection – 

It is one of the most awaited features of document extraction which enhances the ‘Extract Data (Template)’ activity to smartly pre-screen and pre-extract data from your input pdf/image file to decide which template under a particular selected schema is the best fit and performs the extraction accordingly.

Which problem does it solve for intelligent document processing?

Earlier you had to place each template via each ‘Extract Data (Template)’ activity or build a logic for each input document to move from one activity to another for it to be picked by the correct template. This process was quite redundant and complicated. We received immense feedback during the Q4,2021 Beta from customers and wanted to make this part of the bot building as easy as it should be.

How the document extraction logic will be created now?

As mentioned in the beginning, we have you covered with the ‘Automated Template Detection’ and things will happen in a smarter way now.

After selecting the ‘Extract Data (Template)’ activity, you simply select your schema and select Detect Automatically and provide the document path. The Document Information Extraction Service does the rest for you. The backend algorithm does a pre-screening and selects the right template based on the fields pre-extracted!

Where do you find the ‘Automated Template Detection’ feature?

As shown in the screenshot below, this feature is part of the ‘Extract Data (Template)’ activity.

 

How-To –

Let us fast forward a few initial steps which have been covered in previous blogposts and use just a document extraction specific example –

  1. Create an Automation Project ‘XYZ’

  1. Create a template artifact by choosing an existing template/create new template, then select your Document Type / make a custom Document Type and select an existing schema / pick an SAP global schema like ‘SAP_invoice_schema’ / create your own custom schema.

For all the steps on how to do it, please refer to this blog.

Note – When you begin a project with creating a template artifact, the dependency ‘Document Information Extraction SDK’ is automatically added to your project along with the ‘PDF SDK’. Hence, you do not have to do that manually.

3. Repeat step 2 to create multiple templates as per your need

Now to the steps of the topic in attention – ‘Automated Template Detection’

4. Add the ‘Extract Data (Template)’ activity to the Automation Workspace

      5. Click on the activity to bring up the details panel on the right

6. Select your schema (Eg. – SAP_Invoice_Schema)

7. ‘Detect Automatically’ feature gets automatically selected

  1. Provide your document path / a variable containing the document path

  1. Click the Save

Pretty short and straightforward! Now let us move to the execution part to show something interesting during testing

  1. Search “log” and drag and drop the ‘Log Message’ activity to the automation workspace

  1. On the right panel, click the message text box and select ‘(1) ExtractedData’ within the message text box

  1. Below the message text box, click the ‘type’ textbox and select ‘Info’

13. Put a breakpoint on the ‘Log Message’ activity, an orange dot appears next to the activity

   

14. Click the Save Button

Now run the bot to test it

  1. The bot will pause after the first step, then click on ‘Extract Data (Template)’ activity on the left debug panel and on the right panel, expand ‘schemaUid’ Input Parameter

16. When you check the ‘Identifier’, you notice that it has identified the correct template based on            the input document which can be cross-examined here at the debug level

  1. Then continue running the test and let it end, so you can see the extracted result in the Info panel below which can be expanded further for preview

18. The Example ENDs here.

CONCLUSION

By reading this blog post, you have learnt about the new ‘Automated Template Detection’ feature, it’s significance in the overall process and its usage. Lastly, I hope this blogpost has given you a good start to explore the Document Information Extraction Service activities within SAP Process Automation and to check out the template artifact creation.

Thanks for reading and feel free to leave a comment with questions or feedback .

Stay tuned for more updates on Document Information Extraction Service in SAP Process Automation.

LINKS

Please refer to the following links for steps and further information –

  • For Enrichment Activities and all the related sub activities –

Enrichment Data API – SAP Help Portal

  • For examples on using JSON script –

Create Enrichment Data – SAP Help Portal

  • How to create schemas and templates –

Schema – SAP Help Portal

Template – SAP Help Portal

  • For Best Practices on using activities related to Document Information Extract Service –

Best Practices – SAP Help Portal

 

For more information on SAP Process Automation: