EDUCAÇÃO E TECNOLOGIA

Automatically Capture Document Information with SAP Intelligent RPA

This blog post introduces an end-to-end SAP Intelligent Robotic Process Automation (RPA) workflow that retrieves attachments from Outlook, submits them to Document Information Extraction, and sends extracted data to an external system of your choice.

Document Information Extraction is a powerful tool for extracting businesses critical information from a set of digital documents. Recently, I was using Document Information Extraction for retrieving energy usage data from electricity invoices, with the goal of sending the extracted data to SAP S/4HANA for emissions management.

However, one challenge I came across was the document submission process itself. Although submitting an invoice to Document Information Extraction is simple, manually repeating this whenever a new invoice arrives is not ideal. Thankfully, SAP Intelligent RPA provides an elegant solution for automating the entire process.

In this article, I will walk through the process of creating an SAP Intelligent RPA workflow that automatically submits files to Document Information Extraction. The end-to-end process will start by retrieving email attachments from Outlook and finish by sending extracted data to an external system – I use a Nodejs server, but we can use any suitable API endpoint based on your use-case.

Configuring SAP Intelligent RPA

Before proceeding, you will need to install and configure SAP Intelligent RPA on your SAP Business Technology Platform subaccount and local machine. The complete documentation for setup and configuration is found here.

Create a Document Template in Cloud Studio

The first thing we will need to do is create a suitable template, which will be used as a basis for extracting data from future documents.

SAP Intelligent RPA 2.0 includes access to Document Information Extraction within Cloud Studio. To enable this, ensure the role ‘Document_Information_Extraction_UI_Templates_Admin’ is added to the IRPA role collection on SAP BTP.

Role%20collections%20in%20SAP%20Business%20Technology%20Platform

Role collections in SAP Business Technology Platform to enable Document Information Extraction within SAP Intelligent RPA Cloud Studio.

Once the roles are configured, access your Cloud Studio project, and navigate to the ‘+’ on the left side.

Access%20options%20in%20Cloud%20Studio

Access options in Cloud Studio

Then, select Create and Document Template.

Document%20Template

Document Template

In the subsequent window, you have a few options with selecting pre-built templates and schemas. For today, proceed with creating a new template and select an appropriate file for training. I am using Sample Invoice 2 from this tutorial found on the SAP Developer Center.

Give%20your%20template%20a%20name%2C%20description%20and%20supply%20a%20sample%20document.

Give your template a name, description and supply a sample document

Select the Document Type – in this case, it is an invoice.

Select%20Invoice%20as%20the%20document%20type.

Select Invoice as the document type.

Finally, proceed with using the pre-built schema. For custom documents, select Create New to create a schema for the header and line items you require. Once completed, select Add.

Chose%20an%20existing%20schema%2C%20or%20create%20a%20new%20one%20if%20using%20custom%20documents

Chose an existing schema, or create a new one if using custom documents

In the following pop-up, select Open in new tab – this will open the Document Information Extraction UI.

Open%20the%20Document%20Information%20Extraction%20UI

Open the Document Information Extraction UI

Annotate your Template in the Document Information Extraction UI

The next step is annotating the required fields from your template document, which can be done through Document Information Extraction UI. Select your document and press Annotations to open the line items for your document.

Select the template document and choose Annotations.

Then press Edit to begin annotating the relevant fields in your document – drag and highlight a portion of the document and match it with a header or line item. Once completed, press Save and Activate to make the template available for your automations.

Press%20Activate%20to%20make%20this%20template%20available%20for%20use.

Press Activate to make this template available for use.

The template is now visible in Cloud Studio. Navigate to Document Templates to view all available templates and their annotations.

Access%20Document%20Templates%20to%20view%20your%20available%20templates.

Access Document Templates to view your available templates.

Retrieving Attachments from Outlook

To start creating an automation, navigate to the ‘+’ icon and select Create, then Automation.

Create%20a%20new%20automation

Create a new Automation

The first phase of the automation is to retrieve and save attachments from an Outlook email. To accomplish this in SAP Intelligent RPA, add the Outlook SDK as a project dependency (if this is your first time using the Outlook SDK, you can acquire it in the SAP Intelligent RPA Factory store).

Add%20the%20Outlook%20SDK%20to%20your%20project%20to%20work%20emails%20and%20calendars

Add the Outlook SDK to your project to work emails and calendars

Insert the following Outlook workflow activities to open, filter and retrieve attachments.

Add%20the%20following%20workflow%20activities

Add the following workflow activities

We would like to retrieve attachments from emails that match a specific criterion. In this example, as the invoice is a purchase order from Apple, we can narrow our search to include ‘Apple Store Invoice’ as the email subject.

In the ‘Search Email (Outlook)’ step, enter the following parameters in the ‘searchCriterionList’. Additional criteria can also be added to increase robustness (such as filtering by sender name or date received).

Add%20the%20following%20searchCriteria%20to%20find%20a%20specific%20email%20in%20your%20inbox%20-%20in%20this%20case%2C%20an%20invoice%20from%20Apple

Add the following searchCriteria to find a specific email in your inbox – in this case, an invoice from Apple

With three simple steps, we can now search through Outlook and retrieves attachments from an email based on a criterion.

The next step is to iterate through those attachments and save them to a local folder. Navigate to Controls, select the For Each component, and insert this in the workflow. Within this step, set the iterable to be the attachmentNamesList from step three.

Add%20the%20For%20Each%20control%20to%20your%20workflow

Add the For Each control to your workflow

Proceed with saving attachments by using the Save Mail Attachment (Outlook) activity. Enter a path to a local folder in the destinationPath and set the current attachment name to the attachmentFileName.

Save%20attachments%20to%20a%20local%20folder

Save attachments to a local folder

Great! We have now extended our automation to save Outlook attachments to a local folder. The next phase is to retrieve these saved attachments and submit them to Document Information Extraction.

Submit Attachments to Document Information Extraction

The activity for calling Document Information Extraction requires a path to the file, and this path can be generated using a custom script. Insert a custom script control as step six and enter the following JavaScript code:

const path = '{enter your attachments folder path}'; return path + input; 

Ensure you also add input and output parameters for the custom script.

Add%20input%20and%20output%20parameters%20for%20the%20script

Add input and output parameters for the script

Once saved, update the input and output parameters.

Update%20the%20input%20and%20output%20parameters

Update the input and output parameters

Next, insert the Extract Data With Template activity as step seven and set the Document Template to be our previously created template. Update the documentPath parameter with the filePath generated by our JavaScript code.

Add%20the%20Extract%20Data%20with%20Template%20activity%20to%20submit%20the%20current%20attachment%20for%20processing

Add the Extract Data with Template activity to submit the current attachment for processing

With this completed, our automation is updated with the capability of submitting saved attachments to Document Information Extraction.

Testing the Automation

To test the automation, email yourself a copy of the document you’d like to process (ensure the email subject conforms to your automation’s search criteria). Add a ‘Log Message’ activity in your workflow as step eight, with extractedData as the message. Click Save and then run the automation. The test console should update with the extracted fields.

Posting Data to an API Endpoint (Nodejs Server Example)

The final phase of this automation is posting extracted data to an external system. Ideally, we’d like to do something extra with the data we’ve retrieved, such as storing it for reporting or performing some additional analytics. In SAP Intelligent RPA, we can accomplish this by using Call Web Service to send our data to an external system for further work.

To simulate an external service, I used a NodeJS server using Express (a framework that generates boilerplate code required for a Node server). Within your express application, add a new route in the routes folder called irpa.js, and enter the following code:

var express = require('express');
var router = express.Router(); router.post('/', function(req, res) { console.log(req.body.input.lineItemFields); res.end(); }); module.exports = router; 

Then, add the following lines to app.js:

var irpa = require('./routes/irpa.js'); app.use('/irpa/', irpa); 

Back in Cloud Studio, place a custom script as step nine and enter the following code:

const options = { responseType: 'json', url: 'http://localhost:3000/irpa', method: 'POST', headers: { 'Accept':'application/json', 'Content-Type':'application/json' }, body : JSON.stringify({input})
}; return options;

Ensure you update the input and output parameters for this step as well. Finally, add a Call Web Service activity that uses the generated options to perform a POST request to our Node server.

Add%20the%20Call%20Web%20Service%20activity%20to%20execute%20the%20POST%20request

Add the Call Web Service activity to execute the POST request

With this step completed, the full end-to-end automation is ready to be tested!

Testing the Final Workflow (Nodejs Server Example)

To test the complete workflow, open a command line session in your Nodejs directory. Use ‘npm init’ to install dependencies and ‘npm start’ to begin the server. Run the complete automation in Cloud Studio. If everything works as intended, there should be an output to your command line session with the extracted data – from here, it’s up to you on how you want to process the data further!

In this blog post, I have shown that SAP Intelligent RPA can simplify the Document Information Extraction process, with the ability to send extracted data to an external service (simulated with Nodejs).

There are numerous applications for such a workflow. For example (based on my original use-case), we can now automatically collect electricity usage data from invoices and send this to an emissions management platform, such as SAP Environment, Health and Safety management. This can further assist companies with meeting and exceeding their sustainability goals.

Thank you for reading this blog post – please try the steps for creating this automation for yourself and leave a comment to share your experience, thoughts or to ask any questions.