Your SAP on Azure – Part 28 – Integrate SAP Data Warehouse Cloud with PowerBI and Azure Synapse

SAP Data Warehouse Cloud is a new offering that allows customers to centrally manage and analyse business data from multiple sources. Unlike the SAP Business Warehouse or BW/4HANA, the new service is only available in the cloud in the SaaS model. It offers a tight integration with other SAP products, which makes it an attractive solution if you don’t want to use the traditional data warehousing platforms. As I’m always interested in new ways of integrating SAP data, the new offering was an obvious choice for me to play around with.

I have to admit I was impressed with the product’s capabilities. I like the approach of differentiating technical data provisioning and modelling with the analytical layer focused on business use. And I immediately thought about using the data stored in the warehouse with Microsoft PowerBI to analyse them and discover insights or even importing to Azure Synapse to blend with data coming from other sources. In today’s blog, I will show you how to do it!

GETTING DATA FROM SOURCE SAP SYSTEM

We start with a quick data import from the source SAP system. I’m using the latest SAP S/4HANA edition, and I want to import information about sales orders. I decided to use the C_SALESDOCUMENTITEMDEX_1 CDS View. You can also use traditional SAPI extractors or replicate data directly from the ABAP tables.

You need to establish a communication channel to read data from the SAP system. The Data Provisioning Agent acts as a proxy between the system installed in a private network and the cloud service. The documentation covers the installation process nicely, so I won’t include these details here.
You can find more information here:

https://help.sap.com/docs/SAP_DATA_WAREHOUSE_CLOUD/9f804b8efa8043539289f42f372c4862/bffbd58c15784a62af0520f171018ded.html?locale=en-US&q=refresh

Once you installed the Data Provisioning Agent and defined the source system in the SAP Data Warehouse Cloud, adding a new entity to replicate data is a straightforward process. In the Data Builder section, choose Import Remote Tables from the Import menu:

The data import wizard simplifies the process of choosing the source SAP object. In the first step choose the system that will act as the data source:

In the next step, select the source object. You can replicate ABAP tables or use one of the ODP entities like CDS Views, SAPI Extractors or even SLT. Navigating through the object tree is a bit clumsy, and I think it could be simplified. After a short while, I could find the CSDSDOCITMPEDX1 entry, which is the SQL View Name of the chosen CDS View.

The last step allows you to customize the object target name and displays the import status. SAP Data Warehouse Cloud can read the description of the view (and all imported fields), which is a nice feature and makes it easier to work with the data. Confirm your changes by clicking Import and Deploy:

When the import is successful, you can close the window. So far, only metadata about the view has been imported – but that’s not a problem as the SAP Data Warehouse Cloud can reach the data when needed.

You can’t expose tables for consumption by external applications, so we need to build a view on top of the imported data. You can do it in the Data Builder section. Open the View tab and choose New Graphical View:

To create a simple view in a graphical editor, just drag and drop the table name from the menu to the main screen. By default, views are not exposed for external consumption. But you can easily change that behaviour. Turn on the Expose for Consumption switch in the object properties.

That’s it! The data is available for reading from external applications, but part of the configuration is still missing.

CONFIGURE DATA WAREHOUSE CLOUD FOR EXTERNAL ACCESS

We require database user credentials to access data stored in the SAP Data Warehouse Cloud. Open the Space Management and choose Create database user in the Database Access tab.

The username consists of the space name with the chosen suffix. I’m working in the space name BJ_SPACE, and I chose my name as the suffix. So the username I can use to log in is BJ_SPACE#BARTOSZ. Click Create to confirm changes.

Deploy changes by clicking the Deploy button in the top right corner of the screen. Then click the Info button next to the username to display user information and request a password.

The Info screen is handy, showing all the details you need to establish a connection. You can find the full username, hostname, and port number here. Click Request New Password. Please take a copy of all settings as we’ll need them later in PowerBI and Synapse.

All good so far! The last thing to change is the firewall rules to allow access from your computer. Open the Configuration section and go to the IP Allowlist tab. Click Add to provide new entries.

In the new window, provide the Public IP of the computer that initiates the connection to the SAP Data Warehouse Cloud. It may be the IP of your local computer if you’re working with PowerBI Desktop. But when using PowerBI Service or Azure Data Factory, you should use the Public IP of the server that runs the On-Premises Data Gateway or the Integration Runtime.

Save your settings. It may take a couple of minutes to complete. Equipped with access credentials, we can connect and read data from the PowerBI.

CONNECTING POWER BI

We use the ODBC protocol to connect PowerBI to SAP Data Warehouse Cloud. It requires SAP HANA database drivers installed on the computer that initiates the connection. For PowerBI Desktop, you have to install drivers on your local computer.

You can download drivers from here:

https://tools.eu1.hana.ondemand.com/#hanatools

Open the ODBC Data Source Administration console once you complete the drivers installation. Verify that you can find them on the Drivers tab:

Go to User DSN tab and click Add button. Choose HDBODBC drivers. Then provide connection details:

Data source name: DWC

Database type: SAP HANA Cloud

Host: <HostName>:443

If you haven’t already, you can read the hostname and port number from the Info view where you created the database user in the SAP Data Warehouse Cloud.

I strongly recommend running the connection test, as it can highlight any connectivity problems. Everything worked fine for me.

You can now open PowerBI. Click Load Data -> More… button and choose ODBC connection type.

Choose previously created data source name. You don’t have to maintain any advanced options.

When asked for user credentials, provide the generated username and password. Remember that the username consists of the space name and selected suffix delimited with the # sign. Click Connect

Now you can select views that you import to the PowerBI:

All worked fine, and all data is available for further transformations…

…and reporting!

Getting data from SAP Data Warehouse cloud into PowerBI was an easy exercise! Let’s now import the same data set to Data Lake.

IMPORTING DATA TO THE LAKE

On a high level, copying data to Synapse follows the same principle as importing them to PowerBI. You need to install the ODBC drivers, this time on the server running the Self-Hosted Integration Runtime, as it’s the one that initiates the connection. But you don’t have to maintain entries in the ODBC Data Source Administrator console, as we can provide connection string directly in the Linked Service defined in the Azure Synapse.

Open Synapse Workspace and define a new Linked Service of type ODBC. Provide the name and choose Self-Hosted Integration Runtime. The connection strings have the following structure:

DRIVER=HDBODBC;

SERVERNODE=<hostname>:<port>

Don’t maintain the username and password directly in the connection string, as you can type them into authentication fields. Or even better, use Key Vault to keep your credentials secure.

Test the connection and save your settings.

Now, define a new data set that represents data from the source system and link it with the Linked Service we’ve just created. You can choose the table you want to extract from the Table name selection box:

The last thing left is creating the pipeline with Copy Data activity. Maintain the source and sink tabs, and we’re ready to test.

Publish your changes. To run the pipeline, select Add Trigger -> Now. You can check the progress of the extraction job in the Monitoring section.

It only took a couple of seconds to complete the extraction. All records have been processed, and I can see they are available in the Data Lake:

Integrating data from SAP Data Warehouse Cloud is very easy as you can use the ODBC to connect. Within a few minutes, you can connect the chosen BI solution or use Azure data integration tools to bring information to the Data Lake and make it available to blend with data coming from other sources. However, it would make sense to analyse the use case in the second case, as the direct connection to the source SAP system may be beneficial. There is a new connector available that can consume data using the SAP Operational Data Provisioning framework. It supports extracting data from CDS Views, SAPI Extractors and SLT. One of the advantages is that it understands delta extraction, which is very helpful when working with large and frequently changed datasets.