Machine learning with SAP ECC 6.0 and SAP BW 7.50
What if you currently work on SAP ECC 6.0 and you want to build a proof of concept with machine learning and be prepared when times comes to migrate ideas to SAP landscape with native machine learning capabilities of SAP HANA Predictive Analysis Library?
I started to work with SAP about 20 years ago and I stepped through different stages of understanding business data in various implementation projects of master data, material management, sales and distributions, plant maintenance, demand planning, business warehouse and integrated planning and interfaces with other satellite systems. Few years ago, machine learning started to flood the media. This caught my attention because I have seen so many data and I was curious about revamped age of artificial intelligence I read from science fictions books, but in reality all that worked in the past was only basic statistics models.
I graduated various SAP courses from https://open.sap.com/ and at the same time I started to look for a second opinion of the core solutions to break in my mind the concept to basic elements. I found out that most of data science environment relies on Python and that the giants democratized the machine learning libraries. Most notable libraries are TensorFlow and Keras provided by Google.
Once you understand the basic you want to understand the scale and because of that you inevitable will land on Kaggle competition. I took my chance with learning and scrapping the code from Earthquake Kaggle competition.
The Kaggle completion is the fast start and give you the real flavor of machine learning. The Kaggle playground works out of the box and from the community shared exploratory data analysis you can puzzle your own ideas. After that you are convinced that machine learning has a real flavor you start asking if it works because of data or if it might work with any data?
Probably one of the most common and important area in a company to check out the machine learning on pricing prediction. Sales process are probably the most detailed with lot of attributes in sales order and sales invoices. Sales reporting heavily relies on standard reports, logical databases and profitability reports for real-time data consuming ECC 6.0 or aggregated data in SAP BW 7.50. .
I liked this course from SAP Getting Started with Data Science (Edition 2021) because it disclosures so many public online sources the machine learning relies on with the community you can learn from and contribute.
The hardest part is to set the roadmap there are so many sources and libraries that you are facing with the problem of overchoice. You know the quotation ”I can only show you the door, you have to walk through it”. 🙂
The roadmap solution to the problem of overchoice:
- The source data from reporting with categorical and numerical data.
- The format of data can be TXT, CSV or XLS. I used XLS BEX workbooks from BW 7.50 because of convenient data update data with refresh functionality. If you intend to continually update the data in a pipeline you have to post source data with an API.
- The environment for exploratory data analysis and training is anaconda.com with Jupyter Notebook.
- The main library for data manipulation is pandas.
- The library for data preparation is sklearn.
- The library for training is XGBoost. I like the embedded function of features importance.
- The library for graphing is plotly. I like the intuitive content of objects with dictionaries.
- The library for user interface with ipywidgets is Voila. It is slow and refuse to start sometimes, but the advantage is that you don’t need to adapt the code. Otherwise, I would use Streamlit, it is faster.
- The library for API to call prediction from SAP RFC is FastAPI.
- The class for ABAP request is SE24: IF_HTTP_CLIENT.
- The class for ABAP JSON for data-interchange format with FastAPI is SE24: /UI2/CL_JSON.
- You can use RFC for prediction in SAP ECC for real-time execution in developed reports.
- You can use the same RFC for prediction SAP BW process chain in routine of the data transfer process.
What’s the point with all these steps apart from building your knowledge with a proof of concept? Probably going deeper with basic elements gives you more flexibility with different libraries and external API services to fill the gaps between different environments.
How did you solve the problem of overchoice with machine learning to check out your ideas with proof of concept?