Predicting bad SAP performance (Part 1)

After more than 2 years pause, I finally decided to start another blog, or let’s better say a whole series. I was always wondered why Machine Learning isn’t much (or at all?) used in the SAP basis realm.

  • Who had since always lots of structured data available? SAP basis people
  • Who handled big data when Big Data wasn’t even a buzzword yet? SAP basis people
  • Who had to learn scripting for survival? SAP basis people
  • Who had to learn dynamic SQL and stuff to get things done efficiently? SAP basis people

There were also some obstacles, of course. You do not develop schemas or dependencies, but simply reverse engineer what SAP throws at you. Or if you are really lucky, read some manual or SAP notes on what is officially provided. And while SAP systems could be roughly summarized as a collection of (business) interfaces, no one hardly ever thought of what kind of interfaces SAP basis people need!

Personally, I wanted to become a Data Engineer to get a better grip on such issues. And the most fascinating topic for a Data Engineer is Machine Learning. So I try to leverage on my SAP basis know how and provide input for some Machine Learning project. Which brings me to the fundamental question:

What is Machine Learning good for?

My biased answer would be mostly for these topics:

  • Classification/Clustering
  • Anomaly Detection
  • Anomaly Prediction

I am sure people will disagree and highlight additional use cases. For simplicity, I leave out chatbots, image recognition and such stuff. In the past, I had already ventured a little into Classification and Anomaly Detection areas but found no compelling use case for them in the SAP basis realm. This made me think of what would be a great use case for Machine Learning in the SAP basis area?

There had been some few projects in the Anomaly Prediction area. Unplanned downtimes should be predicted by a Machine Learning model. Not much surprisingly, such projects failed (as far as I know of them). And the basic reason was too few examples to learn from. After all, unplanned downtimes of productive systems are what is to be avoided at (almost) all costs. And if you do not have enough examples to learn from, then a Machine Learning project is doomed to fail.

There are lots of potential reasons for unplanned downtimes, and most of them are already addressed e.g., by redundant hardware setups and the like. So a painful unplanned downtime of an important ERP system is happening rarely, making this use case not suitable for Anomaly Prediction.

If I want to do a successful project, I need to dial down a little. What happens more often than unplanned downtimes, but is also a severe nuisance? Bad Performance.

So finally, I have a goal that might bring some tangible results. If it is possible to predict that a SAP system will experience a bad performance incident soon, then SAP basis people might have a chance to prevent the issue or at least reduce the overall impact.

Summary: My goal is to train a Machine Learning model to predict an upcoming bad performance incident for a SAP system.

P.S.: This was just the beginning of a potentially long series, going into technical details, both for SAP performance and Machine Learning. I will try to keep the blog posts short and focus on one specific topic, but each time progressing towards the goal. Since this is my first try at Anomaly Prediction, I am sure there will be better ways to do this and always welcome feedback.