How SAP Cloud Integration have improved in the last 3 years.

I made a mistake but I got some interesting information about SAP Cloud Integration(CPI). And it seems like  some parts have improved a lot over the last period of time. 

In April 2019 I was experimenting with Recursion in SAP Cloud Integration (CPI). I wanted to see what was possible and limits with it. To have an iFlow calling it self many time.

One of the best ways is to use Fibonacci to look for recursive algoritm. I wrote about it in the blog post on how to implement it in SAP CPI

First of. It is a really crappy idea to write anything with recursive in this way. You are better of just writing a groovy script for it or calculate it directly.

 I was experimenting with usage or parallel and sequential processing for this. I found that parallel was performing worse, probably because it needed to have a greater number of processes in place at anyone time. I think I found it possible to run about N=20 for parallel and N=30 for sequential.

 This week I was taking with a client for on SAP Cloud Integration best practice and where we talked about calling iFlows using Process Direct. I remembered my nice Fibonacci iFlow and got the bright idea to call it with N=100 for the first time. It resulted in a lot of calls.

F(100) = F(99) + F(98)

F(99) = F(98)+ F(97)

F(1) = 1

F(0) = 0

After about 2 days, the team complained about we had nearly 6 million messages for the last week on our SAP CPI Neo tenant. It did have an impact on the performance of the system.

We could stop the processing by undeploying the iFlow and it worked fine, but it is of cause an iFlow that suddenly will not be able to call other iFlows. 

It was now possible to have a huge flow out to subsequent messages using the Process Direct. If a process can keep that many iFlows active at one point in time is quite interesting. It is a lot better state handling than earlier. That is is possible to have 6 mio iflows stringed together in a flow does have a huge impact.


Monitor from Figaf

It also seems like the huge increase in processing has any impact on the system. Here is a snapshot from Figaf Monitoring tool.

Notice: It has been run on our partner tenant. I don’t know how this relates to our customer productive instances.

Taken for the last 4 days. All titles are scaled to the same size. The processed messages is normally on below the others but was moved to the first screen because it was relevant. The CPU and memory usage is fetched using private APIs, so we don’t really know the accuracy.

The message processing is identical over the full period that the system.

The CPU load is around 40% during the processing on one node. Which means the Process Direct probably does not affect the other nodes.

Once the message processing stop it seems like we get some new CPI nodes. We have previously seen a new set of tenants have been deployed. So there can have been something that worked strange on the system and therefore a new set of tenants has been created. And then after a few hours a new shift occurs and new nodes are assigned probably as a result of a weekend patch.

I must say I’m quite impressed with how the system could perform and create new flows during this period. And that SAP have been able to make it much better performing and reliable than it was 3 years ago. Even for such a strange scenario I was trying to experiment with.

There are 84 Abandoned processes that did not get updated for a period of 24 hours and then stopped being updated. But that is all we see in terms of errors. 

SAP Cloud Integration is not a good tool to perform math in. I’m using sum in Xpath to calculate it, I’m sure it will cause some problems adding such big numbers.

If you were wondering what F(100) is it is 354224848179261915075, it has 21 digits.