Preventing SAP Customers from Leaking Secrets on Github

When I started writing this blog article, before typing any single word, I searched on Google for “GitHub hardcoded credential data leak”, and the very first result was a very recent news from yesterday: “MEDICAL DATA LEAKS LINKED TO HARDCODED CREDENTIALS IN CODE“.

Yes, because it is so easy for an attacker to go on github.com search for “Microsoft Office O365 SFTP password” and find the access to more than 150,000 to 200,000 patient records from a health organization. It’s as simple as that !

And of course, this case can happen to any company or organization in the world, even to the more secured ones. And thousands of companies are currently vulnerable to a public hardcoded credential exploitation without being aware about this risk. Why ? Because they don’t know who is publishing on GitHub a secret. Was it an employee who did a mistake and mixed between his corporate and private account ? Was it a junior developer that did not yet get to the secure programming training and decided to push his code directly on GitHub to work from home ? Or even an insider who decided to steal the internal corporate GitHub code to share it publicly like the NVIDIA and SAMSUNG case  ? Maybe a third party Cloud integrator that has not security best practice and no guideline for GitHub usage ? There are hundreds of reasons that make this kind of leak happen every day.

Yes of course, there are many scanners that can detect secrets hardcoded in source code. Even GitHub has an API Key scanner.  But most of them are using regular expressions to identify standardized tokens (like API Keys), and they are missing the non standard tokens like passwords. The problem with the rule based scanners, is the false positive rate that goes very high (estimated to 80% of the total findings). This very high false positive rate makes the remediation process impossible for developers.

A high false positive rate is the best enemy for an efficient source code scanner !

Two years ago we launched a brand new Source Code Secret Scanner called Credential Digger. This secret scanner has the particularity to use a Machine Learning model able to identify Passwords and non structured tokens in any source code. Credential Digger is today the unique Open Source secret scanner to identify with a high precision rate the non structured tokens (like passwords, pass phrases, non standard tokens, etc) in addition to the standard keys (AWS, Azure, Google Cloud, AliCloud, etc).

Credential Digger is now used by thousands of development teams to scan and identify hardcoded secrets before their publication. Thanks to the pre-commit hook functionality or the Pull Request automated scans the automation and the integration to secure development pipelines is very easy.

Using tools like Credential Digger during the software development lifecycle will help you to publish  secure source code free from any hardcoded vulnerability threat. But when the leak is coming from a non corporate driven process (as explained above), scanning known projects is not sufficient.

For this reason, we decided to develop a real time monitor that scans all the publications of any GitHub platform. This Monitor is of course using our Machine Learning model to identify all types of secrets. With the real time GitHub monitor we can identify any published secret even if it is coming from a private anonymous repository or a hacker exposure account. As soon as the secret becomes public an alert is sent to the concerned party and the remediation process can be immediately executed.

All the SAP Customers are eligible to benefit from the Beta version of the Git Monitor to detect on real time their potential secret assets disclosed on any Github platform (public or corporate).

We are currently launching a Customer Pilot program on the SAP Cloud BTP to try the real time monitoring service. In this pilot program we propose to launch a customer dedicated instance to monitor their source code assets on real-time and help them to deploy and configure the open source version of the source code scanner.

You can reach me out via my LinkedIn profile to get in touch with us and be part of the pilot program.

Now there is no more excuse to open breaches on your Cloud systems and public code !!