Due to the feedback I have got and the task of setting up a new SAP Data Intelligence policy management I thought a blueprint and a more elaborate description of the dipolicy-script might be helpful. For an introduction to policy management please read my blog:
In a nutshell I list the principles discussed in my previous blog I followed when creating the blueprint.
Separation of “application” and “data” policies
Separate policies of
- “application”-policies, that only use as resource type “application”, and
- “data”-policies that contain “connection” and “connectionContent” as resource types.
I prefer to combine these policies only when assigning them to user.
Avoid Resource Redundancies
Hierarchies are important for an intuitive understanding of how policies depend on each other. Nonetheless I separate the very basic policies and inherit the resources only at the last level, that I use for group/user role definitions
Create Role Policies
To keep all policy definitions in one place – the policy management – and not splitting it up between policies and user assignments, I create a role policy that encompasses eventually all policies that a user role or group needs. Only these roles I “expose” to enable the assignment to user. All other policies I flag as non-exposed.
Application Policy Blueprint
The blueprint I am using as a starting point looks like the following:
Only the policies in the orange boxes are exposed and correspond to a user group. You can download these policies as a zip-file from my personal GitHub.
The basic idea is that I have 2 types of developers
- ml-developer and
where the ml-developer can use additional ml applications. Both roles can use the Connection Management as well and in case the user get the data policy “data.own” she could create her own connections.
For the metadata explorer application I have split the role as well into 2 groups:
- catalog for the catalog and glossary management
- quality for the preparation and rulebook management
All user have the authentication to manage the metadata data. If you like to have an additional role that has only “reading” rights you need to create a new role with the policy “basic.metadata”. All metadata users can not start the Connection Manager and therefore cannot add new data sources. They depend on the data sources defined and assigned to them by the system administrator.
The 5th role is the omnipotent user that has unlimited rights to all applications.
There is not much to prepare because each customer has a different data source landscape. The basic data policies are
- *data.own – that allows users to read, write and manage the data of connections they have created by their own
- *.data.shared – These encompass all the data sources that every user has access to. Normally this are the central object stores like the DI_DATA_LAKE.
- *.data.all – Finally we have the data system administrator who has the access rights to all data sources. This role I would not use in a productive environment.
Installation and Configuration
As already outlined I am using a script when setting up a policy management. You can install the script with
pip install diadmin>=0.0.24
dipolicy --help usage: dipolicy [-h] [-c CONFIG] [-g] [-d DOWNLOAD] [-u UPLOAD] [-m MYCOMPANY] [-z] [-f FILE] [-a] Policy utility script for SAP Data Intelligence. Pre-requiste: vctl. optional arguments: -h, --help show this help message and exit -c CONFIG, --config CONFIG Specifies yaml-config file -g, --generate Generates config.yaml file -d DOWNLOAD, --download DOWNLOAD Download specified policy. If wildcard '*' is used then policies are filtered or all downloaded. -u UPLOAD, --upload UPLOAD Upload new policy (path). If path is directory all json-files uploaded. If path is a pattern like 'policies/mycompany.' all matching json-files are uploaded. -m MYCOMPANY, --mycompany MYCOMPANY Replaces mycompany in policy name. -z, --zip Zip policies -f FILE, --file FILE File to analyse policy structure. If not given all policies are newly downloaded. -a, --analyse Analyses the policy structure. Resource list is saved as 'resources.csv'.
The script needs a configuration file config.yaml that you can generate with
or create it with :
URL : 'https://vsystem.ingress.xxxx.shoot.live.k8s-hana.ondemand.com' TENANT: default USER : user PWD : 'userpwd123' POLICIES_PATH : policies POLICY_FILTER: mycompany # regex match of policyID - use '.' for all. Used for analysis-option only RESOURCE_CLASSES : connectionConfiguration: admin connection: admin connectionContent: data app.datahub-app-data.qualityDashboard: metadata app.datahub-app-core.connectionCredentials: metadata app.datahub-app-data.profile: metadata app.datahub-app-data.qualityRulebook: metadata app.datahub-app-data.system: metadata app.datahub-app-data.catalog: metadata app.datahub-app-data.glossary: metadata app.datahub-app-data.tagHierarchy: metadata app.datahub-app-data.qualityRule: metadata app.datahub-app-data.preparation: metadata app.datahub-app-data.publication: metadata application: application systemManagement: admin certificate: admin connectionCredential: admin COLOR_MAP: admin: black metadata: green application: orange data: blue multiple: grey
The main use case is to download/upload polices and prepare for further analysis.
For downloading a policies you can specify with regular expression which policies you like to download, e.g.
dipolicy -d mycompany.basic -z -m pears
This downloads all policies that start with “my company.basic” (option: -d mycompany.basic) and saves them to the “policies”-folder of the working directory. In addition they are also zipped (option: -z). Keep noted that existing files are overwritten without warning. The options -m replaces “mycompany” with the argument value “pears”.
The uploading case is similar:
dipolicy -u ./policies/mycompany.basic -m pears
It uploads all policies starting with “mycompany.basic” to SAP Data Intelligence (option: -u). Due to the option -m the uploaded policy is renamed, that means all ‘mycompany” are replaced by “pears”. Due to policy dependencies it could be that in the first run not all policies are uploaded. Please have a look to the console warnings. The reason is that policies that inherit from other policies that are not yet existing will not be added. In this case you just have to restart it again.
Graphical visualisations works for me far better than tabular ones. Therefore I added a simple network visualisation that you can call with
dipolicy -a -f policies.json
If the option -f (–file) is not given all policies are downloaded from SAP Data Intelligence and saved to “policies.json” in the policies-path defined in config.yaml before the actual analysis starts. Keep noted that with the config parameter “POLICY_FILTER” you can select the policies you like to analyse, e.g. only your “mycompany.” policies.
The analysis-option produces 2 outcomes:
- A chart displaying the filtered policies with dependencies
- resource.csv-file for further analysis with e.g. Excel
For the blueprint policies, the chart of the mycompany policies looks like the following:
- “Diamond”: Exposed nodes
- “Circle”: Non-exposed nodes
Color coding for the policy classes:
- admin: black
- metadata: green
- application: orange
- data: blue
- multiple: grey
Number label: Policy number you find in the “resources.csv” file.
Both the classification of the resources and policies are configurable with “config.yaml”. There is also a threshold when all resources of policies except the number of thresholds (but at least 90%) are of one type then this nearly unique resource type is assigned to the policy. Otherwise the policy is labeled with “multiple”.
This chart gives you a first glance if the policies look like you planned or not. E.g. in my preparation I saw that I had a couple of nested policies I wanted to avoid.
I am aware that this blueprint is only a rough starting point for a companies user security but hopefully gives you at lease kind of kickstart.