Optimize DDL statements generated for type system update

I ran recently in the situation, where the type system update ran longer than the entire time allocated for the cutover and wanted to share our findings and solution with the community as I received echos from other projects facing the same problem.

How runs the type system update?

The type system, configured through the *-items.xml files, defines the data model, translated into a database schema during the initialization or update of the platform. This translation is performed by the DdlUtils library from the Apache DB project: it generates the DDL statements required for creating or updating the database schema associated to a data model. While the library design allows supporting virtually any database software or version (see o.a.d.Platform and o.a.d.p.SqlBuilder), the implementation uses ANSI SQL to support a maximum of database software versions. Consequently, advanced options or capabilities available for only particular database software versions are often not supported by the library. For instance, ANSI SQL only allows adding one field to a table per statement while MySQL allows adding multiple fields to a table within one statement. The MySQL support in the DdlUtils library (see o.a.d.p.m.MySqlBuilder class) uses ANSI SQL instead of the MySQL specific SQL, generating one statement per field to add instead of one statement per table to modify. Note also that the library project is retired and not further maintained. Consequently, any functionality added recently by database vendors will very likely not be supported by the library.

The type system update can be ran either via hAC or via Ant. When ran through Ant, couple options are available: dry-run, type system only update or full system update. Although these different ways use at the end the same mechanism for generating the DDL statements updating the database schema, they use different technologies, complicating the design and implementation of customizations. hAC uses Spring MVC while the Ant target updatesystem uses a custom Ant task for the dry-run and type system only update options (see d.h.a.t.i.TypeSystemUpdaterTask) and a macro-based task otherwise.

Why did the type system update take so long?

The execution of the DDL statements generated by the type system update took a very long time. In my particular situation, new fields were added to itemtypes stored in very large tables (50+ millions records) and the type system update generated ALTER TABLE ADD statements for each new field. The database used with SAP Commerce was MySQL 5.6, which handles by default the addition of new field to a table as following: it creates a new table based on the old table definition, add the new field, copies then all the records from the old table into the new table and finally remove the old table. The copy of all records can take a significant amount of time, especially for large tables. Moreover, the DdlUtils library generates one statement per new field. In my situation, multiple times, several fields were added to the same itemtype, causing at the end MySQL to copy over and over the same large tables.

The first optimization was to add all new fields to a table in one statement, so that MySQL would only copy once a very large table. For example, instead of generating the following DDL statements:

ALTER TABLE stocklevels ADD p_sapwarehouseid VARCHAR(255);
ALTER TABLE stocklevels ADD p_sapwarehousename VARCHAR(255);

It would generate the following DDL statement:

ALTER TABLE stocklevels ADD p_sapwarehouseid VARCHAR(255), p_sapwarehousename VARCHAR(255);

The second optimization after digging in the MySQL 5.6 documentation was to use the INPLACE instead of the COPY algorithm when adding a new field:

ALTER TABLE stocklevels ADD p_sapwarehouseid VARCHAR(255), p_sapwarehousename VARCHAR(255), ALGORITHM=INPLACE;

Note that the INPLACE algorithm cannot always be used. Fortunately, in our situation, it was possible to use for all new fields.

How to customize the generated DDL statements?

The proposed approach is the following:

  1. Customize the SqlBuilder implementation supporting the SAP Commerce database to generate the desired optimized DDL statements
  2. Expose the custom SqlBuilder via custom Platform and HybrisPlatformFactory classes
  3. Adapt SAP Commerce to ensure the custom HybrisPlatformFactory is invoked from hAC as well as Ant.

Let us see now more in details the code and configuration changes to perform for implementing this approach.

Customize the default SqlBuilder implementation

Instead of implementing from scratch the SqlBuilder class, it is preferable to extend the default implementation used by SAP Commerce for your database. The class d.h.b.d.HybrisPlatformFactory implements in the method createInstance the logic mapping SqlBuilder implementations to databases. Actually, it maps Platform implementations to databases but by navigating into the build method of the Platform implementation, you can easily find which SqlBuilder implementation is used.

As MySQL 5.6 was the database in our scenario, I extended the HybrisMySqlBuilder class as following:

public class MyHybrisMySqlBuilder extends HybrisMySqlBuilder { private final DatabaseSettings databaseSettings; public MyHybrisMySqlBuilder(final Platform platform, final DatabaseSettings databaseSettings) { super(platform, databaseSettings); this.databaseSettings = databaseSettings; } @Override public void processTableStructureChanges(final Database currentModel, final Database desiredModel, final Table sourceTable, final Table targetTable, final Map parameters, final List changes) throws IOException { final Iterator changesIterator = changes.iterator(); final List addColumnChanges = new ArrayList<>(changes.size()); while (changesIterator.hasNext()) { final Object change = changesIterator.next(); if (change instanceof AddColumnChange) { addColumnChanges.add((AddColumnChange) change); changesIterator.remove(); } } if (!addColumnChanges.isEmpty()) { final Map<Table, List> addColumnChangesByTables = addColumnChanges.stream().collect(Collectors.groupingBy(AddColumnChange::getChangedTable)); for (final Map.Entry<Table, List> entry : addColumnChangesByTables.entrySet()) { processAddColumnChanges(currentModel, desiredModel, entry.getKey(), entry.getValue()); } } super.processTableStructureChanges(currentModel, desiredModel, sourceTable, targetTable, parameters, changes); } protected void processAddColumnChanges(final Database currentModel, final Database desiredModel, final Table changedTable, final List changes) throws IOException { this.print("ALTER TABLE "); this.printlnIdentifier(this.getTableName(changedTable)); this.printIndent(); boolean isFirstAddColumnChange = true; for (final AddColumnChange change : changes) { if (isFirstAddColumnChange) { isFirstAddColumnChange = false; } else { this.print(", "); } this.print("ADD COLUMN "); this.writeColumn(change.getChangedTable(), change.getNewColumn()); if (change.getPreviousColumn() != null) { this.print(" AFTER "); this.printIdentifier(this.getColumnName(change.getPreviousColumn())); } else { this.print(" FIRST"); } change.apply(currentModel, this.getPlatform().isDelimitedIdentifierModeOn()); } final Optional algorithm = determineAlgorithmForAddColumnChanges(currentModel, desiredModel, changedTable, changes); if (algorithm != null) { this.print(", ALGORITHM=" + algorithm.get()); } final Optional lock = determineLockForAddColumnChanges(currentModel, desiredModel, changedTable, changes); if (lock != null) { this.print(", LOCK=" + lock.get()); } this.printEndOfStatement(); } protected Optional determineAlgorithmForAddColumnChanges(final Database currentModel, final Database desiredModel, final Table table, final List changes) { final List keys = Arrays.asList( String.format("mysql.dt.ddl.alterTable.%s.addColumn.algorithm", getTableName(table)), String.format("mysql.dt.ddl.alterTable.%s.algorithm", getTableName(table)), "mysql.dt.ddl.alterTable.addColumn.algorithm", "mysql.dt.ddl.alterTable.algorithm" ); for (final String key: keys) { final String algorithm = getDatabaseSettings().getProperty(key); if (algorithm != null) { return Optional.of(algorithm.toUpperCase(Locale.ROOT)); } } return Optional.empty(); } protected Optional determineLockForAddColumnChanges(final Database currentModel, final Database desiredModel, final Table table, final List changes) { final List keys = Arrays.asList( String.format("mysql.dt.ddl.alterTable.%s.addColumn.lock", getTableName(table)), String.format("mysql.dt.ddl.alterTable.%s.lock", getTableName(table)), "mysql.dt.ddl.alterTable.addColumn.lock", "mysql.dt.ddl.alterTable.lock" ); for (final String key: keys) { final String lock = getDatabaseSettings().getProperty(key); if (lock != null) { return Optional.of(lock.toUpperCase(Locale.ROOT)); } } return Optional.empty(); } protected DatabaseSettings getDatabaseSettings() { return databaseSettings; }
}

Notice that the ALGORITHM and LOCK options can be configured via properties to give more flexibility. By default, both options are not configured and won’t appear in the DDL statements. In our case, the following properties were added to the local.properties to make use of them:

mysql.dt.ddl.alterTable.lock=NONE
mysql.dt.ddl.alterTable.algorithm=INPLACE

Be aware that ALGORITHM=INPLACE and/or LOCK=NONE are not always possible. Check therefore your changes and your database configuration allow it before using the properties.

Expose the custom SqlBuilder implementation

The SqlBuilder is exposed through the Platform instance (see getSqlBuilder() method), which is instantiated by the the factory HybrisPlatformFactory (see createInstance method). The factory calls the static method build() from the Platform implementation class to get a new Platform instance. Ideally, it should be possible with AspectJ to intercept calls to the build() method and inject the custom SqlBuilder in the new Platform instance. Unfortunately, it is not that simple since the setSqlBuilder() method is protected. Moreover, as detailed later, it will be needed to get a Platform instance setup with the custom SqlBuilder without using AspectJ. The chosen approach is consequently to:

  1. Extend the default Platform implementation for the SAP Commerce database to inject the custom SqlBuilder
  2. Extend the HybrisPlatformFactory to return the custom Platform

Like SqlBuilder, check the implementation of the createInstance() method in the HybrisPlatformFactory class to find the default implementation used by SAP Commerce for the database. For MySQL 5.6, it is the HybrisMySqlPlatform. Unfortunately, the class has a private constructor and doesn’t allow to be extended, which forces to copy & paste its code to extend it. Notice that the implementations for other databases do not have this limitation and it might be therefore easier than for MySQL.

public class MyHybrisMySqlPlatform extends MySql50Platform implements HybrisPlatform { private static final String MYSQL_ALLOW_FRACTIONAL_SECONDS = "mysql.allow.fractional.seconds"; private final boolean isFractionalSecondsSupportEnabled; protected MyHybrisMySqlPlatform(boolean isFractionalSecondsSupportEnabled) { this.isFractionalSecondsSupportEnabled = isFractionalSecondsSupportEnabled; } public static HybrisPlatform build(DatabaseSettings databaseSettings) { Objects.requireNonNull(databaseSettings); boolean allowFractionaSeconds = Boolean.parseBoolean(databaseSettings.getProperty("mysql.allow.fractional.seconds", Boolean.TRUE.toString())); MyHybrisMySqlPlatform instance = new MyHybrisMySqlPlatform(allowFractionaSeconds); instance.provideCustomMapping(); instance.setSqlBuilder(new DTHybrisMySqlBuilder(instance, databaseSettings)); MySql50ModelReader reader = new MySql50ModelReader(instance); reader.setDefaultTablePattern(databaseSettings.getTablePrefix() + '%'); instance.setModelReader(reader); return instance; } private void provideCustomMapping() { PlatformInfo platformInfo = this.getPlatformInfo(); platformInfo.setMaxColumnNameLength(30); platformInfo.addNativeTypeMapping(-1, "TEXT"); platformInfo.addNativeTypeMapping(12002, "BIGINT", -5); platformInfo.addNativeTypeMapping(12000, "TEXT", -1); platformInfo.addNativeTypeMapping(12003, "LONGTEXT", -1); platformInfo.addNativeTypeMapping(12001, "TEXT", -1); platformInfo.addNativeTypeMapping(12, "VARCHAR", 12); platformInfo.setDefaultSize(12, 255); platformInfo.addNativeTypeMapping(6, "FLOAT{0}"); platformInfo.setHasPrecisionAndScale(6, true); if (this.isFractionalSecondsSupportEnabled) { platformInfo.setHasSize(93, true); platformInfo.setDefaultSize(93, 6); } } public String getColumnName(Column column) { return ((HybrisMySqlBuilder)this.getSqlBuilder()).getColumnName(column); } public String getTableName(Table table) { return this.getSqlBuilder().getTableName(table); }
}

Note that the copy & paste was done from the HybrisMySqlPlatform class present in the SAP Commerce 1808 release. I would highly recommend not to copy & paste this code but rather starts from the code of your release and adapt the line(s) setting the SqlBuilder.

The next step is to extend the standard HybrisPlatformFactory to expose the custom Platform, which exposes the custom SqlBuilder with the optimizations for the DDL statements.

public class MyHybrisPlatformFactory { private volatile static Platform platform; static { PlatformFactory.registerPlatform("MySQL5", MyHybrisMySqlPlatform.class); PlatformFactory.registerPlatform("MySQL", MyHybrisMySqlPlatform.class); } public static Platform createInstance(final DatabaseSettings databaseSettings, final DataSource dataSource) throws DdlUtilsException { final Platform platform = createInstance(databaseSettings); platform.setDataSource(dataSource); return platform; } public static Platform createInstance(final DatabaseSettings databaseSettings) throws DdlUtilsException { if (Boolean.valueOf(databaseSettings.getProperty("mysql.dt.ddl.enabled", Boolean.FALSE.toString())) && databaseSettings.getDataBaseProvider() == DataBaseProvider.MYSQL) { databaseSettings.setStatementDelimiter(";"); return (platform = MyHybrisMySqlPlatform.build(databaseSettings)); } else { return HybrisPlatformFactory.createInstance(databaseSettings); } } public static synchronized Platform getInstance() { if (platform == null) { return HybrisPlatformFactory.getInstance(); } else { return platform; } }
}

Adapt SAP Commerce to call the custom factory HybrisPlatformFactory

As mentioned earlier, the type system update and therefore the factory HybrisPlatformFactory are accessed through different channels, using different technologies. hAC is based on the SAP Commerce platform while the Ant target updatesystem uses Ant task and Ant macros.

Packaging all customizations in a JAR

Before digging into the required adaptations, it is important to notice that the custom HybrisPlatformFactory class as well as other custom classes have to be in the classpath of the SAP Commerce platform as well as in the Ant classpath in order to be invoked. Consequently, it is not possible to store these custom classes within the source folder of an extension as Ant could not access them. Therefore, it shall rather be packaged in a JAR and placed in the lib folder of an extension, which will automatically be part of the SAP Commerce platform classpath and can be referenced in Ant.

Find below the pom.xml of my Maven project used for building this JAR library.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>[...]</groupId> <artifactId>[...]</artifactId> <version>1.0</version> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> </properties> <dependencies> <dependency> <groupId>org.apache.ddlutils</groupId> <artifactId>ddlutils</artifactId> <version>1.0</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.aspectj</groupId> <artifactId>aspectjweaver</artifactId> <version>1.9.0</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.ant</groupId> <artifactId>ant</artifactId> <version>1.9.1</version> <scope>provided</scope> </dependency> <dependency> <groupId>de.hybris</groupId> <artifactId>ybootstrap</artifactId> <version>1808</version> <scope>system</scope> <systemPath>[...]/hybris/bin/platform/bootstrap/bin/ybootstrap.jar</systemPath> </dependency> <dependency> <groupId>de.hybris</groupId> <artifactId>yant</artifactId> <version>1808</version> <scope>system</scope> <systemPath>[...]/hybris/bin/platform/bootstrap/bin/yant.jar</systemPath> </dependency> </dependencies>
</project>

Ensure to adapt the versions of the dependencies based on your SAP Commerce release.

Adaptations for hAC

The call from hAC to the factory HybrisPlatformFactory is more or less hard-coded and AspectJ is consequently the easiest way to adapt SAP Commerce, so that  it uses the custom factory. The following aspect intercepts the calls made to the createInstance() method from the standard HybrisPlatformFactory and invokes the same method from the custom factory instead.

@Aspect
public class MyHybrisPlatformFactoryAspect { @Pointcut("execution(static * de.hybris.bootstrap.ddl.HybrisPlatformFactory.createInstance(de.hybris.bootstrap.ddl.DatabaseSettings))") public void createInstance() {} @Pointcut("execution(static * de.hybris.bootstrap.ddl.HybrisPlatformFactory.getInstance())") public void getInstance() {} @Around("createInstance()") public Object aroundCreateInstance(final ProceedingJoinPoint pjp) throws Throwable { final DatabaseSettings databaseSettings = (DatabaseSettings) pjp.getArgs()[0]; return MyHybrisPlatformFactory.createInstance(databaseSettings); } @Around("getInstance()") public Object aroundGetInstance(final ProceedingJoinPoint pjp) throws Throwable { return MyHybrisPlatformFactory.getInstance(); }
}

The aspect should then be configured using the following META-INF/aop.xml file.

<!DOCTYPE aspectj PUBLIC "-//AspectJ//DTD//EN" "http://www.eclipse.org/aspectj/dtd/aspectj.dtd">
<aspectj> <weaver> <include within="de.hybris.bootstrap.ddl.HybrisPlatformFactory"/> <include within="MyHybrisPlatformFactoryAspect"/> </weaver> <aspects> <aspect name="MyHybrisPlatformFactoryAspect"/> </aspects>
</aspectj>

The aspect will be needed for the Ant adaptations and should therefore be packaged with the customizations in the JAR, including the AspectJ configuration. The configuration should also be placed under the resources folder of an extension to enable the AspectJ support in SAP Commerce.

Adaptations for Ant

The details about the Ant target updatesystem can be found in the platform/build.xml file. It uses the macro updateTypeSystem on one side for the dry-run and type system only update options and the macro updatesystem on the other side for the full system update option.

The updatesystem macro is defined in platformadministration.xml and uses the yrun macro to trigger the system update. The yrun macro starts a JVM to run code and accepts additional JVM arguments as well as additional classpath entries. It is consequently possible to activate AspectJ, add the JAR containing thet customizations to the classpath and reuse the aspect implemented for hAC adaptations to get the custom factory HybrisPlatformFactory invoked.

<yrun additionalclasspath="${ext.myext.path}/lib/mylib.jar" jvmargs="-javaagent:${ext.core.path}/lib/${aspect.weaver.library} ${aspect.weaver.config}"> [...]
</yrun>

I recommend to copy the definition from the standard updatesystem macro from platformadministration.xml and paste it in the buildcallbacks.xml of an extension. Then rename the macro definition (e.g. myupdatesystem) and adapt the yrun macro calls as shown above.

The updateTypeSystem macro is defined in platformadministration.xml file and is based on the Ant task d.h.a.t.i.TypeSystemUpdaterTask. Since the Ant task is called immediately by Ant and Ant does not embed AspectJ, it is not possible this time to reuse the aspect. The solution is consequently to customize the standard TypeSystemUpdaterTask class as well as the intermediate class HybrisSchemaGenerator to call the custom HybrisPlatformFactory class.

Find below the code for the customized HybrisSchemaGenerator class calling the custom factory HybrisPlatformFactory.

public class MyHybrisSchemaGenerator extends HybrisSchemaGenerator { private final DataSourceCreator dataSourceCreator; public MyHybrisSchemaGenerator(PlatformConfig platformConfig, PropertiesLoader propertiesLoader, DataSourceCreator dataSourceCreator, boolean dryRun) throws Exception { super(platformConfig, propertiesLoader, dataSourceCreator, dryRun); this.dataSourceCreator = dataSourceCreator; } public MyHybrisSchemaGenerator(PlatformConfig platformConfig, PropertiesLoader propertiesLoader, DataSourceCreator dataSourceCreator, DbTypeSystemProvider dbTypeSystemProvider, OverridenItemsXml overridenItemsXml, boolean dryRun) throws Exception { super(platformConfig, propertiesLoader, dataSourceCreator, dbTypeSystemProvider, overridenItemsXml, dryRun); this.dataSourceCreator = dataSourceCreator; } @Override protected Platform createDDLUtilsPlatform() { return MyHybrisPlatformFactory.createInstance(this.getDatabaseSettings()); } @Override protected Platform createConnectedDDLUtilsPlatform() { return MyHybrisPlatformFactory.createInstance(this.getDatabaseSettings(), this.dataSourceCreator.createDataSource(this.getDatabaseSettings())); }
}

Find below the code for the customized TypeSystemUpdaterTask class calling the customized HybrisSchemaGenerator class.

public class MyTypeSystemUpdaterTask extends TypeSystemUpdaterTask { @Override public void execute() throws BuildException { try { final HybrisSchemaGenerator schemaGenerator = this.getSchemaGenerator(); this.setScriptNames(schemaGenerator); this.executeSchemaGenerator(schemaGenerator); } catch (final Exception e) { throw new BuildException(e); } } protected String getPlatformHome() { return this.getProject().getProperty("platformhome"); } protected HybrisSchemaGenerator getSchemaGenerator() throws Exception { final PlatformConfig platformConfig = PlatformConfig.getInstance(ConfigUtil.getSystemConfig(this.getPlatformHome())); final PropertiesLoader propertiesLoader = new StandalonePropertiesLoader(platformConfig, this.getTenantId()); final DataSourceCreator dataSourceCreator = new DBCPDataSourceCreator(); final HybrisSchemaGenerator schemaGenerator = new MyHybrisSchemaGenerator(platformConfig, propertiesLoader, dataSourceCreator, this.isDryRun()); return schemaGenerator; }
}

The custom code shall be packaged in the JAR with the other customizations, so that it can be added to the Ant task definition classpath.

Finally, the updateTypeSystem needs to be customized to first invoke the customized TypeSystemUpdaterTask class and second include the JAR in the classpath. I recommend copying the original updateTypeSystem macro definition from platformadministration.xml into the buildcallbacks.xml of an extension and adapt the task definition as following.

<macrodef name="updateTypeSystem" ...> [...] <taskdef name="yMyTypeSystemUpdate" classname="MyTypeSystemUpdaterTask"> [...] <pathelement path="${ext.myext.path}/lib/mylib.jar"/> </taskdef> [...]
</macrodef>

Finally, either the Ant target updatesystem needs to be adapted to call the new macros or a new Ant target shall be introduced. I recommend introducing a new Ant target by copying the definition of the original one in order to keep the standard Ant target.

Conclusion

Optimizing the DDL statements helped in my situation to cut down the type system update from over a day to couple hours and fit easily inside the maintenance window allocated for Go-Live. However, it took time to implement and validate as multiple executions of the system update were required to ensure the customizations work properly. My advise out this experience for other projects would be to check at the end of each development sprint the execution time of the DDL statements generated by the type system update. It’s easy to collect the DDL statements from hAC or Ant and measure the execution time on a pre-production database. It will help to detect early if an optimization of the DDL statements is needed and implement it during development phase.