Uploading large training files with Service Ticket Intelligence

For machine learning to accurately understand and curate actions that caters to business requirements, the quality and quantity of training data are key ingredients for training highly efficient and effective machine learning models.

Having large volumes of good, well represented and labelled data will inevitability translate to larger training file sizes. In the context of Service Ticket Intelligence, this becomes a problem given that the service only allows training file uploads of up to 75MB in size. Users are then limited by the number of records they can upload for model training or forced to make use of the more complex OData protocol to upload larger training files.

In order to address this limitation, a new bulk file upload feature has been introduced with the 2107 release in Service Ticket Intelligence. Users can now perform a multi-part file upload operation for the model to learn with a larger training dataset. This can potentially result in a significant improvement with model performance.

Bulk file upload can be applied and used for all scenarios (classification, recommendation and clustering) supported by Service Ticket Intelligence.

Bulk File Upload Overview

Bulk File Upload Process

The process of uploading large files via bulk file upload will be as follows:

1. File splitting

Split the CSV file into multiple smaller CSV files.

  • Each file part should not exceed 75MB
  • Ensure that the column names and their respective data are present and consistent

2. Indicate bulk upload operation

Indicate in the file upload endpoint (/sti/training/model) that this operation is a multi-part file upload.

  • bulk_file_upload is set to true
  • bulk_file_upload_complete is set to false
  • Sample request body as follows:
{ "scenario":{ "desc":"Training Classification travel data with priority and keywords (small)", "type":"classification", "language":"en", "business_object":"ticket" }, "options":{ "bulk_file_upload":true, "bulk_file_upload_complete":false }, "mapping":{ "input":[ "description" ], "output":[ "category" ], "keywords":[ { "columns":[ "description" ], "text":[ "Name", "Surname" ], "value":"1" } ] }, "training":{ "file":" c3ViamVjdCxib2R5LGFydGljbGVfaWQsYXJ0aWNsZV9kZXNjLGRhdGFzb3VyY2UsYXJ0aWNsZV91cmwKIkknbSBhIHRyaXBsZSBDYXByaWNvcm4gKFN1biwgTW9vbiBhbmQgYXNjZW5kYW50IGluIENhcHJpY29ybikgV2hhdCBkb2VzIHRoaXMgc2F5IGFib3V0IG1lPyIsLDEscmVsaWdpb3VzIHNlbnRpbWVudHMsbWluZHRvdWNoLHd3dy50aW1lc29maW5kaWEuY29tCldoYXQgc2hvdWxkIEkgZG8gdG8gYmUgYSBncmVhdCBnZW9sb2dpc3Q" }
}

3. Last file part?

If there are 2 file parts or more remaining to be uploaded, proceed to Step 4. Else, skip to Step 5 if this is the final file part.

4. Indicate reference model id for subsequent upload

To upload the subsequent multi-part file, make a reference to the model id generated from the previous request in the file upload endpoint (/sti/training/model). Once done, repeat Step 3.

  • bulk_file_upload_reference_model is referenced
  • bulk_file_upload_complete is false
  • Sample request body as follows:
{ "bulk_file_upload_reference_model_id":"90c6daa1894143dda96410bd1e1b70c7", "bulk_file_upload_complete":false, "training":{ "file":“c3ViamVjdCxib2R5LGFydGljbGVfaWQsYXJ0aWNsZV9kZXNjLGRhdGFzb3VyY2UsYXJ0aWNsZV91cmwKIkknbSBhIHRyaXBsZSBDYXByaWNvcm4gKFN1biwgTW9vbiBhbmQgYXNjZW5kYW50IGluIENhcHJpY29ybikgV2hhdCBkb2VzIHRoaXMgc2F5IGFib3V0IG1lPyIsLDEscmVsaWdpb3VzIHNlbnRpbWVudHMsbWluZHRvdWNoLHd3dy50aW1lc29maW5kaWEuY29tCldoYXQgc2hvdWxkIEkgZG8gdG8gYmUgYSBncmVhdCBnZW9sb2dpc3Q” }
}

5. Indicate bulk file upload operation complete

For the final file part, indicate in the file upload endpoint (/sti/training/model) to complete the multi-part file upload.

  • bulk_file_upload_reference_model is referenced
  • bulk_file_upload_complete is true
  • Sample request body as follows:
{ "bulk_file_upload_reference_model_id":"90c6daa1894143dda96410bd1e1b70c7", "bulk_file_upload_complete":true, "training":{ "file":“c3ViamVjdCxib2R5LGFydGljbGVfaWQsYXJ0aWNsZV9kZXNjLGRhdGFzb3VyY2UsYXJ0aWNsZV91cmwKIkknbSBhIHRyaXBsZSBDYXByaWNvcm4gKFN1biwgTW9vbiBhbmQgYXNjZW5kYW50IGluIENhcHJpY29ybikgV2hhdCBkb2VzIHRoaXMgc2F5IGFib3V0IG1lPyIsLDEscmVsaWdpb3VzIHNlbnRpbWVudHMsbWluZHRvdWNoLHd3dy50aW1lc29maW5kaWEuY29tCldoYXQgc2hvdWxkIEkgZG8gdG8gYmUgYSBncmVhdCBnZW9sb2dpc3Q” }
}

6. Perform get model status

Perform a get model status call (/sti/training/model/status?model_id={{model_id}}).

  • Model status will transition from DATA_UPLOAD_IN_PROGRESS to DATA_PREPROCESSING_PENDING to NEW

7. Trigger model training

Model training can be triggered when model status is NEW.


Additional Resources

Postman collection: https://github.com/SAP-samples/service-ticket-intelligence-postman-collection

Bulk file upload API specs: https://help.sap.com/viewer/5088c3bb02144e7782959bb1529ca70e/SHIP/en-US/b0e6467e92c4455f83c0059b46fbe928.html

Bulk file upload examples: https://help.sap.com/viewer/5088c3bb02144e7782959bb1529ca70e/SHIP/en-US/f692293ec4ba4525b3755ef5f710dd3f.html

To try out other features (classification, recommendation, clustering scenarios) of Service Ticket Intelligence, we are also available on SAP BTP as a free trial.

Find out how you can set up your own trial account with Service Ticket Intelligence here and give the service a try today!