Step 3: Real-Time Inference

In this part of our blog series, you’ll call a real-time inference according to the payload.

In Step 2, we learned how to train and serve the model using the Swagger API page. In this step, we’ll learn how to call a real-time inference using a clickstream. Here’s a short video to demonstrate how to do this: 

Video Walk-through of Step 1

Inference Input and Output

To make the real-time inference call, navigate to the Inference section. There are three different inference calls. However, for this guide, we’ll use the next-items endpoint. The details of each endpoint are described in the following documentation.

Inference APIs in the Swagger API Page

After you open the next-items dropdown, you must complete some actions similar to those during model training:

  1. Enter tenant name.
    You must use the same tenant name that you entered during the training process.
  2. Insert Payload.
    Here, you provide all the relevant inference input data in the payload. Each of the different inference endpoints has different requirements: For next_items, the items_ls parameter is required while the other parameters are not required (but are imputable). The items_ls parameter is a list of item_id representing the user’s past item interactions (clickstream) to generate the recommendations.

For this parameter to be valid, the input must meet the following requirements:

  • Correspond to an object entry in the item_catalogue training data used to train the model, or
  • Be provided as an entry in the metadata parameter as a cold start item, or
  • Be provided as a cold start item via the “metadata update” feature

Taking an example from our sample dataset, insert a payload with the content
{ “items_ls”: [“2858”] }

Request payload in the Swagger API Page

For more details of the payload input, refer to this documentation.

After clicking Execute, you can expect the following responses:

  1. The training process has not finished yet. This returns a 404 code, stating that no model instances were found.

    Error Not Found in the API response

  2. The user inputted an incorrect payload. This returns a 400 code, stating that the model doesn’t understand the payload request.

    Error Bad Request in the API response

  3. The model is able to understand the request and successfully return a set of recommendations. This returns a 200 code, stating the recommended items with their respective confidence scores.

    Recommendation result sample in the API response

  4. Forbidden. The user has exceeded their inference quota for the month. A short message is displayed with code 403.

    Error Forbidden in the API response

Cheers

At this point, we have successfully called inferences. If you have encountered any issues, feel free to leave a comment below, I will definitely help you out. Alternatively, check out the Q&A areas in the community.

In the next step, we will be using your own data to train a model for your own use case. Feel free to follow my profile and stay tuned for the next steps. See you in the next blog!