Similar Products

Find Similar Products in a product catalog.

The Similar Products endpoint uses machine learning to search through a product catalog for products that are similar to each other. Additional parameters allow you to specify which product information similarity is based on, and which products to compare.

You can use the Similar Products endpoint to do the following:

  • Clean-up similar products accidentally added to a project twice.
  • Improve search engine optimization by removing duplicated data.

General Concepts

To make a request, you need to specify two product sets using ProductSetSelectors, define how to measure similarity, and initiate the search. You then need to poll the search status endpoint to get the results of the search.

Product Selection

Product Selection is the stage in the processing of this request where product catalogues are filtered down to the products to be compared. The search compares two sets of products and users can specify how those sets are selected. The ProductSelector object is used to specify the selection criteria. By default, all products in a project will be selected for comparison. Optionally, the user can specify a set of product ids or a set of product type ids to filter by.

Product Selection Examples

Compare every product in a project “sunrise” to a product with the ID {product_id} in a project {project_key}

...
"productSelectors": [
{ "projectKey": "{project_key}",
"productIds": ["{product_id}"] },
{ "projectKey": "sunrise" }
]
...

Compare every product with type {product_type_id_1} with products with type {product_type_id_2}

...
"productSelectors": [
{ "projectKey": "{project_key}",
"productTypeIds": ["{product_type_id_1}"] },
{ "projectKey": "{project_key}",
"productTypeIds": ["{product_type_id_2}"] }
]
...

Compare the staged version of product {product_id_1} with the current version of products {product_id_2} and {product_id_3}

...
"productSelectors": [
{ "projectKey": "{project_key}",
"productIds": ["{product_id_1}"],
"staged": true },
{ "projectKey": "{project_key}",
"productIds": ["{product_id_2}", "{product_id_3}"],
"staged": false }
]
...

Similarity Specification

The similarityMeasures attribute defines which aspects of a product to use to calculate product similarity, and how important each aspect is to the overall similarity score between products.

You can use the following attributes for comparisons:

  • name
  • description
  • price
  • variantCount
  • attribute

Similarity Measure Examples

Compare similarity using name, description and attribute, where the similarity of attribute is twice as important to overall similarity:

...
"similarityMeasures": {
"name": 1,
"description": 1,
"attribute": 2
}
...

Compare similarity based on price and attribute, where both are equal to overall similarity:

...
"similarityMeasures": {
"price": 1,
"attribute": 1
}
...

Asynchronous Requests

Requests for similarity searches are asynchronous. The number of products being searched for similarity affects the time required to perform the search. First, send a request to initiate a search and then poll the search status endpoint until the search is complete. A product similarity search can have two statuses:

  • PENDING: A search started and is awaiting completion.
  • SUCCESS: A search completed successfully. If a search is unsuccessful an error response is returned. After a search succeeds, you can access the results using the search status endpoint for one day.

Representations

ProductSelector

A set of ProductData for comparison. If no optional attributes are specified, all current ProductData are selected for comparison. See Product Selection for more details and examples.

  • projectKey - String - Required
    The project containing the project set.
  • productIds - Array of String - Optional
    An array of Product IDs to compare. If unspecified, no Product ID filter is applied.
  • productTypeIds - Array of String - Optional
    An array of product type IDs. Only products with product types in this array are compared. If unspecified, no product type filter is applied.
  • staged - Boolean - Optional
    Specifies use of staged or current product data. Default: false (i.e. use current ProductData).
  • includeVariants - Boolean - Optional
    Specifies use of product variants. If set to true, all product variants are compared, not just the master variant. Default: false.
  • productSetLimit: Maximum number of products to check. Default: 10000. The default value of 10000 is also the maximum value of products which can be requested for a similarity search. If you need a higher limit, contact ↗ Support Portal .

Default ProductSelector

Compare all products within a project.

[
{"projectKey": "{project_key}"},
{"projectKey": "{project_key}"}
]

SimilarityMeasures

Specify which ProductData attributes to use for estimating similarity and how to weigh them. An attribute’s weight can be any whole positive integer, starting with 0. The larger the integer, the higher its weight. See Similarity Specification for more details and examples.

  • name - Integer - Optional
    Importance of the name attribute in overall similarity. Default: 1.
  • description - Integer - Optional
    Importance of the description attribute in overall similarity. Default: 1.
  • attribute - Integer - Optional
    Importance of the product variant’s attribute values in overall similarity. Default: 1.
  • variantCount - Integer - Optional
    Importance of the number of product variants in overall similarity. Default: 0.
  • price - Integer - Optional
    Importance of the price attribute in overall similarity. Default: 0.

Default SimilarityMeasures

{
"name": 1,
"description": 1,
"attribute": 1
}

SimilarProductSearchRequest

  • limit - Number - Optional Range: [1-20] Default: 3
  • offset - Number - Optional
  • language - String Default: en ↗ IETF language tag language tag used to prioritize language for text comparisons.
  • currency - String - Default: EUR
    The three-digit ↗ ISO 4217 currency code to compare prices in. When a product has multiple prices, all prices for the product are converted to the currency provided by the currency attribute and the median price is calculated for comparison. Currencies are converted using the ECB currency exchange rates at the time the request is made. Of the ↗ ISO 4217 currency codes, only currencies with currency exchange rates provided by the ECB are supported.
  • similarityMeasures - SimilarityMeasures - Optional
    Default: Default SimilarityMeasures. similarityMeasures defines the attributes taken into account to measure product similarity.
  • productSetSelectors - Array of length 2 of ProductSelector - Optional
    Default: Default ProductSelector.
  • confidenceMin - Float - Range [0.0 - 1.0] - Default: 0.01
    See Confidence Filtering.
  • confidenceMax - Float - Range [0.0 - 1.0] - Default: 1.0
    See Confidence Filtering.

SimilarProduct

One part of a SimilarProductPair. Refers to a specific ProductVariant.

  • product - Reference to a Product
  • variantId - Integer - Required
    ID of the ProductVariant that was compared.
  • meta - SimilarProductMeta - Optional
    Supplementary information about the data used for similarity estimation. This information helps you understand the estimated confidence score, but it should not be used to identify a product.

SimilarProductMeta

  • name - LocalizedString - Optional
    Localized product name used for similarity estimation.
  • description - LocalizedString - Optional
    Localized product description used for similarity estimation.
  • price - Money - Optional
    The product price in cents using the currency defined inSimilarProductSearchRequest. If multiple prices exist, the median value is taken as a representative amount.
  • variantCount - Integer - Optional
    Total number of variants associated with the product.

SimilarProductPair

A pair of SimilarProducts.

  • confidence - Float - Range [0.0 - 1.0].
    The probability of product similarity.
  • products - Array of two SimilarProduct

SimilarProductSearchRequestMeta

Metadata about the search parameters.

TaskStatus

Response wrapper for an Asynchronous Request.

  • state - String
    One of PENDING or SUCCESS, as described in Asynchronous Requests.
  • expires - DateTime
    The expiry date of the result. When the expiry date is reached, the result is no longer accessible. Expiry is currently set at 1 day after the result first becomes available.
  • result - Any Type
    The data that represents the response to an asynchronous request. Only populated when the status is SUCCESS.

TaskToken

Representation of the URL path a client needs to poll to get the results of an Asynchronous Request.

  • taskId - String
    The ID for the task. Used to find the status of the task.
  • uriPath - String
    The URI path to poll for the status of the task.

Endpoints

Initiation Endpoint

Host (Europe): https://ml-eu.europe-west1.gcp.commercetools.com
Host (United States): https://ml-us.europe-west1.gcp.commercetools.com
Endpoint: /{project_key}/similarities/products
Method: POST
OAuth2 Scopes: view_products:{project_key}
Response Representation: TaskToken
Request Representation: SimilarProductSearchRequest

Status Endpoint

Host (Europe): https://ml-eu.europe-west1.gcp.commercetools.com
Host (United States): https://ml-us.europe-west1.gcp.commercetools.com
Endpoint: /{project_key}/similarities/products/status/{task_id}
Method: GET
OAuth2 Scopes: view_products:{project_key}
Response Representation: TaskStatus of a PagedQueryResult with a results array of SimilarProductPairs, sorted by confidence scores in descending order and the meta information of SimilarProductSearchRequestMeta.

Examples

curl -X POST https://ml-eu.europe-west1.gcp.commercetools.com/{project_key}/similarities/products \
-H "Content-Type: application/json" \
-H 'Authorization: Bearer {access_token}' \
-d \
'
{
"limit" : 3,
"similarityMeasures" : {
"name": 1
},
"productSetSelectors" : [
{
"projectKey": "{project_key}",
"productTypeIds": [ "8b50b0b0-8091-8e32-4601-948a8b504606" ],
"staged": true
},
{
"projectKey": "{project_key}",
"productTypeIds": [ "46068292-4a41-4601-948a-948a8b508b50" ],
"staged": true
}
]
}
'
{
"taskId": "078b4eb3-8e29-1276-45b1-8964cf118707",
"location": "/{project_key}/similarities/products/078b4eb3-8e29-1276-45b1-8964cf118707"
}
curl https://ml-eu.europe-west1.gcp.commercetools.com/{project_key}/similarities/products/078b4eb3-8e29-1276-45b1-8964cf118707
{
"state": "SUCCESS",
"result": {
"count": 1,
"total": 16,
"offset": 0,
"results": [
{
"confidence": 0.68427,
"products": [
{
"product": {
"id": "b0b08091-8e32-4601-948a-8b504606d3ac",
"typeId": "product"
},
"variantId": 1,
"meta": {
"name": {
"en": "White T-Shirt | Commercetools Hackathon Edition | Available in S/M/L"
}
}
},
{
"product": {
"id": "46014606-8b50-4606-8292-4a414601948a",
"typeId": "product"
},
"variantId": 1,
"meta": {
"name": {
"en": "Limited edition of the Commercetools T-Shirt - White Color - Now on Sale!"
}
}
}
]
},
"... other similar products ..."
],
"meta": {
"similarityMeasures" : {
"name": 1
}
}
}
}