Suppose you’re selling products with variations, such as t-shirts in different colors or smartphones with different memory capacities (and costs). Arranging your inventory data without ending up with duplicates can be tricky.

For example, say your site sells t-shirts and sweatshirts in many designs and colors. You have two ways of structuring your data. One record for each:

  • Product (product-level model).
  • Color variant (variant-level model).

The following information explains the main concepts of the variant-level records model.

For more information about each model, see: Structure ecommerce product records.

Dataset example

In the inventory, you have two t-shirt designs (A and B) and two designs of sweatshirts (C and D). Each design comes in several colors.

In your dataset, you can represent them by creating one record for each color variant of each item. Each record specifies the type, the design, the color, and the associated thumbnail. Here’s what your records look like:

json
[
  {
    "type": "t-shirt",
    "design": "B",
    "color": "blue",
    "thumbnail_url": "tshirt-B-blue.png"
  },
  {
    "type": "sweatshirt",
    "design": "C",
    "color": "red",
    "thumbnail_url": "sweatshirt-C-red.png"
  }
]

Going further, you could add all the possible color variations for each record. This way, you could display all the variants for a single product in your frontend (for example, color swatches under the thumbnail), allowing users to discover them.

json
[
  {
    "type": "t-shirt",
    "design": "B",
    "color": "blue",
    "thumbnail_url": "tshirt-B-blue.png",
    "color_variants": ["orange", "teal", "yellow", "red", "green"]
  },
  {
    "type": "t-shirt",
    "design": "B",
    "color": "orange",
    "thumbnail_url": "tshirt-B-orange.png",
    "color_variants": ["blue", "teal", "yellow", "red", "green"]
  }
]

With this approach, every record represents a single variation, which ensures always displaying consistent data. One record per variation lets you add granular custom ranking attributes, like number_of_sales. Besides, you can use Algolia’s distinct feature to deduplicate designs. This way, when someone searches for “t-shirt”, they only get one of each design.

Using the API

At indexing time

Before deduplicating items, restrict what attributes are searchable. You don’t want to search into thumbnail_url, which may be irrelevant and add noise, nor into color_variants, because it could lead to false positives. Therefore, you can set design, type, and color as searchableAttributes.

To use distinct, you first need to set design as attributeForDistinct during indexing time. Only then can you set distinct to true to deduplicate your results. Note that setting distinct at indexing time is optional. If you want to, you can set it at query time instead.

At query time

Once attributeForDistinct is set, you can enable distinct by setting it to true.

You can set distinct to true or 1 interchangeably.

Using the dashboard

You can also set your attribute for distinct and enable distinct in your Algolia dashboard.

  1. Go to your dashboard, then go to the Search product and select your index.
  2. Click on the Configuration tab.
  3. In the Searchable Attributes section, click the “Add a searchable attribute” button.
  4. Select the design, type, and color attributes in the drop-down menu one after another.
  5. Click the Deduplication and Grouping tab, which you can find under Search behavior.
  6. Set the Distinct option to true.
  7. Set the Attribute for Distinct option to design.
  8. Save your changes.

When distinct is true, you get one color for each design. To control which one, you can set a new attribute with business metrics, for example, number_of_sales, and set it up for custom ranking.

Additionally, you can display all available colors for each item thanks to the color_variants attribute. This way, users can access all possible variants from the search results without the page being crowded with too many items.