Indexing long pages
How to structure long pages by splitting them into smaller chunks, by paragraph, page, or any other logical breaking point.
To ensure good performance, Algolia limits the size of each record. Long content, like a detailed Wikipedia page, might be too big to fit into one of these records.
To work around this, divide long pages into smaller “chunks”. This not only helps you stay within the size limit but also makes your search more relevant. Break the page into sections or even paragraphs, and store each as a separate record.
When splitting into chunks, organize them based on the page structure. For instance, if you’re dealing with a lengthy Wikipedia article, create separate records for each section like “Introduction” or “History”.
If you’re using the Algolia Crawler and the record size exceeds the limit, use the helpers.splitContentIntoRecords()
helper to split the page into smaller chunks.
Avoid duplicates
When you split a page, the same content might appear in multiple records. By setting the distinct
parameter to true
, Algolia ensures only the most relevant of these duplicate records is shown. You decide what counts as ‘distinct’ by choosing a meaningful attribute, like the title of a section.
Example
In the following example, you’ve structured your records for a long page. To make sure that search results show only one entry per section, you:
- Set
distinct
totrue
- Choose
section
as yourattributeForDistinct
.
How to enable the distinct
feature
You can enable distinct
from Algolia’s dashboard or API.
Using the API
If using the API to enable ‘distinct`, you can either do it at indexing time (when you add records to your indices) or at query time (when users search).
- Set an attribute, such as
section
, as theattributeForDistinct
- Set
distinct
totrue
to deduplicate your results.
At indexing time
At query time
Using the dashboard
-
Go to the Algolia dashboard and select your Algolia application.
-
On the left sidebar, select Search.
-
Select your Algolia index:
-
Click the Configuration tab.
-
In the Search behavior section, select Deduplication and Grouping.
-
Set the Distinct drop-down menu option to
true
. -
Select your attribute in the Attribute for Distinct drop-down menu.
-
Save your changes.
Was this page helpful?