Algolia on a multi-brand e-commerce: variants, locales, and replica indexes
How I structured Algolia indexing for a multi-brand e-commerce: exploded color variants, hierarchical categories, per-locale indexes, and brand-specific sort replicas.
A multi-brand e-commerce with multiple locales and product variants creates specific search problems: each brand has different ranking logic, each market needs its own index, and a sweater available in 12 colors shouldn’t appear 12 times in search results.
I built the Algolia sync layer for a multi-brand frontend on Sanity + Shopify. Each brand had its own configuration, but shared the same Lambda infrastructure and utility library.
The color variant problem
On Shopify, a product with 5 colors × 3 sizes has 15 variants. If you index each variant as a separate document, a query for “grey sweater” returns 3 identical results — one per size.
The solution is distinct: true with attributeForDistinct: 'distinctId'. Algolia groups documents with the same distinctId and surfaces only the canonical representative.
The distinctId depends on the product configuration:
const distinctId = `${product._id}${
explodedVariants && variantOption ? ` - ${variantOption}` : ''
}`;
explodeColorVariantsInSearch: false— all colors collapse into one document;distinctIdis just the product ID.explodeColorVariantsInSearch: true— each color gets its own document;distinctIdisproductId - color. The 3 grey sizes collapse, but grey and blue stay separate.
The flag is per product, not global. A multi-color dress makes sense to explode; a bag available in black and brown probably doesn’t.
The variant document
Each in-stock variant becomes an Algolia document. It contains product-level fields (title, slug, categories) and variant-level fields (SKU, inventory, price):
export const createSearchVariantDocument = ({
product,
variant,
variantIndex,
images,
}): Omit<ProductVariant, 'distinctId'> => {
return {
objectID: variant.id,
productId: product._id,
price: priceInMinorUnits(variant.price),
compareAtPrice: priceInMinorUnits(variant.compareAtPrice),
discount: calculateDiscountAmount({ price: variant.price, compareAtPrice: variant.compareAtPrice }),
hasDiscount: discount > 0,
slug: product.seo?.slug?.current,
title: product.title,
variantTitle: variant.title,
sorting: variantIndex,
categories: createCategoriesForSearch(product.categories),
isCanonical: canonicalVariant?.id === variant.id,
};
};
isCanonical matters: among all variants of a product, the canonical one is the variant with an active discount and available inventory. It’s the one shown in the search result card. customRanking surfaces it first.
Prices are stored in minor units (price * 100). Algolia ranks by integer; 19.99 becomes 1999. This avoids floating-point issues in numeric ranking.
Hierarchical categories: lvl0/lvl1/lvl2
Algolia supports hierarchical navigation via fields categories.lvl0, categories.lvl1, categories.lvl2. The expected format is:
categories.lvl0: ["Women"]
categories.lvl1: ["Women > Clothing"]
categories.lvl2: ["Women > Clothing > Knitwear"]
The problem is that categories in Sanity are documents with a parent reference (parentPage). I wrote a function that walks the chain upward and builds the path:
export const createCategoriesForSearch = (categories) => {
const searchCategories = {};
for (const category of categories) {
const categoryPath = [category.title.trim()];
let parentPage = category.parentPage;
for (; parentPage?.title || parentPage?.parentPage; parentPage = parentPage?.parentPage) {
categoryPath.unshift(parentPage?.title?.trim() || '');
}
const categoryPathForSearch = categoryPath.map((_, i) =>
categoryPath.slice(0, i + 1).join(' > ')
);
for (let index = 0; index < categoryPathForSearch.length; index++) {
const lvl = `lvl${index}`;
searchCategories[lvl] = [
...new Set([...(searchCategories[lvl] || []), categoryPathForSearch[index]]),
];
}
}
return searchCategories;
};
The same logic applies to collections (collections.lvl0/1). The filterOnly(categoriesId) facet lets the client filter by ID without making the field searchable.
Color families
A product available in “powder blue”, “sky blue”, and “navy” has 3 color variants. Filtering by color only works if the user types the exact variant name — which they won’t.
In Sanity I have a Color document with a parent.name field (the family: Blue, Green, Red). At index time, each specific color maps to its family:
const colorFamilies = await getAllColorFamilies(sanityClient);
// { "powder blue": "Blue", "navy": "Blue", "cherry red": "Red", ... }
const colorFamilyVariants = uniqueArray(
colorVariants.map(color => colorFamilies[color])
);
The indexed field is colorFamilyVariants: ["Blue"]. The client filters by family, not by exact name.
When color variants are exploded, colorFamilyVariants contains only the family of that variant’s color — not all families of the product.
Multi-locale: one index per language
Each brand is distributed across multiple markets (IT, DE, FR, EN). Algolia doesn’t support multilingual queries on a single index — stopwords, stemming, and tokenization are language-specific.
The index naming convention is {indexName}_{locale} for non-default languages:
export const getAlgoliaIndexLocalized = ({ indexName, locale }) => {
const defaultLocale = getDefaultLocaleFromEnv();
if (!locale || defaultLocale === locale) return indexName;
return `${indexName}_${locale}`;
};
// "products" for it (default), "products_de" for de, "products_fr" for fr
During sync, the handler iterates over all available locales, filters Sanity products by language, and saves to the locale-specific index:
for (const locale of getLocalesFromEnv()) {
const products = sanityProducts.filter(
product => product?.i18nLang === locale ||
(!product?.i18nLang && locale === getDefaultLocaleFromEnv())
);
const algoliaIndex = getAlgoliaIndexLocalized({ indexName, locale });
const productIndex = getAlgoliaClient().initIndex(algoliaIndex);
await syncProductsWithAlgolia({ productIndex, products });
await productIndex.setSettings({
...indexSettings,
indexLanguages: [getLanguageTag(locale)],
replicas: replicas.map(name => getAlgoliaIndexLocalized({ indexName: name, locale })),
});
}
Settings (indexLanguages) are updated on every sync, not just initial setup. This ensures that a configuration change doesn’t require a separate manual step.
Replica indexes for sorting
Algolia supports only one primary sort order per index. To support “sort by price ascending”, “sort by highest discount”, or “sort by rating”, I used replica indexes — copies of the main index with a different ranking configuration.
Each brand has its replicas configured based on frontend features:
// Fashion brand — price only
replicas: ['products_price_asc', 'products_price_desc'],
// Food brand — price + ratings + discount
replicas: [
'products_price_asc',
'products_price_desc',
'products_higher_vote',
'products_most_comments',
'products_higher_discount',
],
// Baby products brand — price + discount
replicas: [
'products_price_asc',
'products_price_desc',
'products_higher_discount',
],
Replicas are created and configured automatically by the Algolia client library when you call setSettings with replicas on the primary index. The client connects to the appropriate replica when the user selects a non-default sort order.
The sync flow
The trigger is a Sanity webhook (POST with HMAC signature). The Lambda:
- Validates the signature
- Reads the
_idfrom the body (if present, single-product sync; otherwise full sync) - Queries Sanity via GraphQL with
is_draft: falsefilter - For each locale: filters products, gets the localized index, saves objects, updates settings
const { _id: productId } = useJsonBody() || {};
if (productId && !isValidSignature()) {
return error({ message: 'Invalid signature', statusCode: 401 });
}
const response = await client.request(GQL_QUERY, {
where: {
_: { is_draft: false },
...(productId && { _id: { eq: productId } }),
},
});
Full sync clears the index before reinserting (productIndex.clearObjects()). Single-product sync upserts without clearing — Algolia handles it via objectID.
Fetching all hits with loop protection
To read all records from an index (for reporting and reconciliation), I have a paginated utility with a loop guard:
const MAX_PAGES_PREVENT_LOOP = 50;
for (
let currentPage = 0, totalPages = 1, loopIndex = 0;
currentPage <= totalPages && loopIndex < MAX_PAGES_PREVENT_LOOP;
currentPage++, loopIndex++
) {
if (loopIndex === MAX_PAGES_PREVENT_LOOP - 1) {
return Promise.reject(new Error(`Reached max pages: ${MAX_PAGES_PREVENT_LOOP}`));
}
const { hits, page, nbPages } = await fetchAlgoliaHits({ algoliaIndexName, page: currentPage, hitsPerPage: 1000 });
allHits = [...allHits, ...hits];
totalPages = nbPages;
currentPage = page;
}
hitsPerPage: 1000 is Algolia’s maximum. With 50 pages as a ceiling, that covers up to 50,000 records — enough for any realistic e-commerce catalog.
What I’d keep, what I’d change
The explodeColorVariantsInSearch flag works well. Giving each product control over how its variants appear avoids blanket compromises: products where color matters visually (clothing) can explode; those where it doesn’t stay collapsed.
Color families in Sanity is the right place. Keeping them in the CMS means merchandisers can add new ones without touching code.
Per-locale indexes scale, but linearly. With 4 languages and 5 replicas, that’s 20 indexes per brand. Across multiple brands, Algolia costs become significant.
Full sync with upfront clear is fragile. If the Lambda fails halfway, the index is empty until the next run. A safer approach is to diff existing IDs against incoming ones and only delete removed products — more logic, but no silent data loss.