Squidex localization and Algolia

Hi Sebastian,

Doing some research regarding wheather or not I should use one or multiple Algolia indexes for a multilingual web site.

Currently I have 1 Algolia index. And only 1 language (Swedish) in Squidex.

From the Squidex docs:

{ 
    "id": "01",
    "created": "2017-02-25T19:56:35Z",
    "createdBy": "...",
    "lastModified": "2017-02-25T19:56:35Z",
    "lastModifiedBy": "...",
    "data": {
        "name": {
            "en": "Copenhagen",
            "sv": "Köpenhamn",
            "fi": "Kööpenhamina",
        },
        "population": {
            "iv": 1400000
        }
    }
}

If using the above structure all languages would end up in the same Algolia index as 1 record.

Algolia said to me a couple of minutes ago:

Check this

: https://discourse.algolia.com/t/support-multiple-languages-in-search/3958

And then we ended the conversation by them saying:

Generally, the best is to have one index per language as this won’t interfere with your relevancy

So I’m kinda not sure what the best option would be here combining Squidex + Algolia and make them play nicely with 0 dupes and as little overhead as possible.

I don’t want any dupe records.
I want the same front-end.
I need english as master lang.
I need swedish to be an optional lang.
And that’s it for now I think.

In the future if I need to add additional languages this should be doable in ones sleep.

Thanks in advance and hope you can give some usefull tips.

Edit 1: Found this as well - just adding it for reference: https://www.algolia.com/doc/guides/managing-results/optimize-search-results/handling-natural-languages-nlp/how-to/multilingual-search/

Edit 2:

From the Algolia docs:

There are different ways to handle multiple languages in search. To determine the best solution for you, answer the following questions:

Does the ranking need to be different for each language? There are a couple of reasons the ranking might need to change depending on the language:

  • The price is not the same in different countries (and you want to sort by price) <=== I NEED THIS!
  • The object doesn’t have the same popularity in all regions of the world (and you have popularity scores per region) <=== I NEED THIS TOO!

So:

In other words, you’ll need two indices if:

  • You have a different ranking strategy depending on the language

So I guess I must use two indexes then.

But what about Squidex and the dupe data?

Hi,

I think your Edit 2 it not that helpful, because language != country. So you might have even more indices than languages.

For new the best strategy would be to use a webhook and do the indexing manually.

The rule system could be also improved with json references like they are used in algolia:

From their Docs:

Custom payload can be any valid JSON value. To resolve a value from the original webhook payload use a JSON pointer wrapped with curly braces.

Example:
{
  "entityId": "{ /payload/sys/id }",
  "spaceId": "{ /payload/sys/space/sys/id }",
  "parameters": {
    "text": "Entity version: { /payload/sys/version }"
  }
}

There is also this task in the roadmap: https://trello.com/c/ok2oIK91/63-flatten-content-in-triggers

The idea is to use X-Flatten in the rule system. Both features combined could provide a solution.

Btw: The next major version (2.0) will also contain a new full text.