My understanding how autocomplete/search for text/item works at high level in any scalable product like Amazon eCommerce/Google at high level was :-
ES autocompletion can be achieved in two ways:
The first option is the poor man's completion feature. I'm mentioning it because it can be useful in certain situation but you should avoid it if you have a substantial amount of documents.
The second option uses the conventional ES indexing features, i.e. it will tokenize the text, all (edge-)ngrams will be indexed and then you can search for any prefix/infix/suffix that have been indexed.
The third option uses a different approach and is optimized for speed. Basically, when indexing a field of type completion
, ES will create a "finite state transducer" and store it in memory for ultra fast access.
A finite state transducer is close to a trie in terms of implementation. You can check this excellent article which shows how trie compares to finite state transducer
UPDATE (June 25th, 2019):
ES 7.2 introduced a new data type called search_as_you_type
that allows this kind of behavior natively. Read more at: https://www.elastic.co/guide/en/elasticsearch/reference/7.2/search-as-you-type.html