analyzer - Optimising ElasticSearch aggregated search suggestions -
i'm working on implementing autocomplete field suggestions contain number of matching documents.
i have implemented using terms aggregation include
filter. instance given user typing 'chrysler' following query may generated:
{ "size": 0, "query": { "bool": { "must": [ ... ] } }, "aggs": { "filtered": { "filter": { ... }, "aggs": { "suggestions": { "terms": { "field": "preflabel", "include": "chry.*", "min_doc_count": 0 } } } } } }
this works fine , able data need. however, concerned not optimised , more done when documents indexed.
currently have following mapping:
{ ... "preflabel":{ "type":"string", "index":"not_analyzed" } }
and wondering whether add analysed field, so:
{ ... "preflabel":{ "type":"string", "index":"not_analyzed", "copy_to":"searchlabel" }, "searchlabel":{ "type":"string", "analyzer":"???" } }
so question is: optimal index-time analyser this? (or, crazy?)
i think edge ngram tokenizer speed things up:
curl -xput 'localhost:9200/test_ngram' -d '{ "settings" : { "analysis" : { "analyzer" : { "suggester_analyzer" : { "tokenizer" : "ngram_tokenizer" } }, "tokenizer" : { "ngram_tokenizer" : { "type" : "edgengram", "min_gram" : "2", "max_gram" : "7", "token_chars": [ "letter", "digit" ] } } } }, "mappings": { ... "searchlabel": { "type": "string", "index_analyzer": "suggster_analyzer", "search_analyzer": "standard" } ... } }'
Comments
Post a Comment