lucene query special characters -


i have trouble understandnig handling of special characters in lucene.
analyzer has no stopwords, special chars not removed:

chararrayset stopwords = new chararrayset(0, true); return new germananalyzer(stopwords);   

than create docs like:

doc.add(new textfield("tags", "23", store.no)); doc.add(new textfield("tags", "brüder-grimm-weg", store.no)); 

query tags:brüder\-g works fine, fuzzy query tags:brüder\-g~ not return anything. when street name eselgasse query tags:esel~ work fine.
use lucene 5.3.1

thanks help!

fuzzy queries (as wildcard or regex queries) not analyzed queryparser.

if using standardanalyzer, instance, "brüder-grimm-weg" indexed 3 terms, "brüder", "grimm", , "weg". so, after analysis have:

  • "tags:brüder\-g" --> tags:brüder tags:g
    matches on tags:brüder

  • "tags:brüder\-g~" --> tags:brüder-g~2
    since not analyzed, remains single term, , have no matches, since there no single term in index "brüder-g"


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -