Solr wild card search with space in the middle -


folks,

we want solr wild card search space in middle.

e.g if search "please\ help*" should retrieve document having "please help" followed documents having "please" , "help" words.

we see if search "please\ help*" return document having "please help" , not returning search individual tokens "please" , "help".

given below field defination using indexing , search

<fieldtype name="string_ci" class="solr.textfield" sortmissinglast="true" omitnorms="true">    <analyzer type="index">              <tokenizer class="solr.whitespacetokenizerfactory"/>     <filter class="solr.stopfilterfactory" ignorecase="true" words="stopwords.txt" />     <filter class="solr.lowercasefilterfactory"/>     <filter class="solr.worddelimiterfilterfactory"/>     <filter class="solr.lengthfilterfactory" min="2" max="100"/>     <filter class="solr.shinglefilterfactory" maxshinglesize="2" outputunigrams="true"/>  </analyzer>     <analyzer type="query">     <tokenizer class="solr.whitespacetokenizerfactory"/>     <filter class="solr.stopfilterfactory" ignorecase="true" words="stopwords.txt" />     <filter class="solr.synonymfilterfactory" synonyms="synonyms.txt" ignorecase="true" expand="true"/>     <filter class="solr.lowercasefilterfactory"/>     <filter class="solr.shinglefilterfactory" maxshinglesize="2" outputunigrams="true"/>   </analyzer>   </fieldtype> 

when you're using wildcard search, analysis stage of query not invoked. means "please help*" not go through shingle filter, etc., , therefor doesn't give hits.

as mentioned in comments question - use edgengramfilter in indexing phase instead, , submit query "please help". retrieve documents field starts "please help", create several versions of same token (such "p", "pl", "ple", "plea", "pleas", "please", "please ", "please h", etc.).

you'll have adjust sequence of filters match need.

you can use keywordtokenizer complete input indexed single token (with lowercasefilter if want to), , use match one, single token against wildcard search (as no other analysis need take place).


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -