mongodb - Combine full text with other index -


i have full text index , index on created date.

my query on date alone returns nice, small 44 records (within second):

> db.onemilliondocumentsindexed.count({created: {$lte: isodate("2016-02-06t15:34:59.019z")} }) 44 

however, if combine text search query incredibly slow:

> db.onemilliondocumentsindexed.count({                                 created: {$lte: isodate("2016-02-06t15:34:59.019z")},                                  $text: { $search: "raven" } }) 

it appears use both indexes:

{     "queryplanner" : {         "plannerversion" : 1,         "namespace" : "test.onemilliondocumentsindexed",         "indexfilterset" : false,         "parsedquery" : {             "$and" : [                 {                     "created" : {                         "$lte" : isodate("2016-02-06t15:34:59.019z")                     }                 },                 {                     "$text" : {                         "$search" : "raven",                         "$language" : ""                     }                 }             ]         },         "winningplan" : {             "stage" : "fetch",             "filter" : {                 "created" : {                     "$lte" : isodate("2016-02-06t15:34:59.019z")                 }             },             "inputstage" : {                 "stage" : "text",                 "indexprefix" : {                  },                 "indexname" : "$**_text",                 "parsedtextquery" : {                  }             }         },         "rejectedplans" : [ ]     },     "serverinfo" : {         "host" : "plod",         "port" : 27017,         "version" : "3.0.7",         "gitversion" : "6ce7cbe8c6b899552dadd907604559806aa2e9bd"     },     "ok" : 1 }     

shouldn't created date search reduce number of documents, speeding query?

whilst documents aren't tiny, aren't massive either. here's example document:

{     "_id" : objectid("56b612a2b6c13d2bec221d22"),     "created" : isodate("2016-02-06t15:34:57.954z"),     "adoptability-integer" : 1885631649,     "impoverisher-double" : 0.78982932576436,     "auriga-short-string" : "unpunished",     "pistillate-long-string" : "raven nationalistic supergalaxies shit candidacy vengefulness baghla inharmony breviaries subcoracoid facet numbles achaian hyksos g¥ᄀtterdï¿¥ï¾ exsecant costliness assertively cufic neurotomy subfebrile reassess eruption calciphobous epithecium adipopectic eruption neurotomy impaste shrugging oxytone depredating abb¥ᄑ unfaithfulness clive amman meteorology dollond del cussed malversation determinateness wadset busher precedent warder lithest tuberculinize kythera swiping hyperopic installation otosclerosis costly joyance neenah saliently bicepses myograph blackmur. salable radiational copaiva seisure animism franglais chalkboard astride preaortic machinelike criseyde easternmost theological. goloshes amber assertively universalism pterylological abortifacient entrepï¾¢t nordic intricate canvasser unscholastic caria marginal prakritic gal tambur seascouting branchiform vaticide hysteroidal. vario chefoo permanganic solidillu lashings permanganic denatured chartres nonenergetically pabx coinheritance koulibiaca wrathless unrejoicing kodly confutable juru changelessness ratite pol lightener pansy portadown unpeg iontophoresis ruddily overcorrupt rondure midair mobocrat. rals sind teaser hussism definiteness piperidine septicity procryptic salicaceous catalpa stingy panegyrise baddie wodan preoccasioned ndebele sanitizing mulga grantedly selectman dep overscruple mealies subsellia noncompressible lepidoptera nonequilateral vï¿¥ï¾ racemiform carob preaccredit parramatta. piatigorsky unmanifest eulogized bolometric circumnavigating stare. prewitt branchiform canadianizing untinselled crossruf anthozoic del dragrope pronative foulness incessancy sultanate debunker guncotton reindictment uninstalled pieter buying prestwick anguish dicrotism permissible. nonscarcity labialising underswamp nondegradation incubating unwillable dealer rewinded jaggedness jasmine flatfootedness edgily choregraphic unpenetrating unwhited devotedly thornton irremediably reentry cordilleras inhospitable blenchingly hedgehop. nontribesman semiexhibitionist streetlike outgeneral spatiality hyacinthides prometheus tingly tenacious aerologist promonarchy nonsophistical uhuru unsprayable countrywoman proequality schickard. antagonize cart undocumented heteroplastic cyclostome keratin specification tombless lambie extricating feticide reacceded redwing autokinetic ferias underpart dupr¥ᄑ preexperimental besancon dvm riksm'' unharmonised bradykinetic unforeseeableness ryukyu rootstalk aquarial uredospore kame nondissenting pachyderm southeasterner comminute excitant torturing reasoningly restabilize isotopy emergency boathouses plowmanship decidedness skeptophylaxis kelebe clive furred abuttals variometer indamine wreathe. guymon rubinstein monotriglyph inaction. bedazzle foreordinated proportioner pursy beryl slogging forbearer abirritant concur. nonleprous veriax overservility mirza relitigate richness dipteroi mischarged. inquisitress nav unimpressibility teratoma brilliantined untensing vlaardingen theorbo shostakovich appia maximally fingered ashkenazim soap unpick isocheimenal gingili synonymical interannular patronising knaggiest cleaver lassie interwound osculated unobliging portobello boxer impactive.bladderwort wish aerothermodynamics lymphadenomata nonfundamental interdiffuse injector chaussure. polyphyletically irishising ayous sinecurist decant carbonized flickeringly stomatitic emily luteotrophin anginous syllabic permeameter carthal brachiator farinose justicelike azotized getaway electroencephalographically puglia unconfound appendiceal premedical vassal rubric overhearing conative heartaching shammer staphylorrhaphy bulgar spilikin phagocytosing adenitis syntypic dissertate collyrium sonless anoxia archil mimosis irreversibly unhabituated scholiast rcs portadown mishima preimport bonavist jointedly aspergillus farinose condemnation chough blanc descanter mephistopheles ongoing unsurgical unclassifiableness namtar corniest disbudding disklike zap wheyface teetotally nonsubmission delian enrober canadian nasi hypermetabolism animadversion unbantering recompile ineradicable blindly mren schorlaceous viperous latish unstationed decastylos catalpa beflagged pellicular demark gassendi. macmonnies deserve subsidizer generous reassess colorfully unsummonable clave hderlin borges aechmagoras misbegotten uncontradictory unfelicitous plunderage presynsacral backband amagasaki unsavorily proenzyme ney slipslop unrhythmical debenture rosy unreprehended sulfuryl outpeep fichtean jellylike anginous foil pixies columella nonsuggestion unwhited icier archbishop masan oireachtas coxcomb pseudosiphonic rubinstein cockerel fidel swingle submembranous despondent sarajevo camshaft inclusiveness reynard deducibility counselling velveteen whaleback interventricular harquebuses sodomite chunk nondecayed disyllable nonfundamental funnelling pricing neuroanatomical evaporate palisades kamerun. zigzag meteorology agura puerperium misfield annulus sapper franklinton prenotion pyroxylin dustour fluming cereus nontangental metempirical nonadjudication restated impactive.bladderwort swingle frolic hadramaut buraydah uncarbonized sthenius uncreditableness undreading grattoir excitant bma mellers centurial broad intellectualist pursy apodemal inclusiveness laurence kentucky cyanic nonunified jason. swiping mismatch cereuses dress entrain mannikin insetting scratchy glaiketness query antipatriarchal rjcharging. fichtean lwm reidentification theurgically baddie abut snowcreep vaud cretheus clubhouses homodyne rayah beguine coquettishly rabidness retime lithoid send epistyle undefendable christless narcomania extraprofessional paracelsic interrogatories eucrite cotswolds reverberantly recommendatory dorsally wobbly sheared malacca worminess oka railway farnham bendwise prediet bastioned tuberculocele deriver intelligential cutty artillerist calipatria torchier drillable currawong obviable remoteness. forte sentimentalist dealer nonempathically foreseeable talthybius reinjuries tannic hyperopic toolmaker pieridine noncontention panne baghla syndromic intermeasuring gait leaving osteoclasis. squillageed cadetship messieurs benet player terseness chagrinning sterically birthmark subvertebral runesmith stomodaeum illiberalising sarmentose overlubricate weeds ecumenic unretaliative execrableness trichotomic schumann luxury nupe dirk ashkenazim zap iconoclast vulneraries pulaski hypergeusesthesia mismatch lymphangitides cubitus unpossessable rummage silviculturally bara quo. arizonian danielson granddaddy klemperer curling derivatively monadal ungrained counterproductive contendingly handled aegirine motilal. unfaulty anecdotal cyanate. bucolically leaving mephistopheles revibrated maculation glairier palmer harebell laryngectomized primitivity.mucous consensual comfortlessly slumberland preenrollment decastylos buying yggdrasil unslakable concordia uprooter pï¾¢rto meloid. klemperer frambesia bohemia kruller carburettor limousin accessarily debt ameliorate bootblack richardson salvarsan contumaciousness landscaper epigyny palisades redwing pyribenzamine totalitarian taxiplane aurum chasm criollos transannular friendship spitefully eliot ennomus spessartine pomiferous ethnarchy milkiness fractable amigen unexuberant.repark clapt keyboard noninjurious unemotive corbelled sib sextet beheader appear kyathos cirrocumular semipagan teasingly coelenteron nihilism chitchatted dress bateau unrhythmical unimmerged lapelled archaeocyte depersonalise redispersal querurying memlinc strepitous. consociate dehypnotizing stardom novelise mimosa disklike invertase nonmarriageable agreeing tuberculinize graphologic paris hew airy outwove inconvenienced columella desc freight broodiest spermatorrhoea melodic rebeck. silverised jahangir everard foolishly gabby packer mahound emendatory infeasibility inkpot resubstantiated isopectic revivifying crassulaceous unresigned.greenboard hanyang guevara inspectable hyperbrachycephaly dicrotal armipotent dissever girdlelike alternator obs. heritable nondietetically sensationism medick chlorellaceous spotted flews mariner gait nontribesman unshrinkability regulated haunter sharer postliminy maeterlinck disaffiliating nonreflection disadvantageously creepy congenitalness puglia savanna. codetta orb reenlightenment gen palaeozoology educatee niobous deject dysteleological pampre electroencephalographically harebrained execrableness achroite theorbo germinance anisocarpic jagellon antlia frenchiness splendid communalize andalusia unlofty archduchy apery forbade snit wintriest mendicity franglais depersonalise sibship unslapped totalitarian compatriot doll polkaed dyersville huntingdonshire loftily spectrality carafe gouverneur cureless unprecarious redevelop illiberalism. racialistic distributing cameo madrigalesque coalitionist snort cochleate overact ladysmith protostele afforestation multimegaton proletarianness amphithalamus abeokuta. amerind subfreezing missilery secateurs. superstructure. chrysarobin seaworthiness snohomish necrobacillosis incinerated wrack sclaffer kamasutra postmyxedemic mortgagor impaste earliness underlapped bucktooth mortified birthmark unscrupled angiocardiography hemiacetal judgeless hussy channing reunified nondissenting hypercathartic vindicable unslapped extensionally lashings canniest cling motional homotherm overobesity clive retasting clipt rewound. unousted prosper australorp theocracy interprofessional crocus carthal unmoveable repouss¥ᄑ birthmark reasonableness wristwatch patronising g¥ᄀtterdï¿¥ï¾ jink vitus. stokes ultima phyllocladium mudfish trust caravaggio overtipple amorphous baddie milksopping mulier. indeciduate winkle acrimoniousness. cereuses altgeld gelatinoid contemporaneous traveling haphtaroth aet gogglebox nupe archeptolemus withdrawable nonenergetically horsing coral. stint preludin keynoter cogitative persuadability godwin wardenry reborn patternless sorrentine vitria moror chumash nonguilt nonpacific realter regive unoratorial halothane skeptophylaxis quo. songfest desperado mischarged. suberise teratoma apposer homoiothermal nonstyptical " } 

the main case here "text" search result taking precedence on other filter conditions in query, , such becomes necessary "first" obtain results "text" component, , "scan" other conditions in document.

this type of search can difficult optimise along "range" or type of "inequality" match condition in conjuntion text search results, , due how mongodb handles "special" index type.

for short demonstration, consider following basic setup:

db.texty.drop();  db.texty.insert([     { "a": "a", "text": "something" },     { "a": "b", "text": "something" },     { "a": "b", "text": "nothing much" },     { "a": "c", "text": "something" } ])  db.texty.createindex({ "text": "text" }) db.texty.createindex({ "a": 1 }) 

so if wanted @ text search condition range consideration on other field ( { "$lt": "c" } ), handle follows:

db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } }).explain() 

with explain output such ( important part ):

           "winningplan" : {                     "stage" : "fetch",                     "filter" : {                             "a" : {                                     "$lt" : "c"                             }                     },                     "inputstage" : {                             "stage" : "text",                             "indexprefix" : {                              },                             "indexname" : "text_text",                             "parsedtextquery" : {                                     "terms" : [                                             "someth"                                     ],                                     "negatedterms" : [ ],                                     "phrases" : [ ],                                     "negatedphrases" : [ ]                             },                             "inputstage" : {                                     "stage" : "text_match",                                     "inputstage" : {                                             "stage" : "text_or",                                             "inputstage" : {                                                     "stage" : "ixscan",                                                     "keypattern" : {                                                             "_fts" : "text",                                                             "_ftsx" : 1                                                     },                                                     "indexname" : "text_text",                                                     "ismultikey" : true,                                                     "isunique" : false,                                                     "issparse" : false,                                                     "ispartial" : false,                                                     "indexversion" : 1,                                                     "direction" : "backward",                                                     "indexbounds" : {                                                      }                                             }                                     }                             }                     }             }, 

which saying "first me text results , filter results fetched other condition". "text" index being used here , results returns subsequently being filtered examining content.

this not optimal 2 reasons, being may data best constrained "range" condition rather matches text search. secondly, though there index on other data, not being used here comparison. rather whole document loaded each result , filter tested.

you might consider "compound" index format here, , seem logical if "range" more specific selection, include prefixed order of indexed keys:

db.texty.dropindexes(); db.texty.createindex({ "a": 1, "text": "text" }) 

but there catch here, since when attempt run query again:

db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } }) 

it result in error:

error: error: { "waitedms" : numberlong(0), "ok" : 0, "errmsg" : "error processing query: ns=test.textytree: $and\n $lt \"c\"\n text : query=something, language=english, casesensitive=0, diacriticsensitive=0, tag=null\nsort: {}\nproj: {}\n planner returned error: failed use text index satisfy $text query (if text index compound, equality predicates given prefix fields?)", "code" : 2 }

so though may seem "optimal", way mongodb processes query ( , index selection ) special "text" index, not possible "exclusion" outside of range possible.

you can perform "equality" match on in efficient way:

db.texty.find({ "a": "b", "$text": { "$search": "something" } }).explain() 

with explain output:

           "winningplan" : {                     "stage" : "text",                     "indexprefix" : {                             "a" : "b"                     },                     "indexname" : "a_1_text_text",                     "parsedtextquery" : {                             "terms" : [                                     "someth"                             ],                             "negatedterms" : [ ],                             "phrases" : [ ],                             "negatedphrases" : [ ]                     },                     "inputstage" : {                             "stage" : "text_match",                             "inputstage" : {                                     "stage" : "text_or",                                     "inputstage" : {                                             "stage" : "ixscan",                                             "keypattern" : {                                                     "a" : 1,                                                     "_fts" : "text",                                                     "_ftsx" : 1                                             },                                             "indexname" : "a_1_text_text",                                             "ismultikey" : true,                                             "isunique" : false,                                             "issparse" : false,                                             "ispartial" : false,                                             "indexversion" : 1,                                             "direction" : "backward",                                             "indexbounds" : {                                              }                                     }                             }                     }             }, 

so index used , can shown "pre-filter" content provided text matching output of other condition.

if indeed keep "prefix" index "text" field(s) search however:

db.texty.dropindexes();  db.texty.createindex({ "text": "text", "a": 1 }) 

then perform search:

db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } }).explain() 

then see similar result above "equality" match:

            "winningplan" : {                     "stage" : "text",                     "indexprefix" : {                      },                     "indexname" : "text_text_a_1",                     "parsedtextquery" : {                             "terms" : [                                     "someth"                             ],                             "negatedterms" : [ ],                             "phrases" : [ ],                             "negatedphrases" : [ ]                     },                     "inputstage" : {                             "stage" : "text_match",                             "inputstage" : {                                     "stage" : "text_or",                                     "filter" : {                                             "a" : {                                                     "$lt" : "c"                                             }                                     },                                     "inputstage" : {                                             "stage" : "ixscan",                                             "keypattern" : {                                                     "_fts" : "text",                                                     "_ftsx" : 1,                                                     "a" : 1                                             },                                             "indexname" : "text_text_a_1",                                             "ismultikey" : true,                                             "isunique" : false,                                             "issparse" : false,                                             "ispartial" : false,                                             "indexversion" : 1,                                             "direction" : "backward",                                             "indexbounds" : {                                              }                                     }                             }                     }             }, 

the big differnce here first attempt being filter placed in processing chain, indicating whilst not "prefix" match ( optimal ), content indeed being scanned off of index "before" being sent "text" stage.

so "pre-filtered" not of course in optimal way, , due nature of how "text" index used. if considered plain range on index itself:

db.texty.createindex({ "a": 1 }) db.texty.find({ "a": { "$lt": "c" } }).explain() 

then explain output:

            "winningplan" : {                     "stage" : "fetch",                     "inputstage" : {                             "stage" : "ixscan",                             "keypattern" : {                                     "a" : 1                             },                             "indexname" : "a_1",                             "ismultikey" : false,                             "isunique" : false,                             "issparse" : false,                             "ispartial" : false,                             "indexversion" : 1,                             "direction" : "forward",                             "indexbounds" : {                                     "a" : [                                             "[\"\", \"c\")"                                     ]                             }                     }             }, 

then @ least got indexbounds consider , looked @ portion of index fell within bounds.

so that's differences here. using "compound" structure should save iteration cycles here being able narrow down selection, still must scan index entries filter, , must of course not "prefix" element in index unless can use equality match on it.

without compound structure in index, returning text results "first", , applying other conditions results. not possible "combine/intersect" results looking @ "text" index , "normal" index due query engine handling. not going optimal approach, planning considerations important.

in short, ideally compound "equality" match "prefix", , if not include in index "after" text definition.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -