regex - Mysql query performance issues while using several REGEXP? -


in query need have several regular expression in order filter user name or email because not interesting on them. so, have written query in mysql , after running , took many time give me result. got problem wit performance. after running query, don't filtering information in correct way. not sure how can improve query in order :

  • speed query response time
  • applying correct filtering regular expression

i appreciated help.

select distinct t.user, vg_product_id,t.`platform`, pd.`mail`, substring_index(group_concat(p.ts order p.ts desc separator ','), ',', 1) sub_start_ts,  substring_index(group_concat(t.`expires_at`order t.`expires_at` desc separator ','), ',', 1) expired_time  users u inner join tariff_subs_info t on (t.`user` = u.`user_xmpp_login`                                    , t.`user` not regexp ('^([a-za-z]{2,3}(produsero|usero)+[0-9]{1,3})$' or '(\w+|\d+)?test(\w+|\d+)?' )                                   , t.vg_product_id regexp "^(europe?|usa?|unlimited?|basic?)([a-za-z0-9]+|\_)+(and?|ios?)+$" ) left  join plus_data pd on (u.`user_xmpp_login` = pd.`user`) inner join purchase_log p on (p.purchase_id = t.purchase_id) (pd.mail not '%guerrillamail.com'         or pd.mail null)  group 1,2 order date(p.ts); 

and result:

noadstestuser   basic_xxx_ios   ios null    2015-10-26 14:00:32 2015-10-26 14:05:24 brusero2    unlimited_xx_ios    ios brusero2@yhx.yg 2015-11-03 15:41:57 2015-11-03 15:46:45 brusero3    bxx_uscios  ios brusero3@tb.fff 2015-11-03 15:43:53 2015-11-03 15:48:42 esusero1    unlixx_usc  ios esusero1@es.userr   2015-11-03 13:51:54 2015-11-03 13:56:41 esusero3    basic_x_i os    esusero3@yn.yyf 2015-11-03 13:55:08 2015-11-03 14:00:02 esusero4    basic_x ios esusero4@yn.ttx 2015-11-03 14:01:50 2015-11-03 14:06:38 esusero5    unxxxed_us  ios esusero5@uh.hhb 2015-11-03 14:45:38 2015-11-03 14:50:24 esusero6    basic_xx    ios esusero6@yh.hvv 2015-11-03 14:51:22 2015-11-03 14:56:09 esusero7    unlimxx_    ios esusero7@yh.yyh 2015-11-03 15:20:35 2015-11-03 15:25:24 esusero8    basxx_usc   ios esusero8@ij.iih 2015-11-03 15:22:29 2015-11-03 15:27:14 flusero2    unlxxxe ios flusero2@yh.yog 2015-11-03 16:57:58 2015-11-03 17:02:45 nlprodusero1    baxicxx_x   ios nlprodusero1@yh.rof 2015-11-03 14:06:52 2015-11-03 14:11:44 nlprodusero2    unlixxxeds  ios nlprodusero2@uoh.df 2015-11-03 14:08:28 2015-11-03 14:13:16 prodpurchasetest    baxxc_usxc  ios null    2015-11-03 09:20:51 2015-11-03 09:25:41 ukusero1    basicxxsca  ios ukusero1@uj.uoh 2015-11-03 15:45:59 2015-11-03 15:48:42 ukusero2    baxxsca ios gbuser@yb.jov   2015-11-03 17:00:14 2015-11-03 17:05:07 ukusero4    unlxxd_usc  ios ukusero4@uoh.jv 2015-11-03 17:02:10 2015-11-03 17:02:45 usprodusero1    uxxited_us  ios usprodusero1@ook.ok 2015-11-03 13:30:25 2015-11-03 13:35:14 usprodusero2    bxxxs   ios usprodusero2@ok.iob 2015-11-03 13:33:39 2015-11-03 13:38:31 usprodusero5    unlxxsc ios usprodusero5@rou.tf 2015-11-03 15:34:35 2015-11-03 15:39:26 

this result unexpected me , don't want have them. despite of using not regexp, these lines came results. how can solve these situation?

after edit:

select t.user, vg_product_id,t.`platform`, pd.`mail`, substring_index(group_concat(p.ts order p.ts desc separator ','), ',', 1) sub_start_ts,  substring_index(group_concat(t.`expires_at`order t.`expires_at` desc separator ','), ',', 1) expired_time  users u inner join tariff_subs_info t on (t.`user` = u.`user_xmpp_login`                                    , t.`user` not regexp ('^([a-za-z]{2,3}(produsero|usero)+[0-9]{1,3})$')                                   , t.`user` not regexp ('test')                                   , t.vg_product_id regexp ("^(europe?|usa?|unlimited?|basic?)([a-za-z0-9_]+)+(and?|ios?)+$" )) left  join plus_data pd on (u.`user_xmpp_login` = pd.`user`) inner join purchase_log p on (p.purchase_id = t.purchase_id) (pd.mail not '%guerrillamail.com'        , pd.mail not '%test%'        or pd.mail null)  group 1,2 order date(p.ts); 

i still have following result , , 'test' in user.

noadstestuser   basixxxf_ios    ios null    2015-10-26 14:00:32 2015-10-26 14:05:24 prodpurchasetest    basic_uscaxxs   ios null    2015-11-03 09:20:51 2015-11-03 09:25:41 esusertest  basic_uscxxxs   ios esusertest@ixn.ib   2015-11-04 13:53:48 2015-11-04 13:58:44 esusertest2 basic_uxxxx ios esusertedt2@iu.ycx  2015-11-04 14:11:12 2015-11-04 14:13:44 

mysql regexp not support \w , \d shorthand character classes. so, (\w+|\d+)? subpattern invalid in mysql. since ? quantifier makes subpatterns optional (repeat 1 or 0 times), can remove them altogether.

thus, or '(\w+|\d+)?test(\w+|\d+)?' turn and t.`user` not regexp ('test'), equal in meaning and t.`user` not '%test%'.

next, ([a-za-z0-9]+|\_)+ problematic since there nested quantifiers (a + inside alternation group has + quantifier applied). classical scenario when catastrophical backtracking may occur. suggest replacing subpattern [a-za-z0-9_]+ match letters, digits or underscore. or equivalent [[:alnum:]_]+.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -