SQL SERVER - Query optimization 'like' causing most cpu uses 100% -


i have 2 tables in database products , filters.

the schema:

enter image description here

i have created query find records filters table, loop each record , call procedure set category id products table.

filter table data follow.

enter image description here

the filter selection query follow..

declare @totalrecords int, @start int, @limit int, @catid int, @merchants nvarchar(max), @notmatch nvarchar(max), @willmatch nvarchar(max); select @totalrecords = count(*) filters;  set @limit = 1; set @start = 0;  while(@totalrecords > 0) begin            select @catid = category_id, @merchants = merchant_name, @notmatch = not_match, @willmatch = will_match      (         select top (@start + @limit) *, row_number() on (order (select 1)) rnum          filters     )     rnum > @start;      -- call filter procedure.     exec procsetproductcategory @catid = @catid, @merchants = @merchants, @willmatch = @willmatch, @notmatch = @notmatch;      set @start += 1;     set @totalrecords -= 1; end 

and procsetproductcategory follow..

create proc [dbo].[procsetproductcategory] (     @catid int = null,     @merchants nvarchar(max),     @notmatch nvarchar(max),     @willmatch nvarchar(max) ) begin set nocount on      declare @query nvarchar(max), @ortoken nvarchar(max), @andtoken nvarchar(max);      set @query = 'update products set category_id = '+ convert(nvarchar(20), @catid) + ' category_id null , merchant_name in(' + @merchants + ')';      if(@willmatch not null , ltrim(rtrim(@willmatch)) != '')     begin          set @andtoken = '%'' , product_name ''%';         set @willmatch = replace(@willmatch, '+', @andtoken);          set @ortoken = '%'') or (product_name ''%';         set @query = @query + ' , ((product_name '''+ '%' + replace(@willmatch, ',', @ortoken) + '%''))';     end      if(@notmatch not null , ltrim(rtrim(@notmatch)) != '')     begin         set @andtoken = '%'' , product_name not ''%';         set @notmatch = replace(@notmatch, '+', @andtoken);          set @ortoken = '%'') or (product_name not ''%';         set @query = @query + ' , ((product_name not '''+ '%' + replace(@notmatch, ',', @ortoken) + '%''))';     end      execute sp_executesql @query; end 

it generates sql query following...

query #1 ------------------------------------------------------------------------------------------------------- update products set category_id = 101 merchant_name in('merchant 1','merchant 4','merchant 3') ,   (     (product_name '%abcd%' , product_name '%efhg%')  ) , (     (product_name not '%3258%')      or (product_name not '%yxzs%') )   query #2 ------------------------------------------------------------------------------------------------------- update products set category_id = 102 merchant_name in('merchant 3', 'merchant 4') ,  (     (product_name '%1258%') or (product_name '%abcd%') ) 

note there trick used here.

[,] used differentiate match phrases. [+] in match fields used 2 match phrases , conditions.

these query doing same needed..

enter image description here

issue when run query 500 000 products using 100% cpu.

how can optimize query doesn't take impact on result can reduce cpu usage?

for starters, pointed out already: there wrong logic here. said, assuming stuck there things might want try. first question be: how long thing run? shouldn't worry takes 100% cpu; question how time takes finish.

query1:

it seems creating loop on the filters table, fetching every single row, 1 one.

  • sql isn't optimised row-by-row operations; should consider changing logic set-based
  • if want row row, please use cursor , not the current approach.
    • first go on entire table count how many filters there are
    • then go on entire table , order records select 1
    • out of sorted list pick 1 has rnum bigger counter

=> wrong in many ways, hurts =(

  • if sort/order select 1 return records in order abcd first time , badc second time; , both answers correct because you're sorting constant: actual order of records doesn't matter!
  • each , every time go through loop, server has sort entire table before can tell rnum values fit requirement of being greater @start; every time!
  • there many records fit rnum > @start, returned record being used fill records 1 of them!

to 'fix' i'd suggest use following approach:

declare @totalrecords int,          @start int,          @limit int,          @catid int,          @merchants nvarchar(max),          @notmatch nvarchar(max),          @willmatch nvarchar(max);  declare filter_loop cursor local fast_forward     select category_id,                 merchant_name,                not_match,                will_match           filters          order id -- not required makes debugging easier open filter_loop  fetch next filter_loop @catid, @merchants, @notmatch, @willmatch while @@fetch_status = 0     begin          -- call filter procedure.         exec procsetproductcategory @catid = @catid, @merchants = @merchants, @willmatch = @willmatch, @notmatch = @notmatch;          -- next filter         fetch next filter_loop @catid, @merchants, @notmatch, @willmatch     end close filter_loop  deallocate filter_loop  

query2:

at first sight there little can stored procedure itself. there dynamic sql string building might optimized little bit doubt make impact. right it's readable, i'd leave is. generated query indeed looks this:

update products     set category_id = 101   merchant_name in ('merchant 1','merchant 4','merchant 3')     , ((product_name '%abcd%' , product_name '%efhg%') )     , ((product_name not '%3258%') or (product_name not '%yxzs%')) 

for i'd advice create following index:

create index idx_test on products (merchant_name) include product_name) 

afterthoughts

even changes above in place, still run quite while when working on 100k+ records. real solution around use set-based approach, require either gargantuan dynamic sql string; or better knowledge data itself. e.g. might try combining different filters records have same merchants value different match/nomatch... not difficult, i'd suggest start suggestions above first , see end up.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -