SQL SERVER - Query optimization 'like' causing most cpu uses 100% -
i have 2 tables in database products , filters.
the schema:
i have created query find records filters table, loop each record , call procedure set category id products table.
filter table data follow.
the filter selection query follow..
declare @totalrecords int, @start int, @limit int, @catid int, @merchants nvarchar(max), @notmatch nvarchar(max), @willmatch nvarchar(max); select @totalrecords = count(*) filters; set @limit = 1; set @start = 0; while(@totalrecords > 0) begin select @catid = category_id, @merchants = merchant_name, @notmatch = not_match, @willmatch = will_match ( select top (@start + @limit) *, row_number() on (order (select 1)) rnum filters ) rnum > @start; -- call filter procedure. exec procsetproductcategory @catid = @catid, @merchants = @merchants, @willmatch = @willmatch, @notmatch = @notmatch; set @start += 1; set @totalrecords -= 1; end
and procsetproductcategory follow..
create proc [dbo].[procsetproductcategory] ( @catid int = null, @merchants nvarchar(max), @notmatch nvarchar(max), @willmatch nvarchar(max) ) begin set nocount on declare @query nvarchar(max), @ortoken nvarchar(max), @andtoken nvarchar(max); set @query = 'update products set category_id = '+ convert(nvarchar(20), @catid) + ' category_id null , merchant_name in(' + @merchants + ')'; if(@willmatch not null , ltrim(rtrim(@willmatch)) != '') begin set @andtoken = '%'' , product_name ''%'; set @willmatch = replace(@willmatch, '+', @andtoken); set @ortoken = '%'') or (product_name ''%'; set @query = @query + ' , ((product_name '''+ '%' + replace(@willmatch, ',', @ortoken) + '%''))'; end if(@notmatch not null , ltrim(rtrim(@notmatch)) != '') begin set @andtoken = '%'' , product_name not ''%'; set @notmatch = replace(@notmatch, '+', @andtoken); set @ortoken = '%'') or (product_name not ''%'; set @query = @query + ' , ((product_name not '''+ '%' + replace(@notmatch, ',', @ortoken) + '%''))'; end execute sp_executesql @query; end
it generates sql query following...
query #1 ------------------------------------------------------------------------------------------------------- update products set category_id = 101 merchant_name in('merchant 1','merchant 4','merchant 3') , ( (product_name '%abcd%' , product_name '%efhg%') ) , ( (product_name not '%3258%') or (product_name not '%yxzs%') ) query #2 ------------------------------------------------------------------------------------------------------- update products set category_id = 102 merchant_name in('merchant 3', 'merchant 4') , ( (product_name '%1258%') or (product_name '%abcd%') )
note there trick used here.
[,] used differentiate match phrases. [+] in match fields used 2 match phrases , conditions.
these query doing same needed..
issue when run query 500 000 products using 100% cpu.
how can optimize query doesn't take impact on result can reduce cpu usage?
for starters, pointed out already: there wrong logic here. said, assuming stuck there things might want try. first question be: how long thing run? shouldn't worry takes 100% cpu; question how time takes finish.
query1:
it seems creating loop on the filters
table, fetching every single row, 1 one.
- sql isn't optimised row-by-row operations; should consider changing logic set-based
- if want row row, please use
cursor
, not the current approach.- first go on entire table count how many filters there are
- then go on entire table , order records
select 1
- out of sorted list pick 1 has
rnum
bigger counter
=> wrong in many ways, hurts =(
- if sort/order
select 1
return records in order abcd first time , badc second time; , both answers correct because you're sorting constant: actual order of records doesn't matter! - each , every time go through loop, server has sort entire table before can tell
rnum
values fit requirement of being greater@start
; every time! - there many records fit
rnum > @start
, returned record being used fill records 1 of them!
to 'fix' i'd suggest use following approach:
declare @totalrecords int, @start int, @limit int, @catid int, @merchants nvarchar(max), @notmatch nvarchar(max), @willmatch nvarchar(max); declare filter_loop cursor local fast_forward select category_id, merchant_name, not_match, will_match filters order id -- not required makes debugging easier open filter_loop fetch next filter_loop @catid, @merchants, @notmatch, @willmatch while @@fetch_status = 0 begin -- call filter procedure. exec procsetproductcategory @catid = @catid, @merchants = @merchants, @willmatch = @willmatch, @notmatch = @notmatch; -- next filter fetch next filter_loop @catid, @merchants, @notmatch, @willmatch end close filter_loop deallocate filter_loop
query2:
at first sight there little can stored procedure itself. there dynamic sql string building might optimized little bit doubt make impact. right it's readable, i'd leave is. generated query indeed looks this:
update products set category_id = 101 merchant_name in ('merchant 1','merchant 4','merchant 3') , ((product_name '%abcd%' , product_name '%efhg%') ) , ((product_name not '%3258%') or (product_name not '%yxzs%'))
for i'd advice create following index:
create index idx_test on products (merchant_name) include product_name)
afterthoughts
even changes above in place, still run quite while when working on 100k+ records. real solution around use set-based approach, require either gargantuan dynamic sql string; or better knowledge data itself. e.g. might try combining different filters
records have same merchants
value different match
/nomatch
... not difficult, i'd suggest start suggestions above first , see end up.
Comments
Post a Comment