python - Efficiently finding overlap between many date ranges -


how can efficiently find overlapping dates between many date ranges?

i have pandas dataframe containing information on daily warehouse stock of many products. there records dates stock changed.

import pandas pd df = pd.dataframe({'product': ['a', 'a', 'a', 'b', 'b', 'b'],                   'stock': [10, 0, 10, 5, 0, 5],                   'date': ['2016-01-01', '2016-01-05', '2016-01-15',                           '2016-01-01', '2016-01-10', '2016-01-20']}) df['date'] = pd.to_datetime(df['date']) out[4]:          date product  stock 0 2016-01-01           10 1 2016-01-05            0 2 2016-01-15           10 3 2016-01-01       b      5 4 2016-01-10       b      0 5 2016-01-20       b      5 

from data want identify number of days stock of all products 0. in example 5 days (from 2016-01-10 2016-01-14).

i tried resampling date create 1 record every day , comparing day day. works creates large dataframe, can hardly keep in memory, because data contains many dates stock not change.

is there more memory-efficient way calculate overlaps other creating record every date , comparing day day?

maybe can somehow create period representation time range implicit in every records , compare periods products? option first subset time periods product has 0 stock (relatively few) , apply resampling on subset of data. other, more efficient ways there?

you can pivot table using dates index , products columns, fill nan's previous values, convert daily frequency , rows 0's in columns.

ptable = (df.pivot(index='date', columns='product', values='stock')           .fillna(method='ffill').asfreq('d', method='ffill')) cond = ptable.apply(lambda x: (x == 0).all(), axis='columns') print(ptable.index[cond])  datetimeindex(['2016-01-10', '2016-01-11', '2016-01-12', '2016-01-13',                '2016-01-14'],               dtype='datetime64[ns]', name=u'date', freq='d') 

Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -