python - Pandas : Merge two dataframe where the keys are different -
i have 2 pandas dataframe, 1 store values , stores weight key of value dataframe : [symbol, date, hour] , weight dataframe [symbol, date].
in [8]: value_df = pd.dataframe({'symbol':['s1','s1','s1','s1','s2','s2','s3'], 'date' : [20150101,20150101, 20150101, 20150102,20150101,20150102,20150103], 'hour' : [8,9,10,8,8,8,8], 'value' : [10,10.1,10.2,11,100,101,300]}) in [9]: value_df out[9]: date hour symbol value 0 20150101 8 s1 10.0 1 20150101 9 s1 10.1 2 20150101 10 s1 10.2 3 20150102 8 s1 11.0 4 20150101 8 s2 100.0 5 20150102 8 s2 101.0 6 20150103 8 s3 300.0 in [10]: weight_df = pd.dataframe({'symbol': ['s1','s1','s1','s2','s2','s2','s3','s3','s3'], 'date':[20150101,20150102,20150103] * 3,'weight': [0.8,0.9,1,1,1,1,0.5,0.5,0.5]}) in [11]: weight_df out[11]: date symbol weight 0 20150101 s1 0.8 1 20150102 s1 0.9 2 20150103 s1 1.0 3 20150101 s2 1.0 4 20150102 s2 1.0 5 20150103 s2 1.0 6 20150101 s3 0.5 7 20150102 s3 0.5 8 20150103 s3 0.5
i merge these 2 tables, , add weight column value_df should cartesian product. instance:
date hour symbol value weight 0 20150101 8 s1 10.0 0.8 1 20150101 9 s1 10.1 0.8 2 20150101 10 s1 10.2 0.8 3 20150102 8 s1 11.0 0.9 4 20150101 8 s2 100.0 1.0 5 20150102 8 s2 101.0 1.0 6 20150103 8 s3 300.0 0.5
challenge here because of additional 'hour' column.
i'm not sure understand "challenge". simple merge gives desired output:
>>> pandas.merge(value_df, weight_df, on=['date', 'symbol']) date hour symbol value weight 0 20150101 8 s1 10.0 0.8 1 20150101 9 s1 10.1 0.8 2 20150101 10 s1 10.2 0.8 3 20150102 8 s1 11.0 0.9 4 20150101 8 s2 100.0 1.0 5 20150102 8 s2 101.0 1.0 6 20150103 8 s3 300.0 0.5
Comments
Post a Comment