python - pandas: when data is NaN logic operations cannot be done -
i have large dataframe in pandas , 2 columns can have values or nan (null) when not assigned value.
i want populate 3rd column based on these 2. when not nan takes value. works follows:
in [16]: import pandas pd in [17]: import numpy np in [18]: df = pd.dataframe([[np.nan, np.nan],['john', 'malone'],[np.nan, np.nan]], columns = ['col1', 'col2']) in [19]: df out[19]: col1 col2 0 nan nan 1 john malone 2 nan nan in [20]: df['col3'] = np.nan in [21]: df.loc[df['col1'].notnull(),'col3'] = 'i ' + df['col1'] in [22]: df out[22]: col1 col2 col3 0 nan nan nan 1 john malone john 2 nan nan nan
this works:
in [29]: df.loc[df['col1']== 'john','col3'] = 'i ' + df['col2'] in [30]: df out[30]: col1 col2 col3 0 nan nan nan 1 john malone malone 2 nan nan nan
but if not make values nan , try last loc, gives me error!
in [31]: df = pd.dataframe([[np.nan, np.nan],[np.nan, np.nan],[np.nan, np.nan]], columns = ['col1', 'col2']) in [32]: df out[32]: col1 col2 0 nan nan 1 nan nan 2 nan nan in [33]: df['col3'] = np.nan in [34]: df.loc[df['col1']== 'john','col3'] = 'i ' + df['col2'] --------------------------------------------------------------------------- typeerror traceback (most recent call last) c:\python33\lib\site-packages\pandas\core\ops.py in na_op(x, y) 552 result = expressions.evaluate(op, str_rep, x, y, --> 553 raise_on_error=true, **eval_kwargs) 554 except typeerror: c:\python33\lib\site-packages\pandas\computation\expressions.py in evaluate(op, op_str, a, b, raise_on_error, use_numexpr, **eval_kwargs) 217 return _evaluate(op, op_str, a, b, raise_on_error=raise_on_error, --> 218 **eval_kwargs) 219 return _evaluate_standard(op, op_str, a, b, raise_on_error=raise_on_error) c:\python33\lib\site-packages\pandas\computation\expressions.py in _evaluate_standard(op, op_str, a, b, raise_on_error, **eval_kwargs) 70 _store_test_result(false) ---> 71 return op(a, b) 72 c:\python33\lib\site-packages\pandas\core\ops.py in _radd_compat(left, right) 805 try: --> 806 output = radd(left, right) 807 except typeerror: c:\python33\lib\site-packages\pandas\core\ops.py in <lambda>(x, y) 802 def _radd_compat(left, right): --> 803 radd = lambda x, y: y + x 804 # gh #353, numpy 1.5.1 workaround typeerror: ufunc 'add' did not contain loop signature matching types dtype('<u32') dtype('<u32') dtype('<u32') during handling of above exception, exception occurred: typeerror traceback (most recent call last) <ipython-input-34-3b2873f8749b> in <module>() ----> 1 df.loc[df['col1']== 'john','col3'] = 'i ' + df['col2'] c:\python33\lib\site-packages\pandas\core\ops.py in wrapper(left, right, name, na_op) 616 lvalues = lvalues.values 617 --> 618 return left._constructor(wrap_results(na_op(lvalues, rvalues)), 619 index=left.index, name=left.name, 620 dtype=dtype) c:\python33\lib\site-packages\pandas\core\ops.py in na_op(x, y) 561 result = np.empty(len(x), dtype=x.dtype) 562 mask = notnull(x) --> 563 result[mask] = op(x[mask], y) 564 else: 565 raise typeerror("{typ} cannot perform operation {op}".format(typ=type(x).__name__,op=str_rep)) c:\python33\lib\site-packages\pandas\core\ops.py in _radd_compat(left, right) 804 # gh #353, numpy 1.5.1 workaround 805 try: --> 806 output = radd(left, right) 807 except typeerror: 808 raise c:\python33\lib\site-packages\pandas\core\ops.py in <lambda>(x, y) 801 802 def _radd_compat(left, right): --> 803 radd = lambda x, y: y + x 804 # gh #353, numpy 1.5.1 workaround 805 try: typeerror: ufunc 'add' did not contain loop signature matching types dtype('<u32') dtype('<u32') dtype('<u32')
it if pandas didn't column value == text if values nan????
help!
i argue line doing doing adding string column 1 values if there values not null.
df.loc[df['col1'].notnull(),'col3'] = 'i ' + df['col1']
so can check if there values not null , perform operation if there are:
if df['col1'].notnull().any(): df['col3'] = 'i ' + df['col1']
you don't need create col3 column prior running way.
Comments
Post a Comment