python - pandas: when data is NaN logic operations cannot be done -


i have large dataframe in pandas , 2 columns can have values or nan (null) when not assigned value.

i want populate 3rd column based on these 2. when not nan takes value. works follows:

in [16]: import pandas pd  in [17]: import numpy np  in [18]: df = pd.dataframe([[np.nan, np.nan],['john', 'malone'],[np.nan, np.nan]], columns = ['col1', 'col2'])  in [19]: df out[19]:    col1    col2 0   nan     nan 1  john  malone 2   nan     nan  in [20]: df['col3'] = np.nan  in [21]: df.loc[df['col1'].notnull(),'col3'] = 'i ' + df['col1']  in [22]: df out[22]:    col1    col2       col3 0   nan     nan        nan 1  john  malone  john 2   nan     nan        nan 

this works:

in [29]: df.loc[df['col1']== 'john','col3'] = 'i ' + df['col2']  in [30]: df out[30]:    col1    col2         col3 0   nan     nan          nan 1  john  malone  malone 2   nan     nan          nan 

but if not make values nan , try last loc, gives me error!

in [31]: df = pd.dataframe([[np.nan, np.nan],[np.nan, np.nan],[np.nan, np.nan]], columns = ['col1', 'col2'])  in [32]: df out[32]:    col1  col2 0   nan   nan 1   nan   nan 2   nan   nan  in [33]: df['col3'] = np.nan  in [34]: df.loc[df['col1']== 'john','col3'] = 'i ' + df['col2'] --------------------------------------------------------------------------- typeerror                                 traceback (most recent call last) c:\python33\lib\site-packages\pandas\core\ops.py in na_op(x, y)     552             result = expressions.evaluate(op, str_rep, x, y, --> 553                                           raise_on_error=true, **eval_kwargs)     554         except typeerror:  c:\python33\lib\site-packages\pandas\computation\expressions.py in evaluate(op, op_str, a, b, raise_on_error, use_numexpr, **eval_kwargs)     217         return _evaluate(op, op_str, a, b, raise_on_error=raise_on_error, --> 218                          **eval_kwargs)     219     return _evaluate_standard(op, op_str, a, b, raise_on_error=raise_on_error)  c:\python33\lib\site-packages\pandas\computation\expressions.py in _evaluate_standard(op, op_str, a, b, raise_on_error, **eval_kwargs)      70         _store_test_result(false) ---> 71     return op(a, b)      72  c:\python33\lib\site-packages\pandas\core\ops.py in _radd_compat(left, right)     805     try: --> 806         output = radd(left, right)     807     except typeerror:  c:\python33\lib\site-packages\pandas\core\ops.py in <lambda>(x, y)     802 def _radd_compat(left, right): --> 803     radd = lambda x, y: y + x     804     # gh #353, numpy 1.5.1 workaround  typeerror: ufunc 'add' did not contain loop signature matching types dtype('<u32') dtype('<u32') dtype('<u32')  during handling of above exception, exception occurred:  typeerror                                 traceback (most recent call last) <ipython-input-34-3b2873f8749b> in <module>() ----> 1 df.loc[df['col1']== 'john','col3'] = 'i ' + df['col2']  c:\python33\lib\site-packages\pandas\core\ops.py in wrapper(left, right, name, na_op)     616                 lvalues = lvalues.values     617 --> 618             return left._constructor(wrap_results(na_op(lvalues, rvalues)),     619                                      index=left.index, name=left.name,     620                                      dtype=dtype)  c:\python33\lib\site-packages\pandas\core\ops.py in na_op(x, y)     561                 result = np.empty(len(x), dtype=x.dtype)     562                 mask = notnull(x) --> 563                 result[mask] = op(x[mask], y)     564             else:     565                 raise typeerror("{typ} cannot perform operation {op}".format(typ=type(x).__name__,op=str_rep))  c:\python33\lib\site-packages\pandas\core\ops.py in _radd_compat(left, right)     804     # gh #353, numpy 1.5.1 workaround     805     try: --> 806         output = radd(left, right)     807     except typeerror:     808         raise  c:\python33\lib\site-packages\pandas\core\ops.py in <lambda>(x, y)     801     802 def _radd_compat(left, right): --> 803     radd = lambda x, y: y + x     804     # gh #353, numpy 1.5.1 workaround     805     try:  typeerror: ufunc 'add' did not contain loop signature matching types dtype('<u32') dtype('<u32') dtype('<u32') 

it if pandas didn't column value == text if values nan????

help!

i argue line doing doing adding string column 1 values if there values not null.

df.loc[df['col1'].notnull(),'col3'] = 'i ' + df['col1'] 

so can check if there values not null , perform operation if there are:

if df['col1'].notnull().any():     df['col3'] = 'i ' + df['col1'] 

you don't need create col3 column prior running way.


Comments

Popular posts from this blog

javascript - jQuery: Add class depending on URL in the best way -

caching - How to check if a url path exists in the service worker cache -

Redirect to a HTTPS version using .htaccess -