machine learning - random forest with characters in scikit-learn/python -
i have character column , numbers want categorize character column , apply random forest classifier. realize there onehotencoder there no example anywhere. how can categorize characters e.g. gender column has 'f' , 'm' integers (0,1)?
use labelencoder takes array of strings , transforms array of integers.
example:
from sklearn.preprocessing import labelencoder import pandas pd data = pd.dataframe() data['age'] = [17,33,47] data['gender'] = ['m','f','m'] enc = labelencoder() print(data) enc.fit(data['gender']) data['gender'] = enc.transform(data['gender']) print(data)
output:
age gender 0 17 m 1 33 f 2 47 m age gender 0 17 1 1 33 0 2 47 1
Comments
Post a Comment