How do I convert data from a Scikit-learn Bunch object to a Pandas DataFrame?
from sklearn.datasets import load_iris
import pandas as pd
data = load_iris()
p
Other way to combine features and target variables can be using np.column_stack (details)
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
data = load_iris()
df = pd.DataFrame(np.column_stack((data.data, data.target)), columns = data.feature_names+['target'])
print(df.head())
Result:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
0 5.1 3.5 1.4 0.2 0.0
1 4.9 3.0 1.4 0.2 0.0
2 4.7 3.2 1.3 0.2 0.0
3 4.6 3.1 1.5 0.2 0.0
4 5.0 3.6 1.4 0.2 0.0
If you need the string label for the target, then you can use replace by convertingtarget_names to dictionary and add a new column:
df['label'] = df.target.replace(dict(enumerate(data.target_names)))
print(df.head())
Result:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target label
0 5.1 3.5 1.4 0.2 0.0 setosa
1 4.9 3.0 1.4 0.2 0.0 setosa
2 4.7 3.2 1.3 0.2 0.0 setosa
3 4.6 3.1 1.5 0.2 0.0 setosa
4 5.0 3.6 1.4 0.2 0.0 setosa