categorical-data

Pandas cast all object columns to category

时光总嘲笑我的痴心妄想 提交于 2021-01-22 05:27:13
问题 I want to have ha elegant function to cast all object columns in a pandas data frame to categories df[x] = df[x].astype("category") performs the type cast df.select_dtypes(include=['object']) would sub-select all categories columns. However this results in a loss of the other columns / a manual merge is required. Is there a solution which "just works in place" or does not require a manual cast? edit I am looking for something similar as http://pandas.pydata.org/pandas-docs/stable/generated

Working of labelEncoder in sklearn

不羁岁月 提交于 2020-12-29 04:03:32
问题 Say I have the following input feature: hotel_id = [1, 2, 3, 2, 3] This is a categorical feature with numeric values. If I give it to the model as it is, the model will treat it as continuous variable, ie., 2 > 1. If I apply sklearn.labelEncoder() then I will get: hotel_id = [0, 1, 2, 1, 2] So this encoded feature is considered as continuous or categorical? If it is treated as continuous then whats the use of labelEncoder(). P.S. I know about one hot encoding. But there are around 100 hotel

Working of labelEncoder in sklearn

北城余情 提交于 2020-12-29 04:03:03
问题 Say I have the following input feature: hotel_id = [1, 2, 3, 2, 3] This is a categorical feature with numeric values. If I give it to the model as it is, the model will treat it as continuous variable, ie., 2 > 1. If I apply sklearn.labelEncoder() then I will get: hotel_id = [0, 1, 2, 1, 2] So this encoded feature is considered as continuous or categorical? If it is treated as continuous then whats the use of labelEncoder(). P.S. I know about one hot encoding. But there are around 100 hotel

Working of labelEncoder in sklearn

独自空忆成欢 提交于 2020-12-29 04:02:48
问题 Say I have the following input feature: hotel_id = [1, 2, 3, 2, 3] This is a categorical feature with numeric values. If I give it to the model as it is, the model will treat it as continuous variable, ie., 2 > 1. If I apply sklearn.labelEncoder() then I will get: hotel_id = [0, 1, 2, 1, 2] So this encoded feature is considered as continuous or categorical? If it is treated as continuous then whats the use of labelEncoder(). P.S. I know about one hot encoding. But there are around 100 hotel

How to count the number of categorical features with Pandas?

房东的猫 提交于 2020-12-25 09:56:19
问题 I have a pd.DataFrame which contains different dtypes columns. I would like to have the count of columns of each type. I use Pandas 0.24.2. I tried: dataframe.dtypes.value_counts() It worked fine for other dtypes (float64, object, int64) but for a weird reason, it doesn't aggregate the 'category' features, and I get a different count for each category (as if they would be counted as different values of dtypes). I also tried: dataframe.dtypes.groupby(by=dataframe.dtypes).agg(['count']) But

How to count the number of categorical features with Pandas?

旧城冷巷雨未停 提交于 2020-12-25 09:56:09
问题 I have a pd.DataFrame which contains different dtypes columns. I would like to have the count of columns of each type. I use Pandas 0.24.2. I tried: dataframe.dtypes.value_counts() It worked fine for other dtypes (float64, object, int64) but for a weird reason, it doesn't aggregate the 'category' features, and I get a different count for each category (as if they would be counted as different values of dtypes). I also tried: dataframe.dtypes.groupby(by=dataframe.dtypes).agg(['count']) But

How to change contrasts to compare with mean of all levels rather than reference level (R, lmer)?

为君一笑 提交于 2020-12-13 04:48:10
问题 I have a dataset for which each row is one visit to a store by a salesperson and the fields include "outlet" (store ID), "devices" (how many electronic devices the salesperson sold) and "weekday" (the day of the week on which the salesperson was in the store). I want to work out whether one weekday is better than the others for sales, so instead of comparing all the days of the week to e.g. Monday I want to compare them to the mean of all the days of the week. I am using the lmerTest function