seaborn

Plot another point on top of swarmplot

▼魔方 西西 提交于 2020-04-16 02:53:05
问题 I want to plot a "highlighted" point on top of swarmplot like this The swarmplot don't have the y-axis, so I have no idea how to plot that point. import seaborn as sns sns.set(style="whitegrid") tips = sns.load_dataset("tips") ax = sns.swarmplot(x=tips["total_bill"]) 回答1: This approach is predicated on knowing the index of the data point you wish to highlight, but it should work - although if you have multiple swarmplots on a single Axes instance it will become slightly more complex. import

Is it possible to do a “zoom inset” using seaborn?

不羁岁月 提交于 2020-04-15 04:32:30
问题 This example from matplotlib shows how to do an inset. However I am working with seaborn, specifically the kdeplot. sns.kdeplot(y, label='default bw') sns.kdeplot(y, bw=0.5, label="bw: 0.2", alpha=0.6) sns.kdeplot(y, linestyle="--", bw=2, label="bw: 2", alpha=0.6) sns.kdeplot(y, linestyle=":", bw=5, label="bw: 5", alpha=0.6) It so happens that I have a lot of empty space on the right side of the graph and I would like to put a zoomed in inset there to clarify the lower x range. (If need be I

Seaborn Facetgrid countplot hue

烂漫一生 提交于 2020-04-14 08:22:10
问题 I have created a sample data set for this quesition import pandas as pd from pandas import DataFrame import seaborn as sns import numpy as np sex = np.array(['Male','Female']) marker1 = np.array(['Absent','Present']) marker2 = np.array(['Absent','Present']) sample1 = np.random.randint(0,2,100) sample2 = np.random.randint(0,2,100) sample3 = np.random.randint(0,2,100) df=pd.concat([pd.Series(sex.take(sample1),dtype='category'),pd.Series(marker1.take(sample2),dtype='category'),pd.Series(marker2

How to smooth timeseries with yearly data with lowess in python

不羁岁月 提交于 2020-04-12 07:08:15
问题 I have some data that were recoreded yearly as follows. mydata = [0.6619346141815186, 0.7170140147209167, 0.692265510559082, 0.6394098401069641, 0.6030995845794678, 0.6500746607780457, 0.6013327240943909, 0.6273292303085327, 0.5865356922149658, 0.6477396488189697, 0.5827181339263916, 0.6496025323867798, 0.6589270234107971, 0.5498126149177551, 0.48638370633125305, 0.5367399454116821, 0.517595648765564, 0.5171639919281006, 0.47503289580345154, 0.6081966757774353, 0.5808742046356201, 0

《Python数据分析与机器学习实战-唐宇迪》读书笔记第9章--随机森林项目实战——气温预测(2/2)

岁酱吖の 提交于 2020-04-10 15:13:09
python数据分析个人学习读书笔记-目录索引 第9章--随机森林项目实战——气温预测(2/2)   第8章已经讲解过随机森林的基本原理,本章将从实战的角度出发,借助Python工具包完成气温预测任务,其中涉及多个模块,主要包含随机森林建模、特征选择、效率对比、参数调优等。这个例子实在太长了,分为3篇介绍。这是第2篇。 9.2数据与特征对结果影响分析   带着上节提出的问题,重新读取规模更大的数据,任务还是保持不变,需要分别观察数据量和特征的选择对结果的影响。 1 # 导入工具包 2 import pandas as pd 3 4 # 读取数据 5 features = pd.read_csv( ' data/temps_extended.csv ' ) 6 features.head(5 ) 7 8 print( ' 数据规模 ',features.shape)   数据规模 (2191, 12)   在新的数据中,数据规模发生了变化,数据量扩充到2191条,并且加入了以下3个新的天气特征。 ws_1:前一天的风速。 prcp_1:前一天的降水。 snwd_1:前一天的积雪深度。   既然有了新的特征,就可绘图进行可视化展示。 1 # 设置整体布局 2 fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows=2, ncols=2,

TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' while plotting a seaborn.regplot

爱⌒轻易说出口 提交于 2020-04-10 14:53:20
问题 I'm trying to plot a regplot using seaborn and i'm not unable to plot it and facing TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' . My data has 731 rows and 16 column - >>> bike_df.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 731 entries, 0 to 730 Data columns (total 16 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 instant 731 non-null int64 1 dteday 731 non-null object 2 season 731 non-null int64 3

Plotting histogram using seaborn for a dataframe

做~自己de王妃 提交于 2020-04-09 22:18:19
问题 I have a dataFrame which has multiple columns and many rows..Many row has no value for column so in the data frame its represented as NaN. The example dataFrame is as follows, df.head() GEN Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6 Sample_7 Sample_8 Sample_9 Sample_10 Sample_11 Sample_12 Sample_13 Sample_14 A123 9.4697 3.19689 4.8946 8.54594 13.2568 4.93848 3.16809 NAN NAN NAN NAN NAN NAN NAN A124 6.02592 4.0663 3.9218 2.66058 4.38232 NAN NAN NAN NAN NAN NAN NAN A125 7.88999 2

Display seaborn plots at some point later in code

你离开我真会死。 提交于 2020-03-26 04:07:39
问题 Let's say at some point in my code, I have following two graphs: i.e. graph_p_changes and graph_p_contrib line_grapgh_p_changes = df_p_change[['year','interest accrued', 'trade debts', 'other financial assets']].melt('year', var_name='variables', value_name='p_changes') graph_p_changes = sns.factorplot(x="year", y="p_changes", hue='variables', data=line_grapgh_p_changes, height=5, aspect=2) graph_p_changes.set(xlabel='year', ylabel='percentage change in self value across the years') line

Pandas format column as currency

廉价感情. 提交于 2020-03-20 01:29:19
问题 Given the following data frame: import pandas as pd df = pd.DataFrame( {'A':['A','B','C','D'], 'C':[12355.00,12555.67,640.00,7000] }) df A C 0 A 12355.00 1 B 12555.67 2 C 640.00 3 D 7000.00 I'd like to convert the values to dollars in thousands of USD like this: A C 0 A $12.3K 1 B $12.5K 2 C $0.6K 3 D $7.0K The second thing I need to do is somehow get these into a Seaborn heat map, which only accepts floats and integers. See here for more on the heat map aspect. I'm assuming once the floats

Pandas format column as currency

流过昼夜 提交于 2020-03-20 01:25:50
问题 Given the following data frame: import pandas as pd df = pd.DataFrame( {'A':['A','B','C','D'], 'C':[12355.00,12555.67,640.00,7000] }) df A C 0 A 12355.00 1 B 12555.67 2 C 640.00 3 D 7000.00 I'd like to convert the values to dollars in thousands of USD like this: A C 0 A $12.3K 1 B $12.5K 2 C $0.6K 3 D $7.0K The second thing I need to do is somehow get these into a Seaborn heat map, which only accepts floats and integers. See here for more on the heat map aspect. I'm assuming once the floats