seaborn

Trouble with saving grouped seaborn facetgrid heatmap data into a directory

▼魔方 西西 提交于 2020-03-04 18:44:24
问题 I've been struggling to save my graphs to the specific directory with some certaion look. Here is the example data and what I've tried so far import pandas as pd import numpy as np import itertools import seaborn as sns from matplotlib.colors import ListedColormap print("seaborn version {}".format(sns.__version__)) # R expand.grid() function in Python # https://stackoverflow.com/a/12131385/1135316 def expandgrid(*itrs): product = list(itertools.product(*itrs)) return {'Var{}'.format(i+1):[x[i

Creating multi column legend in python seaborn plot

心已入冬 提交于 2020-03-01 23:13:13
问题 I am using seaborn.distplot (python3) and want to have 2 labels for each series. I tried a hacky string format method like so: # bigkey and bigcount are longest string lengths of my keys and counts label = '{{:{}s}} - {{:{}d}}'.format(bigkey, bigcount).format(key, counts['sat'][key]) In the console where text is fixed width, I get: (-inf, 1) - 2538 [1, 3) - 7215 [3, 8) - 40334 [8, 12) - 20833 [12, 17) - 6098 [17, 20) - 499 [20, inf) - 87 I am assuming the font used in the plot is not fixed

干货 | 一文带你搞定Python 数据可视化

…衆ロ難τιáo~ 提交于 2020-02-29 16:42:09
01 前言 在之前的一篇文章《Python 数据可视化利器》中,我写了 Bokeh、pyecharts 的用法,但是有一个挺强大的库 Plotly 没写,主要是我看到它的教程都是在 Jupyter Notebooks 中使用,说来也奇怪,硬是找不到如何本地使用(就是本地输出 HTML 文件),所以不敢写出来。现在已经找到方法了,这里我就在原文的基础上增加了 Plotly 的部分教程。 数据可视化的第三方库挺多的,这里我主要推荐两个,分别是 Bokeh、pyecharts。 02 推荐 数据可视化的库有挺多的,这里推荐几个比较常用的: ● Matplotlib ● Plotly ● Seaborn ● Ggplot ● Bokeh ● Pyechart ● Pygal 03 Plotly Plotly 文档地址: ● https://plot.ly/python/#financial-charts 来源: oschina 链接: https://my.oschina.net/u/3611008/blog/2353967

How to stop plots printing twice in jupyter when using subplots?

佐手、 提交于 2020-02-28 07:27:09
问题 I'm working with the titanic data and I'm trying to use a combination of pyplot and seaborn to produce some subplots. I've written the following code to create 6 subplots in a 3x2 grid; plt.rcParams['figure.figsize'] = [12, 8] fig, axes = plt.subplots(nrows=3, ncols=2) plt.tight_layout() _ = sns.catplot(x='Pclass', y='Age', data=train_df, kind='box', height=8, palette=col_pal, ax=axes[0, 0]) _ = sns.catplot(x='Embarked', y='Age', data=train_df, kind='box', height=8, palette=col_pal, ax=axes[0

Pandas matplotlib 无法显示中文

感情迁移 提交于 2020-02-28 03:20:00
Pandas 无法显示中文问题 解决方案 Pandas在绘图时,会显示中文为方块,主要原因有二: matplotlib 字体问题seaborn 字体问题 (实际上,matplotlib是支持unicode编码的,中文乱码得主要问题是没有找到合适的中文字体,在matplotlib的配置文件中,可以看到字体的默认设置如下:    #font.family : sans-serif   #font.sans-serif : Bitstream Vera Sans, Lucida Grande, Verdana, Geneva, Lucid, Arial,   Helvetica, Avant Garde, sans-serif 并没有中文字体,所以我们只要手动添加中文字体的名称就可以了,不过并不是添加我们熟悉的“宋体”或“黑体”这类的名称,而是要添加字体管理器识别出的字体名称,matplotlib自身实现的字体管理器在文件font_manager.py中,自动生成的可用字体信息在保存在文件fontList.cache里,可以搜索这个文件查看对应字体的名称,例如simhei.ttf对应的名称为’SimHei’,simkai.ttf对应的名称为’KaiTi_GB2312’等。因此我们只要把这些名称添加到配置文件中去就可以让matplotlib显示中文,修改的方法有两种:) 一.

pandas 介绍

旧巷老猫 提交于 2020-02-27 05:51:47
1 .什么是pandas Pandas是基于Numpy构建的库,在数据处理方面可以把它理解为numpy加强版,同时Pandas也是一项开源项目 。不同于numpy的是,pandas拥有种数据结构:Series和DataFrame. a.Series是一种类似一维数组的数据结构,由一组数据和与之相关的index组成,这个结构一看似乎与dict字典差不多,我们知道字典是一种无序的数据结构,而pandas中的Series的数据结构不一样,它相当于定长有序的字典,并且它的index和value之间是独立的,两者的索引还是有区别的,Series的index是可变的,而dict字典的key值是不可变的。 from pandas import Series,DataFrame import pandas as pd data = Series([1,2,3,4],index= ['a','b','c','d']) print(data) b.DataFrame这种数据结构我们可以把它看作是一张二维表,DataFrame长得跟我们平时使用的Excel表格差不多,DataFrame的横行称为 columns ,竖列和Series一样称为 index ,DataFrame每一列可以是不同类型的值集合,所以DataFrame你也可以把它视为不同数据类型同一index的Series集合。 data2 =

Plot time series with different timestamps and datetime.time format that goes over one day

笑着哭i 提交于 2020-02-25 05:27:19
问题 I have two datasets that contain temperature and light sensor readings. The measurements were done from 22:35:41 - 04:49:41. The problem with this datasets is to plot the measurements with respect to the datetime.date format when the measurements are taken from one day to another (22:35:41 - 04:49:41). The plot-function automatically starts from 00:00 and puts the data that was measured before 00:00 to the end of the plot. import numpy as np import pandas as pd import matplotlib.pyplot as plt

Plot time series with different timestamps and datetime.time format that goes over one day

天涯浪子 提交于 2020-02-25 05:26:06
问题 I have two datasets that contain temperature and light sensor readings. The measurements were done from 22:35:41 - 04:49:41. The problem with this datasets is to plot the measurements with respect to the datetime.date format when the measurements are taken from one day to another (22:35:41 - 04:49:41). The plot-function automatically starts from 00:00 and puts the data that was measured before 00:00 to the end of the plot. import numpy as np import pandas as pd import matplotlib.pyplot as plt

How can I change the Seaborn FacetGrid's legend title?

99封情书 提交于 2020-02-24 05:56:04
问题 In the Seaborn's FacetGrid based lineplot, would like to change the label title. It seemed like a simple thing, but it turned out to be a hairy task tips = sns.load_dataset("tips") g = sns.FacetGrid(tips, col= 'day', legend_out= True,) g.map(sns.lineplot, 'total_bill', 'tip', 'sex', 'time', ci = False) g.fig.legend() legend title 'sex' I wanted to change the label title to 'gender' from 'sex' by adding 'title' argument. But it turns out that become a headline on top the existing title g.add

Hide legend from seaborn pairplot

左心房为你撑大大i 提交于 2020-02-23 09:37:27
问题 I would like to hide the Seaborn pairplot legend. The official docs don't mention a keyword legend. Everything I tried using plt.legend didn't work. Please suggest the best way forward. Thanks! import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline test = pd.DataFrame({ 'id': ['1','2','1','2','2','6','7','7','6','6'], 'x': [123,22,356,412,54,634,72,812,129,110], 'y':[120,12,35,41,45,63,17,91,112,151]}) sns.pairplot(x_vars='x', y_vars="y", data=test, hue = 'id', height = 3)