How to label line chart with column from pandas dataframe (from 3rd column values)?

风格不统一 提交于 2019-12-11 15:29:37

问题


I have a data set I filtered to the following (sample data):

Name Time l
1 1.129 1G-d
1 0.113 1G-a
1 3.374 1B-b
1 3.367 1B-c
1 3.374 1B-d
2 3.355 1B-e
2 3.361 1B-a
3 1.129 1G-a

I got this data after filtering the data frame and converting it to CSV file:

# Assigns the new data frame to "df" with the data from only three columns
header = ['Names','Time','l']
df = pd.DataFrame(df_2, columns = header)

# Sorts the data frame by column "Names" as integers
df.Names = df.Names.astype(int)
df = df.sort_values(by=['Names'])

# Changes the data to match format after converting it to int
df.Time=df.Time.astype(int)
df.Time = df.Time/1000

csv_file = df.to_csv(index=False, columns=header, sep=" " )

Now, I am trying to graph lines for each label column data/items with markers. I want the column l as my line names (labels) - each as a new line, Time as my Y-axis values and Names as my X-axis values. So, in this case, I would have 7 different lines in the graph with these labels: 1G-d, 1G-a, 1B-b, 1B-c, 1B-d, 1B-e, 1B-a.

I have done the following so far which is the additional settings, but I am not sure how to graph the lines.

plt.xlim(0, 60)
plt.ylim(0, 18)
plt.legend(loc='best')
plt.show()

I used sns.lineplot which comes with hue and I do not want to have name for the label box. Also, in that case, I cannot have the markers without adding new column for style.

I also tried ply.plot but in that case, I am not sure how to have more lines. I can only give x and y values which create only one line.

If there's any other source, please let me know below.

Thanks

The final graph I want to have is like the following but with markers:


回答1:


You can apply a few tweaks to seaborn's lineplot. Using some created data since your sample isn't really long enough to demonstrate:

# Create data
np.random.seed(2019)
categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = pd.DataFrame({'Name':np.repeat(range(1,11), 10),
              'Time':np.random.randn(100).cumsum(),
              'l':np.random.choice(categories, 100)
        })

# Plot
sns.lineplot(data=df, x='Name', y='Time', hue='l', style='l', dashes=False,
             markers=True, ci=None, err_style=None)

# Temporarily removing limits based on sample data
#plt.xlim(0, 60)
#plt.ylim(0, 18)

# Remove seaborn legend title & set new title (if desired)
ax = plt.gca()
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles[1:], labels=labels[1:], title='New Title', loc='best')

plt.show()

  • To apply markers, you have to specify a style variable. This can be the same as hue.
  • You likely want to remove dashes, ci, and err_style
  • To remove the seaborn legend title, you can get the handles and labels, then re-add the legend without the first handle and label. You can also specify the location here and set a new title if desired (or just remove title=... for no title).

Edits per comments:

Filtering your data to only a subset of level categories can be done fairly easily via:

categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = df.loc[df['l'].isin(categories)]

markers=True will fail if there are too many levels. If you are only interested in marking points for aesthetic purposes, you can simply multiply a single marker by the number of categories you are interested in (which you have already created to filter your data to categories of interest): markers='o'*len(categories).

Alternatively, you can specify a custom dictionary to pass to the markers argument:

points = ['o', '*', 'v', '^']
mult = len(categories) // len(points) + (len(categories) % len(points) > 0)
markers = {key:value for (key, value) 
           in zip(categories, points * mult)}

This will return a dictionary of category-point combinations, cycling over the marker points specified until each item in categories has a point style.



来源:https://stackoverflow.com/questions/57117485/how-to-label-line-chart-with-column-from-pandas-dataframe-from-3rd-column-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!