The `hue` parameter in Seaborn.relplot() skips an integer when given numerical data?

后端 未结 1 1141
刺人心
刺人心 2020-12-09 11:36

The hue parameter skips one integer.

d = {\'column1\':[1,2,3,4,5], \'column2\':[2,4,5,2,3], \'cluster\':[0,1,2,3,4]}

df = pd.DataFrame(data=d)

sns.relplot(         


        
相关标签:
1条回答
  • 2020-12-09 12:07

    "Full" legend

    If the hue is in numeric format, seaborn will assume that it represents some continuous quantity and will decide to display what it thinks is a representative sample along the color dimension.

    You can circumvent this by using legend="full".

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    
    df = pd.DataFrame({'column1':[1,2,3,4,5], 'column2':[2,4,5,2,3], 'cluster':[0,1,2,3,4]})
    sns.relplot(x='column2', y='column1', hue='cluster', data=df, legend="full")
    plt.show()
    

    Categoricals

    An alternative is to make sure the values are treated categorical Unfortunately, even if you plug in the numbers as strings, they will be converted to numbers falling back to the same mechanism described above. This may be seen as a bug.

    However, one choice you have is to use real categories, like e.g. single letters.

    'cluster':list("ABCDE")
    

    works fine,

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    
    d = {'column1':[1,2,3,4,5], 'column2':[2,4,5,2,3], 'cluster':list("ABCDE")}
    
    df = pd.DataFrame(data=d)
    
    sns.relplot(x='column2', y='column1', hue='cluster', data=df)
    
    plt.show()
    

    Strings with customized palette

    An alternative to the above is to use numbers converted to strings, and then make sure to use a custom palette with as many colors as there are unique hues.

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    
    d = {'column1':[1,2,3,4,5], 'column2':[2,4,5,2,3], 'cluster':[1,2,3,4,5]}
    
    df = pd.DataFrame(data=d)
    df["cluster"] = df["cluster"].astype(str)
    
    sns.relplot(x='column2', y='column1', hue='cluster', data=df, 
                palette=["b", "g", "r", "indigo", "k"])
    
    plt.show()
    

    0 讨论(0)
提交回复
热议问题