How to change outliers to some other colors in a scatter plot

前端 未结 2 1849
耶瑟儿~
耶瑟儿~ 2020-12-06 15:12

If I have a scatter plot like this

I was wondering is there any way to change the obvious outliers, like the three on the top, to some other colors in the same plot?

相关标签:
2条回答
  • 2020-12-06 15:39

    First, you need to find a criterion for "outliers". Once you have that, you could mask those unwanted points in your plot. Selecting a subset of an array based on a condition can be easily done in numpy, e.g. if a is a numpy array, a[a <= 1] will return the array with all values bigger than 1 "cut out".

    Plotting could then be done as follows

    import numpy as np
    import matplotlib.pyplot as plt
    
    num= 1000
    x= np.linspace(0,100, num=num)
    y= np.random.normal(size=num)
    
    fig=plt.figure()
    ax=fig.add_subplot(111)
    # plot points inside distribution's width
    ax.scatter(x[np.abs(y)<1], y[np.abs(y)<1], marker="s", color="#2e91be")
    # plot points outside distribution's width
    ax.scatter(x[np.abs(y)>=1], y[np.abs(y)>=1], marker="d", color="#d46f9f")
    plt.show()
    

    producing

    Here, we plot points from a normal distribution, colorizing all points outside the distribution's width differently.

    0 讨论(0)
  • 2020-12-06 15:44

    ImportanceOfBeingErnest has a great answer. Here's a one-liner I use if I have an array corresponding to enum categories for the data points (especially useful when visualizing data pre divided into classes).

    import numpy as np
    import matplotlib.pyplot as plt
    
    num = 1000
    x= np.random.rand(1,100)
    y= np.random.rand(1,100)*2
    
    # Creating a simple data point classification criteria, classes in this case will be 0, 1 and 2
    classes = np.round(y)
    
    # Passing in the classes for the "c" argument is super convinient
    plt.scatter(x,y, c=classes,cmap=plt.cm.Set1)
    plt.show()
    

    Corresponding scatter plot that divides the graph into 3 colored regions:

    0 讨论(0)
提交回复
热议问题