Fast way to map scalars to colors in python

前端 未结 1 972
长发绾君心
长发绾君心 2021-01-25 15:24

I\'m looking for a fast way to map scalars to hex colors in python:

import matplotlib
import matplotlib.cm as cm
import matplotlib.colors as mcol

np.random.seed         


        
相关标签:
1条回答
  • 2021-01-25 15:50

    The most expensive method in the code from the question is not to_rgba() but the DataFrame.apply because it applies the function to each row individually.

    A comparisson between different methods using matplotlib colormaps is given in my answer to this question: How do I map df column values to hex color in one go?

    The quintessence is that using a look up table (LUT) is indeed much faster (a factor 400 in the case investigated over there).

    However note that in the case of this question here, there is no need to use matplotlib at all. Since you already have a list of possible colors in hex format, there is absolutely no need to use matplotlib and convert hex colors to a colormap and then back to hex colors.

    Instead just using the list of colors as look up table (LUT) directly is way faster. Taking a dataframe with 10000 entries (to keep it comarable to the other answer's timings) the code from this question takes 2.7 seconds.

    The following code takes 380 µs. This is a factor of 7000 improvement.
    Compared to the best method using matplotlib from the linked question's answer of 7.7 ms, it is still a factor of 20 better.

    import numpy as np; np.random.seed(0)
    import pandas as pd
    
    def create_df(n=10000):
        return pd.DataFrame(np.random.rand(n,1), columns=['some_value'])
    
    def apply(df):
        colors = ["#084594", "#0F529E", "#1760A8", "#1F6EB3", "#2979B9", "#3484BE", "#3E8EC4",
                  "#4A97C9", "#57A0CE", "#64A9D3", "#73B2D7", "#83BBDB", "#93C4DE", "#A2CBE2",
                  "#AED1E6", "#BBD6EB", "#C9DCEF", "#DBE8F4", "#EDF3F9", "#FFFFFF"]
        colors = np.array(colors)
        v = df['some_value'].values
        v = ((v-v.min())/(v.max()-v.min())*(len(colors)-1)).astype(np.int16)
        return pd.Series(colors[v])
    
    df = create_df()
    %timeit apply(df)
    
    # 376 µs
    
    0 讨论(0)
提交回复
热议问题