Pandas fill missing values in dataframe from another dataframe

前端 未结 5 873
遥遥无期
遥遥无期 2020-11-28 12:10

I cannot find a pandas function (which I had seen before) to substitute the NaN\'s in a dataframe with values from another dataframe (assuming a common index which can be sp

5条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-28 12:39

    DataFrame.combine_first() answers this question exactly.

    However, sometimes you want to fill/replace/overwrite some of the non-missing (non-NaN) values of DataFrame A with values from DataFrame B. That question brought me to this page, and the solution is DataFrame.mask()

    A = B.mask(condition, A)
    

    When condition is true, the values from A will be used, otherwise B's values will be used.

    For example, you could solve the OP's original question with mask such that when an element from A is non-NaN, use it, otherwise use the corresponding element from B.

    But using DataFrame.mask() you could replace the values of A that fail to meet arbitrary criteria (less than zero? more than 100?) with values from B. So mask is more flexible, and overkill for this problem, but I thought it was worthy of mention (I needed it to solve my problem).

    It's also important to note that B could be a numpy array instead of a DataFrame. DataFrame.combine_first() requires that B be a DataFrame, but DataFrame.mask() just requires that B's is an NDFrame and its dimensions match A's dimensions.

提交回复
热议问题