Add a new column to a data-frame based on a time window match in Python

前提是你 提交于 2021-01-29 10:31:51

问题


I have two data sets, one with a set of vital signs (Blood pressure, Heart rate, Resp rate etc) a time that they were taken ['Obs_DTM'] and a unique ID ['VisitID'] for the patient (df_vitals).

I have a second data set that contains white blood cell counts, with the time they were reported ['ResultDtm']and the same unique ID for the patient (df_WCC).

I want to add the WCC ['TextValue'] to the row if it was reported within a timeframe (T) (this needs to be variable) before (i.e not after) the vital signs were taken.

This would appear to be a left merge on the first table, but how do I define the time window and use that in the merge?

I have tried:- adding a column to df_vitals that is a defined time period (T) before ['Obs_DTM'] as ['before'] and then doing :-

WCC=[]
for i in range(len(df_vitals)):
    x=df_vitals.VisitID[i]
    y=df_vitals.Obs_DTM[i]
    z=df_vitals.before[i]
    for v in range(len(df_WCC)):
        if df_WCC.VisitID[v]==x:
            if df_WCC.ResultDtm[v]>z:
                if df_WCC.ResultDtm[v]<z:
                    a=df_WCC.TextValue[v]
                    WCC.append(a)
                else:
                    continue
            else:
                continue
        else:
            continue
new_df['WCC']=WCC

but this is not as efficient as I need it to be - combined file size - 32GB.....

I am sure there is a simple solution on a merge - but I can seem to find it!

来源:https://stackoverflow.com/questions/63597540/add-a-new-column-to-a-data-frame-based-on-a-time-window-match-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!