python hierarchy from manager and employee id

问题

I have a csv with two columns: employee id 'eid' and manager's employee id 'mid'. Trying to get python code that will, for each employee, add columns showing employee id's of the manager all the way to CEO. CEO has a an employee id of 1. Ultimately I want to write the result back to a csv.

So the data looks like:

eid,    mid
111,    112
113,    112
112,    114
114,    115
115,    1

I am expecting output that looks like this. Note that while no employee will have more than 4 levels of managers, but i would like to also learn python that names columns dynamically.

eid,    mid,    l2mid   l3mid   l4mid
111,    112,    114,    115,    1
113,    112,    114,    115,    1
112,    114,    115,    1   
114,    115,    1       
115,    1

I am very new to coding, and trying to teach myself but keep getting stuck. My questions: 1) I was trying to use a for statement that took mid in a given row, then found that that manager's manager, and so on, until i reached the CEO. I have been trying along these lines:

df = pd.read_csv('employee.csv') 
if mid =! 1 
for i in df:
    df.['l2mid'] = df.loc[df.eid == [i], [mid]]

Maybe I'm approaching this backwards and I should try grouping all employees by manager? How would that code be different?

I have seen solutions in C# and sql, and i've seen solutions that build trees and json. I really appreciate any help and encouragement.

Update: next step was to add a country column - see: entry here

回答1:

I believe there is a better solution, but this works. I filled empty with zeros.

a = []
for index, row in df.iterrows():
    res = df[df['eid']==row['mid']]['mid'].values
    a.append(0 if not res else res[0])
df['l2mid'] = a

a = []
for index, row in df.iterrows():
    res = df[df['eid']==row['l2mid']]['mid'].values
    a.append(0 if not res else res[0])
df['l3mid'] = a

a = []
for index, row in df.iterrows():
    res = df[df['eid']==row['l3mid']]['mid'].values
    a.append(0 if not res else res[0])
df['l4mid'] = a

df
# output :
# eid   mid l2mid   l3mid   l4mid
# 0 111 112 114 115 1
# 1 113 112 114 115 1
# 2 112 114 115 1   0
# 3 114 115 1   0   0
# 4 115 1   0   0   0

You can define a function for routines.

def search_manager(target_column, new_column):
    a = []
    for index, row in df.iterrows():
        res = df[df['eid']==row[target_column]]['mid'].values
        a.append(0 if not res else res[0])
    df[new_column] = a

search_manager('mid', 'l2mid')
search_manager('l2mid', 'l3mid')
search_manager('l3mid', 'l4mid')

来源：https://stackoverflow.com/questions/45929048/python-hierarchy-from-manager-and-employee-id

标签

python

hierarchy