问题
I am facing a little issue while working with pandas and reset_index function. Below is the excel and 3 sheet I am trying to map.
Please find the code below :
filename='C:\\HPTiB\\HPTib_Test_Cases\\template_276.xlsx'
data=pd.read_excel(filename,sheet_name=['INFORMATION SOURCE','INFORMATION RECEIVER','SERVICE PROVIDER'],dtype=str)
sequence=0
segments_276=[]
N_info_src=len(data['INFORMATION SOURCE'])
N_info_recv=len(data['INFORMATION RECEIVER'])
N_svc_prv=len(data['SERVICE PROVIDER'])
N_sub=len(data['SUBSCRIBER'])
for i in range(N_info_src):
print("Value of i",i)
#Currently iterating over the info source loop
sequence=sequence+1
source_parent=sequence
#Write the HL segment
segments_276.append('HL*'+str(sequence)+'**20*1')
#Write all the loop segments for this row
# loop_segments=Parser.build_loop('2100A',i,data['INFORMATION SOURCE'])
# segments_276=segments_276+loop_segments
#Get the KEY for this info source and related keys in the next table
SOURCE_KEY=data['INFORMATION SOURCE'].loc[i,'SOURCE KEY']
subset_info_recv=data['INFORMATION RECEIVER'][data['INFORMATION RECEIVER']['SOURCE KEY']==SOURCE_KEY]
#Reset index to avoid key errors
subset_info_recv.reset_index(drop=True,inplace=True)
N_info_recv=len(subset_info_recv)
for j in range(N_info_recv):
print("value of j \n {} and value of subset_info_recv \n {}".format(j,subset_info_recv))
#Currently itesrating over the info recv loop
sequence=sequence+1
recv_parent=sequence
#Write the HL segment
# segments_276.append('HL*'+str(sequence)+'*'+str(source_parent)+'*21*1')
#Write all the loop segments for this row
# loop_segments=Parser.build_loop('2100B',j,subset_info_recv)
# segments_276=segments_276+loop_segment
# Get the KEY for this info receiver and related keys in the next table
RECEIVER_KEY = data['INFORMATION RECEIVER'].loc[j, 'RECEIVER KEY']
subset_info_provider = data['SERVICE PROVIDER'][data['SERVICE PROVIDER']['RECEIVER KEY'] == RECEIVER_KEY]
# Reset index to avoid key errors
subset_info_provider.reset_index(drop=True, inplace=True)
N_svc_prv = len(subset_info_provider)
print("Lengh of provider sheet", N_svc_prv)
for k in range(N_svc_prv):
print("value of k \n {} and value of subset_info_provider \n {}".format(k,subset_info_provider))
# Currently iterating over the info Provider loop
sequence = sequence + 1
provider_parent = sequence
# Write the HL segment
segments_276.append('HL*' + str(sequence) + '*' + str(recv_parent) + '*19*1')
#Write all the loop segments for this row
#print("Value of k {} and \n subset_info_provider \n {} ".format(k,subset_info_provider))
# loop_segments=Parser.build_loop('2100C',k,subset_info_provider)
# segments_276=segments_276+loop_segments
#Print the result
#for segment in segments_276:
# print(segment)
OUTPUT :
Value of i 0
value of j 0
and value of subset_info_recv
SOURCE KEY RECEIVER KEY RECIEVER KEY TYPE RECIEVER NAME RECIEVER CODE
0 1 1 PERSON CEO A222222221
1 1 2 PERSON CO-FOUNDER A222222221
value of k 0
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 1 1 Tesla Provider Number 123456789
1 1 2 Apple TIN 123453234
value of k 1
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 1 1 Tesla Provider Number 123456789
1 1 2 Apple TIN 123453234
value of j 1
and value of subset_info_recv
SOURCE KEY RECEIVER KEY RECIEVER KEY TYPE RECIEVER NAME RECIEVER CODE
0 1 1 PERSON CEO A222222221
1 1 2 PERSON CO-FOUNDER A222222221
value of k 0
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 2 3 Microsoft NPI 123453756
Value of i 1
value of j 0
and value of subset_info_recv
SOURCE KEY RECEIVER KEY RECIEVER KEY TYPE RECIEVER NAME RECIEVER CODE
0 2 3 PERSON CFO A222222221
value of k 0
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 1 1 Tesla Provider Number 123456789
1 1 2 Apple TIN 123453234
value of k 1
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 1 1 Tesla Provider Number 123456789
1 1 2 Apple TIN 123453234
EXPECTED OUTPUT:
Value of i 0
value of j 0
and value of subset_info_recv
SOURCE KEY RECEIVER KEY RECIEVER KEY TYPE RECIEVER NAME RECIEVER CODE
0 1 1 PERSON CEO A222222221
1 1 2 PERSON CO-FOUNDER A222222221
value of k 0
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 1 1 Tesla Provider Number 123456789
1 1 2 Apple TIN 123453234
value of k 1
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 1 1 Tesla Provider Number 123456789
1 1 2 Apple TIN 123453234
value of j 1
and value of subset_info_recv
SOURCE KEY RECEIVER KEY RECIEVER KEY TYPE RECIEVER NAME RECIEVER CODE
0 1 1 PERSON CEO A222222221
1 1 2 PERSON CO-FOUNDER A222222221
value of k 0
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 2 3 Microsoft NPI 123453756
Value of i 1
value of j 0
and value of subset_info_recv
SOURCE KEY RECEIVER KEY RECIEVER KEY TYPE RECIEVER NAME RECIEVER CODE
0 2 3 PERSON CFO A222222221
value of k 0
and value of subset_info_provider
RECEIVER KEY PROVIDER KEY PROVIDER NAME PROVIDER ID TYPE PROVIDER ID
0 3 4 Google Provider Number 675453756
SO , if you see in OUTPUT for Service provider sheet , it is not printing the value of RECEIVER KEY 3 , instead it is getting reset and printing first two values.
Could you please help me point out the issue , am I not looping it correctly?
Thanks !!
回答1:
Change the line
RECEIVER_KEY = data['INFORMATION RECEIVER'].loc[j, 'RECEIVER KEY']
to
RECEIVER_KEY = subset_info_recv.loc[j, 'RECEIVER KEY']
because the for
loop of j
are in range(len(subset_info_recv))
.
来源:https://stackoverflow.com/questions/61834405/how-to-reset-index-when-mapping-excel-sheet-in-a-loop-using-pandas-in-python-3-0