I have a list of item and country status.
res = [(\'63(I)[PARA.8]\',\'AFGHANISTAN Y ARGENTINA Y AUSTRALIA Y BELGIUM Y BOLIVIA Y BRAZIL N BYELORUSSIAN SSR Y CANADA
The hard part is to parse the list of countries and codes (A, N or Y).
First, write a function to convert each tuple to a pandas Series. The 'code' is A, N or Y. Anything else is (part of) the country name.
def raw_data_to_series(xs):
name, values = xs
if values == 'No Data':
return pd.Series(dtype='object').rename(name)
values = values.replace(' ', ' ').split(' ')
country = ''
results = dict()
for x in values:
if x == 'GUATEMALA':
results[x] = '?'
country = ''
elif country == '':
country = x
elif x in {'A', 'N', 'Y'}:
results[country] = x
country = ''
else:
country = country + ' ' + x
return pd.Series(results).rename(name)
Now, we just pass each element of res
to the function (using a list comprehension):
pd.concat( [raw_data_to_series(r) for r in res], axis=1)
# first 10 lines
63(I)[PARA.8] 63(I)[PARA.7] 63(I)[PARA.6] 99(I) 50(I)
AFGHANISTAN Y Y Y NaN NaN
ARGENTINA Y Y Y NaN NaN
AUSTRALIA Y Y Y NaN NaN
BELGIUM Y Y Y NaN NaN
BOLIVIA Y Y Y NaN NaN
BRAZIL N N N NaN NaN
BYELORUSSIAN SSR Y Y Y NaN NaN
CANADA Y Y Y NaN NaN
CHILE Y Y Y NaN NaN
CHINA A A A NaN NaN