I want to join two data sources, orders and customers:
orders is an SQL Server table:
orderid| customerid | orderdate | ordercost
------ | ----------
empty dataframe result for pd.merge means you don't have any matching values across the two frames. Have you checked the type of the the data? use
df1['customerid'].dtype
to check.
as well as converting after importing (as suggested in the other answer), you can also tell pandas what dtype you want when you read the csv
df2=pd.read_csv(customer_csv, dtype={'customerid': str))
I think problem is columns customerid
has different dtypes
in both DataFrames
so no match.
So need convert both columns to int
or both to str
.
df1['customerid'] = df1['customerid'].astype(int)
df2['customerid'] = df2['customerid'].astype(int)
Or:
df1['customerid'] = df1['customerid'].astype(str)
df2['customerid'] = df2['customerid'].astype(str)
Also is possible omit how='inner'
, because default value of merge:
merged= pd.merge( df1, df2, on= 'customerid')