In the following, male_trips is a big pandas data frame and stations is a small pandas data frame. For each station id I\'d like to know how many male trips took place. The
My answer below works in Pandas 0.7.3. Not sure about the new releases.
This is what the pandas.Series.value_counts method is for:
count_series = male_trips.start_station_id.value_counts()
It should be straight-forward to then inspect count_series based on the values in stations['id']. However, if you insist on only considering those values, you could do the following:
count_series = (
male_trips[male_trips.start_station_id.isin(stations.id.values)]
.start_station_id
.value_counts()
)
and this will only give counts for station IDs actually found in stations.id.