Hi I am working on a dataset where there is a host_id and two other columns : reviews_per_month and number_of_reviews. For every host_id, majority of the values are present