Python: check cosine similarity between mongoDB database documents

本小妞迷上赌 提交于 2021-02-19 04:02:48

问题


I am using python. Now I have a mongoDB database collection, in which all documents have such a format:

{"_id":ObjectId("53590a43dc17421e9db46a31"),
 "latlng": {"type" : "Polygon", "coordinates":[[[....],[....],[....],[....],[.....]]]}
 "self":{"school":2,"home":3,"hospital":6}
 }

In which the field "self" indicates the venue types in the Polygon and the number of corresponding venue types. different documents have different self field, such as {"KFC":1,"building":2,"home":6}, {"shopping mall":1, "gas station":2}

Now I need to calculate the cosine similarity between two "self" fields of two documents. Before, all my documents are saved as dictionaries in a pickle file, and I use following codes to calculate the similarity:

vec = DictVectorizer()
total_arrays = vec.fit_transform(data + citymap).A
vector_matrix = total_arrays[:len(data)]
citymap_base_matrix = total_arrays[len(data):]

def cos_cdist(matrix, vector):
v = vector.reshape(1, -1)
return scipy.spatial.distance.cdist(matrix, v, 'cosine').reshape(-1)

for vector in vector_matrix:
    distance_result = cos_cdist(citymap_base_matrix,vector)

Here, the data and citymap are just like [{"KFC":1,"building":2,"home":6},{"school":2,"home":3,"hospital":6},{"shopping mall":1, "gas station":2}]

But now I am using mongoDB and I want to know if there is mongoDB method to calculate the similarity in a more straightforward way, any idea?

来源:https://stackoverflow.com/questions/29914422/python-check-cosine-similarity-between-mongodb-database-documents

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!