Normalize ranking score with weights
问题 I am working on a document search problem where given a set of documents and a search query I want to find the document closest to the query. The model that I am using is based on TfidfVectorizer in scikit. I created 4 different tf_idf vectors for all the documents by using 4 different types of tokenizers. Each tokenizer splits the string into n-grams where n is in the range 1 ... 4 . For example: doc_1 = "Singularity is still a confusing phenomenon in physics" doc_2 = "Quantum theory still