What are useful ranking algorithms for documents without links?

前端 未结 4 1867
悲哀的现实
悲哀的现实 2020-12-24 08:38

I\'ve looked at Algorithms of the Intelligent Web that describes (page 55) an interesting algorithm - called DocRank - for creating a PageRank like score for business docume

4条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-12-24 09:08

    I've done some additional research on the topic and found the Wikipedia entry for the Okapi BM25 algorithm. It also has a successor BM25F that takes document structure into account, but this appears to be more relevant to HTML/XML.

    BM25 Incorporates:

    1. average document length in the collection,
    2. length of the particular document
    3. term frequency

    Finally, the Wikipedia entry links to a Lucene implementation.

    Compared to @Doug's answers above this appears to be a more complex algorithm to implement.

提交回复
热议问题