watson retrieve-and-rank - manual ranking

陌路散爱 提交于 2019-12-02 06:50:46

The training data is meant to train a learning-to-rank (L2R) algorithm. The L2R approach is to first take a list of candidate answers (e.g. documents in a search result page) that were generated in response to a query (aka question) and represent each query-answer pair as a set of features. Each feature hopefully captures some representation of how well that particular candidate answer matches the query. Each line in the training data represents the feature values belonging to one of these query-answer pairs.

Because the training data contains feature vectors from lots of different queries (and corresponding search results), the first column uses a query id to tie together different candidate answers that were generated in response to a single query.

As you said, the last column simple captures whether a human annotator believed that the answer was actually relevant to the question or not. The 0-4 scale is not mandatory. 0 always represents irrelevant. But after that you can use whatever scale makes sense for your use case (often people just use a 0-1 binary scale when there is limited data since this reduces complexity).

The python script made available on the documentation page that you referenced will actually go through the process of generating candidate answers and corresponding feature vectors given a file containing different queries. You may wish to step through the code in that script to get a better idea of how you might create your training data.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!