Extract related articles in different languages using Wikidata Toolkit

随声附和 提交于 2020-01-23 17:57:07

问题


I'm trying to extract interlanguage related articles in Wikidata dump. After searching on the internet, I found out there is a tool named Wikidata Toolkit that helps to work with these type of data. But there is no information about how to find related articles in different languages. For example, the article: "Dresden" in the English language is related to the article: "Dresda" in the Italiano one. I mean the second one is the translated version of the first one. I tried to use the toolkit, but I couldn't find any solution. Please write some example about how to find this related article.


回答1:


you can use Wikidata dump [1] to get a mapping of articles among wikipedias in multiple language.

for example if you see the wikidata entry for Respiratory System[2] at the bottom you see all the articles referring to the same topic in other languages.

That mapping is available in the wikidata dump. Just download wikidata dump and get the mapping and then get the corresponding text from the wikipedia dump. You might encounter some other issues, like resolving wikipedia redirects.

[1] https://dumps.wikimedia.org/wikidatawiki/entities/ [2] https://www.wikidata.org/wiki/Q7891



来源:https://stackoverflow.com/questions/48386791/extract-related-articles-in-different-languages-using-wikidata-toolkit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!