Getting Wikipedia page view statistics

守給你的承諾、 提交于 2019-12-12 03:05:53

问题


I'm trying to collect time series data over the last five years on Wikipedia page view statistics for a particular webpage ("Bitcoin"). I found this site to be useful: http://stats.grok.se for getting this data. Two issues:

  1. The website triggers an "internal server error" error whenever 2016 is selected as a year for which to obtain data.

  2. Is there an existing tool that can put this output in more usable form, such as a .csv?


回答1:


I don't know about stats.grok.se as it doesn't appear to live on a wikimedia production or labs server. But there's an API provided for page view statistics starting July 2015:

https://wikimedia.org/api/rest_v1/#!/Pageviews_data/get_metrics_pageviews_per_article_project_access_agent_article_granularity_start_end

E.g., daily page views to https://en.wikipedia.org/wiki/Bitcoin over the past year: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia.org/all-access/all-agents/Bitcoin/daily/20151105/20161105

all-access = desktop+mobile-web+mobile-app

all-agents = user+spider+bot

Historical data can be downloaded from https://dumps.wikimedia.org/other/pagecounts-raw/




回答2:


I found archive of page view statistics from 2007 to 2016 here: https://dumps.wikimedia.org/other/pagecounts-raw/

At the bottom of the page they list several other sources covering various time periods.



来源:https://stackoverflow.com/questions/40445113/getting-wikipedia-page-view-statistics

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!