Reach a string behind unknown value in JSON

╄→гoц情女王★ 提交于 2019-12-10 20:53:38

问题


I use Wikipedia's API to get information about a page. The API gives me JSON like this:

"query":{
  "pages":{
     "188791":{
        "pageid":188791,
        "ns":0,
        "title":"Vanit\u00e9",
        "langlinks":[
           {
              "lang":"bg",
              "*":"Vanitas"
           },
           {
              "lang":"ca",
              "*":"Vanitas"
           },
           ETC.
        }
     }
  }
}

You can see the full JSON response.

I want to obtain all entries like:

{
   "lang":"ca",
   "*":"Vanitas"
}

but the number key ("188791") in the pages object is the problem.

I found Find a value within nested json dictionary in python that explains me how to do enumerate the values.

Unfortunately I get the following exception:

TypeError: 'dict_values' object does not support indexing

My code is:

json["query"]["pages"].values()[0]["langlinks"]

It's probably a dumb question but I can't find a way to pass in the page id value.


回答1:


As long as you're only querying one page at a time, Simeon Visser's answer will work. However, as a matter of good style, I'd recommend structuring your code so that you iterate over all the returned results, even if you know there should be only one:

for page in data["query"]["pages"].values():
    title = page["title"]
    langlinks = page["langlinks"]
    # do something with langlinks...

In particular, by writing your code this way, if you ever find yourself needing to run the query for multiple pages, you can do it efficiently with a single MediaWiki API request.




回答2:


One solution is to use the indexpageids parameter, e.g.: http://fr.wikipedia.org/w/api.php?action=query&titles=Vanit%C3%A9&prop=langlinks&lllimit=500&format=jsonfm&indexpageids. It will add an array of pageids to the response. You can then use that to access the dictionary.




回答3:


You're using Python 3 and values() now returns a dict_values instead of a list. This is a view on the values of the dictionary.

Hence that's why you're getting that error because indexing fails. Indexing is possible in a list but not a view.

To fix it:

list(json["query"]["pages"].values())[0]["langlinks"]



回答4:


If you really want just one page arbitrarily, do that the way Simeon Visser suggested.

But I suspect you want all langlinks in all pages, yes?

For that, you want a comprehension:

[page["langlinks"] for page in json["query"]["pages"].values()]

But of course that gives you a 2D list. If you want to iterate over each page's links, that's perfect. If you want to iterate over all of the langlinks at once, you want to flatten the list:

[langlink for page in json["query"]["pages"] 
 for langlink in page["langlinks"].values()]

… or…

itertools.chain.from_iterable(page["langlinks"] 
                              for page in json["query"]["pages"].values())

(The latter gives you an iterator; if you need a list, wrap the whole thing in list. Conversely, for the first two, if you don't need a list, just any iterable, use parens instead of square brackets to get a generator expression.)



来源:https://stackoverflow.com/questions/20010839/reach-a-string-behind-unknown-value-in-json

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!