Trouble extracting individual JSON values in Ruby

家住魔仙堡 提交于 2019-12-12 02:53:52

问题


I'm in the process of trying to scrape reddit (API-free) and I've run into a brick wall. On reddit, every page has a JSON representation that can be seen simply by appending .json to the end, e.g. https://www.reddit.com/r/AskReddit.json.

I installed NeatJS, and wrote a small chunk of code to clean the JSON up and print it:

require "rubygems"
require "json"
require "net/http"
require "uri"
require 'open-uri'
require 'neatjson'

url = ("https://www.reddit.com/r/AskReddit.json")

result = JSON.parse(open(url).read)

neatJS = JSON.neat_generate(result, wrap: 40, short: true, sorted: true, aligned: true, aroundColonN: 1)

puts neatJS

And it works fine:

(There's way more to that, it goes on for another few pages, the full JSON is here: http://pastebin.com/HDzFXqyU)

However, when I changed it to extract only the values I want:

url = ("https://www.reddit.com/r/AskReddit.json")

result = JSON.parse(open(url).read)

neatJS = JSON.neat_generate(result, wrap: 40, short: true, sorted: true, aligned: true, aroundColonN: 1)

neatJS.each do |data|
  puts data["title"]
  puts data["url"]
  puts data["id"]
end

It gave me an error:

  002----extractallaskredditthreads.rb:17:in `<main>': undefined method `each' for #<String:0x0055f948da9ae8> (NoMethodError)

I've been trying different variations of the extractor for about two days and none of them have worked. I feel like I'm missing something incredibly obvious. If anyone could point out what I'm doing wrong, that would be appreciated.

EDIT

It turns out I had the wrong variable name:

 neatSJ =/= neatJS

However, correcting this only changes the error I got:

 002----extractallaskredditthreads.rb:17:in `<main>': undefined method `each' for #<String:0x0055f948da9ae8> (NoMethodError)

And as I said, I have been attempting multiple ways of extracting the tags, which may have caused my typo.


回答1:


In this code:

result = JSON.parse(open(url).read)

neatJS = JSON.neat_generate(result, wrap: 40, short: true, sorted: true, aligned: true, aroundColonN: 1)

...result is a Ruby Hash object, the result of parsing the JSON into a Ruby object with JSON.parse. Meanwhile, neatJS is a String, the result of calling JSON.neat_generate on the result Hash. It doesn't make sense to call each on a string. If you want to access the values inside the JSON structure, you want to use the result object, not the neatJS string:

children = result["data"]["children"]

children.each do |child|
  puts child["data"]["title"]
  puts child["data"]["url"]
  puts child["data"]["id"]
end



回答2:


Is it a typo?

neatJS = JSON.neat_generate
[...]
neatSJ.each do |data|


来源:https://stackoverflow.com/questions/35515733/trouble-extracting-individual-json-values-in-ruby

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!