Convert .json to .csv in ruby

前端 未结 7 1493
心在旅途
心在旅途 2020-12-24 09:42

I want to convert .json file into .csv file using ruby. Pleases help me to do this.

Also propose any tool to achieve this.

7条回答
  •  南方客
    南方客 (楼主)
    2020-12-24 09:49

    Edit:

    This functionality described below is now available as a gem. After installing with gem install json_converter, the following snippet can be used to generate a CSV from a valid JSON string or object:

    require 'json_converter'
    json_converter= JsonConverter.new
    
    # Assume json is a valid JSON string or object
    csv = json_converter.generate_csv json
    

    Original Answer:

    If your JSON data is relatively simple (no nesting or arrays), Alex's answer is probably the cleanest way of handling this problem.

    However, if you do need to take arrays and nested objects into account, I've attempted to port a web version of such a converter to ruby. It can be found here. The methods that handle the actual restructuring of data are array_from and flatten.

    The array_from method attempts to identify what a "row" of data looks like for a given dataset. It is not perfect, and you may want to tweak this part for different datasets.

    # Attempt to identify what a "row" should look like
    def array_from(json)
      queue, next_item = [], json
      while !next_item.nil?
    
        return next_item if next_item.is_a? Array
    
        if next_item.is_a? Hash
          next_item.each do |k, v|
            queue.push next_item[k]
          end
        end
    
        next_item = queue.shift
      end
    
      return [json]
    end
    

    The flatten method recursively iterates over the JSON object(s), and generates an object that represents headers and values. If an object is nested, the header for its column will be prefixed with its parent key(s), delimited by the / character.

    # The path argument is used to construct header columns for nested elements
    def flatten(object, path='')
      scalars = [String, Integer, Fixnum, FalseClass, TrueClass]
      columns = {}
    
      if [Hash, Array].include? object.class
        object.each do |k, v|
          new_columns = flatten(v, "#{path}#{k}/") if object.class == Hash
          new_columns = flatten(k, "#{path}#{k}/") if object.class == Array
          columns = columns.merge new_columns
        end
    
        return columns
      elsif scalars.include? object.class
          # Remove trailing slash from path
          end_path = path[0, path.length - 1]
          columns[end_path] = object
          return columns
      else
        return {}
      end
    end
    

    If there are any null values in the original JSON, you'll need to convert these to something other than nil before attempting the conversion - you'll generally end up with uneven rows if you don't. The nils_to_strings method handles that:

    # Recursively convert all nil values of a hash to empty strings
    def nils_to_strings(hash)
      hash.each_with_object({}) do |(k,v), object|
        case v
        when Hash
          object[k] = nils_to_strings v
        when nil
          object[k] = ''
        else
          object[k] = v
        end
      end
    end
    

    Here's a brief example of how this would be used:

    json = JSON.parse(File.open('in.json').read)
    in_array = array_from json
    in_array.map! { |x| nils_to_strings x }
    
    out_array = []
    in_array.each do |row|
      out_array[out_array.length] = flatten row
    end
    
    headers_written = false
    CSV.open('out.csv', 'w') do |csv|
      out_array.each do |row|
        csv << row.keys && headers_written = true if headers_written === false
        csv << row.values
      end
    end
    

    And finally, here's some example input/output:

    Input:

    {
      "Forms": [
        {
          "Form": {
            "id": "x",
            "version_id": "x",
            "name": "x",
            "category": "",
            "subcategory": null,
            "is_template": null,
            "moderation_status": "x",
            "display_status": "x",
            "use_ssl": "x",
            "modified": "x",
            "Aggregate_metadata": {
              "id": "x",
              "response_count": "x",
              "submitted_count": "x",
              "saved_count": "x",
              "unread_count": "x",
              "dropout_rate": "x",
              "average_completion_time": null,
              "is_uptodate": "x"
            }
          },
          "User": {
            "username": "somedude@example.com"
          }
        },
        {
          "Form": {
            "id": "x",
            "version_id": "x",
            "name": "x",
            "category": "",
            "subcategory": null,
            "is_template": null,
            "moderation_status": "x",
            "display_status": "x",
            "use_ssl": "x",
            "modified": "x",
            "Aggregate_metadata": {
              "id": "x",
              "response_count": "x",
              "submitted_count": "x",
              "saved_count": "x",
              "unread_count": "x",
              "dropout_rate": "x",
              "average_completion_time": null,
              "is_uptodate": "x"
            }
          },
          "User": {
            "username": "somedude@example.com"
          }
        }
      ]
    }
    

    Output:

    Form/id,Form/version_id,Form/name,Form/category,Form/subcategory,Form/is_template,Form/moderation_status,Form/display_status,Form/use_ssl,Form/modified,Form/Aggregate_metadata/id,Form/Aggregate_metadata/response_count,Form/Aggregate_metadata/submitted_count,Form/Aggregate_metadata/saved_count,Form/Aggregate_metadata/unread_count,Form/Aggregate_metadata/dropout_rate,Form/Aggregate_metadata/average_completion_time,Form/Aggregate_metadata/is_uptodate,User/username
    x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,somedude@example.com
    x,x,x,"","","",x,x,x,x,x,x,x,x,x,x,"",x,somedude@example.com
    

提交回复
热议问题