Loading contents of json array in redshift

徘徊边缘 提交于 2019-12-13 02:26:40

问题


I'm setting up redshift and importing data from mongo. I have succeeded in using a json path file for a simple document but am now needing to import from a document containing an array.

{
   "id":123,
   "things":[
      {
         "foo":321,
         "bar":654
      },
      {
         "foo":987,
         "bar":567
      }
   ]
}

How do I load the above in to a table like so:

select * from things;

    id  | foo  | bar
--------+------+-------
   123  | 321  | 654
   123  | 987  | 567

or is there some other way?

I can't just store the json array in a varchar(max) column as the content of Things can exceed 64K.


回答1:


Given

db.baz.insert({
   "myid":123,
   "things":[
      {
         "foo":321,
         "bar":654
      },
      {
         "foo":987,
         "bar":567
      }
   ]
});

The following will display the fields you want

db.baz.find({},{"things.foo":1,"things.bar":1} )

To flatten the result set use aggregation like so

 db.baz.aggregate( 
 {"$group": {"_id": "$myid", "things": { "$push" : {"foo":"$things.foo","bar":"$things.bar"}}}},
 {    
   $project : {
     _id:1,
     foo : "$things.foo",
     bar : "$things.bar"   
   } 
  },
  { "$unwind" : "$foo" },
  { "$unwind" : "$bar" }
);


来源:https://stackoverflow.com/questions/24512868/loading-contents-of-json-array-in-redshift

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!