暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Vertica通过fjsonparser解析Json数据入表

原创 simonchiang 2024-11-26
104

针对具有多层嵌套的Json格式数据,可以在加载数据的时候使用fjsonparser(flatten_arrays=true,flatten_maps=true)将数组与map展开,展开后可直接通过key查询嵌套的数据。
1、案例数据

[dbadmin@v001 jsontest]$ cat 1.json { "_name_" : "Bobs pizzeria", "cuisine" : "Italian", "location" : {"city" : "Cambridge", "zip" : "02140"}, "menu" : [{"item" : "cheese pizza", "price" : "$8.25"}, {"item" : "chicken pizza", "price" : "$11.99"}, {"item" : "spinach pizza", "price" : "$10.50"}] }
复制

2、创建flex表

--Json最外层字段可直接解析入flex的字段中 create flex table jsontest.restaurant_flex( _name_ varchar(200), cuisine varchar(50) );
复制

3、加载数据到flex表

copy jsontest.restaurant_flex from '/home/dbadmin/jsontest/1.json' parser fjsonparser(flatten_arrays=true,flatten_maps=true);
复制

4、通过MaptoString函数检查__raw__字段中的json数据

select _name_, cuisine ,MaptoString(__raw__) from jsontest.restaurant_flex; -[ RECORD 1 ]------------------------------------------------------------------------------------------------------ _name_ | Bobs pizzeria cuisine | Italian MaptoString | { "_name_": "Bob's pizzeria", "cuisine": "Italian", "location.city": "Cambridge", "location.zip": "02140", "menu.0.item": "cheese pizza", "menu.0.price": "$8.25", "menu.1.item": "chicken pizza", "menu.1.price": "$11.99", "menu.2.item": "spinach pizza", "menu.2.price": "$10.50" }
复制

5、展开的数据可以使用select语句直接查询

select _name_,cuisine,"location.city","menu.0.item" from jsontest.restaurant_flex; -[ RECORD 1 ]-+--------------- _name_ | Bobs pizzeria cuisine | Italian location.city | Cambridge menu.0.item | cheese pizza
复制

5、可通过insert...select...直接将flex表需要的字段写入到普通表

create table jsontest.restaurant as select _name_, cuisine, "location.city" as location_city, "location.zip" as location_zip, "menu.0.item" as menu_0_item, "menu.0.price" as menu_0_price, "menu.1.item" as menu_1_item, "menu.1.price" as menu_1_price, "menu.2.item" as menu_2_item, "menu.2.price" as menu_2_price from jsontest.restaurant_flex; vmart=> \d jsontest.restaurant List of Fields by Tables Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key ------------+------------+---------------+----------------------+--------+---------+----------+-------------+------------- jsontest | restaurant | _name_ | varchar(200) | 200 | | f | f | jsontest | restaurant | cuisine | varchar(50) | 50 | | f | f | jsontest | restaurant | location_city | long varchar(130000) | 130000 | | f | f | jsontest | restaurant | location_zip | long varchar(130000) | 130000 | | f | f | jsontest | restaurant | menu_0_item | long varchar(130000) | 130000 | | f | f | jsontest | restaurant | menu_0_price | long varchar(130000) | 130000 | | f | f | jsontest | restaurant | menu_1_item | long varchar(130000) | 130000 | | f | f | jsontest | restaurant | menu_1_price | long varchar(130000) | 130000 | | f | f | jsontest | restaurant | menu_2_item | long varchar(130000) | 130000 | | f | f | jsontest | restaurant | menu_2_price | long varchar(130000) | 130000 | | f | f | (10 rows) select * from jsontest.restaurant ; _name_ | cuisine | location_city | location_zip | menu_0_item | menu_0_price | menu_1_item | menu_1_price | menu_2_item | menu_2_price ----------------+---------+---------------+--------------+--------------+--------------+---------------+--------------+---------------+-------------- Bobs pizzeria | Italian | Cambridge | 02140 | cheese pizza | $8.25 | chicken pizza | $11.99 | spinach pizza | $10.50 (1 row)
复制

6、建议先将表建好,再插入,使用CTAS创建表字段会自动定义为long varchar(130000),并不是最优

drop table jsontest.restaurant; CREATE TABLE jsontest.restaurant ( _name_ varchar(200), cuisine varchar(50), location_city varchar(200), location_zip varchar(200), menu_0_item varchar(200), menu_0_price varchar(200), menu_1_item varchar(200), menu_1_price varchar(200), menu_2_item varchar(200), menu_2_price varchar(200) ); insert into jsontest.restaurant select _name_, cuisine, "location.city" as location_city, "location.zip" as location_zip, "menu.0.item" as menu_0_item, "menu.0.price" as menu_0_price, "menu.1.item" as menu_1_item, "menu.1.price" as menu_1_price, "menu.2.item" as menu_2_item, "menu.2.price" as menu_2_price from jsontest.restaurant_flex; ERROR 2631: Column "location_city" is of type varchar but expression is of type long varchar HINT: You will need to rewrite or cast the expression insert into jsontest.restaurant select _name_, cuisine, "location.city"::varchar, "location.zip"::varchar, "menu.0.item"::varchar, "menu.0.price"::varchar, "menu.1.item"::varchar, "menu.1.price"::varchar, "menu.2.item"::varchar, "menu.2.price"::varchar from jsontest.restaurant_flex; OUTPUT -------- 1 (1 row) select * from jsontest.restaurant; _name_ | cuisine | location_city | location_zip | menu_0_item | menu_0_price | menu_1_item | menu_1_price | menu_2_item | menu_2_price ----------------+---------+---------------+--------------+--------------+--------------+---------------+--------------+---------------+-------------- Bobs pizzeria | Italian | Cambridge | 02140 | cheese pizza | $8.25 | chicken pizza | $11.99 | spinach pizza | $10.50 (1 row)
复制

7、数组数量未知的情况下,没有的数组会返回空值,不会报错

select _name_, cuisine, "location.city" as location_city, "location.zip" as location_zip, "menu.0.item" as menu_0_item, "menu.0.price" as menu_0_price, "menu.1.item" as menu_1_item, "menu.1.price" as menu_1_price, "menu.2.item" as menu_2_item, "menu.2.price" as menu_2_price, "menu.3.item" as menu_3_item, "menu.3.price" as menu_3_price from jsontest.restaurant_flex; -[ RECORD 1 ]-+--------------- _name_ | Bobs pizzeria cuisine | Italian location_city | Cambridge location_zip | 02140 menu_0_item | cheese pizza menu_0_price | $8.25 menu_1_item | chicken pizza menu_1_price | $11.99 menu_2_item | spinach pizza menu_2_price | $10.50 menu_3_item | menu_3_price |
复制

8、非多层嵌套数据可以直接使用fjsonparser()解析数据到普通表中

[dbadmin@v001 jsontest]$ cat /home/dbadmin/jsontest/m.json { "height" : "14000", "name" : "Kilimanjaro", "type" : "mountain" } { "height" : "29029", "hike_safety" : "34.1", "name" : "Everest", "type" : "mountain" } { "height" : "29029", "hike_safety" : "15.4", "name" : "Mt St Helens", "type" : "volcano" } { "height" : "17000", "hike_safety" : "12.2", "name" : "Denali", "type" : "mountain" } { "hike_safety" : "50.6", "name" : "Mt Washington", "type" : "mountain" } create table jsontest.mountains( name varchar(100), type varchar(100), height int, hike_safety numeric(5,1) ); copy jsontest.mountains from '/home/dbadmin/jsontest/m.json' parser fjsonparser(); select name,type,height,hike_safety from jsontest.mountains; name | type | height | hike_safety ---------------+----------+--------+------------- Denali | mountain | 17000 | 12.2 Everest | mountain | 29029 | 34.1 Kilimanjaro | mountain | 14000 | Mt St Helens | volcano | 29029 | 15.4 Mt Washington | mountain | | 50.6 (5 rows)
复制
最后修改时间:2024-11-26 16:51:45
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论