暂无图片
暂无图片
3
暂无图片
暂无图片
2
暂无图片

Oracle Csv格式文件导入-SQLLDR

原创 布衣 2024-05-09
3106

背景

  最近工作需要把Mysql数据导入到Oracle,于是想到了SQLLDR(SQLLDR是Oracle服务端自带的工具)。趁这次机会整理份操作文档方便后续查看。

SQL*Loader支持3种数据加载方式:

  • 传统路径加载(direct=false):等同于insert语句(默认)
  • 直接路径加载(direct=true):绕过SGA,把数据直接导入高水位线(HWM)以上,可设置并行加载,性能比传统路径加载更高,但限制也更多;
    no_index_errors:默认为false,如果为true表示在加载过程中不会把index设为unusable(在此方式可用);
  • 外部表加载:先为数据文件上创建一个外部表,然后再把数据从外部表insert到目标表中

常用参数介绍:

  • errors:最大允许错误的记录数
  • discardmax :最大允许丢弃的记录数
  • parallel:默认为false,如果为true表示启用并行加载
  • skip_index_maintenance:默认为false,如果为true表示在加载过程中停止索引的维护
  • rows:表示达到多少行将提交,默认(传统路径加载64,直接路径加载all)
  • skip:省略最前面的记录数
  • CHARACTERSET UTF8 :导出文件的字符集,以及导入目标字符集

数据加载对目标表处理方式:

  • SQL*Loader是以INSERT方式加载数据,目标表必须是空表,否则会报错;(默认)
  • APPEND:添加数据至源数据的后面
  • REPLACE:在添加数据之前,先执行DELETE FROM TABLE把表清空,注意:这里的REPLACE是表级别的替代,而不是行级别
  • TRUNCATE:在添加数据之前,先执行TRUNCATE TABLE table_name REUSE STORAGE

注:SQL*Loader不支持对已存在的行更新,如果你想更新已存在的行,则必须先把数据加载至一个空的中间表,然后在原表和中间表进行关联update

示例:
环境准备
  • 准备CSV数据:t.csv
"status","amt","create_time"
"A","90","2024-02-26 13:37:28"
"B","100","2024-03-26 13:37:28"
"B","123","2023-01-26 13:37:28"
"A","10","2024-01-26 13:37:28"
"A","1","2024-03-26 13:37:28"

复制
  • 创建测试表:t1、t2
create table t1 ( id NUMBER(16) not null, status VARCHAR2(2), amt number(5,2), comments VARCHAR2(1000), create_time TIMESTAMP(6), constraint pk_id primary key(id) ); create table t2 ( id NUMBER(16) not null, status VARCHAR2(2), amt number(5,2), comments VARCHAR2(1000), create_time TIMESTAMP(6), constraint pk_t2_id primary key(id) );
复制
  • 创建序列:t_id_seq
create sequence t_id_seq minvalue 1 maxvalue 99999 start with 1 increment by 1 cache 20;
复制
  • 配置:tns
cat /u01/oracle/11.2.0.3/product/network/admin/tnsnames.ora TEST = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = scan-rac.com)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = DB) ) )
复制
  • 执行导入:sqlldr
sqlldr userid=scott/'tiger'@TEST control=/home/oracle/csv/t.ctl log=/home/oracle/csv/t.log
复制
将t.csv数据导入t1:
  • t1.ctl
options(skip=1,BINDSIZE=20971520, ROWS=10000000, READSIZE=20971520, ERRORS=10)
LOAD DATA
CHARACTERSET UTF8          
INFILE 't.csv'             -- 指定csv文件
BADFILE 't.bad'            --符合条件但导入失败,坏数据文件
DISCARDFILE 't.dsc'        -- 指定 丢弃文件
INTO TABLE t1              -- 指定表名,insert:缺省;append:追加;replace:替换
FIELDS TERMINATED BY ','   -- 数据中分隔符 ","   
OPTIONALLY ENCLOSED BY '"' -- 数据中每个字段用 '"' 框起,比如字段中有 "," 分隔符时 
trailing nullcols          -- 表的字段没有对应的值时允许为空  
( 
 status varchvar,  
 amt    numeric(5,2),   
 create_time TIMESTAMP "YYYY-MM-DD HH24:MI:SS", --  TIMESTAMP 指定格式
 comments "'status :'||upper(:status)||'金额:'||:amt",-- 绑定变量对数据加工处理  
 id "t_id_seq.nextval"               -- 这一列直接取序列
 )  
复制
  • 执行导入:sqlldr
sqlldr userid=scott/'tiger'@TEST control=/home/oracle/csv/t1.ctl log=/home/oracle/csv/t1.log SQL*Loader: Release 11.2.0.3.0 - Production on Thu May 9 15:34:49 2024 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. Commit point reached - logical record count 5
复制
  • 查看导入日志:
[oracle@db csv]$ cat t1.log 
SQL*Loader: Release 11.2.0.3.0 - Production on Thu May 9 15:34:49 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
Control File:   /home/oracle/csv/t1.ctl
Character Set UTF8 specified for all input.
Data File:      t.csv
  Bad File:     t.bad
  Discard File: t.dsc 
 (Allow all discards)
Number to load: ALL
Number to skip: 1
Errors allowed: 10
Bind array:     10000000 rows, maximum of 20971520 bytes
Continuation:    none specified
Path used:      Conventional
Table T1, loaded from every logical record.
Insert option in effect for this table: INSERT
TRAILING NULLCOLS option in effect
   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
STATUS                              FIRST    10   ,  O(") CHARACTER            
AMT                                  NEXT     *   ,  O(") CHARACTER            
CREATE_TIME                          NEXT     *   ,  O(") DATETIME YYYY-MM-DD HH24:MI:SS
COMMENTS                             NEXT     *   ,  O(") CHARACTER            
    SQL string for column : "'status :'||upper(:status)||'|AMT'||:amt"
ID                                   NEXT     *   ,  O(") CHARACTER            
    SQL string for column : "t_id_seq.nextval"
value used for ROWS parameter changed from 8323541 to 20087
Table T1:
  5 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Space allocated for bind array:               20970828 bytes(20087 rows)
Read   buffer bytes:20971520
Total logical records skipped:          1
Total logical records read:             5
Total logical records rejected:         0
Total logical records discarded:        0
Run began on Thu May 09 15:34:49 2024
Run ended on Thu May 09 15:34:49 2024
Elapsed time was:     00:00:00.12
CPU time was:         00:00:00.05
复制
  • 查看表T1数据
    image.png
将t.csv数据分别导入t1、t2:
  • t.ctl
options(skip=1,BINDSIZE=20971520, ROWS=10000000, READSIZE=20971520, ERRORS=10)
LOAD DATA
CHARACTERSET UTF8
INFILE 't.csv'
BADFILE 't.bad'
DISCARDFILE 't.dsc'
truncate               -- 对t1、t2 truncate 后,再导入
INTO TABLE t1
when st = "A"          -- when 子句选择导入符合条件的记录 :st 与 下面的列:st  对应,或 when (2) = "A"
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(
 st filler POSITION(1), -- 过滤字段,不必在表中出现,主要用于when条件过滤,注意filler在position关键字前边
 status POSITION(1), -- WHEN子句从第一个记录的末尾继续处理。要重置回记录的开始,请为第一个字段设置位置 (1)。
 amt ,
 create_time TIMESTAMP "YYYY-MM-DD HH24:MI:SS",
 comments "'status:'||upper(:status)",
 id "t_id_seq.nextval"
 )
INTO TABLE t2
when st = "B"
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
trailing nullcols
(
 st filler position(1),
 status  position(1),
 amt ,
 create_time TIMESTAMP "YYYY-MM-DD HH24:MI:SS",
 comments "'status:'||upper(:status)",
 id "t_id_seq.nextval"
 )
复制
  • 执行导入:sqlldr
sqlldr userid=scott/'tiger'@TEST control=/home/oracle/csv/t.ctl log=/home/oracle/csv/t.log SQL*Loader: Release 11.2.0.3.0 - Production on Thu May 9 17:40:49 2024 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. Commit point reached - logical record count 5
复制
  • 查看部分日志:t.log
Table T1:
  3 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  2 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Table T2:
  2 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  3 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Space allocated for bind array:               20969208 bytes(6773 rows)
Read   buffer bytes:20971520

Total logical records skipped:          1
Total logical records read:             5
Total logical records rejected:         0
Total logical records discarded:        0

复制
  • 查看表T1、T2数据
    image.png

定时任务脚本

  • 注:执行sqlldr 命令前一定要加:exec ,否则命令不执行
cat csv_db.sh 
#!/bash/bin
export ORACLE_SID=db
export ORACLE_UNQNAME=db
export ORACLE_BASE=/u01/oracle
export ORACLE_HOME=/u01/oracle/11.2.0.3/product
export NLS_LANG=AMERICAN_AMERICA.AL32UTF8
export OGG_HOME=/u01/ogg
export PATH=$OGG_HOME:$ORACLE_HOME/bin:$PATH
export LD_LIBRARY_PATH=$OGG_HOME:$ORACLE_HOME/lib:$LD_LIBRARY_PATH
exec sqlldr userid=scott/'tiger'@TEST control=/home/oracle/csv/t.ctl log=/home/oracle/csv/t.log
复制

结语

  Oracle 数据库sqlldr使用功能还是很全面且强大的,需要花精力研究。

最后修改时间:2024-07-18 16:21:53
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
1人已赞赏
Z
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

杜伟
暂无图片
9月前
评论
暂无图片 0
Oracle Csv格式文件导入-SQLLDR
9月前
暂无图片 点赞
评论
星星之火
暂无图片
10月前
评论
暂无图片 0
控制好自己的心情,生活才会处处祥和。心情,虽不是人生的全部,却能左右人生的全部。 每天清晨给自己一份鼓励,让今天过得更有意义。生活不会偏爱任何一个不劳而获的人,也不会辜负每一个默默努力的人。 踏实过好每一天,我们才能创造属于自己的精彩人生。
10月前
暂无图片 点赞
评论