第 21 章 Apache Sqoop
目录
21.1. 安装 Sqoop
21.2. sqoop2-tool
21.2.1. verify
21.2.2. upgrade
21.3. sqoop2-shell
21.3.6.1. link
21.3.5.1. create job
21.3.5.2. show job
21.3.5.3. start job
21.3.5.4. status job
21.3.4.1. hdfs-connector
21.3.4.2. generic-jdbc-connector
21.3.2.1. server
21.3.2.2. 要设置可查看具体出错信息
21.3.1. show version
21.3.2. set
21.3.3. show connector
21.3.4. link
21.3.5. job
21.3.6. update
21.4. FAQ
21.4.1. Unable to load native-hadoop library for your platform
21.1. 安装 Sqoop
OSCM 一键安装
curl -s https://raw.githubusercontent.com/oscm/shell/master/database/apache-sqoop/sqoop-1.99.7-bin-hadoop200.sh | bash复制
复制
启动 Sqoop
/srv/apache-sqoop/bin/sqoop.sh server start复制
检查 Sqoop 线程
[hadoop@netkiller ~]$ jps 2512 SecondaryNameNode 23729 SqoopJettyServer 2290 DataNode 871 ResourceManager 23885 Jps复制
21.2. sqoop2-tool
21.2.1. verify 配置文件检验工具
[hadoop@iZj6ciilv2rcpgauqg2uuwZ ~]$ sqoop2-tool verify Setting conf dir: /srv/apache-sqoop/bin/../conf Sqoop home directory: /srv/apache-sqoop Sqoop tool executor: Version: 1.99.7 Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine Running tool: class org.apache.sqoop.tools.tool.VerifyTool 0 [main] INFO org.apache.sqoop.core.SqoopServer - Initializing Sqoop server. 6 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config file poller thread Verification was successful. Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.复制
21.3. sqoop2-shell
进入 sqoop2-shell
[hadoop@netkiller ~]$ sqoop2-shell Setting conf dir: /srv/apache-sqoop/bin/../conf Sqoop home directory: /srv/apache-sqoop Sqoop Shell: Type 'help' or '\h' for help. sqoop:000>复制
Sqoop client script:
sqoop2-shell /path/to/your/script.sqoop复制
21.3.1. show version
sqoop:000> show version client version: Sqoop 1.99.7 source revision 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb Compiled by abefine on Tue Jul 19 16:08:27 PDT 2016复制
sqoop:000> show version --all client version: Sqoop 1.99.7 source revision 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb Compiled by abefine on Tue Jul 19 16:08:27 PDT 2016 0 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable server version: Sqoop 1.99.7 source revision 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb Compiled by abefine on Tue Jul 19 16:08:27 PDT 2016 API versions: [v1]复制
21.3.2. set
21.3.2.1. server
sqoop:000> set server --host master --port 12000 --webapp sqoop Server is set successfully复制
21.3.2.2. 要设置可查看具体出错信息
sqoop:000> set option --name verbose --value true Verbose option was changed to true复制
21.3.3. show connector
sqoop:000> show connector 0 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable +------------------------+---------+------------------------------------------------------------+----------------------+ | Name | Version | Class | Supported Directions | +------------------------+---------+------------------------------------------------------------+----------------------+ | generic-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO | | kite-connector | 1.99.7 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO | | oracle-jdbc-connector | 1.99.7 | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO | | ftp-connector | 1.99.7 | org.apache.sqoop.connector.ftp.FtpConnector | TO | | hdfs-connector | 1.99.7 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO | | kafka-connector | 1.99.7 | org.apache.sqoop.connector.kafka.KafkaConnector | TO | | sftp-connector | 1.99.7 | org.apache.sqoop.connector.sftp.SftpConnector | TO | +------------------------+---------+------------------------------------------------------------+----------------------+ sqoop:000> sqoop list-databases --connect jdbc:mysql://192.168.1.1:3306/ --username root --password 123456复制
sqoop:000> show connector --all复制
21.3.4. link
21.3.4.1. hdfs-connector
sqoop:000> create link -connector hdfs-connector 0 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Creating link for connector with name hdfs-connector Please fill following values to create new link object Name: hdfs HDFS cluster URI: hdfs://127.0.0.1:9000 Conf directory: Additional configs:: There are currently 0 values in the map: entry# New link was successfully created with validation status OK and name hdfs sqoop:000>复制
sqoop:000> show link +------+----------------+---------+ | Name | Connector Name | Enabled | +------+----------------+---------+ | hdfs | hdfs-connector | true | +------+----------------+---------+复制
21.3.4.2. generic-jdbc-connector
sqoop:000> create link -connector generic-jdbc-connector Creating link for connector with name generic-jdbc-connector Please fill following values to create new link object Name: mysql Database connection Driver class: com.mysql.jdbc.Driver Connection String: jdbc:mysql://127.0.0.1:3306/test Username: test Password: **** Fetch Size: Connection Properties: There are currently 0 values in the map: entry# SQL Dialect Identifier enclose: New link was successfully created with validation status OK and name mysql复制
sqoop:000> show link +-------+------------------------+---------+ | Name | Connector Name | Enabled | +-------+------------------------+---------+ | mysql | generic-jdbc-connector | true | | hdfs | hdfs-connector | true | +-------+------------------------+---------+复制
21.3.5. job
21.3.5.1. create job
sqoop:000> create job -f "mysql" -t "hdfs" Creating job for links with from name mysql and to name hdfs Please fill following values to create new job object Name: from-mysql-to-hdfs Database source Schema name: test Table name: member SQL statement: Column names: There are currently 0 values in the list: element# Partition column: Partition column nullable: Boundary query: Incremental read Check column: Last value: Target configuration Override null value: Null value: File format: 0 : TEXT_FILE 1 : SEQUENCE_FILE 2 : PARQUET_FILE Choose: 0 Compression codec: 0 : NONE 1 : DEFAULT 2 : DEFLATE 3 : GZIP 4 : BZIP2 5 : LZO 6 : LZ4 7 : SNAPPY 8 : CUSTOM Choose: 0 Custom codec: Output directory: /sqoop/member Append mode: Throttling resources Extractors: Loaders: Classpath configuration Extra mapper jars: There are currently 0 values in the list: element# New job was successfully created with validation status OK and name from-mysql-to-hdfs复制
21.3.5.2. show job
sqoop:000> show job +----+--------------------+--------------------------------+-----------------------+---------+ | Id | Name | From Connector | To Connector | Enabled | +----+--------------------+--------------------------------+-----------------------+---------+ | 1 | from-mysql-to-hdfs | mysql (generic-jdbc-connector) | hdfs (hdfs-connector) | true | +----+--------------------+--------------------------------+-----------------------+---------+复制
21.3.5.3. start job
sqoop:000> start job -n from-mysql-to-hdfs sqoop:000> start job -n from-mysql-to-hdfs Submission details Job Name: from-mysql-to-hdfs Server URL: http://localhost:12000/sqoop/ Created by: hadoop Creation date: 2017-07-22 23:18:02 CST Lastly updated by: hadoop External ID: job_1499236611045_0001 http://iZj6ciilv2rcpgauqg2uuwZ:8088/proxy/application_1499236611045_0001/ 2017-07-22 23:18:02 CST: BOOTING - Progress is not available复制
启动后进入HDFS查看导入情况
[hadoop@netkiller ~]$ hdfs dfs -ls /sqoop [hadoop@netkiller ~]$ hdfs dfs -ls /member Found 10 items -rw-r--r-- 3 hadoop supergroup 0 2017-07-22 23:18 /member/310af608-5533-4bc2-bfb8-eaa45470b04d.txt -rw-r--r-- 3 hadoop supergroup 48 2017-07-22 23:18 /member/36bc39a5-bc73-4065-a361-ff2d61c4922c.txt -rw-r--r-- 3 hadoop supergroup 0 2017-07-22 23:18 /member/3e855400-84a9-422d-b50c-1baa9666a719.txt -rw-r--r-- 3 hadoop supergroup 140 2017-07-22 23:18 /member/3e8dad92-e0f1-4a74-a337-642cf4e6d634.txt -rw-r--r-- 3 hadoop supergroup 55 2017-07-22 23:18 /member/4a9f47f1-0413-4149-a93a-ed8b51efbc87.txt -rw-r--r-- 3 hadoop supergroup 0 2017-07-22 23:18 /member/4dc5bfe7-1cd9-4d9b-96a8-07e82ed79a71.txt -rw-r--r-- 3 hadoop supergroup 0 2017-07-22 23:18 /member/60dbcc60-61f2-4433-af39-1dfdfc048940.txt -rw-r--r-- 3 hadoop supergroup 0 2017-07-22 23:18 /member/6d02ed89-94d9-4d4b-87ed-d5da9d2bf9fe.txt -rw-r--r-- 3 hadoop supergroup 209 2017-07-22 23:18 /member/cf7b7185-3ab6-4077-943a-26228b769c57.txt -rw-r--r-- 3 hadoop supergroup 0 2017-07-22 23:18 /member/f2e0780d-ad33-4b35-a1c7-b3fbc23e303d.txt复制
21.3.5.4. status job
sqoop:000> status job -n from-mysql-to-hdfs复制
21.3.6. update
21.3.6.1. link
sqoop:000> update link -n mysql Updating link with name mysql Please update link: Name: mysql Database connection Driver class: com.mysql.jdbc.Driver Connection String: jdbc:mysql://127.0.0.1:3306/test Username: test Password: **** Fetch Size: Connection Properties: There are currently 0 values in the map: entry# SQL Dialect Identifier enclose: link was successfully updated with status OK复制
Donations (打赏)
We accept PayPal through:
https://www.paypal.me/netkiller
Wechat (微信) / Alipay (支付宝) 打赏:
http://www.netkiller.cn/home/donations.html
作者相关文章:
CentOS 7.3 + Server JRE 1.8 + Hadoop-2.8.0
Spring boot with Schedule (启用/禁用)
《Netkiller Virtualization 手札》Docker 卷管理
Struts2 S2-046, S2-045 Firewall(漏洞防火墙)
转载请注明出处与作者声明,扫描二维码关注作者公众好,不定期更新文章