sqoop部署及使用

1、下载sqoop 

sqoop有两个方向的版本,sqoop 1版本和sqoop 2版本,这里使用sqoop 1。sqoop 1下载链接 http://www.apache.org/dyn/closer.lua/sqoop/1.4.7

2、解压

cd /data/

tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

3、修改配置文件,主要是配置hadoop目录和hive目录

cd conf

cp sqoop-env-template.sh sqoop-env.sh 

配置hadoop、hive

vim sqoop-env.sh 

export HADOOP_COMMON_HOME=hadoop安装目录

export HADOOP_MAPRED_HOME=hadoop安装目录

export HIVE_HOME=hive安装目录

4、下载MySQL数据库链接jar包到sqoop 的lib目录

wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.19/mysql-connector-java-8.0.19.jar

5、 连接MySQL

./sqoop import --connect jdbc:mysql://ip:3306/test?zeroDateTimeBehavior=CONVERT_TO_NULL 
--username '数据库账号' --password '数据库密码' --table 数据库表名 
--fields-terminated-by ',' 
--target-dir '/data/hive/test'

 

6、查看HDFS上面的数据

hdfs dfs -ls /data/hive/test

报错:

ERROR tool.ImportTool: Import failed: java.io.IOException: Generating splits for a textual index column allowed only in case of "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" property passed as a parameter
	at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:204)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
	at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200)
	at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173)
	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270)
	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
	at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)

  

 解决 办法,加上 -Dorg.apache.sqoop.splitter.allow_text_splitter=true 参数,允许主键是字符串。

即:

./sqoop import "-Dorg.apache.sqoop.splitter.allow_text_splitter=true"  --connect jdbc:mysql://ip:3306/test?zeroDateTimeBehavior=CONVERT_TO_NULL 
--username '数据库账号' --password '数据库密码' --table 数据库表名 
--fields-terminated-by ',' 
--target-dir '/data/hive/test'