原文地址: http://blog.csdn/nsrainbow/article/details/43002387 最新课程请关注原作者博客,获得更好的显示体验
声明
- 本文基于Centos 6.x + CDH 5.x
数据准备
建立一个文本文件叫 a.txt,内容是1,terry
2,alex
3,jimmy
4,mike
5,kate
并上传到hive服务器的 /data/ 目录下
JDBC调用方法
加载Driver
加载driver (只说hive2的jdbc)Class.forName("org.apache.hive.jdbc.HiveDriver");
连接数据库
Connection con = DriverManager.getConnection("jdbc:hive2://host1:10000/default", "hive", "");
- 这里的 host1 是主机名
- 10000是hive默认的端口名
- default是默认的database
- hive是默认的用户名,默认密码是空
数据库操作语句
删除表stmt.execute("drop table if exists " + tableName);
创建表
stmt.execute("create table " + tableName + " (key int, value string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'");
查询数据
ResultSet res = stmt.executeQuery("select * from " + tableName);
导入数据
stmt.execute("load data local inpath '" + filepath + "' into table " + tableName);
例子
建立项目
先打开eclipse建立一个maven项目pom.xml
<project xmlns="http://maven.apache/POM/4.0.0" xmlns:xsi="http://www.w3/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache/POM/4.0.0 http://maven.apache/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.crazycake</groupId>
<artifactId>play-hive</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>play-hive</name>
<url>http://maven.apache</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>0.14.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.0.2</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<encoding>UTF-8</encoding>
<optimise>true</optimise>
<compilerArgument>-nowarn</compilerArgument>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<configuration>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">
</transformer>
</transformers>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
其中最重要的就是这两段
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>0.14.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.2.0</version>
</dependency>
其他的都无所谓
建表、导入以及查询数据
建立一个类 HiveJdbcClientpackage org.crazycake.play_hive;
import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;
/**
* 测试hive 的客户端连接
* @author alexxiyang (https://github/alexxiyang)
*
*/
public class HiveJdbcClient {
/**
* 注意:hive-server2 引用的driver是 org.apache.hive.* 而 hive-server 是 org.apache.hadoop.hive.*
*/
private static String driverName = "org.apache.hive.jdbc.HiveDriver";
/**
* @param args
* @throws SQLException
*/
public static void main(String[] args) throws SQLException {
try {
Class.forName(driverName);
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
System.exit(1);
}
//hive的默认端口是 10000,如果要修改就修改 hive-site.xml 文件的hive.server2.thrift.port 属性值
//默认用户名hive,默认密码为空
Connection con = DriverManager.getConnection("jdbc:hive2://host1:10000/default", "hive", "");
Statement stmt = con.createStatement();
//测试的表名 testhivedrivertable
String tableName = "testhivedrivertable";
//如果已经存在就删除
stmt.execute("drop table if exists " + tableName);
//创建这张表
stmt.execute("create table " + tableName + " (key int, value string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'");
//看下创建是否成功
String sql = "show tables '" + tableName + "'";
System.out.println("Running: " + sql);
ResultSet res = stmt.executeQuery(sql);
if (res.next()) {
System.out.println(res.getString(1));
}
//看下表结构
sql = "describe " + tableName;
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println(res.getString(1) + "\t" + res.getString(2));
}
// 加载数据到表里面
// NOTE: filepath 是本地文件所在的位置,注意这个本地不是你的电脑!
// 你得先把这个文件上传到服务器,然后这里的路径是服务器上这个文件的路径
// NOTE: /data/a.txt
String filepath = "/data/a.txt";
sql = "load data local inpath '" + filepath + "' into table " + tableName;
System.out.println("Running: " + sql);
stmt.execute(sql);
// select * query
sql = "select * from " + tableName;
System.out.println("Running: " + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
}
}
}
输出是
Running: show tables 'testhivedrivertable'
testhivedrivertable
Running: describe testhivedrivertable
key int
value string
Running: load data local inpath '/data/a.txt' into table testhivedrivertable
Running: select * from testhivedrivertable
1 terry
2 alex
3 jimmy
4 mike
5 kate
Running: select count(1) from testhivedrivertable
其实java调用很简单的,就是把你在hive shell 里面执行的语句用jdbc执行一遍而已,所以你传输过去的语句的环境是hive server机器,里面写的路径也是从hive server主机的根目录路径出发去寻找数据,所以我们的 a.txt 得上传到服务器上,这段代码才会运行正常。
参考资料
- https://cwiki.apache/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC
更多推荐
Alex 的 Hadoop 菜鸟教程: 第11课 Hive的Java调用
发布评论