Hive学习之Hive数据库DDL


    Hive提供了与SQL相似的数据定义语言(DDL),对于熟悉SQL的人来说,学习Hive的DDL是非常容易得,即使从未接触过SQL的人,学习也不是一件很难的事情。虽然本人对SQL有一定的了解,但绝不敢自称熟悉SQL,所以准备对HiveQL进行深入的学习,按照SQL通常的学习曲线,先学习DDL,再学习DML(数据操作语言)。由于需要演示一些语句的示例,不得不超前使用一些其它语句,比如show、describe等。

Create/Drop/Alter数据库

    创建数据库的语法如下:

CREATE(DATABASE|SCHEMA) [IF NOT EXISTS] database_name
[COMMENTdatabase_comment]
[LOCATIONhdfs_path]
[WITHDBPROPERTIES (property_name=property_value, ...)];

    在创建数据库时可以指定数据库在HDFS上的存储位置以及数据库的属性,示例如下:

hive> showdatabases;
OK
default
Time taken:1.842 seconds, Fetched: 1 row(s)
hive> createdatabase learning comment 'Learning Hive Database' withdbproperties('creator'='hadoop','date'='2014-06-04', 'test'='First database');
OK
Time taken:5.274 seconds
hive> showdatabases;
OK
default
learning
Time taken:0.022 seconds, Fetched: 2 row(s)
hive>describe database learning;
OK
learning Learning Hive Database   hdfs://hadoop:9000/user/hive/warehouse/learning.db hadoop
Time taken:0.078 seconds, Fetched: 1 row(s)

    从例子的演示来看,默认情况下新建的数据库存储在/user/hive/warehouse,该值可由hive.metastore.warehouse.dir参数指定,默认即为上述目录,数据库的拥有者为hadoop用户,可以使用下面即将介绍的alter语句修改数据库的拥有者为hive:

hive> alterdatabase learning set owner user hive;
OK
Time taken:0.255 seconds
hive>describe database learning;
OK
learning Learning Hive Database   hdfs://hadoop:9000/user/hive/warehouse/learning.db hive
Time taken:0.015 seconds, Fetched: 1 row(s)

    Drop数据库的语法如下:

DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];

hive> drop database learning;
OK
Time taken: 0.933 seconds
hive> show databases;
OK
default
Time taken: 0.037 seconds, Fetched: 1 row(s)
hive> dfs -lsr /user/hive/warehouse;
drwxr-xr-x   - hadoopsupergroup          0 2014-05-23 16:43/user/hive/warehouse/page_view
drwxr-xr-x   - hadoopsupergroup          0 2014-05-14 11:29/user/hive/warehouse/pokes

    从上面的输出结果可以看出,在数据库中不存在表的情况下可以直接删除数据库,那么如果数据库存在表结果会是如何呢?下面的例子演示了这种情况:

hive> use learning;
OK
Time taken: 0.105 seconds hive> create table how(name string);
OK
Time taken: 0.982 seconds
hive> dfs -lsr /user/hive/warehouse;
drwxr-xr-x   - hadoopsupergroup          0 2014-06-04 11:11/user/hive/warehouse/learning.db
drwxr-xr-x   - hadoopsupergroup          0 2014-06-04 11:11/user/hive/warehouse/learning.db/how
drwxr-xr-x   - hadoopsupergroup          0 2014-05-23 16:43/user/hive/warehouse/page_view
drwxr-xr-x   - hadoopsupergroup          0 2014-05-14 11:29/user/hive/warehouse/pokes
hive> drop database learning;
FAILED: Execution Error, return code 1 fromorg.apache.hadoop.hive.ql.exec.DDLTask.InvalidOperationException(message:Database learning is not empty. One or moretables exist.)
hive> drop database learning restrict;
FAILED: Execution Error, return code 1 fromorg.apache.hadoop.hive.ql.exec.DDLTask.InvalidOperationException(message:Database learning is not empty. One or moretables exist.)
hive> drop database learning cascade;
OK
Time taken: 3.151 seconds
hive> show databases;
OK
default
Time taken: 0.019 seconds, Fetched: 1 row(s)
hive> dfs -lsr /user/hive/warehouse;
drwxr-xr-x   - hadoopsupergroup          0 2014-05-23 16:43/user/hive/warehouse/page_view
drwxr-xr-x   - hadoopsupergroup          0 2014-05-14 11:29/user/hive/warehouse/pokes

    当数据库中存在表时无法直接删除,会提示数据库非空,存在表,这时可以使用CASCADE关键字,使用RESTRICT关键字与默认行为等价。

    Alter数据库的语法如下:

ALTER DATABASEdatabase_name SET DBPROPERTIES (property_name=property_value, ...);
ALTER DATABASEdatabase_name SET OWNER [USER|ROLE] user_or_role;

相关内容