Greenplum+Hadoop学习笔记-14-定义数据库对象之创建与管理表空间,hadoop-14-


6.2.创建与管理表空间

表空间建立在文件空间之上,文件空间建立在一系列文件系统之上。关于gpfilespace的所有说明如下所示:

 

[gpadmin@master gpfs]$ gpfilespace --help

COMMAND NAME: gpfilespace

 

Creates a filespace using a configuration file that defines per-segment file system locations. Filespaces describe the physical file system resources to be used by a tablespace.

*****************************************************

SYNOPSIS

*****************************************************

 

gpfilespace [<connection_option> ...] [-l <logfile_directory>]  [-o [<output_fs_config_file>]]

 

gpfilespace [<connection_option> ...] [-l <logfile_directory>]    -c <fs_config_file>

 

gpfilespace movetempfilespace {<filespace_name>|default}

 

gpfilespace movetransfilespace {<filespace_name>|default}

 

gpfilespace --showtempfilespace

 

gpfilespace --showtransfilespace

 

gpfilespace -v | -?

*****************************************************

DESCRIPTION

*****************************************************

 

A tablespace requires a file system location to store its database files. In Greenplum Database, the master and each segment (primary and mirror) needs its own distinct storage location. This collection of file system locations for all components in a Greenplum system is referred to as a filespace. Once a filespace is defined, it can be used by one or more tablespaces.

 

When used with the -o option, the gpfilespace utility looks up your system configuration information in the Greenplum Database catalog tables and prompts you for the appropriate file system locations needed to create the filespace. It then outputs a configuration file that can be used to create a filespace. If a file name is

not specified, a gpfilespace_config_<#> file will be created in the current directory by default. 

 

Once you have a configuration file, you can run gpfilespace with the -c option to create the filespace in Greenplum Database.

 

You will need to create a filespace before you can use the gpfilespace --movetempfilespace or --movetransfilespace option to move your temporary or transaction files to the new location.

 

Use either gpfilespace --showtempfilespace or --showtransfilespace options to show the name of the filespace currently associated with temporary or transaction files.

 

Note: If segments are down due to a power or nic failure, you may see inconsistencies during filespace creation. You may not be able to bring up the Greenplum Database.

 

 

*****************************************************

OPTIONS

*****************************************************

 

-c | --config <fs_config_file>

 

 A configuration file containing:

 * An initial line denoting the new filespace name. For example: filespace:myfs

 * One line each for the master, the primary segments, and the    mirror segments. A line describes a file system location that  a particular segment database instance should use as its data  directory location to store database files associated with a  tablespace. Each line is in the format of:

   <hostname>:<dbid>:/<filesystem_dir>/<seg_datadir_name>

 

 

-l | --logdir <logfile_directory>

 The directory to write the log file. Defaults to ~/gpAdminLogs.

-o | --output <output_file_name>

 The directory location and file name to output the generated  filespace configuration file. You will be prompted to enter a  name for the filespace, a master file system location, the primary segment file system locations, and the mirror segment file system locations. For example, if your configuration has  2 primary and 2 mirror segments per host, you will be prompted for a total of 5 locations (including the master). The file system locations must exist on all hosts in your system prior  to running the gpfilespace utility. The utility will designate  segment-specific data directories within the location(s) you  specify, so it is possible to use the same location for multiple  segments. However, primaries and mirrors cannot use the same  location. After the utility creates the configuration file, you  can manually edit the file to make any required changes to the  filespace layout before creating the filespace in Greenplum Database.

 

--movetempfilespace {<filespace_name>|default}

 

 Moves temporary files to a new filespace or to the default location.

 

--movetransfilespace {<filespace_name>|default}

 

 Moves transaction files to a new filespace or to the default location.

 

--showtempfilespace

 

 Show the name of the filespace currently associated with temporary  files. This option checks that all primary and mirror segments, master  and master standby are using the same filespace or temporary files.You will receive a warning message and an email if any inconsistencies  exist.

 

--showtransfilespace

 

 Show the name of the filespace currently associated with transaction  files. This option checks that all primary and mirror segments, master  and master standby are using the same filespace or transaction files. You will receive a warning message and an email if any inconsistencies exist.

 

-v | --version (show utility version)

 

 Displays the version of this utility.

 

 

-? | --help (help)

 

 Displays the utility usage and syntax.

 

****************************

CONNECTION OPTIONS

****************************

 

-h host | --host host

 

 The host name of the machine on which the Greenplum master  database server is running. If not specified, reads from  the environment variable PGHOST or defaults to localhost.

 

-p port | --port port

 

 The TCP port on which the Greenplum master database server is listening for connections. If not specified, reads from  the environment variable PGPORT or defaults to 5432.

 

 

-U username | --username superuser_name

 

 The database superuser role name to connect as. If not specified, reads from the environment variable PGUSER or  defaults to the current system user name. Only database  superusers are allowed to create filespaces.

 

 

-W | --password

 

 Force a password prompt.

 

 

*****************************************************

EXAMPLES

*****************************************************

 

Create a filespace configuration file. You will be prompted to enter a name for the filespace, a master file system location, the primary segment file system locations, and the mirror segment file system locations. For example, if your configuration has 2 primary and 2 mirror segments per host, you will be prompted for a total of 5 locations (including the master). The file system locations must exist on all hosts in your system prior to running the gpfilespace utility:

 

 $ gpfilespace -o .

 Enter a name for this filespace

 > fastdisk

 

 Checking your configuration:

 Your system has 2 hosts with 2 primary and 2 mirror segments per host.

 

 Configuring hosts: [sdw1, sdw2]

 

 Please specify 2 locations for the primary segments, one per line:

 primary location 1> /gp_pri_filespc

 primary location 2> /gp_pri_filespc

 

 Please specify 2 locations for the mirror segments, one per line:

 mirror location 1> /gp_mir_filespc

 mirror location 2> /gp_mir_filespc

 

 Enter a file system location for the master:

 master location> /gp_master_filespc

 

 

 ***************************************

 Example filespace configuration file:

 

 filespace:fastdisk

 mdw:1:/gp_master_filespc/gp-1

 sdw1:2:/gp_pri_filespc/gp0

 sdw1:3:/gp_mir_filespc/gp1

 sdw2:4:/gp_mir_filespc/gp0

 sdw2:5:/gp_pri_filespc/gp1

 

 

Execute the configuration file to create the filespace

in Greenplum Database:

 

 $ gpfilespace -c gpfilespace_config_1

 

 

*****************************************************

SEE ALSO

*****************************************************

 

CREATE FILESPACE, CREATE TABLESPACE

 

6.2.1.创建文件空间:在所有GP HOST主机上准备好逻辑文件系统

[root@master greenplum]# gpssh -f seg_hosts -e 'mkdir -p /gpfs2/seg';

[slave1] mkdir -p /gpfs2/seg

[slave2] mkdir -p /gpfs2/seg

[root@master greenplum]# gpssh -f seg_hosts -e 'chown gpadmin /gpfs2/seg';

[slave1] chown gpadmin /gpfs2/seg

[slave2] chown gpadmin /gpfs2/seg

[root@master greenplum]# mkdir -p /gpfs/master

[root@master greenplum]#  chown gpadmin /gpfs/master/

[root@master greenplum]# ls -ltr /gpfs/master

total 0

 

使用gpfilespace创建文件系统步骤如下:

  • 创建文件系统的配置文件:$gpfilespace –o gpfilespace_config
  • 根据提示输入文件空间名称和位置
  • 生成配置文件,检查配置内容
  • 基于配置文件创建文件空间:$gpfilespace –c gpfilespace_config

[gpadmin@master ~]$ gpfilespace -o gpfilespace_config

20150327:23:40:17:004809 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database

files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

 

 

20150327:23:40:17:004809 gpfilespace:master:gpadmin-[INFO]:-getting config

Enter a name for this filespace

> fs01

 

Checking your configuration:

Your system has 2 hosts with 1 primary and 0 mirror segments per host.

Your system has 1 hosts with 0 primary and 0 mirror segments per host.

 

Configuring hosts: [slave1, slave2]

Please specify 1 locations for the primary segments, one per line:

primary location 1> /gpfs2/seg

 

Configuring hosts: [master]

 

Enter a file system location for the master

master location> /gpfs/master

20150327:23:42:31:004809 gpfilespace:master:gpadmin-[INFO]:-Creating configuration file...

20150327:23:42:31:004809 gpfilespace:master:gpadmin-[INFO]:-[created]

20150327:23:42:31:004809 gpfilespace:master:gpadmin-[INFO]:-

To add this filespace to the database please run the command:

   gpfilespace --config /home/gpadmin/gpfilespace_config

 

[gpadmin@master ~]$ cat gpfilespace_config

filespace:fs01

master:1:/gpfs/master/gpseg-1

slave1:2:/gpfs2/seg/gpseg0

slave2:3:/gpfs2/seg/gpseg1

[gpadmin@master ~]$ gpfilespace -c gpfilespace_config

20150327:23:43:32:005042 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database

files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

 

 

20150327:23:43:32:005042 gpfilespace:master:gpadmin-[INFO]:-getting config

Reading Configuration file: 'gpfilespace_config'

20150327:23:43:32:005042 gpfilespace:master:gpadmin-[INFO]:-Performing validation on paths

..............................................................................

 

20150327:23:43:33:005042 gpfilespace:master:gpadmin-[INFO]:-Connecting to database

20150327:23:43:33:005042 gpfilespace:master:gpadmin-[INFO]:-Filespace "fs01" successfully created

 

6.2.2.转移临时文件或事务文件的位置

6.2.2.1.作用

改善DB的查询性能.备份性能.连续存储数据的性能;

6.2.2.2.关于临时文件和事务文件

  • 只有具备超级管理员权限的用户才可以移动位置,只有gpfilespace工具可以写该文件;
  • 只能为临时文件或事务文件指定一个文件空间;在生产系统中规划存储时,需要专门为临时文件或者事务文件创建固定的分区;
  • 如果文件空间被临时文件使用,不能删除;

6.2.2.3.使用gpfilespace移动临时文件

  •  确保文件空间存在,且与存储其他用户数据的文件空间不同;但如果空间有限也可以相同;
  • 停止GPDB;
  •  将GPDB启动为限制模式:$gpfilespace –movetempfilespace test_fs,限制模式是为了保证没有任何用户可以连接到GP数据库中;movetempfilespace参数会检查当前GP数据库是否为限制模式,如果是启动状态会重新启动到限制模式;如果是停止状态则启动到限制模式;
  •  注意:临时文件位置在Instance中配合共享内存使用,在创建、打开、删除临时文件用到;

[gpadmin@master gpfs]$ psql -d testdw             连接到任一数据库中并查看当前所有的文件空间

psql (8.2.15)

Type "help" for help.

 

testdw=# select * from pg_filespace;

  fsname   | fsowner

-----------+---------

 pg_system |      10

 fs01      |      10

(2 rows)

 

testdw=# \q

[gpadmin@master gpfs]$ gpfilespace --movetempfilespace fs01;

20150329:16:52:03:008118 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database

files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

 

 

20150329:16:52:03:008118 gpfilespace:master:gpadmin-[INFO]:-Database was started in NORMAL mode

20150329:16:52:03:008118 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:52:06:008118 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

20150329:16:52:12:008118 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:52:15:008118 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database in master only mode

20150329:16:52:16:008118 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace fs01 exists

20150329:16:52:16:008118 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace is same as current filespace

20150329:16:52:16:008118 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database in master only mode

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Checking for connectivity

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace information

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace entries used by TEMPORARY_FILES

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Obtaining segment information ...

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Creating RemoteOperations list

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Moving TEMPORARY_FILES filespace from pg_system to fs01 ...

20150329:16:52:18:008118 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

 

testdw=# \db              查看当前默认的文件空间                      

         List of tablespaces

    Name    |  Owner  | Filespae Name

------------+---------+---------------

 pg_default | gpadmin | pg_system

 pg_global  | gpadmin | pg_system

(2 rows)

 

6.2.2.4.使用gpfilespace移动事务文件

  •  确保文件空间存在,且与存储其他用户数据的文件空间不同
  • 停止GPDB
  •  将GPDB启动为限制模式:$gpfilespace –movetransfilespace test_fs
  •  注意:事务文件位置在Instance中配合共享内存使用,在创建.打开.删除事务文件时用到。

[gpadmin@master gpfs]$ gpfilespace --movetransfilespace fs01;

20150329:16:57:20:008777 gpfilespace:master:gpadmin-[INFO]:-

A tablespace requires a file system location to store its database files. A filespace is a collection of file system locations for all components

in a Greenplum system (primary segment, mirror segment and master instances).

Once a filespace is created, it can be used by one or more tablespaces.

20150329:16:57:20:008777 gpfilespace:master:gpadmin-[INFO]:-Database was started in NORMAL mode

20150329:16:57:20:008777 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:57:23:008777 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

20150329:16:57:30:008777 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database

20150329:16:57:33:008777 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database in master only mode

20150329:16:57:34:008777 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace fs01 exists

20150329:16:57:34:008777 gpfilespace:master:gpadmin-[INFO]:-Checking if filespace is same as current filespace

20150329:16:57:34:008777 gpfilespace:master:gpadmin-[INFO]:-Stopping Greenplum Database in master only mode

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Checking for connectivity

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace information

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Obtaining current filespace entries used by TRANSACTION_FILES

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Obtaining segment information ...

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Creating RemoteOperations list

20150329:16:57:36:008777 gpfilespace:master:gpadmin-[INFO]:-Moving TRANSACTION_FILES filespace from pg_system to fs01 ...

20150329:16:57:44:008777 gpfilespace:master:gpadmin-[INFO]:-Starting Greenplum Database

 

6.2.3.创建表空间

  • 必须是由超级用户创建,且在Master节点进行创建;
  • 使用CREATETABLESPACE命令;
  • 创建完成后授予使用权限给普通用户;

testdw=# \h create tablespace

Command:     CREATE TABLESPACE

Description: define a new tablespace

Syntax:

CREATE TABLESPACE tablespace_name [OWNER username] FILESPACE filespace_name

testdw=# \db

         List of tablespaces

    Name    |  Owner  | Filespae Name

------------+---------+---------------

 pg_default | gpadmin | pg_system

 pg_global  | gpadmin | pg_system

(2 rows)

 

testdw=# CREATE TABLESPACE testspace FILESPACE fs01;

CREATE TABLESPACE

testdw=# \db

         List of tablespaces

    Name    |  Owner  | Filespae Name

------------+---------+---------------

 pg_default | gpadmin | pg_system

 pg_global  | gpadmin | pg_system

 testspace  | gpadmin | fs01

(3 rows)

testdw=# grant create on tablespace testspace to admin;

GRANT

 

6.2.4.使用表空间存储DB对象

  • 使用指定表空间创建对象;
  • 使用缺省表空间参数default_tablespace来设定:SETdefault_tablespace = testspace后创建表,此时不需要指定表空间;
  • 如果一个表空间与DB关联,那么其将存储所有该DB的系统日志.临时文件等;

testdw=# create table tb01(id int) tablespace testspace;

NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.

HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.

CREATE TABLE

testdw=# \d

                List of relations

 Schema |    Name     | Type  |  Owner  | Storage

--------+-------------+-------+---------+---------

 public | tb01        | table | gpadmin | heap

 public | tb1_test_01 | table | gpadmin | heap

(2 rows)

 

testdw=# set default_tablespace=testspace;

SET

testdw=# create table tb02(id int);

NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.

HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.

CREATE TABLE

testdw=# select * from pg_tables where tablespace='testspace';

 schemaname | tablename | tableowner | tablespace | hasindexes | hasrules | hastriggers

------------+-----------+------------+------------+------------+----------+-------------

 public     | tb01      | gpadmin    | testspace  | f          | f        | f

 public     | tb02      | gpadmin    | testspace  | f          | f        | f

(2 rows)

 

  • 查看现有的表空间和文件空间

2个缺省表空间和1个缺省文件空间:

pg_global:存储系统日志信息

pg_default:存储template1和template0模版DB

pg_system:系统初始化时使用的数据目录

获取文件空间的信息,查询语句如下所示:

SELECT spcname as tblspc, fsname as filespc, fsedbid as seg_dbid, fselocation as datadir

         FROM pg_tablespace pgts, pg_filespace pgfs, pg_filespace_entry pgfse

         WHERE pgts.spcfsoid=pgfse.fsefsoid

    AND pgfse.fsefsoid=pgfs.oid

    ORDER BY tblspc, seg_dbid;

  •  删除表空间和文件空间

对象被删除前,表空间不能被删除,表空间不能进行级联删除,需要先删除该表空间中的对象;

表空间被删除前,文件空间不能被删除;

通过DROP TABLESPACE命令删除表空间;

testdw-# \h DROP TABLESPACE

Command:     DROP TABLESPACE

Description: remove a tablespace

Syntax:

DROP TABLESPACE [ IF EXISTS ] tablespacename

通过DROP FILESPACE 命令删除文件空间=#DROP FILESPACE test_fs;

testdw-# \h drop filespace

Command:     DROP FILESPACE

Description: remove a filespace

Syntax:

DROP FILESPACE [IF EXISTS] filespacename

相关内容

    暂无相关文章