HDFS SnapShot学习,hdfssnapshot学习


原文链接:http://blog.csdn.net/ashic/article/details/47068183
官方文档链接:http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html

概述
HDFS快照是一个只读的基于时间点文件系统拷贝。可以为文件系统中的某个子目录或者整个文件系统拍摄快照。快照通常用来作为数据备份,防止用户错误和容灾。

HDFS快照的创建是高效的:
快照的创建是”瞬间”完成的:除去查找inode的时间,cost是O(1)
只有当修改SnapShot时,才会有额外的内存占用,内存使用量为O(M),M 为修改的文件或者目录数
DataNode中的Blocks并不会被复制:快照只记录了Block list和文件大小。
Snapshot并不会影响HDFS 的正常操作:修改会按照时间的反序记录,这样可以直接读取到最新的数据。快照数据是当前数据减去修改的部分计算出来的。

Snapshottable Directories
只有被设置为snapshottable的目录才可以创建快照。被设定为snapshottable的目录可以容纳65536个同时进行的快照。管理员可以设置任何的目录成为snapshottable。如果snapshottable里面存着快照,那么在这些快照被删除之前,文件夹不能删除或者改名。
如果一个目录的父目录,或者子目录被设为snapshottable,那么它本身不可以被设为snapshottable

Snapshot Paths
当你将某一目录设为snapshottable并创建快照后,在这个目录下会生成一个”.snapshot”目录来存放快照。假设/foo目录被设置为snapshottable,bar是/foo中的一个文件或目录,你为/foo创建了一个快照s0。那么/foo/.snapshot/s0/bar中就存放了bar的快照。
常用的API和CLI能够在”.snapshot” 的路径下运行。下面是一些例子:

    列出snapshottable目录的所有快照
    hdfs dfs -ls /foo/.snapshot

    列出在快照s0的所有文件
    hdfs dfs -ls /foo/.snapshot/s0

    从s0拷贝一个文件:
    hdfs dfs -cp /foo/.snapshot/s0/bar /tmp

Snapshot Operations
下面的操作,需要拥有superuser权限

Allow Snapshots
允许一个目录可以创建快照。如果操作成功,这个目录即为snapshottable目录

    hdfs dfsadmin -allowSnapshot <path>

        [root@gc2 oracle]# hdfs dfsadmin -allowSnapshot /snap
        Allowing snaphot on /snap succeeded

Disallow Snapshots
disallowing前,所有快照需被删除

    hdfs dfsadmin -disallowSnapshot <path>

        [root@gc2 oracle]# hdfs dfsadmin -disallowSnapshot /snap
        Disallowing snaphot on /snap succeeded

Create Snapshots
为目录创建快照(snapshottable的目录)。需要对该目录有owner权限

    hdfs dfs -createSnapshot <path> [<snapshotName>]
        默认的如果不指定snapshotName,那么默认为"'s'yyyyMMdd-HHmmss.SSS", 列如: "s20130412-151029.033".

        [root@gc2 oracle]# hdfs dfs -createSnapshot /snap
        Created snapshot /snap/.snapshot/s20150726-120414.379

        [root@gc2 oracle]# hdfs dfs -createSnapshot /snap s0
        Created snapshot /snap/.snapshot/s0

        [root@gc2 oracle]# hdfs dfs -ls -R /snap/.snapshot
        drwxr-xr-x   - root supergroup          0 2015-07-26 12:04 /snap/.snapshot/s0
        -rw-r--r--   1 root supergroup        831 2015-07-26 11:56 /snap/.snapshot/s0/hehe.ora
        -rw-r--r--   1 root supergroup         72 2015-07-26 11:55 /snap/.snapshot/s0/sum.sh
        -rw-r--r--   1 root supergroup        754 2015-07-26 11:56 /snap/.snapshot/s0/test.sh
        drwxr-xr-x   - root supergroup          0 2015-07-26 12:04 /snap/.snapshot/s20150726-120414.379
        -rw-r--r--   1 root supergroup        831 2015-07-26 11:56 /snap/.snapshot/s20150726-120414.379/            hehe.ora
        -rw-r--r--   1 root supergroup         72 2015-07-26 11:55 /snap/.snapshot/s20150726-120414.379/sum         .sh
        -rw-r--r--   1 root supergroup        754 2015-07-26 11:56 /snap/.snapshot/s20150726-120414.379/            test.sh

Delete Snapshots

    hdfs dfs -deleteSnapshot <path> <snapshotName>

        [root@gc2 oracle]# hdfs dfs -deleteSnapshot /snap s0

        [root@gc2 oracle]# hdfs dfs -ls -R /snap/.snapshot
        drwxr-xr-x   - root supergroup          0 2015-07-26 12:04 /snap/.snapshot/s20150726-120414.379
        -rw-r--r--   1 root supergroup        831 2015-07-26 11:56 /snap/.snapshot/s20150726-120414.379/hehe.ora
        -rw-r--r--   1 root supergroup         72 2015-07-26 11:55 /snap/.snapshot/s20150726-120414.379/sum.sh
        -rw-r--r--   1 root supergroup        754 2015-07-26 11:56 /snap/.snapshot/s20150726-120414.379/test.sh

Rename Snapshots

    hdfs dfs -renameSnapshot <path> <oldName> <newName>

        [root@gc2 oracle]# hdfs dfs  -createSnapshot /snap s0
        Created snapshot /snap/.snapshot/s0

        [root@gc2 oracle]# hdfs dfs  -renameSnapshot /snap s0 s1

        [root@gc2 oracle]# hadoop fs -ls -R /snap/.snapshot
        drwxr-xr-x   - root supergroup          0 2015-07-26 12:10 /snap/.snapshot/s1
        -rw-r--r--   1 root supergroup        831 2015-07-26 11:56 /snap/.snapshot/s1/hehe.ora
        -rw-r--r--   1 root supergroup         72 2015-07-26 11:55 /snap/.snapshot/s1/sum.sh
        -rw-r--r--   1 root supergroup        754 2015-07-26 11:56 /snap/.snapshot/s1/test.sh
        drwxr-xr-x   - root supergroup          0 2015-07-26 12:04 /snap/.snapshot/s20150726-120414.379
        -rw-r--r--   1 root supergroup        831 2015-07-26 11:56 /snap/.snapshot/s20150726-120414.379/hehe.ora
        -rw-r--r--   1 root supergroup         72 2015-07-26 11:55 /snap/.snapshot/s20150726-120414.379/sum.sh
        -rw-r--r--   1 root supergroup        754 2015-07-26 11:56 /snap/.snapshot/s20150726-120414.379/test.sh

Get Snapshottable Directory Listing
获取所有当前用户有权限创建snapshot的snapshottable目录列表

    hdfs lsSnapshottableDir

        [root@gc2 oracle]# hdfs lsSnapshottableDir
        drwxr-xr-x 0 root supergroup 0 2015-07-26 12:10 2 65536 /snap

Get Snapshots Difference Report
获取两个snapshot的不同之处。这个操作需要对每个snapshot涉及的目录和文件拥有read权限

    hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>

    Results:
    +   The file/directory has been created.
    -   The file/directory has been deleted.
    M   The file/directory has been modified.
    R   The file/directory has been renamed.

        做这个实验前,我们先删除/snap/sum.sh这个文件,并为/snap创建快照s2
        [root@gc2 oracle]# hadoop fs -rm /snap/sum.sh
        15/07/26 12:16:20 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier   interval = 0 minutes.
        Moved: 'hdfs://localhost:9000/snap/sum.sh' to trash at: hdfs://localhost:9000/user/root/.Trash/Current
        上面两行表示文件并未彻底删除,而是移动到了回收站,保留时间是1440分钟

        [root@gc2 oracle]# hdfs dfs -createSnapshot /snap s2
        Created snapshot /snap/.snapshot/s2

        [root@gc2 oracle]# hdfs snapshotDiff /snap s1 s2
        Difference between snapshot s1 and snapshot s2 under directory /snap:
        M       .
        -       ./sum.sh

        s1 比 s2多了一个 sum.sh  或者 比较方便的理解方法是 s1 - xxx = s2

可以通过web查看快照信息
http://192.168.255.169:50070/dfshealth.html#tab-snapshot
这里写图片描述

版权声明:本文为博主原创文章,未经博主允许不得转载。linux公社盗转死妈

相关内容