Ubuntu上使用Hadoop 2.x 十二 HDFS Cluster HA QJM和Federation联合使用
Ubuntu上使用Hadoop 2.x 十二 HDFS Cluster HA QJM和Federation联合使用
扩展性和容错的解决方案
现在已经有了Federation集群,这样就能提供Hadoop大集群的解决方案。不过对于单个namenode server,还是需要HA QJM来提供单点故障的解决方案,使得其可以自动的故障切换。
之前我已经有了两个namenode1和namenode2 server,分别用于管理两个namespace。现在把它们看成active machine, 并clone出两个虚拟机,作为它们的standby machine.
同时QJM需要至少3个JournalNodes,为了省机器,就用datanode1, datanode2和datanode3作为namenode1的JournalNodes. 再创建三个datanode server,同时也作为namenode2的JournalNodes.
架构图:
配置
添3个datanode到federation中
从datanode1中clone出虚拟机,然后复制到另一台物理主机中,安装后,再克隆出2份
完成之后,发现一个奇怪的现象,每个namenode只能看到3台datanode server, 而且每次看到的还不同。
hduser@namenode1:~$ hdfs dfsadmin -printTopology Rack: /168/1 192.168.1.73:50010 (datanode1) 192.168.1.74:50010 (datanode2) 192.168.1.75:50010 (datanode3) hduser@namenode1:~$ hdfs dfsadmin -printTopology Rack: /168/1 192.168.1.74:50010 (datanode2) 192.168.1.75:50010 (datanode3) 192.168.1.78:50010 (datanode6)
namenode2和namenode1有所区别:
hduser@namenode2:~$ hdfs dfsadmin -printTopology Rack: /168/1 192.168.1.74:50010 (datanode2) 192.168.1.75:50010 (datanode3) 192.168.1.78:50010 (datanode6)
这个可能是hdfs的设计问题,应该不是datanode启动失败,因为我检查了日志,似乎没看到错误信息。先记在这里,以后再查。
我还特地检查了hdfs-site.xml,允许所有datanode连接:
<property> <name>dfs.hosts</name> <value>/usr/local/hadoop/etc/hadoop/datanode-allow-list</value> </property>
该文件内容为空。
也可以用-report查看,还是只能看到3台datanode server.
hduser@namenode1:~$ hdfs dfsadmin -report Configured Capacity: 295283847168 (275.00 GB) Present Capacity: 267733209088 (249.35 GB) DFS Remaining: 267733045248 (249.35 GB) DFS Used: 163840 (160 KB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0 ------------------------------------------------- Datanodes available: 3 (3 total, 0 dead) Live datanodes: Name: 192.168.1.74:50010 (datanode2) Hostname: datanode2 Rack: /168/1 Decommission Status : Normal Configured Capacity: 98427949056 (91.67 GB) DFS Used: 53248 (52 KB) Non DFS Used: 9290870784 (8.65 GB) DFS Remaining: 89137025024 (83.02 GB) DFS Used%: 0.00% DFS Remaining%: 90.56% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Last contact: Tue Mar 18 14:51:05 UTC 2014 Name: 192.168.1.78:50010 (datanode6) Hostname: datanode6 Rack: /168/1 Decommission Status : Normal Configured Capacity: 98427949056 (91.67 GB) DFS Used: 53248 (52 KB) Non DFS Used: 9129762816 (8.50 GB) DFS Remaining: 89298132992 (83.17 GB) DFS Used%: 0.00% DFS Remaining%: 90.72% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Last contact: Tue Mar 18 14:47:10 UTC 2014 Name: 192.168.1.75:50010 (datanode3) Hostname: datanode3 Rack: /168/1 Decommission Status : Normal Configured Capacity: 98427949056 (91.67 GB) DFS Used: 57344 (56 KB) Non DFS Used: 9130004480 (8.50 GB) DFS Remaining: 89297887232 (83.17 GB) DFS Used%: 0.00% DFS Remaining%: 90.72% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Last contact: Tue Mar 18 14:51:05 UTC 2014
更新中...
评论暂时关闭