测试集群模式安装实施Hadoop

测试集群模式安装实施Hadoop   

1. 集群架构

 

VMware中安装三台CentOS虚拟机server1server2server3,其中server1作为Hadoop集群的NomeNodeJobTrackerserver2server3作为DataNodeTaskTracker.  为简便将DNSNFS也安装在server1之上。

 

2.安装DNS

 

使用yum安装bind

[root@server1 admin]#  yum install bind*

 

安装完成后检查,

[root@server1 admin]#  rpm -qa | grep  '^bind'

bind-dyndb-ldap-1.1.0-0.9.b1.el6_3.1.x86_64

bind-chroot-9.8.2-0.10.rc1.el6_3.6.x86_64

bind-libs-9.8.2-0.10.rc1.el6_3.6.x86_64

bind-sdb-9.8.2-0.10.rc1.el6_3.6.x86_64

bind-utils-9.8.2-0.10.rc1.el6_3.6.x86_64

bind-devel-9.8.2-0.10.rc1.el6_3.6.x86_64

bind-9.8.2-0.10.rc1.el6_3.6.x86_64

 

安装已经齐全

 

修改配置文件

修改/etc/named.conf,将127.0.0.1,localhost 改成any

[root@server1 etc]# vim named.conf

 

options {

        listen-on port 53 { any; };

        listen-on-v6 port 53 { ::1; };

        directory       "/var/named";

        dump-file       "/var/named/data/cache_dump.db";

        statistics-file "/var/named/data/named_stats.txt";

        memstatistics-file "/var/named/data/named_mem_stats.txt";

        allow-query     { any; };

        recursion yes;

 

        dnssec-enable yes;

        dnssec-validation yes;

        dnssec-lookaside auto;

 

        

        bindkeys-file "/etc/named.iscdlv.key";

        managed-keys-directory "/var/named/dynamic";

};

 

修改/etc/named.rfc1912.zones, 加入以下内容

zone "myhadoop.com" IN {

        type master;

        file "myhadoop.com.zone";       

        allow-update { none; };

};

zone "1.168.192.in-addr.arpa" IN {

        type master;

        file "1.168.192.in-addr.zone";

        allow-update { none; };

};

 

在目录/var/named下创建文件myhadoop.com.zone1.168.192.in-addr.zone

 

修改myhadoop.com.zone

 

$TTL 86400

@     IN SOA     server1.myhadoop.com. chizk.root.myhadoop.com. (

                                              0       ; serial (d.adams)

                                               1D    ; refresh

                                               1H    ; retry

                                               1W   ; expire

                                               3H )  ; minimum

@     IN NS  server1.myhadoop.com.

server1.myhadoop.com.       IN A 192.168.1.201

server2.myhadoop.com.   IN A 192.168.1.202

server3.myhadoop.com.   IN A 192.168.1.203

 

修改1.168.192.in-addr.zone

$TTL 86400

@     IN SOA  server1.myhadoop.com. chizk.root.myhadoop.com. (

                                        0       ; serial

                                        1D      ; refresh

                                        1H      ; retry

                                        1W      ; expire

                                        3H )    ; minimum

@       IN NS  server1.myhadoop.com.

201     IN PTR server1.myhadoop.com.   

202     IN PTR server2.myhadoop.com.   

202     IN PTR server3.myhadoop.com.

修改这两个文件的所有者

 

[root@server1 named]# chown root.named myhadoop.com.zone

[root@server1 named]# chown root.named 1.168.192.in-addr.zone

 

/etc/resolv.conf中添加以下配置

nameserver 192.168.1.201

 

用同样的方法修改server2server3中的/etc/resolv.conf文件

 

启动DNS

[root@server1 named]# service named start

Starting named:                                            [  OK  ]

 

设置为开机自动启动

[root@server1 admin]# chkconfig named on

 

测试DNS查询

[root@server1 admin]# nslookup server1.myhadoop.com

Server:               192.168.1.201

Address:  192.168.1.201#53

 

Name:      server1.myhadoop.com

Address: 192.168.1.201

 

[root@server1 admin]# nslookup server2.myhadoop.com

Server:               192.168.1.201

Address:  192.168.1.201#53

 

Name:      server2.myhadoop.com

Address: 192.168.1.202

 

[root@server1 admin]# nslookup server3.myhadoop.com

Server:               192.168.1.201

Address:  192.168.1.201#53

 

Name:      server3.myhadoop.com

Address: 192.168.1.203

 

查询成功,同时在server2server3中测试,查询都成功

 

3.安装NFS

 

查看NFSrpcbind包是否已安装

 

[root@server1 admin]# rpm -qa | grep nfs

nfs4-acl-tools-0.3.3-5.el6.x86_64

nfs-utils-1.2.2-7.el6.x86_64

nfs-utils-lib-1.1.5-1.el6.x86_64

[root@server1 admin]# rpm -qa | grep rpcbind

rpcbind-0.2.0-8.el6.x86_64

 

可见已经安装完全,若没有安装使用yum安装即可。

 

修改文件/etc/exports, 加入以下内容

/home/admin *(sync,rw)

 

启动NFS

[root@server1 admin]# service nfs start

Starting NFS services:                                     [  OK  ]

Starting NFS quotas:                                       [  OK  ]

Starting NFS daemon:                                       [  OK  ]

Starting NFS mountd:                                       [  OK  ]

 

设置为开机自动启动

[root@server1 admin]# chkconfig nfs on

 

启动rpcbind

[root@server1 admin]# service rpcbind start

Starting rpcbind:                                          [  OK  ]

 

设置为自动启动

[root@server1 admin]# chkconfig rpcbind on

 

输出挂载点

[root@server1 admin]# showmount -e localhost

Export list for localhost:

/home/admin *

 

修改/home/admin的权限,为方便设置为777

[root@server1 home]# chmod 777 /home/admin

 

server2中挂载server1中的/home/admin

 

[root@server2 home]# mount server1.myhadoop.com:/home/admin/ /home/admin_share/

 

测试访问

[root@server2 home]# cd admin_share/

[root@server2 admin_share]# cat test.txt

aaaa,111

bbbb,222

cccc,333

dddd,444

可见访问成功

 

修改server2/etc/fstab 文件,设置自动挂载,在末尾添加如下行:

server1.myhadoop.com:/home/admin   /home/admin_share   nfs  defaults 1 1

 

同理在server3中挂载server1/home/admin,并测试

 

 

4. 共享密钥文件

 

server1server2server3admin用户各自生成登录密钥

[admin@server1 ~]$ ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/home/admin/.ssh/id_rsa):

/home/admin/.ssh/id_rsa already exists.

Overwrite (y/n)? y

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /home/admin/.ssh/id_rsa.

Your public key has been saved in /home/admin/.ssh/id_rsa.pub.

The key fingerprint is:

46:56:64:8f:83:13:e0:f3:17:cb:b9:7d:d5:fc:9f:52 admin@server1

The key's randomart image is:

+--[ RSA 2048]----+

|      ....+      |

|     .   = o     |

|      o = + .    |

|       = o =   ..|

|        S =     +|

|       . . o   E.|

|          . . o .|

|             o  o|

|              ...|

+-----------------+

[admin@server2 ~]$ ssh-keygen -t rsa

[admin@server3 ~]$ ssh-keygen -t rsa

 

server1中将id_rsa.pub 复制为authorized_keys

[admin@server1 ~]$ cp .ssh/id_rsa.pub .ssh/authorized_keys

 

 server2server3中建立共享密钥到本地的软连接

[admin@server2 ~]$ ln -s /home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys

[admin@server3 ~]$ ln -s /home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys

 

server2server3的密钥分别追加到authorized_keys

[admin@server2 ~]$ cat .ssh/id_rsa.pub >> .ssh/authorized_keys

[admin@server3 ~]$ cat .ssh/id_rsa.pub >> .ssh/authorized_keys

 

测试配置

[admin@server1 ~]$ ssh server1.myhadoop.com

The authenticity of host 'server1.myhadoop.com (192.168.1.201)' can't be established.

RSA key fingerprint is a9:f3:7f:55:56:3a:a7:d7:9e:23:1e:86:a5:eb:90:dc.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'server1.myhadoop.com,192.168.1.201' (RSA) to the list of known hosts.

Last login: Sun Jan 27 10:02:12 2013 from server1

 

同样方法测试其他机器,测试成功

 

 

5. 安装Hadoop

server1上,hadoop中配置core-site.xml 为以下格式:

 

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

   <property>

      <name>fs.default.name</name>

      <value>hdfs://server1.myhadoop.com:9000</value>

   </property>

</configuration>

 

配置mapred-site.xml:

 

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

  <property>

    <name>mapred.job.tracker</name>

    <value>server1.myhadoop.com:9001</value>

  </property>

 <property>

     <name>mapred.job.tracker.map.tasks.maximum</name>

     <value>50</value>

 </property>

<property>

     <name>mapred.job.tracker.reduce.tasks.maximum</name>

     <value>50</value>

 </property>

</configuration>

 

配置master

server1.myhadoop.com

 

配置slaves

server2.myhadoop.com

server3.myhadoop.com

 

建立文本文件serverlist.txt,里面包含所有需要分发Hadoop的机器域名,在这里为server2server3,即内容为

[admin@server1 ~]$ cat serverlist.txt

server2.myhadoop.com

server3.myhadoop.com

 

生成HadoopShell脚本

[admin@server1 ~]$ cat serverlist.txt | awk '{print "scp -rp /home/admin/hadoop-0.20.2/ admin@"$1":/home/admin/"}' > distributeHadoop.sh

 

内容如下

[admin@server1 ~]$ cat ./distributeHadoop.sh

scp -rp /home/admin/hadoop-0.20.2/ admin@server2.myhadoop.com:/home/admin/

scp -rp /home/admin/hadoop-0.20.2/ admin@server3.myhadoop.com:/home/admin/

 

运行脚本

[admin@server1 ~]$ ./distributeHadoop.sh

 

检查server2 server3,复制成功

 

格式化namenode

[admin@server1 logs]$ hadoop namenode -format

 

启动Hadoop

[admin@server1 ~]$ ./hadoop-0.20.2/bin/start-all.sh

 

starting namenode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-namenode-server1.out

server2.myhadoop.com: starting datanode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server2.out

server3.myhadoop.com: starting datanode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server3.out

server1.myhadoop.com: starting secondarynamenode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-secondarynamenode-server1.out

starting jobtracker, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-jobtracker-server1.out

server2.myhadoop.com: starting tasktracker, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server2.out

server3.myhadoop.com: starting tasktracker, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server3.out

 

 

检查server1server2server3,启动成功

[admin@server1 logs]$ jps

6481 NameNode

6612 SecondaryNameNode

6681 JobTracker

6749 Jps

 

[admin@server2 logs]$ jps

14869 TaskTracker

14917 Jps

14795 DataNode

 

[admin@server3 logs]$ jps

16354 TaskTracker

16396 Jps

16280 DataNode