Hadoop装配总结

Hadoop安装总结

 

Hadoop安装总结

安装JDK

1、下载jdk1.6及以上版本,在/usr下安装

 

chmod u+x jdk-6u26-linux-i586.bin

./ jdk-6u26-linux-i586.bin

 

2、配置环境变量

vi /etc/profile

        

         找到如下代码:

for i in /etc/profile.d/*.sh ; do

    if [ -r "$i" ]; then

        . $i

    fi

done

 

在之后加入:

    #java config 

JAVA_HOME=/usr/jdk1.6.0_26

export JAVA_HOME

PATH=$PATH:$JAVA_HOME/bin

export PATH

CLASSPATH=.:$JAVA_HOME/lib

export CLASSPATH

 

        

3、配置软链接:

---删除旧的链接

cd /usr/bin

rm rf java

rm rf javac

 

---配置新的链接

ln s /usr/jdk1.6.0_26/bin/java  java

ln s /usr/jdk1.6.0_26/bin/javac  javac

 

4、测试是否安装成功: 看是否显示1.6版本

       [root@localhost jdk1.6.0_26]# java -version

java version "1.6.0_26"

Java(TM) SE Runtime Environment (build 1.6.0_26-b03)

Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode, sharing)

        

新建用户

为保证利于管理,最好是新建一个hadoop用户,作为运行环境。

 

groupadd hadoop                         ---建立hadoop

useradd -g hadoop hadoop           --建立hadoop用户,加入hadoop

passwd hadoop                            --设置密码

 

为设置ssh,需要将hadoop加入wheel

usermod g wheel hadoop

 

应该还有其他方式,设置hadoop组到可以使用ssh,暂时未研究。

配置SSH

hadoop用户下:

[hadoop@localhost ~]$ ssh-keygen -t rsa

[hadoop@localhost ~]$ cat id_rsa.pub >> authorized_keys

 

测试:

ssh localhost

 

在单机配置伪分布方式可以按照如上执行,如果设置集群,则需要将id_rsa.pub 复制到各子机,然后导入验证密钥。

 

安装HADOOP

1、安装文件

 

hadoop官方网站(http://hadoop.apache.org/)下载hadoop安装包,这里下载的是0.20.203

 

上传到hadoop目录下:/home/hadoop

[hadoop@localhost ~]$ tar -zvxf hadoop-0.20.203.0rc1.tar.gz

 

2、配置环境变量:

 

 [hadoop@localhost ~]$ vi /etc/profile

java配置下添加如下:

export HADOOP_HOME=/home/hadoop/hadoop-0.20.203.0

export PATH=$PATH:$HADOOP_HOME/bin

 

         注意刷新配置!

 

3、修改hadoop配置文件:

 

[hadoop@localhost conf]$ vi /home/hadoop/hadoop-0.20.203.0/conf/hadoop-env.sh

修改JAVA_HOME配置

# export JAVA_HOME=/usr/lib/j2sdk1.5-sun

  export JAVA_HOME=/usr/jdk1.6.0_26

 

4、检查安装:

         [hadoop@localhost ~]$ hadoop version

Hadoop 0.20.203.0

Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333

Compiled by oom on Wed May  4 07:57:50 PDT 2011

 

5、配置伪分布模式配置文件

 

[hadoop@localhost conf]$ vi core-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

     <property>

         <name>fs.default.name</name>

         <value>hdfs://localhost/</value>

     </property>

</configuration>

 

 

[hadoop@localhost conf]$ vi hdfs-site.xml

 

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

     <property>

         <name>dfs.replication</name>

         <value>1</value>

     </property>

</configuration>

 

 

[hadoop@localhost conf]$ vi mapred-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 

<!-- Put site-specific property overrides in this file. -->

 

<configuration>

     <property>

         <name>mapred.job.tracker</name>

         <value>localhost:8021</value>

     </property>

</configuration>

 

可参考:http://hadoop.apache.org/common/docs/current/single_node_setup.html

 

也可以把配置文件放在任意目录,只需要在启动守护进程时使用—config选项。

运行HADOOP

<!--[if !supportLists]-->1、  <!--[endif]-->格式化HDFS文件系统

 

[hadoop@localhost bin]$ hadoop namenode format

 

如下是执行日志,可以看到运行参数信息,后续需要仔细研究字段含义:

[hadoop@localhost bin]$ hadoop namenode -format

11/08/13 12:52:56 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = localhost.localdomain/127.0.0.1

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 0.20.203.0

STARTUP_MSG:   build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May  4 07:57:50 PDT 2011

************************************************************/

11/08/13 12:52:57 INFO util.GSet: VM type       = 32-bit

11/08/13 12:52:57 INFO util.GSet: 2% max memory = 19.33375 MB

11/08/13 12:52:57 INFO util.GSet: capacity      = 2^22 = 4194304 entries

11/08/13 12:52:57 INFO util.GSet: recommended=4194304, actual=4194304

11/08/13 12:52:58 INFO namenode.FSNamesystem: fsOwner=hadoop

11/08/13 12:52:59 INFO namenode.FSNamesystem: supergroup=supergroup

11/08/13 12:52:59 INFO namenode.FSNamesystem: isPermissionEnabled=true

11/08/13 12:52:59 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100

11/08/13 12:52:59 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)

11/08/13 12:52:59 INFO namenode.NameNode: Caching file names occuring more than 10 times

11/08/13 12:52:59 INFO common.Storage: Image file of size 112 saved in 0 seconds.

11/08/13 12:52:59 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.

11/08/13 12:52:59 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1

************************************************************/

 

<!--[if !supportLists]-->2、  <!--[endif]-->启动守护进程

 

[hadoop@localhost bin]$ start-dfs.sh

starting namenode, logging to /home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-namenode-localhost.localdomain.out

localhost: starting datanode, logging to /home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-datanode-localhost.localdomain.out

localhost: starting secondarynamenode, logging to /home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-secondarynamenode-localhost.localdomain.out

[hadoop@localhost bin]$ start-mapred.sh

starting jobtracker, logging to /home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-jobtracker-localhost.localdomain.out

localhost: starting tasktracker, logging to /home/hadoop/hadoop-0.20.203.0/bin/../logs/hadoop-hadoop-tasktracker-localhost.localdomain.out

 

<!--[if !supportLists]-->3、  <!--[endif]-->关闭守护进程

[hadoop@localhost bin]$ stop-dfs.sh

[hadoop@localhost bin]$ stop-mapred.sh

<!--[if !supportLists]-->4、  <!--[endif]-->监控界面

http://192.168.128.133:50070/dfshealth.jsp

 

-------------------------------------------

 

作者:CNZQS|JesseZhang  个人博客:CNZQS(http://www.cnzqs.com)

版权声明:除非注明,文章均为原创,可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本声明 

--------------------------------------------