MacOS部署单机版Hadoop(伪分布模式) MacOS部署单机版Hadoop

hadoop是一种分布式框架,一般部署在计算机集群(cluster)上。在没有集群的情况下,为了学习和调试hadoop程序,可以在本机上部署一个单机版的hadoop框架。

hadoop下载链接:https://archive.apache.org/dist/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz

版本为3.2.0

是一个.tar.gz的压缩文件,首先,将该文件解压缩到指定的文件夹,如这里指定将其放在opt文件夹下:

tar -zxvf hadoop-3.2.0.tar.gz -C /opt/

这时就可以看到

baidudeMacBook-Pro:~ jiazhuang01$ ls /opt
DuGuanJiaSvc	MacAptSvc	hadoop-3.2.0

已经解压到了该文件夹里

第一步仍然是要将hadoop添加到路径中,即将其创建变量,并且写入PATH中:

vim /etc/profile

将如下语句添加到现有的内容后面:

export HADOOP_HOME=/opt/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin

保存好,并且source一下让其生效:

source /etc/profile

这时看一下HADOOP_HOME,如下:

baidudeMacBook-Pro:hadoop-3.2.0 jiazhuang01$ echo $HADOOP_HOME 
/opt/hadoop-3.2.0

可以看到,环境变量已经配好。下面需要改一下hadoop配置文件xml中的一些内容。

然后,进入hadoop-3.2.0文件夹,可以看到在 etc/hadoop路径下有很多xml文件。需要对其中的几个进行修改:分别是:

Core-site.xml

Yarn-site.xml

Mapred-site.xml

Hdfs-site.xml

修改后的内容如下(这里参考了 http://www.hihubs.com/article/341 中的方法):

vim $HADOOP_CONF_DIR/core-site.xml
<configuration>
      <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost/</value>
        </property>
    </configuration>
    
vim $HADOOP_CONF_DIR/hdfs-site.xml
   <configuration>
       <property>
            <name> dfs.replication </name>
            <value> 1 </value>
        </property>
    <configuration>

vim $HADOOP_CONF_DIR/mapred-site.xml      
    <configuration>
        <property>
            <name> mapreduce.framework.name </name>
            <value> yarn </value>
         </property>
        <property>
            <name> mapreduce.application.classpath </name>
            <value> $HADOOP_HOME/share/hadoop/mapreduce/*:$HADOOP_HOME/share/hadoop/mapreduce/lib/* </value>
         </property>
    <configuration>
    
vim $HADOOP_CONF_DIR/yarn-site.xml    
    <configuration>
       <property>
            <name> yarn.nodemanager.aux-services </name>
            <value> mapreduce_shuffle </value>
       </property>
         <property>
            <name> yarn.nodemanager.env-whitelist </name>
            <value> HADOOP_HOME,JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME </value>
         </property>
    <configuration>

修改好后,配置ssh免密登录,具体方法可以参考:

https://dongkelun.com/2018/04/05/sshConf/

操作如下:

ssh-keygen -t rsa

用RSA算法加密,得到结果如下:

Generating public/private rsa key pair.
Enter file in which to save the key (/Users/jiazhuang01/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /Users/jiazhuang01/.ssh/id_rsa.
Your public key has been saved in /Users/jiazhuang01/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:YxUsVinjo54B86DZ9+bwuz0TFzT9zV8v0rZMaGqFfn4 jiazhuang01@baidudeMacBook-Pro.local
The key's randomart image is:
+---[RSA 2048]----+
|         oo. .   |
|        = o.o .  |
|       o +.. . o.|
|    +   o.  .   =|
|   + = .S. . +  +|
|  o . =. .o * = o|
|     o.+ . * = o |
|      ooo.* . E  |
|       o=+.=..   |
+----[SHA256]-----+

然后将public key导入到authorized_keys中:

baidudeMacBook-Pro:hadoop-3.2.0 jiazhuang01$ cd /Users/jiazhuang01/.ssh/
baidudeMacBook-Pro:.ssh jiazhuang01$ cat id_rsa.pub>>authorized_keys 

完成后可以ssh一下localhost,看看是否可以:

baidudeMacBook-Pro:hadoop-3.2.0 jiazhuang01$ ssh localhost
Last login: Fri Jul 19 18:04:02 2019

可以了~

第一次要格式化一下:

hadoop namenode -format

下面,进入Hadoop的home目录下,找到sbin子目录的start-dfs.sh文件,运行:

sbin/start-dfs.sh

得到结果:

Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [baidudeMacBook-Pro.local]
2019-07-19 16:54:20,914 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

可以用jps(java processing status,查看JAVA进程)命令看一下结果

55952 ResourceManager
61137 Jps
60642 NameNode
11013 
60742 DataNode
60879 SecondaryNameNode

可以看到建立了NameNode、DataNode以及SecondaryNameNode。

到此本机架设Hadoop就完成啦!

2019-07-19 19:06:48