hadoop分析之三org.apache.hadoop.hdfs.server.namenode各个种的功能与角色
hadoop分析之三org.apache.hadoop.hdfs.server.namenode各个类的功能与角色
以hadoop0.21为例。
NameNode.java: 主要维护文件系统的名字空间和文件的元数据,以下是代码中的说明。
/********************************************************** * NameNode serves as both directory namespace manager and * "inode table" for the Hadoop DFS. There is a single NameNode * running in any DFS deployment. (Well, except when there * is a second backup/failover NameNode.) * * The NameNode controls two critical tables: * 1) filename ->blocksequence (namespace) * 2) block ->machinelist ("inodes") * * The first table is stored on disk and is very precious. * The second table is rebuilt every time the NameNode comes * up. * * 'NameNode' refers to both this class as well as the 'NameNode server'. * The 'FSNamesystem' class actually performs most of the filesystem * management. The majority of the 'NameNode' class itself is concerned * with exposing the IPC interface and the http server to the outside world, * plus some configuration management. * * NameNode implements the ClientProtocol interface, which allows * clients to ask for DFS services. ClientProtocol is not * designed for direct use by authors of DFS client code. End -users * should instead use the org.apache.nutch.hadoop.fs.FileSystem class. * * NameNode also implements the DatanodeProtocol interface, used by * DataNode programs that actually store DFS data blocks. These * methods are invoked repeatedly and automatically by all the * DataNodes in a DFS deployment. * * NameNode also implements the NamenodeProtocol interface, used by * secondary namenodes or rebalancing processes to get partial namenode's * state, for example partial blocksMap etc. **********************************************************/
FSNamesystem.java: 主要维护几个表的信息:维护了文件名与block列表的映射关系;有效的block的集合;block与节点列表的映射关系;节点与block列表的映射关系;更新的heatbeat节点的LRU
cache
/*************************************************** * FSNamesystem does the actual bookkeeping work for the * DataNode. * * It tracks several important tables. * * 1) valid fsname --> blocklist (kept on disk, logged) * 2) Set of all valid blocks (inverted #1) * 3) block --> machinelist (kept in memory, rebuilt dynamically from reports) * 4) machine --> blocklist (inverted #2) * 5) LRU cache of updated -heartbeat machines ***************************************************/INode.java:HDFS将文件和文件目录抽象成INode。
/** * We keep an in-memory representation of the file/block hierarchy. * This is a base INode class containing common fields for file and * directory inodes. */FSImage.java:需要将INode信息持久化到磁盘上FSImage上。
/** * FSImage handles checkpointing and logging of the namespace edits. * */
FSEditLog.java:写Edits文件
/** * FSEditLog maintains a log of the namespace modifications. * */
BlockInfo.java:INode主要是所文件和目录信息的,而对于文件的内容来说,这是用block描述的。我们假设一个文件的长度大小为Size,那么从文件的0偏移开始,按照固定大小,顺序对文件划分并编号,划分好的每一块为一个block
/** * Internal class for block metadata. */
DatanodeDescriptor.java:代表的具体的存储对象。
/************************************************** * DatanodeDescriptor tracks stats on a given DataNode, * such as available storage capacity, last update time, etc., * and maintains a set of blocks stored on the datanode. * * This data structure is a data structure that is internal * to the namenode. It is *not* sent over- the- wire to the Client * or the Datnodes. Neither is it stored persistently in the * fsImage. **************************************************/
FSDirectory.java: 代表了HDFS中的所有目录和结构属性
/************************************************* * FSDirectory stores the filesystem directory state. * It handles writing/loading values to disk, and logging * changes as we go. * * It keeps the filename->blockset mapping always- current * and logged to disk. * *************************************************/
EditLogOutputStream.java:所有的日志记录都是通过EditLogOutputStream输出,在具体实例化的时候,这一组EditLogOutputStream包含多个EditLogFIleOutputStream和一个EditLogBackupOutputStream
/** * A generic abstract class to support journaling of edits logs into * a persistent storage. */
EditLogFileOutputStream.java:将日志记录写到edits或edits.new中。
/** * An implementation of the abstract class {@link EditLogOutputStream}, which * stores edits in a local file. */
EditLogBackupOutputStream.java:将日志通过网络发送到backupnode上。
/** * An implementation of the abstract class {@link EditLogOutputStream}, * which streams edits to a backup node. * * @see org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol#journal * (org.apache.hadoop.hdfs.server.protocol.NamenodeRegistration, * int, int, byte[]) */
BackupNode.java:name Node的backup:升级阶段:Secondary Name Node -》Checkpoint Node(定期保存元数据,定期checkpoint) -》Backup
Node(在内存中保持一份和Name Node完全一致的镜像,当元数据发生变化时,其元数据进行更新,可以利用自身的镜像来checkpoint,无需从nameNode下载)-》Standby Node(可以进行热备)
/** * BackupNode. * <p> * Backup node can play two roles. * <ol> * <li>{@link NamenodeRole#CHECKPOINT} node periodically creates checkpoints, * that is downloads image and edits from the active node, merges them, and * uploads the new image back to the active. </li> * <li>{@link NamenodeRole#BACKUP} node keeps its namespace in sync with the * active node, and periodically creates checkpoints by simply saving the * namespace image to local disk(s).</li> * </ol> */
BackupStorage.java:在Backup Node备份目录下创建jspool,并创建edits.new,将输出流指向edits.new
/** * Load checkpoint from local files only if the memory state is empty.<br> * Set new checkpoint time received from the name -node. <br> * Move <code>lastcheckpoint.tmp </code> to <code>previous.checkpoint</code> . * @throws IOException */
TransferFsImage.java:负责从name Node去文件。
/** * This class provides fetching a specified file from the NameNode. */
GetImageServlet.java:是httpServlet的子类,处理doGet请求。
/** * This class is used in Namesystem's jetty to retrieve a file. * Typically used by the Secondary NameNode to retrieve image and * edit file for periodic checkpointing. */