Hadoop Distributed Filesystem. HDFS is a filesystem designed for storing very large files with streaming data access patterns, running on clusters of commodity hardware.
Distributed File System
There is need of distributed file system because dataset is growing beyond the capacity of single machine storage. But there are few problem associated with distributed file system, chances od data loss due to machine failures & complication of network programming because it is network based file system, thus makes it more complicated than regular file system.
HDFS is designed for specific requirements.
Where HDFS is Good Fit ?
- Store large datasets which may be in TB's or PB's or even more. Data Volume
- Store different variety of data - Structured | UnStructured | Semi-Structured Data Variety
- Store data on commodity hardware (Economical).
Where HDFS is not a good fit ?
- Low Latency Data Access (Hbase is better option)
- Huge number of small files (Upto Millions is ok, but Billions is beyond the capacity of current hardwares. Basically NameNode Metadata storage capacity is the problem).
- Random file access (Random Read, write, delete or insert is not possible, Hadoop doesn't support OLTP. RDBMS is the best fit for OLTP operations).
HDFS has a Master/Slave Architecture
Master ==> NameNode
Slave ==> DataNode
Block Placement
- Current Stretegy - One replica on local node, second replica on remote rack, third replica on same remote rack and additional replicas (if replication factor > 3) are randomly placed.
- Client read from nearest replica.
Heartbeats
- Data Nodes sends heartbeats to NameNode every 3 seconds.
- NameNode uses heartbeats to detects datanode failure.
Replication Engine
NameNode detects dataNode failures
- Chooses new dataNodes for new replicas.
- Balances disk usage.
- Balances communication traffic to DataNodes.
NameNode Failure
- A Single point of failure
- Transaction logs are stored in multiple directories.
- A directory on
the local file system.
- A directory on a
remote file system (NFS/CIFS).
Data Pipelining
- Client retrieves the list of DataNodes on which to place replicas of a block.
- Client writes block to the first DataNode.
- The first DataNode forwards the data to the next DataNode in the
pipeline.
- When all replicas are written, the client moves on to write the next
block in file.
Secondary NameNode
- Copies FsImage & Transaction Log from NameNode to a temporary directory.
- Merges FsImage and Transaction log into new FsImage in temporary directory.
- Uploads new FsImage to NameNode and transaction log on NameNode is purged.
The HDFS namespace is stored by the NameNode. The NameNode uses a
transaction log called the EditLog to persistently record every change that
occurs to file system metadata. For example, creating a new file in HDFS causes
the NameNode to insert a record into the EditLog indicating this. Similarly,
changing the replication factor of a file causes a new record to be inserted
into the EditLog. The NameNode uses a file in its local host OS file system to
store the EditLog. The entire file system namespace, including the mapping of
blocks to files and file system properties, is stored in a file called the
FsImage. The FsImage is stored as a file in the NameNode’s local file system
too.