Site icon TechVidvan

Rack Awareness in Hadoop and its Advantages

Hadoop Rack Awareness

This Hadoop tutorial is all about Rack Awareness in Hadoop. In this blog we will describe each and everything about Rack Awareness in HDFS.

First of all we will study what is HDFS Rack Awareness property, what is the need of Rack Awareness in Hadoop. Then we will discuss replica placement via Rack Awareness in HDFS.

At last we will also discuss the various benefits of Rack Awareness in Hadoop framework.

Introduction to HDFS Rack Awareness

Rack Awareness in Hadoop is the concept that chooses closer Datanodes based on the rack information. By default, Hadoop installation assumes that all the nodes belong to the same rack.

To improve network traffic while reading/writing HDFS files in large clusters of Hadoop. NameNode chooses data nodes, which are on the same rack or a nearby rock to read/ write requests (client node). HDFS Namenode achieves this rack information by maintaining rack ids of each data node.

Why Rack Awareness?

The main purpose of Rack awareness is to:

Replica placement via Rack Awareness in Hadoop

The main purpose of replica placement via Rack awareness, the policy is to improve data reliability etc.

A simple policy is to place replicas on the rack to prevent losing of data when an entire rack fails. And allow the use of bandwidth from multiple racks when reading a file.

On multiple rack clusters, block replication follows the below policy:

You should not place more than one replica on one node. You should also not place more than two replicas on the same rack. This has a bottleneck that number of racks used for block replication should be always less than the total number of block replicas.

For example;

Advantages of Rack Awareness in Hadoop

Let’s now discuss some advantages of Rack Awareness in Hadoop HDFS-

Conclusion

In conclusion, it is the concept that chooses closer Datanodes based on the rack information to improve data reliability.  The main purpose of Rack-Awareness is to prevent data loss if the entire rack fails. It also improves network bandwidth. Learn more HDFS properties in detail.

If you have any questions related to Rack Awareness in Hadoop, so please share with us in the comment section. We will try our best to help you.

Exit mobile version