Benchmarking

Hadoop distribution includes a number of benchmarks we can use.

1. Preparation
2. TestDFSIO

2.1 Run TestDFSIO in write mode and create data.
2.2 Run TestDFSIO in read mode.
2.3 Clean up the TestDFSIO data.

1. Preparation

change directory to $HADOOP_INSTALL

2. TestDFSIO

YARN also includes an HDFS benchmark application called TestDFSIO. The TestDFSIO benchmark is useful for testing the I/O performance of the HDFS. This benchmark uses a MapReduce job to read and write files in separate map tasks, whose output is used for collecting statistics that are accumulated in the reduce tasks to produce a summary result.

ดูว่าตัวทดสอบ jobclient ที่ให้มา (hadoop-mapreduce-client-jobclient-2.7.1-tests.jar) ทำอะไรได้บ้าง

2.1 Run TestDFSIO in write mode and create data.

The benchmark data is then appended to a local file named TestDFSIO_results.log and written to standard output.

2.2 Run TestDFSIO in read mode.

log

2.3 Clean up the TestDFSIO data.