Mapreduce example from Apache Site

Ranch Hand

Posts: 1609

posted 9 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

I was following up on this page from Apache.

After the compilation step of word count v1.0 it says

Assuming that:

/user/joe/wordcount/input - input directory in HDFS
/user/joe/wordcount/output - output directory in HDFS

What does directory in HDFS mean? Are these already created? and I see that

lists the two files inside input directory. Even the normal "ls" command would have done that, what is the significance of using bin/hdfs here?

Keep Smiling Always — My life is smoother when running silent. -paul
[FAQs] [Certification Guides] [The Linux Documentation Project]

arumugarani sundaram

Greenhorn

Posts: 9

posted 9 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Hi,

Please try to understand HDFS is distributed file system. If you design the system as clustered ones, the data will be split into multiple segments/chunks and distributed across clustered environment. bin/hadoop dfs ---------> it means that you are listing from HDFS not from an ordinary file system.

Hope you understand this.

The input will say that where the input files are available for processing and the output says where the processed output files are available.

Think of a file that contains the phone number for everyone in the country X; the people with a last name starting with A might be stored on server 1, B on server 2, and so on. In a Hadoop world, pieces of this phonebook would be stored across the cluster. To achieve availability as components fail, HDFS replicates these smaller pieces onto two additional servers by default.This redundancy offers multiple benefits, the most obvious being higher availability. When you query the HDFS, the data from clustered servers will be combined and re-constructed as a single one.

Hope this helps you to understand.

Thanks,
Arumugarani

Akhilesh Trivedi

Ranch Hand

Posts: 1609

posted 9 years ago

Number of slices to send:

Optional 'thank-you' note:

Send

Thanks Arumugarani!

I am able to understand the concepts and working through.

Keep Smiling Always — My life is smoother when running silent. -paul
[FAQs] [Certification Guides] [The Linux Documentation Project]

It means our mission is in jeapordy! Quick, read this tiny ad!

a bit of art, as a gift, the permaculture playing cards

https://gardener-gift.com