File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Difficulty ?

 
paul nisset
Ranch Hand
Posts: 219
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I don't actually work with Big Data but have a feeling that it is something that will be coming my way at some point.
How difficult is it to learn, and become competent, in Hadoop as compared with other NoSql technologies ?

Thanks,
Paul
 
Alex Holmes
Author
Greenhorn
Posts: 21
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

There is definitely a learning curve with Hadoop, which is probably higher than most NoSQL systems, which are more aligned with other real-time systems that we are all accustomed to working with (such as relational databases). The additional learning time is really related to installation, management and understanding MapReduce as a framework and programming model.

Having said that, I would argue that it's worthwhile understanding the Hadoop fundamentals; even if you don't end up using the technology, it will help you understand the MapReduce concepts, which are also being leveraged in-system by NoSQL solutions. Hadoop's emphasis on data locality is also a valuable lesson that we all should be aware about as general good-practice distributed system design, which helps reenforce our own architectural and design decisions.

Thanks,
Alex
 
paul nisset
Ranch Hand
Posts: 219
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Alex.
 
Mohamed El-Refaey
Ranch Hand
Posts: 119
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alex Holmes wrote:Hi,

There is definitely a learning curve with Hadoop, which is probably higher than most NoSQL systems, which are more aligned with other real-time systems that we are all accustomed to working with (such as relational databases). The additional learning time is really related to installation, management and understanding MapReduce as a framework and programming model.

Having said that, I would argue that it's worthwhile understanding the Hadoop fundamentals; even if you don't end up using the technology, it will help you understand the MapReduce concepts, which are also being leveraged in-system by NoSQL solutions. Hadoop's emphasis on data locality is also a valuable lesson that we all should be aware about as general good-practice distributed system design, which helps reenforce our own architectural and design decisions.

Thanks,
Alex


Alex, can you please elaborate more about what you mean by "Hadoop's emphasis on data locality is also a valuable lesson"

Regards,
Mohamed
 
Alex Holmes
Author
Greenhorn
Posts: 21
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mohamed,

In distributed computing it is much preferred to read data from local disk, rather than over the network. This is also known as data locality, and is one of the key aspects of Hadoop. When MapReduce pushes work to the slave nodes, it does do in a way to favor reads from disk rather than reads from the network.

Thanks,
Alex
 
Mohamed El-Refaey
Ranch Hand
Posts: 119
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alex Holmes wrote:Mohamed,

In distributed computing it is much preferred to read data from local disk, rather than over the network. This is also known as data locality, and is one of the key aspects of Hadoop. When MapReduce pushes work to the slave nodes, it does do in a way to favor reads from disk rather than reads from the network.

Thanks,
Alex


Thanks Alex for your clarifications! much appreciated.
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic