• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

using generics in mapper class

 
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am new to HADOOP as well as JAVA.
While I was looking into the "WordCount" problem, I was confused with the use of generics in the WordCountMapper Class.
The Class looks like:



I know that the Generics is used here to assign KEY VALUE pair for mapper input and output. But I want to know what is the advantage of using GENERICS for assigning KEY VALUE pairs.
Is this the only way to assign KEY VALUE pair for the mapper class. Please explain in details.

Thanks!
 
Greenhorn
Posts: 2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jone,

First and foremost reason is the signature of the Mapper class in Apache Hadoop appears with generics which 'technically' forces us to use these generics (either specify types in place of generic placeholders or continue using the generics as it is in the extended class as well):-



Secondly, the feature of generic for the Mapper class has been provided in order for us to have freedom in choosing desired types (classes) for key and value objects in key,value pair for the Mapper. Apache Hadoop could very well force us and lock us in using some predefined type for key, value (e.g. Text and IntWritable as output key and output value types). But in that case programmer won't be able to output anything from Mapper other than <Text,IntWritable>. What if you want to use <FloatWritable,Text> instead? Or to make it more complex what if you want to define your own classes for them as <JoneKeyOutputType,JoneValueOutputType> ? So, this is the reason why Mapper is using generics. This gives us immense freedom in choosing these types for mappers, reducers etc. The whole point of using generics, here, and anywhere else, is to separate logic from data type. Here one need not have separate definition of Mapper only because user decides to go with some other data combination for <key,value> other than what was defined in the Hadoop API. The Mapper logic remains the same, yet allows programmer to specify the types of his own choice.

Hope this clears the point.

Regards,
-V

 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic