Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Hadoop key mismatch

 
Larry Homes
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

Hope this is the correct forum for a hadoop question.

I have a file with a bunch of lines like this:



It continues on for all 50 states, then there is another word like politics:30 Virginia ... etc.

I want to do a distributed sort on this using mapreduce. I know mapreduce sorts between the map and reduces stages, so I just want to emit from map, then from reduce without processing, but it is not working. Here is my map and reduce function:



Here is my main class



And here is the inputformat class i wrote since FileInputFormat would always fail



Here is the error




Thanks
 
Larry Homes
Greenhorn
Posts: 25
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thought I would post the solution I found. It was an incredibly dumb error on my part. In my main class, I named the Job instance sort, but then when setting the mapOutputKey, mapOutputValues, outputKey and outputValue, I use the identifier job. That identifier was from a previous mapreduce in the chain and I had just copied and pasted the code without remembering to change the job identifier.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic