This week's book giveaway is in the OCAJP 8 forum.
We're giving away four copies of OCA Java SE 8 Programmer I Study Guide and have Edward Finegan & Robert Liguori on-line!
See this thread for details.
The moose likes Hadoop and the fly likes Hadoop key mismatch Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of OCA Java SE 8 Programmer I Study Guide this week in the OCAJP 8 forum!
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Hadoop key mismatch" Watch "Hadoop key mismatch" New topic

Hadoop key mismatch

Larry Homes

Joined: Jan 18, 2009
Posts: 25

Hope this is the correct forum for a hadoop question.

I have a file with a bunch of lines like this:

It continues on for all 50 states, then there is another word like politics:30 Virginia ... etc.

I want to do a distributed sort on this using mapreduce. I know mapreduce sorts between the map and reduces stages, so I just want to emit from map, then from reduce without processing, but it is not working. Here is my map and reduce function:

Here is my main class

And here is the inputformat class i wrote since FileInputFormat would always fail

Here is the error

Larry Homes

Joined: Jan 18, 2009
Posts: 25
Thought I would post the solution I found. It was an incredibly dumb error on my part. In my main class, I named the Job instance sort, but then when setting the mapOutputKey, mapOutputValues, outputKey and outputValue, I use the identifier job. That identifier was from a previous mapreduce in the chain and I had just copied and pasted the code without remembering to change the job identifier.
I agree. Here's the link:
subject: Hadoop key mismatch
jQuery in Action, 3rd edition