File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Hadoop and the fly likes Hadoop key mismatch Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "Hadoop key mismatch" Watch "Hadoop key mismatch" New topic

Hadoop key mismatch

Larry Homes

Joined: Jan 18, 2009
Posts: 25

Hope this is the correct forum for a hadoop question.

I have a file with a bunch of lines like this:

It continues on for all 50 states, then there is another word like politics:30 Virginia ... etc.

I want to do a distributed sort on this using mapreduce. I know mapreduce sorts between the map and reduces stages, so I just want to emit from map, then from reduce without processing, but it is not working. Here is my map and reduce function:

Here is my main class

And here is the inputformat class i wrote since FileInputFormat would always fail

Here is the error

Larry Homes

Joined: Jan 18, 2009
Posts: 25
Thought I would post the solution I found. It was an incredibly dumb error on my part. In my main class, I named the Job instance sort, but then when setting the mapOutputKey, mapOutputValues, outputKey and outputValue, I use the identifier job. That identifier was from a previous mapreduce in the chain and I had just copied and pasted the code without remembering to change the job identifier.
I agree. Here's the link:
subject: Hadoop key mismatch
It's not a secret anymore!