In this approach
I mean to say, use map to output word and its length; and then in reduce, use static variables for max length, and compute max length of word from input,
and finally write the output in cleanup() method
the variables will give you the max for the particular map task assuming the configuration set for JVM re-use as by default it is 1.
you will still have to have 1 reducer and get the actual max length and the word from all the mappers.
My approach would be to have global a counter for the max length, you start with the counter value being the value of the first word.
If the word is more than the value of the counter max length, then write the word as key and its length also change the counter value to the new length.
Then,
Approach 1: Then use a single reducer to get the max. Advantage is the number of records to process in the reducer will reduce.
Approach 2: More complicated however will perform better, you use the length of the word as a key. Then use a custom partitioner to send the range of lengths to a reducer.
Then find the max in each reducer and the output of your last reducer will hold the max length and the word