I wrote this piece of hadoop to preprocess files and write the files again to the output directory. I see files by name part-000, 0001 and so on being created but they all are empty. I use NullWritable for key. But set Text for value. I am not sure if its because of that.
TextInputFormat already splits your input by line and only hands your map function one line at a time. So you don't need the while loop. Also, String.contains() takes a CharSequence, not a regular expression. Unless you are looking for the literal character sequence "[A-Za-z]" you want to use String.matches().
Gartner says :Bigdata will be most advanced analytics products by 2015 !
Time to Become Big data architect by learning Hadoop(Developer,
Mahout, Splunk,R etc) from scratch to expert level