I am pretty new to the HDFS and was looking for some opinions on some conflicting answers I have recently gotten.
1. Is it a good idea to compress the stream to write the file out to hadoop. One person told me they had got 10x benefit from doing this. Another told me that it was bad to compress b\c the map reduces that ran on the file could not be distributed using compressed files.
2. I read that map reduces running on hadoop works best with file sizes between 500gb and tb size files. Someone told me that the it works better with smaller files.