I'm creating a sentiment app using SentiStrength tool.
The problem is that this tool look up for file for comparing purposes. I have placed all the file this tool need in HDFS.
When I try to open it from the map function, it says that it couldn't find the file.
This is the code
As you can see, I have to place the path to this file. I'm sure it is in HDFS but why is can't find it?
A HDFS URL is not like a regular filesystem path.
A regular file path like "c:\data\myfile.txt" or "/var/lib/myfile.txt" can be read or written using file I/O APIs, because the OS's filesystem layer knows how to read/write them.
But a HDFS URL is understood only by the HDFS daemons; it is not recognized as a file by the OS's filesystem layer.
One simple solution is your mapper should copy whatever files are required by sentistrength from HDFS onto the node's local filesystem, and then pass those local filesystem paths to sentistrength.
You can use FileSystem.copyToLocalFile to do this.
An optimized way of doing the same thing is add those files required by sentistrength to the Job as cache files:
When mapper is executed, every node downloads this file automatically under the name "EmoticonLookupTable.txt" (ie, whatever name follows the #)
Then use it in mapper like any local file: