I am fairly new to Hadoop so my question might be a simple one but I am hoping that someone can answer it for me
My scenario is this, I have created a workflow that forks 5 hive actions and then at the end joins them again ( the join is purely for fork/join validity)
I have then created a Coordinator that launches this workflow ever 10 minutes
When I look at the Coordinator status I see that there is 1 current action in a "RUNNING" status and an additional 4 future actions in a "WAITING" Status
When I look at the WorkFlow I can see that it has spawned 5 HIVE actions each of which are in a "RUNNING" Status
My problem is that these 5 actions remain in a running state forever, none of these hive actions ever complete and never go to SUCEEDED
Now the strange part is if I take this exact same scenario above but just change to have only 3 HIVE actions then everything works fine, the Coordinator launches the Workflow, the Workflow forks the 3 Hive actions and all 3 HIVE actions SUCCEED
What could the problem be and more importantly the solution?
I have a feeling it might be to do with resources and the various datanodes but am not sure where or what to check?
Anything worth doing well is worth doing poorly first. Just look at this tiny ad:
Devious Experiments for a Truly Passive Greenhouse!