This week's book giveaway is in the Java 8 forum.
We're giving away four copies of Java 8 in Action and have Raoul-Gabriel Urma, Mario Fusco, and Alan Mycroft on-line!
See this thread for details.
The moose likes Hadoop and the fly likes why Hive runs map reduce jobs only for Where clause statements not for normal select statements? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Java 8 in Action this week in the Java 8 forum!
JavaRanch » Java Forums » Databases » Hadoop
Bookmark "why Hive runs map reduce jobs only for Where clause statements not for normal select statements?" Watch "why Hive runs map reduce jobs only for Where clause statements not for normal select statements?" New topic
Author

why Hive runs map reduce jobs only for Where clause statements not for normal select statements?

Monica. Shiralkar
Ranch Hand

Joined: Jul 07, 2012
Posts: 541
When I run a query in hive say "select * from tablename"---No map reduce runs.but when i run query "select * from tablename where -----" -It starts to run map reduce in the background. Why so does is run map reduce only in case of where clause? also the response comes faster in normal query than when with where clause for same reason...so whats the reason.
thanks
Tushar Sudake
Greenhorn

Joined: Jan 31, 2013
Posts: 2
Case 1: SELECT * FROM <table>;
In this case, all the table contents are supposed to be delivered straight forward. There there isn't any 'precondition' or 'filter' as such which 'WHERE' clause introduces.
Hive stores tables as files on HDFS and AFAIK in this case Hive simply out streams that file contents (similar to 'cat' in Linux).
This must be part of optimization. Running MR job and slowing the query doesn't make sense in this case.

Case 2: SELECT * FROM <table> WHERE <condition;>
In this case, table contents must be processed through some kind of logic/filter to get rows matching the condition.
As Hive is meant for huge data, this processing is done by taking advantage of scalable, parallel Hadoop map reduce framework.

Hope this solves your query.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: why Hive runs map reduce jobs only for Where clause statements not for normal select statements?
 
Similar Threads
hibernate - gathering multiple rows single column for two queries
advantage of storing data row wise in Hbase tables as compared to Relational DB.
Hadoop: confusion between Hive tables and HBase tables..
Hibernate/Display tag Pagination (Poor/slow performance for the last set of pages)/ Oracle 10 G
showing hbase data in JSP taking several minutes