aspose file tools*
The moose likes JDBC and the fly likes handling large datasets Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Java 8 in Action this week in the Java 8 forum!
JavaRanch » Java Forums » Databases » JDBC
Bookmark "handling large datasets" Watch "handling large datasets" New topic
Author

handling large datasets

nobu taku
Greenhorn

Joined: Aug 06, 2003
Posts: 17
what is the best practice for presenting large datasets to users in a web application, say a 'product search'?
Do you obtain the entire dataset at once and then show 'x' at time?
Do you perform queries that only obtain 'x-sized' chunks of the dataset?
??
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
first one is fine and right.
second one didn't make sense to me. sorry.
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 60071
    
  65

The second option makes perfect sense with a database that supports it. Why send all the data if you are only going to throw away most of it? (In-memory caching is another issue).

PostgreSQL, for example, offers the keywords OFFSET and LIMIT with which you can easily obtain only the data necessary for the "current page".


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
nilesh Katakkar
Ranch Hand

Joined: Oct 27, 2004
Posts: 35
Yeah agree with Bear Bibeault ! Second approach is the only correct approach !

If you have let's say 100,000 rows or 1 million rows and you are only showing lets say 25 rows at a time, you probably will end up overkilling your application by fetching all the data. Just caching all rows on middle tier wont make sense if your data is real time one. Once the data is send back to the client client may want to see next 25 records or provide another filter criteria to reduce the rows retrieved or whatever. Plus with web apps you will never know, how long it will take client to decide to click next. He may he may not. How long will you keep cached data on middle tier ? What happens when some other users modifies the data in the DB ? You will have stale copy of data. Now add 100 users simulteneously quering same data... # of copies on middle tier will increase linearly.

Bear in his message told the way you can do it in POSTgreSQL. In MS Sql server I think there's a TOP N clause. In oracle you can try using Analytical functions rank () etc. or.. other way is to use inline views..

SELECT * FROM
( SELECT rownum r, a.* r FROM
( <YOUR SQL Statement Here Including WHERE GROUP ORDER BY> A
WHERE rownum < :toRow)
WHERE r > :fromRow
[ November 08, 2004: Message edited by: nilesh Katakkar ]

nilesh<br />neilindallas@hotmail.com
Jeanne Boyarsky
internet detective
Marshal

Joined: May 26, 2003
Posts: 29275
    
140

Nobu,
Bear is definitely correct that it is best to get less rows.

Note that Nilesh's query still needs to do the whole query. It just returns less data. This can still be a big savings, but limit is often more efficient (if supported.)


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Rudy Simon Yeung
Greenhorn

Joined: Jun 06, 2003
Posts: 15
Caching and memory are very expensive. If possible, I would also suggest you setting up validation rules to narrow down the search. This would reduce the number of records returned.

Regards,

Rudy
Adeel Ansari
Ranch Hand

Joined: Aug 15, 2004
Posts: 2874
Originally posted by adeel ansari:
first one is fine and right.
second one didn't make sense to me. sorry.


i think i was lost somewhere while replying this.
pardon. was really not in my senses though.

second one is far-better, as Bear and Jeanne said. i used to suggest the same thing, but dont know where were me. anyways thanks for all the comments to wake me up.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: handling large datasets
 
Similar Threads
sectioning a recordset using jsp
Connection polling issue with MySQLNonTransientConnectionException and Java webapp?
Struts pagination review
Script Time Out
Enumeration vs Iterator