• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Fetch 100 rows at a time using RowHandler in IBatis(Oracle 11g database ) - Example

 
Vijay Kandaswamy
Greenhorn
Posts: 6
Hibernate Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I need to fetch 2 million records from the oracle database(1 year data).
When I am retrieving it as list, it is taking 'n' minutes and it hangs.



So, I tried implementing IBatis `"RowHandler"` interface and I overrided the and I am able to get the result (One row at a time).

But I need to get 'n' rows at a time where n >= 1000. So I added fetchSize="1000" and resultSetType="FORWARD_ONLY" attribute to my select statement.

E.g:



But still I am getting only one row at a time in the "handleRow(Object obj)" method.



When the method is called during query execution, `"List Size ->"` is always incrementing by one. But I am expecting increment rate of 1000 (As I have given fetchSize = "1000")...

When I googled out, there is a property available (Driver.useCursorFetch) which can be used along with "fetchSize" and "resultSetType" attribute.
But I think it is only for MySQL Database.
Reference :
http://www.yaldex.com/mysql_manual/ch23s04.html or
http://stackoverflow.com/questions/3870500/ibatis-querywithrowhandler-still-seems-to-fetch-all-rows.

What is the equivalent property(`Driver.useCursorFetch`) for Oracle 11g database.
I need some configuration like below.



Thanks in advance.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The first thing you should do would be to think really hard how to avoid fetching millions of records. Sometimes you really need to, but in many cases it would be feasible to move the processing to the database, either using a clever SQL statement, or to a stored procedure (in Oracle, you can create Java stored procedures, so you don't even need to know PL/SQL and perhaps could reuse some of the existing code). I'm pretty sure this would decrease the response time much more significantly than arriving at the optimal fetch size setting.

Regardless, the fetch size is a JDBC driver responsibility. When you use JDBC's recordsets (as all ORM and persistence frameworks invariably do at the end), you can access only one row at a time, but - behind the scenes - the driver fetches them in batches specified by the fetch size when he runs out of records from the previous batch. I therefore don't think that increases of the number of records processed by the framework are a good indication of the fetch size in effect.

I don't think there is substantial reason to assume that the fetch size is not being honored. It is possible to verify the fetch size by activating the SQL trace capability in the Oracle database. However, you need some privileges and knowledge to be able to do so. Another possible way to guess at that would be to record the rate of arrival of new records with fetch size set to 1 and 100 (repeat several times). If there is no clear difference between the two, probably the fetch size is not in effect. Make sure that there isn't other lengthy processing that would mask the time differences in the fetch operations, though.
 
chris webster
Bartender
Posts: 2407
33
Linux Oracle Postgres Database Python Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As Martin says, ask yourself if you really need to fetch 2 million records out of the database, across your network and into your application server. Can you get the final result you need via some clever SQL or PL/SQL in the database, which is designed to process large volumes of data efficiently, without moving them around your network?
 
Vijay Kandaswamy
Greenhorn
Posts: 6
Hibernate Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
chris webster wrote:As Martin says, ask yourself if you really need to fetch 2 million records out of the database, across your network and into your application server. Can you get the final result you need via some clever SQL or PL/SQL in the database, which is designed to process large volumes of data efficiently, without moving them around your network?


Thanks for the suggestion chris.
But, As per the requirement I need to pull those records(2 million) through my application only. So I am looking for the above option(fetchSize) believing that might help. We already proposed PL/SQL approach but they need the records to be pulled via application.
 
Vijay Kandaswamy
Greenhorn
Posts: 6
Hibernate Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Martin Vajsar wrote:The first thing you should do would be to think really hard how to avoid fetching millions of records. Sometimes you really need to, but in many cases it would be feasible to move the processing to the database, either using a clever SQL statement, or to a stored procedure (in Oracle, you can create Java stored procedures, so you don't even need to know PL/SQL and perhaps could reuse some of the existing code). I'm pretty sure this would decrease the response time much more significantly than arriving at the optimal fetch size setting.

Regardless, the fetch size is a JDBC driver responsibility. When you use JDBC's recordsets (as all ORM and persistence frameworks invariably do at the end), you can access only one row at a time, but - behind the scenes - the driver fetches them in batches specified by the fetch size when he runs out of records from the previous batch. I therefore don't think that increases of the number of records processed by the framework are a good indication of the fetch size in effect.

I don't think there is substantial reason to assume that the fetch size is not being honored. It is possible to verify the fetch size by activating the SQL trace capability in the Oracle database. However, you need some privileges and knowledge to be able to do so. Another possible way to guess at that would be to record the rate of arrival of new records with fetch size set to 1 and 100 (repeat several times). If there is no clear difference between the two, probably the fetch size is not in effect. Make sure that there isn't other lengthy processing that would mask the time differences in the fetch operations, though.


Thanks Martin.
I tried setting fetch size from 1 to 100 and repeated the process, but it shows no difference (fetch size is not in effect). It would be great if is any alternative approach available(Not necessarily fetchsize setting) to pull huge amount of records through IBatis.
 
Winston Gutkowski
Bartender
Pie
Posts: 10422
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Vijay Kandaswamy wrote:But, As per the requirement I need to pull those records(2 million) through my application only.

Why? That doesn't make sense.

So what you're really asking is: "Can I make pulling 2 million rows into my application any quicker?"

And the answer is: Probably not.

As Chris already said, databases are designed from the ground up to process large volumes of information; Java isn't.
So any solution that involves pulling vast quantities of raw data from your db into your program is unlikely to be optimal.

Winston
 
chris webster
Bartender
Posts: 2407
33
Linux Oracle Postgres Database Python Scala
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Vijay Kandaswamy wrote:
Thanks for the suggestion chris.
But, As per the requirement I need to pull those records(2 million) through my application only. So I am looking for the above option(fetchSize) believing that might help. We already proposed PL/SQL approach but they need the records to be pulled via application.

So what is the requirement? For example, if you're doing something like creating a flat file (e.g. CSV), you could do this via PL/SQL and the UTL_FILE package - write a file straight from the database to an Oracle directory on the server, then just copy the file to wherever it needs to go. Alternatively, it's part of an ETL process, maybe using a proper ETL tool would be a better approach e.g. OWB, ODI, Informatica if you have them, or the open source Talend if you don't already have an ETL tool.

Just curious - I've seen plenty of situations where people insist they have to move vast amounts of data around, only to discover that they don't really need to!
 
Vijay Kandaswamy
Greenhorn
Posts: 6
Hibernate Java Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks chris. Let me try with PL/SQL and the UTL_FILE package. Hope This would be Good solution for us.

chris webster wrote:
Vijay Kandaswamy wrote:
Thanks for the suggestion chris.
But, As per the requirement I need to pull those records(2 million) through my application only. So I am looking for the above option(fetchSize) believing that might help. We already proposed PL/SQL approach but they need the records to be pulled via application.

So what is the requirement? For example, if you're doing something like creating a flat file (e.g. CSV), you could do this via PL/SQL and the UTL_FILE package - write a file straight from the database to an Oracle directory on the server, then just copy the file to wherever it needs to go. Alternatively, it's part of an ETL process, maybe using a proper ETL tool would be a better approach e.g. OWB, ODI, Informatica if you have them, or the open source Talend if you don't already have an ETL tool.

Just curious - I've seen plenty of situations where people insist they have to move vast amounts of data around, only to discover that they don't really need to!
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic