aspose file tools*
The moose likes Web Services and the fly likes Looking for a better data-transfer mechanism Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of The Java EE 7 Tutorial Volume 1 or Volume 2 this week in the Java EE forum
or jQuery UI in Action in the JavaScript forum!
JavaRanch » Java Forums » Java » Web Services
Bookmark "Looking for a better data-transfer mechanism " Watch "Looking for a better data-transfer mechanism " New topic
Author

Looking for a better data-transfer mechanism

cao nima
Greenhorn

Joined: Nov 12, 2010
Posts: 1
We are currently developing a rich client desktop application.
The client can talk to back-end server via the web service.
The web service is responsible for querying data from the back-end database (oracle).
And for function-like requests from clients, the web service simply submit a job to the oracle db.
The business logic resides in oracle stored procedure.
And the data is in the same database.

When the client request for the data that could make the database to retrieve 1,000 records out of, say 1,000,000 records.
what we do now is making multiple web service calls, each time for 100 records (in order to make the response data package small enough),
until no more could be retrieved.

In detail, we make all the queries appended with an order by clause,
and we send over the starting index along with the number of records retrieving,
and use JDBC result set (in the web service method) to re-position and get the needed data,
then use OracleCachedRowSet to hold the data and then send back to the client.

The problem here seemed to be significant.
Multiple web service calls are being wasted.
Each web service call will make the db run the query (in-efficiently) again, and wasted whole bunch of data.

In this querying scenario,
we do not want to separate the data in to different pages (like the table in traditional web-app),
and we want to know the meta data for the query result (our client application have to make use of the metas).

Due to our in-experienced background, we can not figure out an efficient mechanism for this querying scenario.
We thought socket-programming might work for us since once it's opened, the connection will be always be established,
and then we would not have to waste multiple db queries.
But socket can not by-pass the firewalls, and we would lost much the benefits introduced by the web service.

Is there an efficient way of doing this in the web service world?
Please enlighten us.
Any suggestion / criticism welcomed.
Thanks in advance.
Jim Akmer
Ranch Hand

Joined: Jul 06, 2010
Posts: 104
The problem you have comes from the fact that your web service is stateless. You pass the state by the parameters passed in the call.
There are ways to deal with this. E.g:
- You send the initial query request to the web service, eg via RunQueryX().
- Web service does the query and stores all result data in memory.
- This operation RunQueryX() returns as a response a unique id that the web service has cached.
- Your web client starts polling using another operation e.g GetResultsX() passing the id previously returned by web service
- The web service returns the response with a portion of the result set.
In each call, web service can know how many records to return because it can associate the client by the previously generated id.

Another method would be to have the client act also as a server. I.e.
- You send the initial query request to the web service, eg via RunQueryX().
- Operation returns a 202
- Web service acting as a web client to your client starts posting the results

Hope this helps
Ivan Krizsan
Ranch Hand

Joined: Oct 04, 2006
Posts: 2198
    
    1
Hi!
@Jim:
Wouldn't the technique you describe have negative impact on scalability?

I suspect that in order to devise a good solution to this problem, one may have to analyze query patterns and whether the data is mostly read and how often any updates occur etc.
Two ideas that I come to think of, without knowing such details, are:
1. Introduce a database cache, for instance memcached, that has access to large amounts of RAM and that supports distribution.
I don't know how well Oracle DB and memcached are integrated, perhaps there are other similar products available that fits better.
This way, the first query would still have to go to the database and retrieve data, but any subsequent queries will only hit the cache, as long as there are no updates.
However, the cache will only contain data already queried for.

2. Introduce a querying service or some type of database query interceptor, in addition to the above cache.
This service/interceptor makes a guess of what data will be needed next and include this in the query to the database.
For instance, if a client wants to query for rows 20-30 in some result set, then the query service queries for rows 10-40.
This way the cache will contain data that allows very fast retrieval of data for the previous page (rows 10-20) and the next page (rows 30-40).
The data delivered to the client will still only be rows 20-30.

The above suggestions will be transparent to clients of the original services, which can still remain stateless and thus easily scalable.
Please point out any weaknesses in the above ideas - I would be very interested in hearing how well suited they are for the problem at hand!
Best wishes!

My free books and tutorials: http://www.slideshare.net/krizsan
samarjit samanta
Greenhorn

Joined: Nov 07, 2010
Posts: 9
Did you hear someone downloading a big file over web services?
But socket can not by-pass the firewalls

You will need to add exception to some other port, or write your socket on port 80, I would not do that. But this line of yours hints there is some scope of tightly coupled code with database.
, and we would lost much the benefits introduced by the web service.

This will be interesting based on what optimizations you prefer and need.

and use JDBC result set (in the web service method) to re-position and get the needed data.
You are getting all the records 1000 records x 10 times OVER NETWORK (JDBC uses sockets to get data from oracle).
re-positioning JDBC cursor? are you querying to get all the million records every time into ResultSet.
Another point you are forming a bulky HTTP request to send all of it to client.
So I visualize your code does.
while(100 times ){[Big oracle query 1000 records] -> [Big result set 1000 records] } -> [Big HTTP request object 1000 records that is sent to client from app server]

Remove that while loop. [BIG oracle query] -> [BIG result set] -> [BIG HTTP request object that is sent to client from app server] Now it looks like old times DB query and display on web page. But see there are no bottle necks.

If you want to remove those BIG from the above line resort to pagination.

Use Oracle pagination queries if you are new to pagination. I could not find the original oracle's article regarding pagination but the resultant query looks something like below, I added the oracle HINTS which is missing in the below links.
Two links that i quickly found are here. http://www.dba-oracle.com/t_sql_pagination.htm and http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html


As Ivan said above especially in his second point some kind of pre-fetching, page scrolling optimization is something I would look into.
If you want to give client an impression that he is looking into a single page I would look for something like the google images search. I do not have experience in this, I can only suggest serarch on google for Continuous Scrolling/scroll pagination etc.
@Ivan, @Jim: That was my humble opinion, I did not want to disregard you opinion, I just wanted to suggest it might be better to deal with simplicity.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Looking for a better data-transfer mechanism