aspose file tools*
The moose likes Threads and Synchronization and the fly likes Performance issue while uploading 10000 or more records Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Threads and Synchronization
Bookmark "Performance issue while uploading 10000 or more records" Watch "Performance issue while uploading 10000 or more records" New topic
Author

Performance issue while uploading 10000 or more records

adil qureshi
Ranch Hand

Joined: Jul 11, 2008
Posts: 48
I want to upload 10000 records in parallel .Earlier i was doing it sequentially .how can i do this using multithreading approach ?


SCJP 1.5,SCWCD 1.5
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18570
    
    8

Is the server you are uploading to designed to receive records in a random sequence, or is it designed to receive records starting at the first and going sequentially to the last? Or was your question intended to include rewriting the server if necessary?
adil qureshi
Ranch Hand

Joined: Jul 11, 2008
Posts: 48
Paul Clapham wrote:Is the server you are uploading to designed to receive records in a random sequence, or is it designed to receive records starting at the first and going sequentially to the last? Or was your question intended to include rewriting the server if necessary?


no its not that complex its simple insertion into database .but this insertion process i need to divide accross multiple threads as my vendor will also change the hardware accordingly.Please suggest . Thanks in advance
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41874
    
  63
What does hardware have to do with multi-threading (which happens within a single JVM)? Which kind of hardware are you talking about, anyway?

Are you familiar with the classes of the java.util.concurrent package? There's a decent tutorial that covers its basics: http://java.sun.com/docs/books/tutorial/essential/concurrency/index.html. You could use an ExecutorService to run 10 upload threads in parallel.


Ping & DNS - my free Android networking tools app
adil qureshi
Ranch Hand

Joined: Jul 11, 2008
Posts: 48
Ulf Dittmer wrote:What does hardware have to do with multi-threading (which happens within a single JVM)? Which kind of hardware are you talking about, anyway?

Are you familiar with the classes of the java.util.concurrent package? There's a decent tutorial that covers its basics: http://java.sun.com/docs/books/tutorial/essential/concurrency/index.html. You could use an ExecutorService to run 10 upload threads in parallel.


thanks Ulf..i think you have shown me way... just one thing more i wanted to make the number of thread to be configurable eg 10 or 20 or any random value from some config file ...would you like to comment on that .
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41874
    
  63
i wanted to make the number of thread to be configurable eg 10 or 20 or any random value from some config file ...

That seems a good idea. You could use a properties file for that, or a context parameter in the web.xml file if it's a web app.
aditee sharma
Ranch Hand

Joined: Jul 22, 2008
Posts: 182
adil qureshi wrote:I want to upload 10000 records in parallel .Earlier i was doing it sequentially .how can i do this using multithreading approach ?


I have a question about it as well. In old days (pre java 5), JSR specs dictated that spawning your own threads in a application server is a no no because it can interfere with the container's resource management among other reasons.
A ususal way of handling multiple records without affecting scalability was to publish these records to a JMS Topic via a queue.

Has the situation changed after Java 5 onwards ? Has the concurrency package removed that restriction?

I have used Java 5 concurrency package for a read only application (on weblogic) that attempted to access data from multiple sources and did not face any "resource management" problems as such.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41874
    
  63
I think creating threads is one of the things that shouldn't be done in EJBs because those are managed by the app server. That's likely still the case.

Creating threads elsewhere (in servlets, for example) should be no problem (and has never been one in my experience).
adil qureshi
Ranch Hand

Joined: Jul 11, 2008
Posts: 48
Ulf Dittmer wrote:I think creating threads is one of the things that shouldn't be done in EJBs because those are managed by the app server. That's likely still the case.

Creating threads elsewhere (in servlets, for example) should be no problem (and has never been one in my experience).


ulf ,can you please tell me in a thread pool how can i identify each and every thread uniquly .Since i have only one big list and i want that each and every thread to work on different indexes in parallel.
Chris Hurst
Ranch Hand

Joined: Oct 26, 2003
Posts: 416
    
    2

EjB's are single threaded but in a J2EE container you can request concurrency via a work manager http://java.sun.com/j2ee/1.4/docs/api/javax/resource/spi/work/WorkManager.html you don't create a thread as such you say I have this unit of work you could execute in parallel deal with it effectively, ie the container allocates threads etc though its possible to control thread pool configuration.

You might want to look at something like JCA.


"Eagles may soar but weasels don't get sucked into jet engines" SCJP 1.6, SCWCD 1.4, SCJD 1.5,SCBCD 5
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41874
    
  63
Why do you need to identify threads? If you obtain them from the pool you can be assured that nobody else is using them; what more do you need to know?

As a side note, object creation has become much less costly in recent JVMs, and consequently thread pools have lost much of their justification. Are your sure it helps to use one in your situation?
adil qureshi
Ranch Hand

Joined: Jul 11, 2008
Posts: 48
Ulf Dittmer wrote:Why do you need to identify threads? If you obtain them from the pool you can be assured that nobody else is using them; what more do you need to know?

As a side note, object creation has become much less costly in recent JVMs, and consequently thread pools have lost much of their justification. Are your sure it helps to use one in your situation?


sir , actually i have only one list with thousands of record . Now i want each and every thread to work on different records on the same list instance (Please correct me but i think this is one of the ways by which i can increase the performance ) . That's why i wanted to know which thread is running and which record he will have to work on .
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41874
    
  63
Here's a snippet of a worker class you might use:

and here's some code that uses it:

For a more fully-fledged solution, use a Callable instead of a Runnable, and do something with the Future objects that the calls to ExecutorService.submit return.
aditee sharma
Ranch Hand

Joined: Jul 22, 2008
Posts: 182
Ulf Dittmer wrote:I think creating threads is one of the things that shouldn't be done in EJBs because those are managed by the app server. That's likely still the case.

Creating threads elsewhere (in Servlets, for example) should be no problem (and has never been one in my experience).


DISCLAIMER: I am not arguing for the sake of it and I really want to know.

Servlets are also managed by application server only.
Loosely put, the application server has a thread which listens on a socket.
When it receives a request, it queues the thread to a work queue serviced by a pool of worker threads.

If every Servlet spawned several threads of its own, that could easily hurt, or kill, the whole application(at least theoretically).
Certainly you can make it difficult for the application server to make sensible global resource management decisions.

This is something that has always puzzled me. Why the heck would JSR provide an API that they forbade to use ? May be for applications that are not client/server?....
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336


Servlets are also managed by application server only.

Servlets are managed by the Servlet Container. The programming restrictions in this container are simpler than those in the EJB container.


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41874
    
  63
aditee sharma wrote:If every Servlet spawned several threads of its own, that could easily hurt, or kill, the whole application(at least theoretically).

The developer could absolutely do that. But there are many ways for a developer to shoot himself (and the app server) in the foot, and no API can force him not to do that. So, if you have a servlet that you expect to scale to thousands of simultaneous users, then you had better not create 10 threads for each request in its code.

Why the heck would JSR provide an API that they forbade to use ? May be for applications that are not client/server?

Time to tack a step back and look at the bigger picture. Threads have been an integral part of Java since java 1.0, and are not discouraged in any way. In fact, in these days of multi-core CPUs and multi-CPU machines, I consider concurrency to be one of the major features of Java.

What *is* discouraged is the creation of threads within EJBs, because EJBs are handled (created and destroyed) by the app server, and bad things might happen if the app server were to destroy an object that spawned a child thread that is beyond the knowledge and management of said app server.

It's in some respects similar to the applet sandbox that prohibits the Java code from doing certain things in the name of security.

Note that EJB has various methods for running background and asynchronous tasks: MDBs (since EJB 2.0), Timers (since EJB 2.1), Connectors (which can handle threads) and asynchronous session beans (since EJB 3.1), so it's not like that kind of thing can not be done at all.
aditee sharma
Ranch Hand

Joined: Jul 22, 2008
Posts: 182
Ulf Dittmer wrote:What *is* discouraged is the creation of threads within EJBs, because EJBs are handled (created and destroyed) by the app server, and bad things might happen if the app server were to destroy an object that spawned a child thread that is beyond the knowledge and management of said app server.

Yeah, you are correct. I just confirmed from the old J2EE 2.0 specs that only the EJBs should restrict their use of threads.
I should've known and said that firmly yesterday, when the interviewer told me that I did wrong by using the concurrent API in my Servlet.


Thank you and Paul for giving the time to clear the doubt.

adil qureshi
Ranch Hand

Joined: Jul 11, 2008
Posts: 48
Ulf Dittmer wrote:
aditee sharma wrote:If every Servlet spawned several threads of its own, that could easily hurt, or kill, the whole application(at least theoretically).

The developer could absolutely do that. But there are many ways for a developer to shoot himself (and the app server) in the foot, and no API can force him not to do that. So, if you have a servlet that you expect to scale to thousands of simultaneous users, then you had better not create 10 threads for each request in its code.

Why the heck would JSR provide an API that they forbade to use ? May be for applications that are not client/server?

Time to tack a step back and look at the bigger picture. Threads have been an integral part of Java since java 1.0, and are not discouraged in any way. In fact, in these days of multi-core CPUs and multi-CPU machines, I consider concurrency to be one of the major features of Java.

What *is* discouraged is the creation of threads within EJBs, because EJBs are handled (created and destroyed) by the app server, and bad things might happen if the app server were to destroy an object that spawned a child thread that is beyond the knowledge and management of said app server.

It's in some respects similar to the applet sandbox that prohibits the Java code from doing certain things in the name of security.

Note that EJB has various methods for running background and asynchronous tasks: MDBs (since EJB 2.0), Timers (since EJB 2.1), Connectors (which can handle threads) and asynchronous session beans (since EJB 3.1), so it's not like that kind of thing can not be done at all.




After seeing all the discussion I am with one more doubt .
For eg I start a Thread (ONLY ONE) ---> This Thread calls a function ---> This function itself call many function ---> ( but since its a single thread so all the functions get called one after another ( IF I AM NOT WRONG ) ----> Now if i replace any one of the functions with a multithreaded code which may be a thread pool or many threads .
Then what will happen ?
Also is it possible in any way that this Thread Pool gets completed first then only control goes back to the previous Thread .Can joining the first thread with this Thread Pool possible in any ways ..... ?.


 
 
subject: Performance issue while uploading 10000 or more records