This week's book giveaway is in the Servlets forum.
We're giving away four copies of Murach's Java Servlets and JSP and have Joel Murach on-line!
See this thread for details.
The moose likes Java in General and the fly likes Validity of binaries Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Validity of binaries" Watch "Validity of binaries" New topic
Author

Validity of binaries

Muthukrishnan Manoharan
Ranch Hand

Joined: Aug 27, 2008
Posts: 91

Hi people,

I have a requirement to check the validity of a binary over the internet. Basically I have the database of urls of binaries and their respective size and MD5. I need to frequently check the validity of these binaries (ie. if they still exist in the specified url, and if their size or MD5 has changed). Since I am dealing with large number of URLs, it is taking a lot of time to check all the urls. I am actually downloading each and every binary via java.net.URLConnection and finding out their respective sizes and MD5. Is there any other way you people can suggest to optimize it or help me out to reduce the time of processing.

Thanks
Muthu
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14074
    
  16

The bottleneck in a program like that is most likely the speed of your Internet connection. If the program takes too long to run, you should add some log statements to it to see how long the different operations take, or use a profiler to find out where the performance bottleneck is.

But if your Internet connection speed is indeed the limiting factor, then there's nothing you can do about your program; you should just get a faster Internet connection.

Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 7 API documentation
Scala Notes - My blog about Scala
Muthukrishnan Manoharan
Ranch Hand

Joined: Aug 27, 2008
Posts: 91

Thanks Jesper Young,

I dont have any problem with my internet connection. I could find some improvement in the processing time when I use BufferedInputStream in place of

as I used previously. But I fear BufferedInputStream would have some effects as said here

Or is it alright to go with BufferedInputStream as the problem occurs only with BufferedReader
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41092
    
  44
But I fear BufferedInputStream would have some effects as said here

What do you mean? That topic talks about the problems of using Readers and Writers for binary data; there's nothing in it about problems using Streams.


Ping & DNS - my free Android networking tools app
Muthukrishnan Manoharan
Ranch Hand

Joined: Aug 27, 2008
Posts: 91

Oh.. sorry I misunderstood it..
Muthukrishnan Manoharan
Ranch Hand

Joined: Aug 27, 2008
Posts: 91

Will threading be of any use to improve processing time in this application.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41092
    
  44
It sounds as if the processing of each file is independent of that of each other one - so, yes, those could be checked in concurrent threads, likely improving the throughput.
Muthukrishnan Manoharan
Ranch Hand

Joined: Aug 27, 2008
Posts: 91

Thanks Ulf Dittmer,

For a typical scenario of 43 urls, the application with 5 threads takes little more time than the single threaded version of the application.

Am I going wrong somewhere in the selection of number of threads to process.

Also please tell me which one to choose from the following scenarios:

1. A thread fetches the url, downloads it and then updates in the database. Similarly many threads deal with different urls simultaneously.

or

2. Separate thread for downloading and separate thread for updating (following producer consumer)

I am presently following the scenario no.1.

-Muthu
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41092
    
  44
Approach #2 makes no sense. It's impossible to say where the problem might be without seeing the code.
 
 
subject: Validity of binaries
 
Similar Threads
Syntax error while inserting a file into Mysql database.
Struts framework
Which tomcat to install
Thread missed in the middle with ThreadPoolExecutor
is the free solaris8 really $75?