wood burning stoves 2.0*
The moose likes Performance and the fly likes Compare IP address in a range Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of OCA/OCP Java SE 7 Programmer I & II Study Guide this week in the OCPJP forum!
JavaRanch » Java Forums » Java » Performance
Bookmark "Compare IP address in a range" Watch "Compare IP address in a range" New topic
Author

Compare IP address in a range

Sudhanshu Umalkar
Greenhorn

Joined: Nov 19, 2002
Posts: 12
I have a table holding the range of IP addresses (from and to) and the region. I need to find out the region for the input IP address.
Presently, I am firing a query to get the range - something like SELECT RANGE FROM TABLE WHERE INPUT_IP BETWEEN FROM_IP AND TO_IP.

Now, the table has millions of entries and the input IP addresses count is also pretty big. So it doesn't make sense to fire a query for each input IP address.

Any better way to do this?

Thanks.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

If you have a decent DB on decent hardware, and properly indexed tables, then a single query against several million rows shouldn't be a problem. If you are able to do multiple queries at once, you could use PreparedStatement and batches to possibly improve performance.

The next step after that would probably be caching of some sort, whether a home-grown solution or something of the shelf like ehcache or memcahced.

Are you actually experiencing a performance problem that you've measured and determined is caused by these queries? Or are you just guessing at what might be a problem?
Sudhanshu Umalkar
Greenhorn

Joined: Nov 19, 2002
Posts: 12
There's certainly a performance issue. Cannot execute millions of queries one by one. Please note that the table has a lot of other information that I require when the IP address matches the range.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Sud Umm wrote:There's certainly a performance issue.


So, you've actually measured this in a scenario that's a reasonable approximation of your expected production environment and load?

Cannot execute millions of queries one by one.


Of course we can. The question is how quickly can we do it, and how quickly do we need to do it? Do you have concrete numbers for those two questions?

Please note that the table has a lot of other information that I require when the IP address matches the range.


That statement by itself doesn't really say anything useful.

To summarize:

1. Decent hardware.
2. Decent DB software.
3. Proper indices.
4. PreparedStatement + batching.
5. Measure.
6. Caching.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18712
    
    8

Sud Umm wrote:Now, the table has millions of entries and the input IP addresses count is also pretty big. So it doesn't make sense to fire a query for each input IP address.


Well, that "pretty big" phrase is kind of imprecise. But later on in the thread you stated that you would be running "millions" of queries if you did one query for each IP address. So that would mean that your database has millions of different IP addresses in it. Is that correct? Or did "pretty big" mean something like "a couple of thousand"?
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8052
    
  22

Sud Umm wrote:Now, the table has millions of entries and the input IP addresses count is also pretty big. So it doesn't make sense to fire a query for each input IP address.

Any better way to do this?

Well, assuming:
(a) Your db table does not contain overlapping ranges.
(b) You can insert all the IP addresses to be queried into a temporary table.
then what you're doing is an "inner join" between your monster table and the temp table.

You should probably look up the exact syntax for your db, since different versions of SQL may have different ways of doing it (don't you just love it? The most important data retrieval language on the planet still not standardized after 30+ years ).

However, as others have said, indexing is going to be very important in terms of performance. I'd say, at the very least:
1. If your db table contains START_IP and END_IP columns, then at least one of them should be indexed.
2. If your temp table is likely to contain more than 50 addresses or so, then it should also be indexed.

Winston

Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Compare IP address in a range