Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Compare IP address in a range

 
Sudhanshu Umalkar
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a table holding the range of IP addresses (from and to) and the region. I need to find out the region for the input IP address.
Presently, I am firing a query to get the range - something like SELECT RANGE FROM TABLE WHERE INPUT_IP BETWEEN FROM_IP AND TO_IP.

Now, the table has millions of entries and the input IP addresses count is also pretty big. So it doesn't make sense to fire a query for each input IP address.

Any better way to do this?

Thanks.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you have a decent DB on decent hardware, and properly indexed tables, then a single query against several million rows shouldn't be a problem. If you are able to do multiple queries at once, you could use PreparedStatement and batches to possibly improve performance.

The next step after that would probably be caching of some sort, whether a home-grown solution or something of the shelf like ehcache or memcahced.

Are you actually experiencing a performance problem that you've measured and determined is caused by these queries? Or are you just guessing at what might be a problem?
 
Sudhanshu Umalkar
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There's certainly a performance issue. Cannot execute millions of queries one by one. Please note that the table has a lot of other information that I require when the IP address matches the range.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sud Umm wrote:There's certainly a performance issue.


So, you've actually measured this in a scenario that's a reasonable approximation of your expected production environment and load?

Cannot execute millions of queries one by one.


Of course we can. The question is how quickly can we do it, and how quickly do we need to do it? Do you have concrete numbers for those two questions?

Please note that the table has a lot of other information that I require when the IP address matches the range.


That statement by itself doesn't really say anything useful.

To summarize:

1. Decent hardware.
2. Decent DB software.
3. Proper indices.
4. PreparedStatement + batching.
5. Measure.
6. Caching.
 
Paul Clapham
Sheriff
Posts: 21107
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sud Umm wrote:Now, the table has millions of entries and the input IP addresses count is also pretty big. So it doesn't make sense to fire a query for each input IP address.


Well, that "pretty big" phrase is kind of imprecise. But later on in the thread you stated that you would be running "millions" of queries if you did one query for each IP address. So that would mean that your database has millions of different IP addresses in it. Is that correct? Or did "pretty big" mean something like "a couple of thousand"?
 
Winston Gutkowski
Bartender
Pie
Posts: 10417
63
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sud Umm wrote:Now, the table has millions of entries and the input IP addresses count is also pretty big. So it doesn't make sense to fire a query for each input IP address.

Any better way to do this?

Well, assuming:
(a) Your db table does not contain overlapping ranges.
(b) You can insert all the IP addresses to be queried into a temporary table.
then what you're doing is an "inner join" between your monster table and the temp table.

You should probably look up the exact syntax for your db, since different versions of SQL may have different ways of doing it (don't you just love it? The most important data retrieval language on the planet still not standardized after 30+ years ).

However, as others have said, indexing is going to be very important in terms of performance. I'd say, at the very least:
1. If your db table contains START_IP and END_IP columns, then at least one of them should be indexed.
2. If your temp table is likely to contain more than 50 addresses or so, then it should also be indexed.

Winston
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic