File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes JDBC and Relational Databases and the fly likes indexing issue Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Head First Android this week in the Android forum!
JavaRanch » Java Forums » Databases » JDBC and Relational Databases
Bookmark "indexing issue" Watch "indexing issue" New topic

indexing issue

Brian Percival
Ranch Hand

Joined: Jun 23, 2004
Posts: 163
I need a clarification from the experts. I have a db with more than a million records. I have a area code field (text) and phone number field( text). And then a different table with a phone field (includes area code also, the field is text).

I need to do some filtering, basically spitout records from table2 which do not have phone numbers (including area code)from the table1. since table1 has over a million records( table2 has like tens of thousands. I know in this case it is good to do filtering based on indexes.

My dilemma is, which would be faster? in both tables create a temporary field (int) which will have full number (areacode+text), create two indexes on the new fields in two tables, and then do a sql statement? or should I just go on with how it is right now, make the full numbers (in table 1) on the fly and search against numbers in table2? this si going to be a text against text search, without indexing, so it would be slow too. Trade off here is time to create new fields and indexes vs just doing it as I see it.

What do you suggest? I cann't experiment myself since this is a access database on a speed/memorywise challenged machine and it takes forever to even do one trail.

Please advise...

Jeanne Boyarsky
author & internet detective

Joined: May 26, 2003
Posts: 32301

I suspect the answer to which is faster highly depends on the database. You can still try it both ways though.

Add a where clause to limit the phone #s searching to a small subset (maybe 100 rows.) Then try it both ways and see which is faster.

[OCA 8 book] [Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Joel McNary

Joined: Aug 20, 2001
Posts: 1824

Does access even support indexes? And why would you create the new fields as ints?

I don't even think that creating indexes would necessarily work, even on a system that supports indexes. The reason is that typically if more that 10% of the table would be returned, the system determines that a full table-scan would be better. So you wind up with the un-indexed performance. Now, I can't say with authority that this is what would happen here, and you might get a little bit of improvement, but I would be suprised if you got great performance.

What do I suggest? It's hard to say, but if you create the new field, can you change the logic so that the new field is populated? If not, then it's probably better to just do it as-is. Obnoxious and annoying, but probably is less headache.

Of course, I'd probably complain that you can't expect good performance from an access database anyway and then implement the logic in Java.

Piscis Babelis est parvus, flavus, et hiridicus, et est probabiliter insolitissima raritas in toto mundo.
Brian Percival
Ranch Hand

Joined: Jun 23, 2004
Posts: 163
Thanks for your replies. Yes I would have to try it out with limited number of records. And yes, access supports indexes. The reason I am preferring int fields is because (I believe) a text comparision of numbers is slower than number comparison. Planning to use indexes because indexes (hopefully) would aid me in speedingup comparisions between the tables. Only I want to make sure the overhead of creating and maintaining indexes will not negate their hepfulness.
stu derby
Ranch Hand

Joined: Dec 15, 2005
Posts: 333
I don't know about Access, but many databases allow function-based indexes; if Access supports them, you could have a function-based indx that is the integer result of your two text columns, thereby avoiding the "temporary" column.
Maximilian Xavier Stocker
Ranch Hand

Joined: Sep 20, 2005
Posts: 381
Originally posted by Joel McNary:
Does access even support indexes?

Balazs Borbely
Ranch Hand

Joined: Oct 11, 2004
Posts: 33
A few possible solutions (I don't which would be applicable, depends on your business logic, and constraints)

-make those fields as number not text, since the area code is a 3 digit number and the phone number a 6 digits number.(number field + index=> good performance)
-if not you can use indexes on text fields as well

-use a professional database as Oracle (free for developers), or PostgreSQL (open source)

- use the same primary key for both tables , and that primary key should be a number, (the search on primary key is very fast)

'Make everything as simple as possible, but not simpler.' --Albert Einstein
It is sorta covered in the JavaRanch Style Guide.
subject: indexing issue
It's not a secret anymore!