This week's book giveaway is in the OCAJP 8 forum. We're giving away four copies of OCA Java SE 8 Programmer I Study Guide and have Edward Finegan & Robert Liguori on-line! See this thread for details.
I need a clarification from the experts. I have a db with more than a million records. I have a area code field (text) and phone number field( text). And then a different table with a phone field (includes area code also, the field is text).
I need to do some filtering, basically spitout records from table2 which do not have phone numbers (including area code)from the table1. since table1 has over a million records( table2 has like tens of thousands. I know in this case it is good to do filtering based on indexes.
My dilemma is, which would be faster? in both tables create a temporary field (int) which will have full number (areacode+text), create two indexes on the new fields in two tables, and then do a sql statement? or should I just go on with how it is right now, make the full numbers (in table 1) on the fly and search against numbers in table2? this si going to be a text against text search, without indexing, so it would be slow too. Trade off here is time to create new fields and indexes vs just doing it as I see it.
What do you suggest? I cann't experiment myself since this is a access database on a speed/memorywise challenged machine and it takes forever to even do one trail.
Does access even support indexes? And why would you create the new fields as ints?
I don't even think that creating indexes would necessarily work, even on a system that supports indexes. The reason is that typically if more that 10% of the table would be returned, the system determines that a full table-scan would be better. So you wind up with the un-indexed performance. Now, I can't say with authority that this is what would happen here, and you might get a little bit of improvement, but I would be suprised if you got great performance.
What do I suggest? It's hard to say, but if you create the new field, can you change the logic so that the new field is populated? If not, then it's probably better to just do it as-is. Obnoxious and annoying, but probably is less headache.
Of course, I'd probably complain that you can't expect good performance from an access database anyway and then implement the logic in Java.
Piscis Babelis est parvus, flavus, et hiridicus, et est probabiliter insolitissima raritas in toto mundo.
Joined: Jun 23, 2004
Thanks for your replies. Yes I would have to try it out with limited number of records. And yes, access supports indexes. The reason I am preferring int fields is because (I believe) a text comparision of numbers is slower than number comparison. Planning to use indexes because indexes (hopefully) would aid me in speedingup comparisions between the tables. Only I want to make sure the overhead of creating and maintaining indexes will not negate their hepfulness.
I don't know about Access, but many databases allow function-based indexes; if Access supports them, you could have a function-based indx that is the integer result of your two text columns, thereby avoiding the "temporary" column.
A few possible solutions (I don't which would be applicable, depends on your business logic, and constraints)
-make those fields as number not text, since the area code is a 3 digit number and the phone number a 6 digits number.(number field + index=> good performance) -if not you can use indexes on text fields as well
-use a professional database as Oracle (free for developers), or PostgreSQL (open source)
- use the same primary key for both tables , and that primary key should be a number, (the search on primary key is very fast)
'Make everything as simple as possible, but not simpler.' --Albert Einstein