*
The moose likes JDBC and the fly likes use of LIKE Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Databases » JDBC
Bookmark "use of LIKE" Watch "use of LIKE" New topic
Author

use of LIKE

Brian Percival
Ranch Hand

Joined: Jun 23, 2004
Posts: 163
Ok this is nothing related to JDBC but more a database design q.

I have say, a 'news' table with 'sites' as one if its column,

a news item can be assigned to more than one site and so 'sites' column contains a 'list' of all site ids that this news item is assigned to.

so whenever there is a query to get all itesm related to a site, we need to do a 'like' and use regexp such as like 'sid,%' or '%,sid,%' or '%,sid' etc.


I know this is a bad design, that the news-site association should be stored in a different table etc etc... but that implies all 'getnews' queries then will have to be 'join's between news and site tables.

And if there are any such dependencies, say another field called 'categories' then teh ideal design would be to add another 'categories' table in the join to get news items for a site and a category...

And so on...

Can anybody convince me that NO MATTER WHAT, storing the ids as a list in a single column and using 'like' for searches is ALWAYS a bad idea and always a performance hit compared to the multiple table joins solution???
Scott Selikoff
Saloon Keeper

Joined: Oct 23, 2005
Posts: 3700
    
    5

Easily.

Since string searching has to be done on every record in the table, there's no way to optimize this. In every query you could be searching through huge amounts of data which may be spread across a large drive.

The correct solution is to create a table with two columns, (nid,sid), where records in the table may have multiple news ids but (nid,sid) are unique primary keys. Then you can build a tree index on sid and retrieve results extremely fast (logn).

The thing you're forgetting is that more complex data structures can be managed with intelligent uses of indexes where as overly simplistic, denormalized data, is extremely limited. The database can't build an index to intelligently search through a field with LIKE-style syntax.


My Blog: Down Home Country Coding with Scott Selikoff
Scott Selikoff
Saloon Keeper

Joined: Oct 23, 2005
Posts: 3700
    
    5

BTW a second solution, although it'd practically be an anti-pattern since maintenance would be horrible, is if you know there is a limit on the number of pairs such as 5, you could add 5 columns in the original table instead of a list. Its better than a list is than it has faster performance but you now have to build index on 5 different fields, and as I said, maintenance would be hell.
Brian Percival
Ranch Hand

Joined: Jun 23, 2004
Posts: 163
Thanks.. just wanted a confirmation, that no matter how bulky joins might get, it is always to go ..
Paul Campbell
Ranch Hand

Joined: Oct 06, 2007
Posts: 338
Originally posted by Brian Percival:
Thanks.. just wanted a confirmation, that no matter how bulky joins might get, it is always to go ..


Joins are not a bad thing.

A query (full of joins) against a well-designed star schema will always out perform a normalized data structure representing the same data.
Purushoth Thambu
Ranch Hand

Joined: May 24, 2003
Posts: 425
Performance comes into consideration only if you know how much volume of data is stored in the table. I am not in favor of creating multiple tables and have joins - DB's are getting faster and storage is cheap. But using LIKE will definitely result in (a) FTS (Full table scan) or (b) Full Index Scan if table is IOT.

By the way you didn't say which database you are using. If you happen to use Oracle then you can look at InterMedia/Text Indexes to improve the performances and avoid FTS. It offers additional features as well. Using Text index depends on how many inserts you anticipate in a second.
Paul Campbell
Ranch Hand

Joined: Oct 06, 2007
Posts: 338
Originally posted by Purushothaman Thambu:
Performance comes into consideration only if you know how much volume of data is stored in the table. I am not in favor of creating multiple tables and have joins - DB's are getting faster and storage is cheap. But using LIKE will definitely result in (a) FTS (Full table scan) or (b) Full Index Scan if table is IOT.

By the way you didn't say which database you are using. If you happen to use Oracle then you can look at InterMedia/Text Indexes to improve the performances and avoid FTS. It offers additional features as well. Using Text index depends on how many inserts you anticipate in a second.


I fail to understand the resistance to utilizing joins. What is the basis for the point of view... it isn't something resisted in the BI or data warehousing world. Is this something that causes a problem for JDBC?
Purushoth Thambu
Ranch Hand

Joined: May 24, 2003
Posts: 425
Didn't mean that joins are bad. But we can void them if possible. I don't see why I should be creating multiple tables ,normalize, index , fk constraint, non mergeable views and stuff if join doesn't promise huge performance/scalability improvement.
Paul Campbell
Ranch Hand

Joined: Oct 06, 2007
Posts: 338
Originally posted by Purushothaman Thambu:
Didn't mean that joins are bad. But we can void them if possible. I don't see why I should be creating multiple tables ,normalize, index , fk constraint, non mergeable views and stuff if join doesn't promise huge performance/scalability improvement.


That's the point. Your query and performance in presenting the query results is completely dependent on the design of your data base. The payoff of doing the ground work now... and honestly... how hard is it to write multiple create DDL statments... though in an ideal world it would be generated from your logical data model.
 
wood burning stoves
 
subject: use of LIKE
 
Similar Threads
inserting a JLabel to a JTable
Hibernate object mapping... design issue?
Check constraints
Timezone Coneversion Problem
can I ask a big favor?