• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

What is the second best way to quote a string for a mysql query through jdbc

 
Ranch Hand
Posts: 334
2
Netbeans IDE Tomcat Server Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't think there's a way to use a prepared statement for this and I can't find the java version mysql_real_escape_string()

The problem is I'm constructing a complicated where clause. The query has a variable number of optional fields to make it easy for the end user to search a 20 million row table.

So I'd like to quote individual strings as I build the query.

I'd rather not write my own string quoting function but that is an option.

What am I missing?

Thanks,
Joe
 
Sheriff
Posts: 67746
173
Mac Mac OS X IntelliJ IDE jQuery TypeScript Java iOS
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What prevents you from using a PreparedStatement?
 
Joe Areeda
Ranch Hand
Posts: 334
2
Netbeans IDE Tomcat Server Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Bear Bibeault wrote:What prevents you from using a PreparedStatement?


Hi Bear,

My problem with prepared statement is the Where clause is so variable it can match of any number of fields depending on the user's request lots of ands and ors, some numeric comparisons and some likes. I suppose an alternative is to build the statement template but even that's not straight forward because fields can be strings, ints, floats, and soon timestamps.

I tried simplifying the process but the users revolted and said a 10+ sec query that is easy for them to construct is better than making the request harder to specify.

If I can quote the strings the algorithm works but all these unquoted strings make me really, really nervous.

Joe

 
Bartender
Posts: 543
4
Netbeans IDE Redhat Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
you could make a prepared statement with named parameters using a wrapper class (google search will help with this) with something like

"select * from table where (:param1 is null or table.paramName=:param1) and (:param2 is null or table.param2Name=:param2) and ..."

Do it for all possible fields and you're in. If you don't want to use the wrapper class, you could just apply your params twice each.

For this kind of thing, JPA would be better suited though. Up to you if you want to head to JPA or keep using JDBC.
 
Joe Areeda
Ranch Hand
Posts: 334
2
Netbeans IDE Tomcat Server Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Dieter,

I'll have to dig into JPA at this point I barely understand the acronym.

I'll toy with possibly null named parameters but the number of combinations is going to be large.

Maybe I am just going about this all wrong. So allow me to describe the problem I'm trying to solve instead of the programming corner I've painted myself into.

The rows in this table represent a data stream stored on disk or tape in one or more computing clusters. We call the data stream a Channel which may have been acquired by a real time system or calculated from one or more acquired channels.
A channel has a name auto generated from a model that has several fields for example "L1:IMC-X_MC3M3_OUT_DQ.mean". It also has a sampling frequency that runs from about 1/60-65536Hz. These are the fields that are giving me headaches. There are others but we do exact matches on them.

For the name field the user specifies a set of mandatory substrings it must contain or an set of alternative substrings. In the above example IMC and MC3 might be required and and one of X, or Y or Z. To complicate matters the order these substrings occur is not always easy to infer and the delimiters (:, -, _, and .) do not always break the name into fields that have to match exactly.

The match string a user might enter could look like "L1 mc3 imc x | L1 mc3 imc z" and that would get translated into a where clause like:



For the sample frequency they may select a value and compare operator ( <=, <, =, >, >=) which would then make the where clause look like:



The web app that uses this as Step 1 is meant to find and retrieve the data for a specific time interval and plot or transform and plot it. It turns out that many of our users just use it to do this search to figure out which channels to use in their off-line analysis. I seem to be the only one who cares how $#@%! UGLY and inefficient the DB queries are.

Some numbers:
If we ignore how many calculated channels are derived from an acquired channel, and how many different data formats and sample rates were used over the years there are about 1 million channels.
Add those variants and it's about 10 million
Add in exact duplicates available from different clusters and it's about 25 million.

Just for the shock and awe factor: the data kept on line at main cluster is a bit over 2 PB (2,000 TB) and when the instruments comes on line after a major upgrade at the end of this year or beginning of 2015 new data will be acquired at a rate of about 1 PB/year. At that point hopefully the size of the channel list will stop growing or at least slow to a crawl.

Right now we are searching the 25 million row table but soon I'll have the search working with the 1 million row table. There are multiple indexes and alternate name matching strategies (like glob and regex) that work much more efficiently. However they require a priori knowledge of the order of the substrings which is not always available.

I've thought about using the full text search feature, which is a bit weird on char(64) fields but the big problem is the strings can't be reliably broken into "words" to search on.

Perhaps there is a way to write a custom string comparison function that I don't know about. Something like String.containsAllSubstrings(String name, String[] substrings) would deal with the ugly syntax and increase efficiency a bit.

I do appreciate the Ranch as the best place to describe these kinds of problems. I always seem to get a good discussion of alternatives.

Best,
Joe
 
Bartender
Posts: 3648
16
Android Mac OS X Firefox Browser Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Looking at your problem, your so-called sql query's where clause is dynamic.

Have you ever thought of using the Builder pattern to build the query or at least the where clause?

An example how to use the pattern can be found here


Of course JPA route is also good but it seems you are not familiar with it
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic