• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

how to avoid special character in database?

 
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry for my ignorance about SQL and Database.

Is there any provision in SQL or in Database or in JDBC API to avoid special characters in tables/database?

Thanks.
 
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What do you mean by "special character"?
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Paul Sturrock:
What do you mean by "special character"?




A computer-representable character that is not alphabetic, numeric, or blank.

Non-alphabetic or non-numeric character, such as @, #, $, %, &, * and +.



I hope you get it now.

 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
While there are some ways of doing this in the database, it would be a much cleaner design not to store them there in the first place. What prevents you from removing those characters before you store them?
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ulf Dittmer:
While there are some ways of doing this in the database, it would be a much cleaner design not to store them there in the first place. What prevents you from removing those characters before you store them?



Definitely, I would like to remove them before storing. But is there any ohter way of doing this apart from using RegEx? I mean, say any special kind of statement or something of that nature.

Thanks.
 
Ulf Dittmer
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

But is there any ohter way of doing this apart from using RegEx? I mean, say any special kind of statement or something of that nature.



No. Anyone's definition of 'special characters' probably differs from everyone else's, so it would be hard to know what to remove.
 
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What's the real problem you're trying to solve? What happens when "bad" characters get into the database? Believe me, I've hit dozens of ways that unexpected user input can crash an application, so I appreciate the problem. But just removing them may not be the answer.
 
ankur rathi
Ranch Hand
Posts: 3852
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Stan James:
What's the real problem you're trying to solve? What happens when "bad" characters get into the database? Believe me, I've hit dozens of ways that unexpected user input can crash an application, so I appreciate the problem. But just removing them may not be the answer.



The thing is, we don't want to make user flexible enough to enter *anything* he/she likes.

I guess, I should do that test both at client side and server side with my code only (JavaScript & Java respectively).
 
Paul Sturrock
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What does happen when "bad" characters enter the database? My understanding is it doesn't really matter what ends up in a text filed of a database, unless you use the value in some other technology unchecked. You mention JavaScript - is this the real problem?
 
Stan James
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Altering what the user enters seems downright rude. I'd be more inclined to properly escape strings on whatever platform I have to handle them. The first rookie error we hit was building an insert statement as a String and somebody entered a name of O'Malley and the ' terminated the value in the SQL string and Malley gave a syntax error. After we managed to get ' in and out of the database, it blew up some JavaScript written by a JSP the same way - terminated a string.

See the Commons StringEscapeUtils for a start on this journey.
 
Ranch Hand
Posts: 338
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Really the question is not how to avoid entering special characters in a database... it is how to avoid upserting characters that are not supported by the database's configuration. It isn't just special characters, carriage returns, etc. If the character is not supported by your database's configuration, it becomes unprintable, unreadable, and most of all... it makes that column virtually useless in a where clause, causes odd results in group by, order by, etc. As a result, if you don't eliminate the bad characters prior to committing them to the database, you (or somebody) will have to clean them up later. I was the person that had to clean up the bad data the other day when I had to create a package to remove unprintable/unreadable characters not supported by an Oracle installation using US7ASCII... in this case the user had loaded thousands of rows of data containing "curly" quotes from an excel spreadsheet to the database.

So it is a valid concern and no, I don't think you can do it in a single statement. In my case, the package is now part of the upsert sql and strips the characters as part of the sql statement.
 
Stan James
(instanceof Sidekick)
Posts: 8791
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Interesting notes about upsetting the database itself. I doing my first direct hands-on time work with databases since the 80s right now. Using a prepared statement with setString() I'm able to insert and retrieve all the characters on the keyboard, which is more than the last system I worked on could do. I'll have to play with copy paste from Word and Excel and see what happens!
 
Paul Sturrock
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Paul Campbell:
Really the question is not how to avoid entering special characters in a database... it is how to avoid upserting characters that are not supported by the database's configuration. It isn't just special characters, carriage returns, etc. If the character is not supported by your database's configuration, it becomes unprintable, unreadable, and most of all... it makes that column virtually useless in a where clause, causes odd results in group by, order by, etc. As a result, if you don't eliminate the bad characters prior to committing them to the database, you (or somebody) will have to clean them up later. I was the person that had to clean up the bad data the other day when I had to create a package to remove unprintable/unreadable characters not supported by an Oracle installation using US7ASCII... in this case the user had loaded thousands of rows of data containing "curly" quotes from an excel spreadsheet to the database.

So it is a valid concern and no, I don't think you can do it in a single statement. In my case, the package is now part of the upsert sql and strips the characters as part of the sql statement.



Rather than "cleaning up the data", why did you not convert your Oracle instance to support Unicode? Is it a pre-8i version or something? OK, migration from one character set to another is non-trivial, but if users are trying to store characters outside your current character set, is this not something you need to do? Surely your character stripping package is corrupting the data?
 
Paul Campbell
Ranch Hand
Posts: 338
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It is 10g/RAC... all our databases are 10g or being converted to 10g. Not all are RAC. 8i has been out of support for a long time, 9i is in extended support by Oracle. You never want to be in an out-of-support situation with your corporate data stores.

Well for one. It isn't my call. Even if it was... it isn't the right decision. That is an architecture decision driven by the corporation as a whole. We are currently going through a core application conversion and consolidation.

One of the "features" of upgrading the character code used by your database is that some systems (most notably the ETL jobs) are dependent on the particular character returned to the job in order to determine "no data found". When you change character codes on your database, you change the character that will be returned when "no data found" occurs for these jobs.

Your suggestion is valid. In this situation, there just isn't enough ROI to support a decision to convert/upgrade hundreds of Oracle databases (all our databases are configured similarly) and testing/modification of the thousand+ ETL jobs that use these databases because a single excel spreadsheet upsert process inserted unsupported characters into a single database.

Also, once the unsupported characters are inserted to the database, they are lost. Converting to different character code will not make that data usable.

I'm writing in a hurry as I go to work so if I didn't make a lot of sense, please let me know.

[ October 08, 2007: Message edited by: Paul Campbell ]

[ October 08, 2007: Message edited by: Paul Campbell ]

[ October 08, 2007: Message edited by: Paul Campbell ]
[ October 08, 2007: Message edited by: Paul Campbell ]
 
Hoo hoo hoo! Looks like we got a live one! Here, wave this tiny ad at it:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic