This week's book giveaway is in the Design forum.
We're giving away four copies of Design for the Mind and have Victor S. Yocco on-line!
See this thread for details.
Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Fit or not fit (Oracle varchar2 problem)

 
D Rog
Ranch Hand
Posts: 472
Linux Objective C Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I met a strange problem, although a field is defined as varchar2(4000), it can fit different number of character, because Unicode, UTF-8 or whatever. I guess Oracle field size is defined in bytes, not characters. So is any way to figure out upfront if certain string will fit varchar2 field without actual executing INSERT or UPDATE? Another problem is, if I know that a string can't fit and going to truncate it, then the process seems should be iterative. So an ideal solution for me can be just call some function which with minimum spending of computer resources can tell me how to truncate a string to fit a particular varchar2 field.Or perhaps you can suggest something clever?
 
Scott Selikoff
author
Saloon Keeper
Posts: 4010
18
Eclipse IDE Flex Google Web Toolkit
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
When you get over 2,000 characters, encoding becomes a big issue (been there). It might be better to declare objects that large as BLOBs (or in the case of Oracle, CLOBs)
 
Paul Campbell
Ranch Hand
Posts: 338
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by D Rog:
I met a strange problem, although a field is defined as varchar2(4000), it can fit different number of character, because Unicode, UTF-8 or whatever. I guess Oracle field size is defined in bytes, not characters. So is any way to figure out upfront if certain string will fit varchar2 field without actual executing INSERT or UPDATE? Another problem is, if I know that a string can't fit and going to truncate it, then the process seems should be iterative. So an ideal solution for me can be just call some function which with minimum spending of computer resources can tell me how to truncate a string to fit a particular varchar2 field.Or perhaps you can suggest something clever?


Actually your problem is not that strange, but it is likely the result of how UTF-8 characters are encoded. Characters beyond U+FFFF are stored as four bytes. This storage can be even more confusing if your Oracle installation is using AL32UTF8 as its Oracle character set. AL32UTF8 stores these same (greater than U+FFFF) characters as two UTF-16 surrogates encoded in UTF-8 that take up 6 bytes per surrogate character.

As previously stated, a CLOB is likely a better way for you to go.
 
D Rog
Ranch Hand
Posts: 472
Linux Objective C Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for responses. Yes, CLOB is what I am currently mostly use. My concern was that CLOB can perform not quite well.

Another concern, how effectively Oracle build index against CLOB column?
[ October 10, 2008: Message edited by: D Rog ]
 
Paul Campbell
Ranch Hand
Posts: 338
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by D Rog:
Thanks for responses. Yes, CLOB is what I am currently mostly use. My concern was that CLOB can perform not quite well.

Another concern, how effectively Oracle build index against CLOB column?

[ October 10, 2008: Message edited by: D Rog ]



All LOB columns have special segments created for them by Oracle. These columns require a special type of indexing.

To define a LOB index:

http://www.dba-oracle.com/t_table_blob_lob_storage.htm

You can also create a text index if the situation warants it:

http://www.dba-oracle.com/oracle_tips_like_sql_index.htm
 
Scott Selikoff
author
Saloon Keeper
Posts: 4010
18
Eclipse IDE Flex Google Web Toolkit
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're right that CLOBs may not perform well but when you go over 2,000 characters, you don't have many options.
 
Avi Abrami
Ranch Hand
Posts: 1141
1
Java Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rog,
Have you read the Oracle documentation?
VARCHAR2 columns can be defined as "bytes" (the default) or "characters".
The documentation also details the size calculations Oracle makes for storing VARCHAR2 column data.

Good Luck,
Avi.
 
Paul Campbell
Ranch Hand
Posts: 338
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Avi Abrami:
Rog,
Have you read the Oracle documentation?
VARCHAR2 columns can be defined as "bytes" (the default) or "characters".
The documentation also details the size calculations Oracle makes for storing VARCHAR2 column data.

Good Luck,
Avi.


That is true... you can override the default length semantics defined by the NLS_LENGTH_SEMANTICS parameter which defaults to byte(i.e., if you don't specify VARCHAR2(4000) is 4000 bytes).

You can set it to VARCHAR2(CHAR 4000) and Oracle will store the 4000 characters regardless of the number of bytes used to store the 4000 characters.

Your maximum size limit for VARCHAR2 is 4000 regardless of character or bytes.
[ October 15, 2008: Message edited by: Paul Campbell ]
 
Valentin Jacquemin
Ranch Hand
Posts: 35
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Campbell wrote:
You can set it to VARCHAR2(CHAR 4000) and Oracle will store the 4000 characters regardless of the number of bytes used to store the 4000 characters.

Your maximum size limit for VARCHAR2 is 4000 regardless of character or bytes.


Just to precise to correct syntax is: VARCHAR2(4000 CHAR).
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic