This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
I presume by UTF, you are talking about the widely used UTF-8. UTF-8 uses 1 to 3 bytes, or 8 to 24 bits, per Unicode character, depending on the character. There are other UTF formats, like UTF-16, that represent the data differently. UTF-16 uses 16 bits per character. UTF-16 has the advantage of having all the characters the same size, while UTF-8 usually takes less space, at least if most of the characters can be encoded in 8 bits, like the displayable ASCII characters. UTF-8 is more likely to be used when it is known that the data will be access serially, as when it is sent across the network. UTF-16 is used when the data might be access in random order, as in a file. For example, Windows NT/2000 use UTF-16 to store Unicode data on disk. For an introduction to Unicode and encoding, you might look at The Unicode� Standard: A Technical Introduction.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com
subject: How many bits are there for UTF characters?