This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
My assignment states that all text values only contain 8 bits characters. The encoding of these characters is US ASCII.
By reading this I remembered a quote in the �Complete Java 2 Certification, Fifth Edition� book. The quote was this:
�The strings that denote encoding names are determined by standards committees, so they are not especially obvious of informative. For example, the U.S. ASCII encoding name is not USASCII as you might expect, but rather ISO8859-1.�
I looked in the posts on this forum, but I can�t find a clear answer.
What I red was that U.S. ASCII was originally 7 bits, and didn�t contain European characters. So the US ASCII standard was extended to 8 bits and with support for European characters. This new version was named ISO8859-1.
But when I red the posts, people said that they are using the USASCII anyway. While to me it seems that ISO8859-1 is the way to go.
So can someone give my a clear answer which to use?
I find it difficult to say if your approach is correct. But what I do know if you do not define your encoding then the default encoding is used. That is right voor developers in the US, but I from somewhere else so I can't depend on that.
So I use the new String([bytes], encoding) constructor! I am just not sure which encoding to use.
I hope that someone can give me a clear anwser.
Joined: Feb 26, 2006
for the assignment the us ascii or the iso code anything is fine both will work equally
Joined: Feb 26, 2006
hey marinus!! I did not get if my approach is wrong. Why should it be??
the Reqs say that the character encoding is US-ASCII I read bytes and never read characters when i want to read a char i read a byte and cast it to char I remember that a conversion of byte to int will fill up the MSByte and int to char will take care of getting the unicode character. So i will indeed get the right character code while writing back as well i write only a byte so that will also take care of writing only the LSByte which is the US ASCII code?? Can anyone else shed some light if my approach is incorrect?
I do agree that the implementation is bad considering if the Encoding changes. But i think i took a decision and mentioned in the choices.txt. tHat should not get me deductions.
Joined: Oct 31, 2006
I have really no idea if your approach is wrong. It seems to me that you do not really take care of the US ASCII encoding. But maybe someone else can confirm this.
I am also not satisfied. Because I have still no idea what the difference is between the US ASCII and ISO8859-1. I still hope someone can clarify this.
ASCII uses 7 bits, so it has valid characters between 0 and 127 only. ISO-8859-1 uses 8 bits, and consequently has 256 characters. Luckily, the first 128 characters of both encoding are identical.
Getting back to your original post, there really is no 8-bit ASCII encoding (see above). But if you know that something is encoded in ISO-8859-1, then you know that it's also encoded in ASCII, because both are identical for all ASCII characters.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com