• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

ASCII Conversion Problem - emdash sign

 
Gihan Pandigamage
Ranch Hand
Posts: 60
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Friends,
I'm trying to add ascii values to a 2d array. but when I'm trying to add 151 ascii character to the string array it adds it as a '?' instead of 'emdash' sign. I couldn't figure out this. This works for other ascii values perfectly.


how can I add emdash sign to a string array without any issue?
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Pie
Posts: 15150
31
Android IntelliJ IDE Java Scala Spring
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
151 is not a valid code for an ASCII character; ASCII characters have codes between 0 and 127.

What your code does is take Unicode character U+0097 (97 hex = 151 decimal) and convert that to a string. I've looked up Unicode character U+0097 on www.unicode.org and it's some sort of control character, so it's not surprising that it is shown as "?". It's certainly not an "emdash" character.

I looked up the Windows-1252 encoding and indeed, in that encoding the character 0x97 looks like a long dash. But Windows-1252 is not ASCII and not Unicode.

If you want to specify characters according to the Windows-1252 encoding, you'll need to decode from that encoding, instead of directly interpreting the characters as Unicode characters. Something like this should work, although I haven't tested it:

 
Steve Luke
Bartender
Pie
Posts: 4181
21
IntelliJ IDE Java Python
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Character codes in java are Unicode codes, not ASCII codes. The Unicode character for emdash is 8212 (in decimal) I think. There isn't an ASCII code for emdash as far as I could find... The only association to 151 and the em dash is the Windows insertion code, but I don't know what reference they use to get that number.
 
Gihan Pandigamage
Ranch Hand
Posts: 60
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jesper de Jong wrote:151 is not a valid code for an ASCII character; ASCII characters have codes between 0 and 127.

What your code does is take Unicode character U+0097 (97 hex = 151 decimal) and convert that to a string. I've looked up Unicode character U+0097 on www.unicode.org and it's some sort of control character, so it's not surprising that it is shown as "?". It's certainly not an "emdash" character.

I looked up the Windows-1252 encoding and indeed, in that encoding the character 0x97 looks like a long dash. But Windows-1252 is not ASCII and not Unicode.

If you want to specify characters according to the Windows-1252 encoding, you'll need to decode from that encoding, instead of directly interpreting the characters as Unicode characters. Something like this should work, although I haven't tested it:




Thanks Jesper it's working..bravo ... I spent half a day for this ...Thanks again
 
Gihan Pandigamage
Ranch Hand
Posts: 60
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Steve Luke wrote:Character codes in java are Unicode codes, not ASCII codes. The Unicode character for emdash is 8212 (in decimal) I think. There isn't an ASCII code for emdash as far as I could find... The only association to 151 and the em dash is the Windows insertion code, but I don't know what reference they use to get that number.


Thanks steve problem solved. Jesper's code snippet working
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic