• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Junilu Lacar
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Rob Spoor
  • Bear Bibeault
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Piet Souris
  • Carey Brown
  • Stephan van Hulst
Bartenders:
  • Frits Walraven
  • fred rosenberger
  • salvin francis

Ascii - Space characters

 
Ranch Hand
Posts: 209
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

there are space characters 160 (non breaking space character) and 32 (space) - these represent a space characters but are distinct different individual characters,

is there any library that can be used to convert 160 (non breaking space character) into 32 (space) characters - this is for string comparison purposes,

Thanks in Advance,

Niall
 
Saloon Keeper
Posts: 23871
162
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Probably the easiest way to do this is to use String's replace() method to replace all nbsp characters with space characters before comparing.
 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Niall Loughnane wrote:is there any library that can be used to convert 160 (non breaking space character) into 32 (space) characters - this is for string comparison purposes,


Erm?
Don't look for complexity where none exists.

Winston
 
Marshal
Posts: 26690
81
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Niall Loughnane wrote:there are space characters 160 (non breaking space character) and 32 (space) - these represent a space characters but are distinct different individual characters,



That isn't correct. ASCII only defines characters in the range from 0 to 127. Now, Unicode does declare 160 as a non-breaking space character, but then it declares a whole lot of other characters as space characters as well. Here is a document which lists twenty of them but there could be others. As far as I can see the Unicode normalization algorithms don't do anything with those various space characters -- and speaking of normalization, have you built that into your specialized string comparison?
 
Tim Holloway
Saloon Keeper
Posts: 23871
162
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Paul is correct. The original American Standard Code for Information Interchange is a 7-bit code. The 8th bit was reserved for use as a parity bit for use with devices such as Teletype™ machines. The classic old modem settings "8N1" reflect that, indicating 8 data bits, no parity bit, 1 stop bit (2 stop bits were needed for some slower devices). "7E2" would be 7 data bits with even parity, 2 stop bits.

When the IBM PC became popular, a new de facto standard was defined: ASCIIZ which designated uses for characters with the 8th bit set. Graphics, accented text (including umlauts, etc. And the non-break space for typesetting.
 
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic