• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

SAX Parser Issue

 
Rancher
Posts: 1776
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
I am reading an xml using XML Reader and writing it into smaller xmls. I am using UTF-8 encoding while reading the xml like below.

When the parser encounters an entity like — ©, i use the endEnitity method like below to output to the buffer.

My issue is that when an entity like — is encountered, the parser writes an extra unknown symbols like below in the output xml.

A Delayed Fix ——for the Marriage Penalty —— and for copy its coming -> "©Â©"


The above issue arises only if I run the parser application from Cygwin shell scripts in my local desktop (or alike sh scripts in Production). The issue do not occur in my local desktop when I run the parser as a standalone application using Eclipse.
I have some how narrowed down the issue to the below characters method of SAX parser.

The printlns show an extra character in the chars[] array as below -

Chars -> ùfor the Marriage Penalty
Chara after -> ùfor the Marriage Penalty
ΓÇöfor the Marriage Penalty
Chars -> ùfor the Marriage Penalty
Chara after -> ù
ΓÇö


My question is since I already output the entity using endEntity() method can I stop them from again printing out using the SAX parser's character() method.
 
John Jai
Rancher
Posts: 1776
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

John Jai wrote:Hi,
When the parser encounters an entity like — ©, i use the endEnitity method like below to output to the buffer.


Actually the entities i specified in the above post has been changed. they were & mdash ; and & copy ; (without spaces)
 
Attractive, successful people love this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic