aspose file tools*
The moose likes XML and Related Technologies and the fly likes difference between #PCDATA and CDATA? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "difference between #PCDATA and CDATA? " Watch "difference between #PCDATA and CDATA? " New topic
Author

difference between #PCDATA and CDATA?

Vivek Saxena
Ranch Hand

Joined: Apr 24, 2002
Posts: 58
Hi,
I am new to XML and its related technologies. I have started the preparation for XML certification.
I have following questions. Please help me.
Q#1
What is the difference between #PCDATA and CDATA? What does parser have to do with these?

Q#2
Why can't I use following?
<!ELEMENT test (#CDATA)>
and
<!ATTLIST test name #PCDATA #REQUIRED>
Any help would be appreciated.
Thanks
vivek
[ January 26, 2003: Message edited by: Vivek Saxena ]
Roseanne Zhang
Ranch Hand

Joined: Nov 14, 2000
Posts: 1953
Go to http://www.w3.org/TR/REC-xml
Search the words you've question with. You will not only find out your own answer, but also learn way more than that.
If it is a little over your head, buy/find a book, or free books/chapters online.
If you give me a fish, You feed me for the day If you teach me how to fish, you feed me for life.
Roseanne Zhang
Ranch Hand

Joined: Nov 14, 2000
Posts: 1953
Sorry, the best place to start:
http://www.w3schools.com/default.asp
The place where I started.
Mohan Panigrahi
Ranch Hand

Joined: Sep 28, 2001
Posts: 142
Hi Vivek,
# PCDATA for elements = CDATA for attributes.
When you specify #PCDATA for your element or CDATA for your attribute, it means that you cannot put any markup inside the element or attribute respectively.
Going by this, your second question is also answered.
Thanks
Mohan
Vivek Saxena
Ranch Hand

Joined: Apr 24, 2002
Posts: 58
HI,
I was looking into tutorial at http://www.w3schools.com/dtd/dtd_building.asp and I fond following definition for PCDTAT & CDATA. I really couldn't understand the technicalities of the statements with respect to parser.
For PCDATA
PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.
For CDATA
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.

Could someone please explain this to me? I would really appreciate!
Thanks
vivek
[ January 27, 2003: Message edited by: Vivek Saxena ]
[ January 27, 2003: Message edited by: Vivek Saxena ]
Roseanne Zhang
Ranch Hand

Joined: Nov 14, 2000
Posts: 1953
<h1>Titel</h1>
<h1> is a tag here, will be treated as markup. However, if the same thing is put inside a CDATA section, then it is not tag and markup any more.
<![CDATA[
<h1>Titel</h1>
]]>
My suggestion, read their html/xhtml tutorials before DTD/XML.
Vivek Saxena
Ranch Hand

Joined: Apr 24, 2002
Posts: 58
Roseanne Zhang,
Thanks for your kind advise. I will try to give some thought on it. I think I didn’t explain my problem properly or you couldn’t understand.
I am trying to find following things:
What exactly #PCDATA means when we define element-content as #PCDDTA?
<!ELMENT name (#PCDATA)>
& What exactly CDATA means when we define attribute type as CDATA?
<!ATTLIST name lastname CDATA “saxena”>
I understand that both represent a string or text but there are some differences that I don’t understand. I was reading a book and I found following definition

“Keyword PCDATA specifies that the element must contain parsable character data – that is , any text except the characters less-than (<) , greater-than (>) , ampersand (&), quote(') and double quote (") .
&
“Atrribute types are classified as either strings (CDATA), tokenized or enumerated. String (CDATA) attribute types do not impose any constraint on attribute values – other than disallowing the <,>,&,’ and “ characters. Entity reference must be used for these characters.”
So I understand one thing for sure that there are some constraint that are imposed by #PCDATA (when defined as element-content) but not by CDATA (when defined as attribute type), other than disallowing the <,>,&,' and " characters.
I need help to identify those constraints. May be I am too dumb to understand. May be that is why I am looking for help.
One more thing I found that,
“The CDATA keyword in an attribute declaration has a different meaning than the CDATA section in an XML document. In CDATA section all characters are legal (including <,>,&,’ and “ characters) except the “]]>” end tag.”
So what you explained to me is has nothing to do with my problem. That I understand.
Looking forward to get some light here.
Thanks
vivek
[ January 28, 2003: Message edited by: Vivek Saxena ]
[ January 28, 2003: Message edited by: Vivek Saxena ]
Roseanne Zhang
Ranch Hand

Joined: Nov 14, 2000
Posts: 1953
You've got all the excellent quotes, I could not explain better than those . However, I can reinforce them:
Original posted by vivek:
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.
Keyword PCDATA specifies that the element must contain parsable character data � that is, any text except the characters less-than ( < ) , greater-than ( > ), ampersand ( & ), quote( ' ) and double quote ( " ).
Atrribute types are classified as either strings (CDATA), tokenized or enumerated. String (CDATA) attribute types do not impose any constraint on attribute values � other than disallowing the <, >, &, � and " characters. Entity reference must be used for these characters.

Good summary! It helped me. It will help others who have similar problems. Thanks!
 
 
subject: difference between #PCDATA and CDATA?