aspose file tools*
The moose likes Product and Other Certifications and the fly likes XML One liners for the exam.... Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Product and Other Certifications
Bookmark "XML One liners for the exam...." Watch "XML One liners for the exam...." New topic
Author

XML One liners for the exam....

Madhav Lakkapragada
Ranch Hand

Joined: Jun 03, 2000
Posts: 5040
This is in line with the tradition followed in other Cert forums. An attempt to have a colletive thread that contains very important stuff that we must remember for the exam. I am personally quoting from the Specs. Feel free to correct any (unintentions) mistakes. Its possible that this may have been done in the past, but spelling them again helps me so I am starting them again.
I request that seperate discussion threads be used to discuss any specific topics.

  • Parsed data is made up of character data and markup.
  • No case folding is performed to match strings.
  • XML Document has both logical and physical structure.
  • Physically, the document is composed of units called entities.
  • Logically it is composed of declarations, elements, comments, char references and PI. All of these are indicated by explicit markup.
  • There is exactly one element, called the root, or document element, no part of which appears in the content.
  • The elements, delimited by start and end tags, must nest properly within each other.
  • All XML Processors must accept UTF-8 and UTF-16 encodings.
  • A name is a token beginning with a letter or '_' or ":" and continues with digits, hyphens, underscores, colons or full stops.
  • Names beginning with 'xml' (any combination of case) are reserved.
  • Comments are not part of document's character data.
  • The string '--' must not occur within comments.
  • Comments donot nest.
  • Parameter Entity (PE) ref.'s are not recognized within comments.
  • PI's allow documents to contain instructions for applications.
  • PI's are not part of document's character data.
  • PI's begin with a targetName.
  • Target names must not start with 'xml' (any case).
  • Parameter Entity (PE) ref.'s are not recognized within PI's.
  • CDATA may occur anywhere. Used to escape text containing chars which would otherwise be recognized as markup.
  • CDATA sections begin with <![CDATA[ and end with CDEnd-markup ]]>.
  • The only marup recognized inside a CDATA section is the CDEnd-markup chars ie; ]]>.
  • CDATA sections donot nest.
  • A DTD is used to define constraints on the logical structure and to support predefined storage units.
  • The DTD must apprear before the first element in the document.
  • A DTD contains or points to markup declarations that provide grammer for a class of documents.
  • DTD can point to an external subset or internal subset or both.
  • A special attribute xml:lang maybe inserted in documents to specify the language used in the contents and attribute values of an element in an XML document.
  • The xml:lang attribute value is cosidered to apply to all attributes and content of the element where it is specified unless overridden.
  • Each element has a type identified by name.
  • Each attribute specification has a name and a value.
  • The order of attribute specification in the start tag or empty-element tag is not significant.
  • No attribute name may appear more than once in the same start-tag or empty element tag.
  • To be valid an attribute must have been declared and the value must be of the type declared.
  • Attribute values cannot contain direct or indirect references to external entities. (WFC)
  • The replacement text of any entity referred to directly or indirectly in an attribute value must not contain a '<'.
  • The text between the start-tag and the end-tag is called the element's content.
  • An element with no-content is called empty-element.
  • An element type declaration constrains the element's content.
  • No element type may be declared more than once. (VC)
  • The optional char following the name or list governs whether the element or the content particles in the list may occur [i]one or more(+), zero or more (*), or zero or one times(?).
  • The absence of such an operator means that the element or content particle must appear exactly once.
  • An element has mixed content when elements of that type may contain character data, optionally interspersed with child elements.
  • ...more!

  • regds.
    - madhav
    ps: All copyrights belong to respective owners.
    [ February 09, 2003: Message edited by: Madhav Lakkapragada ]

    Take a Minute, Donate an Hour, Change a Life
    http://www.ashanet.org/workanhour/2006/?r=Javaranch_ML&a=81
    Madhav Lakkapragada
    Ranch Hand

    Joined: Jun 03, 2000
    Posts: 5040
  • Attributes may appear only within start-tags and empty-element tags.
  • Attributes are used to associate name-value pairs with elements.
  • When more than one definition is provided for the same attribute of a given element type, the first declaration is binding and later declarations are ignored.
  • XML Attributes are of three kinds: a string type; a set of tokenized types, and enumerated types.
  • ID values must uniquely identify the elements which bear them.
  • No element type may have more than one ID atttribute specified. (VC).
  • IDREF values must match the value of some ID attribute. (VC)
  • No Element Type may have more than ONE NOTATION attribute specified. (VC)
  • An attribute of type NOTATION must not be declared on an element declared EMPTY.
  • The keyword INCLUDE in a conditional section of a DTD evaluates as part of the DTD.
  • The keyword IGNORE ignores that particular section and hence it is not logically part of the DTD.
  • General entities are entities for use within the document content.
  • A Parsed entity's contents are referred to as its replacement text and this text is considered an integral part of the document.
  • An unparsed entity is a resource whose contents may or may not be text.
  • Parameter entities are parsed entites for use in the DTD.
  • A parameter entity (PE) and a general entity with the same name are two distinct entities.
  • An entity reference refers to the content of a named entity.
  • References to parsed general entities use & and PE references use %.
  • The declaration of a PE must precede any reference to it.
  • Unparsed entities may be referred to only in attribute values declared to be of type ENTITY or ENTITIES.
  • A parsed entity cannot contain a recursive reference to itself either directly or indirectly.
  • An internal entity is a parsed entity.
  • The literal entity value in an internal entity declaration may contain character, PE and general entity references.
  • The replacement text of such internal entities must contain the replacement text or PE and character references; however, general-entity references must be left as-is, unexpanded.
  • General entities would be expanded should the internal reference appear in the document content or in an attribute value.
  • Validating and non-validating processors must report violations of XML Spec's well-formedness constraints in the content of the document entity and anyother parsed entities that they read.
  • Validating processors must, at user option, report violations of constraints expressed by declarations in DTD.
  • Non-validating processors are required to check only the document entity, including the entore internal DTD subset for well-formedness.


  • There is much more.....but.......
    - madhav
    Vivek Saxena
    Ranch Hand

    Joined: Apr 24, 2002
    Posts: 58
    I would like to add my contribution here, may be some of them are duplicate.
  • Attempt to use same element name in multiple element type declaration is an error.
  • Having a root element name other than the name specified in Document type declaration is an error.
  • When declaring mixed content, Not listing PCDATA as first item is an error.
  • Child element of an element declared as type ANY must have their own element type declarations.
  • Using same value for multiple ID attributes is an error.
  • Not beginning a type attribute ID’s value with a letter, underscore (_) and colon (: ) is an error.
  • Providing more than one attribute type for an element is an error.
  • Not assigning an external unparsed entity to an attribute with attribute type ENTITY is an error.

  • Thanks
    [ February 13, 2003: Message edited by: Vivek Saxena ]
    Axel Janssen
    Ranch Hand

    Joined: Jan 08, 2001
    Posts: 2164
    I am going to put my xml-schema explorations here:
    (good thing needs time)
    Will be heavily based on tutorial on xfront.com by Costello. Found this best ressource on this topic. Better than skonnard-chapters or wrox.xml.pro2 or even xml-schema 0-spec (this is 2nd best).
    schema vs. dtd:
  • more datatypes: 44+ vs. 10, can create your own datatypes
  • xml syntax.
  • Object-oriented'ish (extend or restrict a type)
  • specify element content as being unique (keys on content) and uniqueness within a region
  • multiple elements with the same name, but with different content
  • define elements with nil content
  • define substitutable elements

  • schema can be used for, cause its xml
  • validate xml documents
  • Automatic GUI generation
  • Semantic Web???
  • Smart editor
  • Automatic API generation


  • elements:
  • An element declaration can have a type attribute, or a anonymous complexType child element inlined, but it cannot have both a type attribute and a complexType child element.
  • Facets: and or or? Patterns, enumerations => "or" them together All other facets => "and" them together
  • Elements with simple content can be declared using:
  • build in type
  • named user defined simple type
  • anonymous, inlined user defined simple type
  • Elements with child elements as content can be declared using:
  • define child elements inline
  • define named complex type and use this in element
  • extend another complex type
  • restrict another complex type


  • annotations
  • <annotation> element is used for documenting the schema. <documentation> for humans <appinfo> for programs. <annotation> has no effect on Schema validation.
  • global component: annotations may occur before and after any
  • non-global components: annotations may occur only at the beginning non-global components
  • we can put <annotation> allways after opening <xsd:element> in <xsd:element><xsd:annotation/></xsd:element>
  • parameters of annotation-element: source (url for aditional documentation) and xml:lang, appinfo-element: only source


  • regular expressions, too much
  • overview
  • test applet

  • subclassing complexType definitions
  • derive by extension -> extend the parent complexType with more elements. Uses <xsd:extension base="theBase"> element
  • derive by restriction -> create a type which is a subset of the base type. Uses <xsd:restriction base="theBase"> element
  • with derive by restriction all elements of base type must be repeated, except of course when element should be omited !
  • with derive by restriction number of occurences of element can be changed. (e.g. author element should appear maxOccurs 1 time, not unbounded.
  • with derive by restriction element can only be omited if minOccurs in base complexType is 0
  • derive by restriction makes sense in context of type substitutability
  • derivations of type can be prohibited. Uses attribute final=(#all|extension|restriction)[\list]
    Terminology[list]
  • Declaration vs. Definition
  • declared components have a representation in an XML instance document (like elements and attributes)
  • defined components have no representation in XML instance document, just in schema. (like type (simple, complex), attribute group definitions, model group definitions
  • Global versus Local
  • global element declarations/type definitions that are immediate children of <schema>
  • local element declarations/type definitions are nested within other elements/types
  • only global elements/types can be referenced (i.e., reused)


  • element substitution
  • substitutionGroup: example: <xsd:element name="subway" type="xsd:string"/><xsd:element name="T" substitutionGroup="subway" type="xsd:string"/>. First element is head of substitution group.
  • Note that existing elements does not have to be changed. Just a new element added.
  • Its possible to use derived types(!) of type of head element in the substitution group elements. If type is the same as in head element, type attribute may be omited. Example above: type="xsd:string" can be omited.
  • elements in the substitution group must be declared as global elements
  • element substitution can be blocked with block="substitution" attribute. Error does not pop up, when declaring substitution group in schema. It does pop up, when you try to substitute in instance document.
  • <out_of_topic>WSAD.5-beta-xml tools does not support substitutionGroup in code completion. Tool is usefull.</out_of_topic>


  • Attributes
  • attributes can only have simple types (derived or build in). That's pretty clear: They can have no child elements.
  • attributes of attribute element: name, type, use=(required|optional|prohibited), default or fixed.
  • use attribute must be optional, if we use default or fixed attribute.
  • attributes can be inlined as local components inside an element declaration or separately defined in an global attributeGroup.
  • The attribute declarations always come last, after the element declarations.
  • Elements with simple content and attribute must be declared as complex type.


  • group element
  • for grouping together element declarations, no attribute declarations
  • groups must be defined (or declared) as global components. Can be referenced by local elements.
  • syntax: <xsd:group (name|ref)/>


  • Creating Lists
  • use xsd:list type
  • number of items of list, datatype, data-range of list can be restricted (xsd:length value="?")
  • we cannot create a list of lists or a list of complexTypes
  • in instance document list-items are to be separated by whitespace


  • further details about elements
  • xsd:choice is an exclusive-or
  • elements with fixed/default parameter can be left empty. Validating parser will insert value.
  • xsd:all means that included elements can appear in any order.
  • maxOccurs value inside xsd:all elements must be "1". minOccurs can be "0" or "1"
  • If a complexType uses <all> and it extends another type, then that parent type must have empty content.
  • The <all> element cannot be nested within either <sequence>, <choice>, or another <all>
  • The contents of <all> must be just elements. It cannot contain <sequence> or <choice>
  • Union of simple types can be archieved with xsd:union
  • The <any> element enables the instance document author to extend document with elements not specified by the schema
  • schemas which contain <xsd:any> are called extensible, other are called fixed.
  • The <anyAttribute> element enables the instance document author to extend document with attributes not specified by the schema.
  • schemas which contain <xsd:any> or <xsd:anyAttribute> are called extensible, other are called fixed.

  • {*]Another term for extensible schemas is open content model. Flexibility is added with extensibility. Might have good or bad effects. There is a range of openness that a schema may support - anywhere from having instance documents where new elements can be inserted anywhere (global openness), to instance documents where new elements can be inserted only at specific locations (localized openness). In rapid changing market place there might be an urgent need for open schemas. Open schemas enable schema user to add innovative elements to their instance documents. Those changes can be included in next version of schema
  • xsd:any and xsd:anyAttribute have attribute namespace for restricting elements to be extended to certain namespaces. Default is ##any. Possible values: ##local or ##some_namespace.


  • [ March 02, 2003: Message edited by: Axel Janssen ]
    [ March 02, 2003: Message edited by: Axel Janssen ]
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: XML One liners for the exam....