Is the XML declaration node mandatory?

I had a discussion with a colleague of mine about the XML declaration node (I'm talking about this => <?xml version="1.0" encoding="UTF-8"?> ).

I believe that for something to be called "valid XML", it requires a XML declaration node.

My colleague states that the XML declaration node is optionnal, since the default encoding is UTF-8 and the version is always 1.0 . This make sense, but what does the standard says ?

In short, given the following file:

<books>
  <book id="1"><title>Title</title></book>
</book>

Can we say that:

  • It is valid XML ?
  • It is a valid XML node ?
  • It is a valid XML document ?
  • Thank you very much.


    This:

    <?xml version="1.0" encoding="UTF-8"?>
    

    is not a processing instruction - it is the XML declaration. Its purpose is to configure the XML parser correctly before it starts reading the rest of the document.

    It looks like a processing instruction, but unlike a real processing instruction it will not be part of the DOM the parser creates.

    It is not necessary for "valid" XML. "Valid" means "represents a well-defined document type, as described in a DTD or a schema". Without a schema or DTD the word "valid" has no meaning.

    Many people mis-use "valid" when they really mean "well-formed". A well-formed XML document is one that obeys the basic syntax rules of XML.

    There is no XML declaration necessary for a document to be well-formed, either, since there are defaults for both version and encoding ( 1.0 and UTF-8 / UTF-16 , respectively). If a Unicode BOM (Byte Order Mark) is present in the file, it determines the encoding. If there is no BOM and no XML declaration, UTF-8 is assumed.

    Here is a canonical thread on how encoding declaration and detection works in XML files. How default is the default encoding (UTF-8) in the XML Declaration?


    To your questions:

  • It is valid XML ?
    This cannot be answered without a DTD or a schema. It is well-formed, though.
  • It is a valid XML node ?
    A node is a concept that is related to an in-memory representation of a document (a DOM). This snippet can be parsed into a node, since it is well-formed.
  • It is a valid XML document ?
    See #1.
  • You are confusing a few XML concepts here (not to worry, this confusion is common and stems partly from the fact that the concepts overlap and names are mis-used rather often).

  • It all starts with structured data consisting of names, values and attributes that is organized as a tree.
  • XML means, most basically, a syntax to represent this structured data in textual form (it's a "Markup Language"). It is what you get when you serialize the tree into a string of characters and it can be used to de-serialize a string of characters into a tree again.
  • Document usually refers to a string of characters that represent a serialized tree. It can be stored in a file, sent over the network or created in-memory.
  • The rules of serialization and de-serialization are very strictly defined. A document (a "string of characters") that can successfully be de-serialized into a tree is said to be well-formed .
  • The semantics of such a tree (allowed elements, element count and order, namespaces, any number of complex rules, really) can be defined in what is called a DTD or a schema. If a tree obeys a certain set of well-defined semantics, it is said to be valid .
  • The term Document Object Model (DOM) refers to the standardized in-memory representation of structured data. It's the name of the a well-defined API to access this tree with standardized methods.
  • A node is the basic data structure of a Document Object Model.

  • According to the Extensible Markup Language (XML) 1.0 (Fifth Edition) W3C Recommendation 26 November 2008, section: http://www.w3.org/TR/2008/REC-xml-20081126/#sec-prolog-dtd
    without xml declaration, it is not valid (even though it is well-formed, complete).


    the specification states:

    Definition: XML documents SHOULD begin with an XML declaration which specifies the version of XML being used.

    And also for a document to be valid it should have a document type declaration associated with it. The snippet you show here seems to be a wellformed node, but in no way a valid document.

    链接地址: http://www.djcxy.com/p/29830.html

    上一篇: WSO2如何更改从字符串到xml的响应

    下一篇: XML声明节点是强制性的吗?