1. Document Type Definition (DTD)
Document Type Definition (DTD) describes the objects (such as elements, attributes, entities) and the relationship of the objects in a XML document. It specifies a set of constraints and establishes the trees that are acceptable in an XML document.
A DTD can be declared inside an XML document (i.e., inline), or referenced as an external file.
An inline DTD is wrapped in a DOCTYPE declaration, and has the following syntax:
An inline DTD is wrapped in a DOCTYPE declaration, and has the following syntax:
1 | <!DOCTYPE root-element [ |
2 | declarations |
3 | ]> |
A DTD can also be stored in an external file. An XML document can reference an external DTD via the following syntax:
1 | <!DOCTYPE root-element SYSTEM "DTD-filename"> |
DTD Syntax
XML’s DTD has its own syntax different of XML’s syntax which consists of declarations (for element, attributes and so on) such as:
- Document Type declaration:
1
<!DOCTYPE ...>
- Element declaration:
1
<!ELEMENT element-name (element-content)>
2
<!-- OR -->
3
<!ELEMENT element-name category>
1
<!ELEMENT title (#PCDATA)> // contain parsed character data
2
<!ELEMENT name (first_name, last_name)> // contain child elements
3
<!ELEMENT person(title, name+, born, died?, nationality*)>
4
// name (one), ? (zero or one), + (one or more), * (zero or more)
5
6
<!ELEMENT linebreak EMPTY> // an empty-element
7
<!ELEMENT message ANY> // combination of all
o#PCDATA
(Parsed Character Data): texts that will be examined for entity references and tags;
oEMPTY
: Empty element (for leaf element only);
oANY
: unrestrictive;
Occurrence Indicators:
o"+"
: one or more occurrences;
o"*"
: zero or more occurrences;
o"?"
: zero or exactly one occurrence;
o No occurrence indicator: exactly one;
Connector:
o","
: indicate the sequence of the child elements
o"|"
: choices (or) – choose only one of them - The structure of attributes in element start-tag is declared in DTD like:
1
// Declaring "attribute" in DTD
2
<!ATTLIST element-name
3
attribute-1-name attribute-1-type default
4
attribute-2-name attribute-2-type default
5
...
6
>
7
// default
8
default-value|#REQUIRED|#IMPLIED|#FIXED value
1
// Examples
2
<!ATTLIST person ID CDATA #REQUIRED>
3
<!ATTLIST trade action (buy|sell) #REQUIRED> // enumeration type
4
<!ATTLIST person
5
ID CDATA #REQUIRED
6
SSNUM CDATA
7
>
oCDATA
(Character Data): text strings that will not be parsed for entity references and tags.
oID
: an unique identifier.
oIDREF, IDREFS
: reference(s) to a previously defined ID.
oENTITY, ENTITIES
: external entity(entities).
oNMTOKEN, NMTOKENS
: word(s) not containing spaces.
o Enumeration: list of NMTOKEN separated by “|”.
Default:
o#REQUIRED
: must be provided in the document.
o#IMPLIED
: use the application default.
o#FIXED
value: must use this value.
o A literal default value.
- Entity declaration: A “entity” is a variable allowing the definition of replacement text or special characters where the entity reference is used in the form of
&entity-name;
to obtain the value of the variable. Entities can be declared inline or external:1
// Inline "entity" declaration
2
<!ENTITY entity-name "entity-value">
3
// External "entity" declaration
4
<!ENTITY entity-name SYSTEM "url">
1
<!ENTITY author "Antonio Andreo"> // In XML documents, entity referenced as &author;
2
<!ENTITY mywebsite SYSTEM "http://www.google.com">
- Define Notation for an external entity:
1
<!NOTATION ...>
Usage and Limitations of DTD
DTD defines the structure of XML documents, which could facilitate exchanges of documents between services. However, DTD has some limitations:
- DTD has its own syntax (which is inherited from SGML DTD) and requires a dedicate processing tool to process the content. It does not use XML syntax and XML processor.
- DTD does not support object-oriented concepts such as hierarchies and inheritance.
- DTD’s data type is limited to text string; and does not support other data types like number, date etc.
- DTD does not support namespaces.
- DTD’s occurrence indicator is limited to 0, 1 and many; cannot support a specific number such as 8.
2. XML Schema Definition (XSD)
XML Schema developed by W3C via a recommendation in May 2001, is a description language to define the structure and content type of an XML document. It overcomes the limitation of DTD and meant to replace DTD for the checking of XML document validity. In brief, the XML Schema:
- is a well-formed XML document, which uses XML syntax;
- is object-oriented, support concepts like inheritance;
- supports namespaces;
- supports more data type;
- more element occurrence indicators.
Note: The current version of XSD 1.1 (september 2012) became a approved W3C specification in April 2012.
So, the purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. Steps to follow in order to write a XSD document:
- Use of XML syntax
- Defining rules:
- define elements that can appear in a document
- define attributes that can appear in a document
- define which elements are child elements
- define the order of child elements
- define the number of child elements
- define whether an element is empty or can include text
- define data types for elements and attributes
- define default and fixed values for elements and attributes
- Save the document with the extension “.xsd”
- Reference the XSD document in the XML document like:
1
<
cars
xmlns:xsi
=
"http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation
=
"car.xsd"
>
XSD Syntax
This article outlines the XML-Schema but is not a reference to the syntax of this language.
We assume have the below configuration:
1 | < xs:schema xmlns:xs = "http://www.w3.org/2001/XMLSchema" > |
- Elements:
- The most common types are
xs:string
,xs:decimal
,xs:integer
,xs:boolean
,xs:date
,xs:time
; for example:1
<
xs:element
name
=
"age"
type
=
"xs:integer"
/>
- define a default value (default):
1
<
xs:element
name
=
"country"
type
=
"xs:string"
default
=
"USA"
/>
- define a fixed value (fixed):
1
<
xs:element
name
=
"country"
type
=
"xs:string"
fixed
=
"France"
/>
- The most common types are
- Attributes:
- only complex elements can have attributes;
- the declaration of attributes with a default or fixed value is identical to the elements:
1
<
xs:attribute
name
=
"firstname"
type
=
"xs:string"
default
=
"Huseyin"
/>
2
<
xs:attribute
name
=
"lastname"
type
=
"xs:string"
fixed
=
"OZVEREN"
/>
- define a mandatory attribute (required):
1
<
xs:attribute
name
=
"email"
type
=
"xs:string"
use
=
"required"
/>
- define a optional attribute (optional):
1
<
xs:attribute
name
=
"email"
type
=
"xs:string"
use
=
"optional"
/>
- Restrictions
It is possible
- to place restrictions on attributes or elements;
- to define a values range (minInclusive and maxInclusive):
1
<
xs:minInclusive
value
=
"minimum"
/> <
xs:maxInclusive
value
=
"maximum"
/>
- to define a list of values (enumeration):
1
<
xs:enumeration
value
=
"a_value"
/>
- to define a pattern (pattern):
1
<
xs:pattern
value
=
"[AZ][AZ][AZ]"
/> <
xs:pattern
value
=
"([az])*"
/>
- to define a behaviour for the treatment of spaces:
o spaces are kept (preserve):1
<
xs:whiteSpace
value
=
"preserve"
/>
1
<
xs:whiteSpace
value
=
"replace"
/>
1
<
xs:whiteSpace
value
=
"collapse"
/>
- to define a length (length, minLength and maxLength):
1
<
xs:length
value
=
"8"
/>
2
<
xs:minLength
value
=
"5"
/>
3
<
xs:maxLength
value
=
"8"
/>
- to define restriction on decimals with fractionDigits and totalDigits
- Complex elements
- use of tag xs:complexType;
- the definition of a complex element can be done directly at the element itself or by the name of the complex type (which allows multiple elements share the same complex type);
- a complex type can enrich another complex type or not (extension):
1
<
xs:extension
base
=
"basic_type"
>
- a complex type can also restrict another (restriction):
1
<
xs:restriction
base
=
"xs:integer"
>
- mix free text with tags (mixed) like:
1
my web site <
blogname
>JAVA BLOG</
blogname
>
2
<
xs:complexType
mixed
=
"true"
>
- Indicators
- Allows to control how the elements will be used;
- Order indicators:
o xs:all: sub-elements appear in any order;
o xs:choice: indicates that a single sub-elements may appear;
o xs: sequence: instructs the sub-elements, they must appear in a specific order; - Indicators of occurrence (how many times an element can appear):
o maxOccurs: maximum number (default 1). For unlimited use:maxOccurs=”unbounded”;
o minOccurs: minimum number; - Indicators of group:
o group: allows to group logically the elements;
o attributeGroup: allows to group logically the attributes;
- Extension
- the Any tag allows to add any item as a result of those precisely defined:
1
<
xs:any
minOccurs
=
"0"
/>
- the anyAttribute tag allows to add attributes not specified in the schema;
- the substitutionGroup tag allows to define a schema that applies to an XML document whose the tags still would not carry the same name:
1
<
name
/> <
nom
/>
- the Any tag allows to add any item as a result of those precisely defined:
Usage and Limitations of XSD
XSD is a description language to define the structure and content type of an XML document. It overcomes the limitation of DTD and meant to replace DTD for the checking of XML document validity.
XSD allows the creation of standards (Internet languages like xHTML, RSS, WSDL…etc), allows the data integrity, allows a very accurate validation compared to the DTD. However, XSD has the limitation to be long to write for complex structures.
XSD Validate pattern http://www.codesynthesis.com/projects/xsstl
XSD Validate pattern http://www.codesynthesis.com/projects/xsstl
0 nhận xét:
Post a Comment