Atomik Roundtrip 2.1: Working with Roundtrip > Chapter 20 A brief guide to XML << >>

20.7 Other things you might encounter

Comments

If you want to document an XML file, or a DTD, then you can add a comment to it. These are simply pieces of text which are completely ignored, they’re just there to explain bits of the XML for people trying to read through it. Comments are enclosed within the format:

<!-- This is a comment -->

You should note that, just as with #PCDATA content, comments should not contain another comment, although they can include XML markup (so they can be used to temporarily remove a line from an XML file, without deleting it).

Whitespace

‘Whitespace’ refers to areas within the XML content which are blank. This could be carriage returns, double space characters, tabs, soft returns or any other such formatting character. Because of XML’s nature, it does not attempt to contain any formatting information at all - even to the extent of carriage returns to separate paragraphs. This does have a practical purpose, as these whitespace characters are often represented differently on different platforms (try opening up a Mac text file on a PC - all the carriage returns have become two squares, and the text is now all squashed up onto one line). For XML to work properly in a cross platform environment, therefore, whitespace is not suitable, and instead XML markup is used. The most common example of this can be seen throughout the Roundtrip tutorial documents, in which the ‘ReviewText’ element contains multiple child element ‘Paragraph’.

<!ELEMENT ReviewText (Paragraph)+>

This child element is used to represent the separation of the text into paragraphs without using characters which can be interpreted differently in a cross platform environment.

It is an XML convention for all whitespace characters to be stripped from the XML as it is interpreted, so if you do put any whitespace characters within your XML file, they will normally be ignored. Consequently, it is wise to use child elements to separate content, rather than whitespace characters.

Additionally, more than one space character in a row will normally be reduced to one space : for example:

<GameTitle>The Neverland</GameTitle>

will be interpreted as:

<GameTitle>The Neverland</GameTitle>

It is possible, depending on your XML interpreter, to preserve any whitespace within your text using the xml:space attribute. This attribute can be applied to any element, and is part of the reserved xml namespace.

<GameTitle xml:space=”preserve”>The Neverland</GameTitle>
<Standfirst xml:space=”preserve”>The excesses of life are sometimes too much to handle for the Princess.</Standfirst>

The example above will not have its whitespace (the carriage returns and repetition of space characters) removed. In order to use this special attribute, it must be declared in the DTD, however..

<!ELEMENT GameTitle (#PCDATA)>
<!ATTLIST GameTitle xml:space #FIXED “preserve”>

If the xml:space attribute is not specified in the DTD and XML, its negative value (“default”) will be assumed, and whitespace will be removed.

You should note that the Roundtrip interpreter does not currently support XML:space.