Atomik Roundtrip 2.1: Working with Roundtrip << >>

Chapter 13 Working with XML

Atomik Roundtrip is able to read any well-formed or valid XML file import its content into the QuarkXPress page, and allow it to be exported back out again, along with any copy edit changes.

Roundtrip differentiates between a well-formed and a valid XML file in the same way any other XML aware application will, by the presence of a DOCTYPE declaration. The DOCTYPE identifies to Roundtrip what sort of document it is that you’re working with, and identifies the DTD which defines this format.

There are two forms of DOCTYPE definition, SYSTEM and PUBLIC. A SYSTEM declaration refers to a DTD which is present on, or directly accessible by, the file system of the machine on which the XML is being read. A PUBLIC definition refers to a DTD which is available at a particular internet URL.

The type of definition does not matter to Roundtrip. It will simply extract the DTD filename from the end of the file reference or URL, and attempt to locate that file in the ‘Default DTD Folder’ specified in the preferences. You must ensure that a copy of the correct DTD for the XML which you wish to use with Atomik Roundtrip is in the default DTD location.

System administrators should note that the default DTD location does not necessarily have to be on a user’s hard disk, it can equally well be on a network file server. If this were the case, then only one copy of the DTD needs to be maintained.

If a DTD cannot be located, Atomik Roundtrip will still load the XML file, but will treat it as well-formed XML as opposed to valid XML. The major disadvantage to well-formed compared with valid XML is that you will be unable to create rulesets for XML which has no DTD, so you will be unable to automatically style the content as it is brought onto the QuarkXPress page. If the DTD could not be located, you will see a warning to this effect in the Error tab of the Roundtrip XML palette.

What about XML Schema?

XML schema is an advanced form of DTD which allows for more precise descriptions of the content. The most common reason for using XML schema to specify your XML document format instead of DTDs is that it has advanced support for data-typing. QuarkXPress has no requirement for data-typing, because all it can do with the data is display it for printing; there’s no functionality in QuarkXPress for calculating with or sorting these values, so they need only be considered as text.

A DTD defines as much as QuarkXPress requires to work with this data, the additional information provided by an XML schema is irrelevant to QuarkXPress. However, if you’ve got XML files which are designed to conform to an XML schema rather than a DTD, you will also need to create a DTD which describes the document format. This can be very simply accomplished in most good XML editors by simply opening the XML schema file and converting it to a DTD. Whilst data-typing information will be lost, this will not be a problem for Roundtrip as that data was not relevant.

If this DTD is given the same name as the XML schema document and placed in the ‘Default DTD location’ specified in the Roundtrip preferences, the XML will validate to that DTD instead of the XML schema document, and will be imported as valid.No change will need to be made to your XML in order to facilitate this.

What if I want to automatically style my well-formed XML?

If you know the name of the elements which are likely to appear in your well formed XML file, then simply make a ‘fake’ DTD like so (where element1 etc represent the names of the XML elements):

<!ELEMENT root (element1 | element2 | element3)*> <!ELEMENT element1 (#PCDATA)> <!ELEMENT element2 (#PCDATA)> <!ELEMENT element3 (#PCDATA)>

Whilst this DTD won’t accurately describe the structure of your well formed XML file, if you specify it when creating a ruleset, the ruleset which you create can be used with your well-formed XML file, and the stylings which you have specified will be applied. This works because the XML is valid against this general DTD.

This method can also be used to create place holders, but as the well-formed XML files do not conform to a particular structure, using place holders may not be appropriate with XML which is not valid.

What about SGML and HTML?

Atomik Roundtrip is designed to work with XML and XML only. However, as XML is similar in its structure to both SGML and HTML Atomik Roundtrip can have some limited success interpreting these files and importing their content into QuarkXPress. However, we do not make any representation that Roundtrip will work with either SGML or HTML.