6.8 Output tab

6.8.1 Encoding

UTF-8: Is the preferred standard for e-mail and web pages. It uses between one and four bytes per character. Note: UTF-8 does represent all the characters currently supported by non-East Asian specific DTP software.

Atomik Xport allows for three additional output encoding schemes in addition to UTF-8:

UTF-16: The standard used by many systems which require double-byte character support. UTF-16 is very commonly used in systems where text in East Asian languages and character sets needs to be stored.

ISO8859-1: The most common standard for encoding of characters on Windows based systems, including many databases and content management systems. Any characters not represented by the ISO8859-1 standard (especially on the Macintosh) are represented by Unicode numeric entities.

US-ASCII: The simplest encoding scheme, US-ASCII simply represents the American Standard Code for Information Interchange character set. Any characters outside this encoding method (and that’s pretty much anything which isn’t an alphanumeric or simple punctuation), is represented as a Unicode numeric entity.

You can choose the encoding format to use in the ‘Output’ tab of the ‘Edit Ruleset’ dialog. This output method will apply to all XML files created from this ruleset. It is possible to specify different output formats for different rulesets.