Atomik Xport SE: Reference > Chapter 6 Working with Rulesets << >>

6.4 Defining rules

Rules are in essence mappings between typographical styles on the QuarkXPress page and the XML elements defined by the DTD. These mappings enable the Atomik Xport Automated Matching System (AMS) to automate the extraction of XML from the QuarkXPress project.

The AMS and the mappings from styles in QuarkXPress to the XML elements in the DTD are the central features of Atomik Xport that make it such a powerful and flexible solution.

Once you have successfully loaded a DTD into Atomik Xport and named your ruleset (as described above), you are ready to create the rules that define the ruleset. Rules in Atomik Xport can range from very simple correlations between QuarkXPress style sheets and XML elements through to rules that are much more complex. Rules can be defined that will match content at a paragraph or a character level (content within a paragraph) enabling very granular content to be identified and extracted such as an author’s name within a paragraph of text. Additionally Atomik Xport can match content using style sheets, styling attributes e.g. font, size and colour, or a combination of both.

You will see there are two lists of names within the Ruleset Editor. The left-hand list has the title ‘Document Styles’ and contains a list of all of the paragraph and character style sheets used in the currently open QuarkXPress project. As with the QuarkXPress style sheet palette, the paragraph and character style sheets are indicated as follows:

 
Paragraph style sheet 
 
Character style sheet 

The list to the right of this has the title ‘DTD Elements’. This contains a list of the elements that Atomik Xport has imported from the DTD that you selected when creating the ruleset. What you now need to do is to create a rule or multiple rules for each element into which you would like to extract content.

There are two possible ways to create rules for the DTD elements, both of which are explained now. Note that these methods are not mutually exclusive, you can use them in combination.

Creating mappings from QuarkXPress style sheets

Many publications utilise the QuarkXPress style sheets to achieve consistent styling throughout their QuarkXPress projects as well as increased productivity. In this case you can create a rule that maps the QuarkXPress style sheet directly to the specified DTD element. It is achieved by dragging and dropping the style name onto an element as follows:

  1. 1 Within the ‘Document Styles’ box on the left-hand-side of the Ruleset Editor, select your style sheet
  2. 2 Drag and drop this style sheet onto the element to which you would like to create a rule in the ‘DTD Elements’ list (the right-hand list of the two lists)

Mapping a style sheet to a DTD element

Once you have dragged and dropped the style sheet onto the DTD element you will see that a rule is created on the right-hand side of the Ruleset Editor to reflect the action you have just performed. The rule indicates whether it is based on a paragraph or character style sheet and lists the character attributes that will be used by Atomik Xport in the matching process. See below for an example some different types of rules.

Note you must be on the Rules tab in the Ruleset Editor to be able to see the list of rules. You can switch between Rules, Boxes, DTD and Output simply by clicking the corresponding tabs in the Ruleset Editor. Atomik Xport will display the Rules tab by default when you create a new ruleset.

It is possible to set further matching criteria for a rule. This is explained later in this chapter.

This process can be repeated for all of the style sheets and DTD elements for which you wish to extract content as XML. There is no known limit to the number of rules that can be created for any single DTD element.

  1. 3 Once you have completed your mapping rules, click ‘OK’ to save and close your ruleset

Creating mappings direct from the QuarkXPress page

The flexibility of Atomik Xport means that you don’t only have to rely on style sheets to enable you to create mapping rules. It is possible to create rules simply by dragging and dropping a sample of the text, for which you wish to create a mapping rule, directly from the QuarkXPress page onto the chosen DTD element. This will work equally well for text that does or does not have a style sheet associated with it. Here’s how you do it:

  1. 1 Highlight a sample of the text for which you wish to create a mapping.
  2. 2 Press and hold down Shift and Control and, while holding these keys down, drag and drop the highlighted text onto the DTD element (the right-hand list) to which you wish to map this text style.

Dragging and dropping text directly from the page onto a DTD element

  1. 3 If your text has been styled with both a character and a paragraph style, you will now be asked to choose which type to base the rule on.

    Once you have dragged and dropped the highlighted text onto the DTD element you will see that a rule is created on the right-hand side of the Ruleset Editor to reflect the action you have just performed. The rule indicates whether it is based on paragraph or character style sheets and lists the character attributes that will be used by Atomik Xport in the matching process.

    Note you must be on the ‘Rules’ tab in the Ruleset Editor to be able to see the list of rules. You can switch between Rules, Boxes, DTD and Output simply by clicking the corresponding tab in the Ruleset Editor. Atomik Xport will display the ‘Rules’ tab by default when you create a new ruleset.

    If the text you highlighted and dragged and dropped contains more than one style (e.g. some of the text is bold and some is italics), Atomik Xport will create the rule based on the style of the first character in the text and ignore the other styles.

    It is possible to set further matching criteria for a rule. This is explained later in this chapter.

    This process can be repeated for all of the various styles in your project for which you wish to extract content as XML. Remember, you can create any number of rules for a DTD element.

    1. 3 Once you have completed your mapping rules, click ‘OK’ to save and close your ruleset

    Examples of rules and their content

    A rule in Atomik Xport can hold information on five key properties of the typography of text in QuarkXPress. The five properties are the style sheet name, the font name, font size, font face and font colour. Any combination of these can be used to identify content. The section Rule Properties below explains how each of these aspects can be controlled.

    Here are examples of three types of rules with an explanation of their different component parts.

    Example of multiple rules

      Indicates the rule is based on a paragraph style sheet 
    Indicates the rule is based on a character styleheet 

    Excluding Styles

    If you do not create a rule for a document style, paragraphs that have the style applied will not typically be extracted (exceptions to this are captions, copyright and footnotes), and the text will be listed in the proofing log.

    However, there may be cases where you do not want non-extracted text to be listed in the proofing log. For instance, if your running header style was unmapped, your proofing log would include the running header from every page. It becomes harder to check the proofing log for content which was unintentionally excluded when it lists potentially hundreds of repeated headers.

    To cater for this situation, Atomik Xport includes the option of excluding styles. To specify a style sheet should be excluded, right-click the style sheet name in the Document Styles list in the Xport Ruleset Editor. A context menu will appear offering to ‘Exclude Style from Extraction’.

    If you select this option, the text with this style will not be considered for extraction and will not be listed in the proofing log. Further, a cross will appear again the style in the Document Styles list, indicating its status as an excluded style.

    To change the style’s status back so that it is considered for extraction (and included in the proofing log if unmapped), right-click the style sheet name and de-select ‘Exclude Style’ to toggle the setting off.

    Visual Indicators

    When you create a rule, various visual indicators automatically appear in the two lists (Document Styles and DTD Elements). Here is an explanation of each icon and what it is indicating.

    Explanation of icons

     
    Single tick icon - These appear next to the style sheets in the Document Styles list, and against elements in the DTD Elements list. A single tick indicates that the adjacent style sheet or element has a single rule associated with it. 
     
    Double tick icon - This is similar to the single tick icon but indicates that the corresponding style sheet or element has multiple rules associated with it.  
     
    Arrow icon - In the DTD Elements list, these indicate the elements that have been associated with the currently selected. style sheet. By clicking on different style sheets you can very quickly see which elements each style sheet has been mapped to. Similarly, where the arrow appear in the Document Styles list, they indicate the style sheets associated with the currently selected element. 
     
    Cross icon - These appear next to styles which have been marked for exclusion. Text which has had an excluded style applied will not be listed in the proofing log. 

    Handy Hint 1: You can map more than one style or style sheet from a QuarkXPress project onto a single XML element. This means if you have, for example, multiple headline styles, you can still map them all onto a single DTD element if you need to.

    Handy Hint 2: If you have a style or style sheet that is used for more than one type of content, you can map it to more than one DTD element. This is a perfectly valid approach, however, this can create problems in the automated matching process if the mappings and DTDs are not constructed very carefully. Therefore, we strongly recommend that you avoid creating mappings from one style to multiple DTD elements if possible.

    The AMS will use the structure of the DTD to work out which XML element the style should be mapped to, using the context of the neighbouring XML elements to determine the appropriate target XML element for the mapping. If there is ambiguity in the mapping, an error will occur. In order to minimise the likelihood of this occurring, follow Handy Hint 3 below.

    Handy Hint 3: Try to avoid creating mappings from one style or style sheet to multiple XML elements. Where possible, style each different type of content with a separate style or style sheet in QuarkXPress, even if the appearance, i.e., the typographical properties, of the different styles are the same. This will enable Atomik Xport to extract the content to the appropriate XML element with less possibility of error. Alternatively, if you find you frequently have to map a single style to multiple XML elements, you might want to consider simplifying the structure of your DTD, to ensure that there are less ‘target’ elements.