Atomik Xport SE: Reference > Chapter 6 Working with Rulesets > 6.5 Rules tab << >>

6.5.1 Matching options

Match Style Name and Attributes

This is possibly the most strict search in that Atomik Xport will only identify the content on the page if it has been formatted with the specified styles and the character attributes also match to those specified in the style definition. This means than any text that has a style applied but then had the underlying character attributes changed will not be matched.

Match Style Name Only

In this instance Atomik Xport will match any text which has the specified style associated with it even if the underlying character attributes are different from the actual style character attributes.

Match Attributes Only

With this option set for a rule, Atomik Xport will ignore the style name and identify content purely based on the character attributes. This is ideal for situations when you might have text that for example looks like a headline but doesn’t have the headline style applied.

Match Additional Attributes Only

This is an extremely powerful option that enables Atomik Xport to look only for text that has had additional attributes (e.g. Italic or All Caps) applied to it. Consider, for example, you had some text that had a ‘01. Heading’ style applied to it. The underlying character attribute for this style is Bold but say you need to identify any text within the heading that has had Italics applied to it in addition to the Bold. With this option set, Atomik Xport would purely identify text that has this additional Italic attribute applied to it, ignoring the rest of the ‘bolded’ text. This particular rule has to be applied slightly differently to the others.

  1. 1 First create an example with some text on the page of just the additional attribute that you would like to match.

In the example above this would be some text that is styled Italic (but not bold or any other face)

  1. 2 Press and hold down Shift and Control and, while holding these keys down, drag and drop this example text onto the required DTD Element (the right-hand list).
  2. 3 Select the rule you have just created
  3. 4 Change the rule setting to ‘Match Additional Attributes Only’

This should automatically cause the the ‘Face’ attribute to be ticked and the other three attributes to be unticked. Next to the Face attribute you should only see the additional attribute that Atomik Xport is required to match.

Wildcards and regular expressions

If you’ve got multiple styles in a document which need to apply to the same element, previous versions of Atomik Xport have required that you make multiple rules for each style. Atomik Xport 4 allows for ‘wildcard’ rules, whereby you can specify a portion of the style name to be matched, as opposed to the whole name.

For example, if you had multiple styles, named as follows:

News Title

News Text

News Text first paragraph

In Brief Title

In Brief Text

Feature Title

Feature Text

Boxout Title

Boxout Text

You could make a ruleset with nine separate rules, one for each of the styles, or alternatively you could make a ruleset with just two rules, using wildcard rules.

In this example, the styles Boxout Title, Feature Title, News Title and InBrief Title are all mapped to the ‘Title’ element by a single rule which matches any style name which ends with the text “Title”. This is achieved by checking the ‘match’ checkbox, choosing the ‘Wildcard’ button, and in the editable text box which becomes available, typing “*Title”. The asterisk means, effectively, ‘any character or characters’, so this rule will match any characters followed by the text ‘Title’.

The second rule is slightly more complex. This matches any style name containing the word ‘Text’ - regardless of where “Text” appears in the style name; you can see that this is done by entering ‘*Text*’ for the style name .

Wildcard rules can allow some fairly complex matching of style names - but if you want to be even more specific, you can also specify a regular expression, rather than a wildcard. You do this by selecting the ‘Regex’ button, and entering an appropriate regular expression. Regular expressions are an implementation of the IEEE 1003.2 “POSIX” standard for string handling expressions. For further information on regular expressions, start up a Mac OS X terminal prompt and type “man regex”, or alternatively, take a look at a UNIX text book (most will have a description of regular expression handling), or ask your systems administrator.