Atomik Xport SE: Reference > Chapter 6 Working with Rulesets > 6.5 Rules tab << >>

6.5.3 Additional rule options

No Other Style

The ‘No Other Style’ checkbox in the Atomik Xport ruleset allows for changes in character style to be identified, even when their surrounding paragraph style has not changed.

An example of where this would be useful would be a newspaper or magazine ‘News In Brief’ section : where all the text is styled using the same paragraph style sheet, but headings are picked out in a character style sheet.

For example, with this text:

This is a heading and this is the rest of the body copy which goes with it.

This is another heading and this is more body copy.

Atomik Xport would normally return the result:

<Story> 
<Heading>This is a heading</Heading>
<Body>and this is the rest of the body copy which goes
with it. This is another heading and this is more body copy.</Body>
</Story>

Whereas, if the ‘No Other Style’ checkbox in the ruleset had been checked, this would be output as:

<Story> 
<Heading>This is a heading</Heading>
<Body>and this is the rest of the body copy which goes with it.</Body>
</Story>
<Story>
<Heading>This is another heading</Heading>
<Body> and this is more body copy.</Body>
</Story>

The reasoning for this is simply that when Atomik Xport encounters the second heading, because it’s already in the same paragraph style sheet as the body copy, this text matches the same rule as is applied to the body copy. So Atomik Xport will simply append the new heading and it’s following body copy to the end of the existing body element.

Atomik Xport will normally attempt to continue with the existing element so long as the text will match the rules for the current element; and only creates a new element if it finds that the text does not obey these rules. Checking the ‘No Other Style’ checkbox will change this behaviour and consider rules for all sibling elements every time a style change is encountered, even when the current text could still be matched to the current element.

New Box - New Element

When Atomik Xport is processing text, if it finds identically styled text in separate frames which are extracted sequentially, then it will consider the contents of all of these boxes to be part of the same element.

Imagine the picture above being extracted by Atomik Xport: where the text labels are all separate pieces of text in separate boxes. The XML export you would get from Xport with the ‘New Box - New Element’ unticked would be something like:

<Graphic> 
<Image imgName=“laptop.jpg”>
<label>CDKeyboardScreen</label>
</Graphic>

This is not usually a desirable manner of exporting the content. The Atomik Xport 4 ruleset has a new control: ‘New Box - New Element’ - which does exactly what it says: when Xport is extracting content from the page, whenever subsequent boxes with identical content are extracted, it will create a new element to contain that content, rather than continuing to place the content in the current element; for example:

<Graphic> 
<Image imgName=“laptop.jpg”>
<label>CD</label>
<label>Keyboard</label>
<label>Screen</label>
</Graphic>

New Paragraph - New Element

By default Atomik Xport will extract consecutive paragraphs of identicallly styled text to a single element. But by ticking the ‘New-Paragraph - New Element’ checkbox, Atomik Xport will create a new element for every subsequent paragraph to which the same rule applied. This is useful, for instance, when extracting lists, as in the following case:

  1. HIGHLIGHTS

• Luxury hotel

• Wide range of leisure and dining facilities

• Unique hosted wine hour

• Complimentary spa

With the New Paragraph - New Element option left in the default setting (unticked) you would expect to the see following xml structure extracted:

<h4>HIGHLIGHTS</h4> 
<ul>
<li>• Luxury hotel • Wide range of leisure and dining facilities • Unique hosted wine hour • Complimentary spa</li>
</ul>

But by mapping the list style to the an element (in the XML below, <li>), and ticking ‘New Element - New Paragraph’, the individual items are split into separate elements at each point a paragraph mark is encountered:

<h4>HIGHLIGHTS</h4> 
<ul>
<li>• Luxury hotel</li>
<li>• Wide range of leisure and dining facilities</li>
<li>• Unique hosted wine hour</li>
<li>• Complimentary spa</li>
</ul>