Atomik Xport SE: Reference > Chapter 12 Extracting Images, Captions and Copyright Information

12.2 The caption and copyright element declarations

Publishers typically have captions and copyright information they wish to extract for their images. You can designate any sibling of your image element as the caption or copyright element by assigning its element role in your Ruleset Editor.

Since captions and copyright text apply to specific images, it is usually important to retain that association in your XML. The way to do this is to structure your DTD so that the image, caption and copyright information are siblings within a parent node. In the example below, the ‘media’ element collects the image (‘media’), followed by caption (‘media-caption’) and copyright (‘media-producer‘):

<!ELEMENT media (media-ref, media-caption, media-producer)> 
<!ELEMENT media-ref EMPTY>
<!ELEMENT media-caption (#PCDATA |B|I|sup|sub)*>
<!ELEMENT media-producer (#PCDATA |B|I|sup|sub)*>