4. Basic Document Editing

This chapter teaches you the essentials of writing documents in ecromedos Markup Language (ECML).

4.1. How to Structure Your Documents

In general, you should start new documents from an empty document template, as described in section 3.1.2. Such a template will contain a complete skeleton for a document of a given document class.

4.1.1. General Structure

For a book or report the overall structure of a document can be represented by the following tree:

report
  |-head
  |-legal
  |-make-toc
  |-preface
  .
  .
  |-chapter
  |  |-title
  |  |-<BLOCK ELEMENTS>
  |  |-section
  |  .  |-title
  |  .  |-<BLOCK ELEMENTS>
  |     |-subsection
  |     .  |-title
  |     .  |-<BLOCK ELEMENTS>
  |        |-subsubsection
  |        .  |-title
  |        .  |-<BLOCK ELEMENTS>
  .
  .
  |-appendix
  |  |-title
  |  |-<BLOCK ELEMENTS>
  |  |-section
  |  .
  |  .
  .
  .
  |-glossary
  |-biblio
  |-index
  .
  .

This tree is greatly simplified and incomplete, because naturally any type of sectioning element, with the exception of subsubsection, can contain multiple subordinate sections. When using the article class, the preface element is replaced by the abstract element and the elements legal, glossary and index are not available. Also note that the most top-level sectioning element in an article is the section.

Block elements may be figures, equations or tables; or simply paragraphs of text, which are set with the p element.

4.1.2. Document Header And Legal Section

The very first child element of a document's root element is always the document header section, which has the following structure:

<head>
    <subject>Special Subject, e.g. Ph.D. Thesis</subject>
    <title>Document Main Title</title>
    <subtitle>Subtitle</title>
    <author>Author 1</author>
    <author>Author 2</author>
    ...
    <date>Date of Publication</date>
    <publisher>Name of Publisher</publisher>
</head>

In contrast to HTML, where the order of the header elements can be arbitrary, in ECML the order is fixed. In books and reports the header can be followed by an optional legal section, which consists of plain paragraphs of text and which is meant to hold copyright information. The legal section should generally fit on one single page of whatever paper format you have chosen for printable output. Have a look at the sources of this manual for an example.

4.1.3. The preface Element

In books and reports you may use the preface element to set an abitrary number of introductory sections right after the document header and the table of contents. The title of a preface will not be numbered and it will not appear in the table of contents when generating printable output. A preface may contain paragraphs of text, as well as other block elements, such as figures and tables. It must not contain any deeper sections except for minisections. If you feel that you need to divide your preface, you should consider making it a chapter.

4.1.4. Minisections

A minisection is a special kind of section, that can occur anywhere in the section hierarchy. The title of a minisection will be set in smaller letters than any regular section title, it will not be numbered and also not listed in the table of contents.

4.1.5. Paragraphs

A paragraph is set with the p element and is the simplest block element available, containing only formatted text and inline elements. Paragraphs can also optionally have a title, which will be set inline. See section 4.4 for an example of how this can be useful.

4.2. Formatting Inline Text

ecromedos gives you some control over how text is formatted and rendered and you may indeed discover that ECML isn't purely semantic in this area. If that bothers you, just pretend i stood for emphasis and discard the other elements. Alternatively, you can use these features to develop your own, specialized markup language on top of ECML, which is very much encouraged!

4.2.1. Formatting Elements

From your word processor you may be used to being able to emphasize text by setting it in bold or italic letters or by underlining it. With ecromedos you can achieve this by enclosing the span of text to be formatted inside the tags b for bold print, i for italics or u for underlining. You may also combine these arbitrarily. In addition, you may use the tt tag to make text appear in typewriter letters, which is useful for setting, for example, internet addresses or code fragments.

Table 4.1: Using text-formatting elements
 
MarkupResulting Output
<u>underlined text</u>underlined text
<i>text in italics</i>text in italics
<b>bold-faced letters</b>bold-faced letters
<b><i>bold face and italics</i></b>bold face and italics
<tt>text in typewriter letters</tt>text in typewriter letters
Super<sup>script</sup>Superscript
Sub<sub>script</sub>Subscript
<xx-small>text in XXS</xx-small>text in XXS
<x-small>text in XS</x-small>text in XS
<small>small letters</small>small letters
<medium>regular size</medium>regular size
<large>large letters</large>large letters
<x-large>text in XL</x-large>text in XL
<xx-large>text in XXL</xx-large>text in XXL
<color rgb="#880000">text in red letters</color>text in red letters

For the sake of completeness, there are the seven elements xx-small, x-small, small, medium, large, x-large and xx-large that let you control the font size. However, there should hardly ever be a reason to change the font size explicitly. Use the elements sup and sub in order to set text in super or supscript.

You can color text with the color element as shown in table 4.1. The rgb attribute expects a color value in CSS-style hexadecimal notation.

4.2.2. Controlling Hyphenation

In printable output, text is set justified over the entire width of the page's text area. In order to avoid large gaps of white space between words, LaTeX uses a clever algorithm and language-specifc patterns to hyphenate words automatically. However, sometimes the hyphenation algorithm fails and in rare cases it cannot hyphenate certain words, at all. You can provide hints, telling LaTeX where a word may be broken up, by inserting y tags in the right spots. For example, in order to tell LaTeX that it may split the word “bibliography” only in between biblio and graphy, you would write biblio<y/>graphy in your markup.

4.2.3. Manually Inserting Line or Page Breaks

In general, you should not have to worry about where a line breaks or where a new page begins, because it is the job of the formatting engine (i.e. LaTeX or your web browser) to take care of this. In rare cases, however, you may have to intervene manually. You can use the br element to break the current line or pagebreak to start a new page. You should not use multiple line or page breaks in a row.

When you need to prevent linebreaks in certain places, you can either use the non-breaking space (see section 4.7) or protect the specific strip of text with the nobr tag. For example, a title or academic degree should not be separated from the name that follows it. Consequently, you should write Dr.&nbsp;Pepper or <nobr>Dr. Pepper</nobr> to prevent the formatting engine from possibly breaking the line right before Pepper.

4.3. Working with Cross-References

ecromedos allows you to cross-reference locations in the same document or to create hyperlinks to external resources on the Web.

4.3.1. References in the same Document

Sometimes you will want to refer to another section in your manuscript, i.e. you may write something like, “[...] you will find out more about this on page XYZ”. However, at the time of writing your markup, you cannot tell on which page the section you are referring to will actually be printed. The solution is to label the locations you wish to reference and let ecromedos take care of filling in the correct number whenever it encounters a reference to a label in your document(1).

The syntax for the definition of cross-references has changed slightly in ecromedos version 2. To label a certain spot in the text, use the label tag. This tag has a single, mandatory id attribute. This must be a unique identifier among all elements that carry an id attribute. Take a look at the following example:

<chapter>
    <title>The Show about Nothing</title>
    <p>
        Seinfeld<label id="seinfeld"> is the best
        sitcom of all times.
    </p>
</chapter>

You can now use the ref element to obtain the section number and pageref to get the page number like this:

<chapter>
    <title>About Myself</title>
    <p>
        I really enjoy watching Seinfeld. You can read more
        about Seinfeld in section <ref idref="seinfeld"/> on
        page <pageref idref="seinfeld"/>.
    </p>
</chapter>

The ref and pageref elements can also point to any other object with an id attribute, such as a figure or a numbered equation. In that case ref will resolve to the corresponding object counter instead of the section counter.


(1) Depending on the target format, ecromedos may actually delegate the task of filling in cross-references to the formatting subsystem, such as is the case for LaTeX output.

4.3.2. Hyperlinks to External Resources

You can insert hyperlinks into your document with the link element, which has a single, mandatory attribute url, which is used in exactly the same way as the href attribute on HTML anchors:

Click this <link url="mailto:bobburnquist@example.com">link</link> to
send a mail to a non-existent e-mail address or visit
<link url="http://www.shredordie.com/">shredordie.com</link> for some
cool skate videos.

4.4. Automatic Counters

You can create new object counters with the counter element. For example, you may decide to create an “example” environment with its own object counter. An instance of such an “example” might look like this:

<p>
    <title>Example <counter group="example"
        simple="no" id="ex:counterhowto"/>:</title>
    <i>
    This is an example on how to use the <tt><b>counter</b></tt> element ...
    </i>
</p>

By giving the counter an id, you can cross-reference the counter using the ref and pageref elements (see section 4.3). If you set the optional simple attribute to yes, the section count will be omitted. In the rare event that you need to start counting from zero, set the optional base attribute to 0.

4.5. Footnotes

Footnotes are inserted via the footnote element into the running text:

ecromedos supports the target formats XHTML and &latex;, where
the latter can be compiled into high-quality PostScript and PDF
via the &tex;<footnote>More information on &tex; and &latex;
can be found on the <i>Comprehensive &tex; Archive Network</i> at
<link url="http://www.ctan.org"><tt>http://www.ctan.org</tt></link>.
</footnote> typesetting system.

You may remember this footnote from section 1.1.

4.6. Inline and Block Quotes

Unless you are setting your text in typewriter letters, you will not be able to enter the correct quotation marks for your language directly with your keyboard. You could use XML character entities to access the glyphs, but that is tedious. Instead you should use the tags q and qq for single and double quoting, respectively.

When quoting large portions of text, consider using the blockquote element, which acts as a block element and may contain multiple paragraphs of text. Block quotes will be indented left and right to set them off from the rest of the text.

4.7. Useful Pre-Defined Entities

ecromedos defines a small set of entities that may come in handy occasionally. Table 4.2 shows the available entity names and what they stand for. The zero-width space is particularly useful for making long path names or Internet addresses break across lines without introducing hyphens or spaces.

Table 4.2: Pre-Defined Entities
 
EntityResolves to
&tex;TeX
&latex;LaTeX
&xetex;XƎTeX
&xelatex;XƎLaTeX
&nbsp;The non-breaking space
&zwsp;The zero-width space
&endash;
&emdash;
&dots;...
&check;
&cross;

For direct access to these entities, you must include the following document type declaration at the top of your document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE report SYSTEM "http://www.ecromedos.net/dtd/3.0/ecromedos.dtd">

If you start your documents by generating a template, as described in section 3.1.2, the document type declaration will already be in place. Entities can also be accessed by name, through the entity element, without including the document type declaration. For example, you can insert an em-dash into the text by writing <entity name="emdash"/>.