Formatting

Dictionaries served from Dict servers are typically formatted in plain text suitable for fixed width terminals. Such text tends to be hard to read, especially for extended periods. Often that text have a regular structure, and can be converted into HTML text, which becomes much easier to read.

For this purpose Dikt has a text reformatting engine. This document describes how to write formatting configuration.

Example

Dictionaries that are worth formatting are those with longer descriptions, but for the explanation, it is better to take the simplest example, like freedict.

Entry from English-Norwegian Freedict dictionary:

poem [pouim] n

    dikt

On the first line there are the headword, the pronunciation, and the abbreviation. The rest of the entry is the description. The pronunciation is inside the square brackets, the headword is before it and the abbreviation is after it.

HTML markup

In HTML this becomes:

<div class="entry">
<dl>
<dt><span class="hword">poem</span>
<span class="pronc">[pouim]</span>
<span class="abrev">n</span>
<dd class="descr">dikt
</dl>
</div>

And it might be rendered like:

poem [pouim] n
dikt

Configuration

Configuration files are installed in share/apps/dikt. Format is in format.conf, and style is in dikt.css. Copy these files to your local configuration and make your changes.

An entry in format.conf:

[dict]
1-rule=old,new
...

This rule will replace every occurrence of old with new in dictionary dict.

On the first line is the header in square brackets. This is a substring from the database name, or from the dictionary title. It is used to match the formatting with the dictionary.

After the header follow the formatting rules. These are key - value pairs separated by an equal sign (=). Each key begins with a number that defines the order in which the rules are applied. The string part of the key is optional, it simply serves as a comment. The value is a pair of the regular expression and the replacement, separated by a comma (,).

Freedict entry for the example above:

[freedict]
1-pronc=(\\\\[[^\\\\n]+\\\\]),<span class="pronc">\\\\1</span>
2-hword=^([^\\\\n\\\\[]+),<dt><span class="hword">\\\\1</span>
3-abrev=(^.+\\\\])([^\\\\n]+),\\\\1<span class="abrev">\\\\2</span>
4-descr=\\\\n\\\\s+(\\\\S+),<dd class="descr">\\\\1

Note

To make it through, each backslash has to be entered 4 times.

For help on regular expressions see Qt RegExp documentation.

Styles used in this example are further configurable in Dikt settings. In dikt.css they have substitution numbers instead of the text. The numbers mark the place where the text is replaced with the values from the settings.

For help on HTML and CSS see Qt HTML documentation.

Debugging

There are two shortcuts to help with debugging. Page Source shows the page in plain text, and can help to find out where the tags are getting placed. Reformat reloads the formatting configuration and reformats the page.

These two shortcuts are accessible from the context menu, but by default there are no keys bound to them, to spare regular users from accidentally triggering them. You can assign them any keys you like in the shortcuts editor.

...

At first all this might seem much more complicated than it is. The best way to start on it is to open the format.conf file in an editor. Then edit it and see how changes affect the rendering.