7. UML annotations - line_diversion¶

The python module line_diversion extracts marked annotation lines from text files. The extraction result is generally a PlantUML diagram. Future extensions to extract other material (e.g. dot(1) graphs) are possible and probable.

While writing source code, the PlantUML annotations can be easily added near the source code lines. (See section Emacs support). The idea of this concept is to minimize the distance between source code and documentation (see section 8, Relevance of Documentation)[1].

7.1. Emacs support¶

A set of Emacs commands facilitates code annotation, reformatting of annotations and diagram preview. The commands are all prefixed with umlx- and bound with the prefix command C-# u. See C-# u C-h for key bindings:

C-# u a               umlx-mark-activity
C-# u b               umlx-re-mark-buffer
C-# u c               umlx-convert-old
C-# u g               umlx-grep-find
C-# u k               umlx-unmark
C-# u m               umlx-mark
C-# u o               umlx-occur
C-# u t               umlx-symbol-tag-delimiter
C-# u u               umlx-re-mark
C-# u x               umlx-extract

In addtion to the C-# u <char> binding, some commands are also directly bound to C-# <char>:

C-# C-a               umlx-mark-activity
C-# C-b               umlx-re-mark-buffer
C-# C-k               umlx-unmark
C-# RET               umlx-mark
C-# C-t               umlx-symbol-tag-delimiter
C-# C-u               umlx-re-mark

Note

If C-# does not work (e.g. in a terminal), the prefix command C-c # can be used instead.

7.2. Annotation tags and markers¶

An annotation marker consists of

an annotation tag, which consists of
- a comment start sequence,
- followed directly (without whitespace) by a tag symbol, which consists of
  - a line_diversion type
  - and a diagram number,
optional type parameters,
optional text
and an optional comment end sequence.

ANNOTATION-MARKER:

<ANNOTATION-TAG> [<SPACE> TYPE-PARAMETERS ..] [<SPACE> <TAIL-TEXT>] [[<SPACE>] <COMMENT-END>]

ANNOTATION-TAG:

<COMMENT-START> <ANNOTATION-TAG-SYMBOL>

ANNOTATION-TAG-SYMBOL:

<LINE_DIVERSION-TYPE> <DIAGRAM-NUMBER>

E.g.:

#a55
a #a55 :;
b #a55 :; #red
#a55 ' (yes)

An annotated line is defined as:

[[<SPACE>] <COMMENT-START> <SPACE> <TEXT>] <ANNOTATION-MARKER>
| [<SPACE>] <KEYWORD> <ANNOTATION-MARKER>
| <TEXT> <ANNOTATION-MARKER> <SPACE> :[;]

7.2.1. Comment start regular expression¶

The regular expression for matching a comment start is defined in module line_diversion:

COMMENT_TYPE_RX='(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|<!--|#+|@[bl]?comm_?@|[.][.])'

7.2.2. Line diversion types¶

The line diversion types recognized are also defined in module line_diversion:

Marker	Diagram
#c[0-9]+	class diagram
#o[0-9]+	object diagram
#p[0-9]+	component diagram
#d[0-9]+	deployment diagram
#u[0-9]+	use case diagram
#a[0-9]+	activity diagram
#s[0-9]+	state machine diagram
#m[0-9]+	sequence diagram
#t[0-9]+	timing diagram

Please refer to line_diversion.py --help for authoritative information.

7.3. Practical annotation¶

Most, but not all annotations are comments. line_diversion also recognizes some annotated code lines like class, if, fi, … Arbitrary code lines can be marked as activity.

The first thing to do is to define an annotation tag symbol (in the example it is for an activity diagram). For activity diagrams, use a0, a1, a2, … However, the diagram numbers are strictly informational and can be arbitrary. The numbers are not interpreted as integers, i.e. a0, and a00 are different annotation tag symbols.

The tagging commands C-# m, C-# a, etc. use the last specified ANNOTATION-TAG-SYMBOL. They will ask for the ANNOTATION-TAG-SYMBOL to be used, when invoked with a prefix argument, e.g. C-u C-# m.

Note that line_diversion.py displays diagrams in the order they are first encountered in the text, not sorted by diagram numbers.

Annotation lines can be distributed anywhere throughout the source text. The annotation text is the same as regular PlantUML commands, with a few exceptions:

@startuml and @enduml are implicitely added by line_diversion.py and must not be explicitely defined,
the prefix and postfix of actions in activity diagrams (e.g. :activity ;) can be written after the PlantUML marker.

Here is an unannotated Python example:
```
# display `hello world`
printf ('hello world')
```
So instead of adding :, adding ; and marking the annotation with C-# m:
```
# :display `hello world`;                                         #a0
printf ('hello world')
```
the annotation can be marked with C-# a:
```
# display `hello world`                                           #a0 :;
printf ('hello world')
```
For a quick and dirty session, code statements can be marked directly as activities:
```
printf ('hello world')                                            #a0 :;
```
However, that is not the correct way to document anything, since it is sufficient to read the source code in such a case. There is simply no need to generate a documentation which repeats the source code word for word. The exception is for a quick overview of unknown code, where the annotations are just temporary.

PlantUML elements other than actions in an activity diagram are not enclosed in special delimiters : abnd ;:

# start                                                           #a20
a = 5                                                             #a20 :;
b = a                                                             #a20 :;
if (a==b):                                                        #a20 (yes)
   a = 12                                                         #a20 :;
else:                                                             #a20
   b = 15                                                         #a20 :;
# show values of **a** and **b**                                  #a20 :;
print(a)
print(b)
# endif                                                           #a20
# stop                                                            #a20

The extraction command:

line_diversion.py --match '^a20$' README-uml-annotations-line-diversion.txt

results in:

@startuml
skinparam padding 1
start
:a = 5;
:b = a;
if (a==b) then (yes)
   :a = 12;
else
   :b = 15;
:show values of **a** and **b**;
endif
stop
@enduml

This output can be rendered and previewed in place with C-c u u v, which is bound to the command x-plantuml-preview-current-block.

The command umlx-extract, which is bound to C-# u x, combines extraction, rendering and previewing. Without a prefix argument, the diagram for the active default marker in the current file is displayed.

With a prefix of 0, the diagram marker can be spedified.

With a negative prefix argument, the diagrams are only extracted. No preview is generated.

With a positiv prefix argument > 0 (and < 16), the diagram with the corresponding sequence number is rendered and displayed. As mentioned earlier, the extraction sequence number \((1, 2, 3, ..)\) of diagrams is unrelated to the DIAGRAM-NUMBER of the ANNOTATION-TAG-SYMBOL.

The last example in the text file README-uml-annotations-line-diversion.txt is extractet and rendered with C-u 0 C-# u x a20 RET as:

A prefix number < 0 (e.g., C-- C-# u x and C-0 C-# u x) shows the extracted PlantUML diagram definition instead of a preview. It is then very simple to walk through the output and preview diagrams with C-c u u v, which is bound to the command x-plantuml-preview-current-block. This is also convenient when debugging errors.

7.4. Activity Diagrams (extracted)¶

Activity diagrams for extracting UML diagrams from annotated source code:

7.4.1. Extract UML diagrams¶

7.4.2. Process Matching Line¶

7.5. Class Diagram (extracted)¶

Class diagram:

7.6. Command/Module Documentation¶

line_diversion.py - extract UML diagram from source code

usage:	line_diversion.py [OPTIONS] [file ..]
or	import line_diversion

7.6.1. Options¶

-m, –match RX only extract diagrams matching RX

-b, –file-base=BASE Write output to files: BASE + ID + “.puml”. E.g. –file-base=”file-base-” => file-base-a0.puml

–show-bases show base classes for class diagrams (default)

–hide-bases hide base classes for class diagrams

–ignore-bases CLASS

ignore base class for class diagrams

(can be specified multiple times) (default: object, PyJsMo)

-r, --replace REP

register replacement REP. REP is a symbol and a replacement separated by whitespace.

–verbatim do not perform standard replacements

-q, –quiet suppress warnings

-v, –verbose verbose test output

-d, –debug[=NUM] show debug information

-h, –help display this help message

–template list show available templates.

–eide[=COMM] Emacs IDE template list (implies –template list).

–template[=NAME] extract named template to standard output. Default NAME is -.

–extract[=DIR] extract adhoc files to directory DIR (default: .)

–explode[=DIR] explode script with adhoc in directory DIR (default __adhoc__)

–setup[=install] explode script into temporary directory and call python setup.py install

–implode implode script with adhoc

-t, –test run doc tests

7.6.2. Module¶

These are the UML diagrams and type codes supported by PlantUML ..

Marker	Diagram
#c[0-9]+	class diagram
#o[0-9]+	object diagram
#p[0-9]+	component diagram
#d[0-9]+	deployment diagram
#u[0-9]+	use case diagram
#a[0-9]+	activity diagram
#s[0-9]+	state machine diagram
#m[0-9]+	sequence diagram
#t[0-9]+	timing diagram

Structure Diagrams	Type	Behavior Diagrams	Type
Class Diagram	c	Use Case Diagram	u
Object Diagram	o	Activity Diagram	a
Component Diagram	p	State Machine Diagram	s
Deployment Diagram	d	Message Sequence Diagram	m
		Timing Diagram	t

After the action marker, a format ~”” or ~** can be specified.

This is the inheritance hierarchy of UML diagrams supported by PlantUML (clickable in HTML):

7.6.3. Automatic Exports¶

>>> for ex in __all__: printf(sformat('from {0} import {1}', __name__, ex))
from line_diversion import check_line_parse
from line_diversion import LineParser
from line_diversion import DiagramTiming
from line_diversion import DiagramMessageSequence
from line_diversion import DiagramInteraction
from line_diversion import DiagramStateMachine
from line_diversion import DiagramActivity
from line_diversion import DiagramUseCase
from line_diversion import DiagramBehavior
from line_diversion import DiagramDeployment
from line_diversion import DiagramComponent
from line_diversion import DiagramObject
from line_diversion import DiagramClass
from line_diversion import DiagramStructure
from line_diversion import Diagram
from line_diversion import LineDiversion

7.6.4. Explicit Exports¶

>>> if '__all_internal__' in globals():
...   for ex in __all_internal__:
...     printf(sformat('from {0} import {1}', __name__, ex))

7.6.5. Details¶

See check_line_parse() for comprehensive list of line matchers.

7.6.5.1. Prefix Match¶

>>> printf(PREFIX_LP)
{
    "rx": [
        "^(\\s*)(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+)(?:\\s|$)",
        "0"
    ],
    "groups": {
        "whitespace": 1,
        "id": 2,
        "text": null,
        "keyword": null,
        "condition": null
    }
}

>>> mo, _id = PREFIX_LP.match('#''a3 if (test) then (yes)')
>>> printf(mo.groups())
('', 'a3')

7.6.5.2. Condition Match¶

>>> printf(COND_LP)
{
    "rx": [
        "^(\\s*)(break|class|def|done|elif|else|elseif|elsif|fi|for|if|while|\\})(?::?|\\s)\\s*(.*)\\s*(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+)(?:\\s(.*)|\\s*)$",
        "0"
    ],
    "groups": {
        "whitespace": 1,
        "id": 4,
        "text": 5,
        "keyword": 2,
        "condition": 3
    }
}

>>> mo, _id = COND_LP.match('        if remove_count:  #''a1 then (yeah)')
>>> printf(mo.groups())
('        ', 'if', 'remove_count:  ', 'a1', 'then (yeah)')

>>> mo, _id = COND_LP.match('} #''a1')
>>> printf(mo.groups())
('', '}', '', 'a1', None)

7.6.5.3. Action Match¶

>>> printf(ACT_LP)
{
    "rx": [
        "^(\\s*)(?:(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.]) )?(.*)\\s*(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+) (:[];|<>/}]?|[];|<>/}.-])\\s*(?:(#[0-9A-Za-z]+|backwards)(?:\\s+|$))?(?:\\s(.*)|\\s*)$",
        "0"
    ],
    "groups": {
        "whitespace": 1,
        "id": 3,
        "text": 6,
        "keyword": 4,
        "condition": 2,
        "cond_pfx": 5
    }
}

>>> mo, _id = ACT_LP.match('# * call `process_parts #''a3 :| #red')
>>> printf(mo.groups())
('', '* call `process_parts ', 'a3', ':|', '#red', None)

>>> mo, _id = ACT_LP.match('        something = more #''a1 :')
>>> printf(mo.groups())
('        ', 'something = more ', 'a1', ':', None, None)

>>> mo, _id = ACT_LP.match('        something = more #''a1 ::')
>>> if mo: printf(mo.groups())

>>> mo, _id = ACT_LP.match('        something = more #''a1 :-')
>>> if mo: printf(mo.groups())

>>> mo, _id = ACT_LP.match('        something = more #''a1 :;')
>>> printf(mo.groups())
('        ', 'something = more ', 'a1', ':;', None, None)

>>> mo, _id = ACT_LP.match('rm -f "${top_dir}/.hgignore.new" #''a1 :')
>>> printf(mo.groups())
('', 'rm -f "${top_dir}/.hgignore.new" ', 'a1', ':', None, None)

>>> printf(ACT_RX.pattern)
^(\s*)(?:(?://+|/\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.]) )?(.*)\s*(?://+|/\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+) (:[];|<>/}]?|[];|<>/}.-])\s*(?:(#[0-9A-Za-z]+|backwards)(?:\s+|$))?(?:\s(.*)|\s*)$

>>> printf(ACT_LP.groups)
OrderedDict([('whitespace', 1), ('id', 3), ('text', 6), ('keyword', 4), ('condition', 2), ('cond_pfx', 5)])

line_diversion.check_line_parse(*exprs)[source]¶

Returns:

prints

>>> check_line_parse('# s/^+//''p') #doctest: +ELLIPSIS
# --------------------------------------------------
#  ||:exp:||  # s/^+//p
# --------------------------------------------------
#    :DBG:    ACT_LP           : ]--[ ]()[
#    :DBG:    ATTRIB_LP        : ]--[ ]()[
#    :DBG:    COND_LP          : ]--[ ]()[
#    :DBG:    PREFIX_LP        : ]--[ ]()[
#    :DBG:    PRINTE_SFORMAT_LP: ]--[ ]()[
#    :DBG:    PYJSMO_ATTRIB_LP : ]--[ ]()[
#    :DBG:    SUFFIX_LP        : ]--[ ]()[

>>> check_line_parse('# start #''a99') #doctest: +ELLIPSIS
# --------------------------------------------------
#  ||:exp:||  # start #a...99
# --------------------------------------------------
#    :DBG:    ACT_LP           : ]--[ ]()[
#    :DBG:    ATTRIB_LP        : ]--[ ]()[
#    :DBG:    COND_LP          : ]--[ ]()[
#    :DBG:    PREFIX_LP        : ]--[ ]()[
#    :DBG:    PRINTE_SFORMAT_LP: ]--[ ]()[
#    :DBG:    PYJSMO_ATTRIB_LP : ]--[ ]()[
#    :DBG:    SUFFIX_LP        : ]a99[ ]('', 'start ', 'a99')[

>>> check_line_parse('.. start ..''a99') #doctest: +ELLIPSIS
# --------------------------------------------------
#  ||:exp:||  .. start ..a...99
# --------------------------------------------------
#    :DBG:    ACT_LP           : ]--[ ]()[
#    :DBG:    ATTRIB_LP        : ]--[ ]()[
#    :DBG:    COND_LP          : ]--[ ]()[
#    :DBG:    PREFIX_LP        : ]--[ ]()[
#    :DBG:    PRINTE_SFORMAT_LP: ]--[ ]()[
#    :DBG:    PYJSMO_ATTRIB_LP : ]--[ ]()[
#    :DBG:    SUFFIX_LP        : ]a99[ ]('', 'start ', 'a99')[

>>> check_line_parse('start #''a99') #doctest: +ELLIPSIS
# --------------------------------------------------
#  ||:exp:||  start #a...99
# --------------------------------------------------
#    :DBG:    ACT_LP           : ]--[ ]()[
#    :DBG:    ATTRIB_LP        : ]--[ ]()[
#    :DBG:    COND_LP          : ]--[ ]()[
#    :DBG:    PREFIX_LP        : ]--[ ]()[
#    :DBG:    PRINTE_SFORMAT_LP: ]--[ ]()[
#    :DBG:    PYJSMO_ATTRIB_LP : ]--[ ]()[
#    :DBG:    SUFFIX_LP        : ]--[ ]()[

>>> check_line_parse(' shell command #a0 :;') #doctest: +ELLIPSIS
# --------------------------------------------------
#  ||:exp:||   shell command #a0 :;
# --------------------------------------------------
#    :DBG:    ACT_LP           : ]a0[ ](' ', 'shell command ', 'a0', ':;', None, None)[
#    :DBG:    ATTRIB_LP        : ]--[ ]()[
#    :DBG:    COND_LP          : ]--[ ]()[
#    :DBG:    PREFIX_LP        : ]--[ ]()[
#    :DBG:    PRINTE_SFORMAT_LP: ]--[ ]()[
#    :DBG:    PYJSMO_ATTRIB_LP : ]--[ ]()[
#    :DBG:    SUFFIX_LP        : ]--[ ]()[

class line_diversion.LineParser(*args, **kwargs)[source]¶

>>> lp = LineParser()
>>> printf(lp)
{
    "rx": [
        null,
        "0"
    ],
    "groups": {
        "whitespace": null,
        "id": null,
        "text": null,
        "keyword": null,
        "condition": null
    }
}

>>> printf(lp._pyjsmo_x_rx)
None

>>> lp.rx = re.compile('some', re.I | re.M | re.U)
>>> printf(lp)
{
    "rx": [
        "some",
        "re.IGNORECASE|re.MULTILINE|re.UNICODE"
    ],
    "groups": {
        "whitespace": null,
        "id": null,
        "text": null,
        "keyword": null,
        "condition": null
    }
}

>>> printf(trans_rx_repr(lp._pyjsmo_x_rx)) #doctest: +ELLIPSIS
re.compile('some', re.IGNORECASE|re.MULTILINE|re.UNICODE)

>>> lp.rx = ("some where", "re.IGNORECASE|re.UNICODE")
>>> printf(lp)
{
    "rx": [
        "some where",
        "re.IGNORECASE|re.UNICODE"
    ],
    "groups": {
        "whitespace": null,
        "id": null,
        "text": null,
        "keyword": null,
        "condition": null
    }
}

>>> printf(trans_rx_repr(lp._pyjsmo_x_rx)) #doctest: +ELLIPSIS
re.compile('some where', re.IGNORECASE|re.UNICODE)

assemble(part_map, diagram)[source]¶

Returns:	assembled line.

init(rx, groups)[source]¶

Returns:	self for chaining.

match(line)[source]¶

Match against line.

Returns:	match object or None.

rx¶: Programmatic property.

split(line, mo, diagram)[source]¶

Returns:	part map indent text kw cond_pfx condr rest mapped_kw kw_sep cond kw_cont_sep mapped_kw_cont

split_(line, mo, diagram)[source]¶

Returns:	part map

class line_diversion.DiagramTiming(*args, **kwargs)[source]¶: LineDiversion for Timing Diagram.

class line_diversion.DiagramMessageSequence(*args, **kwargs)[source]¶: LineDiversion for Message Sequence Diagram.

class line_diversion.DiagramInteraction(*args, **kwargs)[source]¶: LineDiversion for Interaction Diagram.

class line_diversion.DiagramStateMachine(*args, **kwargs)[source]¶: LineDiversion for State Machine Diagram.

class line_diversion.DiagramActivity(*args, **kwargs)[source]¶: LineDiversion for Activity Diagram.

class line_diversion.DiagramUseCase(*args, **kwargs)[source]¶: LineDiversion for Use Case Diagram.

class line_diversion.DiagramBehavior(*args, **kwargs)[source]¶: LineDiversion for Behavior Diagram.

class line_diversion.DiagramDeployment(*args, **kwargs)[source]¶: LineDiversion for Deployment Diagram.

class line_diversion.DiagramComponent(*args, **kwargs)[source]¶: LineDiversion for Component Diagram.

class line_diversion.DiagramObject(*args, **kwargs)[source]¶: LineDiversion for Object Diagram.

class line_diversion.DiagramClass(*args, **kwargs)[source]¶: LineDiversion for Class Diagram.

class line_diversion.DiagramStructure(*args, **kwargs)[source]¶: LineDiversion for Structure Diagram.

class line_diversion.Diagram(*args, **kwargs)[source]¶

close_partition()[source]¶

Returns:	self for chaining.

open_partition(text)[source]¶

Returns:	self for chaining.

process_parts(part_map)[source]¶

Returns:	new/altered part_map.

>>> check_def = 'SomeClass ( with,multiple, inheritance )'
>>> mo = re.match(CLASS_DEF_RX, check_def)
>>> if mo: printf(sformat("{0}//{1}//", mo.groups(), check_def[mo.end(0):]))
('SomeClass', 'with,multiple, inheritance ')////

>>> class Check():
...     pass

>>> check_def = 'SomeClass ()'
>>> mo = re.match(CLASS_DEF_RX, check_def)
>>> if mo: printf(sformat("{0}//{1}//", mo.groups(), check_def[mo.end(0):]))
('SomeClass', '')////

>>> check_def = 'SomeClass'
>>> mo = re.match(CLASS_DEF_RX, check_def)
>>> if mo: printf(sformat("{0}//{1}//", mo.groups(), check_def[mo.end(0):]))
('SomeClass', None)////

class line_diversion.LineDiversion(*args, **kwargs)[source]¶

close_partition()[source]¶

Returns:	self for chaining.

finish()[source]¶

Returns:	self for chaining.

open_partition(text)[source]¶

Returns:	self for chaining.

[1]

Normally, PlantUML code is placed before the source code of a script, class, function etc. or in a separate text file. In this case it is necessary to have two windows, one for documenting the source code as PlantUML and the other for coding. This can lead to the phenomenon of big differences between the source code and its documentation, because of laziness. To prevent this from happening, it is useful to use UML annotations.