The python module line_diversion
extracts marked annotation
lines from text files. The extraction result is generally a
PlantUML diagram. Future extensions to extract other material
(e.g. dot(1) graphs) are possible and probable.
While writing source code, the PlantUML annotations can be easily added near the source code lines. (See section Emacs support). The idea of this concept is to minimize the distance between source code and documentation (see section 8, Relevance of Documentation)[1].
A set of Emacs commands facilitates code annotation, reformatting of annotations and diagram preview. The commands are all prefixed with umlx- and bound with the prefix command C-# u. See C-# u C-h for key bindings:
C-# u a umlx-mark-activity
C-# u b umlx-re-mark-buffer
C-# u c umlx-convert-old
C-# u g umlx-grep-find
C-# u k umlx-unmark
C-# u m umlx-mark
C-# u o umlx-occur
C-# u t umlx-symbol-tag-delimiter
C-# u u umlx-re-mark
C-# u x umlx-extract
In addtion to the C-# u <char> binding, some commands are also directly bound to C-# <char>:
C-# C-a umlx-mark-activity
C-# C-b umlx-re-mark-buffer
C-# C-k umlx-unmark
C-# RET umlx-mark
C-# C-t umlx-symbol-tag-delimiter
C-# C-u umlx-re-mark
Note
If C-# does not work (e.g. in a terminal), the prefix command C-c # can be used instead.
An annotation marker consists of
ANNOTATION-MARKER:
<ANNOTATION-TAG> [<SPACE> TYPE-PARAMETERS ..] [<SPACE> <TAIL-TEXT>] [[<SPACE>] <COMMENT-END>]
ANNOTATION-TAG:
<COMMENT-START> <ANNOTATION-TAG-SYMBOL>
ANNOTATION-TAG-SYMBOL:
<LINE_DIVERSION-TYPE> <DIAGRAM-NUMBER>
E.g.:
#a55
a #a55 :;
b #a55 :; #red
#a55 ' (yes)
An annotated line is defined as:
[[<SPACE>] <COMMENT-START> <SPACE> <TEXT>] <ANNOTATION-MARKER>
| [<SPACE>] <KEYWORD> <ANNOTATION-MARKER>
| <TEXT> <ANNOTATION-MARKER> <SPACE> :[;]
The regular expression for matching a comment start is defined in
module line_diversion
:
COMMENT_TYPE_RX='(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|<!--|#+|@[bl]?comm_?@|[.][.])'
The line diversion types recognized are also defined in module
line_diversion
:
Marker
|
Diagram
|
|
---|---|---|
#c[0-9]+
|
class diagram
|
|
#o[0-9]+
|
object diagram
|
|
#p[0-9]+
|
component diagram
|
|
#d[0-9]+
|
deployment diagram
|
|
#u[0-9]+
|
use case diagram
|
|
#a[0-9]+
|
activity diagram
|
|
#s[0-9]+
|
state machine diagram
|
|
#m[0-9]+
|
sequence diagram
|
|
#t[0-9]+
|
timing diagram
|
Please refer to line_diversion.py --help for authoritative information.
Most, but not all annotations are comments. line_diversion
also
recognizes some annotated code lines like class, if, fi, …
Arbitrary code lines can be marked as activity.
The first thing to do is to define an annotation tag symbol (in the example it is for an activity diagram). For activity diagrams, use a0, a1, a2, … However, the diagram numbers are strictly informational and can be arbitrary. The numbers are not interpreted as integers, i.e. a0, and a00 are different annotation tag symbols.
The tagging commands C-# m, C-# a, etc. use the last specified ANNOTATION-TAG-SYMBOL. They will ask for the ANNOTATION-TAG-SYMBOL to be used, when invoked with a prefix argument, e.g. C-u C-# m.
Note that line_diversion.py displays diagrams in the order they are first encountered in the text, not sorted by diagram numbers.
Annotation lines can be distributed anywhere throughout the source text. The annotation text is the same as regular PlantUML commands, with a few exceptions:
@startuml and @enduml are implicitely added by line_diversion.py and must not be explicitely defined,
the prefix and postfix of actions in activity diagrams (e.g. :activity ;) can be written after the PlantUML marker.
Here is an unannotated Python example:
# display `hello world`
printf ('hello world')
So instead of adding :, adding ; and marking the annotation with C-# m:
# :display `hello world`; #a0
printf ('hello world')
the annotation can be marked with C-# a:
# display `hello world` #a0 :;
printf ('hello world')
For a quick and dirty session, code statements can be marked directly as activities:
printf ('hello world') #a0 :;
However, that is not the correct way to document anything, since it is sufficient to read the source code in such a case. There is simply no need to generate a documentation which repeats the source code word for word. The exception is for a quick overview of unknown code, where the annotations are just temporary.
PlantUML elements other than actions in an activity diagram are not enclosed in special delimiters : abnd ;:
# start #a20
a = 5 #a20 :;
b = a #a20 :;
if (a==b): #a20 (yes)
a = 12 #a20 :;
else: #a20
b = 15 #a20 :;
# show values of **a** and **b** #a20 :;
print(a)
print(b)
# endif #a20
# stop #a20
The extraction command:
line_diversion.py --match '^a20$' README-uml-annotations-line-diversion.txt
results in:
@startuml
skinparam padding 1
start
:a = 5;
:b = a;
if (a==b) then (yes)
:a = 12;
else
:b = 15;
:show values of **a** and **b**;
endif
stop
@enduml
This output can be rendered and previewed in place with C-c u u v, which is bound to the command x-plantuml-preview-current-block.
The command umlx-extract, which is bound to C-# u x, combines extraction, rendering and previewing. Without a prefix argument, the diagram for the active default marker in the current file is displayed.
With a prefix of 0, the diagram marker can be spedified.
With a negative prefix argument, the diagrams are only extracted. No preview is generated.
With a positiv prefix argument > 0 (and < 16), the diagram with the corresponding sequence number is rendered and displayed. As mentioned earlier, the extraction sequence number \((1, 2, 3, ..)\) of diagrams is unrelated to the DIAGRAM-NUMBER of the ANNOTATION-TAG-SYMBOL.
The last example in the text file
README-uml-annotations-line-diversion.txt
is extractet and
rendered with C-u 0 C-# u x a20 RET as:
A prefix number < 0 (e.g., C-- C-# u x and C-0 C-# u x) shows the extracted PlantUML diagram definition instead of a preview. It is then very simple to walk through the output and preview diagrams with C-c u u v, which is bound to the command x-plantuml-preview-current-block. This is also convenient when debugging errors.
Activity diagrams for extracting UML diagrams from annotated source code:
line_diversion.py - extract UML diagram from source code
usage: | line_diversion.py [OPTIONS] [file ..] |
or | import line_diversion |
-m, –match RX only extract diagrams matching RX -b, –file-base=BASE Write output to files: BASE + ID + “.puml”. E.g. –file-base=”file-base-” => file-base-a0.puml –show-bases show base classes for class diagrams (default) –hide-bases hide base classes for class diagrams –ignore-bases CLASS
- ignore base class for class diagrams
- (can be specified multiple times) (default: object, PyJsMo)
-r, --replace REP register replacement REP. REP is a symbol and a replacement separated by whitespace. –verbatim do not perform standard replacements -q, –quiet suppress warnings -v, –verbose verbose test output -d, –debug[=NUM] show debug information -h, –help display this help message –template list show available templates. –eide[=COMM] Emacs IDE template list (implies –template list). –template[=NAME] extract named template to standard output. Default NAME is -
.–extract[=DIR] extract adhoc files to directory DIR (default: .
)–explode[=DIR] explode script with adhoc in directory DIR (default __adhoc__
)–setup[=install] explode script into temporary directory and call python setup.py install –implode implode script with adhoc -t, –test run doc tests
These are the UML diagrams and type codes supported by PlantUML ..
Marker
|
Diagram
|
|
---|---|---|
#c[0-9]+
|
class diagram
|
|
#o[0-9]+
|
object diagram
|
|
#p[0-9]+
|
component diagram
|
|
#d[0-9]+
|
deployment diagram
|
|
#u[0-9]+
|
use case diagram
|
|
#a[0-9]+
|
activity diagram
|
|
#s[0-9]+
|
state machine diagram
|
|
#m[0-9]+
|
sequence diagram
|
|
#t[0-9]+
|
timing diagram
|
Type | Type | ||
---|---|---|---|
c | u | ||
o | a | ||
p | s | ||
d | m | ||
t |
This is the inheritance hierarchy of UML diagrams supported by PlantUML (clickable in HTML):
>>> for ex in __all__: printf(sformat('from {0} import {1}', __name__, ex))
from line_diversion import check_line_parse
from line_diversion import LineParser
from line_diversion import DiagramTiming
from line_diversion import DiagramMessageSequence
from line_diversion import DiagramInteraction
from line_diversion import DiagramStateMachine
from line_diversion import DiagramActivity
from line_diversion import DiagramUseCase
from line_diversion import DiagramBehavior
from line_diversion import DiagramDeployment
from line_diversion import DiagramComponent
from line_diversion import DiagramObject
from line_diversion import DiagramClass
from line_diversion import DiagramStructure
from line_diversion import Diagram
from line_diversion import LineDiversion
>>> if '__all_internal__' in globals():
... for ex in __all_internal__:
... printf(sformat('from {0} import {1}', __name__, ex))
See check_line_parse()
for comprehensive list of line matchers.
>>> printf(PREFIX_LP)
{
"rx": [
"^(\\s*)(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+)(?:\\s|$)",
"0"
],
"groups": {
"whitespace": 1,
"id": 2,
"text": null,
"keyword": null,
"condition": null
}
}
>>> mo, _id = PREFIX_LP.match('#''a3 if (test) then (yes)')
>>> printf(mo.groups())
('', 'a3')
>>> printf(COND_LP)
{
"rx": [
"^(\\s*)(break|class|def|done|elif|else|elseif|elsif|fi|for|if|while|\\})(?::?|\\s)\\s*(.*)\\s*(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+)(?:\\s(.*)|\\s*)$",
"0"
],
"groups": {
"whitespace": 1,
"id": 4,
"text": 5,
"keyword": 2,
"condition": 3
}
}
>>> mo, _id = COND_LP.match(' if remove_count: #''a1 then (yeah)')
>>> printf(mo.groups())
(' ', 'if', 'remove_count: ', 'a1', 'then (yeah)')
>>> mo, _id = COND_LP.match('} #''a1')
>>> printf(mo.groups())
('', '}', '', 'a1', None)
>>> printf(ACT_LP)
{
"rx": [
"^(\\s*)(?:(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.]) )?(.*)\\s*(?://+|/\\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+) (:[];|<>/}]?|[];|<>/}.-])\\s*(?:(#[0-9A-Za-z]+|backwards)(?:\\s+|$))?(?:\\s(.*)|\\s*)$",
"0"
],
"groups": {
"whitespace": 1,
"id": 3,
"text": 6,
"keyword": 4,
"condition": 2,
"cond_pfx": 5
}
}
>>> mo, _id = ACT_LP.match('# * call `process_parts #''a3 :| #red')
>>> printf(mo.groups())
('', '* call `process_parts ', 'a3', ':|', '#red', None)
>>> mo, _id = ACT_LP.match(' something = more #''a1 :')
>>> printf(mo.groups())
(' ', 'something = more ', 'a1', ':', None, None)
>>> mo, _id = ACT_LP.match(' something = more #''a1 ::')
>>> if mo: printf(mo.groups())
>>> mo, _id = ACT_LP.match(' something = more #''a1 :-')
>>> if mo: printf(mo.groups())
>>> mo, _id = ACT_LP.match(' something = more #''a1 :;')
>>> printf(mo.groups())
(' ', 'something = more ', 'a1', ':;', None, None)
>>> mo, _id = ACT_LP.match('rm -f "${top_dir}/.hgignore.new" #''a1 :')
>>> printf(mo.groups())
('', 'rm -f "${top_dir}/.hgignore.new" ', 'a1', ':', None, None)
>>> printf(ACT_RX.pattern)
^(\s*)(?:(?://+|/\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.]) )?(.*)\s*(?://+|/\*+|;+|@:u?[bl]?comm_?@|--|#+|@[bl]?comm_?@|[.][.])([copduasmt][0-9]+) (:[];|<>/}]?|[];|<>/}.-])\s*(?:(#[0-9A-Za-z]+|backwards)(?:\s+|$))?(?:\s(.*)|\s*)$
>>> printf(ACT_LP.groups)
OrderedDict([('whitespace', 1), ('id', 3), ('text', 6), ('keyword', 4), ('condition', 2), ('cond_pfx', 5)])
line_diversion.
check_line_parse
(*exprs)[source]¶Returns: |
---|
prints
>>> check_line_parse('# s/^+//''p') #doctest: +ELLIPSIS
# --------------------------------------------------
# ||:exp:|| # s/^+//p
# --------------------------------------------------
# :DBG: ACT_LP : ]--[ ]()[
# :DBG: ATTRIB_LP : ]--[ ]()[
# :DBG: COND_LP : ]--[ ]()[
# :DBG: PREFIX_LP : ]--[ ]()[
# :DBG: PRINTE_SFORMAT_LP: ]--[ ]()[
# :DBG: PYJSMO_ATTRIB_LP : ]--[ ]()[
# :DBG: SUFFIX_LP : ]--[ ]()[
>>> check_line_parse('# start #''a99') #doctest: +ELLIPSIS
# --------------------------------------------------
# ||:exp:|| # start #a...99
# --------------------------------------------------
# :DBG: ACT_LP : ]--[ ]()[
# :DBG: ATTRIB_LP : ]--[ ]()[
# :DBG: COND_LP : ]--[ ]()[
# :DBG: PREFIX_LP : ]--[ ]()[
# :DBG: PRINTE_SFORMAT_LP: ]--[ ]()[
# :DBG: PYJSMO_ATTRIB_LP : ]--[ ]()[
# :DBG: SUFFIX_LP : ]a99[ ]('', 'start ', 'a99')[
>>> check_line_parse('.. start ..''a99') #doctest: +ELLIPSIS
# --------------------------------------------------
# ||:exp:|| .. start ..a...99
# --------------------------------------------------
# :DBG: ACT_LP : ]--[ ]()[
# :DBG: ATTRIB_LP : ]--[ ]()[
# :DBG: COND_LP : ]--[ ]()[
# :DBG: PREFIX_LP : ]--[ ]()[
# :DBG: PRINTE_SFORMAT_LP: ]--[ ]()[
# :DBG: PYJSMO_ATTRIB_LP : ]--[ ]()[
# :DBG: SUFFIX_LP : ]a99[ ]('', 'start ', 'a99')[
>>> check_line_parse('start #''a99') #doctest: +ELLIPSIS
# --------------------------------------------------
# ||:exp:|| start #a...99
# --------------------------------------------------
# :DBG: ACT_LP : ]--[ ]()[
# :DBG: ATTRIB_LP : ]--[ ]()[
# :DBG: COND_LP : ]--[ ]()[
# :DBG: PREFIX_LP : ]--[ ]()[
# :DBG: PRINTE_SFORMAT_LP: ]--[ ]()[
# :DBG: PYJSMO_ATTRIB_LP : ]--[ ]()[
# :DBG: SUFFIX_LP : ]--[ ]()[
line_diversion.
LineParser
(*args, **kwargs)[source]¶>>> lp = LineParser()
>>> printf(lp)
{
"rx": [
null,
"0"
],
"groups": {
"whitespace": null,
"id": null,
"text": null,
"keyword": null,
"condition": null
}
}
>>> printf(lp._pyjsmo_x_rx)
None
>>> lp.rx = re.compile('some', re.I | re.M | re.U)
>>> printf(lp)
{
"rx": [
"some",
"re.IGNORECASE|re.MULTILINE|re.UNICODE"
],
"groups": {
"whitespace": null,
"id": null,
"text": null,
"keyword": null,
"condition": null
}
}
>>> printf(trans_rx_repr(lp._pyjsmo_x_rx)) #doctest: +ELLIPSIS
re.compile('some', re.IGNORECASE|re.MULTILINE|re.UNICODE)
>>> lp.rx = ("some where", "re.IGNORECASE|re.UNICODE")
>>> printf(lp)
{
"rx": [
"some where",
"re.IGNORECASE|re.UNICODE"
],
"groups": {
"whitespace": null,
"id": null,
"text": null,
"keyword": null,
"condition": null
}
}
>>> printf(trans_rx_repr(lp._pyjsmo_x_rx)) #doctest: +ELLIPSIS
re.compile('some where', re.IGNORECASE|re.UNICODE)
rx
¶Programmatic property.
line_diversion.
DiagramMessageSequence
(*args, **kwargs)[source]¶LineDiversion for Message Sequence Diagram.
line_diversion.
DiagramInteraction
(*args, **kwargs)[source]¶LineDiversion for Interaction Diagram.
line_diversion.
DiagramStateMachine
(*args, **kwargs)[source]¶LineDiversion for State Machine Diagram.
line_diversion.
DiagramDeployment
(*args, **kwargs)[source]¶LineDiversion for Deployment Diagram.
line_diversion.
DiagramComponent
(*args, **kwargs)[source]¶LineDiversion for Component Diagram.
line_diversion.
DiagramStructure
(*args, **kwargs)[source]¶LineDiversion for Structure Diagram.
line_diversion.
Diagram
(*args, **kwargs)[source]¶process_parts
(part_map)[source]¶Returns: | new/altered part_map. |
---|
>>> check_def = 'SomeClass ( with,multiple, inheritance )'
>>> mo = re.match(CLASS_DEF_RX, check_def)
>>> if mo: printf(sformat("{0}//{1}//", mo.groups(), check_def[mo.end(0):]))
('SomeClass', 'with,multiple, inheritance ')////
>>> class Check():
... pass
>>> check_def = 'SomeClass ()'
>>> mo = re.match(CLASS_DEF_RX, check_def)
>>> if mo: printf(sformat("{0}//{1}//", mo.groups(), check_def[mo.end(0):]))
('SomeClass', '')////
>>> check_def = 'SomeClass'
>>> mo = re.match(CLASS_DEF_RX, check_def)
>>> if mo: printf(sformat("{0}//{1}//", mo.groups(), check_def[mo.end(0):]))
('SomeClass', None)////
[1] | Normally, PlantUML code is placed before the source code of a script, class, function etc. or in a separate text file. In this case it is necessary to have two windows, one for documenting the source code as PlantUML and the other for coding. This can lead to the phenomenon of big differences between the source code and its documentation, because of laziness. To prevent this from happening, it is useful to use UML annotations. |