24. rst-mode etags Support

24.1. Resources

  • README-rst-mode-etags-support.txt
  • bin/diffmap.py
  • bin/rst_tags.py
  • _static/rst-TAGS linked to TAGS
listing 24.1 workspace register setup
(workspace-set-locations '(
 ("/home/sw/project/documentation/README-rst-mode-etags-support.txt" 5122 "README-rst-mode-etags-support.txt" 49 nil)
 ("/home/ws/project/documentation/bin/rst_tags.py" 15145 "rst_tags.py" 50 nil)
 ("/home/ws/project/documentation/bin/diffmap.py" 3482 "diffmap.py" 51 nil)
 ("/home/sw/project/documentation/_static/rst-TAGS" 1 "rst-TAGS" 55 nil)
) nil t t)

After evaluating the expression in listing 24.1 (C-x C-e at end of expression, or copy expression, M-:, yank expresssion, RET),

+-------+-------+-------+
|       |       |   9   |
|       |   8   |       |
|   7 _____________________ • /home/sw/project/documentation/_static/rst-TAGS
+-------+-------+-------+
|       |       |   6   |
|       |   5   |       |
|   4   |       |       |
+-------+-------+-------+
|       |       |   3 _____ • /home/ws/project/documentation/bin/diffmap.py
|       |   2 _____________ • /home/ws/project/documentation/bin/rst_tags.py
|   1 _____________________ • /home/sw/project/documentation/README-rst-mode-etags-support.txt
+-------+-------+-------+
|       |
|   0   |
|       |
+-------+

24.2. etags Interface

A special include entry is useful to combine several files:

^L
documentation/TAGS,include
^L
administration/TAGS,include
^L
ws-admin/TAGS,include
^L
ws-admin/tagideasy/TAGS,include
^L
ws-bme/TAGS,include

This file should be installed in directory "$HOME/project" as TAGS with:

test -n "$( dpkg -l coreutils 2>/dev/null | grep ^ii )" || sudo apt-get install coreutils
printf 'DApkb2N1bWVudGF0aW9uL1RBR1MsaW5jbHVkZQoMCmFkbWluaXN0cmF0aW9uL1RBR1MsaW5jbHVkZQoMCndzLWFkbWluL1RBR1MsaW5jbHVkZQoMCndzLWFkbWluL3RhZ2lkZWFzeS9UQUdTLGluY2x1ZGUKDAp3cy1ibWUvVEFHUyxpbmNsdWRlCg==' | base64 -d >TAGS

M-x t-r-t-t expands to tags-reset-tags-tables() which allows changing the active TAGS file.

The Emacs rst tags support is supplied by find-tag-default-rst(), which is already integrated in package emacs-init.

Etags generation is provided by rst_tags.py, which is also integrated in the snippets framework:

sda make tags

For this to work, a project update (sda update) might be needed, or the tags rule can be extracted from the snippet with:

snc -m mak '^mak_sphinx.makefile-footer$' | sed '/^tags:/p;1,/tags:/d' | sed '/^ *$/,$d'

which at the time of this rendering was:


24.2.1. TAGS File Structure

A TAGS file has one or more entries describing the tags in a file, see e.g. Understanding the ‘ctags -e’ file format (ctags for emacs) - Stack Overflow.

An etags file entry consists of

<\014> file name "," size of following tag entries

An etags tag entry has the following structure:

line text <\177> tag <\001> line no (1-based) "," char position (0-based)

24.2.2. Sample TAGS File

The TAGS file as seen in Emacs:

^L
README-rst-mode-etags-support.txt,57
.. _`sec:Sample TAGS File`:^?sec:Sample TAGS File^A21,6337
^L
../administration/README-redistribution-of-info-items.txt,30
.. _`sec:Info tags`^?sec:Info tags^A370,11867

The TAGS file with special characters represented as octal escape sequences in brackets:

<\014>
README-rst-mode-etags-support.txt,57
.. _`sec:Sample TAGS File`:<\177>sec:Sample TAGS File<\001>21,6337
<\014>
../administration/README-redistribution-of-info-items.txt,30
.. _`sec:Info tags`<\177>sec:Info tags<\001>370,11867

24.3. diffmap

diffmap.py - generate line mapping from target to source from diff(1) output

usage: diffmap.py [OPTIONS] source-or-diff-output [target]
or import diffmap

24.3.1. Options

-q, –quiet suppress warnings
-v, –verbose verbose test output
-d, –debug show debug information
-h, –help display this help message
-t, –test run doc tests

24.3.2. Description

The diff(1) between two source and transformed file gives an accurate specification which line numbers of the two files are correlated.

diff -u /home/ws/project/administration/README-redistribution-of-info-items.txt /home/ws/project/administration/doc/redistribution-of-info-items.rst.auto
@@ -1,7 +1,18 @@

The hunk description specifies, that

  • the block of lines in source document consisting of 7 lines starting at line 1, matches with
  • the block of lines in target document consisting of 18 lines starting at line 1

see text processing - How shall I understand the unified format of diff output? - Unix & Linux Stack Exchange.

The correlation can then be determined from the hunk information alone, without parsing the input files:

@startditaa -E
skinparam defaultFontName Verdana

+--------+--------+
| s_from | t_from |
+--------+--------+
| 1      | 1 -> 1 |
+--------+--------+
| 2      | 2 -> 2 |
+--------+--------+
| 3      | 3 -> 3 |
+--------+--------+
| 4      | 4 -> 4 |
+--------+--------+

----

+--------+--------+
| s_from | t_from |
+--------+--------+
| 1      | 1 -> 1 |
+--------+--------+
| 2      | 2 -> 2 |
+--------+--------+
         | 3 -> 2 |
         +--------+
         | 4 -> 2 |
         +--------+

----

+--------+--------+
| s_from | t_from |
+--------+--------+
| 1      | 1 -> 1 |
+--------+--------+
| 2      | 2 -> 2 |
+--------+--------+
| 3      |
+--------+
| 4      |
+--------+

@endditaa

figure 24.1 Correlate Line Numbers from Diff Output

Note

The mapping is different, depending on a hunk having some lines or no lines!

>>> text = '\n'.join(('line ' + str(_i) for _i in xrange(15)))
>>> def write_file(file, line_nos):
...     with open(file, 'w') as _fh:
...         lines = list(text.splitlines())
...         for _i in line_nos:
...             printf(lines[_i], file=_fh)
>>> def dump_file(file):
...     printf('\n'.join([sformat("{0:02d} | {1}", _i + 1, _l)
...                       for _i, _l in
...                       enumerate(open(file, 'r').read().splitlines())]))

24.3.2.1. Line mapping, when source file hunk is completely removed

>>> write_file("t1", [1, 2, 3, 7, 8])
>>> write_file("t2", [1, 2, 3, 4, 5, 6, 7, 8])
01 | line 1   <-----   01 | line 1
02 | line 2   <-----   02 | line 2
03 | line 3   <-----   03 | line 3
               \----   04 | line 4   \  These are between line 03
                \---   05 | line 5    > and line 04 of source file
                 \--   06 | line 6   /
04 | line 7   <-----   07 | line 7
05 | line 8   <-----   08 | line 8
@@ -3,0 +4,3 @@
+line 4
+line 5
+line 6

Lines 00 - 06 of target file map to lines 00 - 03 of source file. Therefore, lines 04 - 06 all map to line 03. Lines starting at 7 are mapped with an offset of -3 (07 -> 04, 08 -> 05).

>>> dm = DiffMap("t1", "t2")
>>> print(dm)
#    :INF:    s_line         : ]4[
#    :INF:    t_line         : ]7[
#    :INF:    s_line - t_line: ]-3[
#    0 -    6 (   3) =>    0 -    3

24.3.2.2. Line mapping, when target file hunk is completely removed

>>> write_file("t1", [1, 2, 3, 4, 5, 6, 7, 8])
>>> write_file("t2", [1, 2, 3, 7, 8])
01 | line 1   <-----   01 | line 1
02 | line 2   <-----   02 | line 2
03 | line 3   <-----   03 | line 3
04 | line 4   <---/                  \  These are between line 03
05 | line 5   <--/                    > and line 04 of target file
06 | line 6   <-/                    /
07 | line 7   <-----   04 | line 7
08 | line 8   <-----   05 | line 8
--- t1...
+++ t2...
@@ -4,3 +3,0 @@
-line 4
-line 5
-line 6

Lines 00 - 03 of target file map to lines 00 - 03 of source file. Lines starting at 04 are mapped with an offset of +3 (04 -> 07, 05 -> 08).

>>> dm = DiffMap("t1", "t2")
>>> print(dm)
#    :INF:    s_line         : ]7[
#    :INF:    t_line         : ]4[
#    :INF:    s_line - t_line: ]3[
#    0 -    3 (   3) =>    0 -    3

24.3.2.3. Line mapping, when source and target file hunk overlap

>>> write_file("t1", [1, 2, 3, 4, 5, 6, 7, 8])
>>> write_file("t2", [1, 2, 3, 3, 3, 7, 8])
01 | line 1   <-----   01 | line 1
02 | line 2   <-----   02 | line 2
03 | line 3   <-----   03 | line 3
04 | line 4   <-----   04 | line 3   \  These lines overlap between
05 | line 5   <-----   05 | line 3    > source and target file
06 | line 6   <-/                    /
07 | line 7   <-----   06 | line 7
08 | line 8   <-----   07 | line 8
--- t1...
+++ t2...
@@ -4,3 +4,2 @@
-line 4
-line 5
-line 6
+line 3
+line 3

Lines 00 - 05 of target file map to lines 00 - 05 of source file. Lines starting at 06 are mapped with an offset of +1 (06 -> 07, 07 -> 08).

>>> dm = DiffMap("t1", "t2")
>>> print(dm)
#    :INF:    s_line         : ]7[
#    :INF:    t_line         : ]6[
#    :INF:    s_line - t_line: ]1[
#    0 -    5 (   5) =>    0 -    5
>>> os.unlink('t1')
>>> os.unlink('t2')

24.3.3. Module

24.3.4. Automatic Exports

>>> for ex in __all__: printf(sformat('from {0} import {1}', __name__, ex))
from diffmap import DiffMap

24.3.5. Explicit Exports

>>> if '__all_internal__' in globals():
...   for ex in __all_internal__:
...     printf(sformat('from {0} import {1}', __name__, ex))

24.3.6. Details

class diffmap.DiffMap(source_or_diff_output=None, target=None)[source]

@startuml
class "DiffMap" {
  +s_line : integer
  +t_line : integer
  +t_to_s_map : dict
  ====
  +__str__()
  +ranges()
  +parse_diff(diff_output)
  ----
  +parse(source_file, target_file)
  +lookup(lookup_t_line) : mapped_s_line
}
@enduml

figure 24.2 Correlate Line Numbers from Diff Output

>>> _debug = sys.modules["__main__"]._debug
>>> text = '\n'.join(('line ' + str(_i) for _i in xrange(15)))
>>> def write_file(file, line_nos):
...     with open(file, 'w') as _fh:
...         lines = list(text.splitlines())
...         for _i in line_nos:
...             printf(lines[_i], file=_fh)
>>> def dump_file(file):
...     printf('\n'.join([sformat("{0:02d} | {1}", _i + 1, _l)
...                       for _i, _l in
...                       enumerate(open(file, 'r').read().splitlines())]))
>>> write_file('/tmp/diffmap-source.txt', list(xrange(15))[1:])
>>> write_file('/tmp/diffmap-target.txt', [1, 2, 3, 4, 4, 5, 5, 6, 7, 11, 12, 13])
>>> dump_file('/tmp/diffmap-source.txt') #doctest: +ELLIPSIS
01 | line 1
02 | line 2
03 | line 3
04 | line 4
05 | line 5
06 | line 6
07 | line 7
08 | line 8
09 | line 9
10 | line 10
11 | line 11
12 | line 12
13 | line 13
14 | line 14
>>> dump_file('/tmp/diffmap-target.txt') #doctest: +ELLIPSIS
01 | line 1
02 | line 2
03 | line 3
04 | line 4
05 | line 4
06 | line 5
07 | line 5
08 | line 6
09 | line 7
10 | line 11
11 | line 12
12 | line 13
>>> printf(os.popen("diff --unified=0 '/tmp/diffmap-source.txt' '/tmp/diffmap-target.txt'", 'r').read(), end='') #doctest: +ELLIPSIS
--- /tmp/diffmap-source.txt...
+++ /tmp/diffmap-target.txt...
@@ -4,0 +5,2 @@
+line 4
+line 5
@@ -8,3 +9,0 @@
-line 8
-line 9
-line 10
@@ -14 +12,0 @@
-line 14
>>> dm = DiffMap('/tmp/diffmap-source.txt', '/tmp/diffmap-target.txt')
>>> printf(dm)
#    :INF:    s_line         : ]15[
#    :INF:    t_line         : ]13[
#    :INF:    s_line - t_line: ]2[
#    0 -    6 (   4) =>    0 -    4
#    7 -    9 (   9) =>    5 -    7
#   10 -   12 (  12) =>   11 -   13
>>> for _item in ditems(dm):
...     printf(_item)
(0, 0)
(1, 1)
(2, 2)
(3, 3)
(4, 4)
(5, 4)
(6, 4)
(7, 5)
(8, 6)
(9, 7)
(10, 11)
(11, 12)
(12, 13)
>>> os.unlink('/tmp/diffmap-source.txt')
>>> os.unlink('/tmp/diffmap-target.txt')
clear()[source]

Clear data.

lookup(t_line)[source]

Lookup target line.

Returns:mapped source line for target line.
parse(source_file, target_file)[source]

Diff source and target file, parse diff output.

parse_diff(diff_output)[source]

Parse hunks in diff output.

listing 24.2 check for diff edge case empty file
printf "%s\n%s\n%s\n" 'hallo' 'hallo' 'hallo' | diff -u /dev/null -
listing 24.3 check for diff edge case empty file, diff output
--- /dev/null    2022-10-29 11:10:30.081667219 +0200
+++ -    2023-03-08 17:53:12.495842805 +0100
@@ -0,0 +1,3 @@
+hallo
+hallo
+hallo

@startuml

start

:t_to_s_map = dict()
s_line = 0
t_line = 0;

' |:here:|

while (for each hunk) is (do)
  :_s_frm = number
  _s_cnt = default: 1
  _t_frm = number
  _t_cnt = default: 1
  _s_ofs = (_s_cnt == 0) + 0
  _t_ofs = (_t_cnt == 0) + 0;

  while (s_line < _s_frm?) is (do)
    if  (t_line < _t_frm + _t_ofs?) then (yes)
      :t_to_s_map[t_line] = s_line
      t_line += 1;
    endif
    :s_line += 1;
  endwhile

  if (_s_ofs?) then (yes)
    if (t_line < _t_frm + _t_ofs?) then (yes)
      :t_to_s_map[self.t_line] = s_line
      t_line += 1;
    endif
  endif

  while (t_line < _t_frm + _t_ofs?) is (do)
    :t_to_s_map[t_line] = s_line
    t_line += 1;
  endwhile

  while (for _indx in range(min(_s_cnt, _t_cnt))) is (do)
    :t_to_s_map[t_line] = s_line
    s_line += 1
    t_line += 1;
  endwhile

  while (_t_cnt < _s_cnt?) is (do)
    :s_line += 1
    _t_cnt += 1;
  endwhile

  while (_s_cnt < _t_cnt ?) is (do)
    :t_to_s_map[t_line] = s_line
    t_line += 1
    _s_cnt += 1;
  endwhile

  if (_s_ofs?) then (yes)
    :s_line += 1;
  endif

endwhile

://mapped_s_line// for
  //lookup_t_line >= _t_line_sup//
is
  lookup_t_line + _s_line_sup - _t_line_sup;

' |:here:|

stop

@enduml

figure 24.3 Correlate Line Numbers from Diff Output

ranges()[source]
Returns:list of mapped ranges