{rslt} ← {larg} (fun ##.ticks) rarg         ⍝ Sample Dfn execution clock ticks.

During  code execution, ⎕AI's "compute time" clock ticks every so-many millisec-
onds. Operator [ticks] records which line in its operand [fun]ction is executing
as each tick occurs and reports the percentage of ticks that fall on each line.

In order to obtain a significant tick-sample, the function is applied repeatedly
to (or between) its argument(s) until the clock has ticked 1,000 times. Repeated
application means that any side effects (⎕nappend, etc) will also be perpetrated
repeatedly, which may be undesirable.

At  the  end of the time period, the function source is displayed in the session
together with, for each line:

- The number of times the line was "visited".
- The percentage of clock ticks recorded for the line.

    ┌────────────── visits
    │  ┌─────────── percent
    │  │  ┌──────── source
    │  │  │
    5 14  sieve←{             ⍝ Sieve of Eratosthenes.
    5 15      ⍺←⍬             ⍝ Default no primes yet.
    5 14      nxt←1↑⍵         ⍝ Next prime, and
    5 14      msk←×nxt|⍵      ⍝ ... mask of non-multiples.
    5 17      ∧/1↓msk:⍺,⍵     ⍝ All non multiples - finished.
    4 26      (⍺,nxt)∇ msk/⍵  ⍝ Sieve remainder.
    0  0  }

Where:

- Visits:  This  is (intended to be) an accurate account of the number visits to
  each  line. A line containing a local function is deemed to be visited both at
  definition  and application time. This behaviour differs from that of operator
  →mdf←,  which  reports  only  the  definition count in such cases. See Bugs[6]
  below.

- Percent:  Approximate  indication of the proportion of clock ticks occuring in
  each  line.  Owing to timing granularity, the column total may not be 100; see
  below.

- Source:  As  much of the subject function's source code as will fit across the
  screen.

Unlike  ⎕MONITOR  used with traditional functions, the time attributed to a line
_excludes_ that spent in called subfunctions. This is a _good_thing_.

⎕AI's  clock resolution varies: on some systems, the clock ticks every 10 milli-
seconds  and on others, every 16 (or so). In any case, many lines of code may be
executed  between  ticks. The best we can do is to run the function for a period
of time and notice on which lines the ticks occur.

The  following  diagram illustrates the problem: each ┼────┼ represents the real
and varying execution time of successive lines of code, and ↑tick shows the reg-
ular clock ticks occuring.


    ─┼───┼──┼─┼────┼────┼───┼─┼──┼─┼─┼────────┼──────────┼──────┼─────┼─┼───┼
    ↑                       ↑                       ↑                       ↑
    tick                    tick                    tick                    tick

To  interpret  the  result as an indication of relative processing time for each
line,  we  must suppose that, in general, lines that take longer to execute will
sustain proportionally more ticks.

Bugs:

[1] The  time taken to execute the monitoring code often vastly exceeds the time
    taken  to execute the code it is monitoring. This would be OK if the monitor
    took  _exactly_ the same time to execute on each occasion, in which case, we
    could  calculate and subtract the overhead. However, in Dyalog APL, the heap
    manager may take a different number of cycles to satisfy a request depending
    on  the  state  of  fragmentation  of  the heap. This means that attempts to
    account  for  the execution time overhead of the monitoring code can be only
    approximate.

[2] If  the  (copy  of the) subject function, with its embedded monitoring code,
    happens  to  run  in  a time that is an exact multiple of the system clock's
    resolution,  all  bets are off as ticks would tend to accumulate in the same
    few lines.

[3] The generated monitoring function (⍙_f) will fail with a SYNTAX ERROR if any
    local function call does not produce a result. For example:

        loop←{
            0<⍵:∇ ⍵-1
        }

[4] During  execution  of  the  subject function, tail-calling is disabled. This
    means  that a "highly iterative" function such as the following will produce
    misleading results:

        loop←{
            ⍵=0:⍵           ──── comparision with 0 _should_ be
            ∇ ⍵-1           ──── commensurate with subtraction of 1.
        }

        loop ticks 1e3
    1001 15  loop←{
    1001 14      ⍵=0:⍵
    1000 71      ∇ ⍵-1
       0  0  }

[5] As  the monitoring code is merged into a copy of the subject function, there
    is potential for a name clash. The presence of any of the following names in
    the subject function will lead to unpredictable results:

        ⍙_a   ⍙_e   ⍙_f   ⍙_l
        ⍙_m   ⍙_n   ⍙_q   ⍙_s
        ⍙_t   ⍙_x   ⍙_z
        ⍙_    _⍙

[6] The  line  visits counter is incremented whenever the entry monitor code de-
    tects  that the line number (⎕LC) has changed. Such is the case when a local
    function  is invoked from a line other than its definition line. Compare the
    following examples:


    1 ··  dup1←{
    3 ··      dup←{⍵ ⍵}         ──── 3 visits: 1 definition +
    1 ··      a←dup ⍵           ────           1 call +
    1 ··      dup a             ────           1 more call.
    0 ··  }


    1 ··  dup2←{
    1 ··      dup←{⍵ ⍵} ⋄ a←dup ⍵ ⋄ dup a       ─── 1 visit to line.
    0 ··  }


    1 ··  dup3←{
    2 ··      dup←{⍵ ⍵}         ──── 2 visits: 1 definition +
    1 ··      dup dup ⍵         ────           1 (double) call.
    0 ··  }


    1 ··  dup4←{
    1 ··      {⍵ ⍵}{⍵ ⍵}⍵       ──── 1 visit to line.
    0 ··  }


    1 ··  dup5←{
    2 ··      {                 ──── 2 visits: 1 definition +
    1 ··          ⍵ ⍵           ────           1 call.
    1 ··      }{
    1 ··          ⍵ ⍵
    0 ··      }⍵
    0 ··  }


    1 ··  dup6←{
    1 ··      {⍵ ⍵}¨⍵ ⍵         ──── 1 visit to line.
    0 ··  }


[7] When  the execution path through the subject function is determined by fact-
    ors  other  than  its arguments, the first (visits) column of the result may
    contain  non-integers.  This  might  be the case, for example, if ⎕TS or ⎕RL
    were used within the function, as with:

        maze ticks 10 20        ⍝ reports non-integral line visits count.

    Such a result might be interpreted as an _average_ number of line visits.

Technical notes:

The  coding  of  [ticks] is fairly ghastly for the same reason as →mdf←. See the
technical notes in →mdf← for details.

Monitoring is achieved in several stages:

- A copy of the subject function is injected with entry (→⍵) and exit (⍺←) mark-
  ers  at  the  start of each line and at the entry and exit points of each sub-
  function. These  markers are subsequently replaced with calls to the entry and
  exit monitoring sequences:

    ⍙_ ⍵: ⋄ ··· ⍝ entry sequence.
    ⍺ _⍙ ···    ⍝ exit sequence,

  where  ⍺  and ⍵ are replaced with the number of the function line in which the
  sequence  appears. Monitoring function ⍙_ returns 0 (so that its guard is ign-
  ored) and _⍙ returns its right argument as the result of the (sub-) function.

- A small loop is executed to determine ⎕AI's clock resolution and therefore the
  time needed to endure 1,000 clock ticks.

- The  doctored  function  is  executed  repeatedly for this time period, during
  which sampling information is accumulated in the operator's local variables:

    ⍙_t     time accumulated per line.
    ⍙_n     number of times line is visited (first column of resulting display).
    ⍙_e     number of times entry monitor is called.
    ⍙_x         ..      ..  exit    ..      ..

  The  copy  (⍙_f)  of  the subject function is applied under ⍎ in order to defy
  Dfns' "strictly local to the capsule" rule that would otherwise put the accum-
  ulating variables beyond its scope.

- An  attempt is made to calculate the time overhead spent within the monitoring
  code  itself.  This  is  effected by timing loops with embedded entry and exit
  sequences. Note that these loops are "compiled" by using ⍎, so that they bene-
  fit  from the same interpreter performance improvements, if any (such as idiom
  recognition), as the monitored subject function.

- Finally, this approximate monitoring overhead is subtracted from the monitored
  times  and  the  result displayed using ⎕←. The result of the subject function
  application is returned as a shy result, so that [ticks] may easily be embedd-
  ed within any expression; see the tube.dws example below.

Examples:

⍝ Ackermann's  function  is a challenging case, as the processing time is
⍝ _negligible_ compared with the monitoring overhead. Repeated monitoring
⍝ of this function often shows quite disparate timing.

      3 ack ticks 3
2432 23  ack←{
2432 22      ⍺=0:⍵+1
1244 14      ⍵=0:(⍺-1)∇ 1
1187 41          (⍺-1)∇ ⍺ ∇ ⍵-1
   0  0  }

      3 ack ticks 3
2432 22  ack←{
2432 24      ⍺=0:⍵+1
1244 16      ⍵=0:(⍺-1)∇ 1
1187 38          (⍺-1)∇ ⍺ ∇ ⍵-1
   0  0  }

⍝ However,  for more substatial functions, the display can be illuminating. Here
⍝ is the output from a graph searching function in tube.dws. Inserting a call on
⍝ ticks  into  the  trip function, shows that around 80% of its time is spent in
⍝ determining which vertices are free.
⍝
⍝   trip[19]  find←⍺.graph{⍺⍺ path ticks ⍺ ⍵}       ⍝ find one leg of the route.
⍝                                  ¯¯¯¯¯

    london trip 'Hammersmith' 'Mornington'
  1  0  path←{                      ⍝ Shortest path from/to ⍵ in graph ⍺.
  1  0      graph←⍺                 ⍝ ⍺ is graph vector.
  1  0      fm to←⍵                 ⍝ starting and target vertex(ices).
  1  1      tree←¯2+(⍳⍴⍺)∊fm        ⍝ initial spanning tree.
573 80      free←{(¯2=tree[⍵])/⍵}   ⍝ free vertices in ⍵.
 20  1      ⍬{                      ⍝ path accumulator.
 19  1          ⍵<0:(⍵=¯2)↓⍺        ⍝ root or unvisited vertex: finished.
 18  3          (⍵,⍺)∇ ⍵⊃tree       ⍝ otherwise: prefix previous (parent) vertex
 18  1      }{                      ⍝ find partial spanning tree:
 18  1          ⍵≡⍬:¯1              ⍝ no vertices left: stitched.
 18  1          1∊to∊⍵:1↑⍵∩to       ⍝ found target: finished.
 17  1          next←free¨graph[⍵]  ⍝ next vertices to visit.
 17  4          back←↑,/⍵+0×next    ⍝ back links.
 17  1          wave←↑,/next        ⍝ vertex wave front.
 17  1          tree[wave]←back     ⍝ set back links in tree.
 17  3          ∇∪wave              ⍝ advance wave front.
  0  0      }fm                     ⍝ from starting vertex.
  0  0  }
Hammersmith
    Hammersmith         Piccadilly
    Barons Court        Piccadilly
    Earl's Court        Piccadilly
    Gloucester Road     Piccadilly
    South Kensington    Piccadilly
    Knightsbridge       Piccadilly
    Hyde Park Corner    Piccadilly
    Green Park          Piccadilly
Green Park
    Green Park          Victoria
    Oxford Circus       Victoria
    Warren Street       Victoria
    Euston              Victoria
Euston
    Euston              Northern
    Mornington Crescent Northern
Mornington Crescent

⍝ This example uses Hungarian Assignment on a 100×100 matrix. Note that just
⍝ four of the lines account for around 60% of processing time:

      assign ticks ?100 100⍴1000
  1  0  assign←{⎕ML←0                               ⍝ Hungarian method cost assi
  1  0
  2  0      step0←{step1(⌽⌈\⌽⍴⍵)↑⍵}                 ⍝ 0: at least as many rows a
  1  0
  2  0      step1←{step2↑(↓⍵)-⌊/⍵}                  ⍝ 1: reduce rows by minimum
  1  0
  2  0      step2←{                                 ⍝ 2: mark independent zeros.
  4  0          stars←{                             ⍝ independent zeros.
  3  0              ~1∊⍵:⍺                          ⍝ no more zeros: done.
  2  0              next←<\<⍀⍵                      ⍝ more independent zeros.
  2  0              mask←(rows next)∨cols next      ⍝ mask of dependent rows and
  2  0              (⍺∨next)∇ ⍵>mask                ⍝ ⍺-accumulated star matrix.
  0  0          }                                   ⍝
  2  0          zeros←{⍵+0 stars ⍵}⍵=0              ⍝ 1=>zero, 2=>independent ze
  1  0          step3 ⍵ zeros                       ⍝ next step: 3.
  0  0      }
  1  0
 41  0      step3←{costs zeros←⍵                    ⍝ 3: cover cols with starred
 40  0          stars←zeros=2                       ⍝ starred zeros.
 40  0          covers←2×cols stars                 ⍝ covered cols.
 40  0          ~0∊covers:stars                     ⍝ all cols covered: solution
 39  0          step4 costs zeros covers            ⍝ next step: 4.
  0  0      }
  1  0
828  1      step4←{costs zeros covers←⍵             ⍝ 4: adjust covering lines.
827  3          mask←covers=0                       ⍝ mask of uncovered elements
827  5          open←1=mask×zeros                   ⍝ uncovered zeros.
827  5          ~1∊open:(⌊/(,mask)/,costs)step6 ⍵   ⍝ no uncovered zeros, next s
767  1          prime←first open                    ⍝ choose first uncovered zer
767  1          prow←rows prime                     ⍝ row containing prime.
767  6          star←2=zeros×prow                   ⍝ star in row containing pri
806  6          ~1∊star:prime step5{                ⍝ no star in row, next step
 39  0              costs ⍵ prime                   ⍝ adjusted zeros matrix,
  0  0          }zeros+2×prime                      ⍝ new primed zero (3).
728  1          cnext←covers+prow-2×cols star       ⍝ adjusted covers.
728  6          znext←zeros⌈3×prime                 ⍝ primed zero.
728  4          ∇ costs znext cnext                 ⍝ adjusted zeros and covers
  0  0      }
  1  0
109  0      step5←{costs zeros prime←⍵              ⍝ 5: exchange starred zeros.
108  1          star←(cols prime)∧zeros=2           ⍝ next star.
147  0          ~1∊star:step3 ⍺{                    ⍝ no stars: next step :3.
117  1              {costs ⍵}{⍵-2×⍵=3}⍵-⍺∧⍵>1       ⍝ unstarred stars; starred p
  0  0          }zeros                              ⍝ adjusted zero markers.
 69  0          pnext←(rows star)∧zeros=3           ⍝ next prime.
 69  0          (⍺∨pnext∨star)∇ costs zeros pnext   ⍝ ⍺-accumulated prime-star-·
  0  0      }
  1  0
 61  0      step6←{costs zeros covers←⍵             ⍝ 6: adjust cost matrix.
 60 20          cnext←costs+⍺ׯ1 1+.×0 3∘.=covers   ⍝ add and subtract minimum v
 60  1          znext←zeros+(×costs)-×cnext         ⍝ amended zeros marker.
 60  0          step4 cnext znext covers            ⍝ next step: 4.
  0  0      }
  1  0
839 15      rows←{(⍴⍵)⍴(⊃⌽⍴⍵)/∨/⍵}                  ⍝ row propagation.
879 12      cols←{(⍴⍵)⍴∨⌿⍵}                         ⍝ column propagation.
768 12      first←{(⍴⍵)⍴<\,⍵}                       ⍝ first 1 in bool matrix.
  1  0
  1  0      (⍴⍵)↑step0 ⍵                            ⍝ start with step 0.
  0  0  }

See also: mdf cmpx time

Back to: contents

Back to: Workspaces