Dyalog APL - Strongly connected components of directed graph ⍵.

C ← {trace←0} ##.scc G      ⍝ Strongly connected components of directed graph ⍵.

Argument vector G is a graph (see →Graphs←).  Result C is a vector of  component
numbers, indicating which vertices share the same strongly connected components.

[scc] uses Tarjan's algorithm. See:
http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm

In short:  the algorithm performs a depth-first search of the graph, maintaining
a vector of the path between the starting (root) and current vertex. If a vertex
is encountered that is already in the path,  this constitutes a cycle, the vert-
ices of which are thus strongly connected. The items of this cycle are marked as
belonging to the same component, and the process resumed until all vertices have
been visited.

The  algorithm  is "linear", of order O(≢⍵,∊⍵):  the number of vertices plus the
number of edges.

Here are three representations of the same graph:  pictorial; schematic; and the
nested vector, suitable as an argument, when ⎕IO=1, for [scc]:

                                   1 → 2
    1------>2------>3<----->4      2 → 3 5 6     ┌→┬─────┬───┬───┬───┬─┬─┬───┐
    ∧      /|       |       ∧      3 → 4 7       │2│3 5 6│4 7│3 8│1 6│7│6│4 7│
    |    /  |       |       |      4 → 3 8       └→┴~───→┴~─→┴~─→┴~─→┴→┴→┴~─→┘
    |  /    |       |       |      5 → 1 6
    |└      ∨       ∨       ∨      6 → 7
    5------>6<----->7<------8      7 → 6
                                   8 → 4 7

The strongly connected components are vertices (1 2 5)(3 4 8)(6 7) as represent-
ed by the result: 1 1 2 2 1 3 3 2

    1 2 3 4 5 6 7 8  ← vertex
    1 1 2 2 1 3 3 2  ← strongly connected component number

[scc] is origin-independent, so that it can accept graphs in either origin.

Technical notes:

Tarjan's algorithm is often presented in a procedural style, using an inner  and
outer loop to mutate the state of some global values or of properties associated
with each vertex of the subject graph.

In contrast, as an exercise in pure functional programming, function [scc] main-
tains no state.  Comparing the code with that in the Wikipedia article, propert-
ies associated with vertices: "v.index" and "v.lowlink", together with the stack
"S" and a counter for the next index value "index", are hosted as  a  vector-of-
vectors "5-tuple" T, which is passed between subfunctions. The remaining item of
this tuple: "C" is the accumulator for the result: a simple vector of the vertex
indices of strongly connected components.

    Tuple "fields"                                          Wikipedia equivalent
    --------------                                          --------------------
    C :: a vector of component group indices (the result)   (comment-only)
    L :: a vector of low-link values per vertex ·   ·   ·   v.lowlink.
    X :: vector of index number per vertex  ·   ·   ·   ·   v.index
    x :: scalar: next index number  ·   ·   ·   ·   ·   ·   index
    S :: stack of indices of unconnected vertices   ·   ·   S

The items of the 5-tuple vector are accessed with named indices: C L X x S, est-
ablished at initialisation time:

    T ← ...             ⍝ initial value for tuple T ...
    C L X x S ← ⍳⍴T     ⍝ access names for T's items.

Coding in this way is significantly more complex than its procedural  equivalent
(purity comes at a price).  Extensive use is made of primitive operator @ which,
given an argument T0, returns a successor tuple T1.

    Wikipedia               [scc]
    ---------               -----
    S                       S⊃T0
    index := index + 1      T1←1+at x⊢T0
    v.lowlink := index      T2←v L∆ T1
                            where: L∆←x put L
                            where: put←{(⍺⍺⊃⍵)⊣at(⊂⍵⍵ ⍺)⊢⍵}
(muse:

    The up-side of these contortions is that  function  [scc]  is  effectively a
    single expression, which makes it amenable to agressive  code-transformation
    techniques such as might be used by an internal code optimiser.

    (muse:
        It would be interesting to speculate about language mechanisms to assist
        with this technique of replacing the modification of global  state  with
        passing everything around as argument and result tuples.  The goal would
        be to make it pleasant to denote a successor  tuple  in  which  specfied
        items and their numerically indexed  elements differ  from  the argument
        tuple.  It is JMS's guess that this is the motivation behind "monads" in
        Haskell (but he's been wrong before).

        Passing namespace "refs" with dottable field names doesn't help  because
        the name assignement in this case is procedural and so can not easily be
        embedded in a larger expression.  We would have to resort  to  something
        like:

            T1 ← T0.{⎕this⊣X[⍵]←x}v

        which doesn't seem any more friendly than:

            T1 ← (x⊃T0)⊣@(⊂X v)⊢T0
    )

    Notice that, where a procedural treatment would tend to use a statement sep-
    arator, this more functional approach uses ⊢ to compose transformations:

        ... ⋄ I+←1 ⋄ J⌊←k ⋄ S,⍨←v ⋄ ...     ⍝ procedural
        ... ⊢ 1+@I ⊢ k⌊@J ⊢ v,¨@S ⊢ ...     ⍝ functional
)

Requires: →dsp← for optional tracing.

An alternative coding
---------------------
Nick Nikolov provides this alternative one-liner, which uses the transitive
closure of the adjacency matrix (see →Graphs←).

    scc←{(∪⍳⊢)↓∧∘⍉⍨∨.∧⍨⍣≡i∘.∊⍵,¨i←⍳≢⍵}
    ⍝    ·     ·   ·     ·   ·  i←⍳≢⍵   vertex indices
    ⍝    ·     ·   ·     ·   ⍵,¨i   ·   consider each vertex a SCC by itself
    ⍝    ·     ·   ·     i∘.∊   ·   ·   neighbour lists to adjacency matrix
    ⍝    ·     ·   ∨.∧⍨⍣≡   ·   ·   ·   transitive closure: g[x;y] ←→ path x → y
    ⍝    ·     ∧∘⍉⍨ ·   ·   ·   ·   ·   ... and from y to x
    ⍝    (∪⍳⊢)↓ ·   ·   ·   ·   ·   ·   renumbering of component numbers

The version is very good for small graphs but its space and time requirements
grow rapidly as the size increases.

Condensation
------------
Contracting  each  strongly  connected  component  into a single vertex yields a
directed acyclic graph (DAG), the "condensation of G".  The  following  function
returns a 2-vector pair:  the condensation graph,  together with a corresponding
vector of condensed vertices from the original graph.

    con←{               ⍝ Condensation of graph ⍵.
        c←scc ⍵         ⍝ strongly connected components
        v←{⊂⍵}⌸ c       ⍝ component-grouped vertex indices
        e←c{⊂⍵}⌸ ⍵      ⍝   ..      ..      edges
        x←∪¨(∊¨e)~¨v    ⍝ out-of-component edges
        m←↓∨/¨x∘.∊v     ⍝ masks of remote vertices
        g←m/¨⊂⍳⍴v       ⍝ condensed DAG
        g v             ⍝ ... and contracted vertices
    }

Examples:

⍝   1------>2------>3<------>4          ⍝ Wikipedia
⍝   ∧      /|       |        ∧
⍝   |    /  |       |        |
⍝   |  /    |       |        |
⍝   |└      ∨       ∨        ∨
⍝   5------>6<----->7<-------8

    ⊢ g ← 2(3 5 6)(4 7)(3 8)(1 6)7 6(4 7)   ⍝ directed (origin-1) graph.
┌─┬─────┬───┬───┬───┬─┬─┬───┐
│2│3 5 6│4 7│3 8│1 6│7│6│4 7│
└─┴─────┴───┴───┴───┴─┴─┴───┘
    scc g                               ⍝ strongly connected components.
1 1 2 2 1 3 3 2
                                        ⍝ Using function [con] from above:
    con g                               ⍝ condensation and contracted vertices.
┌────────┬─────────────────┐
│┌───┬─┬┐│┌─────┬─────┬───┐│
││2 3│3││││1 2 5│3 4 8│6 7││
│└───┴─┴┘│└─────┴─────┴───┘│
└────────┴─────────────────┘

    isdag ← scc ≡ ⍳∘≢                   ⍝ is-a-DAG (contains no cycles)
    isdag g                             ⍝ g contains at least one cycle
0
    isdag (⍳10),⊂⍬                      ⍝ graph is acyclic
1

⍝   1<------2<------3<------>4          ⍝ Wikipedia
⍝   |      ┐∧       ∧        ∧
⍝   |    /  |       |        |
⍝   |  /    |       |        |
⍝   ∨/      |       |        |
⍝   5<------6<----->7<-------8<--.
⍝                            '---'

    ⊢ w ← 5 1(2 4)3 2(2 5 7)(3 6)(4 7 8)
┌─┬─┬───┬─┬─┬─────┬───┬─────┐
│5│1│2 4│3│2│2 5 7│3 6│4 7 8│
└─┴─┴───┴─┴─┴─────┴───┴─────┘
    scc w
1 1 2 2 1 3 3 4

    con w                               ⍝ condensation and contracted vertices.
┌────────────┬─────────────────┐
│┌┬─┬───┬───┐│┌─────┬───┬───┬─┐│
│││1│1 2│2 3│││1 2 5│3 4│6 7│8││
│└┴─┴───┴───┘│└─────┴───┴───┴─┘│
└────────────┴─────────────────┘

    ⍝ in the following trace, notice that the stack grows to the left:

    1 scc w                             ⍝ left arg is trace option
1│0 0 0 0 0 0 0 0│0 0 0 0 0 0 0 0│0 0 0 0 0 0 0 0│1│
5│0 0 0 0 0 0 0 0│1 0 0 0 0 0 0 0│1 0 0 0 0 0 0 0│2│1
2│0 0 0 0 0 0 0 0│1 0 0 0 2 0 0 0│1 0 0 0 2 0 0 0│3│5 1
3│1 1 0 0 1 0 0 0│1 1 0 0 1 0 0 0│1 3 0 0 2 0 0 0│4│
4│1 1 0 0 1 0 0 0│1 1 4 0 1 0 0 0│1 3 4 0 2 0 0 0│5│3
6│1 1 2 2 1 0 0 0│1 1 4 4 1 0 0 0│1 3 4 5 2 0 0 0│6│
7│1 1 2 2 1 0 0 0│1 1 4 4 1 6 0 0│1 3 4 5 2 6 0 0│7│6
8│1 1 2 2 1 3 3 0│1 1 4 4 1 6 6 0│1 3 4 5 2 6 7 0│8│
1 1 2 2 1 3 3 4

See also: Graphs dsp

Back to: contents

Back to: Workspaces