Dyalog APL - ⍺-combination matrix of ⍳⍵.

mat ← m ##.cmat n                           ⍝ ⍺-combination matrix of ⍳⍵.

Returns  an  m!n-row,  m-column, matrix of the m-item combination vectors of ⍳n.
Both  rows  and  row  items  of  the result are in ascending order, thus for any
⍺ ⍵≥0:

        {⍵≡⍳⍴⍵}⍋⍺ cmat ⍵            ⍝ Rows in ascending order.

        ∧/{{⍵≡⍳⍴⍵}⍋⍵}¨↓⍺ cmat ⍵     ⍝ Items in each row in ascending order.

Technical notes:

Notice that a combination matrix can be divided into sections, some of which are
themselves, combination matrices. For example:

                             ┌─┬───┐              ┌───┐
                  1 2 3      │1│2 3│              │1 2│
                  1 2 4      │1│2 4│              │1 3│
                  1 2 5      │1│2 5│      1 , 1 + │1 4│     1 , 1 + 2 cmat 4
                  1 3 4      │1│3 4│              │2 3│
    3 cmat 5  =>  1 3 5  =>  │1│3 5│  =>    ⍪     │2 4│  =>   ⍪
                  1 4 5      │1│4 5│              │3 4│
                  2 3 4      ├─┴───┤              └───┘
                  2 3 5      │2 3 4│              ┌─────┐
                  2 4 5      │2 3 5│          1 + │1 2 3│       1 + 3 cmat 4
                  3 4 5      │2 4 5│              │1 2 4│
                             │3 4 5│              │1 3 4│
                             └─────┘              │2 3 4│
                                                  └─────┘

This observation leads to the recurrence relationship:

    ⍺ cmat ⍵ ←→ (1,1+(⍺-1)cmat ⍵-1)⍪1+⍺ cmat ⍵-1

Successive transformations of the expression on the right, produce:

        Expression                              Transformation
        ----------                              --------------
        (1,1+(⍺-1)cmat ⍵-1)⍪1+⍺ cmat ⍵-1        [0] cmat → ∇
                  ¯¯¯¯          ¯¯¯¯
    →   (1,1+(⍺-1)∇⍵-1)⍪1+⍺ ∇⍵-1                [1] ⍺⍪⍵ → ↑⍪/⍺ ⍵
                       ¯
    →   ↑⍪/(1,1+(⍺-1)∇ ⍵-1)(1+⍺ ∇ ⍵-1)          [2] ⍺ ⍵ → ⌽ ⍵ ⍺
            ¯¯¯¯¯¯¯¯¯¯¯¯¯¯  ¯¯¯¯¯¯¯¯¯
    →   ↑⍪/⌽(1+⍺ ∇ ⍵-1)(1,1+(⍺-1)∇⍵-1)          [3] ⍺({}⍵) → {}\⍺ ⍵
                        ¯¯
    →   ↑⍪/⌽{1,⍵}\(1+⍺ ∇ ⍵-1)(1+(⍺-1)∇⍵-1)      [4] (1+⍺)(1+⍵) → 1+⍺ ⍵
                   ¯¯         ¯¯
    →   ↑⍪/⌽{1,⍵}\1+(⍺ ∇ ⍵-1)((⍺-1)∇⍵-1)        [5] (⍺{}⍵)(ß{}⍵) → ⍺ ß{}¨⊂⍵
                     ¯¯¯      ¯¯¯¯¯¯
    →   ↑⍪/⌽{1,⍵}\1+(⍺-0 1)¨∇⍵-1                [6]

Notice  transformation[3]:  ⍺({}⍵) → {}\⍺ ⍵,  which  applies  a  function to the
_second_  item of a 2-vector. Using this technique, we can avoid temporary local
names in situations such as:

    lft rgt←pair ⋄ lft({···⍵}rgt)           ⍝ clumsier.

    {···⍵}\pair                             ⍝ neater.

The base cases for the recursion are:

    0 cmat ⍵ → 1 0⍴0
    ⍺ cmat 0 → 0 ⍺⍴0

or:
    0∊⍺ ⍵:(⍺=0)⍺⍴0

The (index-origin independent) code is therefore:

    cmat←{                              ⍝ ⍺-combination matrix of ⍳⍵.
        0∊⍺ ⍵:(⍺=0)⍺⍴0                  ⍝ done if zero ⍺ or ⍵, otherwise,
        ↑⍪/⌽{⎕IO,⍵}\1+(⍺-0 1)∇¨⍵-1      ⍝ catenate sub-combinations.
    }

Notice  however,  that  the  algorithm is inefficient in the same way as a naïve
coding  of  →fibonacci←, in that the function is applied to the _same_ arguments
many  times  over. If we display the arguments for the function calls as a tree,
by injecting:

        ⎕←(,↑(⍴⎕LC)⍴⊂'·   '),'[',(⍕⍺ ⍵),']'

as the first line of the function, we see the following "dependency tree", where
[⍺ ⍵] stands for (⍺ cmat ⍵):

      2 cmat 4
·   [2 4]
·   ·   [2 3]
·   ·   ·   [2 2]
·   ·   ·   ·   [2 1]
·   ·   ·   ·   ·   [2 0]
·   ·   ·   ·   ·   [1 0]
·   ·   ·   ·   [1 1]
·   ·   ·   ·   ·   [1 0]
·   ·   ·   ·   ·   [0 0]
·   ·   ·   [1 2]
·   ·   ·   ·   [1 1]
·   ·   ·   ·   ·   [1 0]
·   ·   ·   ·   ·   [0 0]
·   ·   ·   ·   [0 1]
·   ·   [1 3]
·   ·   ·   [1 2]
·   ·   ·   ·   [1 1]
·   ·   ·   ·   ·   [1 0]
·   ·   ·   ·   ·   [0 0]
·   ·   ·   ·   [0 1]
·   ·   ·   [0 2]

This is interpreted:

    [2 4] depends on
    ·   [2 3] which depends on
    ·   ·   [2 2] which depends on
    ·   ·   ···
        and
    ·   ·   [1 2] which depends on     ⍝ [1 2] evaluated here.
    ·   ·   ···
    and
    ·   [1 3] which depends on
    ·   ·   [1 2] which depends on     ⍝ [1 2] evaluated again here.
    ·   ·    ···
··· and so on.

Notice  how [1 2], together with its whole sub tree, is evaluated twice. In this
case,  [2 4] generates 21 function calls. This number increases very rapidly and
in general, [⍵ ⍵] generates (¯1+2*⍵+1) calls.
                                                              ┌─┐     ┌───┐
The dependencies are better represented as a _lattice_, where │↑│ and │←--│ mean
"depends on":                                                 │|│     └───┘
                                                              └─┘
    [2 4]←--[2 3]←--[2 2]←--[2 1]←--[2 0]
    ↑       ↑       ↑       ↑
    |       |       |       |
    [1 3]←--[1 2]←--[1 1]←--[1 0]
    ↑       ↑       ↑
    |       |       |
    [0 2]   [0 1]   [0 0]

This expresses exactly the same recurrence relationship as the tree, except that
in  this representation, the nodes are _shared_. Notice how the lattice has only
12,  as opposed to the tree's 21 nodes. If we can find a way to evaluate each of
its nodes only once, a significant performance benefit should result.

In  fact,  we can do a little better than this. We chose [⍺ 0] and [0 ⍵] as base
cases for the recursion, because we can "conjure these results out of thin air".
Looking  at  the  lattice,  we see a column at the base of the "ramp", which has
values  [2 2] [1 1] [0 0].  As  we  know that [⍵ ⍵] = (1 ⍵⍴⍳⍵) for all ⍵, we can
equally  easily  use  [⍵ ⍵]  and [0 ⍵] as base cases, as long as we make special
provision  for starting values from _inside_ the ramp where ⍺>⍵. This enables us
to trim the lattice to:

    [2 4]←--[2 3]←--[2 2]
    ↑       ↑
    |       |
    [1 3]←--[1 2]←--[1 1]
    ↑       ↑
    |       |
    [0 2]   [0 1]

A slightly larger example (4 cmat 8), shows more clearly what's going on. Notice
that  this lattice has 24 nodes, while its corresponding dependency _tree_ would
have 325.

    [4 8]←--[4 7]←--[4 6]←--[4 5]←--[4 4]
    ↑       ↑       ↑       ↑
    |       |       |       |
    [3 7]←--[3 6]←--[3 5]←--[3 4]←--[3 3]
    ↑       ↑       ↑       ↑
    |       |       |       |
    [2 6]←--[2 5]←--[2 4]←--[2 3]←--[2 2]
    ↑       ↑       ↑       ↑
    |       |       |       |
    [1 5]←--[1 4]←--[1 3]←--[1 2]←--[1 1]
    ↑       ↑       ↑       ↑
    |       |       |       |
    [0 4]   [0 3]   [0 2]   [0 1]

The  diagram  suggests  using  right-to-left  or  bottom-to-top  _reduction_  to
accumulate values towards the top left corner, which is the required result. For
reasons  that  will  become apparent, reduction from the right turns out to be a
slightly better bet. Using ⎕ML←2, so that "↑" means "first":

    ↑↑{⍺···⍵}/ [0 4] ··· [0 2] [0 1] ,⊂ [4 4] ··· [2 2] [1 1]

Now,  as  the  [0 ⍵]'s  all have the same value: (1 0⍴0), we can substitute this
value for ⍺ _inside_ the reduction's operand and ignore the value that is passed
into  the  operand function as left argument. This means that we may replace the
vector [0 4] ··· [0 2] [0 1] by _any_ vector providing it has the same number of
items:  (⍳⍵-⍺) will suffice.

    ↑↑{(1 0⍴0)···⍵}/ (⍳⍵-⍺),⊂ [4 4] ··· [2 2] [1 1]

Finally,  to  squeeze out the last drop of refinement, we can compute the little
array  (1 0⍴0)  just  once and bind it as an operand to the reduction's operand,
where it becomes "⍺⍺".

    ↑↑(1 0⍴0){⍺⍺···⍵}/ (⍳⍵-⍺),⊂ [4 4] ··· [2 2] [1 1]

The base cases: [4 4] ··· [2 2] [1 1], evaluate to ({1 ⍵⍴⍳⍵}¨+\⍺⍴1), noting that
(+\⍺⍴1) is an origin independent expression for "1 to ⍺", which gives:

    ↑↑(1 0⍴0){⍺⍺···⍵}/ (⍳⍵-⍺),⊂{1 ⍵⍴⍳⍵}¨+\⍺⍴1

At this point, we can cover the "special provision" for the null case where ⍺>⍵.
The result is a matrix of shape (0 ⍺), so the reshape becomes: (⍺≤⍵){⍺ ⍵⍴⍳⍵}.

    ↑↑(1 0⍴0){⍺⍺···⍵}/ (⍳⍵-⍺),⊂(⍺≤⍵){⍺ ⍵⍴⍳⍵}¨+\⍺⍴1

Now  all  that  remains,  is  to  code  the workings of the reduction's operand:
{⍺⍺···⍵}. Each application of the operand will take a _column_ of the lattice as
right argument and a nominal [0 ⍵] as left operand.

Transposing  the  rightmost  column  of  the  lattice above, the operand will be

applied to:    ⍵: [4 4]   [3 3]   [2 2]   [1 1]  ⍺⍺: [0 1]
                  ↓       ↓       ↓       ↓          ↓
with result:      [4 5]←--[3 4]←--[2 3]←--[1 2]←-----'

To  achieve  this  result,  we need to do a further, but this time, _cumulative_
reduction. Without accumulation, the result of the "inner" reduction in the case
above  would be [4 5], whereas we need the whole vector [4 5] ··· [1 2], but ex-
cluding the initial base case ⍺⍺:[0 1], to pass back to the outer reduction. The
template for such a reduction is:

    ¯1↓↑{(⍺ ··· ↑⍵),⍵}/(⊂···),···,(⊂···),(⊂⊂···)
      ││     │  │  │    │          │      └────── _Extra_ rightmost enclosure,
      ││     │  │  │    │          │                      for accumulation.
      ││     │  │  │    └──────────┴───────────── Items of reduction argument.
      ││     │  │  └─── Prefix new result item to accumulation.
      ││     │  └────── First item of accumulated result.
      ││     └───────── Operation between successive left and right args.
      │└─────────────── Disclose vector result of vector reduction.
      └──────────────── Discard initial base case ⍺⍺:[0 1].

Substituting our specific cases:

    ¯1↓↑{(⍺ ··· ↑⍵),⍵}/⍵,⊂⊂⍺⍺

The  remaining "···" is exactly the expression developed right back at the start
for the "tree" algorithm: ⍪/⌽{⎕IO,⍵}\1+···, so the complete operand function is:

    ¯1↓↑{(⍪/⌽{⎕IO,⍵}\1+⍺(↑⍵)),⍵}/⍵,⊂⊂⍺⍺

A last tweak is to "factor out" (⊂⊂⍺⍺) to where ⍺⍺ is generated, so that:

        ···(1 0⍴0){···{···}/⍵,⊂⊂⍺⍺}/···
becomes:    __                ¯¯
        ···(⊂⊂1 0⍴0){···{···}/⍵,⍺⍺}/···

The final code looks like this. Notice that apart from the specification of ⎕ML,
it's  a one-liner. More to the point, there are no assignments, tests or control
branches; it is a single expression:

    cmat←{⎕ML←2                             ⍝ ⍺-combination matrix of ⍳⍵.
        ↑↑(⊂⊂1 0⍴0){                        ⍝ base-case: [0 ⍵].
            ¯1↓↑{                           ⍝ removing base case from,
                (⍪/⌽{⎕IO,⍵}\1+⍺(↑⍵)),⍵      ⍝ accumulation of,
            }/⍵,⍺⍺                          ⍝ sub- (⍺ ⍵-1) combinations,
        }/(⍳0⌈⍵-⍺),⊂(⍺≤⍵){⍺ ⍵⍴⍳⍵}¨⌽+\⍺⍴1    ⍝ with base cases [⍺ 0].
    }

Alternative codings
-------------------
John  Niss  Hansen  suggests  the  following  much  simpler coding. Effectively,
the  function  generates  all binary numbers of width ⍵ and selects those with ⍺
bits set. Each binary "mask" is then used to select a different ⍺ combination of
numbers  from  0  to ⍵-1.  This  works  well  for small values of ⍵ but the sub-
expression  (⍳⍵⍴2)  uses  an  exponential  amount  of space (and time) as ⍵ gets
larger. The function assumes ⎕io=0.

      cmat←{↑⌽((⍺=+/¨t)/t←,⍳⍵⍴2)/¨⊂⍳⍵}

VMJ suggests the following one-line speedy alternative for small-to-medium sized
results.

      cmat←{⎕ML←3 ⋄ ⊃⊃{{(</2↑[⎕IO+1]⊃⍵)/⍵},⍺∘.,⍵}/(⍳⍺)+⊂(-⎕IO)+⍳⍵-⍺-1}

and also:

      cmat←{
           (⎕IO ⎕ML)←1 3 ⋄ wd←⍺ ⋄ mx←⍵
           ⊃↑2{⍺>wd:⍵
               a←2⊃⍵
               b←mx-a+wd-⍺ ⋄ q←∊a+⍳¨b
               (⍺+1)∇((b/↑⍵),¨q)q
           }2⍴⊂⍳⍵-⍺-1
       }

This  version  is based on Ken Gradney's code, which was published in Vector Vol
18.1:

      cmat←{
          ⎕ML←3 ⋄ n←1+⍵-⍺ ⋄ c←(⍺-1)+⍳n

          c{⎕IO≥↑⍺:↑⍵
              m←⌽+\⌽↑1↓⍵
              (⍺-1)∇((m/⍺-1),(↑⍵)[∊((↑m)-m)+⍳¨m;])m
          }((n,1)⍴c)(n⍴1)
      }


Roger Hui proposes this short coding:

      cmat←{⊖⊃⍪/{k,¨⍪\1+⍵}⍣⍺⊢(⊂⍉⍪⍬),d⍴⊂0 0⍴k←⌽⍳1+d←⍵-⍺}

which  is  faster than the others except on really large arguments (when it is a
bit slower).

Roger says:  The ⊖ is needed because Dyalog does not have "suffix scan" (⊖f⍀⊖⍵).
I am led to doing the regular APL scan on the reversed argument,  then reversing
the whole thing at the end.

Examples:

      display 0 cmat 5
┌⊖┐
↓0│
└~┘

      display 1 cmat 5
┌→┐
↓1│
│2│
│3│
│4│
│5│
└~┘

      display 2 cmat 5
┌→──┐
↓1 2│
│1 3│
│1 4│
│1 5│
│2 3│
│2 4│
│2 5│
│3 4│
│3 5│
│4 5│
└~──┘

      display 3 cmat 5
┌→────┐
↓1 2 3│
│1 2 4│
│1 2 5│
│1 3 4│
│1 3 5│
│1 4 5│
│2 3 4│
│2 3 5│
│2 4 5│
│3 4 5│
└~────┘

      display 4 cmat 5
┌→──────┐
↓1 2 3 4│
│1 2 3 5│
│1 2 4 5│
│1 3 4 5│
│2 3 4 5│
└~──────┘

      display 5 cmat 5
┌→────────┐
↓1 2 3 4 5│
└~────────┘

      display 6 cmat 5
┌→──────────┐
⌽0 0 0 0 0 0│
└~──────────┘

      ⍉↑{⍴⍵ cmat 5}¨0 to 6                      ⍝ Row and column sizes.
1 5 10 10 5 1 0
0 1  2  3 4 5 6

      2{⍵[⍺ cmat⍴⍵]}'scissors' 'paper' 'stone'  ⍝ 2-combos of nested vector.
 scissors  paper
 scissors  stone
 paper     stone

      ↓3{⍵[⍺ cmat⍴⍵]}'abcde'                    ⍝ 3-combos of char vector.
 abc  abd  abe  acd  ace  ade  bcd  bce  bde  cde

      display¨0 1 2 3∘.cmat 0 1 2 3             ⍝ Combos of small vectors.
 ┌⊖┐      ┌⊖┐      ┌⊖┐      ┌⊖┐
 ↓0│      ↓0│      ↓0│      ↓0│
 └~┘      └~┘      └~┘      └~┘
 ┌→┐      ┌→┐      ┌→┐      ┌→┐
 ⌽0│      ↓1│      ↓1│      ↓1│
 └~┘      └~┘      │2│      │2│
                   └~┘      │3│
                            └~┘
 ┌→──┐    ┌→──┐    ┌→──┐    ┌→──┐
 ⌽0 0│    ⌽0 0│    ↓1 2│    ↓1 2│
 └~──┘    └~──┘    └~──┘    │1 3│
                            │2 3│
                            └~──┘
 ┌→────┐  ┌→────┐  ┌→────┐  ┌→────┐
 ⌽0 0 0│  ⌽0 0 0│  ⌽0 0 0│  ↓1 2 3│
 └~────┘  └~────┘  └~────┘  └~────┘

      ⎕io←0 ⋄ 2 cmat 4                          ⍝ Code is "origin sensitive".
0 1
0 2
0 3
1 2
1 3
2 3

      ⍉ ∘.{⍬⍴⍴⍺ cmat ⍵}⍨ 0 to 12                ⍝ Pascal's triangle.
1  0  0   0   0   0   0   0   0   0  0  0 0
1  1  0   0   0   0   0   0   0   0  0  0 0
1  2  1   0   0   0   0   0   0   0  0  0 0
1  3  3   1   0   0   0   0   0   0  0  0 0
1  4  6   4   1   0   0   0   0   0  0  0 0
1  5 10  10   5   1   0   0   0   0  0  0 0
1  6 15  20  15   6   1   0   0   0  0  0 0
1  7 21  35  35  21   7   1   0   0  0  0 0
1  8 28  56  70  56  28   8   1   0  0  0 0
1  9 36  84 126 126  84  36   9   1  0  0 0
1 10 45 120 210 252 210 120  45  10  1  0 0
1 11 55 165 330 462 462 330 165  55 11  1 0
1 12 66 220 495 792 924 792 495 220 66 12 1

See also: pmat fibonacci rr

Back to: contents

Back to: Workspaces