mat ← m ##.cmat n ⍝ ⍺-combination matrix of ⍳⍵.
Returns an m!n-row, m-column, matrix of the m-item combination vectors of ⍳n.
Both rows and row items of the result are in ascending order, thus for any
⍺ ⍵≥0:
{⍵≡⍳⍴⍵}⍋⍺ cmat ⍵ ⍝ Rows in ascending order.
∧/{{⍵≡⍳⍴⍵}⍋⍵}¨↓⍺ cmat ⍵ ⍝ Items in each row in ascending order.
Technical notes:
Notice that a combination matrix can be divided into sections, some of which are
themselves, combination matrices. For example:
┌─┬───┐ ┌───┐
1 2 3 │1│2 3│ │1 2│
1 2 4 │1│2 4│ │1 3│
1 2 5 │1│2 5│ 1 , 1 + │1 4│ 1 , 1 + 2 cmat 4
1 3 4 │1│3 4│ │2 3│
3 cmat 5 => 1 3 5 => │1│3 5│ => ⍪ │2 4│ => ⍪
1 4 5 │1│4 5│ │3 4│
2 3 4 ├─┴───┤ └───┘
2 3 5 │2 3 4│ ┌─────┐
2 4 5 │2 3 5│ 1 + │1 2 3│ 1 + 3 cmat 4
3 4 5 │2 4 5│ │1 2 4│
│3 4 5│ │1 3 4│
└─────┘ │2 3 4│
└─────┘
This observation leads to the recurrence relationship:
⍺ cmat ⍵ ←→ (1,1+(⍺-1)cmat ⍵-1)⍪1+⍺ cmat ⍵-1
Successive transformations of the expression on the right, produce:
Expression Transformation
---------- --------------
(1,1+(⍺-1)cmat ⍵-1)⍪1+⍺ cmat ⍵-1 [0] cmat → ∇
¯¯¯¯ ¯¯¯¯
→ (1,1+(⍺-1)∇⍵-1)⍪1+⍺ ∇⍵-1 [1] ⍺⍪⍵ → ↑⍪/⍺ ⍵
¯
→ ↑⍪/(1,1+(⍺-1)∇ ⍵-1)(1+⍺ ∇ ⍵-1) [2] ⍺ ⍵ → ⌽ ⍵ ⍺
¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯
→ ↑⍪/⌽(1+⍺ ∇ ⍵-1)(1,1+(⍺-1)∇⍵-1) [3] ⍺({}⍵) → {}\⍺ ⍵
¯¯
→ ↑⍪/⌽{1,⍵}\(1+⍺ ∇ ⍵-1)(1+(⍺-1)∇⍵-1) [4] (1+⍺)(1+⍵) → 1+⍺ ⍵
¯¯ ¯¯
→ ↑⍪/⌽{1,⍵}\1+(⍺ ∇ ⍵-1)((⍺-1)∇⍵-1) [5] (⍺{}⍵)(ß{}⍵) → ⍺ ß{}¨⊂⍵
¯¯¯ ¯¯¯¯¯¯
→ ↑⍪/⌽{1,⍵}\1+(⍺-0 1)¨∇⍵-1 [6]
Notice transformation[3]: ⍺({}⍵) → {}\⍺ ⍵, which applies a function to the
_second_ item of a 2-vector. Using this technique, we can avoid temporary local
names in situations such as:
lft rgt←pair ⋄ lft({···⍵}rgt) ⍝ clumsier.
{···⍵}\pair ⍝ neater.
The base cases for the recursion are:
0 cmat ⍵ → 1 0⍴0
⍺ cmat 0 → 0 ⍺⍴0
or:
0∊⍺ ⍵:(⍺=0)⍺⍴0
The (index-origin independent) code is therefore:
cmat←{ ⍝ ⍺-combination matrix of ⍳⍵.
0∊⍺ ⍵:(⍺=0)⍺⍴0 ⍝ done if zero ⍺ or ⍵, otherwise,
↑⍪/⌽{⎕IO,⍵}\1+(⍺-0 1)∇¨⍵-1 ⍝ catenate sub-combinations.
}
Notice however, that the algorithm is inefficient in the same way as a naïve
coding of →fibonacci←, in that the function is applied to the _same_ arguments
many times over. If we display the arguments for the function calls as a tree,
by injecting:
⎕←(,↑(⍴⎕LC)⍴⊂'· '),'[',(⍕⍺ ⍵),']'
as the first line of the function, we see the following "dependency tree", where
[⍺ ⍵] stands for (⍺ cmat ⍵):
2 cmat 4
· [2 4]
· · [2 3]
· · · [2 2]
· · · · [2 1]
· · · · · [2 0]
· · · · · [1 0]
· · · · [1 1]
· · · · · [1 0]
· · · · · [0 0]
· · · [1 2]
· · · · [1 1]
· · · · · [1 0]
· · · · · [0 0]
· · · · [0 1]
· · [1 3]
· · · [1 2]
· · · · [1 1]
· · · · · [1 0]
· · · · · [0 0]
· · · · [0 1]
· · · [0 2]
This is interpreted:
[2 4] depends on
· [2 3] which depends on
· · [2 2] which depends on
· · ···
and
· · [1 2] which depends on ⍝ [1 2] evaluated here.
· · ···
and
· [1 3] which depends on
· · [1 2] which depends on ⍝ [1 2] evaluated again here.
· · ···
··· and so on.
Notice how [1 2], together with its whole sub tree, is evaluated twice. In this
case, [2 4] generates 21 function calls. This number increases very rapidly and
in general, [⍵ ⍵] generates (¯1+2*⍵+1) calls.
┌─┐ ┌───┐
The dependencies are better represented as a _lattice_, where │↑│ and │←--│ mean
"depends on": │|│ └───┘
└─┘
[2 4]←--[2 3]←--[2 2]←--[2 1]←--[2 0]
↑ ↑ ↑ ↑
| | | |
[1 3]←--[1 2]←--[1 1]←--[1 0]
↑ ↑ ↑
| | |
[0 2] [0 1] [0 0]
This expresses exactly the same recurrence relationship as the tree, except that
in this representation, the nodes are _shared_. Notice how the lattice has only
12, as opposed to the tree's 21 nodes. If we can find a way to evaluate each of
its nodes only once, a significant performance benefit should result.
In fact, we can do a little better than this. We chose [⍺ 0] and [0 ⍵] as base
cases for the recursion, because we can "conjure these results out of thin air".
Looking at the lattice, we see a column at the base of the "ramp", which has
values [2 2] [1 1] [0 0]. As we know that [⍵ ⍵] = (1 ⍵⍴⍳⍵) for all ⍵, we can
equally easily use [⍵ ⍵] and [0 ⍵] as base cases, as long as we make special
provision for starting values from _inside_ the ramp where ⍺>⍵. This enables us
to trim the lattice to:
[2 4]←--[2 3]←--[2 2]
↑ ↑
| |
[1 3]←--[1 2]←--[1 1]
↑ ↑
| |
[0 2] [0 1]
A slightly larger example (4 cmat 8), shows more clearly what's going on. Notice
that this lattice has 24 nodes, while its corresponding dependency _tree_ would
have 325.
[4 8]←--[4 7]←--[4 6]←--[4 5]←--[4 4]
↑ ↑ ↑ ↑
| | | |
[3 7]←--[3 6]←--[3 5]←--[3 4]←--[3 3]
↑ ↑ ↑ ↑
| | | |
[2 6]←--[2 5]←--[2 4]←--[2 3]←--[2 2]
↑ ↑ ↑ ↑
| | | |
[1 5]←--[1 4]←--[1 3]←--[1 2]←--[1 1]
↑ ↑ ↑ ↑
| | | |
[0 4] [0 3] [0 2] [0 1]
The diagram suggests using right-to-left or bottom-to-top _reduction_ to
accumulate values towards the top left corner, which is the required result. For
reasons that will become apparent, reduction from the right turns out to be a
slightly better bet. Using ⎕ML←2, so that "↑" means "first":
↑↑{⍺···⍵}/ [0 4] ··· [0 2] [0 1] ,⊂ [4 4] ··· [2 2] [1 1]
Now, as the [0 ⍵]'s all have the same value: (1 0⍴0), we can substitute this
value for ⍺ _inside_ the reduction's operand and ignore the value that is passed
into the operand function as left argument. This means that we may replace the
vector [0 4] ··· [0 2] [0 1] by _any_ vector providing it has the same number of
items: (⍳⍵-⍺) will suffice.
↑↑{(1 0⍴0)···⍵}/ (⍳⍵-⍺),⊂ [4 4] ··· [2 2] [1 1]
Finally, to squeeze out the last drop of refinement, we can compute the little
array (1 0⍴0) just once and bind it as an operand to the reduction's operand,
where it becomes "⍺⍺".
↑↑(1 0⍴0){⍺⍺···⍵}/ (⍳⍵-⍺),⊂ [4 4] ··· [2 2] [1 1]
The base cases: [4 4] ··· [2 2] [1 1], evaluate to ({1 ⍵⍴⍳⍵}¨+\⍺⍴1), noting that
(+\⍺⍴1) is an origin independent expression for "1 to ⍺", which gives:
↑↑(1 0⍴0){⍺⍺···⍵}/ (⍳⍵-⍺),⊂{1 ⍵⍴⍳⍵}¨+\⍺⍴1
At this point, we can cover the "special provision" for the null case where ⍺>⍵.
The result is a matrix of shape (0 ⍺), so the reshape becomes: (⍺≤⍵){⍺ ⍵⍴⍳⍵}.
↑↑(1 0⍴0){⍺⍺···⍵}/ (⍳⍵-⍺),⊂(⍺≤⍵){⍺ ⍵⍴⍳⍵}¨+\⍺⍴1
Now all that remains, is to code the workings of the reduction's operand:
{⍺⍺···⍵}. Each application of the operand will take a _column_ of the lattice as
right argument and a nominal [0 ⍵] as left operand.
Transposing the rightmost column of the lattice above, the operand will be
applied to: ⍵: [4 4] [3 3] [2 2] [1 1] ⍺⍺: [0 1]
↓ ↓ ↓ ↓ ↓
with result: [4 5]←--[3 4]←--[2 3]←--[1 2]←-----'
To achieve this result, we need to do a further, but this time, _cumulative_
reduction. Without accumulation, the result of the "inner" reduction in the case
above would be [4 5], whereas we need the whole vector [4 5] ··· [1 2], but ex-
cluding the initial base case ⍺⍺:[0 1], to pass back to the outer reduction. The
template for such a reduction is:
¯1↓↑{(⍺ ··· ↑⍵),⍵}/(⊂···),···,(⊂···),(⊂⊂···)
││ │ │ │ │ │ └────── _Extra_ rightmost enclosure,
││ │ │ │ │ │ for accumulation.
││ │ │ │ └──────────┴───────────── Items of reduction argument.
││ │ │ └─── Prefix new result item to accumulation.
││ │ └────── First item of accumulated result.
││ └───────── Operation between successive left and right args.
│└─────────────── Disclose vector result of vector reduction.
└──────────────── Discard initial base case ⍺⍺:[0 1].
Substituting our specific cases:
¯1↓↑{(⍺ ··· ↑⍵),⍵}/⍵,⊂⊂⍺⍺
The remaining "···" is exactly the expression developed right back at the start
for the "tree" algorithm: ⍪/⌽{⎕IO,⍵}\1+···, so the complete operand function is:
¯1↓↑{(⍪/⌽{⎕IO,⍵}\1+⍺(↑⍵)),⍵}/⍵,⊂⊂⍺⍺
A last tweak is to "factor out" (⊂⊂⍺⍺) to where ⍺⍺ is generated, so that:
···(1 0⍴0){···{···}/⍵,⊂⊂⍺⍺}/···
becomes: __ ¯¯
···(⊂⊂1 0⍴0){···{···}/⍵,⍺⍺}/···
The final code looks like this. Notice that apart from the specification of ⎕ML,
it's a one-liner. More to the point, there are no assignments, tests or control
branches; it is a single expression:
cmat←{⎕ML←2 ⍝ ⍺-combination matrix of ⍳⍵.
↑↑(⊂⊂1 0⍴0){ ⍝ base-case: [0 ⍵].
¯1↓↑{ ⍝ removing base case from,
(⍪/⌽{⎕IO,⍵}\1+⍺(↑⍵)),⍵ ⍝ accumulation of,
}/⍵,⍺⍺ ⍝ sub- (⍺ ⍵-1) combinations,
}/(⍳0⌈⍵-⍺),⊂(⍺≤⍵){⍺ ⍵⍴⍳⍵}¨⌽+\⍺⍴1 ⍝ with base cases [⍺ 0].
}
Alternative codings
-------------------
John Niss Hansen suggests the following much simpler coding. Effectively,
the function generates all binary numbers of width ⍵ and selects those with ⍺
bits set. Each binary "mask" is then used to select a different ⍺ combination of
numbers from 0 to ⍵-1. This works well for small values of ⍵ but the sub-
expression (⍳⍵⍴2) uses an exponential amount of space (and time) as ⍵ gets
larger. The function assumes ⎕io=0.
cmat←{↑⌽((⍺=+/¨t)/t←,⍳⍵⍴2)/¨⊂⍳⍵}
VMJ suggests the following one-line speedy alternative for small-to-medium sized
results.
cmat←{⎕ML←3 ⋄ ⊃⊃{{(</2↑[⎕IO+1]⊃⍵)/⍵},⍺∘.,⍵}/(⍳⍺)+⊂(-⎕IO)+⍳⍵-⍺-1}
and also:
cmat←{
(⎕IO ⎕ML)←1 3 ⋄ wd←⍺ ⋄ mx←⍵
⊃↑2{⍺>wd:⍵
a←2⊃⍵
b←mx-a+wd-⍺ ⋄ q←∊a+⍳¨b
(⍺+1)∇((b/↑⍵),¨q)q
}2⍴⊂⍳⍵-⍺-1
}
This version is based on Ken Gradney's code, which was published in Vector Vol
18.1:
cmat←{
⎕ML←3 ⋄ n←1+⍵-⍺ ⋄ c←(⍺-1)+⍳n
c{⎕IO≥↑⍺:↑⍵
m←⌽+\⌽↑1↓⍵
(⍺-1)∇((m/⍺-1),(↑⍵)[∊((↑m)-m)+⍳¨m;])m
}((n,1)⍴c)(n⍴1)
}
Roger Hui proposes this short coding:
cmat←{⊖⊃⍪/{k,¨⍪\1+⍵}⍣⍺⊢(⊂⍉⍪⍬),d⍴⊂0 0⍴k←⌽⍳1+d←⍵-⍺}
which is faster than the others except on really large arguments (when it is a
bit slower).
Roger says: The ⊖ is needed because Dyalog does not have "suffix scan" (⊖f⍀⊖⍵).
I am led to doing the regular APL scan on the reversed argument, then reversing
the whole thing at the end.
Examples:
display 0 cmat 5
┌⊖┐
↓0│
└~┘
display 1 cmat 5
┌→┐
↓1│
│2│
│3│
│4│
│5│
└~┘
display 2 cmat 5
┌→──┐
↓1 2│
│1 3│
│1 4│
│1 5│
│2 3│
│2 4│
│2 5│
│3 4│
│3 5│
│4 5│
└~──┘
display 3 cmat 5
┌→────┐
↓1 2 3│
│1 2 4│
│1 2 5│
│1 3 4│
│1 3 5│
│1 4 5│
│2 3 4│
│2 3 5│
│2 4 5│
│3 4 5│
└~────┘
display 4 cmat 5
┌→──────┐
↓1 2 3 4│
│1 2 3 5│
│1 2 4 5│
│1 3 4 5│
│2 3 4 5│
└~──────┘
display 5 cmat 5
┌→────────┐
↓1 2 3 4 5│
└~────────┘
display 6 cmat 5
┌→──────────┐
⌽0 0 0 0 0 0│
└~──────────┘
⍉↑{⍴⍵ cmat 5}¨0 to 6 ⍝ Row and column sizes.
1 5 10 10 5 1 0
0 1 2 3 4 5 6
2{⍵[⍺ cmat⍴⍵]}'scissors' 'paper' 'stone' ⍝ 2-combos of nested vector.
scissors paper
scissors stone
paper stone
↓3{⍵[⍺ cmat⍴⍵]}'abcde' ⍝ 3-combos of char vector.
abc abd abe acd ace ade bcd bce bde cde
display¨0 1 2 3∘.cmat 0 1 2 3 ⍝ Combos of small vectors.
┌⊖┐ ┌⊖┐ ┌⊖┐ ┌⊖┐
↓0│ ↓0│ ↓0│ ↓0│
└~┘ └~┘ └~┘ └~┘
┌→┐ ┌→┐ ┌→┐ ┌→┐
⌽0│ ↓1│ ↓1│ ↓1│
└~┘ └~┘ │2│ │2│
└~┘ │3│
└~┘
┌→──┐ ┌→──┐ ┌→──┐ ┌→──┐
⌽0 0│ ⌽0 0│ ↓1 2│ ↓1 2│
└~──┘ └~──┘ └~──┘ │1 3│
│2 3│
└~──┘
┌→────┐ ┌→────┐ ┌→────┐ ┌→────┐
⌽0 0 0│ ⌽0 0 0│ ⌽0 0 0│ ↓1 2 3│
└~────┘ └~────┘ └~────┘ └~────┘
⎕io←0 ⋄ 2 cmat 4 ⍝ Code is "origin sensitive".
0 1
0 2
0 3
1 2
1 3
2 3
⍉ ∘.{⍬⍴⍴⍺ cmat ⍵}⍨ 0 to 12 ⍝ Pascal's triangle.
1 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0 0 0 0 0 0 0
1 3 3 1 0 0 0 0 0 0 0 0 0
1 4 6 4 1 0 0 0 0 0 0 0 0
1 5 10 10 5 1 0 0 0 0 0 0 0
1 6 15 20 15 6 1 0 0 0 0 0 0
1 7 21 35 35 21 7 1 0 0 0 0 0
1 8 28 56 70 56 28 8 1 0 0 0 0
1 9 36 84 126 126 84 36 9 1 0 0 0
1 10 45 120 210 252 210 120 45 10 1 0 0
1 11 55 165 330 462 462 330 165 55 11 1 0
1 12 66 220 495 792 924 792 495 220 66 12 1
See also: pmat fibonacci rr
Back to: contents
Back to: Workspaces