rslt ← {larg)(op ##.avl) rarg           ⍝ Adelson-Velskii, Landis (AVL) trees.
     T∆ ← T ∪ avl (key val)     ⍝ tree ⍺ with key=val ⍵.
     T∆ ← T ~ avl key           ⍝ tree ⍺ without key ⍵.
     T∆ ← T ⍎ avl key           ⍝ value for key ⍵ in tree ⍺.
    fmt ←   ⍕ avl T             ⍝ format of tree ⍵.
    vec ←   ∊ avl T             ⍝ enlist of tree ⍵.
    chk ←   ? avl T             ⍝ stats for tree ⍵: ok size mean_depth height.
See →BST←
An  AVL  tree is a binary search tree in which subtree heights differ by no more
than 1.
AVL  trees  do not guarantee minimal height. In the following diagram, the nodes
are  marked  with  their  height  (maximum distance to a leaf). For example, B↓2
signifies  a node with search key 'B' and height 2. Notice that sibling subtrees
differ in height by at most 1:
    Height 4 AVL tree       Shallower reordering (also an AVL tree)
           E↓4                      D↓3
          /   \                    /   \
         /     \                  /     \
        C↓3     F↓2              B↓2     F↓2
       /   \     \              /  \    /  \
      B↓2   D↓1   G↓1        A↓1  C↓1  E↓1  G↓1
     /
    A↓1
If we remove node G↓1 from the diagram above left, the resulting tree is said to
"overbalance" as F↓1 differs in height by 2 from its sibling C↓3:
           E↓4
          /  \
         /    \
        C↓3     F↓1         overbalanced: C↓3 >> F↓1
       /   \
      B↓2   D↓1
     /
    A↓1
We  can  maintain  the tree without keeping explicit track of the height of each
node's subtrees.  Instead, we record only the _difference_ in the heights of its
subtrees.
By keeping only the difference in subtree heights, addition and removal of nodes
requires adjustment to only local regions of the tree.
Further, in a correctly balanced AVL tree, this difference must be one of ¯1 0 1
which  means that only two bits per node are required to store balancing inform-
ation.
(
    We  could reduce this to just one bit per node by devolving each node's bal-
    ance bits to its left and right subtree. This works because leaf nodes, with
    only  null  subtrees, are (clearly) balanced and so need no balance bits. We
    can see this scheme in  operation by looking at  the output from ⍕avl, where
    the balance bits are represented by arrows supporting the subtrees.
)
  < left subtree higher ·   ·  ¯1
    subtrees of equal height·   0
  > right subtree higher    ·   1
           <E
           / \
          /   \
        <C     F>
        / \     \
      <B   D     G
      /
     A
Worst-case AVL trees
--------------------
Height:  0   1       2       3       4           5
Size:    0   1       2       4       7          12
         ·   ·       ·       ·       ·           ·
         .   A       B       C       E           H
                    /       / \     / \         / \
                   A       B   D   C   F       /   \
                          /       / \   \     /     \
                         A       B   D   G   E       J
                                /           / \     / \
                               A           C   F   I   K
                                          / \   \       \
                                         B   D   G       L
                                        /
                                       A
We  see  that successive worst cases consist of a new node with the previous two
cases as subtrees. This means that the number of nodes in the worst case follows
the →fibonacci← sequence and for an AVL tree of height ⍵ is:
    {¯1+fibonacci ⍵+2}      ⍝ number of nodes in worst case tree of height ⍵.
An approximation for the ⍵th Fibonacci number implies that the worst-case height
for an ⍵-node AVL tree is 1.44×⍟⍵ (search the Internet for "AVL 1.44").
See →avl_worst← for more.
Balancing
---------
During  insertion or deletion, nodes may overbalance and need to be corrected by
rotation. In the following examples node A overbalances to the left <<A. If node
A were to overbalance right A>>, the mirror-image transformations would be used.
To  avoid  coding  separate  mirror-image rotation functions, the implementation
parameterises the direction of rotation, see Technical notes below.
Depending  on  case, rebalancing decreases or leaves the height of the node as a
whole  unaltered. A decrease in height may cause the node's parent to become un-
balanced and so itself to require rebalancing, and so on.
On  deletion of a node, a worst-case of ⌈2⍟⍵ rotations may be needed to preserve
the tree's balance; on insertion of a new node, at most a single rotation is re-
quired.
To simplify the following diagrams, a little additional notation is used:
    <<N     node N overbalances left
     <N>    node N balances
    /n n\   arbitrary left and right subtrees of node N
     ⌿ ⍀    edge about to be rotated
     |      edge to parent node (if any).
     |=     height unchanged
     |-     height decreased
Case 1: <<A's left subtree <B hangs left:
         |              |-                          rotation reduces height
       <<A      =>     <B>
        ⌿ \            / \                          Bal(A) = Bal(B)-r
      <B   a\        /b  <A>
      / \            /   / \
    /b   b\        /bb  b\  a\
    /
  /bb
Case 2: <<A's left subtree <B> balances:
         |              |=                          height unaltered
       <<A      =>      B>
        ⌿ \            / \                          Bal(A) = Bal(B)-r
      <B>  a\        /b  <A
      / \                / \
    /b   b\             b\  a\
Case 3: <<A's left subtree B> hangs right:
         |              |              |-           rotations reduce height
       <<A      =>    <<A      =>     <C>
        / \            ⌿ \            / \
       B>  a\        <C?  a\         /   \          Bal(A) ← 0⌈-Bal(C)
      / ⍀            / \           <B?   ?A>        Bal(B) ← 0⌊-Bal(C)
    /b  ?C?        <B?  c\         / \   / \
        / \        / \           /b  /c c\  a\
      /c   c\    /b  /c
Balancing  is  perhaps better understood by considering the effects of change on
nodes and edges:
   ┌────────────────────────────────────────────────────────────────────────┐
   │A node  signals a height increment (i) of ¯1 0 1 to its supporting edge.│
   │An edge signals a balancing moment (b) of ¯1 0 1 to its parent node.    │
   └────────────────────────────────────────────────────────────────────────┘
For  example,  inserting  a  new  leaf  imparts  a  height increment of 1 to its
supporting  edge.  If  this  edge  is a right edge (say), it imparts a balancing
moment  of  1 to its parent node. If this node is already balanced (0), then its
balance  would be changed to 1 and it in turn would pass a height increment of 1
to its supporting edge, and so on.
Height Increments
-----------------
Case  analysis  yields  a  simple rule for determining the height increment of a
node  following  insertion  or deletion in one of its subtrees. In the following
diagrams,  without  loss  of generality, we can assume that the height increment
(or decrement) is transmitted by the tree's right subtree:
Insertion, 3 cases:
    ┌───────────────┬───────────────┬───────────────┐
    │   |       |=  │   |       |+  │  |        |=  │
    │  <A  =>  <A>  │  <A>  =>  A>  │  A>  =>  <b>  │
    │  / +     / \  │    +       \  │   \      / \  │
    │               │               │    b    A     │
    │               │       INCR    │     +         │
    └───────────────┴───────────────┴───────────────┘
leading to the rule:
    ┌──────────────────────────────────────────────────────────────────────────┐
    │ A height <increment> is transmitted iff the <initial> node was balanced. │
    └──────────────────────────────────────────────────────────────────────────┘
Deletion, 9 cases:
    ┌───────────────────┬───────────────────┬───────────────────┐
    │    |        |-    │    |        |=    │    |        |-    │
    │   <A  =>   <B>    │   <A>  =>   A>    │    A>  =>  <A>    │
    │   / ⍀      / \    │   / \      / \    │   / \      / \    │
    │ <B            A   │ <B   c   <B   c   │ <B   c   <B   c   │
    │ /                 │ /   ⌿    /        │ /   / \  /   / \  │
    │                   │                   │    d        d     │
    │           DECR    │                   │   ⌿       DECR    │
    ├───────────────────┼───────────────────┼───────────────────┤
    │    |        |=    │    |        |=    │    |        |-    │
    │   <A   =>   B>    │   <A>  =>  <A     │    A>  =>  <A>    │
    │   / ⍀      / \    │   / ⍀      /      │   / \      / \    │
    │ <B>          <A   │ <B>      <B>      │ <B>  c   <B>  c   │
    │ / \          /    │                   │     ⌿             │
    │    c        c     │                   │                   │
    │                   │                   │           DECR    │
    ├───────────────────┼───────────────────┼───────────────────┤
    │    |        |-    │    |         |=   │    |        |-    │
    │   <A  =>   <c>    │   <A>  =>   <A    │    A>  =>  <A>    │
    │   / ⍀      / \    │   / \       / \   │   / \      / \    │
    │  B>       B   A   │  B>  c     B>  c  │  B>  c    B>  c   │
    │   \               │   \ ⌿       \     │   \   \    \   \  │
    │    c              │                   │        d        d │
    │           DECR    │                   │       ⌿   DECR    │
    └───────────────────┴───────────────────┴───────────────────┘
leading to the rule:
   ┌───────────────────────────────────────────────────────────────────────────┐
   │ A height <decrement> is transmitted iff the <resulting> node is balanced. │
   └───────────────────────────────────────────────────────────────────────────┘
Operations on AVL trees
-----------------------
[avl]  uses  the standard →BST← methods for search, insertion, and node removal.
In  the case of insertion a single rotation, and in the case of removal, several
rotations, may be necessary to maintain the tree's balance.
Statistics
----------
An invocation of [avl] with a left operand of ? reports a 4-vector of statistics
(ok size depth height) for its tree right argument, where:
        ok:  boolean -> balance = difference in subtree heights, for all nodes.
      size:  number of (non-null) nodes.
     depth:  mean distance from the root of all nodes.
    height:  maximum distance from root to each leaf; a null tree has height 0.
Technical notes
---------------
These  AVL  functions  are implemented using a recursive triple, with 0 standing
for the null tree:
    (key val) bal (lft rgt)
where:
    key : numeric scalar or character vector
    val : any array value
    bal : balance ¯1 0 1
    lft : left subtree or 0
    rgt : right subtree or 0
The AVL tree functions: insert/replace, delete, search, ... are derived from the
single  operator  [avl]  by binding an appropriate left operand. This is so that
common auxiliary functions such as balancing and rotation, which have little use
outside  the context of AVL trees, may be hidden within the capsule (encapsulat-
ed). It is conceivable that an implementation of D: could retain a partial eval-
uation of such bindings to speed up subsequent application of the derived funct-
ions.
Parameterising direction:
In  order to avoid having to code mirror-image subfunctions for left/right trav-
ersal and left/right rotation, the direction is parameterised using subfunction:
     wise←{(⍺=1)⌽⍵}             ⍝ subtrees ⍵ in -⍺, +⍺ order.
     1 wise 4 5
5 4
    ¯1 wise 4 5
4 5
Similarly, a complete tree may be presented in either +1 or -1 orientation:
     proj←{(⍺=0 0 ¯1)⌽¨⍵}       ⍝ ⍺-projection of node ⍵.
     1 proj 2 3(4 5)
2 3  4 5
    ¯1 proj 2 3(4 5)
2 3  5 4
(muse:
    In  general,  parameterisation saves duplication at the expense of abstract-
    ion.  In  other  words, it produces more concentrated code, which is smaller
    but harder to understand.
)
Formatting
----------
Derived  function ⍕avl transposes the tree and pushes the balance indicators out
to either end of the subtree  branches.  Unlike the diagrams above, sibling sub-
trees of differing height are _each_ preceded by a '>' or '<' indicator. For ex-
ample:
    Notes Diagram               Equivalent ⍕avl
           <E
           / \          =>                   ┌>A=A
          /   \                         ┌>B=B┘
        <C     F>                  ┌>C=C┤
        / \     \                  │    └<D=D
      <B   D     G              E=E┤
      /                            └<F=F┐
     A                                  └>G=G
References:
[1] An algorithm for the organization of information,
    G. M. Adelson-Velskii and E. M. Landis (1962).
          ¯       ¯                 ¯
[2] Knuth: The Art of Computer Programming, Vol 2, Ch. 6.2.3, "Balanced Trees".
[3] The Internet: Search for "AVL trees".
[4] http://en.wikipedia.org/wiki/AVL_trees
Examples:
⍝ derivation of AVL tree functions:
    chk ← ? avl     ⍝ stats for tree ⍵              :: ∇ t → y s d h
    tt ← 0                          ⍝ null tree
    tt ← tt ∪avl 'one' 1            ⍝ single node tree: (key val) bal (lft rgt)
    tt                              ⍝ nested structure of single tree node.
┌───────┬─┬───┐
│┌───┬─┐│0│0 0│
││one│1││ │   │
│└───┴─┘│ │   │
└───────┴─┴───┘
    ⊢ tt ← tt ∪avl 'two' 2          ⍝ insert of second
┌───────┬─┬───────────────────┐
│┌───┬─┐│1│┌─┬───────────────┐│
││one│1││ ││0│┌───────┬─┬───┐││
│└───┴─┘│ ││ ││┌───┬─┐│0│0 0│││
│       │ ││ │││two│2││ │   │││
│       │ ││ ││└───┴─┘│ │   │││
│       │ ││ │└───────┴─┴───┘││
│       │ │└─┴───────────────┘│
└───────┴─┴───────────────────┘
⍝ NB: In the following examples, the _values_ are numeric: 1 2 3 ...
⍝     but the _keys_ are alphabetic: 'one' 'two' 'three' ...
⍝     this means that 'one' precedes 'two' but that 'four' follows 'five'.
    tt ← tt ∪avl 'three' 3          ⍝ insert third node.
    disp tt
┌─────────┬─┬─────────────────────────────────┐
│┌─────┬─┐│0│┌───────────────┬───────────────┐│
││three│3││ ││┌───────┬─┬───┐│┌───────┬─┬───┐││
│└─────┴─┘│ │││┌───┬─┐│0│0 0│││┌───┬─┐│0│0 0│││
│         │ ││││one│1││ │   ││││two│2││ │   │││
│         │ │││└───┴─┘│ │   │││└───┴─┘│ │   │││
│         │ ││└───────┴─┴───┘│└───────┴─┴───┘││
│         │ │└───────────────┴───────────────┘│
└─────────┴─┴─────────────────────────────────┘
    ⍕avl tt                         ⍝ format of tree.
       ┌─one=1
three=3┤
       └─two=2
    tt ← tt ∪avl foldl ('four' 4) ('five' 5) ('six' 6) ('seven' 7)
    ⍕avl tt                         ⍝ 7-node tree.
             ┌>five=5
     ┌<four=4┘
one=1┤
     │               ┌>seven=7
     │        ┌>six=6┘
     └>three=3┤
              └<two=2
    disp ∊avl tt                    ⍝ enlist of tree.
┌────────┬────────┬───────┬─────────┬───────┬─────────┬───────┐
│┌────┬─┐│┌────┬─┐│┌───┬─┐│┌─────┬─┐│┌───┬─┐│┌─────┬─┐│┌───┬─┐│
││five│5│││four│4│││one│1│││seven│7│││six│6│││three│3│││two│2││
│└────┴─┘│└────┴─┘│└───┴─┘│└─────┴─┘│└───┴─┘│└─────┴─┘│└───┴─┘│
└────────┴────────┴───────┴─────────┴───────┴─────────┴───────┘
    chk tt                          ⍝ tree stats: ok size mean_depth height.
1 7 2 4
    ⍕avl tt ∪avl 'six' 666          ⍝ tree with value of 'six' replaced.
             ┌>five=5
     ┌<four=4┘
one=1┤
     │                 ┌>seven=7
     │        ┌>six=666┘
     └>three=3┤
              └<two=2
    ⍕avl tt ~avl 'one'              ⍝ tree with key 'one' removed.
             ┌─five=5
     ┌─four=4┤
     │       └─seven=7
six=6┤
     └─three=3┐
              └>two=2
    tt ~avl foldl ⊃¨∊avl tt         ⍝ tree with all keys removed.
0
⍝ Notice how insertion may require local rotation to maintain balance:
    ⍪ ⍕avl∘(0∘(∪avl foldl))¨,\6↑⎕a  ⍝ insert of A B .. F
┌─────────────┐
│A=A          │
├─────────────┤
│A=A┐         │
│   └>B=B     │
├─────────────┤
│   ┌─A=A     │
│B=B┤         │
│   └─C=C     │
├─────────────┤
│   ┌<A=A     │
│B=B┤         │
│   └>C=C┐    │
│        └>D=D│
├─────────────┤
│   ┌<A=A     │
│B=B┤         │
│   │    ┌─C=C│
│   └>D=D┤    │
│        └─E=E│
├─────────────┤
│        ┌─A=A│
│   ┌─B=B┤    │
│   │    └─C=C│
│D=D┤         │
│   └─E=E┐    │
│        └>F=F│
└─────────────┘
⍝ Removal may also need rotation to maintain balance:
    A_F←0 ∪avl foldl 'ABCDEF'                   ⍝ tree A B .. F
    ⍪ ⍕avl¨ A_F∘(~avl foldl)¨,\' ABCDEF'        ⍝ removal of A B .. F
┌─────────────┐
│        ┌─A=A│
│   ┌─B=B┤    │
│   │    └─C=C│
│D=D┤         │
│   └─E=E┐    │
│        └>F=F│
├─────────────┤
│   ┌─B=B┐    │
│   │    └>C=C│
│D=D┤         │
│   └─E=E┐    │
│        └>F=F│
├─────────────┤
│   ┌<C=C     │
│D=D┤         │
│   └>E=E┐    │
│        └>F=F│
├─────────────┤
│   ┌─D=D     │
│E=E┤         │
│   └─F=F     │
├─────────────┤
│E=E┐         │
│   └>F=F     │
├─────────────┤
│F=F          │
├─────────────┤
└─────────────┘
    ⍪ ⍕avl¨ A_F∘(~avl foldl)¨,\' DEBCFA'    ⍝ repeated removal of root node
┌─────────────┐
│        ┌─A=A│
│   ┌─B=B┤    │
│   │    └─C=C│
│D=D┤         │
│   └─E=E┐    │
│        └>F=F│
├─────────────┤
│        ┌─A=A│
│   ┌>B=B┤    │
│   │    └─C=C│
│E=E┤         │
│   └<F=F     │
├─────────────┤
│   ┌<A=A     │
│B=B┤         │
│   │    ┌>C=C│
│   └>F=F┘    │
├─────────────┤
│   ┌─A=A     │
│C=C┤         │
│   └─F=F     │
├─────────────┤
│   ┌>A=A     │
│F=F┘         │
├─────────────┤
│A=A          │
├─────────────┤
└─────────────┘
See also: BST alists sbst splay redblack foldl fibonacci avl_worst
Back to: contents
Back to: Workspaces