rslt ← {larg} (op ##.sbst) rarg ⍝ Simple Binary Search Trees. T∆ ← T ∪ sbst (key val) ⍝ tree ⍺ with key=val ⍵. T∆ ← T ~ sbst key ⍝ tree ⍺ without key ⍵. T∆ ← T ⍎ sbst key ⍝ value for key ⍵ in tree ⍺. T∆ ← = sbst T ⍝ balanced tree. fmt ← ⍕ sbst T ⍝ format of tree ⍵. vec ← ∊ sbst T ⍝ enlist of tree ⍵. chk ← ? sbst T ⍝ stats for tree ⍵: ok size mean_depth height. See →BST← [sbst] implements the simplest form of binary search tree (BST) with no attempt to keep the tree balanced. This means that it performs well only if the keys are inserted in random order or the tree is periodically rebalanced. Technical notes: In contrast with the other BST operators, [sbst] includes a balancing function =sbst, which uses the DSW algorithm [0] and works as follows. First, the tree is stripped into a degenerate left list, or "vine". A vine may be pictured as a stem supporting (initially depth-0) leaves, each of which is a complete tree. Starting at the (leftmost) leaf and working back up the vine, alternate sections are "compressed" by rotation, merging adjacent n-leaves into (n+1)-leaves. The process is repeated until, after ⌈2⍟⍵ iterations the vine is a balanced tree. left vine of 0-leaves ··· / O ⌿ N / M ⌿ ← alternate ⌿ edges are rotated. L / K ⌿ J / 1-leaves → 2-leaves → 3-leaves → ... I ··· ··· ··· ⌿ / / / H N L H / ⌿ \ ⌿ \ / \ G L O ⌿ \ / \ ⌿ / \ ⌿ \ / \ F J M H N / \ / ⌿ \ / \ / \ / \ E H K / \ M 0 / \ ⌿ / \ / \ / \ D F I D J D L / ⌿ \ / \ / \ / \ / \ C D G / \ I K / \ / \ ⌿ / \ / \ / \ / \ B B E B F B F J N / / \ / \ / \ / \ / \ / \ / \ A A C A C E G A C E G I K M O If the vine starts with exactly ¯1+2*⍵ nodes, for some ⍵, the result is a perfectly balanced tree. Otherwise, a pre-pass is required to reduce the length by folding those adjacent 0-leaves over and above ¯1+2*⍵ to form 1-leaves. This produces a vine with exactly ¯1+2*⍵ (0 or 1)-leaves, which becomes a tree that is perfectly balanced except for extra nodes located at an incomplete deepest level. For example, a 10-node vine is reduced to a 7-node vine with 3 rotations: J ← vine with (3 + ¯1+2*3) 0-leaves. ⌿ I → I ← vine with ¯1+2*3 (0 1)-leaves. / ⌿ \ H G J → G ← compression to (1 2)-leaves. ⌿ / \ / \ G E H ⌿ \ / ⌿ \ / \ F D F D I → D ← complete tree ⌿ / / \ / \ / \ with 3 extra E C B E H J / \ nodes F H J / ⌿ / \ \ / \ at deepest D B A C F / \ level. / / / \ C A / \ / / \ B B G / / \ / \ A / \ / \ / \ / \ A C E I \ / \ F H J A left-hanging vine, which uses right rotation, is chosen because [sbst] has a hard-coded right rotation subfunction [rrot], used during node removal. B → A rrot←{ ⍝ right rotation. / \ / \ B((A(p q))r)←⍵ ⍝ unpack nodes. A r p B A(p(B(q r))) ⍝ repack nodes. / \ / \ } ⍝ :: t ← ∇ t p q q r Clearly, right vines and left rotations would work just as well. Here is the DSW code. Notice how the size of the tree is discovered by the enlisting function. bal←{ ⍝ dsw-balancing. vine size←0 0 list ⍵ ⍝ vine of 0-leaves and size. log←⌊2⍟size+1 ⍝ largest complete tree ≤ ⍵. rem←1+size-2*log ⍝ no of surplus nodes. cmps←¯2+2*1+⍳log ⍝ compression vector. ↑cmp/(1↓cmps,2×rem),⊂vine ⍝ compression reduction → balanced tree. } ⍝ :: t ← ∇ t cmp←{ ⍝ compress of alternate vine sections. ⍺=0:⍵ ⍝ far enough: terminal leaf. inf(lft rgt)←⍵ ⍝ parts of node. lev←(⍺-1)∇ lft ⍝ leftmost vine leaf. 2|⍺:inf(lev rgt) ⍝ copying of alternate vine sections. rrot inf(lev rgt) ⍝ rotation of alternate vine sections. } ⍝ :: t ← n ∇ v list←{ ⍝ list (0-vine) from tree ⍵. / 0≡⍵:⍺ ⍝ null: accumulated vine. / C inf(lft rgt)←⍵ ⍝ node info & subtrees. B → / lev s←⍺ ∇ lft ⍝ left vine & size, / \ B (inf(lev 0))(s+1)∇ rgt ⍝ ++ right vine. A C / } ⍝ :: v s ← v ∇ t A An alternative "brute force" method of rebalancing would be to take the vector of key=value pairs returned by ∊sbst and re-insert them sequentially into a null tree in an order that guarantees a good balance. bal←{ ⍝ rebalance of tree ⍵. pairs←∊sbst ⍵ ⍝ vector of key=value pairs. order←{⍋⌽⍉((⌈2⍟⍵)/2)⊤⍳⍵}⍴pairs ⍝ binary insert order. ⊃{⊃⍵ ∪sbst ⍺}/pairs[order],0 ⍝ new balanced tree. } ⍝ :: t ← ∇ t However, this method performs poorly compared with DSW, as it uses O(2∘⍟) comp- arisons for each of ⍵ insertions: tt ← 0 ∪sbst foldl 1000?1000 ⍝ 1,000-node tree. cmpx'bal tt' 'bal2 tt' ⍝ compare with DSW. bal tt 2.2E¯1 0% ⎕⎕⎕⎕⎕ bal2 tt 1.8E0 +715% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕ References: [0] 'DSW' stands for Day-Stout-Warren: [1] Day A.C., Balancing a binary tree, Computer Journal 19.4, Nov 1976, 360-361. [2] Stout Q.F. Warren B.L., Tree Rebalancing in Optimal Time and Space, Comms ACM 29.9, Sept 1986, 902-908. Examples: ⍕sbst 0 ∪sbst foldl⍳7 ⍝ no balancing if keys inserted in order. 1=1┐ └2=2┐ └3=3┐ └4=4┐ └5=5┐ └6=6┐ └7=7 ⍕sbst =sbst 0 ∪sbst foldl⍳7 ⍝ explicit rebalancing. ┌1=1 ┌2=2┤ │ └3=3 4=4┤ │ ┌5=5 └6=6┤ └7=7 ⍝ vector of key=value pairs: pairs ← ('one'1) ('two'2) ('three'3) ('four'4) ('five'5) ('six'6) ('seven'7) tt←0 ∪sbst foldl pairs ⍝ pairs folded into tree. ⍕sbst tt ⍝ format of tree. ┌five=5 ┌four=4┘ one=1┤ │ ┌seven=7 │ ┌six=6┘ │ ┌three=3┘ └two=2┘ tt ← =sbst tt ⍝ balanced tree. ⍕sbst tt ┌five=5 ┌four=4┤ │ └one=1 seven=7┤ │ ┌six=6 └three=3┤ └two=2 'six'⍎sbst tt ⍝ value for key 'six' 6 tt ← tt ~sbst'four' ⍝ key 'four' removed. ⍕sbst tt ┌five=5 ┌one=1┘ seven=7┤ │ ┌six=6 └three=3┤ └two=2 ∊sbst tt ⍝ vector of key=value pairs. ┌────────┬───────┬─────────┬───────┬─────────┬───────┐ │┌────┬─┐│┌───┬─┐│┌─────┬─┐│┌───┬─┐│┌─────┬─┐│┌───┬─┐│ ││five│5│││one│1│││seven│7│││six│6│││three│3│││two│2││ │└────┴─┘│└───┴─┘│└─────┴─┘│└───┴─┘│└─────┴─┘│└───┴─┘│ └────────┴───────┴─────────┴───────┴─────────┴───────┘ ?sbst tt ⍝ tree stats: ok size mean_depth height. 1 6 2 3 ⍕sbst =sbst 0 ∪sbst foldl ⎕a ⍝ balance of (11+¯1+2*4)-node tree. ┌A=A ┌B=B┤ │ └C=C ┌D=D┤ │ │ ┌E=E┐ │ │ │ └F=F │ └G=G┤ │ │ ┌H=H │ └I=I┤ │ └J=J K=K┤ │ ┌L=L │ ┌M=M┤ │ │ └N=N │ ┌O=O┤ │ │ │ ┌P=P │ │ └Q=Q┤ │ │ └R=R └S=S┤ │ ┌T=T │ ┌U=U┤ │ │ └V=V └W=W┤ │ ┌X=X └Y=Y┤ └Z=Z ?sbst tt ⍝ tree stats: ok size mean_depth height. 1 6 2 3 See also: BST alists avl splay redblack Back to: contents Back to: Workspaces