⍝ Lempel-Ziv-Welch compression: packZ'mississippi' ⍝ compress char vector ┌──┬─────────────────┬────┐ │11│0 0 0 0 1 1 0 0 0│misp│ │ │0 0 1 1 0 1 1 1 0│ │ │ │0 1 0 0 1 1 1 1 1│ │ └──┴─────────────────┴────┘ {⍵≡0 packZ packZ ⍵}'mississippi' ⍝ expand recovers original 1 2⊥2⊃packZ'mississippi' ⍝ ... bool matrix decoded 0 1 2 2 5 7 3 3 1 ¯8 packZ'mississippi' ⍝ final compression dictionary ┌─┬─┬─┬─┬──┬──┬──┬──┬───┬───┬──┬──┐ │m│i│s│p│mi│is│ss│si│iss│sip│pp│pi│ └─┴─┴─┴─┴──┴──┴──┴──┴───┴───┴──┴──┘ segs←{(¯12 packZ ⍵)[⎕io+2⊥2⊃packZ ⍵]} ⍝ dictionary entry per code word. segs'mississippi' ⍝ segments for string. ┌─┬─┬─┬─┬──┬──┬─┬─┬─┐ │m│i│s│s│is│si│p│p│i│ └─┴─┴─┴─┴──┴──┴─┴─┴─┘ {⍵≡↑,/segs ⍵}'mississippi' ⍝ enlist of segs recovers original 1 segs 40⍴'a' ⍝ segs for highly repetitive string ┌─┬──┬───┬────┬─────┬──────┬───────┬────────┬────┐ │a│aa│aaa│aaaa│aaaaa│aaaaaa│aaaaaaa│aaaaaaaa│aaaa│ └─┴──┴───┴────┴─────┴──────┴───────┴────────┴────┘ segs 40⍴'abcd' ⍝ segs for less repetitive string ┌─┬─┬─┬─┬──┬──┬───┬──┬──┬───┬───┬───┬────┬─────┬────┬───┐ │a│b│c│d│ab│cd│abc│da│bc│dab│cda│bcd│abcd│abcda│bcda│bcd│ └─┴─┴─┴─┴──┴──┴───┴──┴──┴───┴───┴───┴────┴─────┴────┴───┘ packZ'hello world' ⍝ compressed char vector ┌──┬─────────────────────┬────────┐ │11│0 0 0 0 0 1 1 0 1 0 1│helo wrd│ │ │0 0 1 1 1 0 0 1 1 1 1│ │ │ │0 1 0 0 1 0 1 1 0 0 1│ │ └──┴─────────────────────┴────────┘ packZ 11 ⎕dr'hello world' ⍝ compressed bool vector ┌──┬───────────────────────────────────────────────────────────────┬───┐ │88│0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0│0 1│ │ │0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 0 1 1 1 0│ │ │ │0 0 0 0 0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 0 0 0 1 0 1 0 0 1 1 0│ │ │ │0 0 0 1 0 1 1 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 0 1 0 0│ │ │ │0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 1 1 1 1 1 0 1 0 1 0 0 1 0 0 0 0│ │ └──┴───────────────────────────────────────────────────────────────┴───┘ packZ 83 ⎕DR'hello world' ⍝ compressed integer vector ┌──┬─────────────────────┬──────────────────────────────┐ │11│0 0 0 0 0 1 1 0 1 0 1│104 101 108 111 32 119 114 100│ │ │0 0 1 1 1 0 0 1 1 1 1│ │ │ │0 1 0 0 1 0 1 1 0 0 1│ │ └──┴─────────────────────┴──────────────────────────────┘ size←{⍬⍴⎕size'⍵'} ⍝ function for size in bytes. :If 32={2×⎕size'⍵'}⍬ ⍝ 32-bit interpreter: size 100 100⍴⍳7 ⍝ size of numeric matrix. 10020 size packZ 100 100⍴⍳7 ⍝ compressed size. 512 :Else ⍝ 64-bit interpreter: size 100 100⍴⍳7 ⍝ size of numeric matrix. 10040 size packZ 100 100⍴⍳7 ⍝ compressed size. 600 :EndIf mat←'Scissors' 'Stone' 'Paper'[2 3⍴1 2 3 2 3 1] mat ⍝ nested matrix ┌────────┬─────┬────────┐ │Scissors│Stone│Paper │ ├────────┼─────┼────────┤ │Stone │Paper│Scissors│ └────────┴─────┴────────┘ :If 32={2×⎕size'⍵'}⍬ ⍝ 32-bit interpreter: size mat ⍝ matrix size. 188 size packZ ⍉mat ⍝ compressed size. 172 :Else ⍝ 64-bit interpreter: size mat ⍝ matrix size. 328 size packZ ⍉mat ⍝ compressed size. 320 :EndIf packZ mat ⍝ compressed matrix ┌───┬─────────┬──────────────────────┐ │2 3│0 0 0 1 0│┌────────┬─────┬─────┐│ │ │0 0 1 0 0││Scissors│Stone│Paper││ │ │0 1 0 0 0│└────────┴─────┴─────┘│ └───┴─────────┴──────────────────────┘ tokens ⊃⎕nr'packZ' ⍝ tokens in first line of packZ ┌─────┬─┬─┬───┬─┬───┬─┬─┬─┬─┬──────────────────────────┬───────────────────────────────┐ │packZ│←│{│⎕IO│ │⎕ML│←│0│ │1│ │⍝ Lempel-Ziv-Welch compression.│ └─────┴─┴─┴───┴─┴───┴─┴─┴─┴─┴──────────────────────────┴───────────────────────────────┘ chk←{⍺←⊢ ⋄ ⍵≡0 packZ ⍺ packZ ⍵} ⍝ check unpack pack round trip. toks←↑,/tokens¨⎕nr'packZ' ⍝ nested vector of packZ's tokens. chk toks ⍝ nested pack: full circle. 1 chk 'hello world' ⍝ char pack: full circle. 1 chk 11 ⎕dr'hello world' ⍝ bool pack: full circle. 1 packZ{⍵/⍵}⍳8 ┌──┬───────────────────────────────────────────┬───────────────┐ │36│0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1│1 2 3 4 5 6 7 8│ │ │0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1│ │ │ │0 0 0 0 0 0 1 0 1 0 0 1 0 1 1 1 1 1 1 0 0 0│ │ │ │0 0 0 1 1 1 0 1 0 0 0 0 1 0 1 1 1 1 1 1 1 1│ │ │ │0 1 1 0 1 1 1 1 0 0 0 1 1 0 0 0 1 0 1 0 1 0│ │ └──┴───────────────────────────────────────────┴───────────────┘ ⎕ml←⍴'VMJ' ⍝ Check OK in ⎕ML 3. 'OK'≡0 packZ packZ'OK' 1 ⍝∇ packZ tokens Back to: code Back to: Workspaces