expr ← vars ##.unify expr expr ··· ⍝ Unification of expressions. Returns the unification of a vector of expressions, with respect to variables listed in its left argument. For example, if (1×2)+y and x+(3×4) represent the same expression, then it foll- ows that x = 1×2 and y = 3×4, and both expressions must be equal to (1×2)+(3×4). The new expression is said to be the "unification" of the previous ones with re- spect to variables x and y. Technical notes: Unify takes a _vector_ of expressions to be unified and uses primitive reduction to apply Robinson's algorithm cumulatively, two expressions at a time. The algorithm examines the two expressions in parallel, looking for the first pair of terms that "disagree". For example, in expressions: (1+2) × y ÷ 3 - (z + 5) u × (6+7) ÷ 3 - (z + w) the disagreement pairs are: (1+2) u y (6+7) 5 w If just one of the items in the pair is a variable, it is substituted in both of the original expressions with the corresponding value of the pair, and the whole process repeated until the expressions match. If neither of the items is a vari- able, then unification fails and a: "Can't unify" error is generated. A small complication arises with repeated variables. The expressions: z × z (1+y) × (x+2) yield two disagreement pairs: z (1+y) z (x+2) In this case, the _values_ of the two disagreement pairs (1+y) (x+2) are them- selves unified recursively to (1+2), before being substituted for z, yielding (1+2)×(1+2) as the unified expression. This coding includes an "occurs check" to guard against non-termination caused by an attempt to substitute for a variable with a value containing one or more occurrences of that variable. Using the notation: [⍺/⍵]∊ to mean "substitute ⍺ for occurrences of ⍵ in expression ∊" or more succinctly "⍺ for ⍵ in ∊". An ex- ample of such non-termination might be: [(x+1)/x] 2+x => [(x+1)/x] 2+(x+1) => [(x+1)/x] 2+((x+1)+1) => [(x+1)/x] 2+(((x+1)+1)+1) => ... If the process that generates the expressions to be unified can guarantee that such cases do not occur, this check may be omitted. References: Robinson, J.A. "A machine-oriented logic based on the resolution principle", Journal of the ACM, 12:23-41, 1965. Brown, J.A. and Guerreiro, R. "APL2 Implementations of Unification", Conference Proceedings APL87. APL Quote Quad, V17.4, 1987. A slightly different version of unify is used by max.dws for the unification of type expressions during type inference. For a discussion of polymorphic types and some more references, see max.dws/Implementation. Examples: ⍝ Here is a parser for simple parenthesised expressions. px←{ ⍝ Single char token parser. 1=⍴⍵:⊃⍵ ⍝ atom: return scalar. depth←ׯ1⌽-⌿+\'()'∘.=⍵ ⍝ within parens. lpars←0 1⍷depth ⍝ left parens. smask←lpars≥depth ⍝ segmentation mask. drops←1 ¯1×⊂smask/lpars ⍝ parens drop. ∇¨↑↓¨/drops,⊂smask⊂⍵ ⍝ process each segment. } ⍝ and a corresponding de-parser: fx←{ ⍝ Format parse tree. 0=≡⍵:⍕⍵ ⍝ atom: done. ↑,/{ ⍝ join branches. 0=≡⍵:⍵ ⍝ atom: done. '(',⍵,')' ⍝ parenthesise. }∘∇¨⍵ ⍝ each branch. } ⍝ Expressions in x and y: lex rex←px¨ '(1×2)+y' 'x+(3×4)' ⍕disp¨ lex rex ┌───┬─┬─┐ ┌─┬─┬───┐ │1×2│+│y│ │x│+│3×4│ └───┴─┴─┘ └─┴─┴───┘ ⍝ Unification of expressions: 'xy'unify lex rex ┌───┬─┬───┐ │1×2│+│3×4│ └───┴─┴───┘ ⍝ ... formatted. fx 'xy'unify lex rex (1×2)+(3×4) ⍝ More complex expressions: xa←' a + (b÷n) - (b × a) + a ' xb←'(1-c) + (c÷2) - d + e ' xc←' f + g - h + (i-3)' vars←lcase ⎕a ⍝ vars: a b ··· z fx vars unify px¨xa xb xc~¨' ' (1-3)+(3÷2)-(3×(1-3))+(1-3) ⍝ Unify can be used to resolve "logical syllogisms", such as: ⍝ ⍝ Socrates is a man; all men are mortal; ergo, Socrates is mortal. s1←'Socrates' 'is' 'man' s2←'Man' 'is' 'mortal' 'man' 'Man'unify s1 s2 ┌────────┬──┬──────┐ │Socrates│is│mortal│ └────────┴──┴──────┘ ⍝ If both items in the disagreement pair are variables, the (lexically) ⍝ higher is mapped to the lower. In this example, M→C, M→D, C→A, D→B→A: ⎕a unify 'A+B' 'C+D' 'M+M' ⍝ matching variables. A+A ⍝ Unify works on any conformable arrays, including those of higher rank. ⍝ In the following matrices, constant '*' propogates through all variables ⍝ A-Z: a←2 13⍴2/13↑⎕a b←2 13⍴13↓⎕a (⊃⌽,b)←'*' a b ┌─────────────┬─────────────┐ │AABBCCDDEEFFG│NOPQRSTUVWXYZ│ │GHHIIJJKKLLMM│NOPQRSTUVWXY*│ └─────────────┴─────────────┘ ⍝ * matches M matches Y matches F matches X ... ⎕a unify a b ************* ************* ⍝ Expression unification is used to infer type in some functional languages. ⍝ Here is the BNF for a tiny right-associative type expression: ⍝ ⍝ texp := func | atom ⍝ func := atom→texp | (texp)→texp ⍝ atom := tcon | tvar ⍝ tcon := # ⍝ tvar := ⍺ | ∆ | ∊ | ⍳ | ⍵ ⍝ ⍝ Type variables are ⍺ ∆ ∊ ⍳ ⍵, ⍝ # is the only type constant and ⍝ → is the only type operator. ⍝ Examples of expressions: ⍝ ⍝ #, ⍺, ⍺→⍵, (⍺→∆)→∊→# ⍝ Parser for such expressions: pt←{ ⍝ Parse type expression. ⊃⍬{ ⍝ Parse tree and remainder. 0=⍴⍵:⍺ ⍵ ⍝ done: parse tree and remainder. hd tl←(⊃⍵)(1↓⍵) ⍝ head and tail of expr. '('=hd:↑∇/⍬ ∇ tl ⍝ parenthesised sub-expr: ')'=hd:⍺ tl ⍝ complete sub-expression: '→'=hd:⌽⍺{⍺⍺ hd ⍵}\⌽⍬ ∇ tl ⍝ function: hd ∇ tl ⍝ atom: }⍵~' ' ⍝ ignoring blanks. } ⍝ and a corresponding de-parser: ft←{ ⍝ Format parse tree. 0=≡⍵:⍕⍵ ⍝ atom: done. 0=≡⊃⍵:∊∇¨⍵ ⍝ atomic left branch: no parens. '(',(∇⊃⍵),')',∊∇¨1↓⍵ ⍝ parenthesised left branch. } ⍝ Parse trees for type expressions: pt'⍺→∊→⍳→⍵' ┌─┬─┬─────────┐ │⍺│→│┌─┬─┬───┐│ │ │ ││∊│→│⍳→⍵││ │ │ │└─┴─┴───┘│ └─┴─┴─────────┘ pt'(⍺→∊)→⍳→⍵' ┌───┬─┬───┐ │⍺→∊│→│⍳→⍵│ └───┴─┴───┘ ⍝ Formatted parse tree. Note how redundant parentheses are ignored: ft pt'(⍺→∊)→(⍳→⍵)' (⍺→∊)→⍳→⍵ ⍝ Vars: ⍺ ∆ ∊ ⍳ ⍵ vars←'⍺∆∊⍳⍵' ⍝ Some type expressions: xa←'(⍺→⍺)→∊ ' xb←' ∆→∆ ' xc←' ⍳→(⍵→#)' ⍝ Unified type expressions: ft vars unify pt¨xa xb xc (#→#)→#→# ⍝ The following expressions can't be unified: clash←pt¨ '(⍺→⍺)→⍵' '⍺→(⍵→⍵)' ft vars unify clash Can't unify ft vars unify clash ∧ ⍝ However, if the intention is that the variables ⍝ in the expressions are independent, they can be ⍝ distinguished by: dv←{ ⍝ Distinguish Variables ⍺ in exprs ⍵. vlft vrgt←⍺∘∩∘∊¨⍵ ⍝ vars in left and right exprs (vlft∩vrgt){ ⍝ duplicate vars. (⊂⍵)∊⍺⍺:(⍺⍺⍳⊂⍵)⊃⍵⍵ ⍝ var in old: subs new. 0=≡⍵:⍵ ⍝ atom: pass through. ∇¨⍵ ⍝ tree: traverse. }(⍺~vlft)\⍵ ⍝ translated right expr. } ⍝ The left argument of dv is the "pool" of ⍝ variables, available for substitution. ft¨clash (⍺→⍺)→⍵ ⍺→⍵→⍵ ⍝ Distinguish variables: ft¨vars dv clash (⍺→⍺)→⍵ ∆→∊→∊ ⍝ Unify succeeds: ft vars unify vars dv clash (⍺→⍺)→∊→∊ See max.dws/Implementation. See also: tokens parse Back to: contents Back to: Workspaces