z←x ##.xtimes y ⍝ Fast multi-digit product using FFT. From Roger Hui: 0. Introduction FFT is used to provide an O(n×⍟n) algorithm for the convolution of two vectors (polynomial multiplication), which is then used to implement a fast multiplic- ation for integers with an arbitrary number of digits. The ideas and code were presented by Henry Rich in the Jwiki essay FFT on 2010-12-27, here translated from J to Dyalog APL. See: http://www.jsoftware.com/jwiki/Essays/FFT In these notes, ⎕pp←6 and ⎕io←0 . 1. Convolution Convolution (polynomial multiplication) can be coded succinctly: convolve←{+⌿(-⍳⍴⍺)⌽⍺∘.×⍵,0×1↓⍺} x←9 3 5 8 1 0 5 y←6 2 3 7 4 x convolve y 54 36 63 130 94 73 109 49 19 35 20 convolve works by summing the antidiagonals of x∘.×y , that is, summing the ele- ments of x∘.×y having like indices in (⍳⍴x)∘.+⍳⍴y , resulting in a vector with ¯1+(⍴x)+⍴y elements. (x∘.×y) ' ' ((⍳⍴x)∘.+⍳⍴y) 54 18 27 63 36 0 1 2 3 4 18 6 9 21 12 1 2 3 4 5 30 10 15 35 20 2 3 4 5 6 48 16 24 56 32 3 4 5 6 7 6 2 3 7 4 4 5 6 7 8 0 0 0 0 0 5 6 7 8 9 30 10 15 35 20 6 7 8 9 10 convolve takes time of order (⍴x)×(⍴y) , making it impractical on large argu- ments (longer than 1e5 , say). 2. FFT A faster convolution obtains by use of FFT (fast Fourier transform). FFT con- verts a polynomial represented by its coefficients into its value at n points, here the n roots of unity, and the inverse FFT converts from the n points back into coefficients. The important properties are that ⍺ convolve ⍵ ←→ iFFT (FFT ⍺) × (FFT ⍵) and that FFT and iFFT are computed in O(n×⍟n) time (justifying the moniker "fast"). The key computation in FFT (and iFFT) is: floop←{(⊣/⍺)∇⍣(×m)⊢(+⌿⍵),[m-0.5]⍺×[⍳m←≢⍴⍺]-⌿⍵} The right argument is a hypercube initially constructed from the argument ⍵ to FFT as (r⍴2)⍴⍵ where r←2⍟⍬⍴⍴⍵ . floop applies to the hypercube r times, to each of the axes in order. Let p←¯1*2÷2*r be a primitive root of unity of order 2*r. The left argument is the hypercube p*(2*k)×s⍴⍳×/s←2⍴⍨(r-1)-k on the k-th applic- ation of floop . Equivalently, the initial left argument is p*s⍴⍳×/s←(r-1)⍴2 and the next left argument obtains from the current one by ⊣/⍺ . x 9 3 5 8 1 0 5 y 6 2 3 7 4 iFFT (FFT 16↑x) × (FFT 16↑y) 54J¯9.32587E¯15 36J¯3.51108E¯15 63J¯6.38378E¯15 ... x rconvolve y 54J¯9.32587E¯15 36J¯3.51108E¯15 63J¯6.38378E¯15 ... ⌊0.5+9○ x rconvolve y 54 36 63 130 94 73 109 49 19 35 20 x convolve y 54 36 63 130 94 73 109 49 19 35 20 Because rconvolve is mathematically (if not numerically) equivalent to convolve, the application of 9○ to its result is justified when the arguments are real and of ⌊0.5+ when the they are integral. 3. Integer Multiplication The convolution of x and y are the digits of their product when x and y are in- terpreted as vectors of digits. The convention here is that the digits are list- ed from most significant to least significant. For example, the number 3142 is represented by 3 1 4 2 . Base 10 is used here but other bases are possible. It remains to convert the result into standard form, with each element less than the base. If carry performs one step of propagating the carries, then the limit of its application is the desired computation. x 9 3 5 8 1 0 5 y 6 2 3 7 4 ⎕←t←⌊0.5+9○ x rconvolve y 54 36 63 130 94 73 109 49 19 35 20 carry 0,t 5 7 12 16 9 11 13 13 10 12 7 0 carry carry 0,t 5 8 3 6 10 2 4 4 1 2 7 0 carry carry carry 0,t 5 8 3 7 0 2 4 4 1 2 7 0 carry carry carry carry 0,t 5 8 3 7 0 2 4 4 1 2 7 0 carry⍣≡ 0,t 5 8 3 7 0 2 4 4 1 2 7 0 x xtimes y 5 8 3 7 0 2 4 4 1 2 7 0 ' '~⍨⍕x xtimes y 583702441270 {⎕pp←18 ⋄ ⍕⍵} 9358105 × 62374 583702441270 s t ← ↓ ? 2 1.5e4⍴10 s xtimes time t 00.296 xtimes has the potential to multiply numbers with over a million digits in under a minute. The function as written is constrained by the fact that in Dyalog APL the maximum rank of an array is 15, so that the maximum product can have only 2*15 or 32768 digits. 4. Collected Definitions The presentation here favors terseness and clarity over efficiency. Some poss- ible improvements are described in the Jwiki essay. (For example, computing the roots of unity just once rather than four times.) convolve←{+⌿(-⍳⍴⍺)⌽⍺∘.×⍵,0×1↓⍺} xtimes←{⎕io←0 roots ← {×\1,1↓(⍵÷2)⍴¯1*2÷⍵} cube ← {⍵⍴⍨2⍴⍨2⍟⍴⍵} extend ← {(2*⌈2⍟¯1+(⍴⍺)+⍴⍵)↑¨⍺ ⍵} floop ← {(⊣/⍺)∇⍣(×m)⊢(+⌿⍵),[m-0.5]⍺×[⍳m←≢⍴⍺]-⌿⍵} FFT ← { ,(cube roots ⍴⍵)floop cube ⍵} iFFT ← {(⍴⍵)÷⍨,(cube +roots ⍴⍵)floop cube ⍵} rconvolve ← {(¯1+(⍴⍺)+⍴⍵)↑iFFT ⊃×/FFT¨⍺ extend ⍵} carry ← {1↓+⌿1 0⌽0,0 10⊤⍵} (+/∧\0=t)↓t←carry⍣≡0,⌊0.5+9○ ⍺ rconvolve ⍵ } Ref: http://www.jsoftware.com/jwiki/Essays/FFT See also: nats big xpower Back to: contents Back to: Workspaces