## A better way to write convolution

Normally discrete convolution is written as follows:

`(f ** g)(n) = sum_(k=0)^n f(k)g(n-k)`

It is not immediately obvious from this that f ** g = g ** f. Consider this alternate representation:

`(f ** g)(n) = sum_(j+k=n) f(j)g(k)`

This formula is obviously symmetric in f and g and it clarifies an important invariant. For example, to multiply two polynomials (or formal power series) we convolve their coefficients. This looks like:

Let `P(x) = sum_k a_k x^k and Q(x) = sum_k b_k x^k` then

`P(x)Q(x) = sum_(j+k=n) a_j b_k x^n`

This emphasizes that the nth coefficient is the product of the coefficients whose degrees sum to n. (You may recognize this as the convolution theorem.)

There is one subtle difference. In this alternate representation, it very much matters what the domain of j and k are. If j and k are naturals, we get what we started with. But if they are integers, then we have:

`sum_(j+k=n) f(j)g(k) = sum_(k=-oo)^oo f(k)g(n-k)`

In many combinatorial situations this makes no difference and may even be preferable. If, however, you are doing something like a short-time Fourier transform, then you probably don’t want to include an infinite number of entries (but you probably aren’t indexing by bbbZ). Personally, I view this as a feature; you should be clear about the domain of your variables.

Let’s prove that ** is associative, i.e. `((f ** g) ** h)(n) = (f ** (g ** h))(n)`. Expanding the left hand side we get:

`sum_(m+k=n) sum_(i+j=m) f(i)g(j)h(k) = sum_(i+j+k=n) f(i)g(j)h(k) = sum_(i+m=n) sum_(j+k=m) f(i)g(j)h(k)`

and we’re done. It’s clear the associativity of convolution comes from the associativity of addition and distributivity. In fact, the commutativity also came from the commutativity of addition. This suggests a simple but broad generalization. Simply replace the + in the indexing with any binary operation, i.e.:

`(f ** g)(n) = sum_(j o+ k=n) f(j)g(k)`

Usually o+ is taken to be an operation of a group, and this makes sense when you need to support the equivalent of n-k, but there’s really nothing keeping it from being a monoid, and the unit isn’t doing anything so why not a semigroup, and we only really need associativity if we want the convolution to be associative, and nothing requires n to be the same type as j or k or for them to be the same type as each other for that matter… Of course having a group structure reduces the often sticky problem of factoring n into all the pairs of j and k such that j o+ k = n to the problem of “simply” enumerating the group.

Here’s some code for the free monoid (i.e. lists) case:

``````import Data.Monoid
import Data.List

class Monoid m => CommutativeMonoid m
instance Num a => CommutativeMonoid (Sum a)

-- Overly generalized convolve
ogconvolve :: (CommutativeMonoid m) => (a -> b -> m) -> (n -> [(j,k)]) -> (j -> a) -> (k -> b) -> n -> m
ogconvolve mul factor f g n = mconcat (map (uncurry (\j k -> mul (f j) (g k))) (factor n))

gconvolve :: Num m => (n -> [(n,n)]) -> (n -> m) -> (n -> m) -> n -> m
gconvolve factor f g n = getSum (ogconvolve (\a b -> Sum (a*b)) factor f g n)

lconvolve :: Num m => ([a] -> m) -> ([a] -> m) -> [a] -> m
lconvolve = gconvolve (\as -> zip (inits as) (tails as))``````

You may have noticed that we never actually use o+ anywhere. It’s only significance is to define the valid factorizations. I.e. there’s an implicit constraint in the above code that

``all (\(j,k) -> n == j `oplus` k) (factor n)``

and also that

``factor n == nub (factor n)``

i.e. no duplicates. (Conceptually, factor spits out a set, indeed the graph of the relation (j,k)$->j o+ k=n.) The commutative monoid constraint and `mul` function were just for the heck of it, but why a commutative monoid and not an arbitrary monoid? Well, we don’t want the result to depend on how `factor` spits out the factors. In other words, if it actually did return a set then we can’t depend on the order of the elements in the set if we want a well-defined function. Here’s where I tell you that `lconvolve` is an important function for something. I suspect it or a variant probably is, but I don’t know. Here’s another monoid case, commutative this time, that definitely is important and interesting: Dirichlet convolution. Here’s the typical bad definition: `(f *** g)(n) = sum_(d$n) f(d)g(n/d)` where d$n means d evenly divides n. These are the coefficients of the product of two Dirichlet series. For example, `F(s) = sum_(n=1)^oo f(n)n^(-s) \quad G(s) = sum_(n=1)^oo g(n)n^(-s)` `\quad F(s)G(s) = sum_(n=1)^oo (f *** g)(n)n^(-s)` We again see a situation where the correspondence doesn’t just jump out at you (or at least it didn’t to me originally) until you realize the above sum can be written: `(f *** g)(n) = sum_(ab=n) f(a)g(b)` then it’s pretty easy to see that it is right. We can start doing a lot of number theory with the above combined with the Euler product formula: `sum_(n=1)^oo f(n)n^(-s) = prod_(p\ "prime") sum_(n=0)^oo f(p^n)p^(-ns)` where f is multiplicative, which means f(ab) = f(a)f(b) when a and b are coprime. (The real significance of f being multiplicative is that it is equivalent to f being completely determined by its values on prime powers.) It turns out it is surprisingly easy and fun to derive various arithmetic functions and identities between them. Here’s a brief teaser. 1(n) = 1 is a multiplicative function and this gives: `zeta(s) = sum_(n=1)^oo n^(-s) = prod_(p\ "prime") 1/(1-p^(-s))` where we’ve used the formula for the sum of a geometric series. So `1/(zeta(s)) = prod_(p\ "prime") 1-p^(-s) = sum_(n=1)^oo mu(n)n^(-s)`. The question is now, what is mu? Well, all we need to do is say what it is on prime powers and comparing the Euler product formula to the above we see, for prime p, mu(p) = -1 and mu(p^n) = 0 for n > 1. This is the Möbius function. Because zeta(s) * (1/(zeta(s))) = 1 we have (1 *** mu)(n) = sum_(d$n) mu(d) = 0 for n > 1. Continuing in this vein leads to analytic number theory and the Riemann hypothesis, though you may want to pick up the Mellin transform along the way.