A better way to write convolution

Normally discrete convolution is written as follows:

#(f ** g)(n) = sum_(k=0)^n f(k)g(n-k)#

It is not immediately obvious from this that #f ** g = g ** f#. Consider this alternate representation:

#(f ** g)(n) = sum_(j+k=n) f(j)g(k)#

This formula is obviously symmetric in #f# and #g# and it clarifies an important invariant. For example, to multiply two polynomials (or formal power series) we convolve their coefficients. This looks like:

Let #P(x) = sum_k a_k x^k# and #Q(x) = sum_k b_k x^k# then

#P(x)Q(x) = sum_(j+k=n) a_j b_k x^n#

This emphasizes that the #n#th coefficient is the product of the coefficients whose degrees sum to #n#. (You may recognize this as the convolution theorem.)

There is one subtle difference. In this alternate representation, it very much matters what the domain of #j# and #k# are. If #j# and #k# are naturals, we get what we started with. But if they are integers, then we have:

#sum_(j+k=n) f(j)g(k) = sum_(k=-oo)^oo f(k)g(n-k)#

In many combinatorial situations this makes no difference and may even be preferable. If, however, you are doing something like a short-time Fourier transform, then you probably don’t want to include an infinite number of entries (but you probably aren’t indexing by #bbbZ#). Personally, I view this as a feature; you should be clear about the domain of your variables.

Let’s prove that #**# is associative, i.e. #((f ** g) ** h)(n) = (f ** (g ** h))(n)#. Expanding the left hand side we get:

#sum_(m+k=n) sum_(i+j=m) f(i)g(j)h(k)# #= sum_(i+j+k=n) f(i)g(j)h(k)# #= sum_(i+m=n) sum_(j+k=m) f(i)g(j)h(k)#

and we’re done. It’s clear the associativity of convolution comes from the associativity of addition and distributivity. In fact, the commutativity also came from the commutativity of addition. This suggests a simple but broad generalization. Simply replace the #+# in the indexing with any binary operation, i.e.:

#(f ** g)(n) = sum_(j o+ k=n) f(j)g(k)#

Usually #o+# is taken to be an operation of a group, and this makes sense when you need to support the equivalent of #n-k#, but there’s really nothing keeping it from being a monoid, and the unit isn’t doing anything so why not a semigroup, and we only really need associativity if we want the convolution to be associative, and nothing requires #n# to be the same type as #j# or #k# or for them to be the same type as each other for that matter… Of course having a group structure reduces the often sticky problem of factoring #n# into all the pairs of #j# and #k# such that #j o+ k = n# to the problem of “simply” enumerating the group.

Here’s some code for the free monoid (i.e. lists) case:

You may have noticed that we never actually use #o+# anywhere. It’s only significance is to define the valid factorizations. I.e. there’s an implicit constraint in the above code that

and also that

i.e. no duplicates. (Conceptually, factor spits out a set, indeed the graph of the relation #(j,k)|->j o+ k=n#.)

The commutative monoid constraint and mul function were just for the heck of it, but why a commutative monoid and not an arbitrary monoid? Well, we don’t want the result to depend on how factor spits out the factors. In other words, if it actually did return a set then we can’t depend on the order of the elements in the set if we want a well-defined function.

Here’s where I tell you that lconvolve is an important function for something. I suspect it or a variant probably is, but I don’t know. Here’s another monoid case, commutative this time, that definitely is important and interesting: Dirichlet convolution.

Here’s the typical bad definition:

#(f *** g)(n) = sum_(d|n) f(d)g(n/d)# where #d|n# means #d# evenly divides #n#.

These are the coefficients of the product of two Dirichlet series. For example,

#F(s) = sum_(n=1)^oo f(n)n^(-s) \quad G(s) = sum_(n=1)^oo g(n)n^(-s)# #\quad F(s)G(s) = sum_(n=1)^oo (f *** g)(n)n^(-s)#

We again see a situation where the correspondence doesn’t just jump out at you (or at least it didn’t to me originally) until you realize the above sum can be written:

#(f *** g)(n) = sum_(ab=n) f(a)g(b)#

then it’s pretty easy to see that it is right. We can start doing a lot of number theory with the above combined with the Euler product formula:

#sum_(n=1)^oo f(n)n^(-s) = prod_(p\ "prime") sum_(n=0)^oo f(p^n)p^(-ns)#

where #f# is multiplicative, which means #f(ab) = f(a)f(b)# when #a# and #b# are coprime. (The real significance of #f# being multiplicative is that it is equivalent to #f# being completely determined by its values on prime powers.) It turns out it is surprisingly easy and fun to derive various arithmetic functions and identities between them. Here’s a brief teaser.

#1(n) = 1# is a multiplicative function and this gives:

#zeta(s) = sum_(n=1)^oo n^(-s) = prod_(p\ "prime") 1/(1-p^(-s))#

where we’ve used the formula for the sum of a geometric series.

So #1/(zeta(s)) = prod_(p\ "prime") 1-p^(-s) = sum_(n=1)^oo mu(n)n^(-s)#. The question is now, what is #mu#? Well, all we need to do is say what it is on prime powers and comparing the Euler product formula to the above we see, for prime #p#, #mu(p) = -1# and #mu(p^n) = 0# for #n > 1#. This is the Möbius function. Because #zeta(s) * (1/(zeta(s))) = 1# we have #(1 *** mu)(n) = sum_(d|n) mu(d) = 0# for #n > 1#. Continuing in this vein leads to analytic number theory and the Riemann hypothesis, though you may want to pick up the Mellin transform along the way.