## Introduction

Complex-step differentiation is a simple and effective technique for numerically differentiating a(n analytic) function. Discussing it is a neat combination of complex analysis, numerical analysis, and ring theory. We’ll see that it is very closely connected to forward-mode automatic differentiation (FAD). For better or worse, while widely applicable, the scenarios where complex-step differentiation is the best solution are a bit rare. To apply complex-step differentiation, you need a version of your desired function that operates on complex numbers. If you have that, then you can apply complex-step differentiation immediately. Otherwise, you need to adapt the function to complex arguments. This can be done essentially automatically using the same techniques as automatic differentiation, but at that point you might as well use automatic differentiation. Adapting the code to complex numbers or AD takes about the same amount of effort, however, the AD version will be more efficient, more accurate, and easier to use.

Nevertheless, this serves as a simple example to illustrate several theoretical and practical ideas.

## Numerical Differentiation

The problem we’re solving is given a function $f : \mathbb R \to \mathbb R$ which is differentiable around a point $x_0$, we’d like to compute its derivative $f’$ at $x_0$. In many cases, $f$ is real analytic at the point $x_0$ meaning $f$ has a Taylor series which converges to $f$ in some open interval containing $x_0$.

The most obvious way of numerically differentiating $f$ is to approximate the limit in the definition of the derivative, $f’(x) = \lim_{h\to 0} [f(x + h) - f(x)] / h$, by simply choosing a small value for $h$ rather than taking the limit. When $f$ is real analytic at $x$, we can analyze the quality of this approximation by expanding $f(x + h)$ in a Taylor series at $x$. This produces $[f(x + h) - f(x)]/h = f’(x) + O(h)$. A slight tweak produces a better result with the same number of evaluations of $f$. Specifically, the Taylor series of $f(x + h) - f(x - h)$ at $x$ is equal to the odd part the Taylor series of $f(x + h)$ at $x$. This leads to the Central Differences formula:

$$f'(x) + O(h^2) = \frac{f(x + h) - f(x - h)}{2h}$$

The following interactively illustrates this using the function $f(x) = x^{9/2}$ evaluated at $x_0 =$ . The correct answer to $17$ digits is $f’($$) {}={}$. The slider ranges from $h=10^{-2}$ to $h=10^{-20}$.

$h$:
$f’($$)$:
error:

If you play with the slider using the first example, you’ll see that the error decreases until around $10^{-5}$ after which it starts increasing until $10^{-15}$ where it is off by more than $1$. At $10^{-16}$ the estimated derivative is $0$ which is, of course, completely incorrect. Even at $10^{-5}$ the error is on the order of $10^{-9}$ which is much higher than the double precision floating point machine epsilon of approximately $10^{-16}$.

There are two issues here. First, we have the issue that if $x_0 \neq 0$, then $x_0 + h = x_0$ for sufficiently small $h$. This happens when $x_0/h$ has a magnitude of around $10^{16}$ or more.

The second issue here is known as catastrophic cancellation. For simplicity, let’s say $f(x)=1$. (It’s actually about $6.2$ for the first example.) Let’s further say for some small value of $h$, $f(x+h) = 1.00000000000020404346$. The value we care about is the $0.00000000000020404346$, but given limited precision, we might have $f(x + h) = 1.000000000000204$, meaning we only have three digits of precision for the value we care about. Indeed, as $h$ becomes smaller we’ll lose more and more precision in our desired value until we lose all precision which happens when $f(x + h) = f(x)$. It is generally a bad idea numerically to calculate a small value by subtracting two larger values for this reason.

We would not have the first issue if $x_0 = 0$ as in the second and fourth examples ($f(x) = e^x$). We would not have the second issue if $f(x) = 0$ as in the second and third examples ($f(x) = \sin(x)$ with $x_0 = \pi$). We have neither issue in the second example of $f(x) = \sin(x)$ with $x_0 = 0$. This will become important later.

We have a dilemma. For the theory, we want as small a value of $h$ as possible without being zero. In practice, we start losing precision as $h$ gets smaller, and generally larger values of $h$ are going to be less impacted by this.

Let’s set this aside for now and look at other ways of numerically computing the derivative in the hopes that we can avoid this problem.

## Cauchy’s Residue Theorem

If we talk about functions $f : \mathbb C \to \mathbb C$, the analogue of real analyticity is holomorphicity or complex analyticity. A complex function is holomorphic if it satisfies the Cauchy-Riemann equations. (See the Appendix for more details about where the Cauchy-Riemann equations come from.) A complex function is complex analytic if it has a Taylor series which converges to the function. It can be proven that these two descriptions are equivalent, though this isn’t a trivial fact. We can also talk about functions that are holomorphic or complex analytic on an open subset of $\mathbb C$ and at a point by considering an open subset around that point. The typical choice of open subset is some suitably small open disk in the complex plane about the point. (Other common domains are ellipses, infinite strips, and areas bounded by Hankel contours and variations such as sideways opening parabolas.)

A major fact about holomorphic functions is the Cauchy integral theorem. If $f$ is a holomorphic function inside a (suitably nice) closed curve $\Gamma$ in the complex plane, then $\oint_\Gamma f(z)\mathrm dz = 0$. Again, $\Gamma$ will typically be chosen to be some circle. (Integrals like this in the complex plane are often called contour integrals and the curves we’re integrating along are called contours.)

Things get really interesting when we generalize to meromorphic functions which are complex functions that are holomorphic except at an isolated set of points. These take the form of poles which are points $z_0$ such that $1/f(z_0) = 0$, i.e. poles are where a function goes to infinity as, e.g., $1/z$ does at $0$. The generalization of Cauchy’s integral theorem is Cauchy’s Residue Theorem. This theorem is surprising and is one of the most useful theorems in all of mathematics both theoretically and practically.

We’ll only need a common special case of it. Let $f$ be a holomorphic function, then $f(z)/(z - z_0)^n$ is a meromorphic function with a single pole of order $n$ at $z_0$. If $\Gamma$ is a positively oriented, simple closed curve containing $z_0$, then
$$f^{(n-1)}(z_0) = \frac{(n-1)!}{2 \pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{(z - z_0)^n}$$
In this case, $f^{(n-1)}(z_0)/(n-1)!$ is the residue of $f(z)/(z - z_0)^n$ at $z_0$. More generally, if there are multiple poles in the area bounded by $\Gamma$, then we will sum up their residues.

This formula provides us a means of calculating the $(n-1)$-st Taylor coefficient of a complex analytic function at any point. For our particular purposes, we’ll only need the $n=2$ case, $f’(z_0) = \frac{1}{2 \pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{(z - z_0)^2}$.

For the remainder of this section I want to give some examples of how Cauchy’s Residue Theorem is used both theoretically and practically. This whole article will itself be another practical example of Cauchy’s Residue Theorem. This is not exhaustive by any means.

To start illustrating some of the surprising properties of this theorem, we can take the $n=1$ case which states that we can evaluate a holomorphic function at any point via $f(z_0) = \frac{1}{2 \pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{z - z_0}$ where $\Gamma$ is any contour which bounds an area containing $z_0$. This leads to an interesting discreteness. Not only can we evaluate a (holomorphic) function (or any of its derivatives) at a point via the values of the function on a contour, the only significant constraint on that contour is that it bound an area containing the desired point. In other words, no matter how we deform the contour the integral is constant except when we deform the contour so as not to bound an area containing the point being evaluated, at which point the integral’s value is $0$1. It may seem odd to use an integral to evaluate a function at a point, but it can be useful when there are numerical issues with evaluating the function near the desired point2. In fact, these results show that if we know the values of a holomorphic function on the boundary of a given open subset of the complex plane, then we know the value of the function everywhere. In this sense, holomorphic functions (and analytic functions in general) are extremely rigid.

This leads to the notion of analytic continuation where we try to compute an analytic function beyond its overt domain of definition. This is the basis of most “sums of divergent series”. For example, there is the first-year calculus fact that the sum of the infinite series $\sum_{n=0}^\infty x^n$ is $1/(1-x)$ converging on the interval $x \in (-1,1)$. In fact, the proof of convergence only needs $\$x\$< 1$ so we can readily generalize to complex $z$ with $\$z\$< 1$, i.e. $z$ contained in the open unit disk. However, $1/(1-z)$ is a meromorphic function that is holomorphic everywhere except for $z=1$, therefore there is a unique analytic function defined everywhere except $z=1$ that agrees with the infinite sum on the unit disk, namely $1/(1-z)$ itself. Choosing $z=-1$ leads to the common example of “summing a divergent series” with “$\sum_{n=0}^\infty (-1)^n = 1/2$” which really means “the value at $-1$ of the unique complex analytic function which agrees with this infinite series when it converges”.

Sticking with just evaluation, applying the Cauchy Residue theorem to quadrature, i.e. numerical integration, leads to an interesting connection to a rational approximation problem. Say we want to compute $\int_{-1}^1 f(x) \mathrm dx$, we can use the Cauchy integral to evaluate $f(x)$ leading to
$$\int_{-1}^1 f(x) \mathrm dx = \int_{-1}^1 \frac{1}{2\pi i}\oint \frac{f(z)\mathrm dz}{z - x}\mathrm dx = \frac{1}{2\pi i}\oint f(z)\int_{-1}^1 \frac{\mathrm dx}{z - x}\mathrm dz = \frac{1}{2\pi i}\oint f(z)\log\left(\frac{z+1}{z-1}\right)\mathrm dz$$
A quadrature formula looks like $\int_{-1}^1 f(x) \mathrm dx \approx \sum_{k=1}^N w_k f(x_k)$. The sum can be written as a Cauchy integral of $\oint f(z)\sum_{k=1}^N \frac{1}{2\pi i}\frac{w_k\mathrm dz}{z - x_k}$. We thus have
$$\left$\frac{1}{2\pi i}\oint f(z)\left[\log\left(\frac{z+1}{z-1}\right) - \sum_{k=1}^N \frac{w_k}{z - x_k}\right]\mathrm dz\right$$$
as the error of the approximation. The sum is a rational function (in partial fraction form) and thus the error is minimized by points ($x_k$) and weights ($w_k$) that lead to better rational approximations of $\log((z+1)/(z-1))$3.

The ability to calculate coefficients of the Taylor series of a holomorphic function is, by itself, already a valuable tool in both theory and practice. In particular, the coefficients of a generating function or a Z-transform can be computed with Cauchy integrals. This has applications in probability theory, statistics, finance, combinatorics, recurrences, differential equations, and signal processing. Indeed, when $z_0 = 0$ and $\Gamma$ is the unit circle, then the Cauchy integral is a component of the Fourier series of $f$. Approximating these integrals with the Trapezoid Rule (which we’ll discuss in a bit) produces the Discrete Fourier Transform.

Let $p$ be a polynomial and, for simplicity, assume all its zeroes are of multiplicity one. Then $1/p(z)$ is a meromorphic function that’s holomorphic everywhere except for the roots of $p$. The Cauchy integral $\frac{1}{2\pi i}\oint_{\Gamma} \frac{p’(z)\mathrm dz}{p(z)}$ counts the number of roots of $p$ are contained in the area bounded by $\Gamma$. If we know there is only one root of $p$ within the area bounded by $\Gamma$, then we can compute that root with $\frac{1}{2\pi i}\oint_{\Gamma} \frac{z p’(z)\mathrm dz}{p(z)}$. A better approach is to use the formulas $\left(\oint \frac{z\mathrm dz}{p(z)}\right)/\left(\oint \frac{\mathrm dz}{p(z)}\right)$. Similar ideas can be used to adapt this to counting and finding multiple roots. See Numerical Algorithms based on Analytic Function Values at Roots of Unity by Austin, Kravanja, and Trefethen (2014) which is a good survey in general.

Another very common use of Cauchy’s Residue Theorem is to sum (convergent) infinite series. $\tan(\pi z)/\pi$ has a zero at $z = k$ for each integer $k$ and a non-zero derivative at those points. In fact, the derivative is $1$. Alternatively, we could use $\sin(\pi z)/\pi$ which has a zero at $z = k$ for each integer $k$ but has derivative $(-1)^k$ at those points. Therefore, $\pi\cot(\pi z) = \pi/\tan(\pi z)$ has a (first-order) pole at $z = k$ for each integer $k$ with residue $1$. In particular, if $f$ is a holomorphic function (at least near the real axis), then the value of the Cauchy integral of $f(z)\pi\cot(\pi z)$ along a Hankel contour will be $2\pi i \sum_{k=0}^\infty f(k)$. Along an infinite strip around the real axis we’d get $2 \pi i \sum_{k=-\infty}^\infty f(k)$. As an example, we can consider the famous sum, $\sum_{k=1}^\infty 1/k^2$. It can be shown that if $f$ is a meromorphic function whose poles are not located at integers, then $\oint f(z)\pi \cot(\pi z)\mathrm dz = 0$. We thus have that $\sum_{k=-\infty}^{\infty} f(k) = -\sum_j \mathrm{Res}(f(z)\pi\cot(\pi z); z_j)$ where $z_j$ are the poles of $f$. In particular, $f(z) = \frac{1}{z^2 + a^2}$ has (first-order) poles at $\pm ai$. This gives us simply
$$\sum_{k=-\infty}^{\infty} \frac{1}{k^2 + a^2} = -\pi\frac{\cot(\pi a i)-\cot(-\pi a i)}{2ai} = \frac{\pi}{a}\coth(\pi a)$$
where I’ve used $\coth(x) = i\cot(xi)$ and the fact that $\coth$ is an odd function. Exploiting the symmetry of the sum gives us
$$\sum_{k=1}^{\infty} \frac{1}{k^2 + a^2} = \frac{\pi}{2a}\coth(\pi a) - \frac{1}{2a^2}$$
By expanding $\coth$ in a Laurent series, we see that the limit of the right-hand side as $a$ approaches $0$ is $\frac{\pi^2}{6}$. While contour integration is quite effective for coming up with analytic solutions to infinite sums, numerically integrating the contour integrals is also highly effective as illustrated in Talbot quadratures and rational approximations by Trefethen, Weideman, and Schmelzer (2006), for example.

## Computing the Integrals

We’ve seen in the previous section that $f’(z_0) = \frac{1}{2\pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{(z-z_0)^2}$. This doesn’t much help us if we don’t have a way to compute the integrals. From this point forward, fix $\Gamma$ as a circle of radius $h$ centered on $z_0$.

Before that, let’s consider numerical integration in general. Say we want to integrate the real function $f$ from $0$ to $b$, i.e. we want to calculate $\int_0^b f(x)\mathrm dx$. The most obvious way to go about it is to approximate the Riemann sums that define the (Riemann) integral. This would produce a formula like $\int_0^b f(x)\mathrm dx \approx \frac{b}{N}\sum_{k=0}^{N-1} f(bk/N)$ corresponding to summing the areas of rectangles whose left points are the values of $f$. As before with central differences, relatively minor tweaks will give better approximations. In particular, we get the two roughly equivalent approximations of the Midpoint Rule $\int_0^b f(x)\mathrm dx \approx \frac{b}{N}\sum_{k=0}^{N-1} f\left(\frac{b(k+(1/2))}{N}\right)$ where we take the midpoint rather than the left or right point, and the Trapezoid Rule $\int_0^b f(x)\mathrm dx \approx \frac{b}{2N}\sum_{k=0}^{N-1}[f(b(k+1)/N) + f(bk/N)]$ where we average the left and right Riemann sums. While both of these perform substantially better than the left/right Riemann sums, they are still rather basic quadrature rules; the error decreases as $O(1/N^2)$.

Something special happens when $f$ is a periodic function. First, the Trapezoid rule reduces to $\frac{b}{N}\sum_{k=0}^{N-1} f(bk/N)$. More importantly, the Midpoint rule and the Trapezoid rule both start converging geometrically rather than quadratically. Furthermore, for the particular case we’re interested in, namely integrating analytic functions along a circle in the complex plane, these quadrature rules are optimal. Let $\zeta$ be the $2N$-th root of unity. The Trapezoid rule corresponds to sum the values of $f$ at the even powers of $\zeta$ scaled by the radius $h$ and translated by $z_0$, and the Midpoint rule corresponds to the sum of the odd powers.

We now have two parameters for approximating a Cauchy integral via the Trapezoid or Midpoint rules: the radius $h$ and the number of points $N$.

Complex-Step Differentiation corresponds to approximating the Cauchy integral for the derivative using the extreme case of the Midpoint rule with $N=2$ and very small radii (i.e. values of $h$). Meanwhile, Central Differences corresponds to the extreme case of using the Trapezoid rule with $N=2$ and very small radii. To spell this out a bit more, we perform the substitution $z - z_0 = he^{\theta i}$ which leads to $\mathrm dz = hie^{\theta i}\mathrm d\theta$ and
$$\frac{1}{2\pi i}\oint_{$z - z_0$ = h} \frac{f(z)\mathrm dz}{(z - z_0)^2} = \frac{1}{2 \pi h}\int_0^{2\pi} f(z_0 + he^{\theta i})e^{-\theta i}\mathrm d\theta$$

Applying the Trapezoid rule to the right hand side of this corresponds to picking $\theta = 0, \pi$, while applying the Midpoint rule corresponds to picking $\theta = \pm \pi/2$. $e^{\theta i} = \pm 1$ for $\theta = 0, \pi$, and $e^{\theta i} = \pm i$ for $\theta = \pm \pi/2$. For the Trapezoid rule, this leads to $f’(z_0) \approx \frac{1}{2h}[f(z_0 + h) - f(z_0 - h)]$ which is Central Differences. For the Midpoint rule, this leads to $f’(z_0) \approx \frac{1}{2hi}[f(z_0 + hi) - f(z_0 - hi)]$. This is Complex-Step Differentiation when $z_0$ is real.

## Complex-Step Differentiation

As just calculated, Complex-Step Differentiation computes the derivative at the real number $x_0$ via the formula:
$$f'(x_0) \approx \frac{1}{2hi} [f(x_0 + hi) - f(x_0 - hi)]$$
Another perspective on this formula is that it is just the Central Differences formula along the imaginary axis instead of the real axis.

When $f$ is complex analytic and real-valued on real arguments, then we have $f(\overline z) = \overline{f(z)}$ where $\overline z$ is the complex conjugate of $z$, i.e. it maps $a + bi$ to $a - bi$ or $re^{\theta i}$ to $re^{-\theta i}$. This leads to $f(x_0 + hi) - f(\overline{x_0 + hi}) = f(x_0 + hi) - \overline{f(x_0 + hi)} = 2i\operatorname{Im}(f(x_0 + hi))$. This lets us simplify Complex-Step Differentiation to $f’(x_0) \approx \operatorname{Im}(f(x_0 + h))/h$.

Here is the earlier interactive example but now using Complex-Step Differentiation. As $h$ decreases in magnitude, the error steadily decreases until there is no error at all.

$h$:
$f’($$)$:
error:

This formula using $\operatorname{Im}$ avoids catastrophic cancellation simply by not doing a subtraction. However, it turns out for real $x_0$ (which is necessary to derive the simplified formula), there isn’t a problem either way. Using the first form of the Complex-Step Differentiation formula is also numerically stable. The key here is that the imaginary part of $x_0$ and $f(x_0)$ are both $0$ and so we don’t get catastrophic cancellation for the same reason we wouldn’t get it with Central Differences if $f(x_0) = 0$. This suggests that if we wanted to evaluate $f’$ at some non-zero point on the imaginary axis, Complex-Step Differentiation would perform poorly while Central Differences would perform well. Further, if we wanted to evaluate $f’$ at some point not on either the real or imaginary axes, neither approach would perform well. In this case, choosing different values for $N$ and the radius would be necessary4.

A third perspective on Complex-Step Differentiation comes when we think about which value of $h$ should we use. The smaller $f’(x_0)$ is, the smaller we’d want $h$ to be. Unlike Central Differences, there is little stopping us from having $h$ be very small and values like $h=10^{-100}$ are typical. In fact, around $h=10^{-155}$ in double precision floating point arithmetic, $h$ gets the theoretically useful property that $h^2 = 0$ due to underflow. In this case, $x_0 + hi$ behaves like $x_0 + \varepsilon$ where $\varepsilon^2 = 0$. This is the defining property of the ring of dual numbers. Dual numbers are exactly what are used in forward-mode automatic differentiation.

## Forward-Mode Automatic Differentiation

The ring of dual numbers has numbers of the form $a + b\varepsilon$ where $a, b \in \mathbb R$. This behaves just like the complex numbers except that instead of $i^2 = -1$ we have $\varepsilon^2 = 0$. The utility of dual numbers for our purposes can be seen by expanding $f(x_0 + \varepsilon)$ in a Taylor series about $x_0$. We get $f(x_0 + \varepsilon) = f(x_0) + f’(x_0)\varepsilon$. All higher power terms of the Taylor series are zero because $\varepsilon^2 = 0$. We can thus get the derivative of $f$ simply by computing $f(x + \varepsilon)$ and then looking at the coefficient of $\varepsilon$ in the result.

In this example there is no interactivity as we are not estimating the derivative in the AD case but instead calculating it in parallel. There is no $h$ parameter.

$f’($$)$:
error:

As the end of the previous section indicated, Complex-Step Differentiation approximates this (often exactly) by using $hi$ as $\varepsilon$. Nevertheless, this is not ideal. Often the complex versions of a function will be more costly than their dual number counterparts. For example, $(a + bi)(c + di) = (ac - bd) + (ad + bc)i$ involves four real multiplications and two additions. $(a + b\varepsilon)(c + d\varepsilon) = ac + (ad + bc)\varepsilon$ involves three real multiplications and one addition on the other hand.

Using Complex Variables to Estimate Derivatives of Real Functions by Squire and Trapp (1998) is the first(?) published paper specifically about the idea of complex-step differentiation. It’s a three page paper and the authors are not claiming any originality but just demonstrating the effectiveness of ideas from the ’60s that the authors found to be underappreciated.

The Complex-Step Derivative Approximation by Martins, Sturdza, and Alonso (2003) does a much deeper dive into the theory behind complex-step differentiation and its connections to automatic differentiation.

You may have noticed the name “Trefethen” in many of the papers cited. Nick Trefethen and his collaborators have been doing amazing work for the past couple of decades, most notably in the Chebfun project. Looking at Trefethen’s book Approximation Theory and Approximation Practice (and lectures) recently reintroduced me to Trefethen’s work. This particular article was prompted by a footnote in the paper The Exponentially Convergent Trapezoidal Rule which I highly recommend. In fact, I highly recommend Chebfun as well as nearly all of Trefethen’s work. It is routinely compelling, interesting, and well presented.

## Appendix

Using the language of Geometric Calculus, we can write a very general form of the Fundamental Theorem of Calculus. Namely, $\int_{\mathcal M} \mathrm d^m\mathbf x \nabla f(\mathbf x) = \oint_{\partial \mathcal M}\mathrm d^{m-1}\mathbf x f(\mathbf x)$ where $\mathcal M$ is an $m$-dimensional manifold. Here $f$ is a multivector-valued vector function. If $m=2$ and $\nabla f = 0$, then this would produce a formula very similar to the Cauchy integral formula.

Writing $f(x + yi) = u(x, y) + v(x, y)i$, the Cauchy-Riemann equations are $\frac{\partial u}{\partial x} = \frac{\partial v}{\partial y}$ and $\frac{\partial u}{\partial y} = -\frac{\partial v}{\partial x}$. However, $\nabla f = 0$ leads to the slightly different equations $\frac{\partial u}{\partial x} = -\frac{\partial v}{\partial y}$ and $\frac{\partial u}{\partial y} = \frac{\partial v}{\partial x}$.

The resolution of this discrepancy is found by recognizing that we don’t want $f$ to be a vector-valued vector function but rather a spinor-valued spinor function. It is most natural to identify complex numbers with the even subalgebra of the 2D geometric algebra. If $\mathbf e_1$ and $\mathbf e_2$ are the two orthonormal basis vectors of the 2D space, then the pseudoscalar $I \equiv \mathbf e_1\wedge \mathbf e_2 = \mathbf e_1 \mathbf e_2$ satisfies $I^2 = -1$. For the 2D case, a spinor is a multivector of the form $a + bI$.

We can generalize the vector derivative, $\nabla$, to a multivector derivative $\nabla_X$ where $X$ is a multivector variable by using the generic formula for the directional derivative in a linear space and then defining $\nabla_X$ to be a linear combination of directional derivatives. Given any $\mathbb R$-linear space $V$ and an element $v \in V$, we can define the directional derivative of $f : V \to V$ in the direction $v$ via $\frac{\partial f}{\partial v}(x) \equiv \frac{\mathrm d f(x + \tau v)}{\mathrm d\tau}$. In our case, we have the basis vectors $\{1, \mathbf e_1, \mathbf e_2, I\}$ though we only care about the even subalgebra corresponding to the basis vectors $\{1, I\}$. Define $\partial_1 f(x) \equiv \frac{\mathrm d f(x + \tau)}{\mathrm d \tau}$ and $\partial_I f(x) \equiv \frac{\mathrm d f(x + \tau I)}{\mathrm d \tau}$ assuming $f$ is a spinor-valued function5. We can then define $\nabla_{\mathbf z} \equiv \partial_1 + I\partial_I$. We now have $\nabla_{\mathbf z} f = 0$ is the equivalent to the Cauchy-Riemann equations where $f$ is now a spinor-valued spinor function, i.e. a function of $\mathbf z$.

1. In terms of the general theory of partial differential equations, we are saying that $z^{-1}$ is a Green’s function for $\nabla$. We can then understand everything that is happening here in terms of general results. In particular, it is the two-dimensional case of the results described in Multivector Functions by Hestenes.↩︎
2. See Numerical Algorithms based on Analytic Function Values at Roots of Unity by Austin, Kravanja, and Trefethen (2014) for an example. Also, with some minor tweaks, we can have that “point” be a matrix and these integrals can be used to calculate functions of matrices, e.g. the square root, exponent, inverse, and log of a matrix. See Computing $A^\alpha$, $\log(A)$, and Related Matrix Functions by Contour Integrals by Hale, Higham, and Trefethen (2009) for details.↩︎
4. While focused on issues that mostly come up with very high-order derivatives, e.g. $100$-th derivatives and higher, Accuracy and Stability of Computing High-Order Derivatives of Analytic Functions by Cauchy Integrals by Bornemann (2009) nevertheless has a good discussions of the concerns here.↩︎