Complex-step differentiation is a simple and effective technique for numerically differentiating a(n analytic) function.
Discussing it is a neat combination of complex analysis, numerical analysis, and ring theory. We’ll see that it is very
closely connected to forward-mode automatic differentiation (FAD). For better or worse, while widely applicable, the scenarios
where complex-step differentiation is the *best* solution are a bit rare. To apply complex-step differentiation, you need
a version of your desired function that operates on complex numbers. If you have that, then you can apply complex-step
differentiation immediately. Otherwise, you need to adapt the function to complex arguments. This can be done essentially
automatically using the same techniques as automatic differentiation, but at that point you might as well use automatic
differentiation. Adapting the code to complex numbers or AD takes about the same amount of effort, however, the AD version
will be more efficient, more accurate, and easier to use.

Nevertheless, this serves as a simple example to illustrate several theoretical and practical ideas.

The problem we’re solving is given a function |f : \mathbb R \to \mathbb R| which is differentiable around a point |x_0|, we’d like to compute its derivative |f’| at |x_0|. In many cases, |f| is real analytic at the point |x_0| meaning |f| has a Taylor series which converges to |f| in some open interval containing |x_0|.

The most obvious way of numerically differentiating |f| is to approximate the limit in the definition of
the derivative, \[f’(x) = \lim_{h\to 0} [f(x + h) - f(x)] / h\] by simply choosing a small value for |h| rather
than taking the limit. When |f| is real analytic at |x|, we can analyze the quality of this approximation by expanding |f(x + h)|
in a Taylor series at |x|. This produces \[[f(x + h) - f(x)]/h = f’(x) + O(h)\] A slight tweak produces a better result with the
same number of evaluations of |f|. Specifically, the Taylor series of |f(x + h) - f(x - h)| at |x| is equal to the
odd part the Taylor series of |f(x + h)| at |x|. This leads to the **Central Differences** formula:

$$f'(x) + O(h^2) = \frac{f(x + h) - f(x - h)}{2h}$$

The following interactively illustrates this using the function |f(x) = x^{9/2}| evaluated at |x_0 =| . The correct answer to |17| digits is |f’(||) {}={}|. The slider ranges from |h=10^{-2}| to |h=10^{-20}|.

|h|:

|f’(||)|:

error:

If you play with the slider using the first example, you’ll see that the error decreases until around |10^{-5}| after which it starts increasing until |10^{-15}| where it is off by more than |1|. At |10^{-16}| the estimated derivative is |0| which is, of course, completely incorrect. Even at |10^{-5}| the error is on the order of |10^{-9}| which is much higher than the double precision floating point machine epsilon of approximately |10^{-16}|.

There are two issues here. First, we have the issue that if |x_0 \neq 0|, then |x_0 + h = x_0| for sufficiently small |h|. This happens when |x_0/h| has a magnitude of around |10^{16}| or more.

The second issue here is known as catastrophic cancellation. For simplicity, let’s say |f(x)=1|. (It’s actually about |6.2| for the first example.) Let’s further say for some small value of |h|, |f(x+h) = 1.00000000000020404346|. The value we care about is the |0.00000000000020404346|, but given limited precision, we might have |f(x + h) = 1.000000000000204|, meaning we only have three digits of precision for the value we care about. Indeed, as |h| becomes smaller we’ll lose more and more precision in our desired value until we lose all precision which happens when |f(x + h) = f(x)|. It is generally a bad idea numerically to calculate a small value by subtracting two larger values for this reason.

We would not have the first issue if |x_0 = 0| as in the second and fourth examples (|f(x) = e^x|). We would not have the second issue if |f(x) = 0| as in the second and third examples (|f(x) = \sin(x)| with |x_0 = \pi|). We have neither issue in the second example of |f(x) = \sin(x)| with |x_0 = 0|. This will become important later.

We have a dilemma. For the theory, we want as small a value of |h| as possible without being zero. In practice, we start losing precision as |h| gets smaller, and generally larger values of |h| are going to be less impacted by this.

Let’s set this aside for now and look at other ways of numerically computing the derivative in the hopes that we can avoid this problem.

If we talk about functions |f : \mathbb C \to \mathbb C|, the analogue of real analyticity
is holomorphicity or complex analyticity.
A complex function is **holomorphic** if it satisfies the Cauchy-Riemann equations.
(See the Appendix for more details about where the Cauchy-Riemann equations come from.)
A complex function is **complex analytic** if it has a Taylor series which converges to the function. It can be proven
that these two descriptions are equivalent, though this isn’t a trivial fact. We can also talk about functions that
are holomorphic or complex analytic on an open subset of |\mathbb C| and at a point by considering an open subset around
that point. The typical choice of open subset is some suitably small open disk in the complex plane about the point.
(Other common domains are ellipses, infinite strips, and areas bounded by Hankel contours
and variations such as sideways opening parabolas.)

A major fact about holomorphic functions is the Cauchy integral theorem.
If |f| is a holomorphic function inside a (suitably nice) closed curve |\Gamma| in the complex plane, then |\oint_\Gamma f(z)\mathrm dz = 0|.
Again, |\Gamma| will typically be chosen to be some circle. (Integrals like this in the complex plane are often called
**contour integrals** and the curves we’re integrating along are called **contours**.)

Things get really interesting when we generalize to meromorphic functions
which are complex functions that are holomorphic except at an isolated set of points. These take the form of **poles**
which are points |z_0| such that |1/f(z_0) = 0|, i.e. poles are where a function goes to infinity as, e.g., |1/z| does at |0|.
The generalization of Cauchy’s integral theorem is Cauchy’s Residue Theorem. *This theorem
is surprising and is one of the most useful theorems in all of mathematics both theoretically and practically*.

We’ll only need a common special case of it. Let |f| be a holomorphic function, then |f(z)/(z - z_0)^n| is a meromorphic function
with a single pole of order |n| at |z_0|. If |\Gamma| is a positively oriented, simple closed curve containing |z_0|,
then $$f^{(n-1)}(z_0) = \frac{(n-1)!}{2 \pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{(z - z_0)^n}$$ In this case, |f^{(n-1)}(z_0)/(n-1)!|
is the **residue** of |f(z)/(z - z_0)^n| at |z_0|. More generally, if there are multiple poles in the area bounded by |\Gamma|, then we will
sum up their residues.

This formula provides us a means of calculating the |(n-1)|-st Taylor coefficient of a complex analytic function at any point. For our particular purposes, we’ll only need the |n=2| case, \[f’(z_0) = \frac{1}{2 \pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{(z - z_0)^2}\]

For the remainder of this section I want to give some examples of how Cauchy’s Residue Theorem is used both theoretically and practically. This whole article will itself be another practical example of Cauchy’s Residue Theorem. This is not exhaustive by any means.

To start illustrating some of the surprising properties of this theorem, we can take the |n=1| case which states that we can evaluate
a holomorphic function at any point via |f(z_0) = \frac{1}{2 \pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{z - z_0}| where |\Gamma|
is any contour which bounds an area containing |z_0|. This leads to an interesting discreteness. Not only can we evaluate
a (holomorphic) function (or any of its derivatives) at a point via the values of the function on a contour, the only significant
constraint on that contour is that it bound an area containing the desired point. In other words, no matter how we deform the contour the
integral is constant except when we deform the contour so as not to bound an area containing the point being evaluated, at which point
the integral’s value is |0|^{1}. It may seem odd to use an integral to evaluate
a function at a point, but it can be useful when there are numerical issues with evaluating the function near the desired point^{2}. In fact, these results show that if we know the values of a holomorphic function on the boundary of a given open subset of
the complex plane, then we know the value of the function *everywhere*. In this sense, holomorphic functions (and analytic
functions in general) are extremely rigid.

This leads to the notion of analytic continuation
where we try to compute an analytic function beyond its overt domain of definition. This is the basis of most “sums of divergent series”.
For example, there is the first-year calculus fact that the sum of the infinite series |\sum_{n=0}^\infty x^n| is |1/(1-x)| converging
on the interval |x \in (-1,1)|. In fact, the proof of convergence only needs |\|x\| < 1| so we can readily generalize to
complex |z| with |\|z\| < 1|, i.e. |z| contained in the open unit disk. However, |1/(1-z)| is a meromorphic function that is holomorphic
everywhere except for |z=1|, therefore there is a *unique* analytic function defined everywhere except |z=1| that agrees with
the infinite sum on the unit disk, namely |1/(1-z)| itself. Choosing |z=2| leads to the common example of “summing a divergent series”
with “|\sum_{n=0}^\infty 2^n = -1|” which really means “the value at |2| of the unique complex analytic function which agrees
with this infinite series when it converges”.

Sticking with just evaluation, applying the Cauchy Residue theorem to quadrature, i.e. numerical integration, leads to an interesting
connection to a rational approximation problem. Say we want to compute |\int_{-1}^1 f(x) \mathrm dx|, we can use the Cauchy
integral to evaluate |f(x)| leading to $$\int_{-1}^1 f(x) \mathrm dx
= \int_{-1}^1 \frac{1}{2\pi i}\oint \frac{f(z)\mathrm dz}{z - x}\mathrm dx
= \frac{1}{2\pi i}\oint f(z)\int_{-1}^1 \frac{\mathrm dx}{z - x}\mathrm dz
= \frac{1}{2\pi i}\oint f(z)\log\left(\frac{z+1}{z-1}\right)\mathrm dz$$
A quadrature formula looks like |\int_{-1}^1 f(x) \mathrm dx \approx \sum_{k=1}^N w_k f(x_k)|. The sum
can be written as a Cauchy integral of |\oint f(z)\sum_{k=1}^N \frac{1}{2\pi i}\frac{w_k\mathrm dz}{z - x_k}|. We thus have
$$\left|\frac{1}{2\pi i}\oint f(z)\left[\log\left(\frac{z+1}{z-1}\right) - \sum_{k=1}^N \frac{w_k}{z - x_k}\right]\mathrm dz\right|$$ as the error of
the approximation. The sum is a rational function (in partial fraction form) and thus the error is minimized by points (|x_k|)
and weights (|w_k|) that lead to better rational approximations of |\log((z+1)/(z-1))|^{3}.

The ability to calculate coefficients of the Taylor series of a holomorphic function is, by itself, already a valuable tool in both theory and practice. In particular, the coefficients of a generating function or a Z-transform can be computed with Cauchy integrals. This has applications in probability theory, statistics, finance, combinatorics, recurrences, differential equations, and signal processing. Indeed, when |z_0 = 0| and |\Gamma| is the unit circle, then the Cauchy integral is a component of the Fourier series of |f|. Approximating these integrals with the Trapezoid Rule (which we’ll discuss in a bit) produces the Discrete Fourier Transform.

Let |p| be a polynomial and, for simplicity, assume all its zeroes are of multiplicity one. Then |1/p(z)| is a meromorphic function that’s holomorphic everywhere except for the roots of |p|. The Cauchy integral |\frac{1}{2\pi i}\oint_{\Gamma} \frac{p’(z)\mathrm dz}{p(z)}| counts the number of roots of |p| contained in the area bounded by |\Gamma|. If we know there is only one root of |p| within the area bounded by |\Gamma|, then we can compute that root with |\frac{1}{2\pi i}\oint_{\Gamma} \frac{z p’(z)\mathrm dz}{p(z)}|. A better approach is to use the formulas |\left(\oint \frac{z\mathrm dz}{p(z)}\right)/\left(\oint \frac{\mathrm dz}{p(z)}\right)|. Similar ideas can be used to adapt this to counting and finding multiple roots. See Numerical Algorithms based on Analytic Function Values at Roots of Unity by Austin, Kravanja, and Trefethen (2014) which is a good survey in general.

Another very common use of Cauchy’s Residue Theorem is to sum (convergent) infinite series. |\tan(\pi z)/\pi| has a zero at |z = k| for each integer |k| and a non-zero derivative at those points. In fact, the derivative is |1|. Alternatively, we could use |\sin(\pi z)/\pi| which has a zero at |z = k| for each integer |k| but has derivative |(-1)^k| at those points. Therefore, |\pi\cot(\pi z) = \pi/\tan(\pi z)| has a (first-order) pole at |z = k| for each integer |k| with residue |1|. In particular, if |f| is a holomorphic function (at least near the real axis), then the value of the Cauchy integral of |f(z)\pi\cot(\pi z)| along a Hankel contour will be |2\pi i \sum_{k=0}^\infty f(k)|. Along an infinite strip around the real axis we’d get |2 \pi i \sum_{k=-\infty}^\infty f(k)|. As an example, we can consider the famous sum, |\sum_{k=1}^\infty 1/k^2|. It can be shown that if |f| is a meromorphic function whose poles are not located at integers and |\vert zf(z)\vert| is bounded for sufficiently large |\vert z\vert|, then |\oint f(z)\pi \cot(\pi z)\mathrm dz = 0|. We thus have that \[\sum_{k=-\infty}^{\infty} f(k) = -\sum_j \mathrm{Res}(f(z)\pi\cot(\pi z); z_j)\] where |z_j| are the poles of |f|. In particular, |f(z) = \frac{1}{z^2 + a^2}| has (first-order) poles at |\pm ai|. This gives us simply $$\sum_{k=-\infty}^{\infty} \frac{1}{k^2 + a^2} = -\pi\frac{\cot(\pi a i)-\cot(-\pi a i)}{2ai} = \frac{\pi}{a}\coth(\pi a)$$ where I’ve used |\coth(x) = i\cot(xi)| and the fact that |\coth| is an odd function. Exploiting the symmetry of the sum gives us $$\sum_{k=1}^{\infty} \frac{1}{k^2 + a^2} = \frac{\pi}{2a}\coth(\pi a) - \frac{1}{2a^2}$$ By expanding |\coth| in a Laurent series, we see that the limit of the right-hand side as |a| approaches |0| is |\frac{\pi^2}{6}|. While contour integration is quite effective for coming up with analytic solutions to infinite sums, numerically integrating the contour integrals is also highly effective as illustrated in Talbot quadratures and rational approximations by Trefethen, Weideman, and Schmelzer (2006), for example.

We’ve seen in the previous section that |f’(z_0) = \frac{1}{2\pi i}\oint_{\Gamma} \frac{f(z)\mathrm dz}{(z-z_0)^2}|. This doesn’t much help us if we don’t have a way to compute the integrals. From this point forward, fix |\Gamma| as a circle of radius |h| centered on |z_0|.

Before that, let’s consider numerical integration in general. Say we want to integrate the real function |f| from |0| to |b|,
i.e. we want to calculate |\int_0^b f(x)\mathrm dx|. The most obvious way to go about it is to approximate the Riemann
sums that define the (Riemann) integral. This would produce a formula like
|\int_0^b f(x)\mathrm dx \approx \frac{b}{N}\sum_{k=0}^{N-1} f(bk/N)| corresponding to summing the areas of rectangles
whose left points are the values of |f|. As before with central differences, relatively minor tweaks will give better approximations. In particular,
we get the two roughly equivalent approximations of the **Midpoint Rule**
\[\int_0^b f(x)\mathrm dx \approx \frac{b}{N}\sum_{k=0}^{N-1} f\left(\frac{b(k+(1/2))}{N}\right)\] where we take the midpoint rather than the
left or right point, and the **Trapezoid Rule**
\[\int_0^b f(x)\mathrm dx \approx \frac{b}{2N}\sum_{k=0}^{N-1}[f(b(k+1)/N) + f(bk/N)]\] where we average the left and right Riemann
sums. While both of these perform substantially better than the left/right Riemann sums, they are still rather basic
quadrature rules; the error decreases as |O(1/N^2)|.

Something special happens when |f| is a periodic function. First, the Trapezoid rule reduces to |\frac{b}{N}\sum_{k=0}^{N-1} f(bk/N)|. More importantly, the Midpoint rule and the Trapezoid rule both start converging geometrically rather than quadratically. Furthermore, for the particular case we’re interested in, namely integrating analytic functions along a circle in the complex plane, these quadrature rules are optimal. Let |\zeta| be the |2N|-th root of unity. The Trapezoid rule corresponds to sum the values of |f| at the even powers of |\zeta| scaled by the radius |h| and translated by |z_0|, and the Midpoint rule corresponds to the sum of the odd powers.

We now have two parameters for approximating a Cauchy integral via the Trapezoid or Midpoint rules: the radius |h| and the number of points |N|.

Complex-Step Differentiation corresponds to approximating the Cauchy integral for the derivative using the extreme case of the Midpoint rule with |N=2| and very small radii (i.e. values of |h|). Meanwhile, Central Differences corresponds to the extreme case of using the Trapezoid rule with |N=2| and very small radii. To spell this out a bit more, we perform the substitution |z - z_0 = he^{\theta i}| which leads to |\mathrm dz = hie^{\theta i}\mathrm d\theta| and $$\frac{1}{2\pi i}\oint_{|z - z_0| = h} \frac{f(z)\mathrm dz}{(z - z_0)^2} = \frac{1}{2 \pi h}\int_0^{2\pi} f(z_0 + he^{\theta i})e^{-\theta i}\mathrm d\theta$$

Applying the Trapezoid rule to the right hand side of this corresponds to picking |\theta = 0, \pi|, while applying the Midpoint rule corresponds to picking |\theta = \pm \pi/2|. |e^{\theta i} = \pm 1| for |\theta = 0, \pi|, and |e^{\theta i} = \pm i| for |\theta = \pm \pi/2|. For the Trapezoid rule, this leads to \[f’(z_0) \approx \frac{1}{2h}[f(z_0 + h) - f(z_0 - h)]\] which is Central Differences. For the Midpoint rule, this leads to \[f’(z_0) \approx \frac{1}{2hi}[f(z_0 + hi) - f(z_0 - hi)]\] This is Complex-Step Differentiation when |z_0| is real.

As just calculated, **Complex-Step Differentiation** computes the derivative at the *real* number |x_0| via the formula:
$$f'(x_0) \approx \frac{1}{2hi} [f(x_0 + hi) - f(x_0 - hi)]$$ Another perspective on this formula is that it is just the
Central Differences formula along the imaginary axis instead of the real axis.

When |f| is complex analytic and real-valued on real arguments, then we have |f(\overline z) = \overline{f(z)}| where |\overline z| is the complex conjugate of |z|, i.e. it maps |a + bi| to |a - bi| or |re^{\theta i}| to |re^{-\theta i}|. This leads to |f(x_0 + hi) - f(\overline{x_0 + hi}) = f(x_0 + hi) - \overline{f(x_0 + hi)} = 2i\operatorname{Im}(f(x_0 + hi))|. This lets us simplify Complex-Step Differentiation to |f’(x_0) \approx \operatorname{Im}(f(x_0 + hi))/h|.

Here is the earlier interactive example but now using Complex-Step Differentiation. As |h| decreases in magnitude, the error steadily decreases until there is no error at all.

|h|:

|f’(||)|:

error:

This formula using |\operatorname{Im}| avoids catastrophic cancellation simply by not doing a subtraction. However, it turns out
for real |x_0| (which is necessary to derive the simplified formula), there isn’t a problem either way. Using the first form of
the Complex-Step Differentiation formula is also numerically stable. The key here is that the imaginary part of |x_0| and |f(x_0)| are
both |0| and so we don’t get catastrophic cancellation for the same reason we wouldn’t get it with Central Differences if |f(x_0) = 0|.
This suggests that if we wanted to evaluate |f’| at some non-zero point on the imaginary axis, Complex-Step Differentiation would
perform poorly while Central Differences would perform well. Further, if we wanted to evaluate |f’| at some point not on either
the real or imaginary axes, neither approach would perform well. In this case, choosing different values for |N| and the radius
would be necessary^{4}.

A third perspective on Complex-Step Differentiation comes when we think about which value of |h| should we use. The smaller |f’(x_0)|
is, the smaller we’d want |h| to be. Unlike Central Differences, there is little stopping us from having |h| be *very* small and
values like |h=10^{-100}| are typical. In fact, around |h=10^{-155}| in double precision floating point arithmetic, |h| gets the
theoretically useful property that |h^2 = 0| due to underflow. In this case, |x_0 + hi| behaves like |x_0 + \varepsilon| where
|\varepsilon^2 = 0|. This is the defining property of the ring of dual numbers.
Dual numbers are exactly what are used in forward-mode automatic differentiation.

The ring of dual numbers has numbers of the form |a + b\varepsilon| where |a, b \in \mathbb R|. This behaves just like the complex numbers except that instead of |i^2 = -1| we have |\varepsilon^2 = 0|. The utility of dual numbers for our purposes can be seen by expanding |f(x_0 + \varepsilon)| in a Taylor series about |x_0|. We get |f(x_0 + \varepsilon) = f(x_0) + f’(x_0)\varepsilon|. All higher power terms of the Taylor series are zero because |\varepsilon^2 = 0|. We can thus get the derivative of |f| simply by computing |f(x + \varepsilon)| and then looking at the coefficient of |\varepsilon| in the result.

In this example there is no interactivity as we are not estimating the derivative in the AD case but instead calculating it in parallel. There is no |h| parameter.

|f’(||)|:

error:

As the end of the previous section indicated, Complex-Step Differentiation approximates this (often exactly) by using |hi| as |\varepsilon|. Nevertheless, this is not ideal. Often the complex versions of a function will be more costly than their dual number counterparts. For example, |(a + bi)(c + di) = (ac - bd) + (ad + bc)i| involves four real multiplications and two additions. |(a + b\varepsilon)(c + d\varepsilon) = ac + (ad + bc)\varepsilon| involves three real multiplications and one addition on the other hand.

Using Complex Variables to Estimate Derivatives of Real Functions by Squire and Trapp (1998)
is the first(?) published paper *specifically* about the idea of complex-step differentiation. It’s a three page paper and the authors
are not claiming any originality but just demonstrating the effectiveness of ideas from the ’60s that the authors found to be underappreciated.

The Complex-Step Derivative Approximation by Martins, Sturdza, and Alonso (2003) does a much deeper dive into the theory behind complex-step differentiation and its connections to automatic differentiation.

You may have noticed the name “Trefethen” in many of the papers cited. Nick Trefethen and his collaborators have been doing amazing work for the past couple of decades, most notably in the Chebfun project. Looking at Trefethen’s book Approximation Theory and Approximation Practice (and lectures) recently reintroduced me to Trefethen’s work. This particular article was prompted by a footnote in the paper The Exponentially Convergent Trapezoidal Rule which I highly recommend. In fact, I highly recommend Chebfun as well as nearly all of Trefethen’s work. It is routinely compelling, interesting, and well presented.

Using the language of Geometric Calculus, we can write a very general form of the Fundamental Theorem of Calculus. Namely, \[\int_{\mathcal M} \mathrm d^m\mathbf x \cdot \nabla f(\mathbf x) = \oint_{\partial \mathcal M}\mathrm d^{m-1}\mathbf x f(\mathbf x)\] where |\mathcal M| is an |m|-dimensional manifold. Here |f| is a multivector-valued vector function. If |m=2| and |\nabla f = 0|, then this would produce a formula very similar to the Cauchy integral formula.

Writing |f(x + yi) = u(x, y) + v(x, y)i|, the Cauchy-Riemann equations are |\frac{\partial u}{\partial x} = \frac{\partial v}{\partial y}| and |\frac{\partial u}{\partial y} = -\frac{\partial v}{\partial x}|. However, |\nabla f = 0| leads to the slightly different equations |\frac{\partial u}{\partial x} = -\frac{\partial v}{\partial y}| and |\frac{\partial u}{\partial y} = \frac{\partial v}{\partial x}|.

The resolution of this discrepancy is found by recognizing that we don’t want |f| to be a vector-valued vector function but rather a spinor-valued spinor function. It is most natural to identify complex numbers with the even subalgebra of the 2D geometric algebra. If |\mathbf e_1| and |\mathbf e_2| are the two orthonormal basis vectors of the 2D space, then the pseudoscalar |I \equiv \mathbf e_1\wedge \mathbf e_2 = \mathbf e_1 \mathbf e_2| satisfies |I^2 = -1|. For the 2D case, a spinor is a multivector of the form |a + bI|.

We can generalize the vector derivative, |\nabla|, to a multivector derivative |\nabla_X| where |X| is a multivector variable
by using the generic formula for the directional derivative in a linear space and then defining |\nabla_X| to be a linear
combination of directional derivatives. Given any |\mathbb R|-linear space |V| and an element |v \in V|, we can
define the directional derivative of |f : V \to V| in the direction |v| via
|\frac{\partial f}{\partial v}(x) \equiv \frac{\mathrm d f(x + \tau v)}{\mathrm d\tau}|. In our case,
we have the basis vectors |\{1, \mathbf e_1, \mathbf e_2, I\}| though we only care about the even subalgebra
corresponding to the basis vectors |\{1, I\}|. Define |\partial_1 f(x) \equiv \frac{\mathrm d f(x + \tau)}{\mathrm d \tau}|
and |\partial_I f(x) \equiv \frac{\mathrm d f(x + \tau I)}{\mathrm d \tau}| assuming |f| is a spinor-valued function^{5}.
We can then define |\nabla_{\mathbf z} \equiv \partial_1 + I\partial_I|. We now have |\nabla_{\mathbf z} f = 0| is
the equivalent to the Cauchy-Riemann equations where |f| is now a spinor-valued spinor function, i.e. a function of |\mathbf z|.

In terms of the general theory of partial differential equations, we are saying that |z^{-1}| is a Green’s function for |\nabla|. We can then understand everything that is happening here in terms of general results. In particular, it is the two-dimensional case of the results described in Multivector Functions by Hestenes.↩︎

See Numerical Algorithms based on Analytic Function Values at Roots of Unity by Austin, Kravanja, and Trefethen (2014) for an example. Also, with some minor tweaks, we can have that “point” be a matrix and these integrals can be used to calculate functions of matrices, e.g. the square root, exponent, inverse, and log of a matrix. See Computing |A^\alpha|, |\log(A)|, and Related Matrix Functions by Contour Integrals by Hale, Higham, and Trefethen (2009) for details.↩︎

See Is Gauss Quadrature Better than Clenshaw-Curtis? by Trefethen (2008) for more details.↩︎

While focused on issues that mostly come up with very high-order derivatives, e.g. |100|-th derivatives and higher, Accuracy and Stability of Computing High-Order Derivatives of Analytic Functions by Cauchy Integrals by Bornemann (2009) nevertheless has a good discussions of the concerns here.↩︎

If we allowed arbitrary multivector-valued functions, then we’d need to add a projection producing the tangential derivative.↩︎