<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.kuniga.me/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.kuniga.me/" rel="alternate" type="text/html" /><updated>2026-05-10T20:12:37+00:00</updated><id>https://www.kuniga.me/feed.xml</id><title type="html">NP-Incompleteness</title><subtitle>Kunigami&apos;s Technical Blog</subtitle><author><name>Guilherme Kunigami</name></author><entry><title type="html">Sobolev Spaces</title><link href="https://www.kuniga.me/blog/2026/05/02/sobolev-spaces.html" rel="alternate" type="text/html" title="Sobolev Spaces" /><published>2026-05-02T00:00:00+00:00</published><updated>2026-05-02T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/05/02/sobolev-spaces</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/05/02/sobolev-spaces.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources/blog/2026-05-02-sobolev-spaces/sobolev.png" alt="Thumbnail of Sergei Lvovich Sobolev" />
</figure>

<p>Continuing with my exploration of understanding physics from first math principles (see previous <a href="https://www.kuniga.me/blog/2026/04/26/functionals.html">post on functionals</a>), I wanted to learn more about Sobolev spaces.</p>

<p>These are a type of vector space named after the Soviet mathematician Sergei Lvovich Sobolev (1908-1989), featured on the thumbnail.</p>

<!--more-->

<h2 id="motivation">Motivation</h2>

<p>Usually when we discuss differentiability, we have functions that are either differentiable or not.</p>

<p>In the real world however, we often need to handle functions that are not strictly differentiable everywhere but are almost there. A simple example is the function $f(x) = \abs{x}$ which is differentiable everywhere except at $x = 0$.</p>

<p>There’s a more relaxed version of differentiability called the <em>weak differentiability</em> which we’ll cover soon, but the idea is that a Sobolev space is the vector space of weakly differentiable functions.</p>

<p>Before we go there, let’s lay down some nomenclature.</p>

<h2 id="function-spaces">Function Spaces</h2>

<p>A vector space in which the elements are functions is generally called a <strong>function space</strong>. There’s a special notation for a select group of function spaces, which are associated with the properties of the functions that are part of it.</p>

<p>The set of continuous functions with domain $\Omega$ is denoted by $C(\Omega)$. We add a superscript $k$ indicating how many times these functions can be differentiated, so $C^1(\Omega)$ is the set of functions that have a continuous derivative. The set of smooth functions (infinitely differentiable) is denoted by $C^\infty(\Omega)$.</p>

<p>A special property we’ll need later is called <strong>compact support</strong>. This basically means that there exists some compact set $K \subset \Omega$ outside of which the function is 0. One example of such a function is:</p>

\[\begin{equation}
  \varphi(x)=\left\{
  \begin{array}{@{}ll@{}}
    \exp(-\frac{1}{1 - x^2}) &amp; \text{if } \abs{x} \lt 1 \\
    0, &amp; \text{otherwise}
  \end{array}\right.
\end{equation}\]

<p>These functions have the subscript $_c$, e.g. $C^\infty_c(\Omega)$.</p>

<p>The set of (Lebesgue) integrable functions is denoted by $L^p(\Omega)$. More formally:</p>

\[L^p(\Omega) = \curly{ f: \Omega \rightarrow \mathbb{F} \mid \int_{\Omega} \abs{f(x)}^p dx \lt \infty}\]

<p>A more relaxed version is that the function only needs to be integrable in compact sets. This is denoted by $L_{LOC}$:</p>

\[L_{LOC}^p(\Omega) = \curly{ f: \Omega \rightarrow \mathbb{F} \mid \int_{K} \abs{f(x)}^p dx \lt \infty, \quad \forall \mbox{compact set }K \in \Omega}\]

<p>Note that $L^p(\Omega) \subseteq L_{LOC}^p(\Omega)$. One example is $f(x) = 1$ for $\Omega = \mathbb{R}$. In this case compact set means a bounded closed interval $[a, b]$. We have:</p>

\[\int_{a}^{b} f(x) dx = b - a \lt \infty\]

<p>So $f \in L_{LOC}^p(\Omega)$. But if the interval is unbounded, then the integral is not defined so $f \not \in L^p(\Omega)$.</p>

<p>We also have $W^{k, p}$ which is a vector space of the functions in $L^p$ such that all their <em>weak</em> derivatives up to order $k$ are also in $L^p$:</p>

\[W^{k, p}(\Omega) = \curly{ f \in L^p : f\mbox{'s } n\mbox{-th weak derivative} \in L^p \mbox{ for } n \le k}\]

<p>We haven’t defined weak derivatives yet though, but we’ll do so shortly. For now it suffices to say this vector space is what we call a <strong>Sobolev space</strong>.</p>

<h2 id="weak-derivatives">Weak Derivatives</h2>

<p>Let’s start with the formal definition. Let $\Omega$ be an open subset of $\mathbb{R}^n$ and $u \in L_{LOC}^1(\Omega)$. We say that $v \in L_{LOC}^1(\Omega)$ is <strong>a weak derivative of $u$ in the $i$-th direction</strong> (denoted as $v = \partial_i u$) if:</p>

\[(1) \quad \int_{\Omega} u(x) \partial_i \varphi(x) dx = -\int_{\Omega} v(x) \varphi(x) dx, \quad \forall \varphi \in C^{\infty}_c(\Omega)\]

<p>Let’s explain what each term means. The integral is over the infinitesimal box on $\mathbb{R}^n$ in the subspace $\Omega$. The function $\varphi(x)$ is called a <em>test function</em> and because it has a compact support, it’s only non-zero in that compact set, so we can interpret it as a bitmask in programming or a microscope that “focuses” a function $f$ when we multiply it by $\varphi$.</p>

<p>The notation $\partial_i$ is the partial derivative on the dimension $i$ (i.e. with respect to $x_i$) and is a shorthand for $\partial_i f = \partial f / \partial x_i$. A more compact notation if we omit the function argument:</p>

\[\int_{\Omega} u \partial_i \varphi dx = -\int_{\Omega} v \varphi dx\]

<h3 id="intuition">Intuition</h3>

<p>I still don’t have a good intuition behind why weak derivatives are useful. The least unsatisfying one is this: pointwise derivatives are too strict. The weak derivative averages out $u$ locally (in the compact support of $\varphi$) by multiplying it with $\partial_i \varphi$. Since $\varphi$ is smooth, it helps get rid of the kinks in $u$.</p>

<p>This reminds me of the use of convolution in image processing to smooth out pixels by taking the average of the surroundings.</p>

<p>The use of $\partial_i \varphi$ instead of just $\varphi$ is so that we get what would correspond to $\partial_i u$ on the other side of the equality.</p>

<h3 id="example">Example</h3>

<p>To get an idea on how to use this formulation, consider the function $f(x) = \abs{x}$ with $x \in \mathbb{R}$, for which we want to find the derivative. It’s not differentiable at $x = 0$ because of the “edge”.</p>

<p>For the rest of the domain we have $f’(x) = 1$ for $x \gt 0$ and $f’(x) = -1$ for $x \lt 0$. This is the function we’ll tentatively take as the function $v$. We just need to show it satisfies $(1)$. Note we didn’t specify what $f’(0)$ is like, but it can be anything. We’ll call this function $\mbox{sgn}(x)$.</p>

<p><strong>Lemma 1.</strong> $\mbox{sgn}(x)$ is the weak derivative of $\abs{x}$</p>

<proof>
$$
\int_{-\infty}^{\infty} \abs{x} \varphi'(x) dx = -\int_{-\infty}^{\infty} \mbox{sgn}(x) \varphi(x) dx
$$

We can work on the left side first by splitting the integral into two:

$$
\int_{-\infty}^{\infty} \abs{x} \varphi'(x) dx = \int_{-\infty}^{0} -x \varphi'(x) dx + \int_{0}^{\infty} x \varphi'(x) dx
$$

Integration by parts gives us:

$$
= \int_{-\infty}^{0} \varphi(x) dx - \int_{0}^{\infty} \varphi(x) dx
$$

Replacing $\mbox{sgn}(x) = -1$ in the first and $\mbox{sgn}(x) = 1$ in the second:

$$
= - \int_{-\infty}^{0} \mbox{sgn}(x) \varphi(x) dx - \int_{0}^{\infty} \mbox{sgn}(x) \varphi(x) dx = - \int_{-\infty}^{\infty} \mbox{sgn}(x) \varphi(x) dx
$$

So $\mbox{sgn}$ is indeed a weak derivative of $\abs{x}$.
</proof>

<p>So in practice we do not need to list all the test functions. We just use the specific property that it has a compact support but otherwise don’t make any assumption about them.</p>

<h3 id="derivation">Derivation</h3>

<p>Let’s now understand where this formula comes from. Suppose for now that $u$ has a derivative. Since $\varphi$ is a smooth function, it also has a derivative. Suppose we multiply them and want to take the derivative on dimension $i$. We can use the product rule:</p>

\[\partial_i (u \varphi) = (\partial_i u) \varphi + u (\partial_i \varphi)\]

<p>Here we’re omitting the $(x)$ parameter for simplicity. Now integrate over the entire domain $\Omega$:</p>

\[\int_\Omega \partial_i (u \varphi) dx = \int_\Omega (\partial_i u) \varphi dx + \int_\Omega u (\partial_i \varphi) dx\]

<p>We can think of $\partial_i$ as a gradient if we multiply by $e_i$ (the basis vector):</p>

\[\partial_i f = \nabla f \cdot e_i\]

<p>or in our case</p>

\[\partial_if (u \varphi) = \nabla (u \varphi) \cdot e_i\]

<p>so</p>

\[\int_\Omega \partial_i (u \varphi) dx = \int_\Omega  \nabla (u \varphi) \cdot e_i dx = \left(\int_\Omega  \nabla (u \varphi) dx \right) e_i\]

<p>By the <a href="https://www.kuniga.me/docs/math/integral.html">divergence theorem</a>, we have that</p>

\[\int_\Omega \nabla (u \varphi) dx = \int_{\delta \Omega} (u \varphi) \cdot \mathbf{n} dS\]

<p>But because $\varphi$ has compact support, it means outside that region it is 0. This includes the boundary $\delta \Omega$, so each term on the righthand integral is $0$ and thus:</p>

\[\int_\Omega \nabla (u \varphi) dx = 0\]

<p>and:</p>

\[\int_\Omega \partial_i (u \varphi) dx = 0\]

<p>and finally:</p>

\[\int_\Omega u (\partial_i \varphi) dx = -\int_\Omega (\partial_i u) \varphi dx\]

<p>which is the exact form as $(1)$. The only thing is that we assumed that $u$ is differentiable. So instead of requiring that $u$ is differentiable, we just require a function $v$ to exist that satisfies $(1)$. For a differentiable function, $v$ coincides with $\partial_i u$.</p>

<p>The idea being “If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck”. We’ll look at this from a different angle next.</p>

<h3 id="representation-an-analogy">Representation, an analogy</h3>

<p>Recall that in the post about functionals we covered the <em>Riesz Representation Theorem</em> which basically says that if $H$ is a Hilbert space then there exists some element in it, $y$, such that any linear functional $f$ on $H$ can be <em>represented</em> as the inner product $f(x) = \langle x, y\rangle$ for all $x$.</p>

<p>We have an analogous case for weak derivatives. For a fixed function $u$, define the functional:</p>

\[L(\varphi) = \int_\Omega u \partial_i \varphi dx\]

<p>we can show this functional is linear. So an analogous result to Riesz’s is that there exists a function $v$ such that $L(\varphi)$ can be represented as:</p>

\[L(\varphi) = \int_\Omega v \varphi dx\]

<p>For all $\varphi$. The difference is that in Riesz it guarantees such a $y$ exists, but here $v$ only exists if $u$ has a <em>weak derivative</em>. Also for Hilbert spaces $y$ is from the same domain as $f(x)$, while here the domain of $L(\varphi)$ is not necessarily the same as $v$.</p>

<p>Ok, so now we know where the equation comes from and that it’s tied to integration by parts and test functions, but why choose this specific property among all possible properties? We’ll now cover examples to help us see why it’s useful.</p>

<h3 id="properties">Properties</h3>

<p>In the example, we saw that $f = \mbox{sgn}$ is a weak derivative of $\abs{x}$ but that we can set any value for $f(0)$. At first glance this suggests that weak derivatives are not unique like derivatives are. But there’s a slightly weaker result that shows they’re the same almost everywhere (more formally, those with non-zero measure), as shown in <em>Lemma 2</em>:</p>

<p><strong>Lemma 2.</strong> Let $v_1$ and $v_2$ be the weak derivative of $u$ at the $i$-th direction. Then $v_1 = v_2$ except at sets with measure $0$.</p>

<proof>
We have that:

$$
\int_\Omega v_1 \varphi dx = \int_\Omega v_2 \varphi dx
$$

so

$$
\int_\Omega (v_1 - v_2) \varphi dx = 0
$$

Let $g = v_1 - v_2$. We want to prove that

$$
\int_\Omega g \varphi dx = 0, \quad \forall \varphi \in C^\infty_c(\Omega)
$$

implies $g = 0$ for any set with non-zero measure. Let's prove by contradiction. Suppose $\epsilon \gt 0$ and define

$$
E = \curly{ x \in \Omega: g(x) \gt \epsilon}
$$

and that $E$ has positive measure. It's possible to show that there exists a compact set $K \subset E$ with $\abs{K} \gt 0$ (I have no idea why this is true, I haven't studied measure theory). Then there must exist $\varphi(x) = 1$ with support in $K$, so

$$
\int_\Omega g \varphi dx \gt \epsilon \int_\Omega \varphi dx \ge \epsilon \int_K 1 dx = \epsilon \abs{K}
\gt 0$$

which is a contradiction. The same argument applies for negative measure.
</proof>

<p>To say weak derivatives are <em>stable under limits</em> means that if we have a sequence of functions that converges to $u$ and their corresponding weak derivatives converge to $g$, then $g$ is a weak derivative of $u$, as shown in <em>Lemma 3</em>:</p>

<p><strong>Lemma 3.</strong> Let $u_n$ be a family of functions in $L^2$ with weak derivatives ($\partial_i u_n$) also in $L^2$ with the following properties:</p>

\[\lim_{n \rightarrow \infty} u_n = u \\
\lim_{n \rightarrow \infty} \partial_i u_n = g \\\]

<p>Then $\partial_i u = g$ (weakly).</p>

<proof>
Since $\partial_i u_n$ is the weak derivative of $u_n$ we have:

$$
\int_\Omega u_n \partial_i \varphi = - \int_\Omega (\partial_i u_n) \varphi
$$

We have that $\lim_{n \rightarrow \infty} u_n - u = 0$ and since $\varphi$ is bounded (from being smooth), so

$$
\lim_{n \rightarrow \infty} \int_\Omega (u_n - u) \partial_i \varphi = 0
$$

thus:

$$
\lim_{n \rightarrow \infty} \int_\Omega u_n \partial_i \varphi = \int_\Omega u \partial_i \varphi
$$

a similar line of argument gives us:

$$
\lim_{n \rightarrow \infty} \int_\Omega \partial_i u_n \varphi = \int_\Omega g \varphi
$$

so

$$
\int_\Omega u \partial_i \varphi = \int_\Omega g \varphi
$$
</proof>

<p>This property does not hold for normal derivatives because differentiable functions do not necessarily converge to a differentiable one. An example is:</p>

\[u_n(x) = \sqrt{x^2 + \frac{1}{n}}\]

<p>which is differentiable, but it converges to $u(x) = \abs{x}$ which is not.</p>

<h3 id="higher-order">Higher-order</h3>

<p>So far we’ve only defined the (partial) weak derivative for a single dimension. We can however extend this to multiple dimensions and cardinality. We define a multi-index as:</p>

\[\alpha = (\alpha_1, \alpha_2, \cdots, \alpha_n), \quad \abs{\alpha} = \sum_{i}^n \alpha_i\]

<p>and the derivative $D^\alpha$ as:</p>

\[(2) \quad D^\alpha = \frac{\partial^{\abs{\alpha}} f}{\partial {x_1}^{\alpha_1}\partial {x_2}^{\alpha_2} \cdots \partial {x_n}^{\alpha_n}}\]

<p>the generalization of weak derivatives is then:</p>

\[\int_\Omega D^\alpha f \varphi = (-1)^{\abs{\alpha}} \int_\Omega f D^\alpha \varphi , \quad \forall \varphi\]

<proof>
This formula can be obtained by repeated application of integration by parts. Suppose we know $u$ has weak derivatives for dimensions $i$ and $j$ and we want to determine

$$
\int_\Omega u \partial_i \partial_j \varphi dx
$$

Since $\varphi$ is infinitely differentiable, $\partial_j \varphi$ is a valid test function so we can apply $(1)$ to obtain:

$$
\int_\Omega u \partial_i (\partial_j \varphi) dx = -\int_{\Omega} \partial_i u (\partial_j \varphi) dx
$$

now we need to assume the function $\partial_i u$ also has a weak derivative at $j$ so we can do:

$$
\int_\Omega \partial_i u (\partial_j \varphi) dx = -\int_{\Omega} (\partial_j \partial_i u) \varphi dx
$$

putting it all together:

$$
\int_\Omega u \partial_i \partial_j \varphi dx = (-1)^2 \int_{\Omega} (\partial_j \partial_i u) \varphi dx
$$

Note that the order of partials got reversed ($i, j \rightarrow j, i$). However, partial derivatives are commutative so we can write:

$$
\int_\Omega u \partial_i \partial_j \varphi dx = \int_\Omega u \partial_j \partial_i \varphi dx
$$

which will preserve the ordering. Note that each time we apply a partial derivative we multiply the result by $-1$, that's the reason for the factor $(-1)^{\abs{\alpha}}$.

</proof>

<h2 id="sobolev-space">Sobolev Space</h2>

<p>Now that we’ve explored weak derivatives, it’s time to make it a proper vector space. The elements of this space are the weakly differentiable functions of order $k$ and in $L^p$. As we introduced in <em>Function Spaces</em> above:</p>

\[W^{k, p}(\Omega) = \curly{ f \in L^p : f\mbox{'s } m\mbox{-th weak derivative} \in L^p \mbox{ for } m \le k}\]

<p>When we say $m$-th derivative here we mean $D^\alpha$ for $\abs{\alpha} = m$. We can define the norm of a function in this space as:</p>

\[\norm{f} = \left(\sum_{\abs{\alpha} \le k} \norm{D^\alpha f}^p \right)^{1/p}\]

<p>Since norms are dependent on specific vector spaces we can include it to make it clearer:</p>

\[\norm{f}_{W^{k, p}} = \left(\sum_{\abs{\alpha} \le k} \norm{D^\alpha f}_{L^p}^p \right)^{1/p}\]

<p>So in English, the norm of a vector in a Sobolev space is the “length” of the vector formed by all possible partial derivatives adding up to order $k$. For example, if $p = 2$, then it becomes the Euclidean norm.</p>

<p>With this norm, <em>Lemma 4</em> shows that Sobolev spaces are Banach spaces.</p>

<p><strong>Lemma 4.</strong> Sobolev spaces are Banach spaces.</p>

<proof>
To show this, we need to show that every <a href="https://www.kuniga.me/docs/math/sequence.html">Cauchy sequence</a> converges in this space. In other words, let $(f_k)$ be a sequence of functions. Then for all $\epsilon \gt 0$, there is $N$ such that for all $j, m \ge N$:

$$
\sup_{x \in D} \norm{f_j(x) - f_m(x)}_{W^{k,p}} \lt \epsilon
$$

we want to then show that there exists $f \in W^{k, p}$ such that

$$
\lim_{j \rightarrow \infty} \norm{f_j - f}_{W^{k,p}} = 0
$$

using linear properties we have that

$$
(4.1) \quad \norm{f_j - f_m}_{W^{k,p}} = \left(\sum_{\abs{\alpha} \le k} \norm{D^\alpha f_j - D^\alpha f_m}_{L^p}^p \right)^{1/p}
$$

for a fixed $\beta$ we have:

$$
\norm{D^\beta f_j - D^\beta f_m}_{L^p}^p \le \sum_{\abs{\alpha} \le k} \norm{D^\alpha f_j - D^\alpha f_m}_{L^p}^p
$$

and

$$
\norm{D^\beta f_j - D^\beta f_m}_{L^p} \le \left(\sum_{\abs{\alpha} \le k} \norm{D^\alpha f_j - D^\alpha f_m}_{L^p}^p\right)^{1/p}
$$

so we have

$$
\norm{D^\beta f_j - D^\beta f_m}_{L^p} \le \norm{f_j - f_m}_{W^{k,p}}
$$

But because $D^\beta f_k$ is part of the Banach space $L^p$, it converges to some limit $v_\beta$ in there. In particular, if $\beta = 0$, $f_k \rightarrow v_0$. Let's call it $f = v_0$. We have that $f \in L^p$, but to show it exists in $W^{k, p}$ we need to show it has all the partial weak derivatives $D^\alpha f$ for $\abs{\alpha} \le k$.
<br /><br />
By the definition of the higher order weak derivative we have

$$
\int_\Omega D^\alpha f_j \varphi = (-1)^{\abs{\alpha}} \int_\Omega f_j D^\alpha \varphi
$$

we know that the limit $D^\alpha f_j \rightarrow v^\alpha$ and $f_j \rightarrow f$ in $L^p$, so we can take the limit for the expression above to obtain:

$$
\int_\Omega v_\alpha \varphi = (-1)^{\abs{\alpha}} \int_\Omega f D^\alpha \varphi
$$

which by definition means $v_\alpha$ is the weak derivative $D^\alpha f$. This means $D^\alpha f$ is in $L^p$ and thus $f \in W^{k,p}$. By $(4.1)$ we have:

$$
\norm{f_j - f_m}_{W^{k,p}}^p = \sum_{\abs{\alpha} \le k} \norm{D^\alpha f_j - D^\alpha f_m}_{L^p}^p
$$

since $f$ is in $W^{k,p}$ we can do:

$$
\norm{f_j - f}_{W^{k,p}}^p = \sum_{\abs{\alpha} \le k} \norm{D^\alpha f_j - D^\alpha f}_{L^p}^p
$$

taking $j \rightarrow \infty$, we know that $D^\alpha f_j \rightarrow D^\alpha f = v_\alpha$ since that's how we defined $v_\alpha$, so

$$
\lim_{j \rightarrow \infty} \norm{f_j - f}_{W^{k,p}} = \lim_{j \rightarrow \infty} \sum_{\abs{\alpha} \le k} \norm{D^\alpha f_j - D^\alpha f}_{L^p}^p = 0
$$

so every Cauchy sequence in $W^{k,p}$ converges.

</proof>

<p>For $W^{k, 2}$ we can define the inner product as:</p>

\[\langle u, v \rangle = \sum_{\abs{\alpha} \le k} \int_\Omega D^\alpha u(x) D^\alpha v(x) dx\]

<p>The norm induced via $\langle u, u \rangle$ is:</p>

\[\langle u, u \rangle = \norm{u}^2 = \sum_{\abs{\alpha} \le k} \int_\Omega (D^\alpha u(x))^2 dx\]

<p>the integral is now the $L^2$ norm for $D^\alpha u(x)$ so</p>

\[\norm{u} = \left(\sum_{\abs{\alpha} \le k} \norm{D^\alpha u}^2_{L^2} \right)^{1/2}\]

<p>which is consistent with the norm defined before. This is enough to show that $W^{k, 2}$ is a Hilbert space. In this context it’s common to use the notation $H^k = W^{k,2}$.</p>

<h2 id="conclusion">Conclusion</h2>

<p>I spent a lot of time (with ChatGPT) trying to get an intuition for weak derivatives and am still not truly satisfied. I do get the idea that we relax the conditions for a derivative by defining it via one of its properties instead.</p>

<p>The same idea is used in vector spaces: Hilbert spaces are those with an inner product. Inner products can induce a norm, but norms can be defined independently of inner produts. Thus, Banach spaces only require norm, not inner product, being thus more general than Hilbert spaces.</p>

<p>The same idea appears in topology e.g. defining things in terms of open sets instead of Euclidean distance. The part I don’t understand is why this property specifically. Why was it chosen over any other property? Maybe once I learn more about its applications in physics I’ll get a better sense.</p>

<p>I’ve read seen Sobolev spaces mentioned in several occasions, especially when reading about physics, but I had no idea what they meant, except that they sound cool. I’m glad to finally understand them a little better.</p>

<h2 id="references">References</h2>

<ul>
  <li>[1] ChatGPT</li>
  <li>[<a href="https://www.kuniga.me/blog/2026/04/26/functionals.html">2</a>] NP-Incompleteness - Functionals</li>
</ul>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="functional analysis" /><summary type="html"><![CDATA[Continuing with my exploration of understanding physics from first math principles (see previous post on functionals), I wanted to learn more about Sobolev spaces. These are a type of vector space named after the Soviet mathematician Sergei Lvovich Sobolev (1908-1989), featured on the thumbnail.]]></summary></entry><entry><title type="html">[Book] Stream Processing with Apache Flink</title><link href="https://www.kuniga.me/blog/2026/04/28/book-flink.html" rel="alternate" type="text/html" title="[Book] Stream Processing with Apache Flink" /><published>2026-04-28T00:00:00+00:00</published><updated>2026-04-28T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/04/28/book-flink</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/04/28/book-flink.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources//books/flink.jpg" alt="Book cover." />
</figure>

<p>In this post I’ll share my notes on the book <em>Stream Processing with Apache Flink</em> by Fabian Hueske and Vasiliki Kalavri.</p>

<p>This book covers many aspects of the popular open-source Apache Flink, a stream processing engine.</p>

<p><br /><br /></p>

<!--more-->

<h2 class="no_toc" id="book-summary">Book Summary</h2>

<p>I read the 1st edition of the book. It has 292 pages including the appendix and index, divided into 11 chapters. I skimmed most of the discussions about configuration since I’m mostly interested in Flink’s high-level architecture (I don’t use Flink day-to-day).</p>

<p><strong>Stream Processing.</strong> <em>Chapters 1</em> and <em>2</em> provide an introduction to general stream processing concepts. The contents overlap a bit with the book <a href="https://www.kuniga.me/blog/2022/07/26/review-streaming-systems.html">Streaming Systems</a> by Akidau et al.</p>

<p><strong>Architecture.</strong> <em>Chapter 3</em> covers the architecture of Flink at a high level, though the chapters introducing specific features go in more depth on some components.</p>

<p><strong>Features.</strong> <em>Chapters 5</em> to <em>8</em> cover the <strong>many</strong> different features Flink offers. Including the Scala API.</p>

<p><strong>Operations.</strong> <em>Chapters 4, 9</em> and <em>10</em> are mostly for operations and configuration. I skipped most of the contents.</p>

<h2 id="table-of-contents">Table of Contents</h2>

<ol id="markdown-toc">
  <li><a href="#table-of-contents" id="markdown-toc-table-of-contents">Table of Contents</a></li>
  <li><a href="#overview" id="markdown-toc-overview">Overview</a></li>
  <li><a href="#architecture" id="markdown-toc-architecture">Architecture</a>    <ol>
      <li><a href="#components" id="markdown-toc-components">Components</a></li>
      <li><a href="#data-model" id="markdown-toc-data-model">Data Model</a></li>
      <li><a href="#data-transfer" id="markdown-toc-data-transfer">Data Transfer</a></li>
      <li><a href="#watermark" id="markdown-toc-watermark">Watermark</a></li>
      <li><a href="#state" id="markdown-toc-state">State</a></li>
      <li><a href="#checkpointing" id="markdown-toc-checkpointing">Checkpointing</a></li>
      <li><a href="#e2e-exactly-once-semantics" id="markdown-toc-e2e-exactly-once-semantics">E2E Exactly-Once Semantics</a></li>
    </ol>
  </li>
  <li><a href="#api" id="markdown-toc-api">API</a>    <ol>
      <li><a href="#stateless-transformations" id="markdown-toc-stateless-transformations">Stateless Transformations</a></li>
      <li><a href="#keyed-transformations" id="markdown-toc-keyed-transformations">Keyed Transformations</a></li>
      <li><a href="#windowed-transformations" id="markdown-toc-windowed-transformations">Windowed Transformations</a></li>
      <li><a href="#multistream-transformations" id="markdown-toc-multistream-transformations">Multistream Transformations</a></li>
      <li><a href="#join-transformations" id="markdown-toc-join-transformations">Join Transformations</a></li>
      <li><a href="#distribution-transformations" id="markdown-toc-distribution-transformations">Distribution Transformations</a></li>
      <li><a href="#late-event-handling" id="markdown-toc-late-event-handling">Late Event Handling</a></li>
      <li><a href="#operator-state" id="markdown-toc-operator-state">Operator State</a></li>
    </ol>
  </li>
  <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ol>

<h2 id="overview">Overview</h2>

<p><strong>Terminology.</strong> A <strong>job</strong> is the highest level unit an end user interfaces with. So for example when users write a Scala code or a SQL query, it gets mapped to a single job.</p>

<p>A job gets compiled into an execution graph, a DAG of <strong>operators</strong>. Each operator is in theory executed by multiple <strong>tasks</strong> (hosts X threads) but in practice operators get fused so the same task ends up processing a chunk of the sub-DAG.</p>

<h2 id="architecture">Architecture</h2>

<h3 id="components">Components</h3>

<p><strong>JobManager.</strong> This is a process that runs in one of the hosts/nodes of the cluster. For fault-tolerance, standby instances of it might run in other hosts. This component is responsible for many things such as compiling a query into a DAG, decides how to split it into tasks, decides where tasks are run, task health monitoring, rebalancing work across tasks.</p>

<p><strong>TaskManager.</strong> This is a process running on each host (it’s possible to have more than one task manager per host, but let’s assume it’s 1:1). Each task manager has a thread pool (task slots) that is able to handle work from different jobs, assigned by the job manager.</p>

<p>Tasks also need to send data to one another if they’re processing the DAG of the same job and they don’t live in the same host.</p>

<p><strong>ResourceManager.</strong> Like JobManager, it’s logically a single process. This one acts as a broker between JobManager asking for resources and TaskManagers providing them. It knows how many task slots each host has and can return that list of hosts to the JobManager.</p>

<p>It can also pass along requests by the JobManager for more task slots, by then requesting the underlying system (e.g. Kubernetes) to add more workers to the cluster.</p>

<h3 id="data-model">Data Model</h3>

<p>Operator</p>

<h3 id="data-transfer">Data Transfer</h3>

<p>Data can be transferred within the same thread, across threads and across hosts. The intra-thread case is when operators are fused and data is passed between operators. The inter-thread or intra-host is when an operator has to be split. The inter-host case is when tasks need to communicate with multiple other tasks such as in shuffling operations.</p>

<h3 id="watermark">Watermark</h3>

<p>The watermark is a timestamp that should be interpreted as “no event with timestamp lower than this watermark will appear”. We studied watermarks via the paper <em>Watermarks in Stream Processing</em> in a previous <a href="https://www.kuniga.me/blog/2022/12/29/watermarks.html">post</a>. Watermark is just a heuristic: we cannot for sure predict whether all events with that timestamp have been seen. However, events that have timestamp lower than the watermark are called <em>late events</em>.</p>

<p>In Flink watermarks are emitted alongside normal events. Operators can transform watermarks in different ways. If an operator has multiple upstream operators, the way it merges watermarks is by taking the minimum between them and re-emitting it.</p>

<p>Other operators such as window/stateful make use of watermark. It’s used to determine when to close windows and emit events.</p>

<p>One downside of emitting watermarks as events is that if one of the tasks is not receiving events it doesn’t know which watermark to emit since it’s event-dependent and watermark progression can get stuck.</p>

<h3 id="state">State</h3>

<p>There are two types of state: key-based (<em>keyed state</em>) or operator-based (<em>operator state</em>). The keyed state as the name implies is like a key-value store. When an operator is key-based, Flink guarantees key-affinity (i.e. the same keys are always processed by the same task) by introducing a shuffle operation beforehand.</p>

<p>The overall way state is implemented is that Flink uses a local key-value store like RocksDB to persist the state and periodically snapshots to an external storage. We’ll cover more about this flow in <em>Snapshots</em>.</p>

<p><strong>Keyed state.</strong> The value of the state can be a scalar (<em>ValueState</em>), a list (<em>ListState</em>) or a map (<em>MapState</em>). The reason to expose these as opposed to just working with a strongly typed object is due to the different optimizations they afford.</p>

<p>The scalar is the most general since everything can be serialized to one. But suppose you have a map value and you want to just update one of its keys. If using a scalar, Flink would need to read the entire map, deserialize, update one key, serialize and write back. By using a map state, Flink might decide to store the entries of the map as separate entries in the key-value store.</p>

<p><strong>Operator state.</strong> has 3 types: List, Union and Broadcast. These mostly affect the semantics on repartitioning. The list state declares that the entries on the list are largely independent. Once repartitioning happens, Flink might decide to split the list across multiple tasks.</p>

<p>The Union is also a list, but upon restart Flink will union the lists across all tasks and provide each task with this unioned view. The broadcast state is typically sent by an upstream broadcast operator, and read-only for the downstream. On restart each task receives a copy of the broadcast state.</p>

<p><strong>State Evolution.</strong> Flink supports some schema evolution if the application uses a backward-compatible serialization such as Avro. It also supports topology evolution as long as operator names are stable (so that their state is preserved). Applications can explicitly label an operator, for example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="n">stream</span>
  <span class="o">.</span><span class="py">map</span><span class="o">(...)</span>
  <span class="o">.</span><span class="py">uid</span><span class="o">(</span><span class="s">"user-enrichment"</span><span class="o">)</span>
  <span class="o">.</span><span class="py">name</span><span class="o">(</span><span class="s">"User Enrichment Step"</span><span class="o">)</span></code></pre></figure>

<h3 id="checkpointing">Checkpointing</h3>

<p>The JobManager decides to occasionally checkpoint the state of the application. It sends a message to all TaskManagers that then inject a checkpoint barrier, a special message like the watermark.</p>

<p>Upon receiving the barrier, each operator takes a snapshot of its state (the ones discussed above, or offsets for source operators). They have been continuously writing to a local RocksDB and now they need to update that state to an external storage such as S3. This process is asynchronous.</p>

<p>Once the TaskManager determines all operators have uploaded their state, they inform the JobManager. Once the JobManager determines that all tasks have uploaded their state, it can mark the snapshot as completed.</p>

<p><strong>State exactly-once semantics.</strong> This 2-phase checkpointing allows for exactly-once semantics of the state (caveat: it assumes operations do not have side effects outside the state). Suppose it checkpointed successfully at time T, then it processed some data and the process crashed. Now it needs to rewind to the old checkpoint, meaning each operator will restore its state from the external storage, then it reprocesses the data.</p>

<p>From the state perspective, no data has been processed more than once. Note that by default Flink doesn’t coordinate flushes to the sink with checkpoints: it constantly flushes data downstream, so the exactly-once semantics is only for state. It’s possible to achieve end-to-end exactly once semantics, but the sink must support it.</p>

<h3 id="e2e-exactly-once-semantics">E2E Exactly-Once Semantics</h3>

<p>Note: for this section, it’s worth being familiar with <a href="https://www.kuniga.me/blog/2026/01/17/book-kafka.html">how Kafka works</a>.</p>

<p>One common way to achieve end-to-end exactly-once semantics in Flink is by using Kafka as a source and sink in EO mode. An important aspect is that Flink does not rely on Kafka to store the topic offset but it stores it as part of its state.</p>

<p>The sink must have exactly once mode enabled and downstream consumers must only read committed logs (see <em>Exactly-Once Semantics</em> in <a href="https://www.kuniga.me/blog/2026/01/17/book-kafka.html">Kafka: The Definitive Guide</a>).</p>

<p>The flow is: Suppose Flink just performed a checkpoint of its state. It then starts a new transaction on the Kafka sink. As it reads data from Kafka source, and processing it, it writes data to the transaction. Once it successfully checkpoints again, it commits the Kafka transaction.</p>

<p>If the process crashes any time before the transaction is closed, it will simply abort the transaction and replay the data from the previously stored offset. One corner case is if Flink crashes after completing its checkpoint but before committing the sink transaction.</p>

<p>To support this case, the transaction handle is stored as part of the checkpointed state. So upon recovery, the first thing it needs to check is whether there is a pending transaction, in which case it would commit it. Then it doesn’t need to reset to the previous checkpoint.</p>

<h2 id="api">API</h2>

<p>Let’s start with a simple example reading data from a socket, transforming it and printing to stdout:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="c1">// Read data</span>
<span class="k">val</span> <span class="nv">input</span><span class="k">:</span> <span class="kt">DataStream</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="nv">env</span><span class="o">.</span><span class="py">socketTextStream</span><span class="o">(</span><span class="s">"localhost"</span><span class="o">,</span> <span class="mi">9999</span><span class="o">)</span>

<span class="c1">// Deserialize</span>
<span class="k">val</span> <span class="nv">sensorData</span><span class="k">:</span> <span class="kt">DataStream</span><span class="o">[</span><span class="kt">SensorReading</span><span class="o">]</span> <span class="k">=</span> <span class="n">input</span>
  <span class="o">.</span><span class="py">map</span> <span class="o">{</span> <span class="n">line</span> <span class="k">=&gt;</span>
    <span class="k">val</span> <span class="nv">parts</span> <span class="k">=</span> <span class="nv">line</span><span class="o">.</span><span class="py">split</span><span class="o">(</span><span class="s">","</span><span class="o">)</span>
    <span class="nc">SensorReading</span><span class="o">(</span>
      <span class="nf">parts</span><span class="o">(</span><span class="mi">0</span><span class="o">).</span><span class="py">trim</span><span class="o">,</span> <span class="c1">// id</span>
      <span class="nf">parts</span><span class="o">(</span><span class="mi">1</span><span class="o">).</span><span class="py">trim</span><span class="o">.</span><span class="py">toLong</span><span class="o">,</span> <span class="c1">// timestamp</span>
      <span class="nf">parts</span><span class="o">(</span><span class="mi">2</span><span class="o">).</span><span class="py">trim</span><span class="o">.</span><span class="py">toDouble</span> <span class="c1">// temperature</span>
    <span class="o">)</span>
  <span class="o">}</span>

<span class="c1">// Transform</span>
<span class="k">val</span> <span class="nv">avgTemp</span><span class="k">:</span> <span class="kt">DataStream</span><span class="o">[</span><span class="kt">SensorReading</span><span class="o">]</span> <span class="k">=</span> <span class="n">sensorData</span>
  <span class="o">.</span><span class="py">map</span><span class="o">(</span> <span class="n">r</span> <span class="k">=&gt;</span> <span class="o">{</span>
    <span class="k">val</span> <span class="nv">celsius</span> <span class="k">=</span> <span class="o">(</span><span class="nv">r</span><span class="o">.</span><span class="py">temperature</span> <span class="o">-</span> <span class="mi">32</span><span class="o">)</span> <span class="o">*</span> <span class="o">(</span><span class="mf">5.0</span> <span class="o">/</span> <span class="mf">9.0</span><span class="o">)</span>
    <span class="nc">SensorReading</span><span class="o">(</span><span class="nv">r</span><span class="o">.</span><span class="py">id</span><span class="o">,</span> <span class="nv">r</span><span class="o">.</span><span class="py">timestamp</span><span class="o">,</span> <span class="n">celsius</span><span class="o">)</span>
  <span class="o">})</span>
  <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="o">.</span><span class="py">timeWindow</span><span class="o">(</span><span class="nv">Time</span><span class="o">.</span><span class="py">seconds</span><span class="o">(</span><span class="mi">5</span><span class="o">))</span>
  <span class="o">.</span><span class="py">apply</span><span class="o">(</span><span class="k">new</span> <span class="nc">TemperatureAverager</span><span class="o">)</span>

<span class="c1">// Print</span>
<span class="nv">avgTemp</span><span class="o">.</span><span class="py">print</span><span class="o">()</span></code></pre></figure>

<p>This example covers several components of a pipeline: reading a serialized message from source, deserializing, transforming and writing somewhere (in this case stdout).</p>

<p>Next we consider the API in more detail.</p>

<h3 id="stateless-transformations">Stateless Transformations</h3>

<p><strong>Map.</strong> Is a method of a datastream that takes a function object which itself transforms one message into another message. Example</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="nv">stream</span><span class="o">.</span><span class="py">map</span><span class="o">(</span> <span class="n">r</span> <span class="k">=&gt;</span> <span class="nv">r</span><span class="o">.</span><span class="py">id</span> <span class="o">)</span></code></pre></figure>

<p><strong>Filter.</strong>  Same idea as the <code class="language-plaintext highlighter-rouge">map()</code>. Takes one message, returns true/false. If false, the message is filtered out. Example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="nv">stream</span><span class="o">.</span><span class="py">filter</span><span class="o">(</span> <span class="n">r</span> <span class="k">=&gt;</span> <span class="nv">r</span><span class="o">.</span><span class="py">temperature</span> <span class="o">&gt;=</span> <span class="mi">25</span> <span class="o">)</span></code></pre></figure>

<p><strong>FlatMap.</strong> A more general version of map takes one row but can return a different number of rows. Can be used to implement both <code class="language-plaintext highlighter-rouge">map()</code> and <code class="language-plaintext highlighter-rouge">filter()</code>. Example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="nv">stream</span><span class="o">.</span><span class="py">flatMap</span> <span class="o">{</span> <span class="n">r</span> <span class="k">=&gt;</span>
  <span class="nf">if</span> <span class="o">(</span><span class="nv">r</span><span class="o">.</span><span class="py">temperature</span> <span class="o">&gt;=</span> <span class="mi">25</span><span class="o">)</span> <span class="nc">List</span><span class="o">(</span><span class="nv">r</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="k">else</span> <span class="nv">List</span><span class="o">.</span><span class="py">empty</span>
<span class="o">}</span></code></pre></figure>

<p><strong>Custom Functions.</strong> Instead of passing callbacks to the mentioned APIs, it’s also possible and sometimes necessary to implement interfaces and pass function objects instead. For example, for the <code class="language-plaintext highlighter-rouge">.filter()</code> method we can implement:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">class</span> <span class="nc">FlinkFilter</span> <span class="k">extends</span> <span class="nc">FilterFunction</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="o">{</span>
  <span class="k">override</span> <span class="k">def</span> <span class="nf">filter</span><span class="o">(</span><span class="n">value</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span><span class="k">:</span> <span class="kt">Boolean</span> <span class="o">=</span> <span class="o">{</span>
    <span class="nv">value</span><span class="o">.</span><span class="py">contains</span><span class="o">(</span><span class="s">"flink"</span><span class="o">)</span>
  <span class="o">}</span>
<span class="o">}</span>
<span class="o">...</span>
<span class="k">var</span> <span class="n">flinkTweets</span> <span class="k">=</span> <span class="nv">tweets</span><span class="o">.</span><span class="py">filter</span><span class="o">(</span><span class="k">new</span> <span class="nc">FlinkFilter</span><span class="o">)</span></code></pre></figure>

<p>There are corresponding interfaces <code class="language-plaintext highlighter-rouge">MapFunction</code> and <code class="language-plaintext highlighter-rouge">FlatMapFunction</code> functions for the other 2 APIs. You can also pass parameters to the function upon construction:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">var</span> <span class="n">flinkTweets</span> <span class="k">=</span> <span class="nv">tweets</span><span class="o">.</span><span class="py">filter</span><span class="o">(</span><span class="k">new</span> <span class="nc">KeywordFilter</span><span class="o">(</span><span class="s">"flink"</span><span class="o">))</span></code></pre></figure>

<p>But there’s “compile” time parameters that will be resolved before the DAG is constructed. For runtime initialization, one can use the <em>Rich</em>- versions, e.g. <code class="language-plaintext highlighter-rouge">RichFilterFunction</code> which have the <code class="language-plaintext highlighter-rouge">open()</code> and <code class="language-plaintext highlighter-rouge">close()</code> methods.</p>

<h3 id="keyed-transformations">Keyed Transformations</h3>

<p><strong>KeyBy.</strong> This method on a normal <code class="language-plaintext highlighter-rouge">DataStream</code> transforms it into a <code class="language-plaintext highlighter-rouge">KeyedStream</code> which is logically partitioned by a key function provided, e.g.</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="nv">stream</span><span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="n">r</span> <span class="k">=&gt;</span> <span class="nv">r</span><span class="o">.</span><span class="py">id</span><span class="o">)</span></code></pre></figure>

<p><strong>Rolling Aggregations.</strong> A rolling aggregation doesn’t require a window of time, the values are accumulated forever, but events are emitted for every row. For example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="n">stream</span>
  <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="n">r</span> <span class="k">=&gt;</span> <span class="nv">r</span><span class="o">.</span><span class="py">key</span><span class="o">)</span>
  <span class="o">.</span><span class="py">sum</span><span class="o">(</span><span class="n">r</span> <span class="k">=&gt;</span> <span class="nv">r</span><span class="o">.</span><span class="py">value</span><span class="o">)</span></code></pre></figure>

<p>For each event <code class="language-plaintext highlighter-rouge">r</code>, it will add <code class="language-plaintext highlighter-rouge">r.value</code> to the corresponding sum for <code class="language-plaintext highlighter-rouge">r.key</code> and emit the result.</p>

<p><strong>Reduce.</strong> This API allows providing custom aggregation functions. The major restriction is that the accumulator type must be the same as the record type. For example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="n">stream</span>
  <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="n">r</span> <span class="k">=&gt;</span> <span class="nv">r</span><span class="o">.</span><span class="py">key</span><span class="o">)</span>
  <span class="o">.</span><span class="py">reduce</span><span class="o">((</span><span class="n">acc</span><span class="o">,</span> <span class="n">r</span><span class="o">)</span> <span class="k">=&gt;</span> <span class="nc">Record</span><span class="o">(</span><span class="nv">acc</span><span class="o">.</span><span class="py">key</span><span class="o">,</span> <span class="nv">acc</span><span class="o">.</span><span class="py">value</span> <span class="o">+</span> <span class="nv">r</span><span class="o">.</span><span class="py">value</span><span class="o">))</span></code></pre></figure>

<p>Note that this API does not have an “initializer” step. The variable <code class="language-plaintext highlighter-rouge">acc</code> is set with the first event it sees.</p>

<p><strong>KeyedProcessFunction.</strong> The most general functions on <code class="language-plaintext highlighter-rouge">KeyedStream</code> are those implementing <code class="language-plaintext highlighter-rouge">KeyedProcessFunction</code>. The book provides a complete example, but for brevity, we mention the two methods that must be implemented: <code class="language-plaintext highlighter-rouge">processElement()</code> and <code class="language-plaintext highlighter-rouge">onTimer()</code>.</p>

<p>In <code class="language-plaintext highlighter-rouge">processElement()</code> we receive one message and decide what to do with it. This method has access to the (keyed) state, so it can update the state based on this message. This method cannot output messages downstream. Instead it schedules a timer that when fired calls the <code class="language-plaintext highlighter-rouge">onTimer()</code> method.</p>

<p>The way to use a custom <code class="language-plaintext highlighter-rouge">KeyedProcessFunction</code> is passing it to the <code class="language-plaintext highlighter-rouge">process()</code> method:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">val</span> <span class="nv">alerts</span> <span class="k">=</span>
  <span class="n">readings</span>
    <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
    <span class="o">.</span><span class="py">process</span><span class="o">(</span><span class="k">new</span> <span class="nc">MyKeyedProcessFunction</span><span class="o">)</span></code></pre></figure>

<p><strong>Stateful Map Functions.</strong> It’s possible to use keyed state in the generic flat map function, in particular via the <code class="language-plaintext highlighter-rouge">RichFlatMapFunction</code>. In this case the state is managed by the function itself and has no timer semantics - the only “hook” point is when <code class="language-plaintext highlighter-rouge">flatMap()</code>.</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">class</span> <span class="nc">TemperaturAlertFunction</span><span class="o">()</span>
  <span class="k">extends</span> <span class="nc">RichFlatMapFunction</span><span class="o">[</span><span class="kt">SensorReading</span>, <span class="o">(</span><span class="kt">String</span>, <span class="kt">Double</span>, <span class="kt">Double</span><span class="o">)]</span> <span class="o">{</span>

  <span class="o">...</span>
  <span class="k">override</span> <span class="k">def</span> <span class="nf">open</span><span class="o">(</span><span class="n">parameters</span><span class="k">:</span> <span class="kt">Configuration</span><span class="o">)</span><span class="k">:</span> <span class="kt">Unit</span> <span class="o">=</span> <span class="o">{</span>
    <span class="k">var</span> <span class="n">lastTempDesc</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">ValueStateDescriptor</span><span class="o">[</span><span class="kt">Double</span><span class="o">](</span><span class="s">"lastTemp"</span><span class="o">,</span> <span class="n">classOf</span><span class="o">[</span><span class="kt">Double</span><span class="o">])</span>
    <span class="n">lastTempState</span> <span class="k">=</span> <span class="nv">getRuntimeContext</span><span class="o">.</span><span class="py">getState</span><span class="o">[</span><span class="kt">Double</span><span class="o">](</span><span class="n">lastTempDesc</span><span class="o">)</span>
  <span class="o">}</span>

  <span class="o">...</span>
  <span class="k">override</span> <span class="k">def</span> <span class="nf">flatMap</span><span class="o">(</span><span class="n">reading</span><span class="k">:</span> <span class="kt">SensorReading</span><span class="o">,</span> <span class="n">out</span><span class="k">:</span> <span class="kt">Collector</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">Double</span>, <span class="kt">Double</span><span class="o">)])</span><span class="k">:</span> <span class="kt">Unit</span> <span class="o">=</span> <span class="o">{</span>
    <span class="o">...</span>
    <span class="k">val</span> <span class="nv">lastTemp</span> <span class="k">=</span> <span class="nv">lastTempState</span><span class="o">.</span><span class="py">value</span><span class="o">()</span>
    <span class="o">...</span>
  <span class="o">}</span>
<span class="o">}</span></code></pre></figure>

<p>The state used is of type <em>ValueState</em>. As discussed in <em>Architecture &gt; State &gt; Keyed State</em>, we can also have <em>ListState</em> or <em>MapState</em>. The API will be similar, for example <code class="language-plaintext highlighter-rouge">ValueStateDescriptor</code> becomes <code class="language-plaintext highlighter-rouge">ListStateDescriptor</code> and <code class="language-plaintext highlighter-rouge">getRuntimeContext.getState()</code> becomes <code class="language-plaintext highlighter-rouge">getRuntimeContext.getListState()</code>.</p>

<h3 id="windowed-transformations">Windowed Transformations</h3>

<p>A special case of keyed transformations is the windowed transformations, on top of <code class="language-plaintext highlighter-rouge">WindowedStream</code>. This type of stream is returned once we call methods such as <code class="language-plaintext highlighter-rouge">window()</code>. The <code class="language-plaintext highlighter-rouge">ProcessWindowFunction</code> interface has the method <code class="language-plaintext highlighter-rouge">process()</code> which among other things takes the key and a list of messages within a window.</p>

<p>It’s a layer of abstraction above the <code class="language-plaintext highlighter-rouge">KeyedProcessFunction</code>. In this case we don’t have to deal with timers explicitly. The method <code class="language-plaintext highlighter-rouge">window()</code> takes a function that describes the window, for example, for a sliding window of 1h and step 15min:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">val</span> <span class="nv">alerts</span> <span class="k">=</span> <span class="n">readings</span>
  <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="o">.</span><span class="py">window</span><span class="o">(</span><span class="nv">SlidingEventTimeWindows</span><span class="o">.</span><span class="py">of</span><span class="o">(</span><span class="nv">Time</span><span class="o">.</span><span class="py">hours</span><span class="o">(</span><span class="mi">1</span><span class="o">),</span> <span class="nv">Time</span><span class="o">.</span><span class="py">minutes</span><span class="o">(</span><span class="mi">15</span><span class="o">)))</span>
  <span class="o">.</span><span class="py">process</span><span class="o">(</span><span class="k">new</span> <span class="nc">MyProcessWindowFunction</span><span class="o">)</span></code></pre></figure>

<p>This can load a lot of data in memory. An alternative is to use either <code class="language-plaintext highlighter-rouge">ReduceFunction</code> or <code class="language-plaintext highlighter-rouge">AggregateFunction</code>. The former works more like the <code class="language-plaintext highlighter-rouge">reduce()</code> for keyed streams, where it receives the accumulation for a particular window and a new element to be added to the window. The latter is a bit more general, you can have different types between the accumulator and the event, but it’s more complex to implement.</p>

<p>It’s also possible to customize what is passed to the <code class="language-plaintext highlighter-rouge">.window()</code> API by implementing a <code class="language-plaintext highlighter-rouge">WindowAssigner</code>. This determines to which window an event is assigned to. One of the methods in <code class="language-plaintext highlighter-rouge">WindowAssigner</code> is <code class="language-plaintext highlighter-rouge">getDefaultTrigger()</code> which returns a <code class="language-plaintext highlighter-rouge">Trigger</code>.</p>

<p>The trigger tells us when to emit events from the window, and it’s possible to provide a custom trigger that overrides the default for the window, e.g.</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">val</span> <span class="nv">alerts</span> <span class="k">=</span> <span class="n">readings</span>
  <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="o">.</span><span class="py">window</span><span class="o">(</span><span class="k">new</span> <span class="nc">MyWindow</span><span class="o">)</span>
  <span class="o">.</span><span class="py">trigger</span><span class="o">(</span><span class="k">new</span> <span class="nc">MyTrigger</span><span class="o">)</span>
  <span class="o">.</span><span class="py">process</span><span class="o">(</span><span class="k">new</span> <span class="nc">MyProcessWindowFunction</span><span class="o">)</span></code></pre></figure>

<h3 id="multistream-transformations">Multistream Transformations</h3>

<p><strong>Union.</strong> Merges data of two or more streams of the same type. Events are processed in FIFO order and all events are emitted as they arrive. Example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">var</span> <span class="n">unioned</span> <span class="k">=</span> <span class="n">stream1</span>
  <span class="o">.</span><span class="py">union</span><span class="o">(</span><span class="n">stream2</span><span class="o">,</span> <span class="n">stream3</span><span class="o">)</span></code></pre></figure>

<p><strong>Connect.</strong> It combines two streams into a special type of stream called <code class="language-plaintext highlighter-rouge">ConnectStreams[T1, T2]</code>. The streams are still technically separated, and are handled by different functions, but the key is that they’re processed by the same operator, so they can share state. Example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">var</span> <span class="n">connected</span> <span class="k">=</span> <span class="nv">stream1</span><span class="o">.</span><span class="py">connect</span><span class="o">(</span><span class="n">stream2</span><span class="o">)</span>

<span class="nv">connected</span><span class="o">.</span><span class="py">map1</span><span class="o">(</span><span class="n">f1</span><span class="o">).</span><span class="py">map2</span><span class="o">(</span><span class="n">f2</span><span class="o">)</span></code></pre></figure>

<p>So if we decide to store events from <code class="language-plaintext highlighter-rouge">stream1</code> in a state inside <code class="language-plaintext highlighter-rouge">map1()</code>, then <code class="language-plaintext highlighter-rouge">stream2</code> would have access to it. Typically we want to shard events by key:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">var</span> <span class="n">connected</span> <span class="k">=</span> <span class="n">stream1</span>
  <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="n">keyFunc1</span><span class="o">)</span>
  <span class="o">.</span><span class="py">connect</span><span class="o">(</span><span class="nv">stream2</span><span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="n">keyFunc2</span><span class="o">))</span></code></pre></figure>

<p>To make sure they’re processed by the same task since state is task-scoped.</p>

<h3 id="join-transformations">Join Transformations</h3>

<p><strong>Interval Join.</strong> This joins matching events from different streams that are within a period of time of each other:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">var</span> <span class="n">connected</span> <span class="k">=</span> <span class="nv">stream1</span><span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="o">.</span><span class="py">intervalJoin</span><span class="o">(</span><span class="nv">stream2</span><span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id2</span><span class="o">))</span>
  <span class="o">.</span><span class="py">between</span><span class="o">(</span><span class="nv">Time</span><span class="o">.</span><span class="py">seconds</span><span class="o">(-</span><span class="mi">5</span><span class="o">),</span> <span class="nv">Time</span><span class="o">.</span><span class="py">seconds</span><span class="o">(</span><span class="mi">10</span><span class="o">))</span>
  <span class="o">.</span><span class="py">process</span><span class="o">{</span> <span class="o">(</span><span class="n">r</span><span class="o">,</span> <span class="n">a</span><span class="o">)</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">r</span><span class="o">,</span> <span class="n">a</span><span class="o">)</span> <span class="o">}</span></code></pre></figure>

<p>Where <code class="language-plaintext highlighter-rouge">.process()</code> takes a function (lambda or a function of type <code class="language-plaintext highlighter-rouge">ProcessJoinFunction</code>) which takes a pair of matching events. The detail is that this join keeps a buffer of the left and right events that are within range. When a new event $e$ arrives, say from the first stream, it emits all pairs containing $e$ and events from the second stream in the buffer.</p>

<p><strong>Window Join.</strong> This combines two streams into a (custom) window and performs a join between them. For example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">var</span> <span class="n">connected</span> <span class="k">=</span> <span class="nv">stream1</span><span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="o">.</span><span class="py">join</span><span class="o">(</span><span class="nv">stream2</span><span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id2</span><span class="o">))</span>
  <span class="o">.</span><span class="py">where</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="o">.</span><span class="py">equalTo</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id2</span><span class="o">)</span>
  <span class="o">.</span><span class="py">window</span><span class="o">(</span><span class="nv">TumblingEventTimeWindows</span><span class="o">.</span><span class="py">of</span><span class="o">(</span><span class="nv">Time</span><span class="o">.</span><span class="py">seconds</span><span class="o">(</span><span class="mi">10</span><span class="o">)))</span>
  <span class="o">.</span><span class="py">apply</span><span class="o">{</span> <span class="o">(</span><span class="n">r</span><span class="o">,</span> <span class="n">a</span><span class="o">)</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">r</span><span class="o">,</span> <span class="n">a</span><span class="o">)</span> <span class="o">}</span></code></pre></figure>

<p>Where <code class="language-plaintext highlighter-rouge">.apply()</code> takes a function (lambda or a function of type <code class="language-plaintext highlighter-rouge">JoinFunction</code>) which takes the cross product of the list of elements from the first stream with those of the second within a window of time.</p>

<h3 id="distribution-transformations">Distribution Transformations</h3>

<p>By default, when using <code class="language-plaintext highlighter-rouge">keyBy()</code>, Flink will distribute data to the right tasks based on the provided key (expression). There are ways to customize that using APIs such as <code class="language-plaintext highlighter-rouge">shuffle()</code>, <code class="language-plaintext highlighter-rouge">rebalance()</code> and <code class="language-plaintext highlighter-rouge">rescale()</code>.</p>

<h3 id="late-event-handling">Late Event Handling</h3>

<p>Due to watermark, some events will be considered late. There are many options on how to handle them: the simplest is discarding them, but an alternative is sending them to a different sink (dead letter queue).</p>

<p>This can be done via the <code class="language-plaintext highlighter-rouge">.sideOutputLateData()</code>:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="n">stream</span>
  <span class="o">.</span><span class="py">keyBy</span><span class="o">(</span><span class="nv">_</span><span class="o">.</span><span class="py">id</span><span class="o">)</span>
  <span class="o">.</span><span class="py">timeWindow</span><span class="o">(</span><span class="nv">Time</span><span class="o">.</span><span class="py">seconds</span><span class="o">(</span><span class="mi">10</span><span class="o">))</span>
  <span class="o">.</span><span class="py">sideOutputLateData</span><span class="o">(</span><span class="k">new</span> <span class="nc">LateEventSink</span><span class="o">)</span>
  <span class="o">.</span><span class="py">process</span><span class="o">(</span><span class="k">new</span> <span class="nc">MyProcessFunc</span><span class="o">)</span></code></pre></figure>

<p>When customizing <code class="language-plaintext highlighter-rouge">ProcessFunction</code> we can also handle late events since this function has access to the watermark.</p>

<h3 id="operator-state">Operator State</h3>

<p>For non-keyed state, we discussed a few options in <em>Architecture</em> &gt; <em>State</em> &gt; <em>Operator State</em>: List, Union and Broadcast. A flat map function can use them by using the trait <code class="language-plaintext highlighter-rouge">ListCheckpointed</code>, which requires implementing the <code class="language-plaintext highlighter-rouge">restoreState</code> and <code class="language-plaintext highlighter-rouge">snapshotState</code>. For example:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">class</span> <span class="nc">MyStatefulFunction</span><span class="o">()</span>
  <span class="k">extends</span> <span class="nc">RichFlatMapFunction</span><span class="o">[</span><span class="kt">SensorReading</span>, <span class="o">(</span><span class="kt">Long</span><span class="o">)]</span>
  <span class="k">with</span> <span class="nc">ListCheckpointed</span><span class="o">[</span><span class="kt">java.lang.Long</span><span class="o">]</span> <span class="o">{</span>

  <span class="o">...</span>
  <span class="k">override</span> <span class="k">def</span> <span class="nf">restoreState</span><span class="o">(</span><span class="n">state</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Long</span><span class="o">])</span> <span class="k">=</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
  <span class="o">...</span>
  <span class="k">override</span> <span class="k">def</span> <span class="nf">snapshotState</span><span class="o">(</span><span class="n">chkpntId</span><span class="k">:</span> <span class="kt">Long</span><span class="o">,</span> <span class="n">ts</span><span class="k">:</span> <span class="kt">Long</span><span class="o">)</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">Long</span><span class="o">]</span> <span class="k">=</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
<span class="o">}</span></code></pre></figure>

<p>To use other types of state such as union, we must use the more general trait, <code class="language-plaintext highlighter-rouge">CheckpointedFunction</code>:</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="k">class</span> <span class="nc">MyStatefulFunction</span><span class="o">()</span>
  <span class="k">extends</span> <span class="nc">RichFlatMapFunction</span><span class="o">[</span><span class="kt">SensorReading</span>, <span class="o">(</span><span class="kt">Long</span><span class="o">)]</span>
  <span class="k">with</span> <span class="nc">CheckpointedFunction</span> <span class="o">{</span>

  <span class="o">...</span>
  <span class="k">override</span> <span class="k">def</span> <span class="nf">initializeState</span><span class="o">(</span><span class="n">ctx</span><span class="k">:</span> <span class="kt">FunctionInitializationContext</span><span class="o">)</span><span class="k">:</span> <span class="kt">Unit</span> <span class="o">=</span> <span class="o">{</span>
    <span class="k">val</span> <span class="nv">desc</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">ListStateDescriptor</span><span class="o">[</span><span class="kt">String</span><span class="o">](</span><span class="s">"rules"</span><span class="o">,</span> <span class="n">classOf</span><span class="o">[</span><span class="kt">String</span><span class="o">])</span>
    <span class="n">unionState</span> <span class="k">=</span>
      <span class="nv">ctx</span><span class="o">.</span><span class="py">getOperatorStateStore</span><span class="o">.</span><span class="py">getUnionListState</span><span class="o">(</span><span class="n">desc</span><span class="o">)</span>
  <span class="o">}</span>
  <span class="o">...</span>
  <span class="k">override</span> <span class="k">def</span> <span class="nf">snapshotState</span><span class="o">(</span><span class="n">ctx</span><span class="k">:</span> <span class="kt">FunctionSnapshotContext</span><span class="o">)</span><span class="k">:</span> <span class="kt">Unit</span> <span class="o">=</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
<span class="o">}</span></code></pre></figure>

<h2 id="conclusion">Conclusion</h2>

<p>I’ve been working with stream processing for about 4 years and never took the time to learn about the most popular open source stream processing system.</p>

<p>One thing that I found surprising is how expressive Flink is, and how many different levels of abstractions are supported in the APIs. This also means there are many ways to do the same thing, which I tend to not be a big fan of.</p>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="distributed systems" /><summary type="html"><![CDATA[In this post I’ll share my notes on the book Stream Processing with Apache Flink by Fabian Hueske and Vasiliki Kalavri. This book covers many aspects of the popular open-source Apache Flink, a stream processing engine.]]></summary></entry><entry><title type="html">Functionals</title><link href="https://www.kuniga.me/blog/2026/04/26/functionals.html" rel="alternate" type="text/html" title="Functionals" /><published>2026-04-26T00:00:00+00:00</published><updated>2026-04-26T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/04/26/functionals</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/04/26/functionals.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources/blog/2026-04-26-functionals/riesz.png" alt="Thumbnail of Frigyes Riesz" />
</figure>

<p>I started reading the book <em>The Theoretical Minimum</em> by Leonard Susskind and George Hrabovsky. A lot of the math from the early chapters looked familiar, but in <em>Chapter 6: The Principle of Least Action</em>, they describe and derive the <em>Euler-Lagrange equation</em>, which I don’t recall seeing before.</p>

<p>I wanted to explore these equations and their derivation, but from a more mathematical point of view. This led me to a short rabbit hole around functionals and Sobolev spaces and since I like learning things from first principles, I decided to cover functionals first.</p>

<!--more-->

<p>The thumbnail features the Hungarian mathematician Frigyes Riesz. He was first featured in the post <a href="https://www.kuniga.me/blog/2025/12/14/subharmonic-functions.html">Subharmonic Functions</a> and is back here because he’s considered one of the founders of functional analysis and we’ll cover one theorem named after him, the <em>Riesz Representation Theorem</em>.</p>

<h2 id="functionals">Functionals</h2>

<p>In the branch of math called functional analysis the core object is the <em>functional</em>. Without qualifiers it’s implicitly assumed that a functional is a linear one. In this post however, since we want to cover functionals in general, we’ll qualify linear functions explicitly and assume the general case when saying <em>functionals</em>.</p>

<p>A <strong>functional</strong> is essentially a function $f$ where the domain is a vector space $H$ and the image is a field, either the reals or the complex numbers, generically denoted by $\mathbb{F}$:</p>

\[f : H \rightarrow \mathbb{F}\]

<p>More intuitively, a functional is a function that takes a function as input and returns a scalar. Much like functional programming which operates over functions as objects.</p>

<p>One may ask: what do functions have to do with vector spaces? In a vector space the <em>vector</em> is a concept more broad than, say, a tuple of scalars like $\mathbb{R}^3$. It just means it’s some object that satisfies a set of axioms (e.g. addition, scalar multiplication). An example of a function vector space is the set of continuous functions.</p>

<p>If this function (or map) satisfies additivity and scalar multiplication, then it’s called a <strong>linear functional</strong>. In other words if $x, y$ are members of a vector space $V$, and $\lambda \in \mathbb{F}$ and $f$ a linear map $f: V \rightarrow \mathbb{F}$:</p>

\[f(x + y) = f(x) + f(y) \\
f(\lambda x) = \lambda f(x)\]

<p>We’ll now cover some related concepts and properties of functionals.</p>

<h2 id="properties">Properties</h2>

<h3 id="norm">Norm</h3>

<p>The <strong>norm</strong> or <em>operator norm</em> of a functional $f$ and denoted by $\norm{f}$ is defined as:</p>

\[\norm{f} = \sup_{\norm{x} \ne 0} \frac{\abs{f(x)}}{\norm{x}}\]

<p>That is, it’s the supremum of the value of $f$ but normalized by its input size. Note that it doesn’t make sense to talk about $\abs{f}$, even though its image is a scalar, since it requires a specific input for it to spit out a scalar.</p>

<h3 id="continuity">Continuity</h3>

<p>For continuity to make sense for functionals, the domain must be a <a href="https://www.kuniga.me/blog/2022/11/03/topological-equivalence.html">topological space</a>, i.e. it must have the notion of open sets, because <a href="https://www.kuniga.me/docs/math/topology.html#continuity">continuity</a> depends on these.</p>

<p>In a more specific case, if we assume the domain is a normed vector space, i.e. it has the notion of distance between its elements, then we can use the $\epsilon-\delta$ definition of continuity for the functional, that is, a functional is <strong>continuous</strong> at $x_0$ if for every $\epsilon \gt 0$, there exists $\delta \gt 0$:</p>

\[\norm{x - x_0} \lt \delta \implies \norm{f(x) - f(x_0)} \lt \epsilon\]

<p>For a linear functional $f$ in particular, there’s an alternative characterization: if there’s an upper bound on how much bigger $f$ is compared to its input, then it’s continuous:</p>

<p><strong>Lemma 1.</strong> The linear functional $f: H \rightarrow \mathbb{F}$ is continuous if and only if</p>

\[\norm{f} \le C \norm{x}\]

<p>for all $x \in H$ and some constant $C$.</p>

<proof>

Assume first that $f$ is bounded. Because $f$ is linear, we have $f(0) = 0$. Consider continuity at $0$. We need to show that for every $\epsilon \gt 0$, there exists $\delta \gt 0$:

$$
\norm{x} \lt \delta \implies \norm{f(x)} \lt \epsilon
$$

since we have $\norm{f} \le C \norm{x}$ we just choose $\delta = \epsilon/C$ if $C \gt 0$. Otherwise we have $\norm{f} = 0$, which is continuous.

<br /><br />
Now for a general point $x_0$, we use the linearity of $f$ to show:

$$
f(x) - f(x_0) = f(x - x_0)
$$

using the hypothesis:

$$
\abs{f(x) - f(x_0)} = \abs{f(x - x_0)} \le C \norm{x - x_0}
$$

If $C \gt 0$ we again take $\delta = \epsilon/C$. So if $\norm{x - x_0} \lt \delta$, then:

$$
\abs{f(x) - f(x_0)} \le C \norm{x - x_0} \lt C \delta = \epsilon
$$

so $f$ is also continuous at $x_0$.
<br /><br />
Now consider the other direction, that assumes $f$ is continuous. Consider the case at 0, that gives us, for all $\epsilon \gt 0$, there exists $\delta \gt 0$:

$$
\norm{x} \lt \delta \implies \norm{f(x)} \lt \epsilon
$$

we can then take $\epsilon = 1$ and $\delta_1$ is the corresponding constant. Let $x$ be any vector from the domain. We can re-scale it as:

$$
y = \frac{\delta_1}{2 \norm{x}} x
$$

and we have that $\norm{y} = \delta_1 / 2$. So now $y$ is a point in the neighborhood of $0$ (defined by $\norm{y} \lt \delta_1$) and thus $\abs{f(y)} \lt 1$. By linearity we have:

$$
f(y) = f\left(\frac{\delta_1}{2 \norm{x}} x \right) = \frac{\delta_1}{2 \norm{x}} f(x)
$$

and since $\abs{f(y)} \lt 1$:

$$
\abs{\frac{\delta_1}{2 \norm{x}} f(x)} \lt 1 \implies \abs{f(x)} \lt \frac{2}{\delta_1} \norm{x}
$$

we can thus choose $C = 2 / \delta_1$ to show that

$$
\norm{f} \le C \norm{x}
$$

so intuitively $C$ is the scaling factor that brings every point $x$ in the domain inside the neighborhood of $0$ where continuity holds.
</proof>

<h3 id="differentiation">Differentiation</h3>

<p>Let $X, Y$ be vector spaces equipped with a norm and $f : X \rightarrow Y$. We say that $f$ is <strong>Fréchet differentiable</strong> if there exists a linear map $A: X \rightarrow Y$ (Fréchet derivative) such that:</p>

\[(1) \quad \lim_{\norm{h} \rightarrow 0} \frac{\norm{f(x + h) - f(x) - A(h)}}{\norm{h}} = 0\]

<p>For $h \in X$. Note that $f$ is not necessarily a functional, only if $Y = \mathbb{R}$ or $Y = \mathbb{C}$, and even if it is, it’s not necessarily linear. However if $f$ is a functional, then the Fréchet derivative is a linear function because it’s a linear map from a vector space to a field.</p>

<p>This is a general definition of the differential we see in real analysis. If we take $X = Y = \mathbb{R}$, then we can simplify $(1)$ to:</p>

\[(2) \quad \lim_{h \rightarrow 0} \frac{f(x + h) - f(x) - f'(x)h}{h} = 0\]

<p>By having $A(h) = f’(x)h$. If we add $f’(x)h/h$ to both sides of the equation, we get the more familiar:</p>

\[(3) \quad f'(x) = \lim_{h \rightarrow 0} \frac{f(x + h) - f(x)}{h}\]

<p>One might ask why we use this other form. The expression $f(x + h) - f(x)$ is an element of $Y$ and $h \in X$, so in order for $(3)$ to make sense, we’d need to define multiplication or division between $X$ and $Y$.</p>

<p>The Fréchet derivative also generalizes the <a href="https://www.kuniga.me/blog/2023/12/21/holomorphic-functions.html">complex derivatives</a>, the one that defines holomorphic functions and underpins complex analysis. We can also have the form $(2)$ but the implicit assumption is that in $f’(x)h$ the multiplication operator is the complex one.</p>

<p>Note how the Fréchet derivative goes one abstraction layer above by “wrapping” $f’(x)h$ as some function $A(h)$. This is similar to how topological spaces abstract normed spaces by working with open sets instead of norms (open set is a higher object than norms because they can be defined from norms but not the other way around).</p>

<h2 id="linear-functionals">Linear Functionals</h2>

<p>Now we focus on properties that are only applicable if the functional is linear.</p>

<h3 id="kernel">Kernel</h3>

<p>Let $X, Y$ be vector spaces and $f$ a linear map between them. The <strong>kernel</strong> is the subspace of $X$ defined as:</p>

\[\ker f = \curly{x \in X : f(x) = 0}\]

<p>In other words, all elements in the domain that map to $0$ in the image. Note that $0$ here is not necessarily the scalar number $0$, but the $0$ element in the vector space $Y$.</p>

<p>We can verify that the kernel is indeed a subspace of $X$. It contains the $0$-th element because $f$ is linear and thus $f(0) = 0$. If $x, y \in \ker f$, then $x + y \in \ker f$ again because $f$ is linear and $f(x + y) = f(x) + f(y) = 0 + 0$. If $x \in \ker f$ and $\lambda$ is a scalar, then $\lambda x \in \ker f$ because $f(\lambda x) = \lambda f(x) = 0$.</p>

<p>If $f$ is continuous, then $\ker f$ is closed. We can show this by using one of the topological definitions of <a href="https://www.kuniga.me/docs/math/topology.html#continuity">continuity</a>: A function $f: X \rightarrow Y$ is continuous if and only if for every $U$ that is a closed set in $Y$, $f^{-1}(U)$ is a closed set in $X$. Since $\curly{0}$ is a closed set in $Y$ and $\ker f$ is the pre-image of $\curly{0}$, $\ker f$ is closed.</p>

<p>So we know the kernel is a subspace of $X$, but it doesn’t necessarily inherit the same properties of the vector space $X$. If the domain is a <a href="https://www.kuniga.me/blog/2021/06/26/hilbert-spaces.html">Hilbert space</a> however, then it’s possible to show that $\ker f$ is also a Hilbert space.</p>

<h3 id="dual-space">Dual Space</h3>

<p>Intuitively things that have linear properties can form a vector space, because a lot of its axioms are about linear combination of vectors. Since linear functionals have linear properties, they can also form a vector space! This vector space is called the <strong>dual</strong> of the domain vector space of the functionals.</p>

<p>More specifically, the dual space of a vector space $V$ is the set of all linear functionals that have $V$ as domain, denoted with a superscript asterisk:</p>

\[V^{*} = \curly{f : V \rightarrow \mathbb{R}}\]

<p>where $f$ is a linear functional. The intuition here is that $f$ associates a measure to the vectors of $V$. For Hilbert spaces in particular, we have a nice identity between a vector space and its dual. Before we show that, we need the following result:</p>

<p><strong>Theorem 2.</strong> (Riesz Representation Theorem) Let $H$ be a Hilbert space and $f: H \rightarrow \mathbb{F}$ a continuous linear functional. Then, there exists a unique vector $y \in H$ such that:</p>

\[f(x) = \langle x, y\rangle \quad \forall x \in H\]

<p>and with $\norm{f} = \norm{y}$</p>

<proof>

We first prove for the case where $f(x) = 0$. In that case we choose $y = 0$ and we're done. Otherwise, consider the kernel of $f$, $\ker f$, which we've seen is a vector space and closed because $f$ is continuous. Note that there is $z$ for which $f(z) \ne 0$ and thus $z \notin \ker f$.
<br /><br />
Recall the <a href="https://www.kuniga.me/blog/2021/06/26/hilbert-spaces.html">The Projection Theorem</a> which shows that given a subspace $S$ of a Hilbert space $H$ we can decompose into $S$ and $S^\perp$. More specifically, for each $x \in H$, we can find $x_S \in S$ and $x_{S^{\perp}} \in S^\perp$ such that $x = x_S + x_{S^{\perp}}$ and that $x_S, x_{S^{\perp}}$ are orthogonal, i.e. $\langle x_S, x_{S^{\perp}} \rangle = 0$. This is denoted as $H = S \oplus S^\perp$.
<br /><br />
Since $\ker f$ is a subspace of $H$, we can use such decomposition. We wish to show that $\ker f^\perp$ is one-dimensional, which means it has a base of size 1 (not necessarily that it has a single element). Now choose $u, v \in \ker f^\perp$. Define $w = \alpha u - \beta v$, with $\alpha = f(v)$ and $\beta = f(u)$ as scalars. Since this is a linear combination and $\ker f^\perp$ is a vector space, $w \in \ker f^\perp$.
<br /><br />
Now do $f(w) = \alpha f(u) - \beta f(v)$ and replace the scalars:  $f(w) = f(v) f(u) - f(u) f(v) = 0$. Thus $w \in \ker f$. The only element that can belong to both a set and its orthogonal complement is $0$. This means that $w = 0$ and thus: $\alpha u = \beta v$, which means $u$ and $v$ are the same vector up to a scalar and that $\ker f^\perp$ has a base of size 1. Let's call that base $u$.
<br /><br />
So from the projection theorem, every $x \in H$ can be written as $x = k + \alpha u$ where $k \in \ker f$. Applying $f(x)$ gives us $f(x) = f(k) + \alpha f(u)$. Since $k \in \ker f$, $f(k) = 0$ and thus $f(x) = \alpha f(u)$.
<br /><br />
We have that $\langle x, u \rangle = \langle k + \alpha u, u \rangle =  \langle k, u \rangle + \alpha \langle u, u \rangle$. $k$ and $u$ are orthogonal by definition and $\langle u, u \rangle = \norm{u}^2$, so $\langle x, u \rangle = \alpha \norm{u}^2$ or that $\alpha = \langle x, u \rangle / \norm{u}^2$.
<br /><br />
Replace in $f(x) = \alpha f(u)$ gives us

$$
f(x) = \frac{f(u)}{\norm{u}^2} \langle x, u \rangle
$$

Now choose

$$
y = \frac{\overline{f(u)}}{\norm{u}^2} u
$$

The fraction above is just a scalar factor so $y \in H, y \in \ker f^\perp$. We choose the conjugate of $f(u)$ because the identity for inner product is $\langle x , \alpha y \rangle = \overline{\alpha} \langle x , y \rangle$. We have that:

$$
\langle x, y \rangle = \left\langle x, \frac{\overline{f(u)}}{\norm{u}^2} u \right\rangle = \frac{f(u)}{\norm{u}^2} \langle x, u \rangle = f(x)
$$

So we found $y \in H$ such that $f(x) = \langle x, y \rangle$. Since our choice does not depend on $x$, it holds for all $x \in H$.
<br /><br />
Now we need to show this is unique. Suppose we have $y_1, y_2$ and that $\langle x, y_1 \rangle = \langle x, y_2 \rangle$ then $\langle x, y_1 - y_2 \rangle = 0$ for all $x$, including the case where $x = y_1 - y_2$. The only case in which $\langle x, x \rangle = 0$ is for $x = 0$, so $y_1 = y_2$.
<br /><br />
Finally we show that $\norm{f} = \norm{y}$. First by definition:

$$
\norm{f} = \sup_{x \ne 0} \frac{\abs{f(x)}}{\norm{x}}
$$

we have $f(x) = \langle x, y \rangle$ so:

$$
\norm{f} = \sup_{x \ne 0} \frac{\abs{\langle x, y \rangle}}{\norm{x}}
$$

by Cauchy-Schwarz: $\abs{\langle x, y \rangle} \le \norm{x} \norm{y}$ and thus:

$$
\frac{\abs{\langle x, y \rangle}}{\norm{x}} \le \norm{y}
$$

hence $\norm{f} \le \norm{y}$. For the other direction, if $y \ne 0$ we have:

$$
\frac{\abs{f(y)}}{\norm{y}} = \frac{\abs{\langle y, y \rangle}}{\norm{y}} = \frac{\norm{y}^2}{\norm{y}} = \norm{y}
$$

Since $\norm{f}$ is the supremum of all $\abs{f(x)}/\norm{x}$ for $x = y$ it implies it should be at least $\norm{y}$, $\norm{f} \ge \norm{y}$ so $\norm{f} = \norm{y}$.
</proof>

<p>What this theorem is saying is that for any continuous linear functional $f$ over a Hilbert domain, there’s exactly one element in that domain that “encodes” $f$ as a dot product with that element.</p>

<p>We can now claim that Hilbert spaces are isomorphic to their dual, $H \cong H^*$, that is, there exists a bijection between these two sets. In this case the function is defined by:</p>

\[T(y) = f_y, \quad f_y(x) = \langle x, y \rangle\]

<p>which is a bijection since for each $f \in H^*$ there’s a unique $y$ for which $f = f_y$. Conversely each $y$ defines a unique function $f_y$. Further since $\norm{f} = \norm{y}$ this is a “length-preserving” bijection, which leads to the more general <em>isometric isomorphism</em>.</p>

<p>This is a special identity for Hilbert spaces. It does not hold for example for the slightly more general <a href="https://www.kuniga.me/blog/2021/06/26/hilbert-spaces.html">Banach space</a>.</p>

<h2 id="conclusion">Conclusion</h2>

<p>I thought this would be my first post on functional analysis, but I had forgotten I wrote about Hilbert spaces <a href="https://www.kuniga.me/blog/2021/06/26/hilbert-spaces.html">before</a>.</p>

<p>This was another topic I relied entirely on ChatGPT and really liked the interactive process. It’s very gratifying to start with a blurry view of it and gradually build a cohesive and more intuitive picture.</p>

<p>One of the most amusing moments was when I asked ChatGPT what happens if we go recursive and build a vector space of functionals, and then it pointed out it’s basically the dual space, which I had already studied. It then “clicked”.</p>

<p>It was also nice to connect with other parts I have studied in the past such as analysis and <a href="https://www.kuniga.me/docs/math/topology.html">topology</a>.</p>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="functional analysis" /><summary type="html"><![CDATA[I started reading the book The Theoretical Minimum by Leonard Susskind and George Hrabovsky. A lot of the math from the early chapters looked familiar, but in Chapter 6: The Principle of Least Action, they describe and derive the Euler-Lagrange equation, which I don’t recall seeing before. I wanted to explore these equations and their derivation, but from a more mathematical point of view. This led me to a short rabbit hole around functionals and Sobolev spaces and since I like learning things from first principles, I decided to cover functionals first.]]></summary></entry><entry><title type="html">[Book] Complex Analysis</title><link href="https://www.kuniga.me/blog/2026/04/05/book-complex-analysis.html" rel="alternate" type="text/html" title="[Book] Complex Analysis" /><published>2026-04-05T00:00:00+00:00</published><updated>2026-04-05T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/04/05/book-complex-analysis</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/04/05/book-complex-analysis.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources//books/alfhors.png" alt="Book cover." />
</figure>

<p>In this post I’ll share a summary on the book <em>Complex Analysis</em> by Lars V. Ahlfors, and my journey in studying it.</p>

<p>I’ll start with the journey because I think it’s the more interesting. The second part is basically a link to all the posts I wrote from studying this book.</p>

<p><br /></p>

<!--more-->

<h2 id="the-journey">The Journey</h2>

<p>I’ve wanted to study Complex Analysis since I read Jay Cummings’ <a href="https://www.kuniga.me/blog/2023/04/21/review-real-analysis.html">Real Analysis: A Long-Form Mathematics Textbook</a>. The book is very approachable and I finally understood many bits I had learned from Calculus. I still recall the “wow” moment when I saw the formal definition of integral.</p>

<p>The opportunity came when I was visiting Taichung in Taiwan and visited a bookstore called <em>Mollie Used Books</em>. There were only a few books in English and I was baffled to find a math textbook among them, so I <em>had</em> to get it. The book was only $4 if I recall correctly.</p>

<figure class="center_children">
    <img src="https://www.kuniga.me/resources/blog/2026-04-05-book-complex-analysis/mollie.jpg" alt="See caption" />
    <figcaption>Mollie Used Books in Taichung, Taiwan.</figcaption>
</figure>

<p>In retrospect, that was one of the biggest “penny wise, pound foolish” mistakes of mine. Complex analysis is a hard subject and Ahlfors is not beginner friendly.</p>

<p>I should have started with Tristan Needham’s <em>Visual Complex Analysis</em>, but I fell prey to the sunken cost fallacy: I ended up buying Needham’s book but stuck with Alfhors as the main guide. I did learn about <a href="https://www.kuniga.me/blog/2024/12/15/multi-valued-functions.html">multi-valued functions</a> from <em>Visual Complex Analysis</em> because I couldn’t grok Alfhors’ explanations.</p>

<p>I had to complement my readings online with <a href="https://math.stackexchange.com/">Mathematics</a> on StackExchange and scattered lecture notes. But because these notes approach subjects in different angles and order, many times it wasn’t easy to make use of them to understand the book. A notable example was when studying <a href="https://www.kuniga.me/blog/2024/11/02/poles.html">Zeros and Poles</a>. <a href="https://terrytao.wordpress.com/2016/10/11/math-246a-notes-4-singularities-of-holomorphic-functions/">Terence Tao’s</a> notes was much more intuitive than the book, but he relies on a different foundation so I had to study a bunch of things such as Laurent series out of order.</p>

<p>As ChatGPT appeared, I started using it more and more and largely replaced other sources of complement. I was using it inefficiently though: I would still try to follow Alfhor’s proofs but ask ChatGPT about a leap in argument but often times ChatGPT wouldn’t be able to help because it was something very specific with the proof and my edition.</p>

<p>Only more recently I found that I should rely on ChatGPT for the entire proof, which might be completely different from the one in Alfhors. One major benefit of using ChatGPT is that the initial proof it provides is very high level and intuitive but skims details, which I can then dig into.</p>

<p>It took me roughly 2 years and a half to finish this book. I’d estimate I spend 2h on average per week working through it, so about 260 hours.</p>

<h2 id="summary">Summary</h2>

<h3 id="chapter-1-complex-numbers">Chapter 1: Complex Numbers</h3>

<p>Covers the most basic properties of complex numbers.</p>

<p>Posts from this chapter:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2023/09/16/cardinality-of-complex.html">The Cardinality of Complex Numbers</a></li>
  <li><a href="https://www.kuniga.me/blog/2023/10/02/complex-geometry.html">Complex Numbers and Geometry</a></li>
  <li><a href="https://www.kuniga.me/blog/2023/11/03/gauss-lucas-theorem.html">The Gauss-Lucas Theorem</a></li>
</ul>

<h3 id="chapter-2-complex-functions">Chapter 2: Complex Functions</h3>

<p>Introduces functions of complex numbers. In particular holomorphic functions which are complex differentiable functions. Alfhors calls holomorphic functions analytic, which seems to be an outdated terminology. They have different definitions, but it can be shown that analytic functions are the same as holomorphic functions.</p>

<p>Posts from this chapter:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2023/12/21/holomorphic-functions.html">Holomorphic Functions</a></li>
</ul>

<h3 id="chapter-3-analytic-functions-as-mappings">Chapter 3: Analytic Functions as Mappings</h3>

<p>This chapter reviews point set topology, defines conformal maps (angle-preserving transformations) and Möbius transformations which he calls linear transformations.</p>

<p>Posts from this chapter:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2023/12/30/conformal-maps.html">Conformal Maps</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/01/08/mobius-transformation.html">Möbius Transformation</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/01/13/cross-ratio.html">Cross Ratio</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/01/20/circles-of-apollonius.html">Circles of Apollonius</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/02/03/circles-symmetry.html">Symmetry Points of a Circle</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/02/10/bipolar-coordinates.html">Bipolar Coordinates and Möbius Transformations</a></li>
</ul>

<h3 id="chapter-4-complex-integration">Chapter 4: Complex Integration</h3>

<p>As the name suggests, complex integration is defined here. Most of the times we’re interested in integration over a 1d-curve embedded in the 2d-complex plane (line or contour integrals). It seems common to denote contour integrals as $\oint$ but I used Alfhors notation of the simple integral symbol $\int$ with the contour being implied by the subscript path, e.g. $\int_\gamma$.</p>

<p>Here’s where Cauchy’s name start popping up everywhere. This seems the longest and most information dense chapter of the book.</p>

<p>Posts from this chapter:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2024/04/05/complex-integration.html">Complex Integration</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/04/13/path-independent-line-integrals.html">Path-Independent Line Integrals</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/04/26/cachy-integral-theorem.html">Cauchy Integral Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/05/09/the-winding-number.html">The Winding Number</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/06/06/cauchy-integral-formula.html">Cauchy’s Integral Formula</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/08/31/removable-singularities.html">Removable Singularities</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/11/02/poles.html">Zeros and Poles</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/12/24/open-map.html">The Open Mapping Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/01/18/max-principle.html">The Maximum Principle</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/03/15/general-cauchy.html">The General Form of Cauchy’s Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/04/16/residue-theorem.html">The Residue Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/08/01/harmonic-functions.html">Harmonic Functions</a></li>
</ul>

<h3 id="chapter-5-series-and-product-development">Chapter 5: Series and Product Development</h3>

<p>This covers infinite series and products. Here’s where a bunch of surprising connections star to appear, especially with <em>Weierstrass Factorization Theorem</em> and <em>The Riemann Zeta Function</em>. This was my favorite chapter.</p>

<p>Posts from this chapter:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2025/05/31/runge-theorem.html">Runge’s Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/06/17/mittag-leffler-theorem.html">Mittag-Leffler’s Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2024/07/02/holomorphic-functions-are-analytic.html">Holomorphic Functions are Analytic</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/07/02/weierstrass-factorization-theorem.html">Weierstrass Factorization Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/07/19/gamma-function.html">The Gamma Function</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/08/30/hadamard-theorem.html">Hadamard Factorization Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/10/25/riemann-zeta-function.html">The Riemann Zeta Function</a></li>
</ul>

<h3 id="chapter-6-conformal-mapping-dirichlets-problem">Chapter 6: Conformal Mapping. Dirichlet’s Problem</h3>

<p>Here the book discusses the Dirichlet’s Problem: how to evaluate a holomorphic function if we only know its values at the boundary of a closed curve. The conformal mapping part helps reducing the problem to simpler curves but I found it a bit tedius. Perron’s method to solve a special case of Dirichlet’s Problem was very interesting.</p>

<p>This was my least favorite chapter. Posts from this chapter:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2025/12/14/subharmonic-functions.html">Subharmonic Functions</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/12/31/perron-method.html">The Perron Method</a></li>
</ul>

<h3 id="chapter-7-elliptic-functions">Chapter 7: Elliptic Functions</h3>

<p>This chapter felt a bit more digestable than the previous one, back to some elementary ideas such as periodic functions. I got a glimpse of the connection with number theory and it tempted me to study number theory in more depth.</p>

<p>Posts from this chapter:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2026/01/30/elliptic-functions.html">Elliptic Functions</a></li>
  <li><a href="https://www.kuniga.me/blog/2026/04/04/the-weiertrass-p-function.html">The Weierstrass ℘ Function</a></li>
</ul>

<h3 id="chapter-8-global-analytic-functions">Chapter 8: Global Analytic Functions</h3>

<p>I skipped this chapter entirely. The gist seems to be that for all the results we studied it’s assumed the functions are (locally) single-valued, so we often have to restrict the domain to specific branches which makes it imposible to have a single function that works for the whole domain.</p>

<p>This chapter discusses things like germs and sheaves that help us lift the single-valued constraint and allow defining a single function for the entire domain.</p>

<p>I skipped because this feels like an entire new beast with new terminology and ideas, and because I’m tired. I believe this is very important in fields like algebraic geometry, so if I ever muster the courage to study it, I’m hoping I’ll learn about this properly.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>Despite the time sunk and the frustration of feeling dumb for not being able to understand the book many times, I am very appreciative of what I have learned. I think this is hard stuff and I’m proud at myself for sticking with it for so long.</p>

<p>It feels good being able to experience this knowledge that only few people (which I presume most are being major in math) had. It’s like going on a very long hike and getting very far, seeing many beautiful things along the way, while most people only explore the beginning and miss most of them.</p>

<p>Would any of this be useful? I have no idea. But I do hope I now have a stronger foundation to understand more practical fields such as AI and p  hysics.</p>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="analysis" /><summary type="html"><![CDATA[In this post I’ll share a summary on the book Complex Analysis by Lars V. Ahlfors, and my journey in studying it. I’ll start with the journey because I think it’s the more interesting. The second part is basically a link to all the posts I wrote from studying this book.]]></summary></entry><entry><title type="html">The Weierstrass ℘ Function</title><link href="https://www.kuniga.me/blog/2026/04/04/the-weiertrass-p-function.html" rel="alternate" type="text/html" title="The Weierstrass ℘ Function" /><published>2026-04-04T00:00:00+00:00</published><updated>2026-04-04T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/04/04/the-weiertrass-p-function</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/04/04/the-weiertrass-p-function.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources/blog/2026-04-04-the-weiertrass-p-function/wp.jpeg" alt="CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1196795" />
</figure>

<p>In our post on Elliptic functions [2] we started with the simply periodic functions such as $\sin z$. We noted that $e^{2\pi i z} / w$ is the simplest of the periodic functions and that every single simply periodic function $f(z)$ of period $w$ can be written as a function of it: $f(z) = g(e^{2\pi i z} / w)$.</p>

<p>Then we introduced doubly periodic functions, also known as elliptic functions. One may ask if there is, analogously, the simplest elliptic function and whether it’s possible to write all elliptic functions as a function of it. The answer is yes! And this function is known as the Weierstrass ℘ function which we’ll study in this post.</p>

<!--more-->

<h2 id="fun-trivia">Fun Trivia</h2>

<p>The symbol ℘ denoting the Weierstrass elliptic function seems to be the only unicode character that represents a very specific mathematical object.</p>

<p>Even symbols you might associate with specific meanings, such as π, are just Greek letters, so they’re not exclusively used to represent a constant. Even glyphs like ℝ, often associated with the set of real numbers, are just conventions.</p>

<p>Physics has its counterpart, ℏ, denoting Planck’s constant.</p>

<h2 id="the-simplest-function">The Simplest Function</h2>

<p>In our post on elliptic functions [2], we showed that an elliptic function without poles is a constant and that an elliptic function cannot have a single simple pole. Thus the simplest interesting function is one that has order 2. In this case it either has 2 distinct poles or it has a single pole with order 2, which is also known as a double pole. We’ll show that the Weierstrass $\wp$ function is an elliptic function with a double pole.</p>

<p>Let $f(z)$ be an elliptic function with period module $M$. We defined the Weierstrass $\wp$ function as</p>

\[(1) \quad \wp(z) = \frac{1}{z^2} + \sum_{w \in M, w \ne 0} \left( \frac{1}{(z - w)^2} - \frac{1}{w^2} \right)\]

<p>We start by showing (<em>Lemma 1</em>) that the series converges, and thus that $\wp$ exists.</p>

<p><strong>Lemma 1.</strong> The series in $(1)$ converges uniformly on every compact set $K \subset \mathbb{C} \setminus M$.</p>

<proof>

Since $K$ is compact, then there is a finite

$$
R = \max_{z \in K} \abs{z}
$$

Assume for now that $\abs{w} \in M \gt 2R$. We can write

$$
\frac{1}{(z - w)^2} - \frac{1}{w^2} = \frac{2wz - z^2}{w^2(z - w)^2}
$$

From $\abs{w} \gt 2R$ we have that $\abs{z} \le \abs{w} / 2$. So $\abs{z - w} = \abs{w - z} \ge \abs{w} - \abs{z} \ge \abs{w} - \abs{w} / 2 = \abs{w}/2$. We have:

$$
\abs{\frac{2wz - z^2}{w^2(z - w)^2}} = \frac{\abs{2wz - z^2}}{\abs{w}^2 \abs{(z - w)}^2}
$$

using $\abs{w - z}^2 \ge (\abs{w}/2)^2$:

$$
\le \frac{\abs{2wz - z^2}}{4 \abs{w}^4} \le \frac{\abs{2wz} + \abs{z}^2}{4 \abs{w}^4}
$$

since $\abs{z} \lt R$:

$$
\lt \frac{\abs{2wR} + \abs{R}^2}{4 \abs{w}^4} \le \frac{\abs{Cw}}{4 \abs{w}^4} = \frac{C'}{\abs{w}^3}
$$

for some constant $C'$, so we just need to show that

$$
(1.1) \quad \sum_{w \in M, w \ne 0} \frac{1}{w^3}
$$

converges. The key observation is that the number of $w \in M$ in a disk of radius $r$ grows at an $O(r^2)$ rate because they're vertices of a lattice. That is, there exists a constant $\alpha$ such that $\# \curly{w \in M : \abs{w} \le r} \le \alpha r^2$.
<br /><br />
Now consider the annulus $n \le \abs{w} \lt n + 1$. The number of points in this region, denoted by $a_n$ can be estimated as:

$$
a_n = \# \curly{w \in M : \abs{w} \lt n + 1} - \# \curly{w \in M : \abs{w} \lt n} \le \alpha (n + 1)^2 - n^2 = 2n + 1 \le \beta n
$$

So the number of $w$ in the annulus is $O(n)$. We can now tie $\abs{w}$ to $w$ count by rewriting as a sum of "rings" of integer radii:

$$
\sum_{w \in M, w \ne 0} \frac{1}{\abs{w^3}} = \sum_{n = 1}^\infty \sum_{\substack{w \in M \\ n \le w \lt n + 1}} \frac{1}{\abs{w^3}}
$$

since $n \le \abs{w} \lt n + 1$, $1 / \abs{w} \le 1 / n$:

$$
\le \sum_{n = 1}^\infty \sum_{\substack{w \in M \\ n \le w \lt n + 1}} \frac{1}{n^3}
$$

which is a convergent series, so $(1.1)$ converges absolutely and the series in $(1)$ converges uniformly by the Weierstrass M-test.

Note that we assumed $\abs{w} \gt R$. However since there are finitely many $\abs{w} \le R$, the convergence results hold.

</proof>

<p>Now we need to show that $\wp$ is an elliptic function with a double pole. We note that the only poles of</p>

\[(2) \quad \frac{1}{(z - w)^2} - \frac{1}{w^2}\]

<p>are when $z = w$, thus the periods in $M$. So in a compact set that avoids the lattice points induced by $M$ we get that $(2)$ is holomorphic. The Weierstrass convergence theorem states that if $f_n(z)$ is holomorphic, and the series $\sum_{n} f_n(z)$ converges uniformly, then this series is also holomorphic.</p>

<p>So in some carefully chosen region $(2)$ is holomorphic and thus the series in $(1)$ is a meromorphic function with a double pole at $z = 0$.</p>

<p><strong>Corollary 2.</strong> The function $\wp$ has a double pole at 0.</p>

<p>With that information we just need to show:</p>

<p><strong>Lemma 3.</strong> $\wp(z) = \wp(z + \lambda)$ for any  $\lambda \in M$</p>

<proof>

We wish to prove $\wp(z) = \wp(z + \lambda)$ for any  $\lambda \in M$. We start by replacing in the definition:

$$
\wp(z + \lambda) = \frac{1}{(z + \lambda)^2} + \sum_{w \in M, w \ne 0} \left( \frac{1}{(z + \lambda - w)^2} - \frac{1}{w^2} \right)
$$

First we split the series into two:

$$ = \frac{1}{(z + \lambda)^2} + \sum_{w \in M, w \ne 0} \frac{1}{(z - w)^2} - \sum_{w \in M, w \ne 0} \frac{1}{w^2}
$$

We can reindex the first series via a change of variable $w' = w + \lambda$ (note that by construction $w' \in M$ too):

$$ = \frac{1}{(z + \lambda)^2} + \sum_{w' \in M, w' \ne -\lambda} \frac{1}{(z - w')^2}  - \sum_{w \in M, w \ne 0} \frac{1}{w^2}
$$

Consider the first series. It covers almost the exact same elements as the one from before the change of variable, except that the new one includes $w = 0$ and excludes $w = -\lambda$, so we can equate the series via:

$$
\sum_{w' \in M, w' \ne -\lambda} \frac{1}{(z - w')^2} = \sum_{w \in M, w \ne 0} \left( \frac{1}{(z - w)^2} \right) - \frac{1}{(z + \lambda)^2} + \frac{1}{z^2}
$$

Let $P(z)$ be the series from $\wp(z)$, so that we have:

$$
\wp(z) = \frac{1}{z^2} + P(z)
$$

and we've shown that

$$
\wp(z + \lambda) = \frac{1}{(z + \lambda)^2} + P(z) - \frac{1}{(z + \lambda)^2} + \frac{1}{z^2}
$$

cancelling factors:

$$
\wp(z + \lambda) = P(z) + \frac{1}{z^2} = \wp(z)
$$

So $\wp$ is a function with period module $M$.
</proof>

<p>to show $\wp$ is elliptic.</p>

<p>So we’ve defined the Weierstrass function and have shown it’s elliptic with a double pole at 0. Now we consider some other properties.</p>

<h2 id="properties">Properties</h2>

<p>Because $\wp$ has only one double pole at 0, we can write its <a href="https://www.kuniga.me/blog/2024/11/02/poles.html">Laurent Series</a> expansion as:</p>

\[(3) \quad \wp(z) = c_{-2} z^{-2} + \sum_{n = 0}^\infty c_nz^n\]

<p>We can then use <em>Lemma 4</em> to show that $\wp$ is even:</p>

<p><strong>Lemma 4.</strong> Let $f(z)$ be a function with the Laurent expansion as in $(3)$. Then $f(z) = f(-z)$.</p>

<proof>
Let $g(z) = f(z) - f(-z)$. Since $z^{k} = (-z)^{k}$ for even $k$ the corresponding terms in the Laurent series expansion cancel out leaving us with only the odd terms:

$$
g(z) = 2 c_1 z + 2 c_3 z^3 + 2 c_5 z^5 \dots
$$

This means $g(z)$ has no poles and is thus holomorphic around 0, in fact $g(0) = 0$ by replacing it above. $g(z)$ also has the same period $w \in M$ as $f(z)$ because $g(z + w) = f(z + w) - f(-z - w) = f(z) - f(-z)$. The only other poles $g(z)$ might have are for $w \in M$ because these are the poles for $f(z)$ (since $1/f(0) = 1/f(w) = 0$), but since $g(w) = g(0)$ we conclude $g$ is an elliptic function without poles which means $g$ is constant and thus 0. This means that $f(z) = f(-z)$ everywhere.

</proof>

<p>Another conclusion from the Laurent series is that $\wp$ has residue 0 [4], because by definition it’s the coefficient $c_{-1}$ of the Laurent series.</p>

<p><strong>Corollary 5.</strong> The function $\wp$ has residue 0.</p>

<p>We now explore the anti-derivative of the Weierstrass function.</p>

<h2 id="weierstrass-zeta-function">Weierstrass Zeta Function</h2>

<p>Recall that the anti-derivative $g$ of a function $f$ is one such that $g’ = f$. For the Weierstrass function, its anti-derivative is known as the <em>Weierstrass Zeta Function</em> and denoted by $\zeta$, but conventionally we work with the negative of the anti-derivative, so $\zeta’(z) = -\wp(z)$.</p>

<p>We can compute it from $(1)$ by integrating over the terms of the series, as shown in <em>Lemma 6.</em></p>

<p><strong>Lemma 6.</strong> The Weierstrass Zeta Function can be written as</p>

\[(4) \quad \zeta(z) = \frac{1}{z} + \sum_{w \in M, w \ne 0} \left(\frac{1}{z-w} + \frac{1}{w} + \frac{z}{w^2}\right)\]

<proof>
We integrate each term of $(1)$ individually. The integral of $1/z^2$ is $-1/z$ (plus a constant $C_1$). The integral of $-1/{(z-w)}^2$ is $1/(z - w)$ and of $-1/w^2$ is $-z/w^2$ (plus a constant $C_2$). So if we negate the equation, we have

$$
\frac{1}{z} + \sum_{w \in M, w \ne 0} \left(\frac{1}{z-w} + \frac{z}{w^2} + C_2 \right) + C_1
$$

We can normalize $C_1$ to 0, but we can't do that for $C_2$ because in that case the series would not converge. To see why, consider the Taylor expansion of $1/(z-w)$ around 0 (which is holomorphic since $w \ne 0$):

$$
\frac{1}{z - w} = -\frac{1}{w} - \frac{z}{w^2} - \frac{z^2}{w^3} - \cdots
$$

If we set $C_2 = 0$, then the term inside the series becomes

$$
\frac{1}{z - w} + \frac{z}{w^2} = - \frac{1}{w} - \frac{z^2}{w^3} - \cdots
$$

which is dominated by $1/w$ and the series becomes the non-convergent harmonic series. If we set $C_2 = 1/w$, then the term inside the series becomes

$$
\frac{1}{z - w} + \frac{1}{w} + \frac{z}{w^2} = - \frac{z^2}{w^3} - \cdots = O(1/w^3)
$$

and the corresponding series now converges. This proves the result we are interested in.

</proof>

<p>The Laurent series of $\zeta$ around 0 has the principal part $1/z$ and the rest, $h(z)$, is a holomorphic function:</p>

\[\zeta(z) = \frac{1}{z} + h(z)\]

<p>this means this function has residue 1 around 0. <em>Lemma 7.</em> generalizes this for all other periods in $M$.</p>

<p><strong>Lemma 7.</strong> The function $\zeta$ has residue 1 around $w \in M$.</p>

<proof>
We can rewrite $(4)$ as:

$$
\zeta(z) = \frac{1}{z} + \left(\frac{1}{z-\lambda} + \frac{1}{\lambda} + \frac{z}{\lambda^2}\right) + \sum_{w \in M, w \ne 0, w \ne \lambda} \left(\frac{1}{z-w} + \frac{1}{w} + \frac{z}{w^2}\right)
$$

so around $\lambda$ the only pole is $1/(z - \lambda)$. This gives us the residue 1 around any $\lambda \in M$.
</proof>

<p>From <em>Lemma 40</em> in [2], which claims that the sum of residues of an elliptic function is 0, this implies that $\zeta$ is not a periodic function. It is however, as <em>Lemma 8</em> shows, a <em>quasi-periodic</em> function. A <strong>quasi-periodic</strong> function $f$ is one for which $f(z) = f(z + w) + \eta_w$ for some constant $\eta_w \in \mathbb{C}$ dependent on $w$.</p>

<p>Note that a quasi-periodic function is a weaker version of periodic functions, since the latter are quasi-periodic functions with $\eta_w = 0$.</p>

<p><strong>Lemma 8.</strong> The function $\zeta$ is quasi-periodic.</p>

<proof>

Let

$$
g_\lambda(z) = \zeta(z + \lambda) - \zeta(z)
$$

for some $\lambda \in M$. Differentiating:

$$
g_\lambda'(z) = -\wp(z + \lambda) + \wp(z)
$$

Since $\wp$ is periodic on $\lambda$, $g_\lambda'(z) = 0$. For $z \not \in M$, we have that $\zeta(z)$ and $\zeta(z + \lambda)$ have no poles and thus $g_\lambda(z)$ is holomorphic, and then $g_\lambda'(z) = 0$ implies $g_\lambda(z)$ is a constant.
<br /><br />
We wish to show that $g_\lambda$ is holomorphic even around $z \in M$. We've seen in the proof of <i>Lemma 7</i> that for any $a \in M$ we have:

$$
(8.1) \quad \zeta(z) = \frac{1}{z-a} + h(z)
$$

for some holomorphic function $h(z)$. Since $a + \lambda$ is also in $M$, we can also write:

$$
\zeta(z) = \frac{1}{z-(a + \lambda)} + k(z)
$$

for another holomorphic function $k(z)$. Doing this one for $z + \lambda$:

$$
(8.2) \quad  \zeta(z + \lambda) = \frac{1}{(z + \lambda) - (a + \lambda)} + k(z + \lambda) = \frac{1}{z - a} + k(z + \lambda)
$$

so both $(8.1)$ and $(8.2)$ have a single pole when $z$ is around $a$. If we subtract one from another:

$$
g_\lambda(z) = \zeta(z + \lambda) - \zeta(z) = k(z + \lambda) - h(z)
$$

we are left with a holomorphic function, even around points in $M$. Thus $g_\lambda(z)$ is entire and thus must be a constant. We then have

$$
\zeta(z + \lambda) = \zeta(z) + \eta_\lambda
$$

which is the definition of a quasi-periodic function.
</proof>

<p>There’s a nice identity between the base of the period module and the corresponding quasi-period constants for the zeta function, known as <strong>Legendre’s Relation</strong>. <em>Lemma 9</em> has more details:</p>

<p><strong>Lemma 9.</strong> Let $(w_1, w_2)$ be a base for the period module $M$. Let $\eta_1, \eta_2$ be the constants in $\zeta(z + w_1) = \zeta(z) + \eta_1$ and $\zeta(z + w_2) = \zeta(z) + \eta_2$. Then:</p>

\[w_1 \eta_2 + w_2 \eta_1 = 2 \pi i\]

<proof>

Let $\delta P$ be the boundary of the parallelogram of the lattice induced by $(w_1, w_2)$ at the origin, that is with vertices $0, w_1, w_1 + w_2, w_2$. Let $\gamma$ be $\delta P$ shifted by a small amount $\epsilon \lt 0$, so that $0$ is inside this closed curve, that is, $\epsilon, w_1 + \epsilon, w_1 + w_2 + \epsilon, w_2 + \epsilon$.
<br /><br />
To prove the lemma we'll compute

$$\int_{\gamma} \zeta(z) dz$$

with two different methods to obtain an equality. First we compute using residues. Since we know the residue of $\zeta(z)$ is 1 with a pole at 0, by Cauchy's residue theorem we have:

$$
\frac{1}{2 \pi i} \int_{\gamma} \zeta(z) dz = \sum_{j=1}^{n-1} \eta(\gamma, a_j) \mbox{Res}_{z=a_j} \zeta(z) = 1
$$

so our first method gives:

$$
\int_{\gamma} \zeta(z) dz = 2 \pi i
$$

The other method is by integrating it via paths. The boundary $\gamma$ can be decomposed into 4 segments: $\epsilon$ to $w_1 + \epsilon$, $w_1 + \epsilon$ to $w_1 + w_2 + \epsilon$, and so forth. Let's consider the first segment, $\epsilon \rightarrow w_1 + \epsilon$:

$$
I_1 = \int_{\epsilon}^{w_1 + \epsilon} \zeta(z) dz
$$

now we do the second $w_1 + \epsilon \rightarrow w_1 + w_2 + \epsilon$:

$$
I_2 = \int_{w_1 + \epsilon}^{w_1 + w_2 + \epsilon} \zeta(z) dz
$$

We do a change of variable $z = w + w_1$ to get:

$$
I_2 = \int_{\epsilon}^{w_2 + \epsilon} \zeta(w + w_1) dw
$$

Using the quasi-periodicity: $\zeta(w + w_1) = \zeta(w) + \eta_1$ so

$$
I_2 = \int_{\epsilon}^{w_2 + \epsilon} \zeta(w) dw + w_2 \eta_1
$$

the third segment gives us:

$$
I_3 = \int_{w_1 + w_2 + \epsilon}^{w_2 + \epsilon} \zeta(z) dz
$$

Now we use $z = w + w_2$ to get:

$$
I_3 = \int_{w_1 + \epsilon}^{\epsilon} \zeta(z) dz + w_1 \eta_2
$$

finally the fourth segment is

$$
I_4 = \int_{w_1 + \epsilon}^{\epsilon} \zeta(z) dz
$$

Adding together, the remaining integral of $I_2$ and $I_3$ cancel with $I_4$ and $I_1$, and what remains is $w_1 \eta_2 + w_2 \eta_1$ which proves the lemma.

</proof>

<p><strong>Lemma 10.</strong> The function $\zeta(z)$ is odd.</p>

<proof>
This is similar to the proof to <i>Lemma 4</i>. We define $g(z) = \zeta(z) + \zeta(-z)$, then around 0 the principal part $1/z$ will cancel out making $g(z)$ a holomorphic function, so for $z \rightarrow 0$, $g(z) = 0$.
<br /><br />
We now use the fact that $\wp$ is even, and we have:

$$
(10.1) \quad \zeta'(-z) = -\wp(-z) = -\wp(z) = \zeta'(z)
$$

Differentiate $g(z)$:

$$
g'(z) = \zeta'(z) - \zeta'(-z)
$$

from $(10.1)$ we get $g'(z) = 0$ and hence $g(z)$ is constant. Since $g(z) = 0$ for $z \rightarrow 0$, $g(z) = 0$ everywhere and thus $\zeta(z) = -\zeta(-z)$.

</proof>

<h2 id="weierstrass-sigma-function">Weierstrass Sigma Function</h2>

<p>Another important function related to $\wp$ is <em>Weierstrass sigma function</em>, denoted by $\sigma$ and defined as a function of which $\zeta(z)$ is the log derivative of. More precisely, we have that:</p>

\[\frac{\sigma'(z)}{\sigma(z)} = \frac{d \ln \sigma(z)}{dz} = \zeta(z)\]

<p>This function has a specific product form, as shown in <em>Lemma 11</em>.</p>

<p><strong>Lemma 11.</strong> The Weierstrass sigma function can be expressed as</p>

\[(5) \quad \sigma(z) = z \prod_{w \in M, w \ne 0} \left(1 - \frac{z}{w}\right) \exp \left(\frac{z}{w} + \frac{z^2}{2w^2}\right)\]

<proof>
We just need to take the log of $(5)$ and differentiate, and see if we obtain $(4)$. For now we assume $z \not \in M$, so we can take the log to obtain:

$$
(11.1) \quad \ln \sigma(z) = \ln z + \sum_{w \in M, w \ne 0} \left( \ln \left(1 - \frac{z}{w} \right) + \frac{z}{w} +  \frac{z^2}{2w^2} \right)
$$

both $\ln z$ and $\ln (1 - z/w)$ exist since $z \not \in M$ and we can choose a suitable branch to make it a single-valued function. We can show the series converges in $(11.1)$ in which case we can differentiate term by term, to obtain:

$$
\frac{d\ln \sigma(z)}{dz} = \frac{1}{z} + \sum_{w \in M, w \ne 0} \left( \frac{d}{dz} \ln \left(1 - \frac{z}{w} \right) + \frac{1}{w} +  \frac{z}{w^2} \right)
$$

with

$$
\frac{d}{dz} \ln \left(1 - \frac{z}{w} \right) = - \frac{1/w}{1 - z/w} = \frac{1}{z - w}
$$

which gives us:

$$
\frac{d\ln \sigma(z)}{dz} = \frac{1}{z} + \sum_{w \in M, w \ne 0} \left(  \frac{1}{z - w} + \frac{1}{w} +  \frac{z}{w^2} \right)
$$

which is exactly $\zeta(z)$. It remains to show this holds for some $\lambda \in M$. Since $\lambda$ is a zero of $\sigma(z)$,  near $\lambda$, we have

$$
\sigma(z) = (z - \lambda) g(z)
$$

where $g(z)$ is non-zero and holomorphic. Taking the derivative:

$$
\sigma'(z) = g(z) + (z - \lambda) g'(z)
$$

dividing:

$$
\frac{\sigma'(z)}{\sigma(z)} = \frac{1}{z - \lambda} + \frac{g'(z)}{g(z)}
$$

compare that with the shape of $\zeta$ near $\lambda$

$$
\zeta(z) = \frac{1}{z - \lambda} + h(z)
$$

for a holomorphic function $h(z)$. If we take the difference between $\zeta(z)$ and $\frac{\sigma'(z)}{\sigma(z)}$, say $\delta(z)$, the poles cancel and we end up with a holomorphic function

$$
\delta(z) = h(z) + \frac{g'(z)}{g(z)}
$$

even for $z \in M$. We have $\delta(z) = 0$ for $z \not \in M$, since $(\lambda  - z) h(z)$, this is a removable singularity and we can extend $h(z) = 0$ to it, which gives us $\delta(z) = \sigma'(z) / \sigma(z)$ everywhere.

</proof>

<p><em>Quasi-periodicity</em> doesn’t have to be additive, it can also be multiplicative. In the sigma function case it’s possible to show</p>

\[\sigma(z + w) = -\exp(\eta_w (z + w) / 2) \sigma(z)\]

<h2 id="the-derivative-of-wpz">The derivative of $\wp(z)$</h2>

<p>The derivative $\wp’(z)$ can be expressed as:</p>

<p><strong>Lemma 12.</strong></p>

\[\wp'(z) = -2 \sum_{w \in M} \frac{1}{(z - w)^3}\]

<proof>
We differentiate $(1)$ term-by-term to obtain:

$$
\wp'(z) = -2 \frac{1}{z^3} + \sum_{w \in M, w \ne 0} \left(-2 \frac{1}{(z - w)^3}\right)
$$

The term $1/w^2$ disappears because it's constant wrt $z$. Now we don't have to handle $w = 0$ separately and can move it to the sum:

$$
= -2 \sum_{w \in M} \frac{1}{(z - w)^3}
$$

</proof>

<p>From this expression we can see the poles of $\wp’(z)$ are the same as $\wp(z)$, i.e. the periods in $M$. Also, by plugging $z$ and $-z$ in this formula, we can conclude that</p>

<p><strong>Corollary 13.</strong> The function $\wp’(z)$ is odd.</p>

<p>By <em>Lemma 3</em> in [2] which says that elliptic functions are closed under derivation, we conclude that:</p>

<p><strong>Corollary 14.</strong>  The function $\wp’(z)$ is elliptic with the same period module $M$.</p>

<p>We now characterize the zeros of $\wp’(z)$, via <em>Lemma 15</em>:</p>

<p><strong>Lemma 15.</strong> Let $a \not \in M$. Then $\wp’(a) = 0$ if and only if $2a \in M$.</p>

<proof>
First we assume $\wp'(a) = 0$ and we want to show $2a \in M$. We define the function $g(z) = \wp(z) - \wp(a)$. Since we're assuming $a \not \in M$, $\wp(a)$ is defined and thus $g(a) = 0$. Differentiating $g(z)$ gives us $g'(z) = \wp'(z)$. Since $\wp'(a) = 0$ by hypothesis, $g'(a) = 0$. We conclude that $a$ is a zero of $g(z)$ with multiplicity at least 2.
<br /><br />
Since $\wp(a)$ is constant, $g(z)$ has the same poles as $\wp(z)$ which is the double pole at $0$ in the fundamental parallelogram, which by <i>Lemma 41</i> in [2], means there are 2 zeros in this region. This means all zeros are the double zeroes from above.
<br /><br />
We have that $g(z)$ is even:
$$
g(-z) = \wp(-z) - \wp(a) =  \wp(z) - \wp(a) = g(z)
$$
Thus if $a$ is a 0 of $g(z)$, so is $-a$. We claim then that $a$ and $-a$ are the same point, modulo some parallelogram from the lattice. Otherwise some translation of $-a$ would correspond to a distinct point in the parallelogram of $a$ and that would give us at least 3 zeros, so $a \equiv -a \pmod M$ which implies $2a \in M$.
<br /><br />
Now suppose $2a \in M$. We wish to prove $\wp'(a) = 0$. The hypothesis gives us $a \equiv -a \pmod M$. Since $\wp'$ has period module $M$, then $\wp'(-a) = \wp'(a)$. Since $\wp'(z)$ is odd, $-\wp'(a) = \wp'(a)$ adding these up $0 = 2\wp'(a)$ so $\wp'(a) = 0$.
</proof>

<h2 id="elliptic-functions">Elliptic Functions</h2>

<p>We’re now ready to show that every elliptical function can be written from $\wp(z)$ and $\wp’(z)$. We handle even functions and odd functions separately, which suffices since any function can be decomposed into even and odd functions [6].</p>

<p><strong>Lemma 16.</strong> Let $f(z)$ be an even elliptic function. Then there exists a rational function $R$, such that:</p>

\[f(z) = R(\wp(z))\]

<proof>

The key idea is to, for each pole $a$ of $f(z)$, come up with an expression $Q_a(\wp(z))$ involving $\wp(z)$ to "cancel out" the principal parts of $f(z)$. We sum all such expressions into one $Q(\wp(z))$. Then we subtract $f(z)$ from each $Q_a$ which will leave us with an elliptical function without poles, which must be a constant $C$ [2], so we can write

$$
f(z) = Q(\wp(z)) + C
$$

In the remainder of the proof we'll show what $Q_a$ looks like. We'll have to handle 3 types of poles separately:

<ol>
<li>$a \in M$</li>
<li>$a \not \in M$, $2a \in M$</li>
<li>$a \not \in M$, $2a \not \in M$</li>
</ol>

<br /><br />


We first handle the case where $a \in M$. Since $0 \equiv a \pmod M$, that is $f(a) = f(0)$, it suffices to analyze the case where $a = 0$. Consider the Laurent expansion around $0$:

$$
f(z) = \sum_{n = -m}^{\infty} c_n z^n
$$

For $-z$ we have:

$$
f(-z) = \sum_{n = -m}^{\infty} (-1)^n c_n z^n
$$

Since $f(z) = f(-z)$ the odd coefficients $k$ will match as $(-1)^k c_k z^k = c_k z^k$, which implies $c_k = 0$, leaving us with:

$$
f(z) = \sum_{n = -m \\ n \mbox{ is even}}^{\infty} c_n z^n
$$

Thus the principal part of $f(z)$ around 0 is:

$$
\frac{c_{-2}}{z^2} + \frac{c_{-4}}{z^4} + \cdots + \frac{c_{-m}}{z^m}
$$

We recall that $\wp(z)$ around 0 has the principal part $z^{-2}$. When we exponentiate it, $\wp(z)^k$ we get principal parts containing $z^{-2k}$. Notice however that we also get powers like $z^{-2k-2}$, $z^{-2k-4}$, etc.
<br /><br />
We want to construct a polynomial in $\wp(z)$ to match the principal part of $f(z)$ is: first we add the term $c_{-m} \wp(z)^{m/2}$. Next we add the term $\beta_{m-1} \wp(z)^{(m-2)/2}$ where $\beta_{m-2}$ needs to add up to $c_{m-2}$ + any factor $c_{-m} \wp(z)^{m/2}$ added to $z^{(m-2)/2}$. So the polynomial:

$$
Q_0(z) = \beta_{2} \wp(z) + \beta_{4} \wp(z)^2 + \cdots + \beta_{m} \wp(z)^{m/2}
$$

matches exactly the principal part of $f(z)$ around $0$. So if we do $f(z) - Q_0(z)$ the resulting function is holomorphic at $0$.

<br /><br />

Now consider the case in which $a \not \in M$, $2a \in M$ or that $a \equiv -a \pmod M$. Define $g(z) = \wp(z) - \wp(a)$. Since $a$ is not a pole of $\wp(a)$, $g(a) = 0$. We've also shown that $\wp'(a) = 0$, so $g'(a) = \wp'(a) = 0$. This means $g(z)$ has a double zero at $a$ and thus

$$
\frac{1}{\wp(z) - \wp(a)}
$$

has a double pole at $a$. For $a \not \in M$, $\wp(z)$ is holomorphic, so consider the Taylor expansion around $a$:

$$
\wp(z) = \wp(a) + \wp'(a)(z - a) + \frac{\wp''}{2} (z - a)^2 + \cdots
$$

because $a$ is a double zero of $\wp(z)$, $\wp'(a) = 0$ and $\wp''(a) \ne 0$:

$$
\wp(z) - \wp(a) = \frac{\wp''(a)}{2} (z - a)^2 + \cdots
$$

Finally because $a \equiv -a \pmod M$ (i.e. they're the same point relative to their parallelogram) and $\wp(z)$ is an even function its Taylor expansion around $a$ can only have coefficients of even index, so:

$$
\wp(z) - \wp(a) = \frac{\wp''(a)}{2} (z - a)^2 + \frac{\wp^{(4)}(a)}{4!} (z - a)^4 + \cdots
$$

So $\wp(z) - \wp(a)$ is a polynomial on $(z - a)$ with even powers. We can use <i>Lemma 19</i> to show that near $a$, we have:

$$
\frac{1}{\wp(z) - \wp(a)} = \frac{c'_1}{(z - a)^2} + c'_2 + c'_3 (z - a)^2 + c'_4 (z - a)^4 + \cdots
$$

So we have a series where the only principal part is $(z - a)^2$, so we can use the same scheme we did for the case where $a = 0$ to "cancel out" the principal parts of $f(z)$ near $a$.
<br /><br />
Finally consider the case $a \not \in M$ and $2a \not \in M$. This is the most complicated case because now $a$ and $-a$ are not the same point so we need to consider the "shape" of $f(z)$ at both $a$ and $-a$. The Laurent expansion of $f(z)$ around $a$:

$$
f(z) = \sum_{n = -m}^{\infty} c_n (z - a)^n
$$

Now use the fact that $f(z)$ is even. Consider some $z$ around $-a$, say $z = -a + \epsilon$. Then $f(-a + \epsilon) = f(a - \epsilon)$. Since $a - \epsilon$ is a point around $a$ we have:

$$
f(-a + \epsilon) = f(a - \epsilon) = \sum_{n = -m}^{\infty} c_n (-1)^n \epsilon^n
$$

Replacing $\epsilon = z + a$ back, gives us:

$$
f(-a + \epsilon) = \sum_{n = -m}^{\infty} c_n (-1)^n (z + a)^n
$$

So we have, around $-a$:

$$
f(z) = \sum_{n = -m}^{\infty} c_n (-1)^n (z + a)^n
$$

Which tells us that while the functions for $a$ and $-a$ when they're distinct points are not exactly the same as in the second case, there's this constraint that their coefficients on the Laurent series match up to a negative sign. In particular their principal part have the same coefficients. This is important because we now need to come up with a single function $Q_a(\wp(z))$ that cancels out the principal parts at both $a$ and $-a$.
<br /><br />
It turns out this function is still a linear combination of $1/(\wp(z) \pm \wp(a))$ as in the second case, but we need to show why. Since $2a \not \in M$, we have from <i>Lemma 15</i> that $\wp'(a) \ne 0$. Since $a$ is not a pole of $\wp$, consider its Taylor expansion around $a$:

$$
\wp(z) = \wp(a) + \wp'(a)(z - a) + \frac{\wp''(a)}{2!}(z - a)^2 + \cdots
$$

so

$$
\wp(z) - \wp(a) = \wp'(a)(z - a) + \frac{\wp'(a)}{2!}(z - a)^2 + \cdots
$$

From <i>Lemma 20</i> we have that the inverse around $a$ has the series:

$$
\frac{1}{\wp(z) - \wp(a)} = \frac{c'_1}{z - a} + c'_2 + c'_3 (z - a) + c'_4 (z - a)^2 + \cdots
$$

with $c'_1 = 1 / \wp'(a)$. We only care about the principal part of this series, so we hide the rest behind a holomorphic function:

$$
\frac{1}{\wp(z) - \wp(a)} = \frac{c'_1}{z + a} + h_a(z + a)
$$

Now consider the Taylor expansion of $\wp(z)$ around $-a$:

$$
\wp(z) = \wp(-a) + \wp'(a)(z + a) + \frac{\wp''(a)}{2!}(z + a)^2 + \cdots
$$

Using that $\wp(a) = \wp(-a)$ and $\wp''(a) = -\wp''(-a)$ we obtain:

$$
\wp(z) - \wp(a) = (-\wp'(a))(z + a) + \frac{\wp''(-a)}{2!}(z + a)^2 + \cdots
$$

Inverting and using <i>Lemma 20</i> gives us:

$$
\frac{1}{\wp(z) - \wp(a)} = \frac{c''_1}{z + a} + c''_2 + c''_3 (z + a) + c''_4 (z + a)^2 + \cdots
$$

with $c''_1 = -1 / \wp''(a)$. So this gives us $c'_1 = -c''_1$. Hiding the non-principal part behind a holomorphic function:

$$
\frac{1}{\wp(z) - \wp(a)} = -\frac{c'_1}{z + a} + h_{-a}(z + a)
$$

This lets us see things more clearly. If we define

$$
P_a(z) = \frac{1}{\wp(z) - \wp(a)}
$$

then we can use the construct from the first case to eliminate the poles of $f(z)$ both at $a$ and $-a$ with the same linear combination $Q_a(z)$. The beauty of it is that because the coefficients of the principal part of $f(a)$ and $f(-a)$ are the same but opposite for odd powers, exponentiating $P_a(z)$ for odd parts will also yield opposite signs for factor $1/(z - a)$.

</proof>

<p>Recall that a rational function is of the form $R(z) = P(z)/Q(z)$ where $P$ and $Q$ are polynomials of $z$ and $Q(z) \ne 0$. Now for odd functions:</p>

<p><strong>Lemma 17.</strong> Let $f(z)$ be an odd function. Then there exists a rational function $R(z)$ such that $f(z) = \wp’(z) R(\wp(z))$.</p>

<proof>
We define:

$$
h(z) = \frac{f(z)}{\wp'(z)}
$$

We have that $h(z)$ is an even function because $h(-z)$ is:

$$
h(-z) = \frac{f(-z)}{\wp'(-z)} = \frac{-f(z)}{-\wp'(z)} =  \frac{f(z)}{\wp'(z)} = h(z)
$$

and $f(z)$ and $\wp'(z)$ are odd functions. Further since $f(z)$ and $\wp'(z)$ are elliptic, so is $h(z)$. We need to be careful about when $\wp'(z) = 0$. Since $f(z)$ and $\wp'(z)$ are meromorphic, so is $h(z)$ and it has isolated poles. Define:

$$
H(z) = h(z) - h(-z)
$$

we have $H(z) = 0$ for all $z$ that is not a pole of $h(z)$, so its Laurent series near any pole $a$ would need to have all coefficients 0. So now we reduce the odd function $f(z)$ to the even $h(z)$, and we can use <i>Lemma 16</i> to show that there exists some $Q(\wp(z))$ such that:

$$
h(z) = Q(\wp(z)) + C
$$

and finally

$$
f(z) = (Q(\wp(z)) + C) \wp'(z)
$$

</proof>

<p>In [6] we showed that every function $f(z)$ can be decomposed into odd and even components:</p>

\[f(z) = f_o(z) + f_e(z)\]

<p>With</p>

\[f_e(z) = \frac{f(z) + f(-z)}{2} \qquad f_o(z) = \frac{f(z) - f(-z)}{2}\]

<p>Since elliptic functions are closed under arithmetic operations, we conclude that $f_e(z)$ and $f_o(z)$ are elliptic and can use <em>Lemma 16</em> to show that $f_o(z)$ can be written as $R_1(\wp(z))$ and <em>Lemma 17</em> that $f_e(z)$ can be written as $\wp’(z) R_2(\wp(z))$, so $f(z) = R_1(\wp(z)) + \wp’(z) R_2(\wp(z))$ leading to the following:</p>

<p><strong>Corollary 18.</strong> Every elliptic function $f(z)$ can be written as a rational function of $\wp(z)$.</p>

<h2 id="conclusion">Conclusion</h2>

<p>In this post we introduced the Weierstrass functions which include the $\wp$, the zeta and the sigma functions. We ended up not using the latter two, but I included them because they’re discussed in Ahlfors.</p>

<p>The proof that every elliptic function can be expressed from $\wp(z)$ was extremely difficult to understand. It has many subtle details and every time I re-read the proof I realized I had glossed over something.</p>

<p>This proof is also not provided in Ahlfors (in any obvious way at least) and I relied entirely on ChatGPT to understand it! Like with learning history, I’m finding using ChatGPT a lot more effective at learning math: I can ask for it to explain me things in different angles and dig into different parts.</p>

<p>Part of me is reluctant on this approach because I really like the idea of reading books, but going forward I might need to rethink my approach. Perhaps I’ll use textbooks to get a general idea of the field but rely more on ChatGPT to really understand things.</p>

<p>This post also marks the end of my journey with the textbook <em>Complex Analysis</em> by Lars V. Ahlfors. I’ll write another post summarizing the book and look back on this journey.</p>

<h2 id="appendix">Appendix</h2>

<p><strong>Lemma 19.</strong> Let $P(z)$ be a polynomial:</p>

\[P(z) = c_1 z^2 + c_2 z^4 + c_3 z^6 + \cdots\]

<p>then near 0, $1/P(z)$ is of the form:</p>

\[\frac{1}{P(z)} = \frac{c'_1}{z^2} + c'_2 + c'_3 z^2 + c'_4 z^4 + \cdots\]

<p>where $c’_1 = 1/c_1$.</p>

<proof>
First we factor $z^2$:

$$
P(z) = z^2 (c_1 + c_2 z^2 + c_3 z^4 + \cdots)
$$

Invert:

$$
\frac{1}{P(z)} = \frac{1}{z^2} \cdot \frac{1}{c_1 + c_2 z^2 + c_3 z^4 + \cdots} = \frac{1}{z^2} \cdot \frac{1}{c_1(1 + \alpha_2 z^2 + \alpha_3 z^4 + \cdots)}
$$

Now define

$$
u(z) = \alpha_2 z^2 + \alpha_3 z^4 + \cdots
$$

so we have:

$$
\frac{1}{P(z)} = \frac{1}{z^2} \frac{1}{c_1(1 + u(z))}
$$


as $z \rightarrow 0$, $u(z) \rightarrow 0$ since it has no constant terms. Thus we can consider the series:

$$
\frac{1}{1 + u} = 1 - u + u^2 - u^3 + \cdots
$$

so we end up with:

$$
\frac{1}{P(z)} = \frac{1}{z^2} \frac{1}{c_1} (1 - u(z) + u(z)^2 - u(z)^3 + \cdots)
$$

because each term of $u(z)$ is an even power of $z$, the result will be some polynomial of $z$ of even powers. Thus:

$$
\frac{1}{P(z)} = \frac{c'_1}{z^2} + c'_2 + c'_3 z^2 + c'_4 z^4 + \cdots
$$

with $c'_1 = 1/c_1$.

QED.

</proof>

<p><strong>Lemma 20.</strong> Let $P(z)$ be a polynomial:</p>

\[P(z) = c_1 z + c_2 z^2 + c_3 z^3 + \cdots\]

<p>then near 0, $1/P(z)$ is of the form:</p>

\[\frac{1}{P(z)} = \frac{c'_1}{z} + c'_2 + c'_3 z + c'_4 z^2 + \cdots\]

<p>where $c’_1 = 1/c_1$.</p>

<proof>
We reduce this to <i>Lemma 19</i> by writing $z = w^2$ so we have:

$$
P(w) = c_1 w^2 + c_2 w^4 + c_3 w^6 + \cdots
$$

</proof>

<h2 id="references">References</h2>

<ul>
  <li>[1] Complex Analysis - Lars V. Ahlfors</li>
  <li>[<a href="https://www.kuniga.me/blog/2024/11/02/poles.html">2</a>] NP-Incompleteness: Elliptic Functions</li>
  <li>[<a href="https://www.kuniga.me/blog/2024/11/02/poles.html">3</a>] NP-Incompleteness: Zeros and Poles</li>
  <li>[<a href="https://www.kuniga.me/blog/2025/04/16/residue-theorem.html">4</a>] NP-Incompleteness: The Residue Theorem</li>
  <li>[<a href="https://www.kuniga.me/blog/2024/08/31/removable-singularities.html">5</a>] NP-Incompleteness: Removable Singularities</li>
  <li>[<a href="https://www.kuniga.me/blog/2021/10/09/hermitian-functions.html">6</a>] NP-Incompleteness: Hermitian Functions</li>
</ul>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="analysis" /><summary type="html"><![CDATA[In our post on Elliptic functions [2] we started with the simply periodic functions such as $\sin z$. We noted that $e^{2\pi i z} / w$ is the simplest of the periodic functions and that every single simply periodic function $f(z)$ of period $w$ can be written as a function of it: $f(z) = g(e^{2\pi i z} / w)$. Then we introduced doubly periodic functions, also known as elliptic functions. One may ask if there is, analogously, the simplest elliptic function and whether it’s possible to write all elliptic functions as a function of it. The answer is yes! And this function is known as the Weierstrass ℘ function which we’ll study in this post.]]></summary></entry><entry><title type="html">Coding with AI</title><link href="https://www.kuniga.me/blog/2026/02/14/on-ai.html" rel="alternate" type="text/html" title="Coding with AI" /><published>2026-02-14T00:00:00+00:00</published><updated>2026-02-14T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/02/14/on-ai</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/02/14/on-ai.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources/blog/2026-02-14-on-ai/ai-coder.png" alt="Robot writing code via a computer. Generated with Nano Banana" />
</figure>

<p>In this post I’d like to share my thoughts on coding with AI and how it has affected me. Everyone is talking about this and I don’t have anything new to say, but want to centralize points I’ve heard/read so far and document this point in time. It might be fun to come back to this after a while.</p>

<p><br /></p>

<!--more-->

<h2 id="a-new-era">A New Era</h2>

<p>I kept hearing leadership saying AI would fundamentally change how we work and I was initially dismissing it as hyperbolic but then I started using Claude skills last December. I started getting value from it even before the introduction of Opus 4.5: I created a skill to carry out a refactoring that would have taken me days, and it finished the work in a couple of hours with little interaction needed from me.</p>

<p>Now I’m at a point where I don’t type code in the editor anymore, and do everything through Claude code or the terminal. I plan to write more details on how I use AI for my day-to-day work in a separate post.</p>

<h2 id="loss-of-craft">Loss of Craft</h2>

<p>According to Daniel Pink, the three pillars of intrinsic motivations are: autonomy, purpose and mastery. Delegating coding to AI has eliminated a lot of the mastery aspect of writing code. I’ve seen a lot of people at work sad / lost with this change.</p>

<p>I thought I’d be on the same boat because I love coding! I participated in ICPC competitions during college (I spent many school breaks happily practicing), I can do Leetcode for fun and still solve programming puzzles like Advent of Code. I was surprised at myself when I was excited about writing prompts all day.</p>

<p>Reflecting on why, I realize that most coding in a day-to-day job is not that interesting. It revolves around writing boilerplate, fixing compilation/type errors, adhering to convention and working with human-induced complexity and abstractions. I don’t enjoy any of that. I still plan to keep solving programming puzzles for fun.</p>

<h2 id="skill-atrophy">Skill Atrophy</h2>

<p>Some scientific studies show that people who use GPS have worsened spatial navigation abilities. While this is not permanent, if you get lost during a hike, you won’t be able to rely on these skills.</p>

<p>I think a similar pattern will happen in the future for programming. Most people will have trouble writing code by hand and once we delegate code reviews to AI we’ll have trouble reading it too. This will be even worse for people who are joining the workforce now, since they will not even have long-term memory to fall back on.</p>

<p>So in the event of an emergency blackout where we can’t rely on AI, very few people would be able to help, but maybe that’s ok.</p>

<h2 id="overwork">Overwork</h2>

<p>I saw this <a href="https://simonwillison.net/2026/Feb/9/ai-intensifies-work/">blog post</a> by Simon Willison that resonated with me, claiming that people work more with AI. It seems like a paradox given that AI is supposed to automate a lot of our work.</p>

<p>To me, the major reason for overworking is that now I can get work done with smaller chunks of time. Before, if I had a gap of 30 minutes free I’d browse my phone or do something else. Now I have time to write a prompt and have Claude code fix a small issue. <a href="https://www.threads.com/@darkzuckerberg/post/DUvm6YKFNxC?xmt=AQF08RWtWRHTzpCUaAPjGdDBVU0FUwylRsJzj0zKHsMKH0boVvhQDKTk2zi1olA3YCj-LrF4&amp;slof=1">@darkzuckerberg</a> on Threads uses the term <em>casual productivity</em> for this.</p>

<p>This also increases the perceived opportunity cost: I feel like I’m constantly thinking of ways I could be leveraging AI to advance some project. This sentiment is also shared by <a href="https://molochinations.substack.com/p/ai-killed-the-individual-contributor">Philip Su</a>:</p>

<blockquote>
  <p>In honesty, I don’t even use the bathroom these days before prompting several AIs with work while I’m gone 120 seconds.</p>
</blockquote>

<p>This also means that my attention span, which is already not great (see <em>Context Switching</em>), is going to get worse.</p>

<h2 id="career-expectations">Career Expectations</h2>

<p>The part that I worry about the most is whether this new paradigm will benefit me more or less than the average.</p>

<p>I don’t enjoy writing documents and spending days in meetings to get alignment. While I don’t feel like I need to write the code myself, I do like having more direct contribution to projects, which so far translated into doing the work myself or mentoring more junior engineers / interns where my level of contribution is at the task level. Currently AI seems to be providing most value at this junior level and I’m able to leverage it.</p>

<p>That suggests junior people will be at a disadvantage because any engineer can use AI to do the same work, but junior people often lack the knowledge and experience to know what to build and verify the result. On the other hand, junior engineers are more open to adopting new technologies and are less expensive for the company, so who knows.</p>

<p>The senior engineer who typically delegates more concrete tasks to junior engineers can now do the same to AI agents and get the work done faster. However, the technical lead can also do this and with AI being able to handle more complex workflows, it will be possible to operate at a higher level of abstraction such as writing architectural documents.</p>

<p>My prediction is that the more AI advances, the more it will benefit higher level engineers. I expect it will widen the impact gap between the different levels and I wonder if this will reflect in compensation. This might also expose high-level engineers that are only good at people skills but not good at engineering, because now they might be expected to produce concrete output, even if prototypes.</p>

<p>We’ll see. I’m part anxious, part excited about how things will change in the next year or so.</p>

<h2 id="mind-the-gap">Mind the Gap</h2>

<p>Speaking of widening the impact gap, I expect that – at least while things are being figured out – some people will be vastly more productive with AI than others. Without AI, the ratio of the fastest typer vs. the slowest one is probably not much more than 2x. It’s said that the mythical 10x engineer is often more efficient because they choose to solve a problem in a different way or realize the problem does not need to be solved. However, it’s really hard to prove this efficiency objectively.</p>

<p>With AI, someone skilled at prompting might be able to orchestrate a large project without intervention. They might get it churning during sleep hours or weekends. And I think this could lead to a 100x difference between the most productive and least productive person, at least until tools mature.</p>

<p>Now that prototyping is cheap, people can also be compared more objectively. They could have team A and B compete to work on a major project in parallel, with different approaches. Someone from a different team might vibe code your project during a weekend because they’re most skilled at prompting. This is pretty anxiety-inducing and also plays into overwork (see <em>Overwork</em> above).</p>

<h2 id="context-switching">Context Switching</h2>

<p>Over my career a consistent piece of feedback I got is that I work on too many projects in parallel. I often need a lot of effort to correct course and focus on fewer things, but I’m just excited about too many things and invariably end up relapsing.</p>

<p>I found so far that with AI I can more effectively operate in this mode. I have a main project which I prioritize but when Claude is “Shimmying…” I can context switch to a side quest. I read some people struggle with this but since I’ve been doing this since forever, it feels natural to me.</p>

<p>Relatedly, Boris Cherny, the creator of Claude code, recently did a Q&amp;A and mentioned that we’re in the <em>Golden age for ADHD</em>. I never got diagnosed with ADHD, but I check a lot of the symptoms.</p>

<h2 id="jevons-paradox">Jevons Paradox</h2>

<p>One interesting aspect of individuals becoming 2x, 10x more efficient is what companies will end up doing: reducing work force proportionally or increasing the number and scope of projects.</p>

<p>So far it seems like most of the productivity is in coding, but code review and operational oversight still seem to require human intervention. Also, a lot more changes going in means the surface area that needs to be supported increases.</p>

<p>I think it’s a matter of time before most human oversight will be taken out of the loop and this will mean it will disrupt the software engineering profession. Whether this will cascade to other professions remains to be seen.</p>

<h2 id="coding-style-changes">Coding Style Changes</h2>

<p>It’s said that code is written once and read many times, so we need to optimize for readability. I think many people do the right thing for stuff like naming variables, but I found that when it comes to abstraction we often optimize for writing. The <em>DRY</em> (Don’t Repeat Yourself) principle is often used as justification for excessive abstraction which makes the code short and dense but hard to follow.</p>

<p>AI has no trouble generating tons of code so it can be as verbose as we want. It’s also good at being more consistent and thorough than humans so the argument to avoid duplicate code because it needs to be updated multiple times might not be as strong now. Multiple layers of useless abstraction might also require AI to bring in more context and degrade its performance.</p>

<p>On a similar vein, I’m changing my mind about comments. I often preferred sparse comments because it’s hard to keep them consistent with the code but AI is pretty good at keeping them up to date so I feel like it’s okay to have them now. One thing that AI is still bad at is explaining the “why”. It’s very good at explaining the “what” but it doesn’t explain why it decided to implement things this way until you actually ask it. It probably copied this behavior from training data.</p>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="opinion" /><summary type="html"><![CDATA[In this post I’d like to share my thoughts on coding with AI and how it has affected me. Everyone is talking about this and I don’t have anything new to say, but want to centralize points I’ve heard/read so far and document this point in time. It might be fun to come back to this after a while.]]></summary></entry><entry><title type="html">bpftrace in C++</title><link href="https://www.kuniga.me/blog/2026/02/10/bpf-cpp.html" rel="alternate" type="text/html" title="bpftrace in C++" /><published>2026-02-10T00:00:00+00:00</published><updated>2026-02-10T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/02/10/bpf-cpp</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/02/10/bpf-cpp.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources/blog/2026-02-10-bpf-cpp/bpftrace.png" alt="Logo of bpftrace. A bee blended with a pencil" />
</figure>

<p>I learned a lot about BPF tools after reading the book <a href="https://www.kuniga.me/blog/2025/12/28/book-bpf-performance-tools.html">BPF Performance Tools</a> by Brendan Gregg. The book focused mostly on performance analysis at a lower level, either through kernel functions or libraries such as libc.</p>

<p>I wanted to leverage BPF at the application layer, in particular in C++. The book covers C++ very briefly in <em>Chapter 12</em> and <em>Chapter 13</em> provides an example of analyzing a MySQL database but still, most of the examples assume the implementation being in C.</p>

<p>In this post want to investigate how to inspect C++ applications.</p>

<!--more-->

<h2 id="dynamic-probes">Dynamic Probes</h2>

<p>Let’s start with dynamic probes. As we explained in [1], these are probes we can attach to without having to recompile the code. Since we’re only interested in user space, we’ll use <code class="language-plaintext highlighter-rouge">uprobes</code>.</p>

<h3 id="functions">Functions</h3>

<p>Let’s start with a simple example: free functions. In our example, we run an infinite loop simulating a webserver.</p>

<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="c1">// main.cpp</span>

<span class="kt">long</span> <span class="kt">long</span> <span class="nf">fibonacci</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">return</span> <span class="n">n</span> <span class="o">&lt;=</span> <span class="mi">1</span> <span class="o">?</span> <span class="n">n</span> <span class="o">:</span> <span class="n">fibonacci</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fibonacci</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">2</span><span class="p">);</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">while</span> <span class="p">(</span><span class="nb">true</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">fibonacci</span><span class="p">(</span><span class="mi">40</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>

<p>We can run this binary say via</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">g++ main.cpp -o fibo -O3</code></pre></figure>

<p>and then use this bpftrace script that counts how many times <code class="language-plaintext highlighter-rouge">fibonacci()</code> was called during a 2-second period:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">// fib.bt

uprobe:./fibo:_Z9fibonaccii { @calls = count(); }

interval:s:2 { exit(); }</code></pre></figure>

<p>We can run it via:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">$ sudo bpftrace /tmp/fib.bt

Attached 3 probes
@calls: 9723017</code></pre></figure>

<p><strong>Name mangling.</strong> The first hurdle is that C++ mangles the name of functions, even free ones, so we need to provide that instead, <code class="language-plaintext highlighter-rouge">_Z9fibonaccii</code>. The easiest way to find the mangled symbols from a binary is by running this <code class="language-plaintext highlighter-rouge">bpftrace</code> command:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">$ sudo bpftrace -l 'uprobe:./fibo:*fib*'
uprobe:./fibo:_Z9fibonaccii</code></pre></figure>

<p>We can also get a <em>distribution</em> of values passed to fibonacci by changing <code class="language-plaintext highlighter-rouge">fib.bt</code>:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">uprobe:./fibo:_Z9fibonaccii {
  @h = hist(arg0);
}

interval:s:2 { exit(); }</code></pre></figure>

<p>And then:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">$ sudo bpftrace /tmp/fib.bt

@h:
[0]              1720997 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@                    |
[1]              2784630 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2, 4)           2784630 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4, 8)           1469908 |@@@@@@@@@@@@@@@@@@@@@@@                         |
[8, 16)           245746 |@                                               |
[16, 32)            5339 |                                                |
[32, 64)               3 |                                                |</code></pre></figure>

<p>Note that because <code class="language-plaintext highlighter-rouge">fibonacci</code> is recursive, we don’t need to worry about it being inlined by the compiler optimizer. Otherwise we need to modify the code to add <code class="language-plaintext highlighter-rouge">__attribute__((noinline))</code> if we want to turn off inlining for this function.</p>

<h3 id="methods">Methods</h3>

<p>Class methods look like regular functions after compilation and the concept of method visibility does not exist in runtime, so we can also inspect private methods. We can modify our example to implement such a case:</p>

<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="c1">// main.cpp</span>

<span class="k">struct</span> <span class="nc">Fibo</span> <span class="p">{</span>
  <span class="c1">// Expensive recursive Fibonacci computation</span>
  <span class="kt">long</span> <span class="kt">long</span> <span class="n">run</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">return</span> <span class="n">run</span><span class="p">(</span><span class="mi">40</span><span class="p">);</span>
  <span class="p">}</span>

 <span class="nl">private:</span>
  <span class="kt">long</span> <span class="kt">long</span> <span class="n">run</span><span class="p">(</span><span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="n">n</span> <span class="o">&lt;=</span> <span class="mi">1</span> <span class="o">?</span> <span class="n">n</span> <span class="o">:</span> <span class="n">run</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">run</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">2</span><span class="p">);</span>
  <span class="p">}</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">Fibo</span> <span class="n">f</span><span class="p">;</span>
  <span class="k">while</span> <span class="p">(</span><span class="nb">true</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">f</span><span class="p">.</span><span class="n">run</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
  <span class="p">}</span>

  <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>

<p>If we compile it to <code class="language-plaintext highlighter-rouge">fibo_class</code>, we can inspect the symbols by looking for the class name:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">$ sudo bpftrace -l 'uprobe:./fibo_class:*Fibo*'
uprobe:./fibo_class:_ZN4Fibo3runEi
uprobe:./fibo_class:_ZN4Fibo3runEv</code></pre></figure>

<p>Since we have overloaded signatures, it shows both. We can use <code class="language-plaintext highlighter-rouge">c++filt</code> to identify the correct overload, the one taking <code class="language-plaintext highlighter-rouge">int</code> as argument:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">$ c++filt _ZN4Fibo3runEi
Fibo::run(int)</code></pre></figure>

<p>So we can write a similar script to count the distribution of arguments. In C++, when methods are compiled, the <code class="language-plaintext highlighter-rouge">this</code> object that is implicit in code is made explicit as the first argument, so we have to account for that and probe the second argument <code class="language-plaintext highlighter-rouge">arg1</code> instead.</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">uprobe:./fibo_class:_ZN4Fibo3runEi { @h = hist(arg1); }

interval:s:2 { exit(); }</code></pre></figure>

<p><br /></p>

<h3 id="stl-strings">STL Strings</h3>

<p>bpftrace can handle C-style <code class="language-plaintext highlighter-rouge">char*</code> but not <code class="language-plaintext highlighter-rouge">std::string</code>. For that, we need to make assumptions about which STL implementation is used, operating systems and compilation mode, making this unportable. Suppose we have:</p>

<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="kt">void</span> <span class="nf">my_print</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&amp;</span> <span class="n">s</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">s</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">s1</span> <span class="o">=</span> <span class="s">"hello"</span><span class="p">;</span>
  <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">s2</span> <span class="o">=</span> <span class="s">"world"</span><span class="p">;</span>

  <span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="k">while</span> <span class="p">(</span><span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">i</span> <span class="o">%</span> <span class="mi">3</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">?</span> <span class="n">my_print</span><span class="p">(</span><span class="n">s1</span><span class="p">)</span> <span class="o">:</span> <span class="n">my_print</span><span class="p">(</span><span class="n">s2</span><span class="p">);</span>
    <span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
  <span class="p">}</span>

  <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>

<p>And we want to probe <code class="language-plaintext highlighter-rouge">s</code> in <code class="language-plaintext highlighter-rouge">my_print</code>. Suppose we compile this to <code class="language-plaintext highlighter-rouge">strhist</code>. I’m running this on Linux on a x86_64 architecture and can verify my program links to <code class="language-plaintext highlighter-rouge">stdc++</code></p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">ldd strhist
...
libstdc++.so.6 =&gt; /lib64/libstdc++.so.6 (...)
...</code></pre></figure>

<p>In this case it’s relatively safe to assume that if <code class="language-plaintext highlighter-rouge">std::string</code> is at the address <code class="language-plaintext highlighter-rouge">addr</code>, then <code class="language-plaintext highlighter-rouge">addr + 0</code> is a pointer to the data and <code class="language-plaintext highlighter-rouge">addr + 8</code> contains the length of the string. This works even with SSO (small string optimization) in which the raw data is not stored in the heap, but within the <code class="language-plaintext highlighter-rouge">std::string</code> structure itself, in the buffer zone. In that case <code class="language-plaintext highlighter-rouge">addr + 0</code> will not point to the heap, but to <code class="language-plaintext highlighter-rouge">addr + 16</code>, but for our purposes it doesn’t matter.</p>

<p>We can write a bpftrace script as such:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">// strhist.bt
uprobe:./fibo:_Z8my_printRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
{
  $s  = arg0;
  $p  = *(uint64*)($s + 0);    // char* - raw data
  $n  = *(uint64*)($s + 8);    // size_t - strlen

  // print as raw bytes (handles embedded NULs)
  printf("len=%lu str=%r\n", $n, buf($p, $n));
}</code></pre></figure>

<p>We’ll see it shows <code class="language-plaintext highlighter-rouge">hello</code> and <code class="language-plaintext highlighter-rouge">world</code> correctly. We can get a frequency count of each:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">// strhist.bt
uprobe:./fibo:_Z8my_printRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
{
  $s  = arg0;
  $p  = *(uint64*)($s + 0);    // char* - raw data
  $n  = *(uint64*)($s + 8);    // size_t - strlen

  @cnt[str($p)] = count();
}</code></pre></figure>

<p>The major drawback of <code class="language-plaintext highlighter-rouge">str($p)</code> is that it’s truncated, typically 64 bytes, so if the strings share the same long prefix, it will not distinguish between then. Also, if these strings have <code class="language-plaintext highlighter-rouge">\0</code>, then it will use as delimiter instead of honoring <code class="language-plaintext highlighter-rouge">$n</code>. It’s possible to create a custom hashing function with length larger than 64 but it must be a constant.</p>

<h3 id="user-defined-classes">User-Defined Classes</h3>

<p>If the argument is a more complex class, it can be a lot more work to probe it. Suppose we have this contrived class <code class="language-plaintext highlighter-rouge">C</code> which we want to probe:</p>

<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="k">struct</span> <span class="nc">B</span> <span class="p">{</span>
  <span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">s</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">class</span> <span class="nc">C</span> <span class="p">{</span>
 <span class="nl">public:</span>
  <span class="n">C</span><span class="p">(</span><span class="kt">int</span> <span class="n">id</span><span class="p">)</span> <span class="o">:</span> <span class="n">id_</span><span class="p">(</span><span class="n">id</span><span class="p">)</span> <span class="p">{}</span>

  <span class="kt">void</span> <span class="n">add</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">key</span><span class="p">,</span> <span class="n">B</span> <span class="n">b</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">m_</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">b</span><span class="p">;</span>
  <span class="p">}</span>

 <span class="nl">private:</span>
  <span class="n">std</span><span class="o">::</span><span class="n">map</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="p">,</span> <span class="n">B</span><span class="o">&gt;</span> <span class="n">m_</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">id_</span><span class="p">;</span>
<span class="p">};</span></code></pre></figure>

<p>It’s very complex, having STL data structures and other classes as member variables. Suppose we want to probe the member <code class="language-plaintext highlighter-rouge">id_</code> when calling a function by passing <code class="language-plaintext highlighter-rouge">C</code> as reference:</p>

<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="n">__attribute__</span><span class="p">((</span><span class="n">noinline</span><span class="p">))</span> <span class="kt">void</span> <span class="nf">random_func</span><span class="p">(</span><span class="n">C</span><span class="o">&amp;</span> <span class="n">c</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">c</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="s">"key"</span><span class="p">,</span> <span class="n">B</span><span class="p">{</span><span class="s">"hello 2"</span><span class="p">});</span>
<span class="p">}</span></code></pre></figure>

<p>If we have debug symbols for the binary, we can inspect <code class="language-plaintext highlighter-rouge">C</code>’s structure to find the offset of <code class="language-plaintext highlighter-rouge">id_</code>:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">$ pahole -F dwarf -C 'C' struct.cpp.dwo
...
class C {
        class {
        } m_;                                            /*     0     0 */

        /* XXX 48 bytes hole, try to pack */

        int                        id_;                  /*    48     4 */
public:
...
}</code></pre></figure>

<p>It tells us the offset of <code class="language-plaintext highlighter-rouge">id_</code> is 48 bytes, so in bpftrace we can do:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">uprobe:./struct:_Z11random_funcR1C
{
  $c = *(int*)(arg0 + 48);
  printf("x=%d\n", $c);
}</code></pre></figure>

<p>We can also add a pseudo-struct that mimics the offsets from <code class="language-plaintext highlighter-rouge">C</code> by declaring a struct inline in the script:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">struct C_stub {
    char _pad[48];
    int x_;
}

uprobe:./struct:_Z11random_funcR1C
{
  $c = *(struct C_stub*)arg0;
  printf("x=%d\n", $c-&gt;x_);
}</code></pre></figure>

<p>which is slighly more readable and could be useful if multiple bpftrace scripts need this <code class="language-plaintext highlighter-rouge">C</code> definition.</p>

<p>The conclusion from my exploration on uprobes is that while they are provided out of the box, working with non-primitive types can get easily very complex and brittle. We now explore the scenario where we can modify the source code to make it easier to inspect.</p>

<h2 id="static-probes">Static Probes</h2>

<p>We can add tracepoints to specific parts of the code and because it requires modifying the code, these are known as <em>static probes</em>. One of the easiest ways to do it is using the Folly library, via the <code class="language-plaintext highlighter-rouge">folly/tracing/StaticTracepoint.h</code> header, which defines a few macros. We can add those to one of our previous examples:</p>

<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="cp">#include</span> <span class="cpf">&lt;folly/tracing/StaticTracepoint.h&gt;</span><span class="cp">
</span>
<span class="kt">void</span> <span class="nf">my_print</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&amp;</span> <span class="n">s</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">FOLLY_SDT</span><span class="p">(</span><span class="n">my_project</span><span class="p">,</span> <span class="n">start_proc</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">c_str</span><span class="p">());</span>

  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">s</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>

  <span class="n">FOLLY_SDT</span><span class="p">(</span><span class="n">my_project</span><span class="p">,</span> <span class="n">end_proc</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">...</span></code></pre></figure>

<p>Suppose we compile this to <code class="language-plaintext highlighter-rouge">strhist</code>. To list the USDTs we can do:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">$ sudo bpftrace -l 'usdt:./strhist:*'
usdt:./strhist:my_project:end_proc
usdt:./strhist:my_project:start_proc</code></pre></figure>

<p>The first thing to notice is that the prefix is now <code class="language-plaintext highlighter-rouge">usdt</code> instead of <code class="language-plaintext highlighter-rouge">uprobe</code>. The second is that because we provided the name specifically, the name is not mangled.</p>

<p>Also notice we passed <code class="language-plaintext highlighter-rouge">s.c_str()</code> which is of type <code class="language-plaintext highlighter-rouge">char*</code>, which is well supported by bpftrace. We don’t have to assume the data layout in <code class="language-plaintext highlighter-rouge">std::string</code> anymore, so our script becomes:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">// strhist.bt
usdt:./strhist:my_project:start_proc
{
  printf("str=%s\n", str(arg0));
}</code></pre></figure>

<p>In this case <code class="language-plaintext highlighter-rouge">s.c_str()</code> is cheap enough to run every time, even when no tracing is taking place. If we want to gate so that code only runs during tracing, we can use <code class="language-plaintext highlighter-rouge">FOLLY_SDT_IS_ENABLED</code> + <code class="language-plaintext highlighter-rouge">FOLLY_SDT_WITH_SEMAPHORE</code>:</p>

<figure class="highlight"><pre><code class="language-cpp" data-lang="cpp"><span class="cp">#include</span> <span class="cpf">&lt;folly/tracing/StaticTracepoint.h&gt;</span><span class="cp">
</span>
<span class="n">FOLLY_SDT_DEFINE_SEMAPHORE</span><span class="p">(</span><span class="n">my_project</span><span class="p">,</span> <span class="n">start_proc</span><span class="p">);</span>

<span class="kt">void</span> <span class="nf">my_print</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&amp;</span> <span class="n">s</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">FOLLY_SDT_IS_ENABLED</span><span class="p">(</span><span class="n">my_project</span><span class="p">,</span> <span class="n">start_proc</span><span class="p">))</span> <span class="p">{</span>
    <span class="c1">// This block is only executed when tracing is going on.</span>
    <span class="n">FOLLY_SDT_WITH_SEMAPHORE</span><span class="p">(</span><span class="n">my_project</span><span class="p">,</span> <span class="n">start_proc</span><span class="p">,</span> <span class="n">msg</span><span class="p">.</span><span class="n">c_str</span><span class="p">());</span>
  <span class="p">}</span>
  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">s</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">...</span></code></pre></figure>

<p>So USDTs are easier to work with but this requires more foresight.</p>

<h2 id="applications">Applications</h2>

<p>Now that we know how to define tracepoints, we can do other types of analysis besides counting how many times a tracepoint is hit.</p>

<h3 id="latency">Latency</h3>

<p>We can measure the amount of time it takes between two probes, say <code class="language-plaintext highlighter-rouge">start_proc</code> and <code class="language-plaintext highlighter-rouge">end_proc</code>. As long a <code class="language-plaintext highlighter-rouge">end_proc</code> tracepoint is always executed after a <code class="language-plaintext highlighter-rouge">start_proc</code> for a given thread, we can do:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">usdt:./my_binary:my_project:start_proc {
  @ts[tid] = nsecs;
}
usdt:./my_binary:my_project:end_proc {
  @lat_us = hist((nsecs - @ts[tid]) / 1000);
  delete(@ts[tid]);
}</code></pre></figure>

<p>Upon termination, <code class="language-plaintext highlighter-rouge">@lat_us</code> will display a histogram of duration of <code class="language-plaintext highlighter-rouge">end_proc - start_proc</code> in microseconds.</p>

<h3 id="memory-allocation">Memory Allocation</h3>

<p>We can measure how much memory was allocated between two probes, say <code class="language-plaintext highlighter-rouge">start_proc</code> and <code class="language-plaintext highlighter-rouge">end_proc</code>. As long a <code class="language-plaintext highlighter-rouge">end_proc</code> tracepoint is always executed after a <code class="language-plaintext highlighter-rouge">start_proc</code> for a given thread, and that we assume a specific memory allocator. Suppose the allocator is the standard <code class="language-plaintext highlighter-rouge">malloc</code> provided by <code class="language-plaintext highlighter-rouge">libc</code>. We can do:</p>

<figure class="highlight"><pre><code class="language-text" data-lang="text">uprobe:/usr/lib/libc.so.6:malloc
/ @in_req[tid] / { @req_size[tid] = sum(arg0); }

usdt:/my_binary:my_project:start_proc { @in_req[tid] = 1; }
usdt:/my_binary:my_project:end_proc {
  printf("allocs=%d\n", @req_size[tid]);
  delete(@req_size[tid]);
  delete(@in_req[tid]);
}</code></pre></figure>

<p>When we hit the first tracepoint, <code class="language-plaintext highlighter-rouge">start_proc</code>, we set a flag. If this flag is set, the <code class="language-plaintext highlighter-rouge">malloc</code> action will run, which in this case adds up the first argument.</p>

<p>This mechanism of setting a flag on start and clearing it at the end can be used with many other analysis described in [1].</p>

<h2 id="conclusion">Conclusion</h2>

<p>I haven’t had the opportunity to use bpftrace in production at work, but now I have a better idea on what are the capabilities for a C++ application. I may update this post with other use cases should I run into them.</p>

<h2 id="references">References</h2>

<ul>
  <li>[<a href="https://www.kuniga.me/blog/2025/12/28/book-bpf-performance-tools.html">1</a>] BPF Performance Tools - Brendan Gregg</li>
</ul>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="c++" /><summary type="html"><![CDATA[I learned a lot about BPF tools after reading the book BPF Performance Tools by Brendan Gregg. The book focused mostly on performance analysis at a lower level, either through kernel functions or libraries such as libc. I wanted to leverage BPF at the application layer, in particular in C++. The book covers C++ very briefly in Chapter 12 and Chapter 13 provides an example of analyzing a MySQL database but still, most of the examples assume the implementation being in C. In this post want to investigate how to inspect C++ applications.]]></summary></entry><entry><title type="html">Elliptic Functions</title><link href="https://www.kuniga.me/blog/2026/01/30/elliptic-functions.html" rel="alternate" type="text/html" title="Elliptic Functions" /><published>2026-01-30T00:00:00+00:00</published><updated>2026-01-30T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/01/30/elliptic-functions</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/01/30/elliptic-functions.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources/blog/2026-01-30-elliptic-functions/jacobi.png" alt="https://en.wikipedia.org/wiki/Jacobi_elliptic_functions" />
</figure>

<p>Suppose we’re given a circle of radius $r$ and two points in its perimeter, and we want to compute the length of the arc between these points. This can be computed using elementary functions, for example by determining the angle $\theta$ in radians between these two points with respect to the center and then the arc length is $\theta r$.</p>

<p>For ellipses this is not as trivial. In the 18th century, Giulio Fagnano and Leonhard Euler were the first to use integrals to compute the arc length of an ellipse and these became known as <em>elliptic integrals</em>.</p>

<p>Niels Abel and Carl Jacobi studied the inverse of elliptic integrals and later realized they were doubly periodic, that is they have two fundamental periods, as oppose to trigonometric functions line sine that has a single period. Due to this connection double periodic functions became known as <em>elliptic functions</em>.</p>

<!--more-->

<p>In this post we’ll study elliptic functions. We’ll start with simply periodic functions, then doubly periodic functions. We’ll spend most of the time studying properties of the periods.</p>

<h2 id="simply-periodic-functions">Simply Periodic Functions</h2>

<p>As the name suggests, periodic functions are those with image that repeats indefinitely, for example, $\sin z$, which repeats every $2 \pi$, that is $\sin z = \sin (z + 2\pi)$.</p>

<p>More generally, a function $f(z)$ is called <strong>periodic</strong> if $f(z) = f(z + w)$ for all $z$ in the domain, in which case $w$ is called the <strong>period</strong>. For this post, we’ll restrict ourselves to meromorphic functions.</p>

<p>If $w$ is the period of a function, that is $f(z) = f(z + w)$, inductively we can conclude that any integral multiple of $w$ is also a period, that is $f(z) = f(z + nw)$. If a period is not an integral multiple of any other period we call it the <strong>fundamental period</strong>. If a function has only one fundamental period, then $f$ is called <strong>simply periodic</strong>. For the remainder of the post, when we say periodic, we’ll imply simply periodic.</p>

<p>The function $f(z) = e^{2\pi i z / w}$ is the simplest periodic function and surprisingly, every periodic function can be expressed in terms of this one.</p>

<p><strong>Lemma 1.</strong> Let $f(z)$ be a meromorphic function with domain $\Omega$ and period $w$. Then there exists a unique meromorphic function $g$ such that:</p>

\[f(z) = g(e^{2\pi i z / w})\]

<proof>

Let $\zeta(z) = e^{2\pi i z / w}$ so we want $f(z) = g \circ \zeta (z)$. If $z \in \Omega$, let's denote the image of $\zeta(z)$ as $\Omega'$. We first show that a function $g(\zeta)$ with domain $\Omega'$ exists. Since $f$ exists, for every $z \in \Omega$, there's a corresponding $f(z)$ is the image of $f$. For every $z \in \Omega$ there's $\zeta \in \Omega'$ since $\zeta(z)$ exists. For each point $\zeta \in \Omega'$ we define $g(\zeta) = f(z)$.
<br /><br />
So every point in the domain of $g$ has a corresponding image. We still need to make sure we can write $f$ as a function of $g$. In particular, since $f$ has period $w$, for any $z$, $f(z) = f(z + w)$, we need to make sure that $g \circ \zeta (z) = g \circ \zeta (z + w)$. Since $\zeta(z)$ also has period $w$, it also holds that $\zeta(z) = \zeta(z + w)$, since it maps to the same point in $\Omega'$, $g(\zeta(z) = g(\zeta(z + w)) = f(z)$, so it's well defined.
<br /><br />
So far we've only found that $g$ is a valid mapping from $\Omega'$ to the image of $f$ but didn't prove anything about its properties. We can show it is holomorphic though. Consider the inverse of $\zeta(z)$, which we can obtain by applying the logarithm:

$$
z(\zeta) = \frac{w}{2\pi i} \log (\zeta)
$$

So we can write:

$$
g(\zeta) = f(z(\zeta)) = f \circ z(\zeta)
$$

The problem is that the logarithm is a multi-valued function and so is $z(\zeta)$. We need to turn it into a single-valued one by choosing a proper branch. Let $\zeta_0$ be a point in $\Omega'$. Because $\zeta(z)$ is the exponential, it cannot yield a 0, so $\zeta_0 \ne 0$. Consider a neighborhood $U$ of $\zeta_0$ that doesn't include the origin. In this region, we can choose a branch of the logarithm, denoted by $\log_U$ so that $\log_U(\zeta)$, $\zeta \in U$ is single-valued. Now we function $z_U(\zeta)$ in $U$ as:

$$
z_U(\zeta) = \frac{w}{2 \pi i} \log_U(\zeta)
$$

Since $\log_U$ is holomorphic in $U$, so is $z_U$. The composition of meromorphic/holomorphic functions is meromorphic so $g(\zeta)$ is meromorphic in $U$. There's nothing special in our choice of $\zeta_0$, so this applies to the entire domain $\Omega'$.

</proof>

<h3 id="discrete-fourier-series">Discrete Fourier Series</h3>

<p>Let $\Omega’$ be the domain of the function $g(\zeta)$ as defined in <em>Lemma 1</em>. Suppose it contains an annulus $r_1 \lt \abs{\zeta} \lt r_2$ in which $g$ has no poles. We can then express it via its Laurent series at 0 [2] as:</p>

\[g(\zeta) = \sum_{-\infty}^\infty c_n \zeta^n\]

<p>which then lets us express $f(z)$ as:</p>

\[f(z) = g \circ \zeta(z) = \sum_{-\infty}^\infty c_n e^{2 \pi i n z / w}\]

<p>which is the Fourier series [3].</p>

<h3 id="properties">Properties</h3>

<p>We now consider some properties of periodic functions. First, <em>Lemma 2</em> shows periodic functions are closed under arithmetic operations.</p>

<p><strong>Lemma 2.</strong> Let $f$ and $g$ be periodic functions of same period $w$. Then $f \pm g$, $f * g$ and $f / g$ are also periodic with period $w$.</p>

<proof>
We want to show that $h(z) = f(z) + g(z)$ is elliptic. We have $f(z + w) = f(z)$ and $g(z + w) = g(z)$, so $h(z + w) = f(z + w) + g(z + w) = f(z) + g(z) = h(z)$, so $h$ has period $w$. The same argument applies for $f * g$ and $f / g$, assuming in the latter that $g(z) \ne 0$.
</proof>

<p><strong>Lemma 3.</strong> Let $f$ be a periodic function with period $w$. Then $f’$ is also periodic with period $w$.</p>

<proof>
Define $g(z) = f(z + w)$. Since $f$ is periodic, $g(z) = f(z)$ and thus $g$ has the same poles as $f$. Consider the domain of $f$ without its poles. In this region $f$ and $g$ are holomorphic. Differentiating both gives us $g'(z) = f'(z)$ and $g'(z) = f'(z + w) d(z + w)/dz = f'(z + w)$. This implies $f'(z) = f'(z + w)$ for all $z$ and so $f'$ has period $w$.
</proof>

<h2 id="doubly-periodic-functions">Doubly Periodic Functions</h2>

<p>As the name suggests doubly periodic functions have two fundamental periods. Doubly periodic functions are also known as <strong>elliptic functions</strong>, which is the term we’ll use henceforth. Any period $w$ of an elliptic function $f(z)$ can be written as a integral linear combination of any of its fundamental periods, denoted by $w_1$ and $w_2$:</p>

\[w = n_1 w_1 + n_2 w_2\]

<p>We define the set of such periods as the <strong>period module</strong> of $f(z)$, denoted by $M$. Even if we don’t know $w_1$ and $w_2$ explicitly, we can guarantee that any integral linear combination of any other 2 periods in $M$ is also in $M$, as shown in <em>Lemma 4</em>.</p>

<p><strong>Lemma 4.</strong> Let $M$ be the period module of $f(z)$ and $u, v$ any periods in $M$. Then any integral linear combination of $u, v$ is also in $M$.</p>

<proof>

Since $u, v$ in $M$, we can express then as integral linear combination of $w_1, w_2$: $u = a_1 w_1 + a_2 w_2$ and $v = b_1 w_1 + b_2 w_2$ so any linear combination:

$$
w = n_1 u + n_2 v
$$

can be expressed as:

$$
w = n_1 (a_1 w_1 + a_2 w_2) + n_2 (b_1 w_1 + b_2 w_2)
$$

rearranging terms and grouping by $w_1, w_2$:

$$
w = (n_1 a_1 + n_2 b_1) w_1 + (n_1 a_2 + n_2 b_2) w_2
$$

Since all coefficients involved are integers, $w$ is also a integral linear combination of $w_1, w_2$ and thus belongs to $M$.
</proof>

<p>As <em>Lemma 5</em> shows, we can choose $w_1$ and $w_2$ such that $w_2 / w_1$ is not real, that is it has a non-zero imaginary part.</p>

<p><strong>Lemma 5.</strong> Let $M$ be the period module of $f(z)$. Then there exists $w_1, w_2 \in M$ such that for all $w \in M$, $w = n_1 w_1 + n_2 w_2$ and $w_2 / w_1$ is not real.</p>

<proof>
We choose $w_1$ the element with smallest modulus in $M$. It doesn't change the proof but interestingly, there are either 2, 4 or 6 such candidates as shown in <i>Lemma 13</i> in the <i>Appendix</i>.
<br /><br />
Let $w_2$ the element with smallest modulus in $M$ that is not an integer multiple of $w_1$. Note that it can have the same modulus as $w_1$. We claim that $r = w_2 / w_1$ is not real. Suppose it is, by the choice of $w_2$ we know $r$ is not an integer, so it can be placed between two integers

$$n \lt r \lt n + 1$$

Multiply this by $\abs{w_1}$ to get:

$$
n \abs{w_1} \lt r \abs{w_1} \lt n \abs{w_1} + \abs{w_1}
$$

subtract $n \abs{w_1}$:

$$
0 \lt (r - n) \abs{w_1} \lt \abs{w_1}
$$

Since $r - n \gt 0$, $\abs{r - n} = r - n$ and so $(r - n) \abs{w_1} = \abs{(r - n) w_1}$:

$$
0 \lt \abs{rw_1 - nw_1} \lt \abs{w_1}
$$

Since $r = w_2 / w_1$ or $w_2 = r w_1$:

$$
0 \lt \abs{w_2 - n w_1} \lt \abs{w_1}
$$

The number $w' = w_2 - n w_1$ is an integral linear combination of $w_1, w_2$ and is hence in $M$, but $\abs{w'} \lt \abs{w_1}$ contradicts our choice of $w_1$, which implies $w_2/w_1$ being real is false.
<br /><br />
It remains to show every $w \in M$ can be expressed as $w = n_1 w_1 + n_2 w_2$ for integers $n_1, n_2$. Suppose we want to find $\lambda_1, \lambda_2$ that satisfy:

$$
\begin{align}
w &amp;= \lambda_1 w_1 + \lambda_2 w_2 \\
\overline{w} &amp;= \lambda_1 \overline{w_1} + \lambda_2 \overline{w_2} \\
\end{align}
$$

We claim that the determinant of the coefficients is non-zero, i.e. $w_1\overline{w_2} + \overline{w_1} w_2 \ne 0$. If we denote $w_1 = a_1 + i b_1$ and $w_2 = a_2 + i b_2$

$$
w_1\overline{w_2} + \overline{w_1} w_2 = (a_1 + i b_1)(a_2 - i b_2) + (a_1 - i b_1)(a_2 + i b_2)
$$

Multiplying and cancelling terms we get:

$$
 = 2i (a_2 b_1 - a_1 b_2)
$$

If this is 0, then $a_2 b_1 = a_1 b_2$. We know $w_1$ and $w_2$ are non-zero. So if $a_2 = 0$, then $b_2 \ne 0$ and this implies $a_1 = 0$ which means $w_2 / w_1 = b_2 / b_1$, a contradiction. Similar arguments applies if $b_2 = 0$. So we assume these are all non-zero terms and thus we can write:

$$
\frac{a_1}{a_2} = \frac{b_1}{b_2} = k
$$

Then $w_1 = a_1 + ib_1 = k(a_2 + ib_2) = kw_2$. Then $w_2 / w_1 = k$, which is also a contradiction since $k$ is real. We conlude that the determinant is non-zero and these equations have a unique solution. This other pair of equations

$$
\begin{align}
w &amp;= \overline{\lambda_1} w_1 + \overline{\lambda_2} w_2 \\
\overline{w} &amp;= \overline{\lambda_1} \overline{w_1} + \overline{\lambda_2} \overline{w_2} \\
\end{align}
$$

also have a non-zero determinant and thus also have a unique solution. The only way this can be true is if $\lambda_1 = \overline{\lambda_1}$ and $\lambda_2 = \overline{\lambda_2}$ which implies both are real. So far we showed any complex number can be expressed as a linear combination of $w_1, w_2$ using real coefficients.
<br /><br />
We wish to show that any period in $M$ can be expressed as an <i>integral</i> linear combination of $w_1$ and $w_2$. We'll do that by assuming it's not, and get to a contradiction. Start by assuming that either $\lambda_1$ or $\lambda_2$ is not an integer. Let $m_1$ and $m_2$ be the closest integers to $\lambda_1, \lambda_2$ that is:

$$
(5.1) \quad
\begin{align}
\abs{\lambda_1 - m_1} &amp; \le 1/2 \\
\abs{\lambda_2 - m_2} &amp; \le 1/2 \\
\end{align}
$$

Since $m_1, m_2$ are integers, $u = m_1 w_1 + m_2 w_2$ is in $M$ and since we're assuming either $\lambda_1$ and/or $\lambda_2$ are not integers, then $u \ne w$. By <i>Lemma 4</i>, any integer linear combination of $u$ and $w$ is also in $M$, in particular $v = w - u$. We now claim that $\abs{v} \lt \abs{w_2}$. To see why, expand $v$:

$$
v = w - m_1 w_1 - m_2 w_2
$$

Since any complex number can be expressed as a linear combination of $w_1, w_2$ using real coefficients, $w = \lambda_1 w_1 + \lambda_2 w_2$:

$$
v = (\lambda_1 - m_1) w_1 - (\lambda_2 - m_2) w_2
$$

considering the modulus:

$$
\abs{v} = \abs{(\lambda_1 - m_1) w_1 - (\lambda_2 - m_2) w_2} = \abs{(\lambda_1 - m_1) w_1 + (m_2 - \lambda_2) w_2}
$$

Because $w_2 / w_1$ is not real, then $\abs{(\lambda_1 - m_1) w_1 / (m_2 - \lambda_2) w_2}$ is also not real. We can thus use the strict triangle inequality [4], i.e.

$$
\abs{(\lambda_1 - m_1) w_1 - (m_2 - \lambda_2) w_2} \lt \abs{(\lambda_1 - m_1) w_1} + \abs{(m_2 - \lambda_2) w_2}
$$

and obtain:

$$
\abs{v} \lt \abs{(\lambda_1 - m_1) w_1} + \abs{(m_2 - \lambda_2) w_2} =
\abs{(\lambda_1 - m_1)}\abs{w_1} + \abs{(\lambda_2 - m_2)}\abs{w_2}
$$

by our choice of $m_1, m_2$ we can use $(5.1)$:

$$
\abs{v} \lt \frac{1}{2}\abs{w_1} + \frac{1}{2}\abs{w_2}
$$

since $\abs{w_1} \le \abs{w_2}$:

$$
\abs{v} \lt \abs{w_2}
$$

the only way this can be true is if $v$ is an integer multiple of $w_1$, because otherwise we'd have picked $w_2 = v$. In that case $\lambda_2 = m_2$ and $\lambda_1 - m_1$ is integer, implying both $\lambda_1, \lambda_2$ are integers, a contradiction. This means $\lambda_1, \lambda_2$ are integers and thus every $w \in M$ is a integral linear combination of $w_1$ and $w_2$.
</proof>

<p>Any pair of periods that satisfy <em>Lemma 5</em> will be defined as a <strong>base</strong> for the module period.</p>

<h3 id="unimodular-transformations">Unimodular Transformations</h3>

<p>Suppose we have a base $(w_1, w_2)$ and we want to perform a change of base to $(w’_1, w’_2)$. Since both are in $M$, each can be expressed as an integer linear combination of $(w_1, w_2)$:</p>

\[\begin{align}
w'_1 &amp;= a w_1 + b w_2 \\
w'_2 &amp;= c w_1 + d w_2
\end{align}\]

<p>In matricial form it becomes:</p>

\[\left[ {\begin{array}{c}
   w_1' \\
   w_2' \\
  \end{array} } \right] =
  \left[ {\begin{array}{cc}
   a &amp; b \\
   c &amp; d \\
  \end{array} } \right]
  \left[ {\begin{array}{c}
   w_1 \\
   w_2 \\
  \end{array} } \right]\]

<p>So we can convert from one base to another via a transformation represented by an integer matrix. A <strong>unimodular matrix</strong> is one where the determinant has modulus 1. <em>Lemma 6</em> shows that the base transformation is a unimodular matrix.</p>

<p><strong>Lemma 6.</strong> Let $(w_1, w_2)$ and $(w’_1, w’_2)$ be two basis. Let $T$ be the transformation that takes $(w_1, w_2)$ to $(w’_1, w’_2)$. Then $T$ is unimodular.</p>

<proof>

We start by noticing that

$$
\overline{w'_1} = \overline{a w_1 + b w_2} = a \overline{w_1} + b \overline{w_2}
$$

and similarly for $\overline{w'_2}$, so we can expand the matricial relation to:

$$
  \left[ {\begin{array}{cc}
   w_1' &amp; \overline{w'_1} \\
   w_2' &amp; \overline{w'_2} \\
  \end{array} } \right] =
  \left[ {\begin{array}{cc}
   a &amp; b \\
   c &amp; d \\
  \end{array} } \right]
  \left[ {\begin{array}{cc}
   w_1 &amp; \overline{w_1} \\
   w_2 &amp; \overline{w_2} \\
  \end{array} } \right]
$$

since $(w'_1, w'_2)$ is also a base, an analogous relation exists:

$$
  \left[ {\begin{array}{cc}
   w_1 &amp; \overline{w_1} \\
   w_2 &amp; \overline{w_2} \\
  \end{array} } \right] =
  \left[ {\begin{array}{cc}
   a' &amp; b' \\
   c' &amp; d' \\
  \end{array} } \right]
  \left[ {\begin{array}{cc}
   w'_1 &amp; \overline{w'_1} \\
   w'_2 &amp; \overline{w'_2} \\
  \end{array} } \right]
$$

Putting them together:

$$
  \left[ {\begin{array}{cc}
   w_1 &amp; \overline{w_1} \\
   w_2 &amp; \overline{w_2} \\
  \end{array} } \right] =
  \left[ {\begin{array}{cc}
   a' &amp; b' \\
   c' &amp; d' \\
  \end{array} } \right]
  \left[ {\begin{array}{cc}
   a &amp; b \\
   c &amp; d \\
  \end{array} } \right]
  \left[ {\begin{array}{cc}
   w_1 &amp; \overline{w_1} \\
   w_2 &amp; \overline{w_2} \\
  \end{array} } \right]$$

As we've shown in <i>Lemma 5</i>, the determinant of the left hand side matrix is non-zero, because otherwise it would imply $w_2 / w_1$ is real, contradicting the hypothesis. This means it also has an inverse, so multiplying the relation above by this inverse gives us:

$$
  \left[ {\begin{array}{cc}
   1 &amp; 0 \\
   0 &amp; 1 \\
  \end{array} } \right] =
  \left[ {\begin{array}{cc}
   a' &amp; b' \\
   c' &amp; d' \\
  \end{array} } \right]
  \left[ {\begin{array}{cc}
   a &amp; b \\
   c &amp; d \\
  \end{array} } \right]
$$

which in turn implies the matrices on the right-hand side are inverse of each other and since the determinant of the identity is 1:

$$
\det \left[ {\begin{array}{cc}
   a' &amp; b' \\
   c' &amp; d' \\
  \end{array} } \right]

\det \left[ {\begin{array}{cc}
   a &amp; b \\
   c &amp; d \\
  \end{array} } \right] = 1
$$

and since the values are all integers, this implies their determinants are both equal to 1 or -1, which is the definition of unimodularity.
</proof>

<p>We can express the ratio of two bases as:</p>

\[r' = \frac{w'_1}{w'_2} = \frac{aw_1 + bw_2}{c w_1 + dw_2}\]

<p>Dividing both sides of the fraction by $w_2$:</p>

\[(1) \quad r' = \frac{a r + b}{cr + d}\]

<p>and from <em>Lemma 6</em>, we have $\abs{ad - bc} = 1$.</p>

<h3 id="canonical-basis">Canonical Basis</h3>

<p>So far we’ve seen there can be multiple basis for a given period module. However, if we restrict the domain of the ratio $r = w_2/w_1$ for the basis, we can make sure it’s unique. The region we’re interested in is defined by the following constraints:</p>

\[\begin{array}{ll}
(i)   &amp; \Im(r) \gt 0 \\
(ii)  &amp; -1/2 \lt \Re(r) \le 1/2 \\
(iii) &amp; \abs{r} \ge 1 \\
(iv)  &amp; \Re(r) \ge 0 \quad \mbox{if} \abs{r} = 1
\end{array}\]

<p>We call it the <strong>fundamental region</strong> and depicted it in <em>Figure 1</em>. Our first result is to show a base exists in it, as shown in <em>Lemma 7</em>.</p>

<figure class="center_children">
  <img src="https://www.kuniga.me/resources/blog/2026-01-30-elliptic-functions/fundamental-region.png" alt="See caption" />
  <figcaption>Figure 1. The fundamental region defined by the constraints (i) to (iv) above.</figcaption>
</figure>

<p><strong>Lemma 7.</strong> There exists a base $(w_1, w_2)$ for a period module with $r = w_2/w_1$ in the fundamental region.</p>

<proof>
First we show how to satisfy property $(iii)$. If we choose $w_1$ and $w_2$ as in <i>Lemma 5</i>, we have that $\abs{w_1} \le \abs{w_2}$, $\abs{r} \ge 1$.
<br /><br />
We proceed to property $(ii)$. We already know that $\Im(r) \ne 0$ from <i>Lemma 5</i>. If it is negative, we can choose the base $(-w_1, w_2)$. The new ratio is $r' = -w_1/w_2 = -r$, and since $\Im(-r) = -\Im(r)$, it guarantees $\Im (r') \gt 0$, so $(i)$ is satisfied. Property $(iii)$ hasn't been violated because because $r' = -r$ then $\abs{r'} = \abs{r}$ .
<br /><br />
We now proceed to property $(iv)$. Assume $\abs{r} = 1$, so that $\abs{w_1} = \abs{w_2}$. If $\Re(r) \lt 0$, we can replace it by $(-w_2, w_1)$. Note that because $\abs{w_1} = \abs{w_2}$ is consistent with the process for picking a base in <i>Lemma 5</i>. Then the new ratio is $r' = -w_1/w_2 = -1/r$. We have $\Im(r') = \Im(-1/r) = -\Im(1/r) = \Im(r)$ and $\Re(r') = \Re(-1/r) = -\Re(1/r) = -\Re(r)$. So if $\Re(r) \lt 0$, then $\Re(r') \gt 0$. It remains to show this change doesn't violate the conditions $(ii)$ or $(iii)$: $\abs{r'} = \abs{r} = 1$, which satisfies $(iii)$. If $\Im(r) \gt 0$ so is $\Im(r')$ which satisfies $(ii)$.
<br /><br />
We finally proceed to property $(ii)$. We have that $\abs{w_2} \le \abs{w_1 \pm w_2}$, because otherwise $w_1 \pm w_2$ belongs to period module $M$ spanned by $(w_1, w_2)$ and thus we'd have picked $w_3 = w_1 \pm w_2$ instead of $w_2$. We claim that $\abs{\Re(r)} \le 1/2$. We have $w_2  = w_1 r$, so $\abs{w_2} = \abs{w_1} \abs{r}$. We also have

$$
\abs{w_2 \pm w_1} = \abs{w_1 r \pm w_1} = \abs{w_1 (r \pm 1)} = \abs{w_1} \abs{r \pm 1}
$$

Since $\abs{w_2} \le \abs{w_1 \pm w_2}$,

$$
\abs{w_2} \le \abs{w_1} \abs{r \pm 1}
$$

Replacing $\abs{w_2} = \abs{w_1} \abs{r}$,

$$
\abs{w_1} \abs{r} \le \abs{w_1} \abs{r \pm 1}
$$

Since $w_1$ is non-zero we get:

$$
\abs{r} \le \abs{r \pm 1}
$$

Square both sides and use $\abs{z}^2 = z \overline{z}$:

$$
\abs{r}^2 \le \abs{r \pm 1}^2 = (r \pm 1)(\overline{r \pm 1}) = (r \pm 1)(\overline{r} \pm 1) = r \overline{r} \pm r \pm \overline{r} + 1 = \abs{r}^2 \pm (r + \overline{r}) + 1
$$

Subtracting $\abs{r}^2$ from both sides and using $z + \overline{z} = 2 \Re(z)$:

$$
0 \le \pm 2 \Re(r) + 1
$$

which gives us $\Re(r) \ge -1/2$ and $-\Re(r) \ge -1/2$ which is $\Re(r) \le 1/2$. It almost matches property $(ii)$, except that we require $\Re (r) \ne -1/2$. If that's the case, then $\abs{r} = \abs{r + 1}$ or $\abs{w_2} = \abs{w_1 + w_2}$. If $w_3 = w_1 + w_2$ then $w_3$ is a candidate for $w_2$ consistent with the process from <i>Lemma 5</i>. The new ratio is $r' = (w_1 + w_2) / w_1 = 1 + r$, and thus $\Re(r') = \Re(r) + 1 = 1/2$, so we satisfy $(ii)$ with this tweak. We need to make sure this doesn't violate properties $(i), (iii)$ or $(iv)$. Since $\abs{r'} = \abs{r + 1} = \abs{r}$, condition $(iii)$ is still satisfied. Since $r' = r + 1$, $\Im(r') = \Im(r)$ and property $(i)$ is satisfied. If $\abs{r} = 1$ and $\Re(r) \ge 0$, then we don't need any tweaks because we know already that $\Re (r) \ne -1/2$.

</proof>

<p>Now that we know such a base always exists, <em>Lemma 8</em> shows its ratio is unique.</p>

<p><strong>Lemma 8.</strong> Let $(w_1, w_2)$ be a base for a period module with $r = w_2/w_1$ in the fundamental region, then $r$ is unique.</p>

<proof>
Let $(w'_1, w'_2)$ be another base with $r' = w'_2/w'_1$ in the fundamental region. We'll show that $r' = r$. Without loss of generality, assume $\Im(r') \gt \Im(r)$ (if not, we swap the bases). From $(1)$ we have

$$
r' = \frac{a r + b}{cr + d}
$$

with $\abs{ad - bc} = \pm 1$. Let compute $\Im (r')$:

$$
\Im(r') = \Im\left(\frac{a r + b}{cr + d}\right)
$$

using that $\Im(z) = (z - \overline{z})/(2i)$,

$$2i \Im(r') = \left(\frac{a r + b}{cr + d}\right) - \overline{\left(\frac{a r + b}{cr + d}\right)} = \left(\frac{a r + b}{cr + d}\right) - \left(\frac{\overline{a r + b}}{\overline{cr + d}}\right)
$$

Normalizing the denominators (and using $\abs{z}^2 = z \overline{z}$):

$$
= \frac{(a r + b)(\overline{cr + d}) - (\overline{a r + b})(cr + d)}{\abs{cr + d}^2}
$$

Multiplying terms and cancelling out things we end up with

$$
= \frac{(r - \overline{r})(ad - bc)}{\abs{cr + d}^2}
$$

using $ad - bc = \pm 1$:

$$
\Im(r') = \pm \frac{r - \overline{r}}{2i \abs{cr + d}^2}
$$

using that $\Im(z) = (z - \overline{z})/(2i)$:

$$
\Im(r') = \pm \frac{\Im(r)}{\abs{cr + d}^2}
$$

Since both $\Im(r'), \Im(r) \gt 0$ and the denominator is positive, it must be that:

$$
(8.1) \quad \Im(r') = \frac{\Im(r)}{\abs{cr + d}^2}
$$

which in turn implies that

$$(8.2) \quad ad - bc = 1$$

Also, since we're assuming $\Im(r) / \Im(r') \le 1$, we conclude

$$(8.3) \quad \abs{cr + d} \le 1$$

We now consider all possible values $a,b,c$ and $d$ can take. Start by assuming $c = 0$. Then $\abs{d} \le 1$, so $d \in \curly{-1, 0, 1}$. $d$ can't be 0 because this would imply $\Im(r') = 0$ from $(8.1)$, which contradicts the fact that $r'$ is in the fundamental region with $\Im(r') \gt 0$. Assume $\abs{d} = 1$. We have $ad - bc = 1$ and with $c = 0$ that $ad = 1$ which tells us $\abs{a} = 1$ and with the same sign as $d$. With this we can simplify $(1)$:
<br /><br />

$$
r' = \frac{ar + b}{cr + d} = \frac{ar}{d} + \frac{b}{d} = r \pm b
$$

Since $b$ is real, $\Re(r') = \Re(r) \pm b$ or $\abs{b} = \abs{\Re(r') - \Re(r)}$. The maximum value the right hand side can assume is less than one, because $-1/2 \lt \Re(r'), \Re(r) \le 1/2$. Since $b$ is integer, it must be 0. Thus $r = r'$.
<br /><br />
This was for the case where $c = 0$. Now assume $c \ne 0$. Thus we can divide $(8.3)$ by $\abs{c}$:

$$\frac{\abs{cr + d}}{\abs{c}} \le \frac{1}{\abs{c}}$$

since division is closed under modulus:

$$\abs{\frac{cr + d}{c}} = \abs{r + \frac{d}{c}} \le \frac{1}{\abs{c}}$$

in the complex plan, this equation corresponds to a circle of radius $1/\abs{c}$ centered at the point $d/c$ in the real line. If $\abs{c} \ge 2$, this implies that $\Im(r) \le 1/2$. However, the smallest value of $\Im(r)$ is attained when $\Re(r) = 1/2$, for which case $\Im(r) = \sqrt{3}/2$ (this is easier to visualizer in <i>Figure 1</i> of the fundamental region), so this cannot happen and thus $\abs{c} \le 1$ and since $c \ne 0$, $\abs{c} = 1$.
<br /><br />
If $\abs{c} = 1$, $(8.3)$ becomes $\abs{r \pm d} \le 1$. Since $\abs{z} \ge \abs{\Re(z)}$, then $\abs{\Re(r) \pm d} \le \abs{r \pm d} \le 1$. Since $\abs{\Re(r)} \le 1/2$, $d$ cannot be too large. If $\abs{d} \ge 2$, then $\abs{r \pm d} \gt 1$ a contradiction, so $d \in \curly{-1, 0, 1}$.
<br /><br />
Now suppose $d = 1$. Then $(8.3)$ becomes $\abs{r + 1} \le 1$ which is the unit circle centered in $(-1, 0)$. The intersection between this circle and the unit circumference at the origin is a single point is $(-1/2, \sqrt{3}/2)$. However this point does not belong to the fundamental region (this is easier to visualizer in <i>Figure 1</i> of the fundamental region). Thus $d \ne -1$.
<br /><br />
Now suppose $d = -1$. Then $(8.3)$ becomes $\abs{r - 1} \le 1$ which is the unit circle centered in $(1, 0)$. The intersection between this circle and the unit circumference at the origin is a single point is $r = (1/2, \sqrt{3}/2)$ which does belong to the fundamental region. But since this point is on the circumference of  $\abs{r - 1} \le 1$, we have the equality $\abs{r - 1} = 1$, which implies $\abs{cr + d}^2 = 1$ and thus $\Im(r') = \Im(r)$, and thus $\Im(r') = \sqrt{3}/2$ but there's a single point with this image in the fundamental region and thus $r = r'$.
<br /><br />
Finally suppose $d = 0$. Then $(8.3)$ becomes $\abs{r} \le 1$. Since $\abs{r} \ge 1$ in the fundamental region, $\abs{r} = 1$. From $(8.2)$ we get $bc = -1$. Since $\abs{c} = 1$, $\abs{b} = 1$ and they have opposite signs. From $(1)$ we get:

$$r' = \frac{ar + b}{cr} = \frac{a}{c} + \frac{b}{cr}  = \pm a - 1/r$$

Since $\abs{r} = 1$, $\abs{r}^2 = r \overline{r} = 1$ and $\overline{r} = 1/r$, so

$$r' = \pm a - \overline{r}$$

or

$$r' + \overline{r} = \pm a$$

Since $a$ is real, the imaginary parts of $r'$ and $r$ cancel out and $\Im(r') = \Im(r)$. We also have that

$$\Re(r') + \Re(\overline{r}) = \Re(r') + \Re(r) = \pm a$$

Since $-1/2 \lt Re(r'), Re(r) \le 1/2$, their sum is at most $1$ and never $-1$, so $a \in \curly{0, 1}$. If $a = 1$, then it must be that $Re(r) = Re(r') = 1/2$. Since $\Im(r') = \Im(r)$, $r' = r$. Now if $a = 0$, $r' = -\overline{r}$ so $\Re(r') = -\Re(r)$. However, since $\abs{r} = 1$ and $\abs{r'} = 1$, property $(iv)$ tells us $\Re(r'), \Re(r) \ge 0$, so it must be that $\Re(r') = \Re(r) = 0$. In this case $\Im(r') = \Im(r) = 1$ and $r' = r$.
<br /><br />
So we proved that no matter which valid combination of $a, b, c$ and $d$, we end up with $r' = r$.
</proof>

<p>we define any base $(w_1, w_2)$ with $r = w_2 / w_1$ in the fundamental region as a <strong>canonical basis</strong>. Note that these are not unique given $(-w_1, -w_2)$ is a base with ratio $r$.</p>

<h3 id="lattice">Lattice</h3>

<p>Any base $(w_1, w_2)$ of the module period induces a lattice. The points on this lattice are those of the form $n_1 w_1 + n_2 w_2$. Each “cell” on this lattice is a parallelogram formed by adjacent points $(n_1 w_1, n_2 w_2)$, $((n_1 + 1) w_1, n_2 w_2)$, $(n_1 w_1, (n_2 + 1) w_2)$, $((n_1 + 1) w_1, (n_2 + 1) w_2)$.</p>

<p>The property of elliptic functions is that for any $a$, $f(a) = f(a + n_1 w_1 + n_2 w_2)$, so for any two points in different parallelograms in the same relative position within their respective parallelograms have the same value. Thus, an elliptic function is fully specified for the entire complex plane just from a single parallelogram.</p>

<figure class="center_children">
  <img src="https://www.kuniga.me/resources/blog/2026-01-30-elliptic-functions/lattice.png" alt="See caption" />
  <figcaption>Figure 2. The base $w_1$ and $w_2$ induce a lattice where the periods are the vertices. The function is fully specified just from one parallelogram.</figcaption>
</figure>

<h3 id="congruence-classes">Congruence Classes</h3>

<p>We say that $z_1$ is <strong>congruent</strong> to $z_2$ with respect to $M$, denoted by $z_1 \equiv z_2 \pmod M$, if the difference $z_1 - z_2$ belongs to $M$. This induces <strong>congruence classes</strong>, and for $z_1, z_2$ on the same class $f(z_1) = f(z_2)$.</p>

<h3 id="properties-1">Properties</h3>

<p>We now consider some properties of elliptic functions. <em>Lemma 2</em> and <em>Lemma 3</em> apply to elliptic function since they’re doubly periodic and satisfy:</p>

\[f(z) = f(z + w_1) \quad \mbox{and} \quad f(z) = f(z + w_2)\]

<p>By definition elliptic functions are meromorphic. If they have no poles then they’re holomorphics. Since $f$ is fully specified from a single parallelogram, its value is bounded in this finite region. According to Liouville’s theorem, a bounded holomorphic function must be constant.</p>

<p><strong>Corollary 9.</strong> An elliptic function without poles is a constant.</p>

<p>If a function does have poles, <em>Lemma 10</em> shows the sum of their residues is 0.</p>

<p><strong>Lemma 10.</strong> The sum of residues of an elliptic function is 0.</p>

<proof>

Since poles are isolated singularities and a parallelogram $P$ is a bounded region, inside such a region there's a finite number of poles. We can also translate the lattice so that no poles lie on the boundary of a parallelogram. Consider a curve that corresponds to the boundary of one of the parallelograms, $\delta P$. From the <i>Residue Theorem</i> [5], we have that the sum of the residues of the poles of $f$ inside a closed curve $\gamma$ equals to the integral of $f$ over $\gamma$:

$$
\frac{1}{2\pi i} \int_\gamma f(z) dz = \sum_{j = 1}^{n} n(\gamma, a_j) \mbox{Res}_{z = a_j} f(z)
$$

If we take $\gamma$ to be $\delta P$, due to the periodicity, the function will cancel out for opposite points on the parallelogram and hence

$$
\int_{\delta P} f(z) dz = 0
$$

</proof>

<p>Because simple poles (poles of order 1) have non-zero residue, it means that a non-constant elliptic function cannot have a single simple pole.</p>

<p><strong>Lemma 11.</strong> A non-constant elliptic function has equally many poles as it has zeros.</p>
<proof>
From <i>Lemma 2</i> and <i>Lemma 3</i> we conclude that $g = f'/f$ is an elliptic function with same periods as $f$. Let $z_1, z_2, \dots, z_n$ be the zeros of $f$ with order $m_1, \dots, m_n$ and $p_1, p_2, \dots, p_n$ be the poles of $f$ with order $n_1, \dots, n_n$. From the <i>Argument Principle</i> (<i>Theorem 3</i> in [5]), we have that

$$
\frac{1}{2\pi i} \int_{\delta P} \frac{f'(z)}{f(z)} dz = \sum_{j = 1}^{n} n(\delta P, z_j) m_j - \sum_{j = 1}^{n} n(\delta P, p_j) n_j
$$

Since $f'/f$ is periodic, the integral around $\delta P$ also goes to 0 using similar arguments as in <i>Lemma 10</i>. Since each pole and zero is inside the parallelogram $\delta P$ winds around each once, so $n(\delta P, z_j) = n(\delta P, p_j) = 1$. This proves that

$$
\sum_{j = 1}^{n} m_j = \sum_{j = 1}^{n} n_j
$$

</proof>

<p>An interesting consequence of <em>Lemma 11</em> is the following. Suppose $f(z)$ has $N$ poles. If $p$ is a pole of $f(z)$, then $p$ is also a pole of $f(z) - c$ because if $1/f(p) = 0$ then $1/(f(p) - c) = 0$. From <em>Lemma 11</em>, $f(z) - c$ has the same number of zeros as $f(z)$, so there are $N$ values of $z$ which makes $f(z) = c$. Note that it doesn’t mean that the number of distinct zeros are the same for all $c$, since $N$ accounts for the multiplicity of zeros. The value $N$ is also known as the <strong>order</strong> of $f$.</p>

<p>In Alfhor’s [1] he claims that the order is the number of incongruent roots of the equation $f(z) = c$. This seems to imply it’s the same no matter $c$ and equals to $N$, but I don’t understand why this is true.</p>

<p><strong>Theorem 12.</strong> Let $a_1, \dots, a_n$ be the zeros and $b_1, \dots, b_n$ be the poles of an elliptic function $f$ with module period $M$. Then</p>

\[a_1 + \dots + a_n \equiv b_1 + \dots + b_n \pmod M\]

<proof>

Consider a parallelogram $P$ and its boundary $\delta P$ translated such that no poles or zeros lie in the boundary. We already know that the poles and zeros of $f(z)$ are simple poles of $f'(z)/f(z)$. For $z \ne 0$, the poles of $f'(z)/f(z)$ coincide with $z f'(z)/f(z)$. From the argument principle, we saw that the residue of $f'(z)/f(z)$ is the order of the zero or pole of $f$. For a simple pole $a$ we have that $\mbox{Res}_{z = a} z g(z) = a \mbox{Res}_{z = a} g(z)$ so that:

$$
\frac{1}{2\pi i} \int_{\delta P} \frac{z f'(z)}{f(z)} dz = \sum_{j = 1}^{n} n(\delta P, z_j) z_j m_j - \sum_{j = 1}^{n} n(\delta P, p_j) p_j n_j
$$

as in <i>Lemma 11</i>, $n(\delta P, z_j) = n(\delta P, p_j) = 1$ so:

$$
\frac{1}{2\pi i} \int_{\delta P} \frac{z f'(z)}{f(z)} dz = \sum_{j = 1}^{n} z_j m_j - \sum_{j = 1}^{n} p_j n_j
$$

In this case, since $zf'(z)/f(z)$ is not elliptic, we can't claim the integral is 0. However, consider the parallel segments of the parallelogram: from $a$ to $a + w_1$ and from $a + w_2$ to $a + w_1 + w_2$. For each point $z'$ in the second segment we can write it as $z + w_2$ for $z$ on the first segment. Thus:

$$
\frac{1}{2\pi i} \left(\int_{a}^{a + w_1} \frac{z f'(z)}{f(z)} dz - \int_{a + w_2}^{a + w_1 + w_2} \frac{z f'(z)}{f(z)} dz\right) = \frac{1}{2\pi i} \int_{a}^{a + w_1} \frac{(z - z + w_2) f'(z)}{f(z)} dz
$$

Moving the constant out:

$$
= \frac{-w_2}{2\pi i} \int_{a}^{a + w_1} \frac{f'(z)}{f(z)} dz
$$

The integral equals to $\log f(a + w_1) - \log f(a)$. Since $w_1$ is a period, $f(a + w_1) = f(a)$ and $\log f(a + w_1) - \log f(a) = 2\pi k$ for integer $k$. Thus we get the integer:

$$
= -w_2 k
$$

Using a similar argument for the other parallel sides we arrive at $-w_1 k'$ and thus the integral over the boundary of the parallelogram is an integer of the form $w_1 n_1 + w_2 n_2$. Putting it all together:

$$
\sum_{j = 1}^{n} z_j m_j - \sum_{j = 1}^{n} p_j n_j = w_1 n_1 + w_2 n_2
$$

The integer $w_1 n_1 + w_2 n_2$ belongs to $M$. So by definition $\sum_{j = 1}^{n} z_j m_j$ and $\sum_{j = 1}^{n} p_j n_j$ are congruent.

</proof>

<h2 id="conclusion">Conclusion</h2>

<p>Elliptic functions are pretty interesting! I found the name pretty deceptive and more complicated-sounding than it actually is. I much prefer the term doubly-periodic functions.</p>

<p>I’m starting to see why elliptic functions appear in number theory, given that a lot of the results around the period module guarantee that the results are integers. This connection is fascinating!</p>

<h2 id="appendix">Appendix</h2>

<p><strong>Lemma 13.</strong> Let $M$ be the period module of an elliptic function $f(z)$. There are either 2, 4, and 6 points in $M$ that are the closest to the origin.</p>

<proof>
Let $w_1$ be a point in $M$ with smallest modulus $r$. Then $-w_1$ must be in $M$ as well, and it has the same smallest modulus $r$. We conclude that there are at least two points with smallest modulus and their number is even.
<br /><br />
Suppose there exists another $w_2$ which is not an integral multiple of $w_1$ but with $\abs{w_2} = r$. Let's consider the square of the module of their difference, $\abs{w_1 - w_2}^2$. By the law of cosines:

$$
\abs{w_1 - w_2}^2 = \abs{w_1}^2 + \abs{w_2}^2 - 2\abs{w_1}\abs{w_2} \cos \theta
$$

where we can interpret the $\theta$ is the angle between the vectors. Replacing with $r$:

$$
\abs{w_1 - w_2}^2 = 2r^2 + - 2r^2 \cos \theta = 2r^2(1 - \cos \theta)
$$

If $\theta \lt 60^o$, then $\cos \theta \gt 1/2$, which would imply:

$$
\abs{w_1 - w_2}^2 = 2r^2(1 - \cos \theta) \lt 2r^2(1 - 1/2) = r^2
$$

So the point $w' = w_1 - w_2$ would have a modulus smaller than $r$, a contradiction. If the points in the circumference of radius $r$ must be $60^o$ appart, the maximum number of points we can "fit" in it. This leads us to 2, 4 and 6 possibilities.
</proof>

<h2 id="related-posts">Related Posts</h2>

<p><a href="https://www.kuniga.me/blog/2012/09/02/totally-unimodular-matrices.html">Totally Unimodular Matrices</a>. I was initially surprised to see totally unimodular matrices mentioned in the context of elliptic functions. I had studied them in the context of integer linear programming, but in hindsight it makes sense. There seems to be a deep connection between elliptic functions and integers and mentioned in the conclusion.</p>

<h2 id="references">References</h2>

<ul>
  <li>[1] Complex Analysis - Lars V. Ahlfors</li>
  <li>[<a href="https://www.kuniga.me/blog/2024/11/02/poles.html">2</a>] NP-Incompleteness - Zeros and Poles</li>
  <li>[<a href="https://www.kuniga.me/blog/2021/07/31/discrete-fourier-transform.html">3</a>] NP-Incompleteness - Discrete Fourier Transforms</li>
  <li>[<a href="https://www.kuniga.me/docs/math/complex.html">4</a>] Complex Numbers Cheat Sheet</li>
  <li>[<a href="https://www.kuniga.me/blog/2025/04/16/residue-theorem.html">5</a>]  NP-Incompleteness - The Residue Theorem</li>
</ul>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="analysis" /><summary type="html"><![CDATA[Suppose we’re given a circle of radius $r$ and two points in its perimeter, and we want to compute the length of the arc between these points. This can be computed using elementary functions, for example by determining the angle $\theta$ in radians between these two points with respect to the center and then the arc length is $\theta r$. For ellipses this is not as trivial. In the 18th century, Giulio Fagnano and Leonhard Euler were the first to use integrals to compute the arc length of an ellipse and these became known as elliptic integrals. Niels Abel and Carl Jacobi studied the inverse of elliptic integrals and later realized they were doubly periodic, that is they have two fundamental periods, as oppose to trigonometric functions line sine that has a single period. Due to this connection double periodic functions became known as elliptic functions.]]></summary></entry><entry><title type="html">[Book] Kafka: The Definitive Guide</title><link href="https://www.kuniga.me/blog/2026/01/17/book-kafka.html" rel="alternate" type="text/html" title="[Book] Kafka: The Definitive Guide" /><published>2026-01-17T00:00:00+00:00</published><updated>2026-01-17T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/01/17/book-kafka</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/01/17/book-kafka.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources//books/kafka.jpg" alt="Book cover." />
</figure>

<p>In this post I’ll share my notes on the book <em>Kafka: The Definitive Guide. Real-Time Data and Stream Processing at Scale</em> by Gwen Shapira, Todd Palino, Rajini Sivaram and Krit Petty.</p>

<p>This book covers many aspects of the popular open-source Kafka, a distributed queue.</p>

<!--more-->

<h2 id="book-summary">Book Summary</h2>

<p>I read the 2nd edition of the book. It has 457 pages including Appendix, divided in 14 chapters. I skimmed most of the code snippets and chapters explaining configuration parameters since I was mostly interested in Kafka’s high-level architecture (I don’t use Kafka day-to-day).</p>

<p><strong>Architecture.</strong> <em>Chapter 1</em> provides a high-level overview of Kafka. <em>Chapter 3</em> and <em>4</em> dives into Producers and Consumers, respectively. <em>Chapter 6</em> dives into the core of Kafka.</p>

<p><strong>Features.</strong> In terms of feature set (functional and non-functional) and applications, <em>Chapter 7</em> describes Kafka’s reliability, and <em>Chapter 8</em> explains how to achieve exactly-once semantics. <em>Chapter 9</em> explains how to move data from/to Kafka to/from some other database, while <em>Chapter 14</em> explains stream processing, i.e. how to transform data from Kafka.</p>

<p><strong>Operations.</strong> <em>Chapter 2</em> explains how to install it, including dependencies such as <a href="https://www.kuniga.me/blog/2015/08/07/notes-on-zookeeper.html">Zookeeper</a>. <em>Chapter 5</em> explains Kafka how to manage operations (e.g. creating topics) programmatically. <em>Chapter 12</em> covers similar ground, but more manually, via CLI. <em>Chapter 10</em> explains how to replicate data across data centers, <em>Chapter 11</em> how to secure Kafka data and <em>Chapter 13</em> how to monitor Kafka.</p>

<p>As I said, I was mostly looking into learning about how Kafka works. So the most useful chapters for my needs were <em>Chapters 1, 3, 4, 6, 7, 8</em> and <em>14</em>. These are the ones covered in my notes.</p>

<h2 id="kafka-overview">Kafka Overview</h2>

<p>At the most basic level, Kafka is a distributed queue: you can have multiple hosts write data to it (producers) and have multiple hosts read from it (consumers) in a FIFO manner. One key feature from Kafka is that reading the data doesn’t remove the data from the queue. This allows multiple consumers to read the same data and even replay old data as long as its within Kafka’s retention.</p>

<p><strong>Messages and Batch.</strong> A <em>message</em> is equivalent to a row in a database. For performance reasons, the unit of data in Kafka is a <em>batch</em> of data. How a batch is encoded is transparent to Kafka (of course it must be consistent between producers and consumers) but a typical serialization used is <a href="https://avro.apache.org/">Apache Avro</a>.</p>

<p>Relatedly, data can have an underlying schema, but these are also transparent to Kafka. The schema must be stored externally to be used by both producers and consumers.</p>

<p><strong>Topics, Partitions and Segments.</strong> Kafka is actually a set of queues, not a single one. Each queue is called a <em>topic</em>.</p>

<p>For each topic there’s a corresponding set of <code class="language-plaintext highlighter-rouge">N</code> <em>partitions</em>. Each  partition is replicated in one or more host (broker). The broker that contains the “master” replica is called the <em>leader</em>, while those containing the “slave” replicas are called the <em>followers</em>.</p>

<p>Note that a partition is a logical concept. At a broker level, it’s implemented by a set of files called <em>segments</em>.</p>

<p><strong>Broker and Cluster.</strong> Is synonym with a host, but refers specifically to hosts holding partitions (producers and consumers are not brokers). Note that a host stores partitions from multiple topics.</p>

<p><em>Cluster</em> is a logical grouping of <em>brokers</em>. In practice it seems like a cluster corresponds to a cluster deployment on a datacenter. The key constraint is that communication between brokers in a cluster must be fast.</p>

<p>Cross-cluster replication is possible, but it’s not used in the production path.</p>

<p><strong>Consumers and Producers.</strong> Producers are a set of client hosts that write data to a specific topic. Multiple producers can write to the same topic.</p>

<p>Consumers read data from a topic. In general there’s too much data in a topic for a single host to be able to process it, so multiple consumers must be used to read the data in parallel. These form a <em>consumer group</em>, in which each member reads a unique portion of the data from a topic. Multiple consumer groups can read data from the same topic.</p>

<p><strong>Checkpointing.</strong> Consumers may crash, the consumer group might scale up or down, so Kafka must know exactly up to which point a given consumer read its data. This is known as a checkpoint.</p>

<p>Since multiple consumer groups can read data from a topic, Kafka needs to store multiple checkpoints. This is stored colocated with the actual data in the broker.</p>

<p><strong>Compaction.</strong> Kafka has the feature where messages can define a key (usually a column of the message) and only the last message of a given key is kept. This makes is kind of a key-value store, but the main purpose of this is to save space, so if a key is not needed, space can be reclaimed.</p>

<h2 id="architecture">Architecture</h2>

<h3 id="writing-data">Writing Data</h3>

<p>The client wanting to write to Kafka must use a Kafka producer library. In the producer configuration, the user must provide the server address of at least one broker. This broker serves as a initial point of contact to return metadata about the other brokers in the cluster. With this information, the producer can talk directly to brokers it wants to write data to. There’s no intermediate gateway of sorts.</p>

<p><strong>Partitioning.</strong> The producer requires the user to provide a key column which it will use to determine to which broker it should send the data to. This means the assignment is sticky: messages with the same key always go to the same partition. This also means it’s subject to hotkeys: if all keys are the same, only one broker will receive all the messages.</p>

<p><strong>Batching.</strong> This producer doesn’t send data right away, it batches it for performance. Should the sending fail there are retry mechanisms provided. The user is responsible for handling client failures. If no persistence mechanism exists, if the client crashes the buffered data is lost.</p>

<p>Note within a batch all messages in a batch must belong to the same partition.</p>

<p><strong>Serialization.</strong> The producer handles serialization from user format to bytes and the user can provide a custom serializer. It attaches a piece of metadata describing the schema of the data. It is not the full schema because the overhead would be prohibitive but rather a schema ID. It assumes the schema can be looked up from this ID. This enables non-back compatible schema evolution: if a producer serializes with schema ID X, and then later starts to write with schema ID Y, the consumer will know to deserialize the data with schema X or Y.</p>

<h3 id="reading-data">Reading Data</h3>

<p>The client wanting to read from Kafka must use a Kafka consumer library. As mentioned before, each client is a member of a consumer group. It’s possible to scale in or out a consumer group by removing or adding more clients to the group. The consumer library will know how to re-distribute reads to them.</p>

<p>Sometimes we need key affinity, for example if the client aggregates the data and needs all the data with the same key to go to it. It’s possible to configure static assignment.</p>

<p>The way it works is that the consumer is by staying in an infinite loop and constantly calling <code class="language-plaintext highlighter-rouge">poll()</code> on the consumer client.</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="nf">while</span> <span class="o">(</span><span class="kc">true</span><span class="o">)</span> <span class="o">{</span>
  <span class="nc">ConsumerRecords</span><span class="o">&lt;</span><span class="nc">String</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;</span> <span class="n">records</span> <span class="k">=</span> <span class="nv">consumer</span><span class="o">.</span><span class="py">poll</span><span class="o">(</span><span class="n">timeout</span><span class="o">);</span>

  <span class="nf">for</span> <span class="o">(</span><span class="nc">ConsumerRecord</span><span class="o">&lt;</span><span class="nc">String</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;</span> <span class="n">record</span> <span class="k">:</span> <span class="kt">records</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">int</span> <span class="n">updatedCount</span> <span class="k">=</span> <span class="mi">1</span><span class="o">;</span>
    <span class="nf">if</span> <span class="o">(</span><span class="nv">custCountryMap</span><span class="o">.</span><span class="py">containsKey</span><span class="o">(</span><span class="nv">record</span><span class="o">.</span><span class="py">value</span><span class="o">()))</span> <span class="o">{</span>
      <span class="n">updatedCount</span> <span class="k">=</span> <span class="nv">custCountryMap</span><span class="o">.</span><span class="py">get</span><span class="o">(</span><span class="nv">record</span><span class="o">.</span><span class="py">value</span><span class="o">())</span> <span class="o">+</span> <span class="mi">1</span><span class="o">;</span>
    <span class="o">}</span>
    <span class="nv">custCountryMap</span><span class="o">.</span><span class="py">put</span><span class="o">(</span><span class="nv">record</span><span class="o">.</span><span class="py">value</span><span class="o">(),</span> <span class="n">updatedCount</span><span class="o">);</span>
    <span class="nc">JSONObject</span> <span class="n">json</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">JSONObject</span><span class="o">(</span><span class="n">custCountryMap</span><span class="o">)</span> <span class="o">;</span>
    <span class="nv">System</span><span class="o">.</span><span class="py">out</span><span class="o">.</span><span class="py">println</span><span class="o">(</span><span class="nv">json</span><span class="o">.</span><span class="py">toString</span><span class="o">());</span>
  <span class="o">}</span>
<span class="o">}</span></code></pre></figure>

<p><strong>Checkpoints.</strong> As discussed earlier, Kafka uses checkpoints to know at which point of the queue a given consumer is reading. By default, Kafka commits a checkpoint for the data returned by a <code class="language-plaintext highlighter-rouge">poll()</code> when the next <code class="language-plaintext highlighter-rouge">poll()</code> is called. Assuming you don’t call the next <code class="language-plaintext highlighter-rouge">poll()</code> until you make sure the data is committed to the sink, this guarantees at-least-once semantics.</p>

<p>The consumer also has the flexibility to decide when Kafka saves a checkpoint, by turning off automatic commits and calling an explicit method, <code class="language-plaintext highlighter-rouge">commitSync()</code>. There’s an async version <code class="language-plaintext highlighter-rouge">commitAsync()</code> which doesn’t block during checkpoint but also doesn’t handle retries, so some care must be taken to ensure the right semantics. Exactly-once-semantics will be covered later.</p>

<p>Some care must also be taken when repartitioning happens. Suppose we add a new member to the consumer group. Then some existing members will “lose” the partitions they’re reading from, so they must commit a checkpoint before that happens. It’s possible to subscribe to such events.</p>

<p>The broker that keeps tab of offsets is called the <em>group coordinator</em>, and it is also the partition leader of a topic called <code class="language-plaintext highlighter-rouge">__consumer_offsets</code> (partitioned by the consumer group ID). This topic is only used for durability in case the leader crashes and the offset map must be reconstructed.</p>

<h3 id="cluster-membership">Cluster Membership</h3>

<p>Kafka uses Zookeeper as a source of truth to which brokers are part of a cluster. Brokers must periodically send a heartbeat to the Zookeeper ensemble otherwise they’re considered dead.</p>

<p>Zookeeper is also used to elect a leader among the brokers, called the <strong>controller</strong>. The controller is responsible for deciding which brokers are the leader of a partition and which are the followers, so it must know when brokers leave or join the cluster.</p>

<h3 id="replication">Replication</h3>

<p>The replication factor Kafka suggests is 2-3 but not more than that. Writes happen to the leader replica and typically reads are also only from the leader, but the latter can be changed to improve performance. The problem is that Kafka uses eventual consistency, so the replicas are usually not in sync.</p>

<p>The follower constantly asks for data from the leader, so the leader has an idea on how far behind each follower is. Followers lagging more than a configured time (10s by default) behind are considered <em>out of sync</em>, otherwise they’re <em>in-sync</em> replicas.  On leader failure, only in-sync replicas are candidates for leader election.</p>

<h3 id="retention">Retention</h3>

<p>Each partition is implemented via a set of files. By default each file contains either 1 GB or 1 week worth of data, whichever is smaller. When either of these limits is reached, Kafka starts writing to a new file.</p>

<p>The file contains metadata of the date range contained in it, so an asynchronous process can constantly delete files where all messages in it are out of retention.</p>

<h3 id="indexes">Indexes</h3>

<p>Kafka stores an index that maps logical topic offsets to actual files and file offsets. By default the granularity of the index is 4KB, i.e. for every 4KB of data written a entry is added to the index. This index also allows lookup by timestamp.</p>

<p>An index is a tradeoff between storage and compute. If we store every single message in the index it would take too much space. No index would make lookups take too long.</p>

<p>This index is needed because consumers can specify a time in the past for replay or to recover from a checkpoint.</p>

<h3 id="compaction">Compaction</h3>

<p>To implement the compaction, a background thread process a segment. It builds a in-memory hash table indexed by keys, containing the offset of the most recent message for that key. Then it does a second pass and it filters out any message that has a corresponding entry in the hash map with a smaller offset, and creates a new segment. The book is very confusing about this process, suggesting this is done in one pass.</p>

<p>Note that this is an “eventual” compaction. Consumers reading from the past might still see “dirty” messages. It’s assumed that clients reading from compacted topics are themselves writing to some sort of key value store that also retains only the last entry.</p>

<p>A similar mechanism is used even for non-compacted topics, when data must be deleted. The producer might indicate a message called a <em>tombstone</em>, which is basically a message with the key to be deleted and a null value. Consumers are supposed to handle this tombstone: if they process this message</p>

<h2 id="exactly-once-semantics">Exactly-Once Semantics</h2>

<p>By default, Kafka offers at-least-once semantics. This means that on consumer restart or repartitions, it guarantees that all data it stores will be sent to the consumer but it might send the same data more than once.</p>

<p>Achieving exactly-once is very difficult because the process of reading, processing and writing data must become atomic, so Kafka must be involved in the whole process. Since Kafka must be aware of the writes, a limitation of this process is that the sink of the application must also be a Kafka topic.</p>

<p><em>Chapter 8</em> describes the setup necessary.</p>

<h3 id="idempotent-producers">Idempotent Producers</h3>

<p>To start, the producer must be made idempotent. This means that each message will contain extra metadata: the producer ID and a sequence number, which will be used to uniquely identify a message and identify gaps. For each partition, the broker will keep the last N messages from each producer. If it gets a sequence already in this N messages, it will simply deduplicate.</p>

<p>If it gets a sequence number out of that range it will fail. If the last sequence number it has is <code class="language-plaintext highlighter-rouge">x</code>, and it gets <code class="language-plaintext highlighter-rouge">x + 2</code>, it will detect a gap in the sequence and fail as well.</p>

<h3 id="transaction">Transaction</h3>

<p>The transaction Kafka implements is by handling both the writes to the sink and the checkpointing. The producer writing to the sink must be marked as transactional. Here’s an example of a write with transactions (compare this with the simpler example in <em>Reading Data</em>):</p>

<figure class="highlight"><pre><code class="language-scala" data-lang="scala"><span class="nv">producer</span><span class="o">.</span><span class="py">initTransactions</span><span class="o">();</span>

<span class="nf">while</span> <span class="o">(</span><span class="kc">true</span><span class="o">)</span> <span class="o">{</span>
  <span class="k">try</span> <span class="o">{</span>
    <span class="nc">ConsumerRecords</span><span class="o">&lt;</span><span class="nc">Integer</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;</span> <span class="n">records</span> <span class="k">=</span> <span class="nv">consumer</span><span class="o">.</span><span class="py">poll</span><span class="o">();</span>
    <span class="nf">if</span> <span class="o">(</span><span class="nv">records</span><span class="o">.</span><span class="py">count</span><span class="o">()</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="o">)</span> <span class="o">{</span>
      <span class="nv">producer</span><span class="o">.</span><span class="py">beginTransaction</span><span class="o">();</span>
      <span class="nf">for</span> <span class="o">(</span><span class="nc">ConsumerRecord</span><span class="o">&lt;</span><span class="nc">Integer</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;</span> <span class="n">record</span><span class="k">:</span> <span class="kt">records</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">ProducerRecord</span><span class="o">&lt;</span><span class="nc">Integer</span><span class="o">,</span> <span class="nc">String</span><span class="o">&gt;</span> <span class="n">customizedRecord</span> <span class="k">=</span> <span class="nf">transform</span><span class="o">(</span><span class="n">record</span><span class="o">);</span>
        <span class="nv">producer</span><span class="o">.</span><span class="py">send</span><span class="o">(</span><span class="n">customizedRecord</span><span class="o">);</span>
      <span class="o">}</span>
      <span class="nc">Map</span><span class="o">&lt;</span><span class="nc">TopicPartition</span><span class="o">,</span> <span class="nc">OffsetAndMetadata</span><span class="o">&gt;</span> <span class="n">offsets</span> <span class="k">=</span> <span class="nf">consumerOffsets</span><span class="o">();</span>
      <span class="nv">producer</span><span class="o">.</span><span class="py">sendOffsetsToTransaction</span><span class="o">(</span><span class="n">offsets</span><span class="o">,</span> <span class="nv">consumer</span><span class="o">.</span><span class="py">groupMetadata</span><span class="o">());</span>
      <span class="nv">producer</span><span class="o">.</span><span class="py">commitTransaction</span><span class="o">();</span>
    <span class="o">}</span>
  <span class="o">}</span> <span class="nf">catch</span> <span class="o">(</span><span class="nc">KafkaException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
    <span class="nv">producer</span><span class="o">.</span><span class="py">abortTransaction</span><span class="o">();</span>
    <span class="nf">resetToLastCommittedPositions</span><span class="o">(</span><span class="n">consumer</span><span class="o">);</span>
  <span class="o">}</span>
<span class="o">}</span></code></pre></figure>

<h3 id="init-transactions">Init Transactions</h3>

<p>To prevent multiple producers with the same ID writing to Kafka, it uses a fencing mechanism. It stores a map of producer ID -&gt; timestamp. When a producer calls <code class="language-plaintext highlighter-rouge">initTransactions()</code> it sets the current timestamp for its ID in that map. It will keep using this same timestamp when writing messages.</p>

<p>When writing a message, the broker will check if the timestamp on the message matches the one in the map. If not it will reject. This prevents multiple producers from being alive at the same time. If a new producer starts and registers a higher timestamp, it overtakes all the other existing producers.</p>

<p>When a producer calls <code class="language-plaintext highlighter-rouge">initTransactions()</code> Kafka also selects one of the brokers to be the <em>transaction coordinator</em> (hash of the producer ID, so the mapping is sticky). It will keep in-memory metadata about the transaction, such as state and which partitions are participating in the transaction. This broker is also the partition leader of a special internal topic named <code class="language-plaintext highlighter-rouge">__transaction_state</code> (partitioned by producer ID) which can be used to restore the state.</p>

<h3 id="begin-transaction">Begin Transaction</h3>

<p>To start a transaction, the producer calls <code class="language-plaintext highlighter-rouge">producer.beginTransaction()</code>. This doesn’t talk to Kafka, but the producer client is aware that the next message it sends will be inside a transaction and it will tell Kafka that.</p>

<p>When the producer does send a message the coordinator will first add a message in the topic <code class="language-plaintext highlighter-rouge">__transaction_state</code> indicating a transaction started. The message is then processed by the respective brokers as normal, but they mark the message as uncommitted (so that doesn’t get returned to consumers).</p>

<p>The coordinator writes a message to <code class="language-plaintext highlighter-rouge">__transaction_state</code> with all partitions that have changed by the current write.</p>

<p>In addition to sending the messages, we also need to update the offsets for the consumers. Note that it’s the producer that commits the offset now, not the consumer: <code class="language-plaintext highlighter-rouge">producer.sendoffsetsToTransaction(offsets, consumer.groupMetadata())</code>.</p>

<p>The <em>group coordinator</em> will keep non-committed offsets in a special place and once the transaction is committed it will merge these offsets into its main map.</p>

<h3 id="commit-transaction">Commit Transaction</h3>

<p>When the producer calls <code class="language-plaintext highlighter-rouge">producer.commitTransaction()</code>, the coordinator writes a message in the topic <code class="language-plaintext highlighter-rouge">__transaction_state</code> with the state <code class="language-plaintext highlighter-rouge">PREPARE_COMMIT</code>. Then it sends a request to the leader of all partitions to add a special message representing a “commit” marker. So when broker returns its messages, only those preceding the last “commit” marker are considered valid.</p>

<p>Since the offset topic <code class="language-plaintext highlighter-rouge">__consumer_offsets</code> has also been changed during the transaction, it’s also notified and will add the “commit” marker.</p>

<h3 id="abort-transaction">Abort Transaction</h3>

<p>Similarly, if the transaction needs to aborted, the producer calls <code class="language-plaintext highlighter-rouge">producer.abortTransaction()</code> and the coordinator go through the same process but instead of a “commit” marker, it will have a “abort” marker. Messages between starting at a marker and ending in the “abort” marker are considered deleted.</p>

<p>Differently from the success case, in an aborted transaction, we need the consumer to call <code class="language-plaintext highlighter-rouge">resetToLastCommittedPositions()</code>. While the <code class="language-plaintext highlighter-rouge">__consumer_offsets</code> also received the “abort” marker and will know not to return the pending offsets, on the client side we already moved to the next offset, so need to reset.</p>

<h2 id="data-pipelines">Data Pipelines</h2>

<p>Kafka is the input source for stream processing frameworks such as Flink, but it also provides its own stream processing framework known as <em>Kafka Connect</em>. This is mostly useful for sending data from Kafka to another source type (e.g. MySQL) or vice-versa, with simple transformations.</p>

<h2 id="related-posts">Related Posts</h2>

<p>In the post <a href="https://www.kuniga.me/blog/2025/03/29/queues.html">Queues</a> we discussed queues in general.</p>

<h2 id="conclusion">Conclusion</h2>

<p>I got what I wanted from this book: I learned a lot about Kafka! The part I found the most interesting and difficult was the transactions. I found it amusing how Kafka uses special topics for a bunch of internal operations including checkpointing and transactions, but also cluster-replication. It’s how in Linux everything is a file and in Kafka everything is a topic.</p>

<p>As I was trying to write down my understanding, I found some difficult topics such as compaction and transactions were missing details so I had to complement with external research.</p>

<p>As I tried to summarize the chapters I realized that related chapters don’t seem to be grouped together and some like <em>Chapter 5</em> and <em>12</em> feel like they should be one. This seems to be a common artifact of multi-author books, but didn’t impact the content of the book.</p>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="distributed systems" /><summary type="html"><![CDATA[In this post I’ll share my notes on the book Kafka: The Definitive Guide. Real-Time Data and Stream Processing at Scale by Gwen Shapira, Todd Palino, Rajini Sivaram and Krit Petty. This book covers many aspects of the popular open-source Kafka, a distributed queue.]]></summary></entry><entry><title type="html">2025 in Review</title><link href="https://www.kuniga.me/blog/2026/01/01/2025-in-review.html" rel="alternate" type="text/html" title="2025 in Review" /><published>2026-01-01T00:00:00+00:00</published><updated>2026-01-01T00:00:00+00:00</updated><id>https://www.kuniga.me/blog/2026/01/01/2025-in-review</id><content type="html" xml:base="https://www.kuniga.me/blog/2026/01/01/2025-in-review.html"><![CDATA[<!-- This needs to be define as included html because variables are not inherited by Jekyll pages -->

<div class="headline">


<figure class="image_float_left">
  <img src="https://www.kuniga.me/resources/blog/2026-01-01-2025-in-review/chatgpt-2025.jpeg" alt="Image ChatGPT generated in my 2025 review." />
</figure>

This is a meta-post to review what happened in 2025. Every year I go over the posts I wrote, reflect on the blog as a whole and on the personal side, share things I've done (mostly trips and books read).

The thumbnail was generated by "Your year with ChatGPT" and found it reflects well what I share in this post.

</div>

<!--more-->

<h2 id="posts-summary">Posts Summary</h2>

<p>In 2025 I focused mostly in my studies of Complex Analysis, C++ and operating systems. I was happy with the balance of work vs non-work related (almost half-half).</p>

<h3 id="complex-analysis">Complex Analysis</h3>

<p>I started reading the book <em>Complex Analysis</em> by Ahlfors in September of 2023. I’ve made steady progress on it in 2025 but I’ve changed my goal of covering every single topic on the book. I skipped a few topics from the chapters on the Dirichlet problem.</p>

<p>My current plan is to study the elliptic functions chapter but skip the last chapter on globally analytic functions and my new estimate is that I can finish this book before mid 2026.</p>

<p>Here’s a list of all the posts I wrote as notes from the book this year:</p>

<ul>
  <li><a href="https://www.kuniga.me/blog/2025/01/18/max-principle.html">The Maximum Principle</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/03/15/general-cauchy.html">The General Form of Cauchy’s Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/04/16/residue-theorem.html">The Residue Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/05/31/runge-theorem.html">Runge’s Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/06/17/mittag-leffler-theorem.html">Mittag-Leffler’s Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/07/02/weierstrass-factorization-theorem.html">Weierstrass Factorization Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/07/19/gamma-function.html">The Gamma Function</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/08/01/harmonic-functions.html">Harmonic Functions</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/08/30/hadamard-theorem.html">Hadamard Factorization Theorem</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/10/25/riemann-zeta-function.html">The Riemann Zeta Function</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/12/14/subharmonic-functions.html">Subharmonic Functions</a></li>
  <li><a href="https://www.kuniga.me/blog/2025/12/31/perron-method.html">The Perron Method</a></li>
</ul>

<h3 id="work">Work</h3>

<p>I learned more about C++ such as <a href="https://www.kuniga.me/blog/2025/01/25/vector-views-in-cpp.html">Views</a>, <a href="https://www.kuniga.me/blog/2025/08/17/cpp-concepts.html">Concepts</a> and <a href="https://www.kuniga.me/blog/2025/09/10/cpo-in-cpp.html">CPOs</a>. I also studied Folly <a href="https://www.kuniga.me/blog/2025/05/02/folly-futures.html">futures</a>, <a href="https://www.kuniga.me/blog/2025/06/07/folly-executors.html">executors</a>, <a href="https://www.kuniga.me/blog/2025/06/18/folly-coroutines.html">coroutines</a> and their <a href="https://www.kuniga.me/blog/2025/11/01/coroutine-lifetimes.html">lifetimes</a>. As a background for Folly executors I also studied <a href="https://www.kuniga.me/blog/2025/05/16/async-io.html">asynchronous I/O</a>.</p>

<p>I studied how C++ binaries are compiled (in particular by <a href="https://www.kuniga.me/blog/2025/03/22/review-llvm-core-libs.html">LLVM</a>), linked (via <a href="https://www.kuniga.me/blog/2025/04/25/shared-libraries.html">shared libraries</a>) and executed (<a href="https://www.kuniga.me/blog/2025/04/12/elf.html">ELF</a> and <a href="https://www.kuniga.me/blog/2025/07/15/jemalloc.html">jemalloc</a>). More generally, I also had posts on distributed systems and operating systems: I wrote about <a href="https://www.kuniga.me/blog/2025/03/29/queues.html">queues</a>, read books on <a href="https://www.kuniga.me/blog/2025/07/26/review-the-hard-parts.html">microservices</a>, <a href="https://www.kuniga.me/blog/2025/10/10/review-systems-performance.html">system performance</a> and <a href="https://www.kuniga.me/blog/2025/12/28/book-bpf-performance-tools.html">BPF</a>.</p>

<p>I’m satisfied about being able to read so many technical books this year. I did this by reserving 30-60 min each morning to reading, and got through 4 books amounting to a few thousands of pages. I applied some techniques described in <a href="https://www.kuniga.me/books/atomic-habits">Atomic Habits</a>: I comitted to read at least one page a day and more importantly I started using a new phone without any work related nor social media apps. It’s incredible how much time I had been wasting each morning on those.</p>

<h3 id="leisure">Leisure</h3>

<p>I didn’t explore much outside of work and complex analysis but I enjoyed writing about my last visit to the <a href="https://www.kuniga.me/blog/2025/02/07/computer-history-museum.html">Computer History Museum</a>.</p>

<p>I belatedly completed the <a href="https://adventofcode.com/2024">Advent of Code</a> <em>2024</em>. It was a fun experience and I started writing a post about the problems but didn’t have time to finish.</p>

<h3 id="personal">Personal</h3>

<p>I wrote more “shower-thoughts” / personal opinion posts about <a href="https://www.kuniga.me/blog/2025/01/11/neomania.html">neomania</a>, <a href="https://www.kuniga.me/blog/2025/08/24/horseshoe-theory.html">horseshoe theory</a> and <a href="https://www.kuniga.me/blog/2025/12/20/complex-systems.html">complex systems</a>. A bit short of my plan of writing once a quarter but I’ve enjoyed working on these and hope to continue next year.</p>

<h2 id="the-blog-in-2025">The Blog in 2025</h2>

<h3 id="numbers">Numbers</h3>

<p>I’ve have a resolution to post at least once a month on average but this year I actually averaged more than one post every other week, by writing 33 of them. The blog completed 15 years with 269 posts.</p>

<p>I still use <a href="https://www.goatcounter.com/">GoatCounter</a> for analytics. The tool <a href="https://www.kuniga.me/bulls_and_cows/">A Bulls and Cows Solver</a>
continues to be the most visited page with 4,744 visits. This is a solver I built for fun in 2018 as part of a post, <a href="https://www.kuniga.me/blog/2018/06/04/bulls-and-cows.html">Bulls and Cows</a>, which is a code-breaking 2-player game. I explain why this page in popular in my <a href="https://www.kuniga.me/blog/2025/01/01/2024-in-review.html">2024 review</a>.</p>

<p>Restricted to the blog, <a href="https://www.kuniga.me/blog/2021/05/13/lpc-in-python.html">Linear Predictive Coding in Python</a> written in 2021 continues to be the most popular post with 654 visits. Surprisingly, a post I wrote in 2024, <a href="https://www.kuniga.me/blog/2024/03/01/understanding-call-once-in-cpp.html">Understanding std::call_once() in C++</a> became quite popular with 571 visits. From this year, <a href="https://www.kuniga.me/blog/2025/06/07/folly-executors.html">Folly Executors</a> had the most visits, 119.</p>

<p>Overall traffic to my website grew by a tiny amount, from 17,098 to 17,145.</p>

<h3 id="ai">AI</h3>

<p>I’ve been noticing a drop in traffic to my blog posts, which I attribute to tools like ChatGPT and the fact that Google now often answers questions inline without users having to click links.</p>

<p>I actually don’t mind this change in pattern too much. I never wrote with visits in mind, but I’m happy when I see a popular post because I assume people find it useful. With AI it’s still possible that my posts are helping people behind LLMs but I lost any visibility.</p>

<h3 id="citations">Citations</h3>

<p>From time to time I Google my blog address to see if I can find any interesting references and this time I found it has been cited by a paper by folks at Purdue: <a href="https://arxiv.org/abs/2407.04583">Unbalanced optimal transport for stochastic particle tracking
</a>. I couldn’t find it in any journal and it doesn’t seem peer reviewed, but it cites a 2013 post <a href="https://www.kuniga.me/blog/2013/08/13/totally-unimodular-matrix-recognition.html">Totally Unimodular Matrix Recognition</a>.</p>

<p>Most of my posts do not provide original ideas, being just notes on books and papers I read, so I’m surprised that anyone would cite it instead of the original source directly, but then most textbooks are also not original (in content, but are in presentation) often sourcing from papers and still, many papers cite textbooks.</p>

<h2 id="resolutions-for-2026">Resolutions for 2026</h2>

<p>As I mentioned above, I’ll commit to finish learning about complex analysis. I’m hoping to have time to start a new topic, which for now is <a href="https://en.wikipedia.org/wiki/Classical_mechanics">classical mechanics</a>.</p>

<p>I also want to explore more LLM tools such as Claude. The landscape has been changing very fast and I find hard to be on top of the best tools, so I want to dedicate time to explore them in more depth. I usually don’t follow the <a href="https://www.kuniga.me/blog/2025/01/11/neomania.html">newest trends</a> but this might time well-spent because it can make me more efficient and end up saving time in the long run.</p>

<h2 id="personal-1">Personal</h2>

<p>The end of the year is a good time to look back and remember all the things I’ve done besides work and the technical blog.</p>

<h3 id="trips">Trips</h3>

<figure class="center_children">
    <img src="https://www.kuniga.me/resources/blog/2026-01-01-2025-in-review/vietnam.png" alt="a collage of photos from a trip to Vietnam." />
    <figcaption>
      Vietnam, Part I. Top:
      1. <a href="https://photos.app.goo.gl/uYYrPvA1U7Ue99ou8" target="_blank">Boat ride in the Mekong Delta</a>, Bến Tre;
      2. <a href="https://photos.app.goo.gl/3xUbKxBaMR3CBbcc9" target="_blank">Riverside</a>, Hội An;
      3. <a href="https://photos.app.goo.gl/B82DaamKYXgsFgYJ9" target="_blank">Ruins of Champa Temples</a>, Mỹ Sơn.
      Bottom:
      4. <a href="https://photos.app.goo.gl/nreacHi4JH7ctcDSA" target="_blank">An Dinh Palace</a>, Hue;
      5. <a href="https://photos.app.goo.gl/h6xuWmeKzu3ruaQk9" target="_blank">Rice fields</a>, Mai Châu;
      6. <a href="https://photos.app.goo.gl/J3FV5sURga37rf2C7" target="_blank">Hạ Long Bay</a>.
    </figcaption>
</figure>

<p>In February, we visited Singapore and Vietnam. It was my first time in Southeast Asia and I learned a lot about the rich Vietnamese <a href="https://www.kuniga.me/docs/history/vietnam/">history</a> and enjoyed the local food a lot.</p>

<figure class="center_children">
    <img src="https://www.kuniga.me/resources/blog/2026-01-01-2025-in-review/singapore.png" alt="a collage of photos from a trip to Vietnam and Singapore." />
    <figcaption>
      Vietnam, Part II and Singapore. Top:
      1. <a href="https://photos.app.goo.gl/juyriNSiJvVdguCt9" target="_blank">View top of a hike</a> in Hoa Lư;
      2. <a href="https://photos.app.goo.gl/PbZWsH6pwqk6a4t36" target="_blank">Water puppet</a> theather, Hanoi;
      3. <a href="https://photos.app.goo.gl/6bQP5NNrycsAKPb6A" target="_blank">Bars on the side of train tracks</a>, Hanoi.
      Bottom:
      4. <a href="https://photos.app.goo.gl/bDq3zjV4g7Jkaq6KA" target="_blank">Marina Bay Sands and the Merlion</a>, Singapore;
      5. <a href="https://photos.app.goo.gl/YCquR1Paku4GC7uZ9" target="_blank">Gardens by the Bay</a> at night, Singapore;
      6. <a href="https://photos.app.goo.gl/PNoWvV4vT1juDpqT9" target="_blank">Changi Airport</a>, Singapore.
    </figcaption>
</figure>

<p>In November after a business trip to London, we decided to visit Morocco. I also learned a lot about Morocco’s <a href="https://www.kuniga.me/docs/history/morocco">history</a> and was fascinated by its Islamic architecture, especially the tile work (zelije). It was a nice historical connection to the trip to <a href="https://www.kuniga.me/blog/2025/01/01/2024-in-review.html">Andalucia last year</a> because the Moors from Andalucia came from Morocco.</p>

<figure class="center_children">
    <img src="https://www.kuniga.me/resources/blog/2026-01-01-2025-in-review/morocco.png" alt="a collage of photos from a trip to Morocco" />
    <figcaption>
      Morocco. Top:
      1. <a href="https://photos.app.goo.gl/kKXCLZnbhssUjXeU9" target="_blank">Ksar of Aït Benhaddou</a>, Atlas Montain Range;
      2. <a href="https://photos.app.goo.gl/pXihx3DBFhF9Ukw68" target="_blank">Dar El Bacha Palace</a>, Marrakesh;
      3. <a href="https://photos.app.goo.gl/oVQT3xyPBKudc3zp6" target="_blank">Tanneries</a>, Fez.
      Bottom:
      4. <a href="https://photos.app.goo.gl/wMEsT6Q7re8buW5s7" target="_blank">Blue buildings</a>, Chefchaouen;
      5. <a href="https://photos.app.goo.gl/Mp5Rm7f78RQhvfASA" target="_blank">Hasan II Mosque</a>, Casablanca;
      6. <a href="https://photos.app.goo.gl/jueYX1AM2AYmX8yP9" target="_blank">Roman Ruins</a>, Volubilis.
    </figcaption>
</figure>

<p>We also did quick trips to Portland, Oregon and Denver and the Rocky Mountain National Park in Colorado.</p>

<h3 id="books">Books</h3>

<p>This year I started publishing the notes for books I read as soon as I finish them. In fact from now on I don’t consider a book read until I do that. The list of all books I read this year is now on my <a href="https://www.kuniga.me/books/">books page</a>. In this post I’ll make some overall commentary.</p>

<h4 id="fiction-and-poetry">Fiction and Poetry</h4>

<p>This year I read more fiction than usual, in particular sci-fi. Recommended by a friend, <a href="https://www.kuniga.me/books/manna">Manna</a> is an interesting thought-experiment as to what can happen if most of human work is automated. I’ve been wanting to read Asimov for a while and finally scratched the itch with <a href="https://www.kuniga.me/books/i-robot">I, Robot</a>. It contains lots of interesting ideas between human-robot interactions.</p>

<p>I initially didn’t enjoy reading Borges’ <a href="https://www.kuniga.me/books/ficciones">Ficciones</a> because I couldn’t grok the heavily philosophical first story but the subsequent ones were pretty interesting. Also, after reading the entries on Wikipedia I realized there were many other layers I had missed. I especially liked the <em>Tower of Babel</em>.</p>

<p>Other fictions I read were <a href="https://www.kuniga.me/books/the-tempest">The Tempest</a> and <a href="https://www.kuniga.me/books/pachinko">Pachinko</a>. The Tempest was the first text I read from Shakespeare and found it underwhelming. Pachinko was okay.</p>

<p>I read my first book in Spanish, <a href="https://www.kuniga.me/books/romancero-gitano">Romancero Gitano</a>, a poetry book with very involved metaphors. It would have been hard to understand this book even in my mother language, but luckily it has lots of comments.</p>

<h4 id="history-and-travel">History and Travel</h4>

<p>I finished reading <a href="https://www.kuniga.me/books/the-scramble-for-africa">The Scramble for Africa</a> which I started in 2024 after visiting a few countries in Southern Africa. It covers a short period of time but geographically it’s very broad: it mentions a large part of countries in Africa.</p>

<p>For trip-related books, this year I read <a href="https://www.kuniga.me/books/perfume-dreams">Perfume Dreams</a> and <a href="https://www.kuniga.me/books/fire-in-the-lake">Fire in the Lake</a> for Vietnam and <a href="https://www.kuniga.me/books/singapore-a-very-short-history">Singapore: A Very Short History</a> for Singapore. While they’re not bad, I didn’t get the historical information I was after, except for the Singapore one. I decided that going forward I’ll just read Wikipedia.</p>

<p>I did just this for <a href="https://www.kuniga.me/docs/history/morocco">Morocco</a>. I read the main <a href="https://en.wikipedia.org/wiki/History_of_Morocco">Wikipedia page</a> and found it a lot more effective at learning about its history. To complement it, I read the memoir <a href="https://www.kuniga.me/books/for-bread-alone">For Bread Alone</a> by the Moroccan Mohamed Choukri in hope of some cultural immersion, without success. I still haven’t found a good way to learn about a country’s culture through books. The closest I got is through travel guides like Lonely planet.</p>

<p>With similar hopes I read the book <a href="https://www.kuniga.me/books/a-ladys-life-in-the-rocky-mountains">A Lady’s Life in the Rocky Mountains</a> for a trip to Colorado and <a href="https://www.kuniga.me/books/portlandness">Portlandness</a> for Portland. The former wasn’t very insightful, but the latter did provide some interesting cultural background to Portland and I enjoyed the data viz from it.</p>

<h4 id="stem">STEM</h4>

<p>I read <a href="https://www.kuniga.me/books/the-body">The Body</a> which covers  interesting bits about the human body. I also re-read <a href="https://www.kuniga.me/books/the-music-of-the-primes">The Music of the Primes</a> which I had read over a decade ago. I knew it was about the Riemann Hypothesis, and I wanted to revisit it after studying <a href="https://www.kuniga.me/blog/2025/10/25/riemann-zeta-function.html">The Riemann Zeta Function
</a>.</p>

<p>I also read the biography of von Neumann, <a href="https://www.kuniga.me/books/the-man-from-the-future">The Man from the Future</a>. I like learning about the history of mathematics, so I wanted to see if I’d enjoy reading biographies of mathematicians. It was an interesting but not an amazing read, so not sure I want to make a habit of it.</p>

<h4 id="psychology">Psychology</h4>

<p>I enjoyed reading some books on self-improvement such as <a href="https://www.kuniga.me/books/the-almanack-of-naval-ravikant">The Almanack by Naval Ravikant</a>, <a href="https://www.kuniga.me/books/nonviolent-communication">Nonviolent Communication</a> and <a href="https://www.kuniga.me/books/atomic-habits">Atomic Habits</a> (re-read).</p>]]></content><author><name>Guilherme Kunigami</name></author><category term="blog" /><category term="retrospective" /><summary type="html"><![CDATA[This is a meta-post to review what happened in 2025. Every year I go over the posts I wrote, reflect on the blog as a whole and on the personal side, share things I've done (mostly trips and books read). The thumbnail was generated by "Your year with ChatGPT" and found it reflects well what I share in this post.]]></summary></entry></feed>