kuniga.me > NP-Incompleteness > The Basel Problem
14 Mar 2023
Pietro Mengoli was an Italian mathematician and clergyman from Bologna, teaching at the University of Bologna [1]. He worked on infinite series and limits of geometric figures.
In 1650 he posed the following problem: What does the following summation converge to? $\sum_{n=1}^{\infty} \frac{1}{n^2}$
It took almost 100 years for it to be solved. Euler provided a solution in 1735 which was correct but his proof lacked some formalism.
In this post we’ll discuss Euler’s proof to this problem and understand what parts it was missing. It’s my first post in celebration of $\pi$-day.
The general idea of Euler’s proof is to show that $\sin(x)$ equals to its Taylor series at 0, that is:
\[\sin(x) = x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} \cdots\]And that it’s an entire function (we’ll define it below) and hence we can apply the Weierstrass Factorization theorem to show that:
\[\frac{\sin(x)}{x} = \left( 1 - \left(\frac{x}{\pi}\right)^2 \right)\left( 1 - \left(\frac{x}{2\pi}\right)^2 \right)\left( 1 - \left(\frac{x}{3\pi}\right)^2 \right) \cdots\]Then it shows that when we evaluate this product we’ll obtain a polinomial and the coefficient of $x^2$ is given by:
\[-\left(\frac{1}{\pi^2} + \frac{1}{4\pi^2} + \frac{1}{9\pi^2} + \cdots \right) = - \frac{1}{\pi^2} \sum_{n = 1}^{\infty} \frac{1}{n^2}\]Since the polinomials obtained from the product and the one given by Taylor series are the same, they must have the same coefficient for all variables. In particular for $x^2$:
\[\frac{1}{3!} = \frac{1}{\pi^2} \sum_{n = 1}^{\infty} \frac{1}{n^2}\]Thus we obtain:
\[\sum_{n = 1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}\]Euler’s proof used the result of Weierstrass Factorization theorem without proving it. It would be another 100 years until Karl Weierstrass did it [2].
We now go over each of these steps in more details.
Taylor Series can be used to approximate a function $f$ at a point $a$ via the series:
\[f(x) \sim \sum_{n = 0}^{\infty} \frac{f^{(n)}(a)}{n!}(x - a)^n\]An intuition behind this is provided in 3Blue1Brown’s Youtube video [3]. For some functions $f$ the Taylor series converges to it at all points.
Let’s compute Taylor series for $\sin x$ at point $a = 0$. We know that if $f(a) = \sin a$, then $f^{(1)}(a) = \cos(a)$ and that $f^{(2)}(a) = -\sin(a)$ and finally $f^{(3)}(a) = -\cos(a)$ and it cycles from there.
\[f(x) \sim \sin(a) + \frac{\cos(a)}{1!} (x - a) - \frac{\sin(a)}{2!} (x - a)^2 - \frac{\cos(a)}{3!} (x - a)^3 \\ + \frac{\sin(a)}{4!} (x - a)^4 + \frac{\cos(a)}{5!} (x - a)^5 \cdots\]Evaluating at $a = 0$ we have:
\[f(x) \sim x - \frac{x^3}{3!} + \frac{x^5}{5!} \cdots\]It’s not always true that the Taylor series at any point converges to $f$ at all points, but it’s possible to show that’s the case for $\sin(x)$!
Let $f$ be a function with complex domain and image. Let’s simplify and assume it’s a single-variable function, so $f: \mathbb{C} \rightarrow \mathbb{C}$. We can define the complex derivarive at a (complex) point $z_0$ the same way we do for real valued functions:
\[(1) \quad f'(z_0) = \lim_{z \rightarrow z_0} \frac{f(z) - f(z_0)}{z - z_0}\]Except that the values involved are complex numbers. A function $f$ is complex differentiable at $z_0$ if the limit (1) exists.
Let $U$ be an open set. We say $f$ is holomorphic if it’s complex differentiable at every point in $U$. If $U$ is the whole complex space, i.e. $U = \mathbb{C}$ in our case, then $f$ is called an entire funcion.
It’s possible to show that a power series:
\[\sum_{n = 0}^{\infty} a_n z^n\]that satisfies:
\[(2) \quad \lim_{n \rightarrow \infty} \abs{a_n}^{\frac{1}{n}} = 0\]Is an entire function. The coefficients from $\sin(x)$’s Taylor series are either 0, or $\frac{1}{n!}$. Lemma 3 (see Appendix) shows that $\lim_{n \rightarrow \infty} \abs{\frac{1}{n!}}^{\frac{1}{n}} = 0$ so (2) holds for all the coefficients as $n \rightarrow \infty$ which proves $\sin(x)$ is an entire function.
Entire functions can always be expressed by their Taylor series expansion, which are power series, which we can think of as polynomials of infinite degree. This suggests we can define properties for entire functions analogous to polynomials, such as how fast they grow.
For polynomials their growth depend largely on their degree, so for example $a + bx + cx^2$ will be dominated by the $x^2$ factor. This is analogous to the big-O notation we use in computer science: if we determine our algorithm executed $a + bx + cx^2$ operations, we’d say it has $O(x^2)$ complexity.
We can define a growth scale for entire funtions as well. The problem with entire functions is that their domain is complex numbers, and complex numbers cannot be compared like reals can (e.g. which one is bigger: $i$ or $1$?), so we need to transform them into reals.
The natural way to do this is to work with their norm which is real-valued. So instead of working with a specific complex number $z$, we work with some real $r$ and consider all complex numbers whose norm is $r$, i.e. the set $C_r = \curly{c \in \mathbb{C}, \abs{c} = r}$.
So if we want to find an upperbound on how fast $f$ grows for a given $r \in \mathbb{R}$, we can take the largest value of $f$ for any $z \in C_r$, that is $\max_{\abs{z} = r} \abs{f(z)}$. This is defined as maximum modulus function and denoted as $M_f(r)$. Note that this is a real-valued function $M_f(r) : \mathbb{R} \rightarrow \mathbb{R}$.
We can define a factor corresponding to the hyper-expontential growth of $f(z)$, denoted by $\lambda$, called order and defined as:
\[(3) \quad \lambda = \limsup_{r \rightarrow \infty} \frac{\ln \ln M_f(r)}{\ln r}\]Where $\limsup_{r \rightarrow \infty}$ is called the limit superior and defined as:
\[\limsup_{r \rightarrow \infty} x_r = \lim_{r \rightarrow \infty} \left( \sup_{m \ge r} x_m \right)\]We may ask, why not just $\sup \curly {x_r, r \in \mathbb{N}}$? It maybe be that first terms are big but then the terms get smaller as $r \rightarrow \infty$ so $\sup$ would be dominated by the initial terms, but we want the behavior of terms as $r \rightarrow \infty$.
We may then ask, why not just $\lim_{r \rightarrow \infty} x_r$? The series might not converge to a single value (e.g. it could oscilate). By taking the supremum we obtain a single number for the upper-bound.
With these definitions in place, Lemma 1 is the result we’ll use later:
Lemma 1. The order of $\sin(z)$ is 1.
Going back to the analogy with polynomials, we can write a polynomial as a product of terms involving its zeros (also called roots). For example, for $x^2 - 5x + 4$. Its zeros are $1$ and $4$, so it can be written as $(x - 1)(x - 4)$. This is known as the fundamental theorem of algebra.
Sometimes zeros have the same value, for example, $x^4 - 5x^3 + 4x^2$ has zeros $1$, $4$ and twice $0$, so it can be written as $(x - 0)(x - 0)(x - 1)(x - 4)$ or $x^2 (x - 1)(x - 4)$. In this case we say the zero $0$ has multiplicity 2.
The problem with this formula is if the function has an infinite number of zeros, which is unfortunately the case with transcendental function such as $\sin(z)$.
The Weierstrass Factorization Theorem generalizes the fundamental theorem of algebra by showing an entire function can be written as:
\[f(z) = z^m e^{g(z)} \prod_{k = 1}^{\infty} E_{n_k}(z/a_n)\]For some choice of $n_k$. Here $a_n$ are the zeros of $f(z)$ and $E_n$ is defined as:
\[(4) \quad E_n(w) = (1 - w)\exp \left(\sum_{i=1}^{n} \frac{w^k}{k}\right)\]And assuming that for $n = 0$ the sum is 1. The Weierstrass theorem is not constructive however: it doesn’t show how to find $g(x)$ nor $n_k$, but just states that they exist. A special case is if we can prove that the sum:
\[\sum_{n = 1}^\infty \frac{1}{\abs{a_n}^{h + 1}}\]converges, then we can replace $n_k$ with $h$. So if we find the smallest $h$ for which it holds, we can simplify $(4)$. We still have no bounds on the degree of $g(z)$.
Fortunately, there’s a more specific version of this theorem known as the Hadamard factorization theorem. It states that if $f(z)$ has order $\lambda$, there exists $h$, called the genus of $f(z)$, such that $h \le \lambda \le h + 1$ such that we can replace $n_k$ in Weierstrass’ formula $(4)$ by $h$ and that $g(x)$ is a polynomial of degree at most $h$:
\[f(z) = z^{m} e^{g(z)} \prod_{n = 1}^{\infty} E_{h} \left( \frac{z}{a_n} \right)\]We can use the Hadamard factorization theorem to prove:
Theorem 2.
\[(5) \quad \frac{\sin(z)}{z} = \left( 1 - \left(\frac{z}{\pi}\right)^2 \right)\left( 1 - \left(\frac{z}{2\pi}\right)^2 \right)\left( 1 - \left(\frac{z}{3\pi}\right)^2 \right) \cdots\]To obtain the coefficient for $x^2$ we observe our product is of the form $(1 + b_1 x^2)(1 + b_2 x^2), \dots$ for constants $b_i$. If this was a finite product with $n$ terms, we could obtain a polynomial by multiplying these terms together. For example, if $n = 3$, $(1 + b_1 x^2)(1 + b_2 x^2)(1 + b_3 x^2)$, we’d have 8 terms:
\[1 + b_3 x^2 + b_2 x^2 + b_2 b_3 x^4 + b_1 x^2 + b_1 b_3 x^4 + b_1 b_2 x^4 + b_1 b_2 b_3 x^6\]Grouping by factors:
\[1 + (b_3 + b_2 + b_1) x^2 + (b_2 b_3 + b_1 b_3 + b_1 b_2) x^4 + b_1 b_2 b_3 x^6\]We can observe the terms with $x^2$ are those where we have exactly one coefficient $b_i$, so in general, the resulting coefficient for $x^2$ given a product in this form, is the sum of all $b$’s. In (5), $b_i = -(\frac{1}{i\pi})^2$, so the coefficient for $x^2$ is:
\[-\left(\frac{1}{\pi^2} + \frac{1}{4\pi^2} + \frac{1}{9\pi^2} + \cdots \right) = - \frac{1}{\pi^2} \sum_{n = 1}^{\infty} \frac{1}{n^2}\]We can now refer to the high-level steps from Euler’s Proof to conclude that:
\[\sum_{n = 1}^{\infty} \frac{1}{n^2} = \frac{\pi^2}{6}\]It’s mind boggling how a problem so simple to state have such a difficult solution, which not even Euler could prove to a satisfactory formal degree. This reminds me of other problems like Fermat’s conjecture.
In this post we understood the difficulty of the problem and got some intuition on entire functions and the Weierstrass factorization theorem, but didn’t prove them either. Kruse’s thesis [6] dives into the proof but also omits the proof of some Lemmas.
Sheydvasser’s answer on Quora [5] was the most helpful to understand the application of Weierstrass factorization to $\sin(x)$.
I just finished the excelent book Real Analysis: A Long-Form Mathematics Textbook by Jay Cummings and after writing this post I’m very interested in learning about complex analysis.
Complex power series is central to signal processing, so not surprisingly a lot of posts from this topic have related parts with this post.
In Z-Transform we touched on the convergence of complex power-series, in particular that if $\sum_{n = 0}^{\infty} a_n$ converges, then:
\[\lim_{n \rightarrow \infty} \abs{a_n} = 0\]Compare that with the property power series representing entire function have (2):
\[\lim_{n \rightarrow \infty} \abs{a_n}^{\frac{1}{n}} = 0\]In Cepstrum we also used the Taylor series expansion for $\log(1 - x)$.
Weierstrass factorization theorem. That post has a reference to this one.
Hadamard Factorization Theorem. That post has a reference to this one.
Lemma 3. Let $n$ be a natural number. Then:
\[\lim_{n \rightarrow \infty} \abs{\frac{1}{n!}}^{\frac{1}{n}} = 0\]Lemma 4.
\[\lim_{r \rightarrow \infty} M_{\sin(r)} = \frac{e^r}{2}\]Lemma 5. Let
\[f(z) = z e^{a} \prod_{n = 1}^{\infty} \left(1 - \left( \frac{z}{n \pi}\right) ^2\right)\]Then $f’(0) = e^{a}$.