kuniga.me > NP-Incompleteness > Functionals
26 Apr 2026
I started reading the book The Theoretical Minimum by Leonard Susskind and George Hrabovsky. A lot of the math from the early chapters looked familiar, but in Chapter 6: The Principle of Least Action, they describe and derive the Euler-Lagrange equation, which I don’t recall seeing before.
I wanted to explore these equations and their derivation, but from a more mathematical point of view. This led me to a short rabbit hole around functionals and Sobolev spaces and since I like learning things from first principles, I decided to cover functionals first.
The thumbnail features the Hungarian mathematician Frigyes Riesz. He was first featured in the post Subharmonic Functions and is back here because he’s considered one of the founders of functional analysis and we’ll cover one theorem named after him, the Riesz Representation Theorem.
In the branch of math called functional analysis the core object is the functional. Without qualifiers it’s implicitly assumed that a functional is a linear one. In this post however, since we want to cover functionals in general, we’ll qualify linear functions explicitly and assume the general case when saying functionals.
A functional is essentially a function $f$ where the domain is a vector space $H$ and the image is a field, either the reals or the complex numbers, generically denoted by $\mathbb{F}$:
\[f : H \rightarrow \mathbb{F}\]More intuitively, a functional is a function that takes a function as input and returns a scalar. Much like functional programming which operates over functions as objects.
One may ask: what do functions have to do with vector spaces? In a vector space the vector is a concept more broad than, say, a tuple of scalars like $\mathbb{R}^3$. It just means it’s some object that satisfies a set of axioms (e.g. addition, scalar multiplication). An example of a function vector space is the set of continuous functions.
If this function (or map) satisfies additivity and scalar multiplication, then it’s called a linear functional. In other words if $x, y$ are members of a vector space $V$, and $\lambda \in \mathbb{F}$ and $f$ a linear map $f: V \rightarrow \mathbb{F}$:
\[f(x + y) = f(x) + f(y) \\ f(\lambda x) = \lambda f(x)\]We’ll now cover some related concepts and properties of functionals.
The norm or operator norm of a functional $f$ and denoted by $\norm{f}$ is defined as:
\[\norm{f} = \sup_{\norm{x} \ne 0} \frac{\abs{f(x)}}{\norm{x}}\]That is, it’s the supremum of the value of $f$ but normalized by its input size. Note that it doesn’t make sense to talk about $\abs{f}$, even though its image is a scalar, since it requires a specific input for it to spit out a scalar.
For continuity to make sense for functionals, the domain must be a topological space, i.e. it must have the notion of open sets, because continuity depends on these.
In a more specific case, if we assume the domain is a normed vector space, i.e. it has the notion of distance between its elements, then we can use the $\epsilon-\delta$ definition of continuity for the functional, that is, a functional is continuous at $x_0$ if for every $\epsilon \gt 0$, there exists $\delta \gt 0$:
\[\norm{x - x_0} \lt \delta \implies \norm{f(x) - f(x_0)} \lt \epsilon\]For a linear functional $f$ in particular, there’s an alternative characterization: if there’s an upper bound on how much bigger $f$ is compared to its input, then it’s continuous:
Lemma 1. The linear functional $f: H \rightarrow \mathbb{F}$ is continuous if and only if
\[\norm{f} \le C \norm{x}\]for all $x \in H$ and some constant $C$.
Let $X, Y$ be vector spaces equipped with a norm and $f : X \rightarrow Y$. We say that $f$ is Fréchet differentiable if there exists a linear map $A: X \rightarrow Y$ (Fréchet derivative) such that:
\[(1) \quad \lim_{\norm{h} \rightarrow 0} \frac{\norm{f(x + h) - f(x) - A(h)}}{\norm{h}} = 0\]For $h \in X$. Note that $f$ is not necessarily a functional, only if $Y = \mathbb{R}$ or $Y = \mathbb{C}$, and even if it is, it’s not necessarily linear. However if $f$ is a functional, then the Fréchet derivative is a linear function because it’s a linear map from a vector space to a field.
This is a general definition of the differential we see in real analysis. If we take $X = Y = \mathbb{R}$, then we can simplify $(1)$ to:
\[(2) \quad \lim_{h \rightarrow 0} \frac{f(x + h) - f(x) - f'(x)h}{h} = 0\]By having $A(h) = f’(x)h$. If we add $f’(x)h/h$ to both sides of the equation, we get the more familiar:
\[(3) \quad f'(x) = \lim_{h \rightarrow 0} \frac{f(x + h) - f(x)}{h}\]One might ask why we use this other form. The expression $f(x + h) - f(x)$ is an element of $Y$ and $h \in X$, so in order for $(3)$ to make sense, we’d need to define multiplication or division between $X$ and $Y$.
The Fréchet derivative also generalizes the complex derivatives, the one that defines holomorphic functions and underpins complex analysis. We can also have the form $(2)$ but the implicit assumption is that in $f’(x)h$ the multiplication operator is the complex one.
Note how the Fréchet derivative goes one abstraction layer above by “wrapping” $f’(x)h$ as some function $A(h)$. This is similar to how topological spaces abstract normed spaces by working with open sets instead of norms (open set is a higher object than norms because they can be defined from norms but not the other way around).
Now we focus on properties that are only applicable if the functional is linear.
Let $X, Y$ be vector spaces and $f$ a linear map between them. The kernel is the subspace of $X$ defined as:
\[\ker f = \curly{x \in X : f(x) = 0}\]In other words, all elements in the domain that map to $0$ in the image. Note that $0$ here is not necessarily the scalar number $0$, but the $0$ element in the vector space $Y$.
We can verify that the kernel is indeed a subspace of $X$. It contains the $0$-th element because $f$ is linear and thus $f(0) = 0$. If $x, y \in \ker f$, then $x + y \in \ker f$ again because $f$ is linear and $f(x + y) = f(x) + f(y) = 0 + 0$. If $x \in \ker f$ and $\lambda$ is a scalar, then $\lambda x \in \ker f$ because $f(\lambda x) = \lambda f(x) = 0$.
If $f$ is continuous, then $\ker f$ is closed. We can show this by using one of the topological definitions of continuity: A function $f: X \rightarrow Y$ is continuous if and only if for every $U$ that is a closed set in $Y$, $f^{-1}(U)$ is a closed set in $X$. Since $\curly{0}$ is a closed set in $Y$ and $\ker f$ is the pre-image of $\curly{0}$, $\ker f$ is closed.
So we know the kernel is a subspace of $X$, but it doesn’t necessarily inherit the same properties of the vector space $X$. If the domain is a Hilbert space however, then it’s possible to show that $\ker f$ is also a Hilbert space.
Intuitively things that have linear properties can form a vector space, because a lot of its axioms are about linear combination of vectors. Since linear functionals have linear properties, they can also form a vector space! This vector space is called the dual of the domain vector space of the functionals.
More specifically, the dual space of a vector space $V$ is the set of all linear functionals that have $V$ as domain, denoted with a superscript asterisk:
\[V^{*} = \curly{f : V \rightarrow \mathbb{R}}\]where $f$ is a linear functional. The intuition here is that $f$ associates a measure to the vectors of $V$. For Hilbert spaces in particular, we have a nice identity between a vector space and its dual. Before we show that, we need the following result:
Theorem 2. (Riesz Representation Theorem) Let $H$ be a Hilbert space and $f: H \rightarrow \mathbb{F}$ a continuous linear functional. Then, there exists a unique vector $y \in H$ such that:
\[f(x) = \langle x, y\rangle \quad \forall x \in H\]and with $\norm{f} = \norm{y}$
What this theorem is saying is that for any continuous linear functional $f$ over a Hilbert domain, there’s exactly one element in that domain that “encodes” $f$ as a dot product with that element.
We can now claim that Hilbert spaces are isomorphic to their dual, $H \cong H^*$, that is, there exists a bijection between these two sets. In this case the function is defined by:
\[T(y) = f_y, \quad f_y(x) = \langle x, y \rangle\]which is a bijection since for each $f \in H^*$ there’s a unique $y$ for which $f = f_y$. Conversely each $y$ defines a unique function $f_y$. Further since $\norm{f} = \norm{y}$ this is a “length-preserving” bijection, which leads to the more general isometric isomorphism.
This is a special identity for Hilbert spaces. It does not hold for example for the slightly more general Banach space.
I thought this would be my first post on functional analysis, but I had forgotten I wrote about Hilbert spaces before.
This was another topic I relied entirely on ChatGPT and really liked the interactive process. It’s very gratifying to start with a blurry view of it and gradually build a cohesive and more intuitive picture.
One of the most amusing moments was when I asked ChatGPT what happens if we go recursive and build a vector space of functionals, and then it pointed out it’s basically the dual space, which I had already studied. It then “clicked”.
It was also nice to connect with other parts I have studied in the past such as analysis and topology.