Motivation
In this post, we plan to explore the basic language of category theory with an eye towards defining sheaves, mathematical constructs which formalize the transition between local and global data on a space. In future posts we will explore the theory and application of sheaves in more detail, but first we need to lay the groundwork for our later discussion.
There is quite a bit of background and vocabulary necessary to make sense of sheaves. We could argue that sheaves are mathematical entities with rich structure in themselves, and thus are of interest to abstract mathematicians. Still, generally if one puts in the effort to learn any mathematical concept technical enough to be called "theory," one wishes to get something from the work. It is disappointing to go to all the work to learn such a discipline only to find that it doesn't actually give you a way to approach new problems!
Sheaves give us powerful tools which make many generalizations possible. For example, homology and cohomology can both be expanded from their purely topological roots into the world of sheaves, and the more general framework allows the topological results to fall out as special cases (as one hopes from a generalization). Viewed over graphs, sheaves show the topological underpinnings of certain graph and network theoretic problems.
But what new avenues do these open? Sheaves give us better insight into the structure of spaces. For instance, Joel Friedman [2] showed that morphisms from graphs $G_i$ to a graph $G$ can be viewed as sheaves $S(G_i)$ over $G$, and that morphisms between $G_i$ and $G_j$ induce morphisms between their respective sheaves. However, there are sheaf morphisms between these induced sheaves which are not the result of graph morphisms. These extra morphisms capture aspects of the structure of $G$ which have not been well captured in graph theory alone. This refinement of structural detection allows sheaf theory to address questions about graphs which have been intractable to normal graph theoretic approaches.
Category Theory
When one studies various structures in mathematics, one often encounters similar patterns in different constructs. For instance, cycles in topological spaces behave in some ways like abelian groups. As another example, there are some properties of a structure drawn from the base set on which the structure is imposed. Category theory is a language which makes such correspondences more rigorous, as well as the tools to turn these correspondences into a mathematical structure of their own.
We will explore an introduction to category theory with a focus in topology, both due to the personal interest of the author and because topology is the historical origin of categories. We will focus particularly on the categories of presheaves and sheaves. These categories are of particular interest to topological data processing and analysis, as we will explore more deeply in a future post.
Categories
First, though, we must understand the language of category theory. What exactly is a category? (We mostly follow the construction found in Elements of Algebraic Topology by Munkres, though with fewer, but more detailed examples).
We define a category to consist of three things:
- A class of objects
- For each ordered pair of objects $(X,Y)$, a set $\mathrm{hom}(X,Y)$ of morphisms
- A composition function on the morphisms such that $(f,g)=g \circ f: \mathrm{hom}(X,Y) \times \mathrm{hom}(Y,Z) \rightarrow \mathrm{hom}(X,Z)$ for all objects $X, Y, Z$ where $g \circ f$ is associative and has identities — that is
- If $f \in \mathrm{hom}(W,X), h \in \mathrm{hom}(X,Y)$, and $g\in \mathrm{hom}(Y,Z)$, then $h \circ (g \circ f)=(h\circ g)\circ f$
- There exists $1_{X} \in \mathrm{hom}(X,X)$ such that $1_{X}\circ f=f$ and $g\circ 1_{X}=g$
 
There are several categories which we use in the study of algebraic topology. Most ubiquitous are the category of topological spaces with continuous maps and standard composition and the category of abelian groups under homomorphism (with standard composition, which we will take as implied henceforth unless a particular composition rule is stated). The category of chain complexes and chain maps is also a fairly common sight, though there are others — later, for instance, we will discuss the category of topological spaces with restriction maps as morphisms, as well as the category of finite semimodules with quotient maps.
Usually the objects in a category are the things we study in a particular branch of mathematics (topological spaces, groups, rings, manifolds, etc.) and the morphisms are maps between members of our chosen object which are sufficient to preserve some aspect or aspects of our object (for instance, if we only wish to discuss topological invariants, we could consider the category of topological spaces under homeomorphism. If we cared about properties preserved by continuous maps, we would instead equip our category with morphisms of continuous maps).
Thus far, category theory hasn't given us anything new. It has only provided a slightly different way to talk about maps between structural elements. The theory's utility arises from functors, a type of map between categories.
Functors
A functor is a function $G:C \rightarrow D$ where $C$ and $D$ are categories such that
- for each object $X$ of $C, G(C)$ is an object of $D$;
- for each morphism $F:X\rightarrow Y$ of $C, G(f):G(X)\rightarrow G(Y)$ is a morphism of $D$;
- $G(1_{X})=1_{G(X)}$ for all $X$; and
- $G(g\circ f)=G(g)\circ G(f)$ for all $g,f$.
There are several basic functors, with the identity functor and the forgetful functor perhaps foremost among them. The identity functor maps from a category to itself and, as with most identity maps, takes objects and morphisms back to themselves. The forgetful functor, a map that takes a structured space to its underlying set and its morphisms to their underlying set maps.
  However, there are also some functors which we use regularly in algebraic topology without realizing their functorial nature. Perhaps the most common one we use is homology. $H_n:Top \rightarrow Ab$ is a functor from the category of topological spaces to the category of abelian groups by assigning to each topological space (that is, the objects of $Top$) its $nth$ homology group (an object of the category of abelian groups). Then given a continuous map $f$, the following diagram commutes:
The morphism $ f_*$ between $H_n(X)$ and $H_n(Y)$ is known as the pushforward. Its construction is discussed in detail in [4], but the relevant results for our purposes are that (i) the identity map induces the identity homomorphism, and (ii) for $f:K\rightarrow L$ and $g:L\rightarrow M$, we have $(g\circ f)_*=g_*\circ f_*$.
Clearly the first requirement holds: every space has an $nth$ homology group. Likewise, we noted that each map induces a homomorphism, so if we say $H_n$ takes $f$ to $f_*$ the second requirement is filled. The third and fourth requirements follow from the results we noted about induced homomorphisms.
Whether explicitly or implicitly stated, the functoriality of the homology construction is an integral part of the proof that homology is a topological invariant.
Likewise, there is a functor from the category of simplicial complexes and simplicial maps to that of chain complexes and chain maps which assigns $K\rightarrow \mathscr{C} (K)$ and $f\rightarrow f_\#$. Again, every simplicial complex has a chain complex associated with it, and every simplicial map $f$ induces a chain map $f_\#$, so the first two conditions are upheld. The latter two arise from the verification that the identity simplicial map $\iota$ induces the identity chain map and that $(g\circ f)_\#=g_\#\circ f_\#$, both of which follow from the definition of $f_\#$.
Other functors appear in topological studies. Homotopy relies on the fundamental group, for instance, which also exhibits functorial properties. While the fact that these are functors is an important and useful point, these results are generally used independently of category theory. Our interest, however, will lie with a form of functor that is rarely discussed outside of the context of category theory: sheaves.
What is a Sheaf?
N.B.: The information we present here on sheaf theory is largely drawn from [1], with some clarifications from [4].A sheaf is a means to keep track of data over a space. Data sources are required to agree on comparable data when they overlap in space, and local data is required to be sufficient to recover global data.
Formally, a presheaf is a functor $F:\rm{Open}^{op}(X) \rightarrow \rm{Alg}$ from the category of open sets on the topological space $X$ with restriction morphisms to an algebraic category. A sheaf is a presheaf with additional structure:
- Given an open cover $\{U_i\}$ of an open set $U$, if $s,t \in F(U)$ with $s|_{U_i}=t|_{U_i}$ for each $i$, then $s=t$.
- Given an open cover $\{U_i\}$ of an open set $U$, with an $s_i\in F(U_i)$ for each $i$ such that $s_i |_{U_i\cap U_j}=s_j|_{U_i\cap U_j}$ for all $i,j$, then there is an element $s\in F(U)$ such that $s|_{U_i}=s_i$ for all $i$.
  The first condition (called the locality condition) requires that two pieces of global data that look the same locally are in fact the same globally. The second (called the gluing condition) requires that data which agrees can be glued together into a global structure. We think of $s$ and $t$ (called sections) as particular choices of data with the assigned algebraic structures as all possible data.
  
  (Note that locality demands that $F(\varnothing)=0$, since $\cup_{i\in \varnothing} U_i$ is an open cover of $\varnothing$, and any two distinct sections agree on all elements of the cover yet are not equal. This requirement will generally be implied.)
Example 1
Let's look at a specific example of a sheaf built on a space $X$ composed of two disjoint open sets $A$ and $B$.We want to build a sheaf that works as much like a constant function as we can. Let's define a functor $F:Top(X)\rightarrow Grp$ from the category of open sets of $X$ with restriction maps to the category of groups with group homomorphisms by $F(U)=G$ for each non-empty open set $U$ for a fixed group $G$ and $F(g)=\iota$ for all restriction maps except those to $\varnothing$, which all map to $0$. Is this a sheaf?
That this is a presheaf (that is, that $F$ is a functor) is easy to show. For locality, suppose there is an open set $U$ with open cover $\{U_i\}$. Given two sections $s,t \in G$ with $s\neq t$, since restriction maps induce the identity homomorphism $s=s|_{U_i}$ and $t=t|_{U_i}$. Thus $s|_{U_i}\neq t|_{U_i}$ for some $i$ (indeed for all $i$), so by the contrapositive if $s,t \in F(U)$ with $s|_{U_i}=t|_{U_i}$ for each $i$, then $s=t$, so locality holds.
Finally, the gluing condition. Take $G=\{0,1\}$ and the open cover $A, B$. Choose the local sections $s_1=F(A)=0$ and $s_2=F(B)=1$. Since they don't intersect, $s_1|_{A\cap B}=0=s_2|_{A\cap B}$, yet there is no value in $s\in F(X)=G$ with $s_{A}=s_1$ and $s_{B}=s_2$, so there is no global section which works.
  The closest we can get to what we would think of intuitively as a constant sheaf on this space is to assign $F(U)=G\times G$ if $U$ intersects both $A$ and $B$, $F(U)=G\times \varnothing$ if $U$ only intersects $A$, and $F(U)=\varnothing \times G$ if $U$ only intersects $B$, rendering the following diagram:
Here, $V_1$ is the set of open subsets of $X$ which intersect $A$ and $B$, $V_2$ those that intersect only $A$, and $V_3$ those that intersect only $B$. Containments within one of these classes induces the identity map, and all containments of the null set induce the zero map, though we don't include these details in the diagram.
  Note that again this is a presheaf and locality holds, but now the gluing condition is fulfilled, and hence it is a sheaf. We call this the constant sheaf on this space.
(See [1] section 3.1 for more information on when constant sheaves are possible and when locally constant sheaves are required)
Example 2
In the previous example, we explored how locally consistent data can fail to produce a global section, where the data that we allow for the entire space is too restrictive to capture the variety of local data. Now we will explore the opposite problem: what happens when the global data isn't restrictive enough?
  Consider the discrete two point space $\{a, b\}$ where we assign $\mathbb{R}^3$ to $\{a,b\}$ and $\mathbb{R}$ to both $\{a\}$ and $\{b\}$, with the restriction map from $\{a,b\}$ to $\{a\}$ given by the projection $\pi_1$ onto its first coordinate and the restriction from $\{a,b\}$ to $\{b\}$ given by the projection $\pi_2$ onto its second coordinate, as shown in the diagram below.
  
Clearly any compatible local sections one chooses can be glued together as a global section. Indeed, given $s_1$ the local section over $\{a\}$ and $s_2$ over $\{b\}$, choosing any $(s_1, s_2, s_3)$ will be a global section consistent with both local sections. However, that very ease of choice makes this construction fail to be a sheaf. Two different global sections (any two distinct $s_3$ and $s_3'$ will do) agree on all local sections of the space, a violation on the condition of locality. In order to create a sheaf from this structure, we must either reduce the vector space attached to $\{a,b\}$ from $\mathbb{R}^3$ to $\mathbb{R}^2$ or we must introduce a third point in the space onto which we project the third coordinate from the full space, as illustrated below.
(See [1] section 2.1 for the development of where the original scenario might arise)
Conclusion
Thus far we have discussed the basic terms required to understand sheaf theory. However, the intuition behind the construction still isn't clear. We claimed at the beginning that sheaves were a method of, among other things, keeping track of data. One might ask how the construction we've created corresponds with data in the normal sense.
In the next post, we will show how we might understand the way sheaves track data in the context of pictures. Though sheaves are not usually applied in the case of image reconstruction (since it is easy enough to keep track of the involved information without going to the trouble of constructing a sheaf!), it will still give us an intuitive idea of what, exactly, a sheaf does, and what restriction morphisms, the gluing condition, and locality actually mean in a more concrete sense.
Works Cited
[1] Curry, Justin M., "Sheaves, Cosheaves and Applications," University of Pennsylvania, 2014. arXiv[2] Friedman, Joel, "Sheaves on Graphs, Their Homological Invariants, and a Proof of the Hanna Neumann Conjecture" University of British Columbia, 2011. arXiv
[3] Munkres, James R. "Elements of Algebraic Topology," Perseus Publishing, 1984
[4] R. Ghrist, Elementary Applied Topology, ed. 1.0, Createspace, 2014

