Automata in the Category of Glued Vector Spaces

Preparing to load PDF file. please wait...

0 of 0
Automata in the Category of Glued Vector Spaces

Transcript Of Automata in the Category of Glued Vector Spaces

Automata in the Category of Glued Vector Spaces∗†

Thomas Colcombet

Daniela Petri¸san

[email protected] [email protected]

CNRS, IRIF, Univ. Paris-Diderot, Paris 7, France

November 17, 2017

arXiv:1711.06065v1 [cs.FL] 16 Nov 2017

In this paper we adopt a category-theoretic approach to the conception of automata classes enjoying minimization by design. The main instantiation of our construction is a new class of automata that are hybrid between deterministic automata and automata weighted over a field.
1 Introduction
In this paper we introduce a new automata model, hybrid set-vector automata, designed to accept weighted languages over a field in a more efficient way than Schtzenberger’s weighted automata [13]. The space of states for these automata is not a vector space, but rather a union of vector spaces “glued” together along subspaces. We call them hybrid automata, since they naturally embed both deterministic finite state automata and finite automata weighted over a field. In Section 2 we present at an informal level a motivating example and the intuitions behind this construction, avoiding as much as possible categorytheoretical technicalities. We use this example to guide us throughout the rest of the paper.
A key property that the new automata model should satisfy is minimization. Since the morphisms of “glued” vector spaces are rather complicated to describe, proving the existence of minimal automata “by hand” is rather complicated. Therefore we opted for a more systematic approach and adopted a category-theoretic perspective for designing new forms of automata that enjoy minimization by design. In particular, we introduce the category of “glued” vector spaces in which these automata should live and we analyse its properties that render minimization possible.
∗This work was supported by the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant agreement No.670624), and by the DeLTA ANR project (ANR-16-CE40-0007). The authors also thank the Simons Institute for the Theory of Computing where this work has been partly developped.
†This is the knowledge enriched version of the same paper published in the proceedings of MFCS 2017.

Starting with the seminal papers of Arbib and Manes, see for example [3] and the references therein, and of Goguen [10], it became well established that category theory offers a neat understanding of several phenomena in automata theory. In particular, the key property of minimization in different contexts, such as for deterministic automata (over finite words) and Schtzenberger’s automata weighted over fields [13], arises from the same categorical reasons (existence of some limits/colimits and an (epi,mono)-factorization system [3]).
There is a long tradition of seeing automata either as algebras or coalgebras for a functor. However, in the case of deterministic automata, the algebraic view does not capture the accepting states, while the coalgebraic view does not capture the initial state. In the coalgebraic setting one needs to consider the so-called pointed coalgebras, see for example [1], where minimal automata are modelled as well-pointed coalgebras. The dual perspective of automata seen as both algebras or coalgebras, as well as the duality between reachability and observability, has been explored more recently in papers such as [4–6].
Here we take yet another approach to defining automata in a category. The reader acquainted with category theory will recognise that we see automata as functors from an input category (that specifies the type of the machines under consideration, which in this paper is restricted to word automata) to a category of output values. We show that the next ingredients are sufficient to ensure minimization: the existence of an initial and of a final automaton for a language, and a factorization system on the category in which we interpret our automata.
For example, deterministic and weighted automata over a field are obtained by considering as output categories the categories Set of sets and functions and Vec of vector spaces and linear maps, respectively. Since Set and Vec have all limits and colimits, it is very easy to prove the existence of initial and final automata accepting a given language. In both cases, the minimal automaton for a language is obtained by taking an epi-mono factorization of the unique arrow from the initial to the final automaton.
Notice that the initial and the final automata have infinite (-dimensional) state sets (spaces). If the language at issue is regular, that is, if the unique map from the initial to the final automaton factors through a finite (-dimensional) automaton then, automatically, the minimal automaton will also be finite (dimensional). However, this relies on very specific properties of the categories of sets and vector spaces, namely on the fact that the full subcategories Setfin of finite sets and Vecfin of finite dimensional vector spaces are closed in Set, respectively in Vec, under both quotients and subobjects.
Coming back to hybrid set-vector automata, we define them as word automata interpreted in an output category Glue(Vec) which we obtain as the completion of Vec under certain colimits, and can be described at an informal level as “glueings” of arbitrary vector spaces. The definition of this form of cocompletion Glue(C) of a category C is the subject of Section 4.
We are interested in those hybrid automata for which the state object admits a finitary description, which intuitively can be described as finite glueings of finite dimensional spaces. For this reason we will consider the subcategory Gluefin(Vecfin) of Glue(Vec). It turns out that Gluefin(Vecfin) is closed under quotients in Glue(Vec) but, crucially, it is not closed under subobjects. For example, a glueing of infinitely many one-dimensional spaces is a subobject of a two-dimensional space, but only the latter is an object of Gluefin(Vecfin).
This is the motivation for introducing a notion of (ES ,MS )-factorization

of a category C through a subcategory S. This is a refinement of the classical notion of factorization system on C and is used for isolating the semantical computations (in C) from the automata themselves (with an object from S as “set of configurations”).1 We show how it provides a minimization of “S-automata for representing C-languages”. A concrete instance of this is a factorization system on Glue(Vec) through Gluefin(Vecfin), which plays a crucial role in proving the existence of minimal Gluefin(Vecfin)-automata for recognizing weighted languages.
The rest of the paper is organised as follows. We first develop a motivating example of a hybrid set-vector space automaton in Section 2. We then identify in Section 3 the category-theoretic ingredients that are sufficient for a class of automata to enjoy minimization. We then turn to our main contribution, namely the description and the study of the properties of (finite-)mono-diagrams in a category, in Section 4. We conclude in Section 5 with a discussion of some of the design choices we made in this paper.

2 The hybridisation of deterministic finite state and vector automata

In this section, we (rather informally) describe the motivating example of this paper: the construction of a family of automata that naturally extends both deterministic finite state automata and finite automata weighted over a field in the sense of Schtzenberger (i.e., automata in the category of finite vector spaces). The intuition should then support the categorical constructions that we develop in the subsequent sections.
Set automata (deterministic automata). Let us fix ourselves an alphabet A. A deterministic automaton (or set automaton) is a tuple

A = (Q, i, f , (δa)a∈A) ,

in which Q is a set of states, i is a map from a one element set 1 = {0} to Q (i.e. an initial state), f is a map from Q to a two elements set 2 = {0, 1} (i.e. a set of accepting states), and δa is a map from Q to itself for all letters a ∈ A. Given a word u = a1 . . . an, the automaton accepts the map [[A]](u) : 1 → 2 defined as:

[[A]](u) = f ◦ δu ◦ i

where δ = δan ◦ · · · ◦ δa1 .

We recognize here the standard definition of a deterministic automaton, in which a word u is accepted if the map [[A]](u) is the constant 1, and rejected if it is the constant 0.
Vector space automata (automata weighted over a field). Now, we can use the same definition of an automaton, this time with Q a vector space (over, say, the field R), i a linear map from R to Q, f a linear map from Q to R (seen as a R-vector space as usual), and δa a linear map from Q to itself. In other words, we have used the same definition, but this time in the category of vector spaces. Given a word u, a vector space automaton A computes [[A]](u) : R → R as the composite described above. Since a linear map from R to R is only determined
1This distinction is usually not necessary, and we are not aware of its existence in the literature. It is crucial for us, thus we cannot use already existing results from the coalgebraic literature, e.g. [1].


by the image of 1, this automaton can be understood as associating to each input word u the real number [[A]](u)(1). We will informally refer to such automata in this section as vector space automata. Let us provide an example.
Leading example. For a word u ∈ {a, b, c}∗ let |u|a denote the number of occurrences of the letter a in u. Let us compute the map F which, given a word u ∈ {a, b, c}∗, outputs 2|u|a if it contains an even number of b’s and no c’s, and 0 in all other cases. This is achieved with the vector space automaton Avec = (Qvec = R2, ivec, f vec, δvec) where for all x, y ∈ R,

ivec(x) = (x, 0) , f vec(x, y) = x ,

δavec(x, y) = (2x, 2y) , δcvec(x, y) = (0, 0) .

δbvec(x, y) = (y, x) ,

One easily checks that indeed [[Avec]](u)(1) = F (u) for all words u ∈ A∗.
Can we do better? It is well known from Schtzenberger’s seminal work that the vector space automaton Avec is minimal, both in an algebraic sense (to be described later) as well as at an intuitive level in the sense that no vector space automaton could recognize F with a dimension one vector space as configuration space: Avec is “dimension minimal.”
However, let us think for one moment on how one would “implement” the function F as an online device that would get letters as input, and would modify its internal state accordingly. Would we implement concretely Avec directly? Probably not, since there is a more economic2 way to obtain the same result: we can maintain 2m where m is the number of a’s seen so far, together with one bit for remembering whether the number of b’s is even or odd. Such an automaton would start with 1 in its unique real valued register. Each time an a is met, the register is doubled, each time b is met, the bit is reversed, and when c is met, the register is set to 0. At the end of the input word, the automaton would output 0 or the value of the register depending on the current value of the bit. If we consider the configuration space that we use in this encoding, we use R R instead of R × R. Can we define an automata model that would faithfully implement this example?
A first generalization: disjoint unions of vector spaces. A way to achieve this is to interpret the generic notion of automata in the category of finite disjoint unions of vector spaces (duvs). One way to define such a finite disjoint unions of vector spaces is to use a finite set N of ‘indices’ p, q, r . . . , and to each index p associate a vector space Vp. The ‘space’ represented is then {(p, v) | p ∈ N, v ∈ Vp}. A ‘map’ between duvs represented by (N, V ) and (N , V ) is then a pair h : N → N together with a linear map fp from Vp to Vh(p) for all p ∈ N . It can be seen as mapping each (p, v) ∈ N × Vp to (h(p), fp(v)). Call this a duvs map. Such duvs maps are composed in a natural way. This defines a category, and hence we can consider duvs automata which are automata with a duvs for its state space, and transitions implemented by duvs maps.
For instance, we can pursue with the computation of F and provide a duvs automaton Aduvs = (Qduvs, iduvs, f duvs, δduvs) where Qduvs = {(s, x) | s ∈ {even, odd}, x ∈ R} (considered as a disjoint union of vector spaces with indices even and odd and all associated vector spaces Veven = Vodd = R). The maps can
2Under the reasonable assumption that maintaining a real is more costly than maintaining a bit.


be conveniently defined as follows:

iduvs(x) = (even, x) f duvs(even, x) = x
f duvs(odd, x) = 0

δaduvs(even, x) = (even, 2x) δbduvs(even, x) = (odd, x) δcduvs(even, x) = (even, 0)

δaduvs(odd, x) = (odd, 2x) δbduvs(odd, x) = (even, x) δcduvs(odd, x) = (odd, 0)

This automaton computes the expected F . It is also obvious that such automata over finite disjoint unions of vector spaces generalize both deterministic finite state automata (using only 0-dimensional vector spaces), and vector space automata (using only one index). However, is it the joint generalization that we hoped for? The answer is no...
Minimization of duvs automata. We could think that the above automaton Aduvs is minimal. However, it involved some arbitrary decisions when defining it. This can be seen in the fact that when δcduvs is applied, we chose to not change the index (and set to null the real value): this is arbitrary, and we could have exchanged even and odd, or fixed it abitrarily to even, or to odd. All these variants would be equally valid for computing F .
It is a bit difficult at this stage to explain the non-minimality of these automata since we did not introduce the proper notions yet. Let us try at a high level, invoking some standard automata-theoretic concepts. The first remark is that every configuration in Qduvs is ‘reachable’ in this automaton: indeed (even, x) = iduvs(x) and (odd, x) = δbduvs ◦ iduvs(x) for all x ∈ R. Hence there is no hope to improve the automaton Aduvs or one of its variants by some form of ‘restriction to its reachable configurations’. Only ‘quotienting of configurations’ remains. However, one can show that none among Aduvs and the variants mentioned above is the quotient of another. Keeping in mind the Myhill-Nerode equivalence, we should instead merge the configurations (even, 0) and (odd, 0) since these are observationally equivalent:

f duvs ◦ δuduvs(even, 0) = 0 = f duvs ◦ δuduvs(odd, 0) for all words u ∈ A∗.

However, the quotient duvs obtained by merging (even, 0) and (odd, 0), albeit not very intuitive, consists of one index associated to a two dimensional vector space, which is essentially an indexed version of the vector space automaton Avec computed before. At this stage, we understand that minimising in the category of duvs is not very helpful, as we do not obtain the desired optimisation.
How to proceed from here. The only reasonable thing to do is indeed to merge (even, 0) and (odd, 0), but we have to be more careful about the precise meaning of ‘quotient’. A possibility is to add explicitly equivalence classes in the definition of the automaton. However, category theory provides useful concepts and terminology for defining these objects: colimits, and more precisely the free co-completion of a category. In the previous paragraph, we have shown that the category of duvs – which is itself the free completion of Vec with respect to finite coproducts – is not a good ambient category for our purposes. We need more colimits, so that the notion of ‘quotient’ is further refined. At the other extreme, we could consider the free completion with respect to all colimits, which, informally, consists of objects obtained from the category using copying and gluing. We will explain later in Section 5 why we choose to not use this completion. Intuitively, by adding all colimits we glue the vector spaces “too much”, and not only we loose a geometric intuition of the objects we are dealing


with, but we may run into actual technical problems when it comes the existence of minimal automata.
Instead, we restrict our attention to a class of colimits (which strictly contains coproducts) for which different spaces in the colimit can be “glued” together along subspaces, but which do not contain implicit self folding (i.e., such that an element of a vector space is not glued to a distinct element of the same vector space, directly or indirectly). E.g., we can describe ‘two one-dimensional spaces, the 0-dimensional subspaces of which are identified through a linear bijection’. In this way we obtain the new category of glued vector spaces and hybrid set-vector space automata, corresponding to Glue(Vec)-automata in the rest of the paper.
Generic arguments of colimits provide the language for describing these objects, but do not solve the question of minimality. In particular, we are interested in automata whose space of configurations is a finite colimit belonging to the class described above. The categorical development in this work addresses the minimization problem for hybrid automata. An intuition in the case of gluing of vector spaces. In the case of gluing of vector spaces, it is possible to isolate a combinatorial statement that plays a crucial role in the existence of minimal hybrid set-vector automata:
(a) Any subset of a finite-dimensional vector space admits a minimal cover as a finite union of subspaces. (b) Furthermore, there is a unique such cover which is a union of subspaces which are incomparable with respect to inclusion.
For instance, in the original vector space automaton Avec, the states that are reachable in fact all belong to R × {0} ∪ {0} × R, and this is the minimal cover as in (a) of these reachable configurations. This subset of R2 has the structure of two R-spaces. These happen to intersect at (0, 0), hence it is necessary to glue them at 0 to faithfully represent this set of reachable configurations. Thanks to (b) this decomposition is canonical, and hence can be used for describing the automaton.
3 Automata in a category
In this section, we provide the general definition for a (finite word) automaton in a category. We also isolate properties guaranteeing the existence of minimal automata. Though presented differently, the material in the first subsection is essentially a slight variation around the work of Arbib and Manes [3], which introduced a notion of automaton in a category and, moreover, highlighted the connection between factorization systems of the ambient category, duality and minimization. In the remaining subsections we develop a refinement of this approach to minimization, and introduce a notion of factorization system through a subcategory.

3.1 Automata in a category, initial automaton, final automaton
Definition 1 (automata). Let C be a locally small category, I and F be objects of C, and A be some alphabet. An automaton A in the category C (over the alphabet A), for short a C, I, F -automaton (or simply C-automaton when I and F are obvious in the context), is a tuple (Q, i, f , δ), where Q is an object in C (called the state object), i : I → Q and f : Q → F are morphisms in C (called initial and final morphisms), and δ : A → C(Q, Q) is a function associating to each letter a ∈ A a morphism δa : Q → Q in C. We extend the function δ to A∗ as with δ being the identity morphism on Q and δwa = δa ◦ δw for all a ∈ A and w ∈ A∗.
A morphism of C, I, F -automata h : A → A is a morphism h : Q → Q in C between the state objects which commutes with the initial, final and transition morphisms:


Q δa Q
















Example 2. The two guiding instantiation of this definition are as follows.
When the category C is Set, I = 1 and F = 2, we recover the standard notion of a deterministic and complete automaton (over the alphabet A∗). In the second
case, when C is Vec over a base field K, I = K and F = K, we obtain K-weighted automata. Indeed, if Q is isomorphic to Kn for some natural number n, then linear maps i : K → Q are in one-to-one correspondence with vectors Kn. The same holds for linear maps f : Q → K, hence i and f are simply selecting an initial, respectively, a final vector.

Definition 3 (languages and language accepted). A C, I, F -language (or Clanguage when I and F are clear from the context) is a function L : A∗ → C(I, F ).
We say that A accepts the language L if L(w) = [[A]](w) := f ◦ δw ◦ i for all w ∈ A∗. Let AutoC(L) denote the category of C, I, F -automata for L, that is, the category whose objects are C, I, F -automata that accept the language L and whose arrows are morphisms of C, I, F -automata3.

Lemma 4. If the coproduct w∈A∗ I exists in C, then AutoC(L) has an initial object initC(L). If the product w∈A∗ F exists in C, then AutoC(L) has a final object finalC(L).
In the case of Set, these automata are well known. The first one has as states A∗, as initial state ε, and when it reads a letter a, its maps w to wa. Its final map sends the state w to L(w)(0). There exists one and exactly one morphism from this automaton to each automata for the same language. The generalisation of this construction is that the state space is the coproduct of A∗many copies of I. The final automaton is known as the automaton of ‘residuals’.
3If A accepts the language L and h : A → A is a morphism of C, I, F -automata, then A also accepts the language L. Hence, AutoC(L) is a ‘connected component’ in the category of all C, I, F -automata.


Its set of states are the maps from A∗ to 2. The initial state is L itself, and when reading a letter a, the state S is mapped to w → S(aw). The final map sends S to S(ε). The generalisation of this construction is that the state space is the product of A∗-many copies of F .

3.2 Factorizations through a subcategory

It is important in the development of this paper to distinguish the category AutoC(L) in which the initial and final automata for a language L exist (recall Lemma 4) and which contains ‘infinite automata’, from the subcategory, named AutoS (L) that is used for the concrete automata (with state object in S) which are intended to be algorithmically manageable. In this section, we provide the concept of factorizing through a subcategory, which articulates the relation between these two categories.

Definition 5 (factorization through a subcategory). Assume S is a subcategory of C. An arrow f : X → Y in C is called S-small if it factors through some object S of S, that is, f is the composite X u S v Y for some u : X → S and v : S → Y .
A factorization system through S on C (or simply a factorization system on C if C = S) is a pair (ES , MS ) where ES and MS are classes of arrows in C so that the codomains of all arrows in ES , the domains of all arrows in MS are in S, and the following conditions hold:

1. ES and MS are closed under composition with isomorphisms in S, on the right, respectively left side.

2. All S-small arrows in C have an (ES , MS )-factorization, that is, if f : X → Y factors through an object of S, then there exists e ∈ ES and m ∈ MS , such that f = m ◦ e.

3. The unique (ES , MS )-diagonalization property holds: for each commutative diagram X eT






with e ∈ ES and m ∈ MS , there exists a unique diagonal, that is, a unique morphism u : T → S such that u ◦ e = f and m ◦ u = g.

Using standard techniques, we can prove that whenever (ES , MS ) is a factorization system through S on C, both classes ES and MS are closed under composition, their intersection consists of precisely the isomorphisms in S, and,
as expected, that (ES , MS )-factorizations of S-small morphisms are unique up to isomorphism.

Example 6. Instantiating (C, S) to be (Set, Setfin) yields a natural factorization system through Setfin on Set (as the restriction of the standard (epi,mono)factorization system on Set to Setfin-small morphisms, i.e., the maps of finite image). Over these categories Setfin-automata are deterministic finite state automata inside the more general category of Set-automata which are deterministic


(potentially infinite) automata. The example (Vec, Vecfin) was already mentioned. In this case, being Vecfin-small is equivalent to having finite rank.
Notice that for (C, S) = (Set, Setfin) or (Vec, Vecfin), the factorization systems through the subcategories are obtained simply by restricting the factorization systems on Set, respectively Vec. This is because, in these cases S is closed under quotients and subobjects in C. The category Gluefin(Vecfin) used in this paper is closed under quotients, but in general not under subobjects (and this is the important reason for this extension of the standard notion of factorization). This is also a case in which there is a factorization system in the category C, that coincide over S with factorizing through S, but for which factorizing in C of an S-small morphism and factorizing it through S yield different results.
A factorization system on C lifts naturally to categories of C-valued functors. Automata being very close in definition to such a functor category, factorization systems also lift to them. Lemma 7 shows that this is also the case for factorization systems through S, assuming of course that the input and output objects I and F belong to S.
Lemma 7. Whenever (ES , MS ) is a factorization system through a category S then (EAutoS(L), MAutoS(L)) is a factorization system through AutoS (L) for the category AutoC(L), where EAutoS(L) (resp. MAutoS(L)) contains these AutoC(L)morphisms that belong to ES (resp. to MS ) as C-morphisms.
3.3 Minimization through a subcategory
In this section, we show how the joint combination of having initial and final automata for a language, as given by Lemma 4, together with a factorization system through a subcategory S yields the existence of a minimal S-automaton for small C-languages.
We make the following assumptions: (ES , MS ) is a factorization system through S on C, and L is a C-language accepted by some S-automaton such that there exist an initial initC(L) C-automaton and a final C-automaton finalC(L) for L.
Definition 8 (minimal automaton). The minimal C-automaton for L, denoted minS (L), is the4 S-automaton for L obtained by (EAutoS(L), MAutoS(L))factorization of the unique AutoS (L)-small morphism from initC(L) to finalC(L).
Theorem 9. For all S-automata A for L satisfying the above assumptions, we have
minS (L) ∼= obsS (reachS (A)) ∼= reachS (obsS (A)) ,
in which
• reachS (A) is the result of applying an (EAutoS(L), MAutoS(L))-factorization to the unique AutoS (L)-morphism from initC(L) to A, and
• obsS (A) is the result of applying an (EAutoS(L), MAutoS(L))-factorization to the unique AutoS (L)-morphism from A to finalC(L).
4It is unique up to isomorphism according to the diagonal property.

This theorem does not only state the existence of a minimal automaton, it also makes transparent how to make effective its construction: if one possesses both an implementation of reachS and obsS , then their sequencing minimises an input automaton. From the above theorem it immediately follows that minS (L) is both an EAutoS(L)-quotient of a MAutoS(L)-subobject of A and a MAutoS(L)subobject of an EAutoS(L)-quotient of A: the minimal automaton divides every other automaton for the language.
Proof idea. The proof is contained in the following commutative diagram, in which denotes AutoC(L)-morphisms in EAutoS(L), and AutoC(L)-morphisms in MAutoS (L):

initC (L)

reachS (A) obsS (reachS (A)) finalC(L)

minS (L)
That obsS (reachS (A)) is an (EAutoS(L), MAutoS(L))-factorization of the unique AutoC(L)-morphism from initC(L) to finalC(L) follows since EAutoS(L) is closed under composition. By the unique diagonal property, it is isomorphic to minS (L). The case reachS (obsS (A)) is symmetric.

3.4 A special case of factorization through
So far, the description of factorization and minimization of automata is very generic. Hereafter, the classes ES and MS are constructed along a particular principle which we describe now. In the next sections we will instantiate this construction when S is the subcategory Gluefin(Vecfin) of glued vector spaces.
In this section we fix an (E, M )-factorization system on C and a subcategory S → C.
Definition 10. An S-extremal epimorphism5 in C is an arrow e : X → S in C, with S an object in S, such that if e = m ◦ g where m is in M with domain in S, then m is an isomorphism. We set MS to be the class of arrows in M with domain in S, and ES to be the class of S-extremal epimorphisms.
Definition 11. An MS -subobject in S of an object X of C is an equivalence class up to isomorphism of a morphism S → X belonging to M , where S is an object of S. The MS -subobject S → X is called proper if it is not an isomorphism.
Lemma 12. Assume the following conditions hold:
1. all arrows in M are monomorphisms in C,
2. S is closed under E-quotients, i.e., if e : S → T is in E with S in S, then T is isomorphic to an object of S,
3. the intersection of a nonempty set of MS -subobjects of an object X of C exists and is an M S-subobject of X, and,
5Note that S-extremal epimorphisms need not be epimorphisms in C.

AutomataAutomatonCategoryVector SpacesMinimization