Deterministic Finite Wheeler Automata are a natural generalisation to regular languages of the theory of compressed data structures originated by the introduction of the Burrows-Wheeler transform. Indeed, if we can find a Wheeler automaton recognizing a given language L, such automaton can be used to design time and space efficient algorithms for representing and searching L. In this paper we introduce an alternative representation of Deterministic Wheeler Automata by showing that a natural map between strings and rational numbers in Qr0, 1q can be extended to represent the automaton’s states as intervals in Qr0, 1q. Using this representation it emerges a natural relationship between automata properties and some properties of real numbers. In addition, such representation enables us to formulate problems related to automata in a numerical setting. Although at the moment the numerical approach does not lead to time efficient algorithms, we believe this new perspective deserves further consideration. As a further demonstration of the convenience of this new representation, we use it to provide a simple proof of an unexpected result on regular languages. More precisely, we compare the size of the smallest Wheeler automaton recognizing a given language L with respect to the size of the smallest automaton, possibly non-Wheeler, recognizing the same language. We show settings in which there can be an exponential gap between the two sizes, and we discuss the implications of this result on the problem of representing regular languages.

Manzini, G., Policriti, A., Prezza, N., Riccardi, B. (2024). The Rational Construction of a Wheeler DFA. In 35th Annual Symposium on Combinatorial Pattern Matching, CPM 2024 (pp.1). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.CPM.2024.23].

The Rational Construction of a Wheeler DFA

Manzini G.;Riccardi B.
2024

Abstract

Deterministic Finite Wheeler Automata are a natural generalisation to regular languages of the theory of compressed data structures originated by the introduction of the Burrows-Wheeler transform. Indeed, if we can find a Wheeler automaton recognizing a given language L, such automaton can be used to design time and space efficient algorithms for representing and searching L. In this paper we introduce an alternative representation of Deterministic Wheeler Automata by showing that a natural map between strings and rational numbers in Qr0, 1q can be extended to represent the automaton’s states as intervals in Qr0, 1q. Using this representation it emerges a natural relationship between automata properties and some properties of real numbers. In addition, such representation enables us to formulate problems related to automata in a numerical setting. Although at the moment the numerical approach does not lead to time efficient algorithms, we believe this new perspective deserves further consideration. As a further demonstration of the convenience of this new representation, we use it to provide a simple proof of an unexpected result on regular languages. More precisely, we compare the size of the smallest Wheeler automaton recognizing a given language L with respect to the size of the smallest automaton, possibly non-Wheeler, recognizing the same language. We show settings in which there can be an exponential gap between the two sizes, and we discuss the implications of this result on the problem of representing regular languages.
paper
Co-lexicographical Sorting; Deterministic Finite Automata; Graph Indexing; String Matching; Wheeler languages;
English
35th Annual Symposium on Combinatorial Pattern Matching, CPM 2024 - 25 June 2024 through 27 June 2024
2024
Inenaga, S; Puglisi, SJ
35th Annual Symposium on Combinatorial Pattern Matching, CPM 2024
9783959773263
2024
296
1
23
open
Manzini, G., Policriti, A., Prezza, N., Riccardi, B. (2024). The Rational Construction of a Wheeler DFA. In 35th Annual Symposium on Combinatorial Pattern Matching, CPM 2024 (pp.1). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.CPM.2024.23].
File in questo prodotto:
File Dimensione Formato  
Manzini-2024-LIPIcs-VoR.pdf

accesso aperto

Tipologia di allegato: Publisher’s Version (Version of Record, VoR)
Licenza: Creative Commons
Dimensione 883.62 kB
Formato Adobe PDF
883.62 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10281/492759
Citazioni
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
Social impact