Efficient enumerations in words

Zuba, Wiktor

Praca doktorska

Pliki

Licencja

Korzystanie z tego materiału możliwe jest zgodnie z właściwymi przepisami o dozwolonym użytku lub o innych wyjątkach przewidzianych w przepisach prawa. Korzystanie w szerszym zakresie wymaga uzyskania zgody uprawnionego.

Efficient enumerations in words

Autor

Zuba, Wiktor

Promotor

Rytter, Wojciech

Data publikacji

2021-10-08

Abstrakt (EN)

The dissertation presented here consists of a set of results in text processing algorithms from four different scientific publications. The papers focus on algorithms and data structures that allow efficient enumeration and counting of certain classes of factors. Most of the results are based on different variations of the same key property - regularity of the set of occurrences of factors of specific types. In ”Efficient Representation and Counting of Antipower Factors in Words” we focused on a class of k-antipowers (words consisting of k pairwise different segments of the same length). The results of the paper consist of effective algorithms for enumerating all occurrences, counting all occurrences and counting different k-antipower factors in a string, working in O(nk log k + output), O(nk log k) and O(nk^4 log k log n) time, respectively. The paper ”Counting Distinct Patterns in Internal Dictionary Matching” deals with a problem in which we are given an input word T and its internal dictionary D. We want to answer questions of the form: ”How many distinct words from D are the factors of T[i . . j]?”. We have shown few algorithms and data structures, that allows us to do that effectively. The algorithms differ in exploited properties and complexities (each one can be more effective from the other depending on the ratio between T and D). In ”The Number of Repetitions in 2D-Strings” we focused on the extensions of the notions of squares and runs to the case of two-dimensional strings. We gave new bounds on the maximal number of their occurrences in strings and effective algorithms for enumerating them. In the fourth paper titled ”Efficient Enumeration of Distinct Factors Using Package Representations”, we proposed a new compact representation of string factors. It allows us to obtain new effective algorithms for enumeration and counting of distinct factors it contains. In particular it allowed us to obtain a new simple algorithm for finding distinct squares and a faster algorithm for finding all distinct antipowers in a string (improving one of the results from the first of enclosed papers).

Słowa kluczowe EN

internal dictionary

two-dimensional string

runs

antipower

subword

factor

counting

enumeration

data structure

algorithm

słownik wewnętrzny

słowo dwuwymiarowe

maksymalne powtórzenie

antypotęga

podsłowo

zliczanie

wyliczanie

struktura danych

algorytm

Inny tytuł

Efektywne zliczanie w słowach

Data obrony

2021-10-19

Licencja otwartego dostępu

Dozwolony użytek

Licencja

Efficient enumerations in words

Opcje

Abstrakt (EN)