Artykuł w czasopiśmie
Brak miniatury
Licencja

CC-BYCC-BY - Uznanie autorstwa

Minimum Common String Partition: Exact Algorithms

Autor
Mihajlin, Ivan
Kulikov, Alexander
Nikolaev, Maksim
Cygan, Marek
Reznikov, Grigory
Data publikacji
2021
Abstrakt (EN)

In the minimum common string partition problem (MCSP), one gets two strings and is asked to find the minimum number of cuts in the first string such that the second string can be obtained by rearranging the resulting pieces. It is a difficult algorithmic problem having applications in computational biology, text processing, and data compression. MCSP has been studied extensively from various algorithmic angles: there are many papers studying approximation, heuristic, and parameterized algorithms. At the same time, almost nothing is known about its exact complexity. In this paper, we present new results in this direction. We improve the known 2ⁿ upper bound (where n is the length of input strings) to ϕⁿ where ϕ ≈ 1.618... is the golden ratio. The algorithm uses Fibonacci numbers to encode subsets as monomials of a certain implicit polynomial and extracts one of its coefficients using the fast Fourier transform. Then, we show that the case of constant size alphabet can be solved in subexponential time 2^{O(nlog log n/log n)} by a hybrid strategy: enumerate all long pieces and use dynamic programming over histograms of short pieces. Finally, we prove almost matching lower bounds assuming the Exponential Time Hypothesis.

Słowa kluczowe EN
similarity measure
string distance
exact algorithms
upper bounds
lower bounds
Dyscyplina PBN
informatyka
Strony od-do
35:1-35:16
Data udostępnienia w otwartym dostępie
2021-08-31
Licencja otwartego dostępu
Uznanie autorstwa