Asymptotically half of binary words are shuffle squares
- Series
- Stochastics Seminar
- Time
- Thursday, April 16, 2026 - 15:30 for 1 hour (actually 50 minutes)
- Location
- Skiles 006
- Speaker
- Logan Post – Georgia Institute of Technology – lpost3@gatech.edu
A binary shuffle square is a binary word of even length that can be partitioned into two disjoint, identical subwords. While recognizing shuffle squares is NP-hard, we show that they are surprisingly ubiquitous. We prove that a uniformly random binary word $s$ of length $2n$ is a shuffle square with probability $\frac 12-o(n^{-1/15})$, verifying a conjecture of He, Huang, Nam, and Thaper. In particular, almost every binary word is at most two bit-deletions away from a shuffle square, giving the best possible average case for the “Longest Twin” problem.
By revealing the bits of $s$ sequentially, we reformulate the problem as a discrete stochastic process. We track the evolution of a “buffer set”, a collection of suffixes produced by the revealed bits. In this setting, there is a simple greedy algorithm which behaves like a SSRW; we define a local optimization which creates a negative bias. We also show that the buffer set is robust enough to absorb small defects, yielding a perfect partition with high probability.