TL;DR: Tokenize the repeating characters in a string with <code>(\w)\1*</code>.
<hr />
Some coding challenges (and the occasional real-world challenge) require you to act on the repeating characters in a string. A couple of examples include finding binary gaps, like the string of six zeroes in 1101000000101, and <a target="_blank" href="https://en.wikipedia.org/wiki/Run-length_encoding">run-length encoding (RLE)</a> challenges.
You can quickly tokenize a string's repeating characters using the regular expression <code>(\w)\1*</code>.
<ul>
<li><code>(\w)</code> captures any single alphanumeric character</li>
<li><code>\1*</code> looks for zero or more matches of the previous alphanumeric character</li>
</ul>
Here's how it looks in JavaScript:
<pre><code class="lang-javascript">const test = "1101000000101";
const regex = /(\w)\1*/g

const tokens = test.match(regex);

console.log(tokens); 
// =&gt; [ '11', '0', '1', '000000', '1', '0', '1' ]
</code></pre>
And here's how it looks in Python:
<pre><code class="lang-python">import re

test = "1101000000101"
regex = r"(\w)\1*"

tokens = [x[0] for x in re.finditer(regex, test)]

print(tokens)
# =&gt; ['11', '0', '1', '000000', '1', '0', '1']
</code></pre>
Regular expression disclaimer and checklist:
<ul>
<li>Test it at <a target="_blank" href="https://regex101.com/">regex101.com</a></li>
<li>Visualize it at <a target="_blank" href="https://regexper.com/">Regexper</a></li>
<li>Check for <a target="_blank" href="https://www.regular-expressions.info/catastrophic.html">catastrophic backtracking</a> and <a target="_blank" href="https://www.regular-expressions.info/redos.html">denial of service</a> vulnerabilities.</li>
</ul>
Photo by <a href="https://unsplash.com/@waldemarbrandt67w?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Waldemar Brandt</a> on <a href="https://unsplash.com/s/photos/pattern?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a>

**TL;DR**: Tokenize the repeating characters in a string with `(\w)\1*`.

---

Some coding challenges (and the occasional real-world challenge) require you to act on the repeating characters in a string. A couple of examples include finding binary gaps, like the string of six zeroes in 1101**000000**101, and [run-length encoding (RLE)](https://en.wikipedia.org/wiki/Run-length_encoding) challenges.

You can quickly tokenize a string's repeating characters using the regular expression `(\w)\1*`.

- `(\w)` captures any single alphanumeric character
- `\1*` looks for zero or more matches of the previous alphanumeric character

Here's how it looks in JavaScript:

```javascript
const test = "1101000000101";
const regex = /(\w)\1*/g

const tokens = test.match(regex);

console.log(tokens); 
// => [ '11', '0', '1', '000000', '1', '0', '1' ]
```

And here's how it looks in Python:

```python
import re

test = "1101000000101"
regex = r"(\w)\1*"

tokens = [x[0] for x in re.finditer(regex, test)]

print(tokens)
# => ['11', '0', '1', '000000', '1', '0', '1']
```

Regular expression disclaimer and checklist:

- Test it at [regex101.com](https://regex101.com/)
- Visualize it at [Regexper](https://regexper.com/)
- Check for [catastrophic backtracking](https://www.regular-expressions.info/catastrophic.html) and [denial of service](https://www.regular-expressions.info/redos.html) vulnerabilities.

Photo by <a href="https://unsplash.com/@waldemarbrandt67w?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Waldemar Brandt</a> on <a href="https://unsplash.com/s/photos/pattern?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a>

Regex: Tokenize repeating characters in a string