/ Regex

No regex capture groups in Google Docs

Let's say you're collecting information from a bunch of sources. Your goal here is to aggregate and edit all of this information into a single, cohesive Google Doc.

Now let's say that one of your sources systematically provides lists in this format:

-This list
-Is maddeningly
-Formatted

The only thing correct about this list is its message: its format is problematic. It's problematic because Google Doc autoformatters can't pick up on the fact that it's a list (tangentially, it's also not valid Markdown).

Effectively this means that Google Docs won't format this list as a bullet list. Also, if I want to change this unordered-list-imposter into an ordered list (aka, with numbers), I'd have quite the clicky manual job ahead of me.

Regex to the rescue?

Normally yes, but not in Google Docs. "Find and Replace" in Google Docs does support regular expressions, but with one caveat that prevents solving this particular problem.

I should be able to do the equivalent of this regular expression:

s/(-)([A-Z])/$1 $2/

Where:

  1. s indicates that I'm doing substitution[1]
  2. / are the regex delimiters that separate the "Find" and "Replace" expressions[2]
  3. (-)([A-Z]) is the "Find" expression, looking for a hyphen followed by a capital letter, and capturing both the hyphen and the capital letter into capture groups
  4. $1 $2 is the "Replace" expression, using the values from the automatically-assigned capture group variables ($1 and $2, for - and [A-Z], respectively), with a space in between

This should work.

Instead, here's what you will get in Google Docs:

$1 $2his list
$1 $2s maddeningly
$1 $2ormatted

After a moment of assuming the problem was me, I checked Google Docs regular expressions help and found this:

Note: Capture groups only work with Google Sheets

Bummer.

Workarounds?

Your workaround will vary depending on your situation. If it's a short list, you can manually do the work faster than writing a regular expression anyways.

If it's a much longer list, perhaps you wrangle the character manipulation in your favorite text editor that supports capture groups, then move it back to Google Docs.

Or if you happen to believe that all hyphens in your data are meant to be bullets, you live dangerously: find all - and replace with - followed by a space. Probably not for me.

If you could assume all hyphen-bullets happen at the beginning of the line, you might be tempted to reach for something like:

s/^-/- /

But Google Docs help is here to let you know that's not going to work:

Note: This regular expression [^] only works with Google Sheets.

Bummer.


  1. In Google Docs (or most GUI text editing tools), you wouldn't need the s operator because you are provided with a "Find and Replace" tool in the UI. ↩︎

  2. Similarly, in most GUI text editors, you would probably not use delimiters because the "Find" and "Replace" expressions are entered into two different text fields. ↩︎