. Regex isn't A Pokemon - Autumnotopia
Home Tutorials

How to Quickly Format Your VN Script -- or, Regex isn't A Pokemon!

If you're a gamedev who's worked in Renpy, you probably have some kind of workflow for getting your writing into shape, so to speak. Humans and computers don't speak the same language, so there's some considerations you have to make.

Some people are able to just... write their first draft in code. I cannot do that. So therefore, in the way that I do, I've attempted to minimize how painful the process is when it comes time to implement my script.

One thing I try to keep in mind when structuring my script is consistency - it's what computers crave. I can't write in straight-up .rpy code, but I can consistently structure my writing. Usually, this sort of thing is what I go with

Clara: Has she always been... well, you know...? Sen: Gay? Clara: A grim reaper. Sen nodded solemly.

For my purposes, this gets me where I want to go (Completed-First-Draft City). And the best part is that I can almost entirely make the computer convert this into Renpy code for me. My secret trick is a little thing I like to call Regex (everyone calls it that, that's actually just what it's called).

Regex (or 'regular expressions') is basically a secret code used to find (and replace) patterns in things. It's sort of a buffed version of the normal version of search and replace you'd find in word processors. Instead of always doing literal searches however, you'll instead often want to use special symbolic characters that have different meanings.

Regex can be a little tricky to comprehend at first, but I find that playing around with it a little can help (here's an online regex tester that you can use to practice). I'll walk you through how I would auto-magically convert the snippet I posted above, which I think should probably teach you mooost of the basic building blocks you'll need to know.

Okay, but like... how do I actually RUN this stuff on my script? Many code editors come with a Regex option in their search and replace settings - here's more info about where you can find those settings in VSCode or in Sublime Text. If you can't find a way to do it in any program you already use, you could probably pull it together by using an online editor. As a warning sometimes there's different 'versions' of Regex that may change your results a little... so if something is acting funny, that may be a factor.

The first thing I do is I figure out how I want my final code to look, and figure out how things should translate between the two versions. Here's what my end goal looks like:

c "Has she always been... well, you know...?"
s "Gay?"
c "A grim reaper."
"Sen nodded solemly."
syntax highlighting by codehost

With my structure in mind, let's start to isolate each character's lines. First up: Clara! Searching for literal strings like this is easy:

Find: Clara:

Clara: Has she always been... well, you know...? Sen: Gay? Clara: A grim reaper. Sen nodded solemly.

(: is fine, but some special characters may confuse things. If you're ever unsure, just throw a \ before the character, for example \. would be how you'd search for a period specifically.)

Now that we have the name tag selected, we have a problem: there's a line break. Thankfully, regex accounts for this. The special code to select linebreaks is \n. But even after the linebreak, another problem... Each line of dialogue is unique, so how can we find and replace them?

Bear with me for a second, because this is probably going to be the hardest sequence to wrap your mind around. But kind of like how \n means 'new line', there's other special sequences that mean other things. In Regex, a period is used to symbolize 'any individual character'. If we tacked a period to our search query after the \n, the selection would look like this:

Find: Clara:\n.

Clara: Has she always been... well, you know...? Sen: Gay? Clara: A grim reaper. Sen nodded solemly.

Okay, well... That's not quite it yet, obviously! Now we need to convey 'we want every character on the following line, not just the first'. Thankfully, we have ways to do that too! When you pair a character with a *, it'll look for as many of that character in a row as possible. If we combine .*, then that basically says 'get the whole line'

Find: Clara:\n.*

Clara: Has she always been... well, you know...? Sen: Gay? Clara: A grim reaper. Sen nodded solemly.

Hey! That's pretty close! Now the final hurdle: we need to be able to manipulate these things! How are we supposed to do the 'replace' part of this find-and-replace operation? Well, I know that my Renpy variable for Clara's speaking lines is c, then I'll want quotation marks... But how do I grab and manipulate her lines of dialogue?

To do that, we'll make use of grouping. If you put parenthesis around a part of your 'find' query, Regex will keep that in mind and you can reference that in your 'replace' query. First, I'll make a minor modifcation to my find-query above: Clara:\n(.*) (Notice that the parenthesis are around all of the stuff on the second line of text, after the line break)

Find: Clara:\n(.*)

Clara: Has she always been... well, you know...? Sen: Gay? Clara: A grim reaper. Sen nodded solemly.

This means that the find-query now has enough info for us to structure our replace-query. Now my goal is to write my replace-query as being structured like Renpy code, which in this case means (character variable) "(dialogue)". I search and replace for each individual character, so for my replace-query I'll write Clara's variable, then the capture group. Which looks like:

Find: Clara:\n(.*)

Replace: c "\1"

(You'll notice that \1 hiding in there, that's how you reference your parenthesis'd group from your find-query. If you had multiple groups in your find-query, then you'd reference them by increasing number \2,\3, etc)

That Regex find-and-replace should automatically format all of Clara's lines - as mentioned I do this individually by character (which is a little silly probably, but I feel like it's an acceptable amount of busywork since it saves me a LOT of copy pasting in general).

c "Has she always been... well, you know...?"
s "Gay?"
c "A grim reaper."
Sen nodded solemly.

Once I've adjusted the code for each character (which is basically just swapping one name and letter for another), I'm now left with all unattached lines - in my case, it's usually narration (though sometimes I'll have to manually fix a few instances where a character talks for multiple lines, too). So now instead of looking for lines by character name, I want to basically find 'lines that don't have quotes around them' This involves another trick:

Find: (.*[^"])\n

c "Has she always been... well, you know...?"
s "Gay?"
c "A grim reaper."
Sen nodded solemly.

This is a little weird, but basically it says 'find any whole line where the last character isn't a " and is followed with a line break'. If you have quotes in your script in some places it may need a manual adjustment, but this should pick up most of my remaining untagged lines. I use a similar replace-query as my prior examples (with one minor modification, we need to add in a line break to make sure it still stays on its own line), which leaves us with our final queries and our Renpy-ready code:

Find: (.*[^"])\n

Replace: "\1"\n

c "Has she always been... well, you know...?"
s "Gay?"
c "A grim reaper."
"Sen nodded solemly."
syntax highlighting by codehost

Obviously this exact code won't work in every situation, you may need to make some modifications to get it to pick up what you're laying down. If you're struggling, play around with it in an online tester (a lot of them help by giving you breakdowns of what everything means)... And if you're still stumped, feel free to leave a comment and I may be able to help you figure it out.

Protip!

Even if your script isn't structured very regularly, you can at the very least use a simple regex query to wrap quotes around every line automatically using some of the strats above. This can be nice when you just want to get your game in-engine ASAP!

Find: (.*)

Replace: "\1"

This is a section of text in a visual novel. Is the cat in the box dead, or alive? The world may never know...
"This is a section of text in a visual novel. Is the cat in the box dead, or alive?"
"The world may never know..."
syntax highlighting by codehost