Sometimes It’s Quicker and Easier to Use Ryan RegEx Tester Rather Than Writing an AutoHotkey Script
I used Ryan’s RegEx Tester in an earlier blog to create Web links without writing an AutoHotkey script. This time I take advantage of this powerful tool by using it to extract data for insertion into the INI file discussed in the last blog on this topic. The fact that you can paste any text into the top of the RegEx Tester, add a Regular Expression (and a substitution expression for RegExPlace()), then extract the altered text from the bottom pane makes it a unique AutoHotkey app. This capability alone can motivate someone to learn how to write Regular Expressions.
Extracting a Data List from Web Page Source Code
I needed to create a list of variable IDs to use as keys in the AHKRef.ini file. All of those keys appear in the HTML source code from the AutoHotkey.com “Variables and Expressions” page. (The HTML ID parameters do not display when the page loads into the main browser—only when selecting “Page source code” or “Inspect” from the right-click menu in most Web browsers. See “Developer’s Tools” to add menu items in Microsoft Edge.) Digging through the “Variables and Expressions” page source code, I could pick out the anchor jumps embedded in the page (e.g. <tr id=”Sec”>).
I could work through the page source code to copy-and-paste each ID into the INI text file, but that would be tedious and prone to errors. I found it far easier to copy the entire page into the top of the RegExReplace tab of Ryan’s RegEx Tester, write the appropriate RegEx and “Replacement Text” expression, then retrieve the data from the bottom of the tool:
The most challenging problems involve writing the RegEx and determining the substitution text. While I explain in this blog how the solution works, anyone who wants to do similar work needs an understanding of the mysteries of Regular Expressions. Although they may look complicated, RegExs merely require a slightly different way of thinking. Once understood RegExs represent a powerful method for manipulating text for a variety of purposes. The advantage of Regular Expressions over the usual text manipulation commands (e.g. StringReplace) comes in the form of greater versatility and power. However, writing a RegEx has a slight learning curve.
(There are many sources on the Web for learning Regular Expressions which apply to virtually any programming language, but they rarely offer AutoHotkey examples. That’s why I wrote a book exclusively about using Regular Expressions in AutoHotkey. I designed the book to bring the complete RegEx novice to a level where the enigmatic code starts to make sense by applying the RegExMatch() and RegExReplace() functions to practical problems. In most cases, you can copy example expressions from any Web source and use them directly in Ryan’s RegEx Tester.)
Note: While the double quotation marks work as shown in Ryan’s RegEx Tester, if you plan to use the RegEx directly in one of the AutoHotkey RegEx functions (RegExMatch() or RegExReplace()), be sure to escape each double quotation mark with another one (e.g. “”). For example:
RegExMatch(Haystack, ".+?<tr id=""(.+?)"">.+?","$1=$1`r")
Otherwise, you will get errors.
How the RegExReplace Works
When you look at all the garbage in the top of the RegEx Tester image above and compare it with the neat list of equations in the bottom, you can’t help but be impressed. We shifted through all the chaff to find the embedded IDs, then turned them into INI file key=value pairs.
We did this by applying the important “blah, blah, blah” RegEx (.+?). Introduced in a previous quick reference blog, the “blah, blah, blah” RegEx (.+?) devours anything and everything until it encounters text which matches the next character in the expression. Notice we use the “blah, blah, blah” RegEx three times in the Regular Expression:
This expression tells the RegEx engine to:
- Consume everything in sight until it reaches the string <tr id=” in the pane labeled Text to be searched: in the RegEx Tester.
- Save whatever appears within the pair of parentheses as a substring (or backreference $1) for later use until encountering the next set of “> characters.
- Start gobbling up text again until a new extraction begins with the next occurrence of a match for the <tr id=”(.+?)”> expression.
The Replacement Text
If we only used the backreference $1 in the Replacement Text field, we would get a list of the IDs. By adding the equals sign and repeating that backreference, the substring appears twice as an INI key=value pair:
$1=$1 ; followed by a hidden new line character
(An unseen RETURN or ENTER has been inserted at the end of the Replace Text field to create a new line for each occurrence of an ID. Otherwise, the keys would appear as one continuous line. If using the AutoHotkey RegExReplace() function in a script, be sure to replace the hidden new line with the `r escape character for a carriage return or `n for a newline.)
The last step involves merely selecting and copying the desired key=value pairs from the Results pane and pasting them into the INI file using any text editor.
At times, when you only need to parse the data once, Ryan’s RegEx Tester offers huge advantages over writing a new AutoHotkey script. Rather than going through the change, save, and reload process with even the simplest script, you can do everything in this tool.
The RegEx Tester only requires you to insert the input text with a simple copy-and-paste. The output text changes in real-time while you manipulate the RegEx function parameters. (You immediately see the results.) And finally, copy-and-paste the end product into any other application or file. Plus, since Ryan’s RegEx Tester is written in AutoHotkey, you can modify it to add other nifty features such as Hotkeys for sending results to a file or allowing the resizing of the input or output fields. Or, maybe a search feature for perusing the input text?