About Regular Expression

About Regular Expression Engine

The Regular Expression engine we used is Microsoft Regular Expression 1.0, which is deployed inside vbscript.dll, a common dynamic link library of most Windows. Though it's very old, but it can already do most things we need.

Fast Speed
Without those complex but rare-used functions, it's simple and fast, can fulfil the strict demand of speed of mass replacing. During my numerous replacing works, this engine is always fast.
Simple Grammar
Second, it's grammar is simple. Comparing to Perl regexp or Java regexp, user needn't write regexp delimiter - "/" at all, which is very confusing when you use "\" to convert characters.
Good Compatibility
The Regular Expression container, vbscript.dll, exists pervasively in most Windows. So this tool can run on Most Windows, even old Win 9x. So users with any Windows, with or without .Net freamework or Java runtime, can enjoy their amazing mass replacement.
Light Weight
User needn't install any additional Regexp engine. Imagine if we use a later regexp engine of .Net of Java, users should install a cumbersome .Net or Java runtime which is hundreds times in size than this tool. (Even a full-functional perl Regexp is serveral times in size than it.)
Excellent Reliability
Microsoft Regular Expression 1.0 is a stable version, it's more reliable than other free third-party Regexp engines, which are always in updating or bug fixing.
There're only two versions before .Net version -- 1.0 and 5.5, there's no notable differences between them, that means version 1.0 is already very good. During my works among various Regexp engine, I didn't meet error or hang up in replacing, which is meet in other engines some time.
Limitation of Microsoft Regular Expression 1.0

Essential Grammar

Key Words

Metacharacters
Key Description Example Explanation
^ start of line ^On Key word "On" must be the first word of a line
$ end of line
. Anything except line feed One character, similar as ? in Windows wildcard
X* Preceding expression exists any times or not (.|\n)* Matches nothing or anything(even paragraphs) of any length
X+ Preceding expression exists once or more (.|\n)+ Anything(except nothing) of any length(>=1)
? Exists once or not ( Such as a,b,c or A,B,C
(XX)Make a group of characters, so that we can add other metacharacter behind or reference it in result
[XX]Make a class of characters, can match any characters inside.
(A|B)Similar as [AB], but A and B can be any length of characters, like word
|Either of two sides can match.
[^X]A class of all characters except the characters inside
{a}Preceding expression should be repeated a times.
{a,b}Preceding expression should be repeated from a times to b times.
?Preceding expression exists once or not.
?Match only to the nearest expression followed
Escaped Characters
Key Abbr. of Description Example Explanation
\d Digit means a number Such as 1,2,3
\b Boundary means a word boundary. (Some other types of RegEx use \< and \> do the same thing.) \bsome\b Only word "some" matches, "something" or "handsome" doesn't match
\r Return means carriage return For Mac system
\n Newlinemeans newline For Windows system
\r\n whole line break Mostly for Windows system, also used in most files for cross-platform
\w Wordwithout a + behind, it could only be a Latin character. Match a whole word should be \b\w+\b Such as a,b,c or A,B,C
\s Space means white space, blank character, invisible character. Such as space or tab
\t Tab means white space, blank character, invisible character. Such as space or tab
\W ... An upper case means the negative class of the lowercase one means, reversed range.

About Windows wildcard

Most users are familiar with wildcards in Windows, or MS Office, especially in MS Word, such as "*" can matches any characters, and "?" cam match any one character. But in Regular Expression, these two symbols are arranged to other meaning, you have to use standard Regular Expression symbols or expressions to express your intention.

How to express as "*" of Windows wild card
Simply, you can use ".+" to express any characters, if you want to also match nothing, you can use ".*"

How to express as "?" of Windows wild card
Simply, you can use "." to express any character, but if you want to only match a Latin character(no symbol or number), you can use "\w".

For more reference and examples please visit our website.