The Generator
The generator is exported as the regexGen()
function, everything must be referenced from it.
To generate a regular expression, pass sub-expressions as parameters to the call of regexGen()
function.
Sub-expressions are then concatenated together to form the whole regular expression.
Sub-expressions can either be a string
, a number
, a RegExp
object, or any combinations of the call to methods (i.e., the sub-generators
) of the regexGen()
function object.
Strings passed to the the call of regexGen()
, text()
, maybe()
, anyCharOf()
and anyCharBut()
functions, are always escaped as necessary, so you don't have to worry about which characters to escape.
The result of calling the regexGen()
function is a RegExp
object. See The RegExp Object section for detail.
Since everything must be referenced from the regexGen()
function, to simplify codes, assign it to a short variable is preferable.
- Source:
Example
var _ = regexGen;
var regex = regexGen(
_.startOfLine(),
_.capture( 'http', _.maybe( 's' ) ), '://',
_.capture( _.anyCharBut( ':/' ).repeat() ),
_.group( ':', _.capture( _.digital().multiple(2,4) ) ).maybe(), '/',
_.capture( _.anything() ),
_.endOfLine()
);
var matches = regex.exec( url );
Methods
-
(static) any() → {Term}
-
- Source:
Returns:
- Type
- Term
-
(static) anyChar() → {Term}
-
Matches any single character except the newline character (.)
- Source:
Returns:
- Type
- Term
-
(static) anyCharBut() → {Term}
-
Anything but these characters ([^abc]) usage: anyCharBut( [ 'a', 'c' ], ['2', '6'], 'fgh', 'z' ): ([^a-c2-6fghz])
- Source:
Returns:
- Type
- Term
-
(static) anyCharOf() → {Term}
-
Any given character ([abc]) usage: anyCharOf( [ 'a', 'c' ], ['2', '6'], 'fgh', 'z' ): ([a-c2-6fghz])
- Source:
Returns:
- Type
- Term
-
(static) anything() → {Term}
-
Matches any characters except the newline character: (.*)
- Source:
Returns:
- Type
- Term
-
(static) ascii() → {Term}
-
Matches the character with the code hh (two hexadecimal digits)
- Source:
Returns:
- Type
- Term
-
(static) backspace() → {Term}
-
Matches a backspace (U+0008). You need to use square brackets if you want to match a literal backspace character. (Not to be confused with \b.)
- Source:
Returns:
- Type
- Term
-
(static) capture() → {Capture}
-
Matches specified terms and remembers the match. The generated parentheses are called capturing parentheses. label 是用來供 back reference 索引 capture 的編號。 計算方式是由左至右,計算左括號出現的順序,也就是先深後廣搜尋。 capture( label('cap1'), capture( label('cap2'), 'xxx' ), capture( label('cap3'), '...' ), 'something else' )
- Source:
Returns:
- Type
- Capture
-
(static) carriageReturn() → {Term}
-
Matches a carriage return: (\r)
- Source:
Returns:
- Type
- Term
-
(static) controlChar() → {Term}
-
Matches a control character in a string. Where X is a character ranging from A to Z.
- Source:
Returns:
- Type
- Term
-
(static) digital() → {Term}
-
Matches a digit character: (\d)
- Source:
Returns:
- Type
- Term
-
(static) either() → {Sequence}
-
Adds alternative expressions
- Source:
Returns:
- Type
- Sequence
-
(static) endOfLine() → {Term}
-
- Source:
Returns:
- Type
- Term
-
(static) formFeed() → {Term}
-
Matches a form feed: (\f)
- Source:
Returns:
- Type
- Term
-
(static) group() → {Sequence}
-
Matches specified terms but does not remember the match. The generated parentheses are called non-capturing parentheses.
- Source:
Returns:
- Type
- Sequence
-
(static) hexDigital() → {Term}
-
- Source:
Returns:
- Type
- Term
-
(static) ignoreCase()
-
Case-insensitivity modifier.
- Source:
-
(static) label() → {Label}
-
label is a reference to a capture group, and is allowed only in the capture() method
- Source:
Returns:
- Type
- Label
-
(static) lineBreak() → {Term}
-
Matches any line break, includes Unix and windows CRLF
- Source:
Returns:
- Type
- Term
-
(static) lineFeed() → {Term}
-
Matches a line feed: (\n)
- Source:
Returns:
- Type
- Term
-
(static) many() → {Term}
-
occurs one or more times (x+)
- Source:
Returns:
- Type
- Term
-
(static) maybe() → {Term}
-
Any optional character sequence, shortcut for Term.maybe ((?:abc)?)
- Source:
Returns:
- Type
- Term
-
(static) mixin(global)
-
A utility function helps using the regexGen generator.
Parameters:
Name Type Description global
Object the target object that sub-generators will inject to.
- Source:
-
(static) nonDigital() → {Term}
-
Matches any non-digit character
- Source:
Returns:
- Type
- Term
-
(static) nonSpace() → {Term}
-
Matches a single character other than white space: (\S)
- Source:
Returns:
- Type
- Term
-
(static) nonWord() → {Term}
-
Matches any non-word character.
- Source:
Returns:
- Type
- Term
-
(static) nonWordBoundary() → {Term}
-
Matches a non-word boundary. This matches a position where the previous and next character are of the same type: Either both must be words, or both must be non-words. The beginning and end of a string are considered non-words.
- Source:
Returns:
the non-word boundary expression term object.
- Type
- Term
-
(static) nullChar() → {Term}
-
Matches a NULL (U+0000) character. Do not follow this with another digit, because \0
is an octal escape sequence. - Source:
Returns:
- Type
- Term
-
(static) regex() → {Term|RegexOverwrite}
-
trust me, just put the value as is.
- Source:
Returns:
- Type
- Term | RegexOverwrite
-
(static) sameAs() → {CaptureReference}
-
back reference
- Source:
Returns:
- Type
- CaptureReference
-
(static) searchAll()
-
Default behaviour is with "g" modifier, so we can turn this another way around than other modifiers
- Source:
-
(static) searchMultiLine()
-
Multiline
- Source:
-
(static) space() → {Term}
-
Matches a single white space character, including space, tab, form feed, line feed: (\s)
- Source:
Returns:
- Type
- Term
-
(static) startOfLine() → {Term}
-
- Source:
Returns:
- Type
- Term
-
(static) tab() → {Term}
-
Matches a tab (U+0009): (\t)
- Source:
Returns:
- Type
- Term
-
(static) text(value) → {Term}
-
Any character sequence (abc).
Parameters:
Name Type Description value
String the character sequence.
- Source:
Returns:
the text literal expression term object.
- Type
- Term
-
(static) unicode() → {Term}
-
Matches the character with the code hhhh (four hexadecimal digits).
- Source:
Returns:
- Type
- Term
-
(static) vertTab() → {Term}
-
Matches a vertical tab (U+000B): (\v)
- Source:
Returns:
- Type
- Term
-
(static) word() → {Term}
-
Matches any alphanumeric character including the underscore: (\w)
- Source:
Returns:
- Type
- Term
-
(static) wordBoundary() → {Term}
-
Matches a word boundary. A word boundary matches the position where a word character is not followed or preceeded by another word-character. Note that a matched word boundary is not included in the match. In other words, the length of a matched word boundary is zero. (Not to be confused with [\b].)
- Source:
Returns:
the word boundary expression term object.
- Type
- Term
-
(static) words() → {Term}
-
Matches any alphanumeric character sequence including the underscore: (\w+)
- Source:
Returns:
- Type
- Term