Hybridchill

NOTE: This project has been superceded by cfRegex, which provides a greater level of functionality, and far better documentation.

Java RegEx Utilities

Java RegEx Utilities is a CFML component offering various functions which utilise Java's powerful regular expression features to provide functionality not available with standard CF (lookbehinds, function pointers, and so on).

Please send any questions or queries relating to Java RegEx Utilities via email to jreutils_project@hybridchill.com

You can download Java RegEx Utilities from the Hybridchill Project Download page or the jre-utils RIAForge project page.

Functions

The latest v0.7 release of jre-utils supports the following functions: init , get , getFirst , getGroups , match , matchFirst , matchGroups , matches , replace , split.

The get* and match* groups are identical functions with different argument orders. The get* functions are consistent with rereplace, whilst the match* use the order of rematch.

Most functions also have a *NoCase equivalent - for those that do not, use the (?i) embedded expression.


init

Initialises the component, sets up configuration options.

ARGUMENTS
DefaultFlags
Optional, String list, default 'MULTILINE'
Any flags that apply by default.
IgnoreInvalidFlags
Optional, Boolean, default false
If true, ignores invalid flags, instead of throwing an error.
BackslashReferences
Optional, Boolean, default false
Allows you to use \1 instead of $1 in backreferences.
SetNullGroupsBlank
Optional, Boolean, default true
When returning backreferences, groups with no value are set to empty string if true, or left as null if false.

get

Returns an array of matches to the supplied RegEx.

ARGUMENTS
Text
String
Text to look for matches in.
Regex
String regular expression
Expression used to find match.
Flags
Optional, String list, default DefaultFlags
Any flags to apply to matching.

getNoCase

Exactly the same as get, but with CASE_INSENSITIVE flag enabled.


getFirst

This is a shortcut for get, that returns only the first result, or blank if no matches.


getFirstNoCase

Exactly the same as getFirst, but with CASE_INSENSITIVE flag enabled.


getGroups

Returns an array of structs. Each struct contains string 'match' for that result, and array 'groups' for backreferences.

ARGUMENTS
Text
String
Text to look for matches in.
Regex
String regular expression
Expression used to find match.
SetNullGroupsBlank
Optional, Boolean, default true (see SetNullGroupsBlank)
If true, groups with no values are set to blank, otherwise they are null.
Flags
Optional, String list, default DefaultFlags
Any flags to apply to matching.

match

This is an alias of get but with argument order changed to be consistent with rematch CFML function.


matchNoCase

Exactly the same as match, but with CASE_INSENSITIVE flag enabled.


matchFirst

This is an alias of getFirst but with argument order changed to be consistent with rematch CFML function.


matchFirstNoCase

Exactly the same as matchFirst, but with CASE_INSENSITIVE flag enabled.


matchGroups

This is an alias of getGroups but with argument order changed to be consistent with rematch CFML function.


matchGroupsNoCase

Exactly the same as matchGroups, but with CASE_INSENSITIVE flag enabled.


matches

Returns true if the RegEx produced any matches, false otherwise.

ARGUMENTS
Text
String
Text to look for matches in.
Regex
String regular expression
Expression used to find match.
Flags
Optional, String list, default DefaultFlags
Any flags to apply to matching.

matchesNoCase

Exactly the same as matches, but with CASE_INSENSITIVE flag enabled.


replace

Replaces matches to the regex with Replacement (text or function).

NOTE: For simple replacing, consider Text.replaceFirst(Regex,Replacement) or Text.replaceAll(Regex,Replacement) instead.

ARGUMENTS
Text
String
Text to use as replacement source.
Regex
String regular expression
Expression used to match replacements.
Replacement
String or UDF
String to replace each match with;
or
User-Defined Function to apply to each match. (See Callback Example for details.)
Scope
String, optional, default 'ONE'
Determines if first match or all matches should be replaced. ONE or ALL

split

Splits the supplied text into an array using the supplied regex as delimiter.

NOTE: If you're not using the flags, it's faster to do Text.split(Regex) instead.

ARGUMENTS
Text
String
Input text to split into parts.
Regex
String regular expression
Expression used to split text.
Flags
Optional, String list, default DefaultFlags
Any flags to apply to matching.

splitNoCase

Exactly the same as split, but with CASE_INSENSITIVE flag enabled.


Flags

The following can be used for the DefaultFlags option or the get() functions:

UNIX_LINES
Only \n is recognised as a line terminator (\r is not)
Equivalent to (?d) embedded expression.
CASE_INSENSITIVE
Enables case-insensitive matching, by default only for the US-ASCII charset. (See UNICODE_CASE below)
Equivalent to (?i) embedded expression.
UNICODE_CASE
Allows CASE_INSENSITIVE to also work for Unicode.
Equivalent to (?u) embedded expression.
COMMENTS
Permits whitespace and single-line comments in the regex pattern.
Equivalent to (?x) embedded expression.
MULTILINE
Enables multiline mode, so ^ and $ also matches against the start/end of each line. By default ^ and $ do not match each line, only against the entire string.
Equivalent to (?m) embedded expression.
DOTALL
Enables dotall mode, so the . also matches line terminators. By default . excludes line terminators.
Equivalent to (?s) embedded expression.
CANON_EQ
Enables canonical equivalence. When this flag is specified then two characters will be considered to match if their full canonical decompositions match.

Callback Function

The replace() function supports a callback function for the replacement argument. This function should return a String and accept two arguments: a String named Match, and an optional Array named Groups.

See this example function that does nothing:

<cffunction name="CallbackFunction" returntype="String" output="false">     <cfargument name="Match" type="String"/>     <cfargument name="Groups" type="Array" default="#ArrayNew(1)#"/>         <!---         NOTE:         This example function does nothing.         It returns each match unchanged.     --->         <cfreturn Arguments.Match/> </cffunction>

You must use a User-Defined Function for the callback function. You cannot use built-in functions directly. However, within your UDF you can put whatever CFML logic you like, once it returns a string.