Java RegEx Utilities
Java RegEx Utilities is a CFML component offering various functions which utilise Java's powerful regular expression features to provide functionality not available with standard CF (lookbehinds, function pointers, and so on).
Please send any questions or queries relating to Java RegEx Utilities via email to
jreutils_project
hybridchill.com
You can download Java RegEx Utilities from the Hybridchill Project Download page or the jre-utils RIAForge project page.
Functions
The latest v0.7 release of jre-utils supports the following functions: init , get , getFirst , getGroups , match , matchFirst , matchGroups , matches , replace , split.
The get* and match* groups are identical functions with different argument orders. The get* functions are consistent with rereplace, whilst the match* use the order of rematch.
Most functions also have a *NoCase equivalent - for those that do not, use the (?i) embedded expression.
init
Initialises the component, sets up configuration options.
ARGUMENTS
- DefaultFlags
- Optional, String list, default 'MULTILINE'
- Any flags that apply by default.
- IgnoreInvalidFlags
- Optional, Boolean, default false
- If true, ignores invalid flags, instead of throwing an error.
- BackslashReferences
- Optional, Boolean, default false
- Allows you to use \1 instead of $1 in backreferences.
- SetNullGroupsBlank
- Optional, Boolean, default true
- When returning backreferences, groups with no value are set to empty string if true, or left as null if false.
get
Returns an array of matches to the supplied RegEx.
ARGUMENTS
- Text
- String
- Text to look for matches in.
- Regex
- String regular expression
- Expression used to find match.
- Flags
- Optional, String list, default DefaultFlags
- Any flags to apply to matching.
getNoCase
Exactly the same as get, but with CASE_INSENSITIVE flag enabled.
getFirst
This is a shortcut for get, that returns only the first result, or blank if no matches.
getFirstNoCase
Exactly the same as getFirst, but with CASE_INSENSITIVE flag enabled.
getGroups
Returns an array of structs. Each struct contains string 'match' for that result, and array 'groups' for backreferences.
ARGUMENTS
- Text
- String
- Text to look for matches in.
- Regex
- String regular expression
- Expression used to find match.
- SetNullGroupsBlank
- Optional, Boolean, default true (see SetNullGroupsBlank)
- If true, groups with no values are set to blank, otherwise they are null.
- Flags
- Optional, String list, default DefaultFlags
- Any flags to apply to matching.
match
This is an alias of get but with argument order changed to be consistent with rematch CFML function.
matchNoCase
Exactly the same as match, but with CASE_INSENSITIVE flag enabled.
matchFirst
This is an alias of getFirst but with argument order changed to be consistent with rematch CFML function.
matchFirstNoCase
Exactly the same as matchFirst, but with CASE_INSENSITIVE flag enabled.
matchGroups
This is an alias of getGroups but with argument order changed to be consistent with rematch CFML function.
matchGroupsNoCase
Exactly the same as matchGroups, but with CASE_INSENSITIVE flag enabled.
matches
Returns true if the RegEx produced any matches, false otherwise.
ARGUMENTS
- Text
- String
- Text to look for matches in.
- Regex
- String regular expression
- Expression used to find match.
- Flags
- Optional, String list, default DefaultFlags
- Any flags to apply to matching.
matchesNoCase
Exactly the same as matches, but with CASE_INSENSITIVE flag enabled.
replace
Replaces matches to the regex with Replacement (text or function).
NOTE: For simple replacing, consider Text.replaceFirst(Regex,Replacement) or Text.replaceAll(Regex,Replacement) instead.
ARGUMENTS
- Text
- String
- Text to use as replacement source.
- Regex
- String regular expression
- Expression used to match replacements.
- Replacement
- String or UDF
- String to replace each match with;
- or
- User-Defined Function to apply to each match. (See Callback Example for details.)
- Scope
- String, optional, default 'ONE'
- Determines if first match or all matches should be replaced. ONE or ALL
split
Splits the supplied text into an array using the supplied regex as delimiter.
NOTE: If you're not using the flags, it's faster to do Text.split(Regex) instead.
ARGUMENTS
- Text
- String
- Input text to split into parts.
- Regex
- String regular expression
- Expression used to split text.
- Flags
- Optional, String list, default DefaultFlags
- Any flags to apply to matching.
splitNoCase
Exactly the same as split, but with CASE_INSENSITIVE flag enabled.
Flags
The following can be used for the DefaultFlags option or the get() functions:
- UNIX_LINES
- Only \n is recognised as a line terminator (\r is not)
- Equivalent to (?d) embedded expression.
- CASE_INSENSITIVE
- Enables case-insensitive matching, by default only for the US-ASCII charset. (See UNICODE_CASE below)
- Equivalent to (?i) embedded expression.
- UNICODE_CASE
- Allows CASE_INSENSITIVE to also work for Unicode.
- Equivalent to (?u) embedded expression.
- COMMENTS
- Permits whitespace and single-line comments in the regex pattern.
- Equivalent to (?x) embedded expression.
- MULTILINE
- Enables multiline mode, so ^ and $ also matches against the start/end of each line. By default ^ and $ do not match each line, only against the entire string.
- Equivalent to (?m) embedded expression.
- DOTALL
- Enables dotall mode, so the . also matches line terminators. By default . excludes line terminators.
- Equivalent to (?s) embedded expression.
- CANON_EQ
- Enables canonical equivalence. When this flag is specified then two characters will be considered to match if their full canonical decompositions match.
Callback Function
The replace() function supports a callback function for the replacement argument. This function should return a String and accept two arguments: a String named Match, and an optional Array named Groups.
See this example function that does nothing:
<cffunction name="CallbackFunction" returntype="String" output="false">
<cfargument name="Match" type="String"/>
<cfargument name="Groups" type="Array" default="#ArrayNew(1)#"/>
<!---
NOTE:
This example function does nothing.
It returns each match unchanged.
--->
<cfreturn Arguments.Match/>
</cffunction>
You must use a User-Defined Function for the callback function. You cannot use built-in functions directly. However, within your UDF you can put whatever CFML logic you like, once it returns a string.
