Regular Expressions in .NET are pretty easy to use (assuming you understand the Regex syntax which is beside the case) and certainly you can think of some useful extension methods for System.String that would allow you to quickly validate against a particular regular expression pattern. But regular expressions have another great feature that you maybe don’t use as much – that is the ability to capture subexpressions into groups so that you can pull out a piece of the match.
The way this is done using the Regex class is pretty straightforward.
// Find a word that starts with H and W
string input = "Hello World";
string pattern = @"(H\w+) (W\w+)";
Match m = Regex.Match(input, pattern);
if (m.Success) {
string hWord = m.Groups[1].Value;
string wWord = m.Groups[2].Value;
}
This is easy enough, but I am annnoyed by the code that accesses the groups. It seems like such a mundane detail to worry about Group and Match objects. What if the code could be simplified like the following:
string input = "Hello World"; string pattern = @"(H\w+) (W\w+)"; string hWord, wWord; if (input.MatchInto(pattern, out hWord, out wWord)) ...
It may not look like much of a savings in terms of lines of code, but I find that it looks much cleaner and is a lot less explicit. The full code is below.
/// <summary>
/// Performs a regular expression match against the specified string, and places the captured groups into the output parameters.
/// </summary>
/// <param name="input">The input string to match against.</param>
/// <param name="pattern">The regular expression pattern which should contain grouping expressions.</param>
/// <param name="value1">Receives the value of the 1st capture group (Groups[1]) or null if no match was made.</param>
/// <param name="value2">Receives the value of the 2nd capture group (Groups[2]) or null if no match was made.</param>
/// <param name="value3">Receives the value of the 3rd capture group (Groups[3]) or null if no match was made.</param>
/// <param name="value4">Receives the value of the 4th capture group (Groups[4]) or null if no match was made.</param>
/// <param name="value5">Receives the value of the 5th capture group (Groups[5]) or null if no match was made.</param>
/// <returns>True if the pattern matched (not necessarily all groups) otherwise false.</returns>
public static bool MatchInto( this string input, string pattern, out string value1, out string value2, out string value3, out string value4, out string value5 )
{
value1 = value2 = value3 = value4 = value5 = null;
var match = Match( input, pattern );
if ( match.Success ) {
// Value1
if ( match.Groups.Count > 1 && match.Groups[1].Success ) {
value1 = match.Groups[1].Value;
} // if
// Value2
if ( match.Groups.Count > 2 && match.Groups[2].Success ) {
value2 = match.Groups[2].Value;
} // if
// Value3
if ( match.Groups.Count > 3 && match.Groups[3].Success ) {
value3 = match.Groups[3].Value;
} // if
// Value4
if ( match.Groups.Count > 4 && match.Groups[4].Success ) {
value4 = match.Groups[4].Value;
} // if
// Value5
if ( match.Groups.Count > 5 && match.Groups[5].Success ) {
value5 = match.Groups[5].Value;
} // if
return true;
} // if
return false;
}
/// <summary>
/// Performs a regular expression match against the specified string, and places the captured groups into the output parameters.
/// </summary>
/// <param name="input">The input string to match against.</param>
/// <param name="pattern">The regular expression pattern which should contain grouping expressions.</param>
/// <param name="value1">Receives the value of the 1st capture group (Groups[1]) or null if no match was made.</param>
/// <param name="value2">Receives the value of the 2nd capture group (Groups[2]) or null if no match was made.</param>
/// <param name="value3">Receives the value of the 3rd capture group (Groups[3]) or null if no match was made.</param>
/// <param name="value4">Receives the value of the 4th capture group (Groups[4]) or null if no match was made.</param>
/// <returns>True if the pattern matched (not necessarily all groups) otherwise false.</returns>
public static bool MatchInto( this string input, string pattern, out string value1, out string value2, out string value3, out string value4 )
{
string value5;
return MatchInto( input, pattern, out value1, out value2, out value3, out value4, out value5 );
}
/// <summary>
/// Performs a regular expression match against the specified string, and places the captured groups into the output parameters.
/// </summary>
/// <param name="input">The input string to match against.</param>
/// <param name="pattern">The regular expression pattern which should contain grouping expressions.</param>
/// <param name="value1">Receives the value of the 1st capture group (Groups[1]) or null if no match was made.</param>
/// <param name="value2">Receives the value of the 2nd capture group (Groups[2]) or null if no match was made.</param>
/// <param name="value3">Receives the value of the 3rd capture group (Groups[3]) or null if no match was made.</param>
/// <returns>True if the pattern matched (not necessarily all groups) otherwise false.</returns>
public static bool MatchInto( this string input, string pattern, out string value1, out string value2, out string value3 )
{
string value4;
string value5;
return MatchInto( input, pattern, out value1, out value2, out value3, out value4, out value5 );
}
/// <summary>
/// Performs a regular expression match against the specified string, and places the captured groups into the output parameters.
/// </summary>
/// <param name="input">The input string to match against.</param>
/// <param name="pattern">The regular expression pattern which should contain grouping expressions.</param>
/// <param name="value1">Receives the value of the 1st capture group (Groups[1]) or null if no match was made.</param>
/// <param name="value2">Receives the value of the 2nd capture group (Groups[2]) or null if no match was made.</param>
/// <returns>True if the pattern matched (not necessarily all groups) otherwise false.</returns>
public static bool MatchInto( this string input, string pattern, out string value1, out string value2 )
{
string value3;
string value4;
string value5;
return MatchInto( input, pattern, out value1, out value2, out value3, out value4, out value5 );
}
/// <summary>
/// Performs a regular expression match against the specified string, and places the captured groups into the output parameters.
/// </summary>
/// <param name="input">The input string to match against.</param>
/// <param name="pattern">The regular expression pattern which should contain grouping expressions.</param>
/// <param name="value">Receives the value of the 1st capture group (Groups[1]) or null if no match was made.</param>
/// <returns>True if the pattern matched (not necessarily all groups) otherwise false.</returns>
public static bool MatchInto( this string input, string pattern, out string value )
{
string value2;
string value3;
string value4;
string value5;
return MatchInto( input, pattern, out value, out value2, out value3, out value4, out value5 );
}

Posts