Composing an Email-address pattern

Suppose we wanted to construct a pattern which matches an email address.

We'll use an easy-going approach to the email-address specification according to the syntax definition on Wikipedia.

The definition is split up in a local and domain part.

Let's start with the local part, which may be implemented as follows:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
open ReggerIt

let ucase = Between 'A' 'Z'
let lcase = Between 'a' 'z' 
let printable = OneOf "!#$%&'*+-/=?^_`{|}~"

let local =  OnceOrMore (ucase ||| lcase ||| Macro.decimalDigit ||| printable ||| Plain ".")

The domain part is composed from a list of labels, seperated by a dot. A label may contain a hyphen, but not at the start or end and has maximum length of 63:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
let dot = Plain "."

let labelUnrestricted = ucase ||| lcase ||| Macro.decimalDigit

let label = (labelUnrestricted + RepeatRange 0 61 (labelUnrestricted ||| Plain "-") + labelUnrestricted) |||  labelUnrestricted

let domain = OnceOrMore(label + dot) + label

The pattern for the email-address is finally constructed by gluing it all together:

1: 
let email = local + Plain "@" + domain

Test the pattern:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
open System.Text.RegularExpressions

let pattern = email |> Convert.ToStringStartPattern

[
    "simple@example.com"
    "very.common@example.com"
    "disposable.style.email.with+symbol@example.com"
    "other.email-with-hyphen@example.com"
]
|>  List.map(fun input ->
    Regex.Match(input, pattern)
)
module ReggerIt
val ucase : RexPatt
val Between : char -> char -> RexPatt
val lcase : RexPatt
val printable : RexPatt
val OneOf : string -> RexPatt
val local : RexPatt
val OnceOrMore : RexPatt -> RexPatt
module Macro

from ReggerIt
val decimalDigit : RexPatt
val Plain : string -> RexPatt
val dot : RexPatt
val labelUnrestricted : RexPatt
val label : RexPatt
val RepeatRange : int -> int -> RexPatt -> RexPatt
val domain : RexPatt
val email : RexPatt
namespace System
namespace System.Text
namespace System.Text.RegularExpressions
val pattern : string
module Convert

from ReggerIt
val ToStringStartPattern : RexPatt -> string
Multiple items
module List

from Microsoft.FSharp.Collections

--------------------
type List<'T> =
  | ( [] )
  | ( :: ) of Head: 'T * Tail: 'T list
    interface IReadOnlyList<'T>
    interface IReadOnlyCollection<'T>
    interface IEnumerable
    interface IEnumerable<'T>
    member GetReverseIndex : rank:int * offset:int -> int
    member GetSlice : startIndex:int option * endIndex:int option -> 'T list
    member Head : 'T
    member IsEmpty : bool
    member Item : index:int -> 'T with get
    member Length : int
    ...
val map : mapping:('T -> 'U) -> list:'T list -> 'U list
val input : string
Multiple items
type Regex =
  new : pattern:string -> Regex + 2 overloads
  member GetGroupNames : unit -> string[]
  member GetGroupNumbers : unit -> int[]
  member GroupNameFromNumber : i:int -> string
  member GroupNumberFromName : name:string -> int
  member IsMatch : input:string -> bool + 1 overload
  member Match : input:string -> Match + 2 overloads
  member MatchTimeout : TimeSpan
  member Matches : input:string -> MatchCollection + 1 overload
  member Options : RegexOptions
  ...

--------------------
Regex(pattern: string) : Regex
Regex(pattern: string, options: RegexOptions) : Regex
Regex(pattern: string, options: RegexOptions, matchTimeout: System.TimeSpan) : Regex
Regex.Match(input: string, pattern: string) : Match
Regex.Match(input: string, pattern: string, options: RegexOptions) : Match
Regex.Match(input: string, pattern: string, options: RegexOptions, matchTimeout: System.TimeSpan) : Match