VBA - Regular expressions

September 2016


VBA - Regular expressions


This tutorial aims to explain all the intricacies relating to the use of regular expressions. You will learn how to navigate through strings to find substrings using regular expressions (commonly called RegExp).

Prerequisites

In the VBA Editor.
  • Go to the Tools menu > References.
  • Browse through the various references and check the following option "Microsoft VBScript Regular Expression 5.5".

Syntax and declaration

In these examples, the only terms we will require are:

The statement

Two options are available:

As Object

Dim reg As Object
Set reg = CreateObject("vbscript.regexp")

VBScript_RegExp_55.regexp

Dim reg As VBScript_RegExp_55.regexp
Set reg = New VBScript_RegExp_55.regexp

The properties

No matter how you stated the RegExp, the properties will have the same syntax.

The Pattern

The pattern must comply with the substring. It allows you to navigate through the string.
reg.Pattern = "IciLeMotif"

Test

We will test the string, if the pattern is found, it returns the following Boolean value: True.
reg.Test = "IciLeMotif"

The String begins with A (uppercase)

The pattern would be:
  • at the beginning of the string
  • there is an A


The notation of this pattern is: ^A

Function Prem_Lettre_A(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'Le Pattern est le motif que l'on recherche
'début de chaîne : ^
'doit être : A
reg.Pattern = "^A"
'le test renvoie un Boolean (parfait pour notre fonction Booléenne!!!)
Prem_Lettre_A = reg.test(expression)
End Function

To call the function:

Sub Test_A()
'Nous allons chercher si un String commence par "A" Majuscule
MsgBox Prem_Lettre_A("alors?")
MsgBox Prem_Lettre_A("Ahhh")
MsgBox Prem_Lettre_A("Pas mal non?")
End Sub

The string begins with a or A

The notation of the pattern is [aA]

Sub Test_a_ou_A()
'Nous allons chercher si un String commence par "a" ou "A"
MsgBox Prem_Lettre_a_ou_A("alors?")
MsgBox Prem_Lettre_a_ou_A("Ahhh")
MsgBox Prem_Lettre_a_ou_A("Pas mal non?")
End Sub

Function Prem_Lettre_a_ou_A(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'ici la première lettre : ^
'doit être : a ou A => [aA]
reg.Pattern = "^[aA]"
Prem_Lettre_a_ou_A = reg.test(expression)
End Function

The string begins with an Uppercase letter

The notation for this pattern is: [AZ]
Sub Commence_par_Majuscule()
MsgBox "alors? commence par une majuscule : " & Prem_Lettre_Majuscule("alors?")
MsgBox "Ahhh commence par une majuscule : " & Prem_Lettre_Majuscule("Ahhh")
MsgBox "Pas mal non?  commence par une majuscule : " & Prem_Lettre_Majuscule("Pas mal non?")
End Sub

Function Prem_Lettre_Majuscule(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'ici la première lettre : ^
'doit être une lettre Majuscule : [A-Z]
reg.Pattern = "^[A-Z]"
Prem_Lettre_Majuscule = reg.test(expression)
End Function

The string ends with

Use the following notation: $
Sub Fini_Par()
MsgBox "La phrase : Les RegExp c'est super! se termine par super : " & Fin_De_Phrase("Les RegExp c'est super!")
MsgBox "La phrase : C'est super les RegExp! se termine par super : " & Fin_De_Phrase("C'est super les RegExp!")
End Sub

Function Fin_De_Phrase(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'La fin de la chaine doit être : super!
'notation de fin de chaîne : $
reg.Pattern = "super!$"
'note le $ se place à la fin...
Fin_De_Phrase = reg.test(expression)
End Function

The string contains a number

The notation for this pattern is [0-9]. If you want to find a number between 3 and 7: [3-7]

Sub Contient_un_chiffre()
MsgBox "aze1rty contient un chiffre : " & A_Un_Chiffre("aze1rty")
MsgBox "azerty contient un chiffre : " & A_Un_Chiffre("azerty")
End Sub

Function A_Un_Chiffre(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'doit comporter un chiffre de 0 à 9 n'importe ou (début, milieu, fin de chaine...)
reg.Pattern = "[0-9]"
'remarque [0-9] s'écrit également : \d
    'reg.Pattern = "\d"
A_Un_Chiffre = reg.test(expression)
End Function

The string contains a 3-digit number

We will now "quantify" in our RegExp. To find three occurrences, we will use the curly braces {}.
Sub Contient_Un_Nombre_A_trois_Chiffres()
MsgBox "aze1rty contient 3 chiffres : " & Nb_A_Trois_Chiffre("aze1rty")
MsgBox "a1ze2rty3 contient 3 chiffres : " & Nb_A_Trois_Chiffre("a1ze2rty3")
MsgBox "azer123ty contient 3 chiffres : " & Nb_A_Trois_Chiffre("azer123ty")
End Sub

Function Nb_A_Trois_Chiffre(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'doit comporter 3 chiffres de 0 à 9 qui se suivent
'le nombre d'occurrence se note {}
reg.Pattern = "\d{3}" 'équivalant de : reg.Pattern = "[0-9]{3}"
Nb_A_Trois_Chiffre = reg.test(expression)
End Function

The string contains 3 separate numbers

The code

Sub Contient_trois_Chiffres()
MsgBox "aze1rty contient 3 chiffres séparés : " & Trois_Chiffre("aze1rty")
MsgBox "a1ze2rty3 contient 3 chiffres séparés : " & Trois_Chiffre("a1ze2rty3")
MsgBox "azer123ty contient 3 chiffres séparés : " & Trois_Chiffre("azer123ty")
End Sub

Function Trois_Chiffre(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'doit comporter 3 chiffres de 0 à 9 qui ne se suivent pas
'le nombre d'occurrence se note {}
'le point (.) indique n'importe quel caractère sauf le saut de ligne
'le + indique que ce qui le précède (ici le point) doit être représenté une ou une infinité de fois
reg.Pattern = "(.)+(\d{1})(.)+(\d{1})(.)+(\d{1})"
Trois_Chiffre = reg.test(expression)
End Function

A variant

Sub Contient_trois_Chiffres_Variante()
MsgBox "aze1rty contient 3 chiffres séparés : " & Trois_Chiffre_Simplifiee("aze1rty")
MsgBox "a1ze2rty3 contient 3 chiffres séparés : " & Trois_Chiffre_Simplifiee("a1ze2rty3")
MsgBox "azer123ty contient 3 chiffres séparés : " & Trois_Chiffre_Simplifiee("azer123ty")
End Sub

Function Trois_Chiffre_Simplifiee(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'Comme le même motif : (.)+(\d{1}) se répète trois fois :
reg.Pattern = "(.+\d{1}){3}"
Trois_Chiffre_Simplifiee = reg.test(expression)
End Function

Example of use

In our example, the goal is to determine whether the string is made up of the following sequence:
  • 1: The string starts with the word "Vis"
  • 2: followed by a space
  • 3: followed by 1 to 3 letters
  • 4: followed by a space
  • 5: followed by the letter M
  • 6: followed by 1 to 2 letter
  • 7: followed by a hyphen
  • 8: followed by 1 to 3 letters
  • 9: followed by the word "classe" surrounded by spaces AND in lowercase
  • 10: follow-up of 1-2 letters
  • 11: followed by a dot
  • 12: followed by a letter


Note that: Each block of the pattern is enclosed between brackets.

Notation:
  • 1- => (^Vis)
  • 2- => ( )
  • 3- => ([a-zA-Z]{1,3})
  • 4- => ( ) --- Note : on aurait pu l'insérer avec le M qui suit... Mis pour l'exemple
  • 5- => (M)
  • 6- => ([a-zA-Z]{1,2})
  • 7- => ( -)
  • 8- => ([a-zA-Z]{1,3})
  • 9- => ( classe )
  • 10- => ([a-zA-Z]{1,2})
  • 11- => (\.)
  • 12- => ([a-zA-Z]{1})



This gives us the following pattern
"(^Vis)( )([a-zA-Z]{1,3})( )(M)([a-zA-Z]{1,2})(-)([a-zA-Z]{1,3})( classe )([a-zA-Z]{1,2})(.)([a-zA-Z]{1})"

The code:
Sub Main()
If VerifieMaChaine("Vis xx Mxx-x classe xx.x") Then
    MsgBox "good"
Else
    MsgBox "pas glop"
End If
'manque l'espace avant le M :
If VerifieMaChaine("Vis xxMxx-x classe xx.x") Then
    MsgBox "good"
Else
    MsgBox "pas glop"
End If
End Sub

Function VerifieMaChaine(expression As String) As Boolean
Dim reg As Object

Set reg = CreateObject("vbscript.regexp")
'There are simpler example, but this one is the most complete
reg.Pattern = "(^Vis)( )([a-zA-Z]{1,3})( )(M)([a-zA-Z]{1,2})(-)([a-zA-Z]{1,3})( classe )([a-zA-Z]{1,2})(.)([a-zA-Z]{1})"
VerifieMaChaine = reg.test(expression)
End Function
End Sub

Download

The codes described in this tutorial are all listed in this workbook: http://cjoint.com/14au/DHgoqY7wwIw.htm

Related :

This document entitled « VBA - Regular expressions » from CCM (ccm.net) is made available under the Creative Commons license. You can copy, modify copies of this page, under the conditions stipulated by the license, as this note appears clearly.