If you wish to find
strings at the end of a line that don’t end with a
particular character, you can use a
bracket expresion with a
caret
character after the
left square bracket in a
regular expression.
Metacharacter | Description |
---|---|
[^ ] | Matches a single character that is not contained within the brackets. For example, [^abc] matches any character other than «a», «b», or «c». [^a-z] matches any single character that is not a lowercase letter from «a» to «z». Likewise, literal characters and ranges can be mixed. |
E.g., suppose I have file named list.txt containing the following words:
coat
cop
core
coaster
clam
chrome
counter
cure
claustrophobia
closet
If I wanted to find only those lines that don’t end with either the letter
«e», «r», or «t», I could use the regular expression [^ert]$
for a grep
search as shown below:
$ grep '[^ert]$' list.txt cop clam claustrophobia $
The dollar sign at the end of the regular express indicates that I’m
looking for the pattern at the end of the line.
Metacharacter | Description |
---|---|
$ |
Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line. |
I receive a
spreadsheet every month containing over a thousand email
addresses that I need to validate. I
extract the column
containing the email addresses from the spreadsheet to a text file with a Python
script. Most of the email addresses end with .com
, though
there are also .gov and .net addresses. Occasionally the «m» has been mistyped
as an «n», so I can search for any email addresses that have some other
character after «.co» in the
Vi text
editor with the regular expression /.co[^m]n
.
The backslash
before the period is needed to
«escape»
the normal
metacharacter meaning of the period, which otherwise in a regular
expression (regexp) represents any character. The n
represents the
newline character, i.e., the end of the line. I could also type a slash
followed by .[^m]$
in Vi, since the dollar sign represents
the end of the line. For a grep search, though, I need to use the dollar
sign at the end of the regular expression.
$ grep '.co[^m]$' email_list.txt john.doe@example.con $
In regex, the anchors have zero width. They are not used for matching characters. Rather they match a position i.e. before, after, or between characters.
To match the start or the end of a line, we use the following anchors:
- Caret (^) matches the position before the first character in the string.
- Dollar ($) matches the position right after the last character in the string.
Regex | String | Matches |
---|---|---|
^a |
abc | Matches a |
c$ |
abc | Matches c |
^[a-zA-Z]+$ |
abc | Matches abc |
^[abc]$ |
abc | Matches a or b or c |
[^abc] |
abc | Does not match. A matching string begins with any character but a,b,c. |
^[mts][aeiou] |
mother | Matches. Searches for words that start with m, t or s. Then immediately followed by a vowel. |
[^n]g$ |
king ng |
Does not match. The string should end with g, but not ng. |
[^k]g$ |
kong | Matches. |
^g.+g$ |
gang | Matches. Word would start and end with g. Any number of letters in between. |
See Also: Java regex to allow only alphanumeric characters
2. Regex to Match Start of Line
"^<insertPatternHere>"
- The caret
^
matches the position before the first character in the string. - Applying
^h
to howtodoinjava matchesh
. - Applying
^t
to howtodoinjava does not match anything because it expects the string to start witht
. - If we have a multi-line string, by default caret symbol matches the position before the very first character in the whole string. To match the position before the first character of any line, we must enable the multi-line mode in the regular expression.
In this case, caret changes from matching at only the start the entire string to the start of any line within the string.
Description | Matching Pattern |
---|---|
The line starts with a number | “^\d” or “^[0-9]” |
The line starts with a character | “^[a-z]” or “^[A-Z]” |
The line starts with a character (case-insensitive) | “^[a-zA-Z]” |
The line starts with a word | “^word” |
The line starts with a special character | “^[!@#\$%\^\&*\)\(+=._-]” |
Pattern.compile("^[0-9]").matcher("1stKnight").find();
Pattern.compile("^[a-zA-Z]").matcher("FirstKnight").find();
Pattern.compile("^First").matcher("FirstKnight").find();
Pattern.compile("^[!@#\$%\^\&*\)\(+=._-]").matcher("*1stKnight").find();
Program output.
true
true
true
true
3. Regex to Match End of Line
"<insertPatternHere>$"
- The dollar
$
matches the position after the last character in the string. - Applying
a$
to howtodoinjava matchesa
. - Applying
v$
to howtodoinjava does not match anything because it expects the string to end withv
. - If we have a multi-line string, by default dollar symbol matches the position after the very last character in the whole string.
To match the position after the last character of any line, we must enable the multi-line mode in the regular expression. In this case, dollar changes from matching at only the last the entire string to the last of any line within the string.
Description | Matching Pattern |
---|---|
The line ends with a number | “\d$” or “[0-9]$” |
The line ends with a character | “[a-z]$” or “[A-Z]$” |
The line ends with a character (case-insensitive) | [a-zA-Z]$ |
The line ends with a word | “word$” |
The line ends with a special character | “[!@#\$%\^\&*\)\(+=._-]$” |
Pattern.compile("[0-9]$").matcher("FirstKnight123").find();
Pattern.compile("[a-zA-Z]$").matcher("FirstKnight").find();
Pattern.compile("Knight$").matcher("FirstKnight").find();
Pattern.compile("[!@#\$%\^\&*\)\(+=._-]$")
.matcher("FirstKnight&").find();
Program output.
true
true
true
true
Drop me your questions related to programs for regex starts with and ends with java.
Happy Learning !!
In this post, I’m going to explain a regular expression that I built to find all the words from text that end with a particular character(letter).
I was searching online for a regex to do that but I didn’t find the one that worked for my exact use case (so I started learning how regex works). So, I experimented and finally found the correct answer.
To solve: Find all words in the text that end with 'e'. Solution: w+eb OR [a-zA-Z]+eb
Explanation: There are probably many cheatsheets on Regular Expressions that can be referred to understand various parts of this regex solution. So I will try and keep the explanation short.
Let say, my text is a list of random words:cater cat late gate ignore that sentence just match correct words here
To match words ‘w’ is used. It captures word characters including the underscore(_). We can also use a range of permitted character set instead – [a-zA-Z]
. I am including both lowercase and uppercase ranges to include words that contain any of these. This will select every character of every word in the text.
So, we need the ‘+’ sign. The ‘+’ is a wildcard character that is used to expand the search past a single character. It is important to make entire word selections. Now the expression (w+)
will make full word selections of all text words.
I want to get the result of words that end with the letter ‘e’, so I put ‘e’ at the end of my expression. Up until this point, this expression will select all words that contain ‘e’ but not necessarily ending with ‘e’, which is not what I want. Another problem with this is that it will select parts of the words till the letter ‘e’ and ignore the rest of it. So the selections are:cater cat late gate ignore that sentence just match correct words here
Okay, so this is close as it matches all the words that have ‘e’ but that’s not it. It should only select the words that end with ‘e’. The words ‘cater’ and ‘correct’ should not match.
To solve this, we need ‘b’ which is used to match a word boundary. This is an empty string at the left and right side of a word. So I’ll add this at the right side of the expression to make it w+eb
so that it considers word boundaries and check if the rightmost character is ‘e’. And now the expression matches the correct words and the result is:late gate ignore sentence here
That’s all. Easy now! Hopefully it helps others as well!