Regex word not ending with

If you wish to find
strings at the end of a line that don’t end with a
particular character, you can use a
bracket expresion with a
caret
character after the

left square bracket in a

regular expression.

Metacharacter Description
[^ ] Matches a single character that is not contained
within the brackets. For example, [^abc] matches any character other
than «a», «b», or «c». [^a-z] matches any single character that is
not a lowercase letter from «a» to «z». Likewise, literal characters
and ranges can be mixed.

E.g., suppose I have file named list.txt containing the following words:

coat
cop
core
coaster
clam
chrome
counter
cure
claustrophobia
closet

If I wanted to find only those lines that don’t end with either the letter
«e», «r», or «t», I could use the regular expression [^ert]$
for a grep
search as shown below:

$ grep '[^ert]$' list.txt
cop
clam
claustrophobia
$

The dollar sign at the end of the regular express indicates that I’m
looking for the pattern at the end of the line.

Metacharacter Description
$ Matches the ending position of the string or the position just
before a string-ending newline. In line-based tools, it matches the
ending position of any line.

I receive a

spreadsheet every month containing over a thousand email
addresses that I need to validate. I
extract the column
containing the email addresses from the spreadsheet to a text file with a Python
script. Most of the email addresses end with .com, though
there are also .gov and .net addresses. Occasionally the «m» has been mistyped
as an «n», so I can search for any email addresses that have some other
character after «.co» in the
Vi text
editor with the regular expression /.co[^m]n.
The backslash
before the period is needed to
«escape»
the normal
metacharacter meaning of the period, which otherwise in a regular
expression (regexp) represents any character. The n
represents the
newline character, i.e., the end of the line. I could also type a slash
followed by .[^m]$ in Vi, since the dollar sign represents
the end of the line. For a grep search, though, I need to use the dollar
sign at the end of the regular expression.

$ grep '.co[^m]$' email_list.txt
john.doe@example.con
$

In regex, the anchors have zero width. They are not used for matching characters. Rather they match a position i.e. before, after, or between characters.

To match the start or the end of a line, we use the following anchors:

  • Caret (^) matches the position before the first character in the string.
  • Dollar ($) matches the position right after the last character in the string.
Regex String Matches
^a abc Matches a
c$ abc Matches c
^[a-zA-Z]+$ abc Matches abc
^[abc]$ abc Matches a or b or c
[^abc] abc Does not match. A matching string begins with any character but a,b,c.
^[mts][aeiou] mother Matches. Searches for words that start with m, t or s. Then immediately followed by a vowel.
[^n]g$ king
ng
Does not match. The string should end with g, but not ng.
[^k]g$ kong Matches.
^g.+g$ gang Matches. Word would start and end with g. Any number of letters in between.

See Also: Java regex to allow only alphanumeric characters

2. Regex to Match Start of Line

"^<insertPatternHere>"
  • The caret ^ matches the position before the first character in the string.
  • Applying ^h to howtodoinjava matches h.
  • Applying ^t to howtodoinjava does not match anything because it expects the string to start with t.
  • If we have a multi-line string, by default caret symbol matches the position before the very first character in the whole string. To match the position before the first character of any line, we must enable the multi-line mode in the regular expression.
    In this case, caret changes from matching at only the start the entire string to the start of any line within the string.
Description Matching Pattern
The line starts with a number “^\d” or “^[0-9]”
The line starts with a character “^[a-z]” or “^[A-Z]”
The line starts with a character (case-insensitive) “^[a-zA-Z]”
The line starts with a word “^word”
The line starts with a special character “^[!@#\$%\^\&*\)\(+=._-]”
Pattern.compile("^[0-9]").matcher("1stKnight").find();

Pattern.compile("^[a-zA-Z]").matcher("FirstKnight").find();

Pattern.compile("^First").matcher("FirstKnight").find();

Pattern.compile("^[!@#\$%\^\&*\)\(+=._-]").matcher("*1stKnight").find();

Program output.

true
true
true
true

3. Regex to Match End of Line

"<insertPatternHere>$"
  • The dollar $ matches the position after the last character in the string.
  • Applying a$ to howtodoinjava matches a.
  • Applying v$ to howtodoinjava does not match anything because it expects the string to end with v.
  • If we have a multi-line string, by default dollar symbol matches the position after the very last character in the whole string.
    To match the position after the last character of any line, we must enable the multi-line mode in the regular expression. In this case, dollar changes from matching at only the last the entire string to the last of any line within the string.
Description Matching Pattern
The line ends with a number “\d$” or “[0-9]$”
The line ends with a character “[a-z]$” or “[A-Z]$”
The line ends with a character (case-insensitive) [a-zA-Z]$
The line ends with a word “word$”
The line ends with a special character “[!@#\$%\^\&*\)\(+=._-]$”
Pattern.compile("[0-9]$").matcher("FirstKnight123").find();

Pattern.compile("[a-zA-Z]$").matcher("FirstKnight").find();

Pattern.compile("Knight$").matcher("FirstKnight").find();

Pattern.compile("[!@#\$%\^\&*\)\(+=._-]$")
				.matcher("FirstKnight&").find();

Program output.

true
true
true
true

Drop me your questions related to programs for regex starts with and ends with java.

Happy Learning !!

In this post, I’m going to explain a regular expression that I built to find all the words from text that end with a particular character(letter).
I was searching online for a regex to do that but I didn’t find the one that worked for my exact use case (so I started learning how regex works). So, I experimented and finally found the correct answer.

To solve: Find all words in the text that end with 'e'.
Solution: w+eb 
          OR 
          [a-zA-Z]+eb

Explanation: There are probably many cheatsheets on Regular Expressions that can be referred to understand various parts of this regex solution. So I will try and keep the explanation short.

Let say, my text is a list of random words:
cater cat late gate ignore that sentence just match correct words here

To match words ‘w’ is used. It captures word characters including the underscore(_). We can also use a range of permitted character set instead – [a-zA-Z]. I am including both lowercase and uppercase ranges to include words that contain any of these. This will select every character of every word in the text.

So, we need the ‘+’ sign. The ‘+’ is a wildcard character that is used to expand the search past a single character. It is important to make entire word selections. Now the expression (w+) will make full word selections of all text words.

I want to get the result of words that end with the letter ‘e’, so I put ‘e’ at the end of my expression. Up until this point, this expression will select all words that contain ‘e’ but not necessarily ending with ‘e’, which is not what I want. Another problem with this is that it will select parts of the words till the letter ‘e’ and ignore the rest of it. So the selections are:
cater cat late gate ignore that sentence just match correct words here
Okay, so this is close as it matches all the words that have ‘e’ but that’s not it. It should only select the words that end with ‘e’. The words ‘cater’ and ‘correct’ should not match.

To solve this, we need ‘b’ which is used to match a word boundary. This is an empty string at the left and right side of a word. So I’ll add this at the right side of the expression to make it w+eb so that it considers word boundaries and check if the rightmost character is ‘e’. And now the expression matches the correct words and the result is:
late gate ignore sentence here

That’s all. Easy now! Hopefully it helps others as well!

Понравилась статья? Поделить с друзьями:
  • Regex word not containing
  • References as footnotes in word
  • Regex word not beginning with
  • References and bibliography in word
  • Reference word in english