Extract word with set

I have a set of words (BELLOW, CELLO, HAAF, HABIT, HADAL, HAIR, HELLO, HELP, RABIT) stored in std::set datastructure.

  1. From the above set DS, I want to extract words which starts(0th index) with ‘H’ and store it other container(say std::setstd::string ctr).
    Now, ctr will have — HAAF, HABIT, HADAL, HAIR, HELLO, HELP

  2. Now, I want to fetch the words which has the second letter(1st Index) as ‘A’ from the container ctr.
    Now, ctr will have — HAAF, HABIT, HADAL, HAIR

  3. I want to fetch the words which has the letter ‘A’ in any index other than 0th & 1st Indices. Basically, I don’t want to check the string 0th and 1st positions.

    Now, ctr will have — HAAF, HADAL

I’m not sure how to do the 3rd step.

#include <iostream>
#include <set>

int main()
{
    std::set<std::string> words = {"BELLOW", "CELLO",  "HAAF", 
                                   "HABIT",  "HADAL", "HAIR",
                                   "HELLO", "HELP", "RABIT"};
    for (const std::string& s : words) {
        std::cout << s << std::endl;    
    }
    
    std::set<std::string> etr;
    
    /* Extract words start with letter 'H' */
    for (const std::string& s : words) {
        if (s[0] == 'H') {
           //std::cout << s << std::endl;  
           etr.insert(s);
        }
    }
    
    std::cout << std::endl;
    
    for (const std::string& s : etr) {
        std::cout << s << std::endl;    
    }
    
    std::set<std::string> etr2;
    
    /* Extract words start with letter 'H' & 
       second letter as 'A' */
    for (const std::string& s : etr) {
        if (s[1] == 'A') {
           //std::cout << s << std::endl;  
           etr2.insert(s);
        }
    }
    
    std::cout << std::endl;
    
    for (const std::string& s : etr2) {
        std::cout << s << std::endl;    
    }
    
    /* Extract words start with letter 'H' & 
       second letter as 'A', and any other letter as 'A'
       but not second letter */
    // << I'm not sure  >>    
      
    return 0;
}

Link for running this program

Solution which I expected:

    for (const std::string& s : etr2) {
        size_t occ = s.find('A');
         // Repeat till end is reached
         while(occ != std::string::npos) {
            if (std::find(pos.begin(), pos.end(), occ) == pos.end()) {
                etr.insert(s);
             }
            // Get the next occurrence from the current position
            occ = s.find('A', occ + 1);
        }
    }

Find the link for this solution

Given the words, the task is to extract different words from a set of characters using the defined dictionary. Approach: Python in its language defines an inbuilt module enchant which handles certain operations related to words. In the approach mentioned, following methods are used.

  • check() : It checks if a string is a word or not and returns true if a string is a word, else returns false.
  • permutations(str_arr, str_len) : It provides the combination of a string as per the mentioned string length.

There is a chance that enchant() module might not be present, so it can be installed using pip3 install enchant. Below is the Python code implementation of the above approach. 

Python3

import keyword

import enchant

from itertools import permutations

d = enchant.Dict("en_US")

words = []

perm_word = []

str_arr = "star"

str_len = 4

print("Length of the string is : ", str_len )

while str_len > 1 :

    if str_len == len(str_arr):

        perm = list(permutations(str_arr))

        str_len = str_len -1

        for i in list(perm):

            words =''.join(i)

            if d.check(words):

                perm_word.append(words)

                print ( words + " is an English words")

        print ("perm_word", perm_word)

    elif str_len > 1:

        perm = list(permutations(str_arr, str_len))

        str_len = str_len -1

        for i in list(perm):

            words =''.join(i)

            if d.check(words):

                perm_word.append(words)

                print ( words + " is an English word")

        print ("perm_word", perm_word)

    else:

        str_len = 0

Output : 

star is an English words
tars is an English words
arts is an English words
rats is an English words
perm_word [‘star’, ‘tars’, ‘arts’, ‘rats’]
sat is an English word
tar is an English word
art is an English word
rat is an English word
perm_word [‘star’, ‘tars’, ‘arts’, ‘rats’, ‘sat’, ‘tar’, ‘art’, ‘rat’]
st is an English word
ts is an English word
tr is an English word
as is an English word
at is an English word
rs is an English word
rt is an English word
perm_word [‘star’, ‘tars’, ‘arts’, ‘rats’, ‘sat’, ‘tar’, ‘art’, ‘rat’, 
‘st’, ‘ts’, ‘tr’, ‘as’, ‘at’, ‘rs’, ‘rt’]

Time complexity : O(n!) It generates all possible permutations of the given string and then checks if the permuted word is present in the English dictionary. The total number of permutations is n! (n factorial), where n is the length of the string, so the time complexity is O(n!).

Space complexity : O(n!) This is because the permutations are stored in a list called “perm_word”. For each iteration of the while loop, the list grows with the number of possible permutations, which is given by n!. So the space complexity is proportional to the number of permutations stored in the list, which is O(n!).

I’d like to extract one or more word from a string containing multiple words matching the same pattern (regex python). Here’s the line:

new_appointment_requested; general; SAT-newlead

From that line, I need to extract «SAT-» and whatever comes after it within that word. In this case the output should be «SAT-newlead». It can also be SAT-oldlead or something completely different, and there can be more than one word containing this pattern at a time. To capture all possible scenarios:

  1. find the word starting with «SAT-» (b)
  2. if it is the final or only word in the string, extract the rest of the
    word,
  3. if there is a ; after the word (it’s not the final word in the
    string), grab the word without the ;
  4. if more than one word matches this pattern, extract all instances as separate words

For some reason I can’t wrap my head around Regex, so any help on this would be appreciated.

I have a string "rtcpOnNbActive true" stored in a variable x. I want to extract «true» as substring and store in a variable. How can I do this?

200_success's user avatar

200_success

5,4361 gold badge26 silver badges34 bronze badges

asked Dec 13, 2014 at 8:07

Pratibha Jain's user avatar

1

Try this way:

y=$(echo $x | awk '{print $2}')
echo $y
  • echo $x display the value of x.
  • awk '{print $2}' prints the second field of the previously displayed x.
  • $() hold the output and let assign it to y.

answered Dec 13, 2014 at 8:22

jherran's user avatar

jherranjherran

3,7993 gold badges22 silver badges34 bronze badges

2

Assuming that there’s at least one space before the substring you wish to extract (and that the substring does not contain any spaces), you can do this with a simple parameter expansion:

x="rtcpOnNbActive     true"
y="${x##* }"
echo "[$y]"

output

[true]

answered Dec 13, 2014 at 8:34

PM 2Ring's user avatar

PM 2RingPM 2Ring

6,5232 gold badges27 silver badges31 bronze badges

2

It’s possible to use bash arrays for that, just place your str inside parenthesis. e.g.:

arr=("first second third")
echo ${arr[1]}

str="first second third"
arr1=($str)
echo ${arr1[1]}

Daniel's user avatar

answered Oct 16, 2015 at 15:08

kenorb's user avatar

kenorbkenorb

19.8k14 gold badges137 silver badges162 bronze badges

2

you can use awk:

echo "rtcpOnNbActive         true" | awk '{print $NF}'
true

NF number of field in the current record

using sed:

echo "rtcpOnNbActive         true" | sed 's/.* //g'
true

using string expression:

 a="rtcpOnNbActive         true"
 echo ${a##* }
 true

using grep:

 echo "rtcpOnNbActive         true" | grep -Eo "[a-z]+$"
 true

-o is gives only exact match, [a-z]+ will match letter from a-z and $ means at end

answered Dec 13, 2014 at 9:59

Hackaholic's user avatar

HackaholicHackaholic

1,8711 gold badge10 silver badges10 bronze badges

4

You could use the read built-in

read -r _ y <<<"$x"
printf "%sn" "$y"
true

answered Dec 14, 2014 at 1:39

iruvar's user avatar

iruvariruvar

16.4k8 gold badges48 silver badges80 bronze badges

3

Word splitting in bash can be accomplished very succinctly with set builtin.

str="rtcpOnNbActive                       true"
# Expand positional parameters with arguments
set -- $str
# Now $1=rtcpOnNbActive and $2=true

This can be extended to a host of scenarios by altering value of IFS variable, e.g:

ip_address="10.0.0.138"
IFS=.
set -- $ip_address
# $1=10, $2=0, $3=0, $4=138

answered Sep 29, 2022 at 11:46

Edward Chamberlain's user avatar

Pretty straightforward.

x="rtcpOnNbActive                       true"
y=${x* }

The ${x*.} removes everything from the space forward and then assigns that value to y.

Very similar to what Hackaholic listed above, but a little more succinct.

answered Feb 10, 2021 at 17:10

jgshawkey's user avatar

jgshawkeyjgshawkey

1,4891 gold badge8 silver badges3 bronze badges

A simple solution is

for /f "tokens=3" %%a in (database.txt) do set word3=%%a

After this statement, the variable %word3% will contain the third word from the line in the file.  If the file has more than one line, you will get the third word from the last line that has at least three words; the set word3=%%a command (after the do keyword) will be executed with %%a set to the third word from each such line. 
If you decide that you want to do more than one command per line, use the following syntax:

for /f "tokens=3" %%a in (database.txt) do (
     
     commands referencing %%a
     
)


Edit: As stated above, the code (commands) in the block following the do gets executed for every qualifying line.  If you want to “catch” only the first such line, you can do this by simply adding filtering logic, as in:

setlocal enabledelayedexpansion
set first=1
for /f "tokens=3" %%a in (database.txt) do (
      if !first! == 1 (
           set first=0
            
           commands referencing %%a
            
      )
)


You can replace the a (in %%a) with any letter, but it must be only a single letter; it’s not a normal variable.

Понравилась статья? Поделить с друзьями:
  • Extract word from picture
  • Extract text from text boxes word
  • Extract text from pdf to word
  • Extract rows in excel
  • Extract pictures from word