Helps with reading word by word

In this article, we are going to learn How to Read text File word by word in C. We will read each word from the text file in each iteration.

C fscanf Function


The fscanf() function is available in the C library. This function is used to read formatted input from a stream. The syntax of the fscanf function is:

Syntax

int fscanf(FILE *stream, const char *format, ...)

Parameters :

  • stream − This is the pointer to a FILE object that identifies the stream.
  • format − This is the C string that contains one or more of the following items − Whitespace character, Non-whitespace character, and Format specifiers.
  • A format specifier will be as [=%[*][width][modifiers]type=].

1. Read File Word by Word in C using fscanf Function


Here we are making use of the fscanf function to read the text file. The first thing we are going to do is open the file in reading mode. So using fopen() function and “r” read mode we opened the file. The next step is to find the file stats like what is the size of the data this file contains. so we can allocate exact memory for the buffer that is going to hold the content of this file. We are using the stat() function to find the file size.

  • Once we have the size and buffer allocated for this size, we start reading the file by using the fscanf() function.
  • We keep reading the file word by word until we reach the end of file.In fscanf function, we are passing “%39[^-n] as the argument so we can read the text until we find the next word.
  • The code will look like this:
fscanf(in_file, "%39[^-n]", file_contents)

C Program to Read text File word by word


To run this program, we need one text file with the name Readme.txt in the same folder where we have our code.The content of the file is:

Hello My name is 
John 
danny
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

const char* filename = "Readme.txt";

int main(int argc, char *argv[])
{
    FILE *in_file = fopen(filename, "r");
    if (!in_file) 
	{
        perror("fopen");
        return 0;
    }

    struct stat sb;
    if (stat(filename, &sb) == -1) 
	{
        perror("stat");
        return 0;
    }

    char *file_contents = malloc(sb.st_size);

    while (fscanf(in_file, "%[^-n ] ", file_contents) != EOF) {
      printf("> %sn", file_contents);
	  }
    
	

    fclose(in_file);
    return 0;
}

Output

Hello
My
name
is
John
danny

Improve Article

Save Article

Like Article

  • Read
  • Discuss
  • Improve Article

    Save Article

    Like Article

    Given a text file, extract words from it. In other words, read the content of file word by word. Example : 

    Input: And in that dream, we were flying.
    Output:
    And
    in
    that
    dream,
    we
    were
    flying.

    Recommended: Please try your approach on {IDE} first, before moving on to the solution.

    Approach : 1) Open the file which contains string. For example, file named “file.txt” contains a string “geeks for geeks”. 2) Create a filestream variable to store file content. 3) Extract and print words from the file stream into a string variable via while loop. 

    CPP

    #include <bits/stdc++.h>

    using namespace std;

    int main()

    {

        fstream file;

        string word, t, q, filename;

        filename = "file.txt";

        file.open(filename.c_str());

        while (file >> word)

        {

            cout << word << endl;

        }

        return 0;

    }

    Output:

    geeks
    for
    geeks.

    Time Complexity: O(N) // going through the entire file

    Auxiliary Space: O(1)

    Like Article

    Save Article

    I have a text file containing just lowercase letters and no punctuation except for spaces. I would like to know the best way of reading the file char by char, in a way that if the next char is a space, it signifies the end of one word and the start of a new word. i.e. as each character is read it is added to a string, if the next char is space, then the word is passed to another method and reset until the reader reaches the end of the file.

    I’m trying to do this with a StringReader, something like this:

    public String GetNextWord(StringReader reader)
    {
        String word = "";
        char c;
        do
        {
            c = Convert.ToChar(reader.Read());
            word += c;
        } while (c != ' ');
        return word;
    }
    

    and put the GetNextWord method in a while loop till the end of the file. Does this approach make sense or are there better ways of achieving this?

    John Saunders's user avatar

    John Saunders

    160k26 gold badges244 silver badges395 bronze badges

    asked Mar 16, 2012 at 15:58

    Matt's user avatar

    8

    The String lib provides an easy way of doing this: string.Split(): if you read the entire string in, C# can automatically split it on every space:

    string[] words = reader.ReadToEnd().Split(' ');
    

    The words array now contains all of the words in the file and you can do whatever you want with them.

    Additionally, you may want to investigate the File.ReadAllText method in the System.IO namespace — it may make your life much easier for file imports to text.

    Edit: I guess this assumes that your file is not abhorrently large; as long as the entire thing can be reasonably read into memory, this will work most easily. If you have gigabytes of data to read in, you’ll probably want to shy away from this. I’d suggest using this approach though, if possible: it makes better use of the framework that you have at your disposal.

    answered Mar 16, 2012 at 16:02

    eouw0o83hf's user avatar

    eouw0o83hfeouw0o83hf

    9,3305 gold badges57 silver badges74 bronze badges

    5

    If you’re interested in good performance even on very large files, you should have a look at the new(4.0) MemoryMappedFile-Class.

    For example:

    using (var mappedFile1 = MemoryMappedFile.CreateFromFile(filePath))
    {
        using (Stream mmStream = mappedFile1.CreateViewStream())
        {
            using (StreamReader sr = new StreamReader(mmStream, ASCIIEncoding.ASCII))
            {
                while (!sr.EndOfStream)
                {
                    var line = sr.ReadLine();
                    var lineWords = line.Split(' ');
                }
            }  
        }
    }
    

    From MSDN:

    A memory-mapped file maps the contents of a file to an application’s
    logical address space. Memory-mapped files enable programmers to work
    with extremely large files because memory can be managed concurrently,
    and they allow complete, random access to a file without the need for
    seeking. Memory-mapped files can also be shared across multiple
    processes.

    The CreateFromFile methods create a memory-mapped file from a
    specified path or a FileStream of an existing file on disk. Changes
    are automatically propagated to disk when the file is unmapped.

    The CreateNew methods create a memory-mapped file that is not mapped
    to an existing file on disk; and are suitable for creating shared
    memory for interprocess communication (IPC).

    A memory-mapped file is associated with a name.

    You can create multiple views of the memory-mapped file, including
    views of parts of the file. You can map the same part of a file to
    more than one address to create concurrent memory. For two views to
    remain concurrent, they have to be created from the same memory-mapped
    file. Creating two file mappings of the same file with two views does
    not provide concurrency.

    answered Mar 16, 2012 at 16:28

    Tim Schmelter's user avatar

    Tim SchmelterTim Schmelter

    445k72 gold badges680 silver badges930 bronze badges

    0

    First of all: StringReader reads from a string which is already in memory. This means that you will have to load up the input file in its entirety before being able to read from it, which kind of defeats the purpose of reading a few characters at a time; it can also be undesirable or even impossible if the input is very large.

    The class to read from a text stream (which is an abstraction over a source of data) is StreamReader, and you would might want to use that one instead. Now StreamReader and StringReader share an abstract base class TextReader, which means that if you code against TextReader then you can have the best of both worlds.

    TextReader‘s public interface will indeed support your example code, so I ‘d say it’s a reasonable starting point. You just need to fix the one glaring bug: there is no check for Read returning -1 (which signifies the end of available data).

    answered Mar 16, 2012 at 16:06

    Jon's user avatar

    JonJon

    425k79 gold badges733 silver badges803 bronze badges

    1

    All in one line, here you go (assuming ASCII and perhaps not a 2gb file):

    var file = File.ReadAllText(@"C:myfile.txt", Encoding.ASCII).Split(new[] { ' ' });
    

    This returns a string array, which you can iterate over and do whatever you need with.

    answered Mar 16, 2012 at 16:07

    Bryan Crosby's user avatar

    Bryan CrosbyBryan Crosby

    6,4563 gold badges35 silver badges55 bronze badges

    2

    I would do something like this:

    IEnumerable<string> ReadWords(StreamReader reader)
    {
        string line;
        while((line = reader.ReadLine())!=null)
        {
            foreach(string word in line.Split(new [1] {' '}, StringSplitOptions.RemoveEmptyEntries))
            {
                yield return word;
            }
        }
    }
    

    If to use reader.ReadAllText it loads the entire file into your memory so you can get OutOfMemoryException and a lot of other problems.

    answered Mar 16, 2012 at 16:21

    Eugene's user avatar

    EugeneEugene

    1,5151 gold badge13 silver badges23 bronze badges

    If you want to read it whitout spliting the string — for example lines are too long, so you might encounter OutOfMemoryException, you should do it like this (using streamreader):

    while (sr.Peek() >= 0)
    {
        c = (char)sr.Read();
        if (c.Equals(' ') || c.Equals('t') || c.Equals('n') || c.Equals('r'))
        {
            break;
        }
        else
            word += c;
    }
    return word;
    

    answered Aug 28, 2014 at 9:09

    MaticDiba's user avatar

    MaticDibaMaticDiba

    8951 gold badge11 silver badges18 bronze badges

    This is method that will split your words, while they are separated by space or more than 1 space (two spaces for example)/

    StreamReader streamReader = new StreamReader(filePath); //get the file
    string stringWithMultipleSpaces= streamReader.ReadToEnd(); //load file to string
    streamReader.Close();
    
    Regex r = new Regex(" +"); //specify delimiter (spaces)
    string [] words = r.Split(stringWithMultipleSpaces); //(convert string to array of words)
    
    foreach (String W in words)
    {
       MessageBox.Show(W);
    }
    

    answered Mar 16, 2012 at 16:08

    Andrew's user avatar

    AndrewAndrew

    7,55913 gold badges63 silver badges116 bronze badges

    I created a simple console program on your exact requirement with the files you mentioned, It should be easy to run and check. Please find attached the code. Hope this helps

    static void Main(string[] args)
        {
    
            string[] input = File.ReadAllLines(@"C:UsersachikhaleDesktopfile.txt");
            string[] array1File = File.ReadAllLines(@"C:UsersachikhaleDesktoparray1.txt");
            string[] array2File = File.ReadAllLines(@"C:UsersachikhaleDesktoparray2.txt");
    
            List<string> finalResultarray1File = new List<string>();
            List<string> finalResultarray2File = new List<string>();
    
            foreach (string inputstring in input)
            {
                string[] wordTemps = inputstring.Split(' ');//  .Split(' ');
    
                foreach (string array1Filestring in array1File)
                {
                    string[] word1Temps = array1Filestring.Split(' ');
    
                    var result = word1Temps.Where(y => !string.IsNullOrEmpty(y) && wordTemps.Contains(y)).ToList();
    
                    if (result.Count > 0)
                    {
                        finalResultarray1File.AddRange(result);
                    }
    
                }
    
            }
    
            foreach (string inputstring in input)
            {
                string[] wordTemps = inputstring.Split(' ');//  .Split(' ');
    
                foreach (string array2Filestring in array2File)
                {
                    string[] word1Temps = array2Filestring.Split(' ');
    
                    var result = word1Temps.Where(y => !string.IsNullOrEmpty(y) && wordTemps.Contains(y)).ToList();
    
                    if (result.Count > 0)
                    {
                        finalResultarray2File.AddRange(result);
                    }
    
                }
    
            }
    
            if (finalResultarray1File.Count > 0)
            {
                Console.WriteLine("file array1.txt contians words: {0}", string.Join(";", finalResultarray1File));
            }
    
            if (finalResultarray2File.Count > 0)
            {
                Console.WriteLine("file array2.txt contians words: {0}", string.Join(";", finalResultarray2File));
            }
    
            Console.ReadLine();
    
        }
    }
    

    answered Sep 18, 2017 at 7:19

    AnkUser's user avatar

    AnkUserAnkUser

    5,3712 gold badges9 silver badges25 bronze badges

    This code will extract words from a text file based on the Regex pattern. You can try playing with other patterns to see what works best for you.

        StreamReader reader =  new StreamReader(fileName);
    
        var pattern = new Regex(
                  @"( [^W_d]              # starting with a letter
                                            # followed by a run of either...
                      ( [^W_d] |          #   more letters or
                        [-'d](?=[^W_d])  #   ', -, or digit followed by a letter
                      )*
                      [^W_d]              # and finishing with a letter
                    )",
                  RegexOptions.IgnorePatternWhitespace);
    
        string input = reader.ReadToEnd();
    
        foreach (Match m in pattern.Matches(input))
            Console.WriteLine("{0}", m.Groups[1].Value);
    
        reader.Close();       
    

    answered Sep 25, 2017 at 17:43

    live-love's user avatar

    live-lovelive-love

    47.1k22 gold badges231 silver badges200 bronze badges

    I need to write a program that rads the text of a text file and returns the amount of times a specific word shows up. I cant figure out how to write a program that reads the text file word by word, ive only managed to write a file that reads line by line. How would i make it so it reads word by word?

    import java.io.*;
    import java.util.Scanner;
    
    
    public class main 
    {
    	public static void main (String [] args) 
    	{		
    		Scanner scan = new Scanner(System.in);
    		System.out.println("Name of file: ");		
    		String filename= scan.nextLine();
    	int count =0; 
    
    	try
    	{		
    		FileReader file = new FileReader(filename);			
    		BufferedReader reader = new BufferedReader(file);			
    		String a = "";			
    		int linecount = 0;						
    		String line;			
    		System.out.println("What word are you looking for: ");			
    		String a1 = scan.nextLine();						
    	
    		while((line = reader.readLine()) != null)			
    		{				
    			linecount++;								
    			if(line.equalsIgnoreCase("that"));
    					count++;			
    		}								
    		reader.close();		
    	}
    	catch(IOException e)
    	{
    		System.out.println("File Not found");
    	}
    	System.out.println("the word that appears " + count + " many times");
    }
    }
    

    Review

    I love this book!
    As retired CA credentialed teacher K-6 with 10 yrs as a kindergarten teacher, I love how sequential and step by step this book makes reading giving plenty of practice with previous words taught by adding only 1 word at a time to be read with the previously taught words.

     My first grade grandson struggles to read and fussed about it until I got this book & he told me, «I love this book! I never knew reading could be this much fun!»

    It’s a great tool for students especially 6-8yr olds, who developmentally don’t pick up reading quickly even though they’ve been taught all the sounds and likely their readiness doesn’t click in until 2nd-3rd grade.  Excellent «structure notes» are available at the end of the book, for parents or teachers — Cynthia Claffey

     
    A great book that does exactly what it says!
    The book uses repetitive words and phrases along with great pictures to help jumpstart the learning process.  It also turns all this into a story the kids will enjoy. The dog, the kids and their interactions will be fun and funny for the little ones to read.  This is the beginning of a whole series of teaching books and if they all follow the patterns of this book, then each will be well worth the read. — AmazonReviewer
    Finally a resource that links sight words to actual sentences and book-style stories. 
    My son is a very smart 6 year old about to enter the 1st grade. However, reading did not come as easy as we thought it would. Prior to kindergarten, he was an avid reader. Following kindergarten he was frustrated with books and reading, and upset he could not pick up any book from his shelf and read it. Kindergarten taught him Fry sight words via flash cards and corresponding books, with levels increasing as books and flash card sets were mastered. However, he and many of his classmates did not seem to make the connection between the flash cards and words in books. 

     Since reading these Word by Word Level 1 — books 1 and 2, my son is excited about reading again. He sits and reads these books to us on his own with little to no help. He willingly reads to us at bedtime with no fight. We do not need to read it to him first. The repetition of the words in a story setting helps to reinforce what he is learning, while he feels the success of actually reading a book on his own. Also, you are learning the sight words in a book-sentence setting, with one new word per page, and repeat of earlier sight words introduced to reinforce and help with mastery. 

     He loves the cloud set-up for the text, kind of comic style, and how the number of sentences per page increases as you make your way through to book. He really feels an accomplishment when he reads a page all by himself with lots of text. Thank you for this! We are excited to continue with Books 3 and 4. Very hopeful this will make reading in first grade much easier for him, and help him meet his personal goal of being an independent reader. — Jennifer Gruda,

    About the Author

    Teacher and author Philip Gibson has more than 35 years’ experience teaching English to children and adults in 7 countries.

    After graduating from teacher training college in 1974, Philip Gibson spent the next 30+ years travelling the world teaching English to children and adults of all nationalities.  He now lives in Laos where, for the past 12 years, he has been researching, writing, trialing and improving his Word by Word series of illustrated, graded readers for English-speaking children learning to read and children learning English as a second or foreign language.  

     All the books in the series have been extensively researched, trialed, improved and taught to many groups and individuals, including Philip’s own children. 

    Like this post? Please share to your friends:
  • Helping word in english
  • Helping with word problems
  • Helping verb word list
  • Helping others one word
  • Help word in different languages