In this article, we are going to learn How to Read text File word by word in C. We will read each word from the text file in each iteration.
C fscanf Function
The fscanf() function is available in the C library. This function is used to read formatted input from a stream. The syntax of the fscanf function is:
Syntax
int fscanf(FILE *stream, const char *format, ...)
Parameters :
- stream − This is the pointer to a FILE object that identifies the stream.
- format − This is the C string that contains one or more of the following items − Whitespace character, Non-whitespace character, and Format specifiers.
- A format specifier will be as [=%[*][width][modifiers]type=].
1. Read File Word by Word in C using fscanf Function
Here we are making use of the fscanf function to read the text file. The first thing we are going to do is open the file in reading mode. So using fopen() function and “r” read mode we opened the file. The next step is to find the file stats like what is the size of the data this file contains. so we can allocate exact memory for the buffer that is going to hold the content of this file. We are using the stat() function to find the file size.
- Once we have the size and buffer allocated for this size, we start reading the file by using the fscanf() function.
- We keep reading the file word by word until we reach the end of file.In fscanf function, we are passing “%39[^-n] as the argument so we can read the text until we find the next word.
- The code will look like this:
fscanf(in_file, "%39[^-n]", file_contents)
C Program to Read text File word by word
To run this program, we need one text file with the name Readme.txt in the same folder where we have our code.The content of the file is:
Hello My name is John danny
#include <stdio.h> #include <stdlib.h> #include <sys/stat.h> const char* filename = "Readme.txt"; int main(int argc, char *argv[]) { FILE *in_file = fopen(filename, "r"); if (!in_file) { perror("fopen"); return 0; } struct stat sb; if (stat(filename, &sb) == -1) { perror("stat"); return 0; } char *file_contents = malloc(sb.st_size); while (fscanf(in_file, "%[^-n ] ", file_contents) != EOF) { printf("> %sn", file_contents); } fclose(in_file); return 0; }
Output
Hello My name is John danny
- Remove From My Forums
-
Question
-
how i read from txt file, word by word witout to read all line?
Answers
-
Okay then.
The short answer is that you can’t read word for word in C#. So you have two options:
1. Read line by line and split.
2. Read character by character and find the splitting characters.You have examples of the former. For the latter, you may want to do something like this. (You’ll probably have to modify this to your liking).
string filename = @»C:filename.txt»;
using (StreamReader r = new StreamReader(filename))
{
string s = string.Empty;
int i = 0;
while ((i = r.Read()) != -1)
{
Char c = Convert.ToChar(i);
if (Char.IsDigit(c) || Char.IsLetter(c))
{
s = s + c;
}
else
{
if (s.Trim() != string.Empty)
Console.WriteLine(s);
s = string.Empty;
}
}
}The long and the short of this is, though, that reading the file line by line is the simplest method to accomplish this, and reading it character by character is an extremely complex solution.
David Morton — http://blog.davemorton.net/
-
Marked as answer by
Friday, May 8, 2009 3:37 PM
-
Marked as answer by
Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article
Given a text file, extract words from it. In other words, read the content of file word by word. Example :
Input: And in that dream, we were flying. Output: And in that dream, we were flying.
Recommended: Please try your approach on {IDE} first, before moving on to the solution.
Approach : 1) Open the file which contains string. For example, file named “file.txt” contains a string “geeks for geeks”. 2) Create a filestream variable to store file content. 3) Extract and print words from the file stream into a string variable via while loop.
CPP
#include <bits/stdc++.h>
using
namespace
std;
int
main()
{
fstream file;
string word, t, q, filename;
filename = "file.txt";
file.open(filename.c_str());
while
(file >> word)
{
cout << word << endl;
}
return
0;
}
Output:
geeks for geeks.
Time Complexity: O(N) // going through the entire file
Auxiliary Space: O(1)
Like Article
Save Article
- Use
std::ifstream
to Read File Word by Word in C++ - Use
std::ispunct
andstd::string::erase
Functions to Parse Punctuation Symbols in C++
This article will demonstrate multiple methods about how to read a file word by word in C++.
Use std::ifstream
to Read File Word by Word in C++
The std::ifstream
class can be utilized to conduct input operations file-based streams. Namely, the std::ifstream
type is used to interface with file buffer and operate on it using the extraction operator. Note that, std::fstream
type is also provided in the I/O library that’s compatible with both extraction (>>
) and insertion operators (<<
).
At first, we need to create an object of type ifstream
by calling one of its constructors; in this case, only filename string is passed to the constructor function. Once the ifstream
object is created, one of its methods — is_open
should be called to verify that the call was successful and then proceed to read the file contents.
To read the file word by word, we call the extraction operator on ifstream
object. We redirect it to the string variable, which automatically reads in the first word before the first space character is encountered. Since we need to read each word until the end of the file, we insert the extraction statement into a while
loop expression. Additionally, we declared a vector
of strings to store each word on every iteration and print later with a separate loop block.
#include <iostream>
#include <fstream>
#include <vector>
using std::cout; using std::cerr;
using std::endl; using std::string;
using std::ifstream; using std::vector;
int main()
{
string filename("input.txt");
vector<string> words;
string word;
ifstream input_file(filename);
if (!input_file.is_open()) {
cerr << "Could not open the file - '"
<< filename << "'" << endl;
return EXIT_FAILURE;
}
while (input_file >> word) {
words.push_back(word);
}
for (const auto &i : words) {
cout << i << endl;
}
input_file.close();
return EXIT_SUCCESS;
}
Use std::ispunct
and std::string::erase
Functions to Parse Punctuation Symbols in C++
The only downside of the previous method is that it stores the punctuation characters close to words in the destination vector
. It would be better to parse each word and then store them into a vector
container. We are using the ispunct
function that takes a single character as int
parameter and returns a non-zero integer value if the character is punctuation; otherwise — zero is returned.
Note that the behavior of ispunct
function is undefined if the given argument is not representable as unsigned char
; thus, it is recommended to cast the character to the corresponding type. In the following example, we implemented the two simple if
conditions to check the first and last characters of each word. If the punctuation is found, we call a built-in string function — erase
to remove the found characters.
#include <iostream>
#include <fstream>
#include <vector>
using std::cout; using std::cerr;
using std::endl; using std::string;
using std::ifstream; using std::vector;
int main()
{
string filename("input.txt");
vector<string> words;
string word;
ifstream input_file(filename);
if (!input_file.is_open()) {
cerr << "Could not open the file - '"
<< filename << "'" << endl;
return EXIT_FAILURE;
}
while (input_file >> word) {
if (ispunct(static_cast<unsigned char>(word.back())))
word.erase(word.end()-1);
else if (ispunct(static_cast<unsigned char>(word.front())))
word.erase(word.begin());
words.push_back(word);
}
for (const auto &i : words) {
cout << i << endl;
}
input_file.close();
return EXIT_SUCCESS;
}
In this section we will see how we can read file content word by word using C++. The task is very simple. we have to use the file input stream to read file contents. The file stream will open the file by using file name, then using FileStream, load each word and store it into a variable called word. Then print each word one by one.
Algorithm
read_word_by_word(filename)
begin file = open file using filename while file has new word, do print the word into the console done end
File Content (test_file.txt)
This is a test file. There are many words. The program will read this file word by word
Example
#include<iostream> #include<fstream> using namespace std; void read_word_by_word(string filename) { fstream file; string word; file.open(filename.c_str()); while(file > word) { //take word and print cout << word << endl; } file.close(); } main() { string name; cout << "Enter filename: "; cin >> name; read_word_by_word(name); }
Output
Enter filename: test_file.txt This is a test file. There are many words. The program will read this file word by word
There are several ways to do this. Here’s one.
Below is a class I use pretty often called NTuple. It is the same idea as the Tuple<T>, Tuple<T1, T2>, etc classes that come with the .NET framework. However, the NTuple class is designed to hold a variable number of items. Two NTuple instances are equal if they contain the same number of values and those values are equal.
Given a set of columns
// as per OP, the list of columns to group by will be generated at runtime
IEnumerable<string> columnsToGroupBy = ...;
you can use the NTuple class to group by those columns like this:
var groups = dt.AsEnumerable()
.GroupBy(r => new NTuple<object>(from column in columnsToGroupBy select r[column]));
Here’s the beef:
public class NTuple<T> : IEquatable<NTuple<T>>
{
public NTuple(IEnumerable<T> values)
{
Values = values.ToArray();
}
public readonly T[] Values;
public override bool Equals(object obj)
{
if (ReferenceEquals(this, obj))
return true;
if (obj == null)
return false;
return Equals(obj as NTuple<T>);
}
public bool Equals(NTuple<T> other)
{
if (ReferenceEquals(this, other))
return true;
if (other == null)
return false;
var length = Values.Length;
if (length != other.Values.Length)
return false;
for (var i = 0; i < length; ++i)
if (!Equals(Values[i], other.Values[i]))
return false;
return true;
}
public override int GetHashCode()
{
var hc = 17;
foreach (var value in Values)
hc = hc*37 + (!ReferenceEquals(value, null) ? value.GetHashCode() : 0);
return hc;
}
}
Here’s a test case:
static void Main(string[] args)
{
// some sample data
var dt = new DataTable();
dt.Columns.Add("NAME", typeof(string));
dt.Columns.Add("CITY", typeof(string));
dt.Columns.Add("STATE", typeof(string));
dt.Columns.Add("VALUE", typeof(double));
dt.Rows.Add("Mike", "Tallahassee", "FL", 3);
dt.Rows.Add("Mike", "Tallahassee", "FL", 6);
dt.Rows.Add("Steve", "Tallahassee", "FL", 5);
dt.Rows.Add("Steve", "Tallahassee", "FL", 10);
dt.Rows.Add("Steve", "Orlando", "FL", 7);
dt.Rows.Add("Steve", "Orlando", "FL", 14);
dt.Rows.Add("Mike", "Orlando", "NY", 11);
dt.Rows.Add("Mike", "Orlando", "NY", 22);
// some "configuration" data
IEnumerable<string> columnsToGroupBy = new[] {"CITY", "STATE"};
string columnToAggregate = "VALUE";
// the test routine
foreach (var group in dt.AsEnumerable().GroupBy(r => new NTuple<object>(from column in columnsToGroupBy select r[column])))
{
foreach (var keyValue in group.Key.Values)
{
Debug.Write(keyValue);
Debug.Write(':');
}
Debug.WriteLine(group.Sum(r => Convert.ToDouble(r[columnToAggregate])));
}
}
-
06-10-2003
#1
Registered User
Reading in a file word by word
I am trying to write a program that reads a file name from the
command line, reads the contents of that file, and output a list of words, one word per line, to a second file, whose name is also on the command line.For this program a word is a sequence of alphabetic characters. Any character which is not alphabetic is a separator character.
I know how to open files and create files but I am not sure what to do so that I can read in just one word at a time. I also want to know of any tips on how to write to the new file. Thank you in advance for any help!
Example
input file:
Hello%$this is just a test!output file:
Hello
this
is
just
a
test
-
06-10-2003
#2
ATH0
Just continue in whatever book showed you how to open files a page or two, and they’ll show you how to read.
Basicly there are may ways to do it.
fgets would work.
fgetc would also work.
fscanf would also work…Basicly, there are a whole lot of f-ing functions that would do the job.
Quzah.
Hope is the first step on the road to disappointment.
-
06-10-2003
#3
Registered User
Open files
Cool thank you I got it work… I kept forgetting to point to the right spot. Thank you for your help! The only last thing I need to figure out is how to make a new word after a «non letter».
Example:
If you use this code:
Code:
FILE *fp = fopen(argv[1],"r"); char buf[10]; while( fscanf(fp, "%s", buf) != EOF ) { printf("%sn", &buf); }
On an input file that contains:
This@#$is just a test!
Your out put will be
This@#$is
just
a
test!What I need the output to be is:
This
is
just
a
testLast edited by Bumblebee11; 06-10-2003 at 09:15 PM.
-
06-10-2003
#4
ATH0
What you would want to do then, is either modify how your fscanf call works, or, use a loop to run through the string you read in, and check for non-alpha characters.
To do that, something like this would work.
Quzah.
Hope is the first step on the road to disappointment.
-
06-10-2003
#5
Registered User
Got it to work
I was just able to get my program to work. Thank you for your help!!!
First of all sorry for my bad english… I will try to make not so many mistakes:D
Ok i need to write a programs that reads some text from a file and then it outputs the statistics of the words. So i have already completed that the programs says how many charecters there are in a text. Now i would like you to help me with something.
I want to do a function that reads the text from file and than output it like this:
read text : Today is very warm.
text output when i run programe:
Today
is
very
warm
So i think i need to read this text from some file than put it in array and then output it with for sentence.
ifstream entrance;
char word;
entrance.open(«lala.txt»);
while (!entrance.eof())
{
………
………
}
entrance.close();
for(i=0; i…..)
cout<<word[i]<<endl;
But i dont know how to do this…
So can you pls help me.
Thank you all!
somehow like this
|
|
Last edited on
i do this inside or where?
ifstream entrance;
entrance.open(«lala.txt»);
while (!entrance.eof())
{
………
………
}
entrance.close();
And how do i tell that the text that is inside a file is stored in string?
The good old c++ way!
|
|
Can i change this line how std::vector <std::string> words;?
Couse we havent done anything with vectors, so is it any other posibility to write this ?
You could use an regular array with strings… I suppose.
thisfile wrote: |
---|
THIS IS A TEST HERE HOLD ON |
|
|
Read from a file! This string is: THIS This string is: IS This string is: A This string is: TEST This string is: HERE This string is: HOLD This string is: ON THIS IS A TEST HERE HOLD ON |
yea this code work like a charm, exept for 1 thing
If i change constant of size, to bigger value… like 500. Than when i run a program the words are written like they should be, but there is a big blank spot… And i need to scrol a long way down to get to «press any key to continue». So the programe doest stops when there are no more words to output but it still outputs….
Use a std::vector instead of an array. It doesn’t make sense to use an array when you don’t know how many elements you’re going to read in.
If you don’t know the size of the data you read in use a vector. Thats all I can say. My array solution expects small input.
Ok so vector is just like array, but you dont need to know the size of data?
Ok so let me sum up that code with vector. So it reads file and stores data in vector string.
that ifstream fin(lala.txt) is just like entrance.open(«lala.txt»)?
What exactly does this mean fin >>str ?
And what in above code has the same meaning as while (!entrance.eof())?
So i would like to thanks in advance for all your help… I will try to work something with this code, and if i wont know how to do additional stuff i will ask you guys again
Again thank you for your time and help
that ifstream fin(lala.txt) is just like entrance.open(«lala.txt»)?
Yes. If you pass a string as argument to the constructor it will open the file for you. You can also let it go out of scope and the destructor will close it for you.
What exactly does this mean fin >>str ?
It’s a call to the extraction operator. Same as std::cin >> foo;
only you’re getting input from a file stream rather than directly from the user.
And what in above code has the same meaning as while (!entrance.eof())?
When used as a boolean value, the expression fin >> str will evaluate to false if the stream goes into fail(), bad() or eof() states.
Yes. fin is the name of the ifstream variable just like entrance. There is a constructor overload that opens a file immediately. So you can do these:
|
|
fin >> str
is just like cin >>
from iostream.
The difference here is that fin >> str
will read up to a whitespace and stop and in a loop that reads every word in. The reason it works like !entrance.eof() is because entrance.eof() checks to see if the badbit or eof flag is set (its part of streams and is hard to explain here, you’ll learn it later) and if it is, it returns true, otherwise false. Now fin >> has the same check. Once it reads the eof or a badbit flag is on it stops reading.
The code you have would be the same as mine if you did this as well:
|
|
filipe wrote: |
---|
When used as a boolean value, the expression fin >> str will evaluate to false if the stream goes into fail(), bad() or eof() states. |
wolfgang wrote: |
---|
The difference here is that fin >> str will read up to a whitespace and stop and in a loop that reads every word in. The reason it works like !entrance.eof() is because entrance.eof() checks to see if the badbit or eof flag is set |
I though so too, until a week ago or so, when this post surfaced: http://www.cplusplus.com/forum/general/34292/
It appears that the results from
operator!
and
operator void*
depend only on the value of the failbit and badbit, but not on the value of eof. The expression fin >> str
is therefore not strictly stronger condition compared to the expression !entrance.eof()
. Now I see, that the idiom while (fin >> str) { ... }
actually works not because the condition fails immediately when the end of the stream is reached (; if that was the case the last word would escape from the body of the loop), but because the condition fails when the stream was exhausted and
fin
has no more characters to pass to
str
, which results in a
failbit
.
Thinking further in this direction, using the while (!fin.eof()) { fin >> str; ... }
approach may be generally incorrect. It may work for extracting strings, but it could be incorrect for say,
char
-s. According to some sources (see **), the
eof
bit is set when you try to read something past the end of the file, not when you read the last character before the end of the file. So, if the previous extraction reached the end of the file, without reading past it, the !fin.eof()
condition will still evaluate to
true
, but the following extraction in the next loop iteration will be unsuccessful, because there is nothing left to read, and there will be erroneous extra processing of non-extent input. I’m not sure on this. Any confirmation?
** Random picks:
http://cpp.comsci.us/etymology/include/iostream/eof.html
http://stackoverflow.com/questions/292740/c-reading-from-a-file-blocks-any-further-writing-why
Ok i have worked something with the code, but i took that one with arreys and not that one with vectors. Now i have problem, because i have code which tells how many times is some word repeating in the text. And if i have the size of an array 10 and only 7 words in file, it prints 3 empty spaces so he reaches the size 10. I also dunno how do sort the words by size using vectors(i did this by using arrays). So can someone pls rewrite my code from array to vectors but in very simple way so i will understand it pls
|
|
|
|
Ok i think i rewrite those 2 codes with vectors…. But i have some problem in second code, so can you take a look pls
|
|
i have this errors
string undeclared
expected `;’ before «a»
`a’ undeclared (first use this function)
`cout’ undeclared (first use this function)
`endl’ undeclared (first use this function)
You didn’t specify the namespace. You need to specify it like this:
std::string a = words[i];
in line 28
std::cout << words[i] << std::endl;
in line 37.
Or instead of typing
std::
all over the place, you can use (the using directive):
using namespace std;
Insert it after the include directives and it will apply to all of the following function definitions. Like this:
|
|
Insert it at the beginning of a function (like
main()
in this case) and it will apply to the body of the function. Like this:
|
|
Then you will not need to prepend
std::
anymore.
Regards
PS: how can I escape the array indexes, so that they are not treated as a tag?
Last edited on
Topic archived. No new replies allowed.