About feof() function in C
Recently, when I worked on an assignment of AddressBook, I met a strange problem on reading file. I tried to use the function of feof() to stop reading at the end of file. Interestingly, in some cases I missed reading the last line of some files.
1. Problem description:
Initially I use the a while looping from C How to Program 8th edition (p.449, Fig. 11.6) to read the line by line from the text file:
char buffer[100];
FILE* inFile;
if ((inFile = fopen("test.txt", "r")) == NULL) {
puts("File could not be opened");
} else {
// prime reading
fscanf(inFile, "%s", buffer);
// traditional while loop for reading file
// may lose the last line, if there is no '\n' at the end of last line
while(!feof(inFile)){
printf("%s\n", buffer);
fscanf(inFile, "%s", buffer);
}
I test the code above on 2 text files. These 2 test files are very similar, only difference is whether there is a new line character '\n'
at the end of file.
The text file 1:
(leading numbers are line number, which is not the content of file)
1 Line1\n
2 Line2\n
3 Line3
Note: in this file, there is NO new line character '\n'
at the end of the third line.
The output is:
Line1
Line2
The third line is not read by this while loop!
Then, I modified the text file by adding a new line char '\n'
at the end of file.
The text file 2:
1 Line1\n
2 Line2\n
3 Line3\n
Using same code above, output is:
Line1
Line2
Line3
Now, reading is correct!
So here is question, why this happened?
Since I am a beginner of C language, so that I don't know how to check the source code of feof() function. However, I try to analyze how it works from its behavior.
- For Text file 2 has
'\n'
at the end of file. When fscanf() finish reading of the last line ("Line3\n"), FILE* inFile assumes there are more characters after last line, since there is a'\n'
, suggesting another line after it. Therefore, feof() return FALSE. Then, while loop goes one more time, last line is printed on screen and fscanf() is called again. This time, reading is a failure, FILE* inFile return the EOF, and feof() return TRUE. Exit while loop. - For Text file 1 has NO
'\n'
at the end of file. When fscanf() finish reading of the last line ("Line3"), FILE* inFile knows this is end of file, since no more line after it. FILE* inFile return the EOF, and feof() return TRUE. Exit while loop, the reading of last line is not printed because the reading is at the end of while loop, printing need go into next loop.
2. How solve this problem
2.1. for fscanf()
I google on this problem, did not find anyone asked exact question I met, but get some hints from related questions. On Stackoverflow, someone mentioned that feof() alone is not reliable, need additional checking on the content read. This brings a solution below (only while loop, other parts are same):
while (!feof(inFile)) {
int readNumber = fscanf(inFile, "%s", buffer);
if (readNumber < 0) {
break;
}
printf("%s\n", buffer);
}
- We don't need prime reading (fscanf() before the while loop), all fscanf() are inside while loop.
- fscanf() returns an integer, which is the number of items of the argument list successfully filled. If reading is successful, return a positive integer, else return a negative integer. We use this return value for determine whether it is real EOF. Actually the condition of while loop could be always TRUE, like
while (1) {
int readNumber = fscanf(inFile, "%s", buffer);
if (readNumber < 0) {
break;
}
printf("%s\n", buffer);
}
2.2. for fgets()
There is a disadvantage of fscanf() that its reading stops at any white space. So that if your input string containing white spaces, fscanf() cannot read whole line. In this case we have to use fgets().
fgets() returns a char point which is helpful to determine whether reading is successful or not. If reading failed, fgets() return a NULL pointer.
while (!feof(inFile)) {
char* line;
line = fgets(buffer, 100, inFile);
if (line == NULL) {
break;
}
printf("%s", line);
}
Here, we could use a different strategy to check the content we read without checking returned pointer. If reading is failed, the argument of char pointer (in our example, it is buffer
) will not changed. So that, if FILE* inFile reads EOF and buffer content is ended with '\n'
, it means 1) last reading failed, 2) the content in buffer is from the reading before last, 3) the content of buffer is used already, should be discarded.
while (!feof(inFile)) {
fgets(buffer, 100, inFile);
if (feof(inFile) && buffer[strlen(buffer) - 1] == '\n') {
break;
}
printf("%s", buffer);
}
3. Summary
I observe some unexpected behaviors of feof() and provide solutions based on deduction. I believe that I need know how stream works and how macro works in C to get a better understanding on this issue. I would update this in future.
11/21/2018
update at 11/29/2018