| Chapter 10 - FILE INPUT/OUTPUT |  |  |  |
OUTPUT TO A FILE
Example program ------> FORMOUT.C
Load and display the file named FORMOUT.C for your first example of
writing data to a file. We begin as before with the include statement for
stdio.h, and include the header for the string functions. Then we define
some variables for use in the example including a rather strange looking
new type.
The type FILE is a structure (we will study structures in the
next chapter) and is defined in the stdio.h file. It is used to define
a file pointer for use in file operations. The definition of C requires
a pointer to a FILE type to access a file, and as usual, the name
can be any valid variable name. Many writers use fp for the name
of this first example file pointer so I suppose we should start with it
too.
OPENING A FILE
Before we can write to a file, we must open it. What this really means
is that we must tell the system that we want to write to a file and what
the filename is. We do this with the fopen() function illustrated
in line 11 of the program. The file pointer, fp in our case, will
point to the structure for the file and two arguments are required for
this function, the filename first, followed by the file attribute. The
filename is any valid filename for your operating system, and can be expressed
in upper or lower case letters, or even mixed if you so desire. It is enclosed
in double quotes. For this example we have chosen the name TENLINES.TXT.
This file should not exist on your disk at this time. If you have a file
with this name, you should change its name or move it because when we execute
this program, its contents will be overwritten. If you don't have a file
by this name, this program will create one and write some data into it.
Note that we are not forced to use a string constant for the file name
as we have done here. This is only done here for convenience. We can use
a string variable which contains the filename then use any method we wish
to fill in the name of the file to open. This will be illustrated later
in this chapter.
READING ("r")
The second parameter is the file attribute and can be any of three letters,
"r", "w", or "a", and must be lower case. There are actually additional
attributes available in C to allow more flexible I/O, and after you complete
your study of this chapter, you should check the documentation for your
compiler to study the additional file opening attributes. When an "r" is
used, the file is opened for reading, a "w" is used to indicate a file
to be used for writing, and an "a" indicates that you desire to append
additional data to the data already in an existing file. Opening a file
for reading requires that the file already exist. If it does not exist,
the file pointer will be set to NULL and can be checked by the program.
It is not checked in this program, but could be easily checked as follows.
if (fp == NULL) {
printf("File failed to open\n");
exit (1);
}
Good programming practice would dictate that all file pointers be checked
to assure proper file opening in a manner similar to the above code. The
value of 1 used as the parameter of
exit() will be explained shortly.
WRITING ("w")
When a file is opened for writing, it will be created if it does not
already exist and it will be reset if it does, resulting in deletion of
any data already there. If the file fails to open for any reason, a NULL
will be returned so the pointer should be tested as above.
APPENDING ("a")
When a file is opened for appending, it will be created if it does not
already exist and it will be initially empty. If it does exist, the data
input point will be set to the end of the data already contained in the
file so that new data will be added to any data that already exists in
the file. Once again, the return value can and should be checked for proper
opening.
OUTPUTTING TO THE FILE
The job of actually outputting to the file is nearly identical to the
outputting we have already done to the standard output device. The only
real differences are the new function names and the addition of the file
pointer as one of the function arguments. In the example program, fprintf()
replaces our familiar printf() function name, and the file pointer
defined earlier is the first argument within the parentheses. The remainder
of the statement looks like, and in fact is identical to, the printf()
statement.
CLOSING A FILE
To close a file, use the function fclose() with the file pointer
in the parentheses. Actually, in this simple program, it is not necessary
to close the file because the system will close all open files before returning
to the operating system. It would be good programming practice for you
to get in the habit of closing all files in spite of the fact that they
will be closed automatically, because that would act as a reminder to you
of what files are open at the end of each program.
You can open a file for writing, close it, and reopen it for reading,
then close it, and open it again for appending, etc. Each time you open
it, you could use the same file pointer, or you could use a different one.
The file pointer is simply a tool that you use to point to a file and you
decide what file it will point to.
Compile and run this program. When you run it, you will not get any
output to the monitor because it doesn't generate any. After running it,
look in your current directory for a file named TENLINES.TXT and examine
it's contents. That is where your output will be. Compare the output with
that specified in the program. It should agree. If you add the pointer
test code described above, and if the file couldn't be opened for any reason,
there will be one line of text on the monitor and the file will be empty.
Do not erase the file named TENLINES.TXT yet. We will use it in some
of the other examples in this chapter.
OUTPUTTING A SINGLE CHARACTER AT A TIME
Example program ------> CHAROUT.C
Load the next example file, CHAROUT.C, and display it on your monitor.
This program will illustrate how to output a single character at a time.
The program begins with the include statements, then defines some variables
including a file pointer. The file pointer is named point this time,
but we could have used any other valid variable name. We then define a
string of characters to use in the output function using a strcpy()
function. We are ready to open the file for appending and we do so with
the fopen() function, except this time we use the lower cases for
the filename. This is done simply to illustrate that some operating systems
don't care about the case of the filename. Some operating systems, including
UNIX, are case sensitive for filenames, so you will need to fix the case
before compiling and executing this program. Notice that the file will
be opened for appending so we will add to the lines inserted during the
last program. If the file could not be opened properly, a NULL value is
returned by the fopen() function.
Lines 14 through 18 check to see if the file opened properly and returns
an error indication to the operating system if it did not. The constant
named EXIT_FAILURE is defined in the stdlib.h file and is usually defined
to have the value of 1. The constant named EXIT_SUCCESS is also defined
in the stdlib.h file and is usually defined to have the value of 0. The
operating system can use the returned value to determine if the program
operated normally and can take appropriate action if neccessary. For example,
if a two part program is to be executed and the first part returns an error
indication, there is no need to execute the second part of the program.
Your compiler probably executes in several passes with each successive
pass depending on successful completion of the previous pass.
The program is actually two nested for loops. The outer loop
is simply a count to ten so that we will go through the inner loop ten
times. The inner loop calls the function putc() repeatedly until
a character in the string named others is detected to be a zero.
This is the terminating null for the string.
THE putc() FUNCTION
The part of the program we are interested in is the putc() function
in line 23. It outputs one character at a time, the character being the
first argument in the parentheses and the file pointer being the second
and last argument. Why the designer of C made the pointer first in the
fprintf() function, and last in the putc() function is a
good question for which there may be no answer. It seems like this would
have been a good place to have used some consistency.
When the textline others is exhausted, a newline is needed because
a newline was not included in the definition above. A single putc()
is then executed which outputs the \n character to return the carriage
and do a linefeed.
When the outer loop has been executed ten times, the program closes
the file and terminates. Compile and run this program but once again there
will be no output to the monitor. You need to assure that TENLINES.TXT
is in the current directory prior to execution.
Following execution of the program, examine the contents of the file
named TENLINES.TXT and you will see that the 10 new lines were added to
the end of the 10 that already existed. If you run it again, yet another
10 lines will be added. Once again, do not erase this file because we are
still not finished with it.
READING A FILE
Example program ------> READCHAR.C
Load the file named READCHAR.C and display it on your monitor. This
is our first program which can read from a file. This program begins with
the familiar include statements, some data definitions, and the file opening
statement which should require no explanation except for the fact that
an "r" is used here because we want to read from this file. In this program,
we check to see that the file exists, and if it does, we execute the main
body of the program. If it doesn't exist, we print a message and quit.
If the file does not exist, the system will set the pointer equal to NULL
which we test in line 12. If the pointer is NULL we display a message and
terminate the program.
The main body of the program is one do while loop in which a
single character is read from the file and output to the monitor until
an EOF (end of file) is detected from the input file. The file is then
closed and the program is terminated.
CAUTION CAUTION CAUTION
At this point, we have the potential for one of the most common and
most perplexing problems of programming in C. The variable returned from
the getc() function is a character, so we can use a char variable
for this purpose. There is a problem that could develop here if we happened
to use an unsigned char however, because C returns a minus one for
an EOF. An unsigned char type variable is not capable of containing
a negative value. An unsigned char type variable can only have the
values of zero to 255, so it will return a 255 for a minus one which can
never compare to the EOF. This is a very frustrating problem to try to
find. The program can never find the EOF and will therefore never terminate
the loop. This is easy to prevent. Always use a int type variable
when the return can be an EOF, because an int is always signed.
According to the ANSI-C standard, a char can be implemented as either
a signed or an unsigned type by any particular compiler.
Some compilers use a char type that is not 8 bits long. If your compiler
uses other than 8 bits for a char type variable, the same arguments apply.
Do not use an unsigned type if you need to check for an EOF returned by
the function, because an EOF is usually defined as -1 which cannot be returned
in an unsigned type variable.
There is yet another problem with this program but we will worry about
it when we get to the next program and solve it with the one following
that.
After you compile and run this program and are satisfied with the results,
it would be a good exercise to change the name of TENLINES.TXT and run
the program again to see that the NULL test actually works as stated. Be
sure to change the name back because we are still not finished with TENLINES.TXT.
In a real production program, you would not actually terminate the program.
You would give the user the opportunity to enter another filename for input.
We are interested in illustrating the basic file handling techniques here,
so we are using a very simple error handling method.
READING A WORD AT A TIME
Example program ------> READTEXT.C
Load and display the file named READTEXT.C for an example of how to
read a word at a time. This program is nearly identical to the last except
that this program uses the fscanf() function to read in a string
at a time. Because the fscanf() function stops reading when it finds
a space or a newline character, it will read a word at a time, and display
the results one word to a line. You will see this when you compile and
run it, but first we must examine a programming problem.
It is left as an exercise for the student to include a check for proper
file opening and performing a meaningful response if it does not open.
A meaningful response is to simply output an error message and exit to
the operating system.
THIS IS A PROBLEM
Inspection of the program will reveal that when we read data in and
detect the EOF, we print out something before we check for the EOF resulting
in an extra line of printout. What we usually print out is the same thing
printed on the prior pass through the loop because it is still in the buffer
named oneword. We therefore must check for EOF before we execute
the printf() function. This has been done in READGOOD.C, which you
will shortly examine, compile, and execute.
Compile and execute the program we have been studying, READTEXT.C and
observe the output. If you haven't changed TENLINES.TXT you will end up
with "Additional" and "lines." on two separate lines with an extra "lines."
displayed at the end of the output because of the printf() before
checking for EOF. Note that some compilers apparently clear the buffer
after printing so you may get an extra blank line instead of two lines
with "lines." on them.
Notice that we failed to check that the file opened properly. This is
very poor practice, and it will be left as an exercise for you to add the
required code to do so in a fashion similar to that used in the READCHAR.C
example program.
NOW LET'S FIX THE PROBLEM
Example program ------> READGOOD.C
Compile and execute READGOOD.C and observe that the extra "lines." does
not get displayed because of the extra check for the EOF in the middle
of the loop. This was also the problem referred to when we looked at READCHAR.C,
but I chose not to expound on it there because the error in the output
was not so obvious.
Once again there is no check for the file opening properly, but you
know how to fix it by now and you should do so as an exercise.
We should point out that an experienced C programmer would not write
the code as given in this example because it compares c to EOF twice
during each pass through the loop and this is inefficient. We have been
using code that works and is very easy to understand, but as you gain experience
with C, you will begin to use more efficient coding methods, even if they
tend to become harder to read and understand. An experienced C programmer
would code lines 12 through 17 of READGOOD.C in the following manner;
while((c = fscanf(fp1, "%s", oneword) != EOF)
{
printf("%s\n", oneword);
}
There is no question that this code is more difficult to read, but if you
spend some time studying it, you will find that it is identical to the
code in the example program. Even though is is more efficient, it is not
clear whether the slight gain in efficiency is worth the reduced readability.
If the program saves ten milliseconds when reading in a file once a day,
and it takes a programmer an hour longer to make a modification to the
code a year after the program is released, there is not much savings in
using such code. Even though the time assumptions are all judgement calls
in the above text, it is plain to see that there are often tradeoffs when
writing a program. You will make many decisions concerning execution efficiency
and readability when you are writing non-trivial programs.
This is a rather contrived example, because most experienced C programmers
would not think this code is at all cryptic, but is written in a standard
C notation. As you gain experience, you will come to accept this as clearly
written C code. The philosophical argument about code complexity and readability
has been made however, and should be considered for all software development.
FINALLY, WE READ A FULL LINE
Example program ------> READLINE.C
Load and display the file READLINE.C for an example of reading a complete
line. This program is very similar to those we have been studying except
that we read a complete line in this example program.
We are using fgets() which reads an entire line, including the
newline character, into a buffer. The buffer to be read into is the first
argument in the function call, and the maximum number of characters to
read is the second argument, followed by the file pointer. This function
will read characters into the input buffer until it either finds a newline
character, or it reads the maximum number of characters allowed minus one.
It leaves one character for the end of string null character. In addition,
if it finds an EOF, it will return a value of NULL. In our example, when
the EOF is found, the pointer named c will be assigned the value
of NULL. NULL is defined as zero in your stdio.h file.
When we find that the pointer named c has been assigned the value
of NULL, we can stop processing data, but we must check before we print
just like in the last program. Last of course, we close the file.
HOW TO USE A VARIABLE FILENAME
Example propgram ------> ANYFILE.C
Load and display the program ANYFILE.C for an example of reading from
any file. This program asks the user for the filename desired, and reads
in the filename, storing it in a string. Then it opens that file for reading.
The entire file is then read and displayed on the monitor. It should pose
no problems to your understanding so no additional comments will be made.
Compile and run this program. When it requests a filename, enter the
name and extension of any text file available, even one of the example
C programs.
Enter an invalid file name to see what the system does when it cannot
open the file. If you were a user of this program, and possibly not very
computer literate, would you prefer that the program gave you a cryptic
message of some sort, or would you prefer that the program displayed a
neat message such as "The file you have asked for is not available. Would
you like to enter another filename?". Of course this is the message that
you can emit when you find that the file did not open properly. Your users
will appreciate the effort you put into error handling for their program.
HOW DO WE PRINT?
Example program ------> PRINTDAT.C
Load the last example program in this chapter, the one named PRINTDAT.C
for an example of how to print. This program should not present any surprises
to you, so we will move very quickly through it.
Once again, we open TENLINES.TXT for reading and we open PRN for writing.
Printing is identical to writing data to a disk file except that we use
a standard name for the filename. Many C compilers use the reserved filename
of PRN that instructs the compiler to send the output to the printer. There
are other names that are used occasionally such as LPT, LPT1, or LPT2.
Check the documentation for your particular compiler. Some of the newest
compilers use a predefined file pointer such as stdprn for the print
file. Once again, check your documentation.
The program is simply a loop in which a character is read, and if it
is not the EOF, it is displayed and printed. When the EOF is found, the
input file and the printer output files are both closed. Note that good
programming practice includes checking both file pointers to assure that
the files opened properly. You can now erase TENLINES.TXT from your disk.
We will not be using it in any of the later chapters.
A READING ASSIGNMENT
Spend some time studying the documentation for your compiler and reading
about the following functions. You will not understand everything about
them but you will get a good idea of how the library functions are documented.
fopen(), fclose(), putc(), putchar(), printf(), fprintf(), scanf(),
fgets()
Also spend some time studying stdio.h, looking for prototypes for the
above functions and for the declaration of FILE.
PROGRAMMING EXERCISES
-
Write a program that will prompt for a filename for an input file, prompt
for a filename for a write file, and open both plus a file to the printer.
Enter a loop that will read a character, and output it to the file, the
printer, and the monitor. Stop at EOF.
-
Prompt for a filename to read. Read the file a line at a time and display
it on the monitor with line numbers.
-
Modify ANYFILE.C to test if the file exists and print a message if it doesn't.
Use a method similar to that used in READCHAR.C.