File
- File is a resource for storing and retrieving information. Most of the
operating systems store files as one dimensional arrays. There are many types of files like text files, program
files, data files etc.
Why Files? - Simply because primary memory is volatile and ceases data
when program execution is finished. For using data later in future for further processing or just reading data needs
to be stored in files.
File System - Method of organizing files on the disk so
that those can be retrieved later. Different operating systems use different
file systems to manage files on disk. For example Windows OS uses
FAT (File Allocation Table) and NTFS (New Technology File System) file systems.
Buffer - Buffer is a block in main memory
(RAM) that is used to store temporary data (input or output). Since accessing
the disk every time for reading or writing back is slow buffer is used.
Whenever we try to read data or write data from or to disk buffer is used. Data
is first moved to buffer while reading and written back to it first
before writing it back to the disk.
Note
- File pointer in C is location of pointer in buffer at any particular time.
Text File - Text file is a file which is stored on the disk encoded with
any character coding standard like ASCII or UTF.
If we use ASCII code it uses 7 bit code
stored in a byte on disk. So if you want to store a character 'a' it will be
stores as 97 ( ASCII code for character a is 97). There are 128 different ASCII
codes for storing different characters on disk.
At disk level these files are still binary
files only difference is these are binary files which are encoded with ASCII or
UTF codes i.e. each byte is written using these codes.
On important aspect of these files is End of
File character. In most cases CTRL+Z is assumed to be the EOF character.
Binary
File - General binary files
do not have any restriction on how these are stored on disk. Each 256 bit
pattern can be used in any byte of binary file. Each byte of a binary file can
be one of the 256 bit patterns. Executable files, object files, sound files,
image files all are binary files.
End Of File
in binary files is last byte of the file.
File Operations:
1) Create
2) Open
3) Write
4) Read
5) Move
6) Close
File Modes: Purpose of opening the file. Following modes
are supported by C:
1) r -
read
2) w -
write
3) a -
append
4) r+ - read and write
5) w+ - read and write
6) a+ - read and write
7) rb - binary file read
8) wb - create or open a binary file
9) ab - binary file append
10) rb+ - binary file read write
11) wb+ - binary file read write
12)ab+ - binary file read write
Functions for handling basic file operations
in C:
1)fopen() - creates a new file or open an
existing file.
2)fclose() - closes a file.
3)fread() - read from a binary file.
4)fwrite() - write to a binary file.
Other functions used for handling files in C:
1)fseek() - set position of file pointer to
the desired location in the file.
2)ftell() - returns the current position of
the file pointer in the file.
3)rewind() - set the position of the file
pointer to the beginning of the file.
4)getc(), fscanf(), getw() - read a
character, set of data and an integer from a file respectively.
5)putc(), fprintf(), putw() - write a
character, set of data and an integer to a file.
Functions explained -
1) fopen() - This function opens the file name which is specified in the
specified mode. This function will return a pointer that can be used to
manipulate the file. In case function call fail due to some issues it will
return a null pointer.
Declaration - FILE *fopen(const char *filename, const char
*mode)
Example -
FILE * ptr
ptr =
fopen("myfile.txt","w");
// File type pointer is declared which will
hold the pointer to the first byte address when the file is opened.
// fopen() function is called with myfile.txt
(file which is to be opened or created) argument and w ( this is the mode in
which we want to open the file, w specifies that we just want to write the
contents of the file) argument.
Note
- If myfile.txt file does not exist it will be created.
If the file is not opened due to some reasons
fopen() will return a NULL. This may be due to many reasons like file is write
protected, file does not exist etc. So in most of the programs we check
whether fopen() call was successful or not. To do this you will see code like
following after every fopen() call.
if(!ptr)
{
return 1;
}
This if condition will return a 1 when
fopen() is not successful indicating something wrong with the call.
2) fclose() - This function will close the stream opened by
fopen() function.
Declaration - int fclose(FILE *ptr)
As we see in declaration for fclose(, it can
be called with an file type pointer as argument. This pointer will be the same
pointer with which the fopen() function was called.
If the file is closed successfully this
function will return 0 otherwise it will return EOF (End Of File).
3)fread() - This function reads the specified number of
elements from a input stream ( file in general terms) and stored these in a
block of memory and return the number of items read.
In other words this function reads
unformatted data from a stream into a buffer.
Declaration - size_t fread( void *buffer,
size_t size, size_t count, FILE *ptr)
Here,
*buffer - Pointer to the buffer to where we
will store data we read.
size - size of each element to be read.
count - number of bytes to read.
ptr - pointer to the file stream from which
to data will be read.
This function will return the number of items
read. If number of items read is different from requested amount (count
parameter) we may needs to check if this is because of some error or EOF is
reached. We can do this with the help of feof() or ferror() functions.
What is Stream? - Stream is just a block of data which is
coming from some source. Difference between file and stream is that file is
complete set of data and stream is subset of that data which is read into the
buffer.
Hence when we want to create, copy, delete,
move or open file we use file. But when we want to just do read and write
operations we use stream to do that.
Example -
Note - In following example I have not
checked the error conditions while opening the file with fopen() or while
reading the file with fread().
int count;
char buffer[1000]; // Buffer where data is
stored while reading
long file_stream; // Stream from which data will be read from
char *filename = "c:\\myfile.txt";
// file path and name
file_stream = fopen(filename,"r");
count = fread(buffer, sizeof(char),1000,
file_stream)
This line of code will read 1000 bytes from
the file_stream into a array pointed by buffer.
count will return number of bytes read.
If count is not equal to 1000 bytes this
means either we reached EOF before 100o bytes or some error occured. To check
what happened we can use feof() or ferror() functions described later.
4)
fwrite() - This function writes unformatted data from a buffer to a stream.
Declaration - size_t fwrite( void *buffer,
size_t size, size_t count, FILE *ptr)
Here,
*buffer - Pointer to the buffer containing
the data.
size - size of each element to be written.
count - number of bytes to store in buffer.
ptr - pointer to the file to which data will
be written to.
Example -
FILE *fp;
char str[] =
"My name is KBanyal"; // String which we want to write on file.
fp = fopen(
"file.txt" , "w" ); // File is the pointer to file we need
to write data to.
fwrite(str ,
1 , sizeof(str) , fp );
Note -
fwrite() will return the total number of elements successfully written. If number
of records written is different than count then a error is thrown. ferror()
function can be used to validate if any error occured while writting to the
file.
5)fscanf()
- This function reads formatted data from the stream. It reads bytes,
interprets them according to a format and store the results in its arguments.
This function returns number of items successfully read.If some specifier is
specified to ignore some element while reading those elements are not included
in the count.
Declaration
- int fscanf(FILE *fp, const
char *format, ...)
Here,
fp = pointer
to a FILE object which is the stream to read from.
format = A
sequence formed by an initial % sign indicates a format specifier. This is used
to specify the type and format of the data to be retrieved from the stream and
stored into the locations pointed by
additional arguments.
The prototype
for fscanf() format specifier is something like this - %[*][width][length]specifier
Here
specifier specifies which characters are extracted from the stream. For example
following specifiers can be specified to extract corresponding element from the
stream.
%d reads an
integer
%f reads a
float
%lf reads a
double
%c reads a
character, including white space. If more than 1 character needs to be read at
a time specify the width.
%s reads a
string up to first white space
%[...]
string, up to first character not in brackets
Example
%[abk] will read 'kbanyal' as 'kba'.
%[0123456789]
would read in digits
%[^...]
string, up to first character in brackets
%[^\n] would
read everything up to a newline
* is used to
ignore particular elements in the stream.
Example
%*d will ignore all the integers in input stream.
Examples:
fscanf(infile,
"%d,%c", &x, &c); // read an int & char from file where
int and char are separated by a comma
fscanf(infile,"%s",
array); // read a string from file into
array stops at white space
fscanf(infile,
"%lf %24s", &d, array); //
read a double and a string upto 24 chars from infile
fscanf(infile,
"%20[012345]",array); // read
a string of at most 20 chars consisting of only chars in set
fscanf(infile,
"%ld %d%c", &x, &b, &c);
// read in two integer values store first in long, second in int read in
end of line char into c
6)
fprintf() - This function is
used to send formatted output to a file stream. This function returns total
number of characters printed. If error occurs it will return a negative number.
Declaration
- int fprintf(FILE *fp, const char *fs, ...)
fp = Pointer
to a file (actually stream)
fs =
Formatted string we want to write to the file (actually stream)
Return Value
- On success - Total number of characters printed.
On error - Negative number
fs or the format string can be formed using different format tags.
Prototype = %[flags][width][.precision][length]specifier
Most common specifiers are following:
%d = Displays and integer
%f = Displays a floating-point
number in fixed decimal format
%e = Displays a floating-point
number in exponential notation
%s = Displays string of
characters
%u = Displays unsigned decimal
integer
%c = Displays a character
Since most of the format tags
will never be used and there are lots I am just giving one or two examples for
all the tags below.
Flags = flags can be used
for various purposes. Main is to format data as per requirement. For example -
1) - = is used to left justify
the field.
Example - fprintf(fp,"%-10d
%c",143,'k');
Output: 143 k
7 blanks after 143.
2) 0 = to pad field with zeros
rather than blanks.
Example - fprintf(fp,"%010d
%c",143,'k')
Output: 0000000143 k
Right side blanks padded with
7 0’s.
Width = every data format is provided with
minimum required width to hold the same. Width format tag can be used to
increase it further.
1) %[width]d - will increase
the field width of given integer.
Example - fprintf(fp,"%4d",7)
Output: 7
3 blanks to the right of 7 to
increase its width.
2) %[width]s - will increase
the width of given string.
Example - fprintf(fp,"%15s","kbanyal")
Output: kbanyal
8 blank spaces before kbanyal
to increase its width.
Precision = This format
tag takes different meanings for different format types.
1) %[total].[decimal]f - Here
total field length will be [total] and [decimal] of these will hold the decimal
part for a float value.
Example - fprintf(fp,"%10.3f",546666.7)
Output: 546666.700
2) %[minimum].[maximum]s - Here
minimum width is [minimum] and maximum width is [maximum] for a string value.
If string is more than [maximum] value it will be cropped to [maximum] value.
Example - fprintf(fp,"%3.4s","kbanyal");
Output: kban
Length = length modifier is used to let fprintf
know that we want to print very big or very small variables. For example -
1) short int - Modifier to be
used in this case is 'h'. Short int is always takes less or equal bits as int.
that is short int <= int <= long.
Example - fprintf(fp,"%hd",1)
Output: 1
2) long double - Modifier to be
used in this case is 'L".
Example - fprintf(fp,"%Lf",10.000000001223);
Output: 10.000000
7) fseek() – This function
sets the current position in a file to a new location.
When we perform read and write from or to a file, operating system keep
track of our location in file using file pointer. At any time during read or
write if we want to change our location to any other location in file we can
use fseek() function for this.
Declaration
- int fseek( FILE *ptr, long offset, int origin);
Here,
*ptr = Pointer to the file
offset = The offset within the file (in byte)
Origin = The starting point. We can set it
using following values:
SEEK_SET – Beginning of the file
SEEK_CUR – Current position of the file
pointer.
SEEK_END – End of file.
Return
value –
On success – zero
On failure – negative number
Examples:
1) fseek(fp, 100, SEEK_SET); // Move to 100th
byte from start of file.
2) fseek(fp, 100, SEEK_CUR); // Move to 100th
byte from current position.
3) fseek(fp, -100, SEEK_END); // Move to 100th
byte before end of file.
4)ftell() – This function tells us the
location in the file from where it will be read from or where it will be
written to. Note – The location is relative to the beginning of the file.
If we want to know where we are in the file
at any particular time we can use this function to get that value.
Declaration
– long int ftell(FILE *fp)
Return
value –
On success – current offset relative to
beginning of file.
On failure – negative number
Can be used are follows:
long position;
FILE *fp;
fp = fopen(“xxxx”,”r”);
position = ftell(fp);
9)rewind() – This function re positions the file
pointer to the start of the file.
Declaration
– void rewind(FILE *fp)