How To Read File Into Char Array in C
This article provides a comprehensive guide on how to read files into character arrays in C, with easy-to-follow steps.
Join the DZone community and get the full member experience.
Join For FreeWhen you are working in C programming language, you might encounter some problems which require reading a file into a character array, like analyzing the frequency of each character or converting every starting word of all sentences from lower case to upper case, or vice versa. The solution is really easy but probably isn't that simple for people who do not know much about file reading or writing. So in this article, you can learn step-by-step how to read files into character arrays in C.
Open a File in C
The easiest and most popular way to open a file in C is using the fopen
function in the format below:
file = fopen(file_name, "open_mode"); //open the "file-name" file and store the file pointer in the variable 'file'
The mode parameter of the fopen
function specifies the mode in which the file is to be opened. The mode can be one of the following:
- "r": Open file for reading.
- "w": Truncate the file to zero length or create a file for writing.
- "a": Append to file or create a file for writing if it does not exist.
- "r+": Open file for reading and writing.
- "w+": Truncate the file to zero length or create a file for reading and writing.
- "a+": Append to file or create a file for reading and writing.
But for some reason, the file may not open properly. To prepare when such a situation like that happens, you should always check the return value of the fopen
function to ensure that the file was opened successfully before attempting to read or write to it. Like this:
// If 'fopen' returns NULL, print an error message and exit the program
if (file == NULL) {
printf("Error: Failed to open file '%s'.\n", file_name);
return 1;
}
Read File Contents Character by Character
Before reading the file, you must have a character array to store file contents there. Let's do that.
char buffer[1000]; //Initialize a char array named 'buffer' with size of 1000
Now, it's time to read the file by using fgetc
. This function will read one character in the file every time called, and if called repeatedly, it will read each subsequent character until the end. Thus, we can use a while loop to make the process become easier.
int i = 0, c; //c is the intermediate variable, i is the increment variable
while ((c = fgetc(file)) != EOF) {//Read contents until it reach the end of the file
buffer[i] = c;
i++;
}
Resizable Buffer
The buffer array that we previously defined contains a maximum of 1000 characters. But for many situations, the file size is much larger than that. We can solve this problem by turning our buffer into a resizable one. You can use dynamic memory allocation with the malloc
, realloc
, and realloc
functions provided by the C standard library.
char *buffer = NULL; // initialize buffer to NULL
int buffer_size = 0;
/*Open the file here*/
// Read file character by character
int c, i = 0;
while ((c = fgetc(file)) != EOF) {
// If buffer is full, resize it
if (i >= buffer_size) {
buffer_size += 1000; // increase buffer size by 1000 bytes
buffer = realloc(buffer, buffer_size); // resize buffer
if (buffer == NULL) {
printf("Error: Memory allocation failed.\n");
return 1;
}
}
buffer[i] = c;
i++;
}
We use the realloc
functions in the above code snippet, which proves to be useful because the file size is usually not known in advance. For malloc
and calloc
functions, they can be used to allocate a block of memory of the specified size to a variable. In this example, you can use the following:
buffer = (char*)malloc(1000); //is the same as define char buffer[1000]
You probably won't need to use malloc
and calloc
in this example. We will meet them again later.
File Contains Non-ASCII Characters
In C, a string is represented as a sequence of bytes, and the interpretation of those bytes depends on the character encoding. If the file contains non-ASCII characters, you need to use a character encoding that supports those characters, such as UTF-8 or UTF-16.
For this problem, you should use functions that can handle multibyte characters, such as fgetwc
and fgetws
. These functions read one wide character (wchar_t
) or one wide character string (wchar_t*
) at a time, respectively.
Here are some modifications to the code to make it work when the file contains non-ASCII characters:
wchar_t buffer[100];
// Open file for reading
file = fopen(filename, "r,ccs=UTF-8");
// Read file contents
wchar_t c;
int i = 0;
while ((c = fgetwc(file)) != WEOF) {
buffer[i] = c;
i++;
}
Also, make sure that the input and output streams are set to the correct encoding to properly display or manipulate the characters. On Unix operating systems like MacOS and Linux, to ensure the output encoding is in UTF-8, you can use the setlocale
function:
#include
int main()
{
setlocale(LC_ALL, "en_US.utf8");
// your code here
return 0;
}
On Windows, you can use the _setmode
and _O_U8TEXT
functions to set the output encoding to UTF-8:
#include //_O_U8TEXT
#include //_setmode()
int main()
{
_setmode(_fileno(stdout), _O_U8TEXT);
// your code here
return 0;
}
Here's an example of a file containing the Vietnamese word "Xin chào!" (Hello) with an accent (which are non-ASCII characters), saved in UTF-8 encoding:
Xin chào!
And here is the output of our program after I run it on an online C compiler:
Xin chào!
...Program finished with exit code 0
Press ENTER to exit console.
Read File Contents as a Whole
If you are not familiar with C, then you can skip this step, but I still recommend reading it as an advanced exercise. I want to introduce another way to tackle the "How to read a file into a char array in C" problem. The new thinking is to not read the file character by character but, as a whole, by determining the file size before reading. This is a more complicated solution but also more effective.
First, you should define the usual variables: file pointer to open file and buffer to contain character array. Remember you also need the file size as well:
FILE *fp;
long file_size;
char *buffer;
Then you can open the file to read:
fp = fopen("example.txt", "r");
To know the size of the file, you can use the ftell
function. It will tell the byte location of the current position in the file pointer:current_byte = ftell(fp);
But wait, the file reading always starts at the beginning of the file. No problem, the fseek
function will move the reading control to different positions in the file:
fseek(fp, 0, SEEK_END);
You can get the file size properly now. After that, let's set the reading control to the beginning again to start reading file contents:
file_size = ftell(fp);
rewind(fp); move the control to the file's beginning
// Allocate memory for the char array
buffer = (char*) malloc(file_size + 1);
The use of the malloc
function here is pretty straightforward: allocating memories to create an uninitialized char array with the size of (file_size+1) times 1 byte (size of type char).
If you want to use the calloc
function, here is how:
buffer = (char*) calloc(file_size + 1, sizeof(char));
The main difference between malloc
and calloc
is that malloc
only allocates memory without initializing its contents, while calloc
both allocates and initializes memory to zero. The main advantage of using calloc
is that the allocated memory will already be zeroed out, which can be helpful if you plan to use the char array as a string later.
// Read the file into the char array
fread(buffer, file_size, 1, fp);
fread
function, which takes the file pointer, the size of each element to read, the number of elements to read, and the destination array.
// Add a null terminator at the end of the char array
buffer[file_size] = '\0';
You might be wondering why there is a need to allocate an extra byte to the "buffer." Why not just (file_size) but (file_size + 1)? Here it is; the null terminator will be added at the end of the char array to indicate the end of the string. Actually, if your only mission is to read a file into an array, then this step is unnecessary. But later, if you want to print this array as a string, then this is a requirement. String in C is defined to have the last character as a null terminator\0
.
Cleanup Your Code
You have opened and used the file, so remember to close it afterward. Simply use the fclose
function to free the "file" pointer variable that you had assigned.
fclose(file);
Talking about freeing pointers, remember the "buffer" array that you used to store characters? If you defined it as an allocated memory (pointer), then it's best to free it now to avoid memory leaks.
free(buffer);
Here is an overview of what your solution should look like:
#include
#include
int main() {
FILE *file;
char filename[] = "example.txt";
char *buffer = NULL; // initialize buffer to NULL
int buffer_size = 0;
int i = 0;
//Open file for reading
file = fopen(filename, "r");
//Check if file opened successfully
if (file == NULL) {
printf("Error: Failed to open file '%s'.\n", filename);
return 1;
}
// Read file character by character
int c;
while ((c = fgetc(file)) != EOF) {
// If buffer is full, resize it
if (i >= buffer_size) {
buffer_size += 1000; // increase buffer size by 1000 bytes
buffer = realloc(buffer, buffer_size); // resize buffer
if (buffer == NULL) {
printf("Error: Memory allocation failed.\n");
return 1;
}
}
buffer[i] = c;
i++;
}
// Close file
fclose(file);
// Print the character array
printf("%s", buffer);
// Free the dynamically allocated buffer
free(buffer);
return 0;
}
Conclusion
In this guide, we covered step-by-step solutions to read a file into a character array in C: opening the file, creating a buffer variable, storing the entire contents from the file to the buffer, and finally closing the file and freeing all memory.
We also discussed how to solve problems when the file size is too large for our initial buffer to handle or when the file character encoding is non-standard. We have introduced new concepts and some best examples to use them. This will help you familiarize yourself with these situations, giving you access to a wider range of future-use resources.
Published at DZone with permission of Ankur Ranpariya. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments