C++ fstream - use '\n' instead of std::endl

Written on

tl;dr; version: If you have to write a lot of lines to a file and time is of the essence in your C++ code, don’t use std::endl to print a line delimiter. Use a simple “\n” instead. Why?

Story time

When coding, you innevitably get to a point where you need to write some stuff to a file. C++ offers the fstream library, with ifstream for input and ofstream for output. You also have access to the C-style stdio.h and its FILE type. Finally, C# has many ways of accessing files, but StreamReader and StreamWriter are the commonly recommended ones.

During work on my dissertation project I came across a direct comparison of performance in sorting 200.000 numbers between C++ and C#. That required an input file with 200.000 (technically 200.001, as the first line represents how many numbers follow) integers, one on each line.

The times to run were scary:

C#: 85ms
C++ (fstream): 1100ms

The algorithm employed by both languages for the default sort function (Array.Sort and std::sort) is Introsort, thus performance should be equivalent. Looking into this more, I decided to replace fstream with stdio.h. New C++ result?

C++ (FILE): 110ms

Now isn’t that interesting. Here’s the relevant code for comparison:

C++ fstream:

int n = 1;
ifstream f("input.txt");
f>>n;

int* v = new int[n];

for(int i=0;i<n;i++)
{
    f>>v[i];
}
f.close();

std::sort(v, v+n);

for(int i=0;i<n;i++)
{
    f2<<v[i]<<endl;
}
f2.close();

C++ FILE:

int n = 1;
FILE * pFile;

pFile = fopen("input.txt", "rb");
if (pFile == NULL) { fputs("File error", stderr); exit(1); }

fscanf(pFile, "%d", &n);

int* v = new int[n];
for(int i=0;i<n;i++)
{
    fscanf(pFile, "%d", &v[i]);
}

std::sort(v, v+n);

FILE * oFile = fopen("output.txt", "w+");
    
for(int i=0;i<n;i++)
{
    fprintf(oFile, "%d\n", v[i]);
}

The numbers above were as a result of running these tests on a Windows 8.1 machine. The C++ code was tested by compiling with both gcc and Microsoft’s C++ compiler and no differences in runtime were found. Each time, -O2 (or /O2) was used to optimize the code. Many other parameters were empirically tried, but none showed any relevance.

Now, I was curious what was the problem: reading from the input or writing to the output. Removing all output-related code and leaving just input and sorting resulted in these times:

C++ (FILE): 48ms
C++ (fstream): 51ms

Well, doesn’t that immediately show us where the problem is? Yes, the main issue is when writing to the file. But what if we put the output-related code back, but replace f2<<v[i]<<endl with f2<<v[i]<<"\n" ?

C++ (FILE): 110ms
C++ (fstream): 111ms

std::endl seems to be… not particularly fast.

Of course, I didn’t put the possibility of my machine being the root issue aside. I decided to compile and test the code on a dedicated Linux Mint server. Results:

C++ (FILE): 60ms
C++ (fstream with std::endl): 270ms
C++ (fstream without std::endl): 61ms

Purely for consistency I ran the C# executables through WINE.

C# (WINE): 350ms

Of course, I also made a C# application that does nothing, to see how much of that time is actually just WINE starting up and initialising. On average, the C# “Does Nothing” code ran under WINE in 250ms, thus we can safely say that the code itself needed ~100ms to run. Windows .NET libraries are different to Mono .NET libraries, so nothing can be seen from these C# numbers. Also, this wasn’t a very scientific approach.

You’re now yelling at me: std::endl is equivalent to “\n” AND a buffer flush!!!! Yes, it is. And this seems to be a very costly operation. And the numbers above show just how costly this is on 2 different platforms.

Conclusion?

Avoid std::endl unless you need to flush. Heh. Also, C# performance is amazing. Also, there are no obvious differences in performance during basic io operations between stdio.h and fstream.