Modern C++ for C Programmers: part 1

Welcome to part 1 of Modern C++ for C Programmers, please see the introduction for the goals and context of this series.

In this part we start with C++ features that you can use to spice up your code ’line by line’, without immediately having to use all 1400 pages of ‘The C++ Programming Language’.

Various code samples discussed here can be found on GitHub.

Relation between C and C++

C and C++ are actually very close relatives, to the point that many compilers have unified infrastructure for both languages. In other words, your C code is already going through codepaths shared with C++ (and likely written in C++). In fact, when trivial C programs are compiled as C++ with g++, identically sized binaries come out. All example programs in our beloved The C Programming Language compile as valid C++. Interestingly, the introduction of the 1988 edition of K&R notes Bjarne Stroustrup’s C++ “translator” was used extensively for local testing.

The relation goes further - the entire C library is included in C++ ‘by reference’, and C++ knows how to call all C code. And conversely, it is entirely possible to call C++ functions from C.

C++ was explicitly designed to not present unavoidable overhead compared against C. To quote from the ISO C++ website:

The zero-overhead principle is a guiding principle for the design of C++. It states that: What you don’t use, you don’t pay for (in time or space) and further: What you do use, you couldn’t hand code any better.

In other words, no feature should be added to C++ which would make any existing code (not using the new feature) larger or slower, nor should any feature be added for which the compiler would generate code that is not as good as a programmer would create without using the feature.

These are big claims, and they do require some proof. For this to be true in 2018, we do have to be careful. Lots of code uses exceptions, and these do come with some overhead. However, it is also possible to declare that all or part of our code is exception free, which leads the compiler to remove that infrastructure.

But, here is actual proof. Sorting 100 million integers using the C qsort() function, using std::sort() in C++ and using the C++-2017 parallel sort, we get the following timings:

C qsort(): 13.4 seconds  (13.4 CPU)
C++ std::sort(): 8.0 seconds (8.0 CPU)
C++ parallel sort: 1.7 seconds (11.8 seconds of CPU time)

What is this magic? The C++ version is 40% faster than C? How is this possible?

Here is the code:

int cmp(const void* a, const void* b)
{
  if(*(int*)a < *(int*)b)
    return -1;
  else if(*(int*)a > *(int*)b)
    return 1;
  else 
    return 0;  
}

int main(int argc, char**argv)
{
  auto lim = atoi(argv[1]);

  std::vector<int> vec;
  vec.reserve(lim);

  while(lim--)
    vec.push_back(random());

  if(*argv[2]=='q')
    qsort(&vec[0], vec.size(), sizeof(int), cmp);
  else if(*argv[2]=='p')
    std::sort(std::execution::par, vec.begin(), vec.end()); 
  else if(*argv[2]=='s')
    std::sort(vec.begin(), vec.end());
}

It is worth studying this a bit. The cmp() function is there for qsort(), and defines the sort order.

Main is main as in C, but then we see the first oddity: auto. We’ll cover this later, but auto almost always does what you think it does: calculate the required type and use it.

The next two lines define a vector containing integers, and reserve enough space in there for how many entries we want. This is an optional optimization. The while loop then fills the vector with ‘random’ numbers.

Next up.. something magic happens. We call the C qsort() function, to operate on the C++ vector containing our numbers. How is this possible? It turns out std::vector is explicitly designed to be interoperable with raw pointer operations. It is meant to be able to be passed to C library or system calls. It stores its data in a consecutive slab of memory that can be changed at will.

The next 4 lines use the C++ sorting functions. On some versions of G++, you may need this (non-standard) syntax to get the same result: __gnu_parallel::sort(vec.begin(), vec.end()).

So how come C++ std::sort is faster than qsort?

qsort() is a library function that accepts a comparison callback. The compiler (and its optimizer) can not look at the qsort() procedure as a whole therefore. In addition, there is function call overhead.

The C++ std::sort version meanwhile is actually a ’template’ which is able to inline the comparison predicate, which for ints defaults to the < operator.

To make sure we are being fair, since qsort() is using a custom comparator, and our std::sort is not, we can use:

std::sort(vec.begin(), vec.end(), 
          [](const auto& a, const auto& b) { return a < b; } 
     );

When executed, this still takes the same amount of time. To sort in reverse order, we could change a < b to b < a. But what is this magic syntax? This is a C++ lambda expression, a way to define functions inline. This can be used for many things, and defining a sort operation this way is highly idiomatic.

Finally, C++ 2017 comes with parallel versions of many core algorithms, and for our case, it appears the parallel sort is indeed delivering a 4.7-fold speedup on my 8 hyper-core machine.

Strings

It may be hard to believe, but for much of the time of C++’s original development, it did not have a string class. Writing such a class was somewhat of a rite of passage, and everyone made their own. The reason behind this was partially the prolonged attempt to make a class that was everything for everyone.

The std::string C++ ended up with in 1998 interoperates well with C code:

std::string dir("/etc/"), fname;
fname = dir + "hosts";
FILE* fp = fopen(fname.c_str(), "r");

std::string offers most functionality you’d expect, like concatenation (as shown above). Some further code:

auto pos = fname.find('/');
if(pos != string::npos) 
	cout << "First / is at " << pos << "\n";

pos = fname.find("host");
if(pos != string::npos) 
	cout << "Found host at " << pos << "\n";

std::string newname = fname;
newname += ".backup";

unlink(newname.c_str());

std::string provides unsafe and unchecked access to its characters with the [] operator, so newname[0] == '/', but wise people use newname.at(0) which performs bounds checking.

The post-2011 design of std::string is pretty interesting. The storage of basic string implementation could look like this:

struct mystring
{
	char* data;
	size_t len;
	size_t capacity; // how much we've allocated already
};

On a modern system, this is 24 bytes of data. The capacity field is used to store how much memory has been allocated so mystring knows when it needs to reallocate. Not reallocating every time a character is added to a string is a pretty big win.

Frequently however, the things we store in strings are a lot shorter than 24 bytes. For this reason, modern C++ std::string implementations implement Small String Optimization, which allows them to store 16 or even 21 bytes of characters within their own storage, without using malloc(), which is a speedup.

Another benefit of preventing needless calls to malloc() is that an array of strings is now stored in contiguous memory, which is great for memory cache hitrates, which often delivers whole factors of speedup.

After years of design, std::string may not be everything to everyone, but consistent with the ‘zero overhead’ principle, it beats what you would have quickly written by hand.

Summarising

In part 1 of this series, I hope to have shown you some interesting bits of C++ you could start using right away - gaining you a lot of new power without immediately filling your code with complicated stuff.

Part 2 can be found here.

If you have any favorite things you’d like to see discussed or questions, please hit me up on @bert_hu_bert or bert@hubertnet.nl