Resource Management

Cpp

#include <mutex>

struct Stats {
    uint64_t someStat;
    uint64_t otherStat;

    std::mutex lock;
};

void updateSharedObject(Stats *obj) {
    std::lock_guard m(obj->lock);

    obj->someStat++;
    obj->otherStat = obj->someStat / 2;
}

Go

package main

import (
	"sync"
)

type Stats struct {
	someStat  uint64
	otherStat uint64

	sync.Mutex
}

func updateSharedObject(obj *Stats) {
	obj.Lock()
	defer obj.Unlock()

	obj.someStat++
	obj.otherStat = obj.someStat / 2
}

What This Code Does

Updates properties of a shared object using a lock, specifically a mutex, to ensure that updates are not applied concurrently.

What's the Same

Both solutions define a Stats struct that contains two integer fields and a mutex lock. Both solutions also contain an updateSharedObject function which obtains an exclusive lock via the mutex and update the structs integer fields.

What's Different

While the code is fundamentally the same, the main difference is how resources are managed. The Go version acquires the lock and then uses defer obj.Unlock() to clean up, in this case release the lock. However, the C++ solution only seems to acquire the lock but never releases it:

 std::lock_guard m(obj->lock);

However, this lock is released at the end of the function. This is done via RAII, which stands for "Resource Acquisition Is Initialization." These terms can be a bit confusing to understand, but let's first take a detour to explain constructors and destructors and then we can better define RAII.

Constructors & Destructors

Most OOP languages have the concept of a constructor. It is essentially a creation function that sets up the internal state of a new instance of a class. This also exists in C++. However, a separate concept called a destructor also exists. The purpose of the destructor is mirror to the constructor in that it tears down the object before the memory is reclaimed. Take the following class for example:

class int_array {
private:
    int *arr;

public:
    int_array(size_t size) {
        this.arr = new int[size];
    }
    ~int_array() {
        delete [] this.arr;
    }
}

int_array is a class that wraps a C-style array. Upon construction, an arary of dynamic size is allocated on the heap. If we were to simply free instances of the int_array without freeing the memory allocated on construction, we would have a memory leak. Instead we define a destructor, ~int_array, which frees this memory. We can construct and delete an instance of int_arary like so:

auto arr = new int_array(16);
// use array ...

delete arr;

new both initializes memory on the heap for our int_array instance, but also calls the constructor. Similarly, delete calls the destructor and then frees the memory for out int_array instance.

What if the our int_array is allocated on the stack, like so:

void someFunction() {
    auto arr = int_array(16);
    // use array ...
}

In this case, because our int_array instance was allocated on the stack (note no use of new here), it will automatically be free'd when the function ends. However, C++ will also ensure that any destructors are called before the objects are freed.

Understanding RAII

An alternative, an likely more understandable, name for RAII is Scope Based Resource Management, where a "Resource" could be anything such as:

  • Memory
  • DB Connections
  • Sockets
  • ThreadPool Workers
Essentially anything that requires some setup and cleanup can implement the RAII pattern by following a couple of rules:
  1. Acquire the required resources on construction
  2. Release acquired resources on destruction

Given these rules, we can now see how/where our original locking C++ code works. The acquisition of the lock is represented by the std::lock_guard type and the guard, m, is allocated on the stack. At the end of function, m is destructed (this is where the lock is released) and freed.

This pattern is extremely common in modern C++ code. Take for example some of the collections from the std-lib, such as unordered_map or vector:

void someFunction() {
    std::vector vec;

    // dynamically grow heap-allocated, backing array
    for (int i = 0; i < 1000; i++) {
        vec.push_back(i);
    }

    // end function without explicit cleanup of vector
}

Digging Deeper

Due to the behavior of stack allocated objects and RAII principals, it is often preferred to completely avoid the explicit use of new and delete. Instead, everything should be allocated on the stack. These stack allocated objects can then manage heap allocations under the covers and ensure proper allocation and cleanup happens with the object life cycle (construction, destruction).

When an object must be allocated on the heap, either directly or within another RAII object, then a smart pointer should be used. Smart pointers are a way to represent heap allocated resources on the stack that contain the same RAII principals. That is, when they are no longer in scope, they clean up the associated heap memory. The main two smart pointers from the standard library to know are unique_ptr and shared_ptr.

unique_ptr is a smart pointer that is only allowed to have a single owner. If you want to pass a unique_ptr to a function then it must be moved. Moving means giving ownership of the object away. A moved value can no longer be referenced and the lifetime is now transferred to whomever it was moved to. To make this more clear:

{ // start of scope
    unique_ptr<string> a;

    { // start of scope
        unique_ptr<string> b = make_unique<string>("Hello, World");

        // not allowed, as this type of assignment tries to copy `b`, but
        // unique_ptr's cannot be copied
        auto c = b;

        // `a` now own's the unique_ptr
        a = move(b);

        // not allowed, `a` now owns the unique_ptr
        b->length();
    } // end of scope, `b` is cleaned up but the memory allocated for
      // "Hello, World" is untouched since `a` is now the owner

    // allowed, since `a` is the owner of the unique_ptr
    a->length();

} // end of scope, `a` is cleaned up and memory is reclaimed

shared_ptr is a smart pointer that allows multiple owners. Each time a shared_ptr is copied, a ref-counter is incremented. When a shared_ptr is destructed (cleaned up), the ref-counter is decremented. When the ref-counter hits 0, then the associated heap-allocated memory is cleaned up. To make this more clear:

{ // start of scope
    shared_ptr<string> a;

    { // start of scope
        // new shared pointer, ref-count is 1
        shared_ptr<string> b = make_unique<string>("Hello, World");

        // allowed, copy is made and ref-count is 2
        auto c = b;

        // allowed, copy is made and ref-count is 3
        a = c;

        // both allowed to operate on underlying data
        b->length();
        c->length();
    } // end of scope, `b` and `c` cleaned up, ref-count is 1

} // end of scope, `a` is cleaned up, ref-count is 0, memory is freed

With these two constructs, you should never need to use a raw pointer ( *obj) or the new and delete keywords. This is a major step toward writing safer, more modern C++.

Fork me on GitHub