Resource Management
Cpp
#include <mutex>
struct Stats {
uint64_t someStat;
uint64_t otherStat;
std::mutex lock;
};
void updateSharedObject(Stats *obj) {
std::lock_guard m(obj->lock);
obj->someStat++;
obj->otherStat = obj->someStat / 2;
}
Go
package main
import (
"sync"
)
type Stats struct {
someStat uint64
otherStat uint64
sync.Mutex
}
func updateSharedObject(obj *Stats) {
obj.Lock()
defer obj.Unlock()
obj.someStat++
obj.otherStat = obj.someStat / 2
}
What This Code Does
Updates properties of a shared object using a lock, specifically a mutex, to ensure that updates are not applied concurrently.
What's the Same
Both solutions define a Stats
struct that contains two integer fields
and a mutex lock. Both solutions also contain an updateSharedObject
function
which obtains an exclusive lock via the mutex and update the structs integer fields.
What's Different
While the code is fundamentally the same, the main difference is how resources
are managed. The Go version acquires the lock and then uses defer obj.Unlock()
to clean up, in this case release the lock. However, the C++ solution only seems to acquire
the lock but never releases it:
std::lock_guard m(obj->lock);
However, this lock is released at the end of the function. This is done via RAII, which stands for "Resource Acquisition Is Initialization." These terms can be a bit confusing to understand, but let's first take a detour to explain constructors and destructors and then we can better define RAII.
Constructors & Destructors
Most OOP languages have the concept of a constructor. It is essentially a creation function that sets up the internal state of a new instance of a class. This also exists in C++. However, a separate concept called a destructor also exists. The purpose of the destructor is mirror to the constructor in that it tears down the object before the memory is reclaimed. Take the following class for example:
class int_array {
private:
int *arr;
public:
int_array(size_t size) {
this.arr = new int[size];
}
~int_array() {
delete [] this.arr;
}
}
int_array
is a class that wraps a C-style array. Upon construction, an arary of
dynamic size is allocated on the heap. If we were to simply free instances of the
int_array
without freeing the memory allocated on construction, we would have a
memory leak. Instead we define a destructor, ~int_array
, which frees this memory.
We can construct and delete an instance of int_arary
like so:
auto arr = new int_array(16);
// use array ...
delete arr;
new
both initializes memory on the heap for our int_array
instance,
but also calls the constructor. Similarly, delete
calls the destructor and then
frees the memory for out int_array
instance.
What if the our int_array
is allocated on the stack, like so:
void someFunction() {
auto arr = int_array(16);
// use array ...
}
In this case, because our int_array
instance was allocated on the stack
(note no use of new
here), it will automatically be free'd when the
function ends. However, C++ will also ensure that any destructors are called before the
objects are freed.
Understanding RAII
An alternative, an likely more understandable, name for RAII is Scope Based Resource Management, where a "Resource" could be anything such as:
- Memory
- DB Connections
- Sockets
- ThreadPool Workers
- Acquire the required resources on construction
- Release acquired resources on destruction
Given these rules, we can now see how/where our original locking C++ code works. The acquisition
of the lock is represented by the std::lock_guard
type and the guard,
m
, is allocated on the stack. At the end of function, m
is destructed
(this is where the lock is released) and freed.
This pattern is extremely common in modern C++ code. Take for example some of the collections
from the std-lib, such as unordered_map
or vector
:
void someFunction() {
std::vector vec;
// dynamically grow heap-allocated, backing array
for (int i = 0; i < 1000; i++) {
vec.push_back(i);
}
// end function without explicit cleanup of vector
}
Digging Deeper
Due to the behavior of stack allocated objects and RAII principals, it is often preferred
to completely avoid the explicit use of new
and delete
. Instead,
everything should be allocated on the stack. These stack allocated objects can then manage
heap allocations under the covers and ensure proper allocation and cleanup happens with the
object life cycle (construction, destruction).
When an object must be allocated on the heap, either directly or within another RAII object,
then a smart pointer should be used. Smart pointers are a way to represent heap allocated
resources on the stack that contain the same RAII principals. That is, when they are no
longer in scope, they clean up the associated heap memory. The main two smart pointers from
the standard library to know are unique_ptr
and shared_ptr
.
unique_ptr
is a smart pointer that is only allowed to have a single owner. If you
want to pass a unique_ptr
to a function then it must be moved. Moving means
giving ownership of the object away. A moved value can no longer be referenced and the lifetime
is now transferred to whomever it was moved to. To make this more clear:
{ // start of scope
unique_ptr<string> a;
{ // start of scope
unique_ptr<string> b = make_unique<string>("Hello, World");
// not allowed, as this type of assignment tries to copy `b`, but
// unique_ptr's cannot be copied
auto c = b;
// `a` now own's the unique_ptr
a = move(b);
// not allowed, `a` now owns the unique_ptr
b->length();
} // end of scope, `b` is cleaned up but the memory allocated for
// "Hello, World" is untouched since `a` is now the owner
// allowed, since `a` is the owner of the unique_ptr
a->length();
} // end of scope, `a` is cleaned up and memory is reclaimed
shared_ptr
is a smart pointer that allows multiple owners. Each time a
shared_ptr
is copied, a ref-counter is incremented. When a shared_ptr
is destructed (cleaned up), the ref-counter is decremented. When the ref-counter hits 0,
then the associated heap-allocated memory is cleaned up. To make this more clear:
{ // start of scope
shared_ptr<string> a;
{ // start of scope
// new shared pointer, ref-count is 1
shared_ptr<string> b = make_unique<string>("Hello, World");
// allowed, copy is made and ref-count is 2
auto c = b;
// allowed, copy is made and ref-count is 3
a = c;
// both allowed to operate on underlying data
b->length();
c->length();
} // end of scope, `b` and `c` cleaned up, ref-count is 1
} // end of scope, `a` is cleaned up, ref-count is 0, memory is freed
With these two constructs, you should never need to use a raw pointer ( *obj
)
or the new
and delete
keywords. This is a major step toward
writing safer, more modern C++.