Process Coordination
Cpp
#include <atomic>
#include <iostream>
#include <thread>
#include <string>
#include <vector>
int main() {
std::string msg = "Hello, World!";
std::atomic_int index = -1;
std::vector<std::thread> threads;
for (auto i = 0; i < msg.length(); i++) {
threads.push_back(std::thread([ii = i, c = msg[i], &index]() {
while ((ii - 1) != index.load()) {
// busy wait
}
std::cout << c;
index.fetch_add(1);
}));
}
for (auto &t: threads) {
t.join();
}
}
Go
package main
import (
"fmt"
"sync"
)
func main() {
broadcastRegister := make(chan chan int, 1)
broadcastIn := make(chan int, 5)
go channelBroadcaster(broadcastRegister, broadcastIn)
var wg sync.WaitGroup
for i, c := range "Hello, World!" {
workerIn := make(chan int, 20)
broadcastRegister <- workerIn
wg.Add(1)
go coordinatedPrint(i, string(c), workerIn, broadcastIn, &wg)
}
broadcastIn <- -1
wg.Wait()
}
func channelBroadcaster(register chan chan int, in chan int) {
outs := make([]chan int, 0, 10)
for {
select {
case out := <- register:
outs = append(outs, out)
case index := <- in:
for _, out := range outs {
out <- index
}
}
}
}
func coordinatedPrint(index int, char string, in chan int,
out chan int, wg *sync.WaitGroup) {
defer wg.Done()
for {
prev_i := <- in
if (index - 1) == prev_i {
fmt.Print(char)
out <- index
break
}
}
}
What This Code Does
Given the string "Hello, World!"
each program must create as many concurrent
units (threads for C++, goroutines for Go) as there are characters. Then the threads
must coordinate (without the help of the main thread) to print out the characters in
order. The concurrent units may receive, at a minimum, the character they are responsible
for as well as the index of the character to print.
What's the Same
Both implementations use a similar approach (for the sake of comparison) to solve the problem. Each individual concurrency unit waits for the index to be one-less than the index they hold. That is to say, they are next in line to print out their character.
Also, both solutions wait, in the main function, for all concurrency units to finish their
processing. In the Go code, this is done with a sync.WaitGroup
and in the C++
code the join()
method is called on the threads.
What's Different
A LOT. Let's break it down.
The first thing to understand, before discussing code, is that the philosophies differ in the C++ and the Go community around concurrency. One of the creators of Go has been attributed with the following quote:
Don't communicate by sharing memory; share memory by communicating. — Rob Pike
To expand on this quote, in Go, it is more idiomatic to share state through the use of c)hannels. What is sent over the channels should not be shared between goroutines, that is to say data sent over channels is copied such that two goroutines do not point to the same location in memory. This share-nothing approach is a common way to avoid data-races when dealing with concurrent programming.
While these ideas are sound, they are not always typical in C++. Often, an application is written in C++ because performance is very important. When this is the case, and concurrent computation is involved, it is often the case that C++ code will share memory across threads to avoid the price in copying data.
Breaking Down Go
Because the go example aims to coordinate through message-passing, we make use of channels. Channels
can be thought of as synchronized queues. However channels do not, by default, support advanced queueing
patterns such as fan-out. Since every goroutine that is printing needs to communicate with all other
goroutines, we have to implement a simple broadcaster which will fan-out our messages for us
(channelBroadcaster
).
The use of a WaitGroup
is used to allow the main goroutine to wait for all the other
goroutines to exit. In this case, we are sharing memory by passing a pointer to the WaitGroup
to the goroutine. However, even though Go strives to avoid sharing memory, WaitGroup
comes from the Go stdlib's sync
package and is considered safe (as well as idiomatic).
Breaking Down C++
The C++ solution shares memory of the current print index by using a std::atomic_int
. Each
thread is launched via a lambda that captures the index, the current character, and a reference to the
index (references are similar to pointers in that they allow sharing of memory). Each thread performs a
busy-wait as long as the current value of the atomic integer is not what they are looking for
(index.fetch()
retrieves the current value of the int).
Once the main thread as launched all of the threads, it simply waits for all of the threads to exit by
calling join()
on the thread handles.
Some Conclusions
While sharing memory, for this particular (and contrived) example, does result in less code; the Go solution is arguably more efficient in that at no point are the goroutines spinning waiting for a message. The underlying Go scheduler is able to sleep the goroutine and wake it up when there is a message available on the channel. How does that work? Well, reading from the channel is a blocking operation, one which the Go scheduler is able to efficiently handle and not waist CPU resources waiting.