A Memory Model for C++: FAQ
This is an attempt to summarize the responses to common questions/objections
to out work on threads in C++ that in our opinion do have clear-cut responses.
We're still working on some of the harder questions ...
Why is this being addressed as a C++ issue instead of a Posix/thread
library issue?
As is pointed out in
H. Boehm,
"Threads Cannot Be Implemented As a Library", PLDI 2005 or
the technical report version, the fundamental difficulty with the
current C++/pthreads approach to threading is that a C++ compiler
can introduce data races where there were none in the source. This is
fundamentally a language specification and compiler issue, and cannot be
addressed by changes in the threads library specification.
Why is this being addressed as a C++ instead of C issue?
Historically the reason was that some of us had better connections to
the C++ committee. At this point, a better reason is that the C++
committee is actively working on a revised language specification, but
the C committee is not. We are trying to keep the C committee well-informed,
and are hoping that they will eventually adopt some of the C++ changes,
possibly as a technical report instead of a full standard revision.
Why cannot the compiler optimization issues just be side-stepped by
declaring the relevant shared variables volatile?
This turns out to be impractical, for several reasons:
- The examples of compiler-introduced data races in the previously
cited PLDI paper deal mostly with cases in which a shared variable,
say x is already protected by a lock. Hence we would have to
require the programmer to declare lock-protected variables, i.e.
internal monitor variables, volatile. This was clearly never the
intention of the pthread standard. (See the discussion of
Memory Synchronization in SUSV3.)
More importantly, this turns out to be completely impractical. It is
very common to "wrap" single-threaded code in a lock to make it usable
in a multithreaded application. This would not be possible if all
variables/fields in the single-threaded code now had to be declared
volatile.
Thread-safe versions of the C++ standard library, effectively
rely on this approach, which we believe to be the only viable one.
It is also similar to the one adopted by more recent Java container libraries,
for example.
Requiring volatile declarations for lock-protected
variables would effectively require most libraries to come
in two versions: A standard version, and one in which all internal static
variables were declared volatile.
- It is not hard to construct examples in which the compiler
introduces a race on a variable that should in fact be accessed by
only a single thread. (For one such example, see
WG14 paper N1131.) Thus we would also have to tell programmers
to declare unshared variables volatile, if it could look to
the compiler as though they might be shared. This appears to us to
be a completely unreasonable request.
- Aside from the preceding two issues, it is not at all clear that
volatile provides meaningful guarantees for multithreaded
programs, or what those are. Dave Butenhof has often been quoted
as stating that
"The use of "volatile" is not sufficient to ensure proper memory
visibility or synchronization between threads."
Why is it not possible to just adopt the Java memory model?
The Java memory model was heavily motivated by the desire to preserve
both type-safety and some other security guarantees, both of which are
essential if untrusted code is to be run in the same address space with
trusted code. Since this is in any case impossible for ordinary C or C++
code, this is no longer a consideration.
Furthermore, it appears that making Java-like guarantees for C++ is
potentially expensive, particularly on architectures that provide a weak
memory model, or weak atomicity guarantees for ordinary stores. For
example, implementations would have to ensure that an object pointer
cannot be passed from one thread to another without previously making
the objects vtable pointer visible to the other thread. This is likely
to require a memory barrier during object construction. And C++
programmers, unlike Java programmers, tend to expect object construction
to be a very light-weight operation.
It does however appear more and more likely that we will borrow heavily
from the Java memory model, particularly to explain the semantics of C++
atomic operations.
Why is it not possible to just adopt the CLI memory model?
See the answer to the preceding question. The CLI memory model
also still appears to be a moving target.