Memory model and atomics

Memory model and atomics in C++

Synchronized program behaviour

Sequential consistency (Lamport’s definition):

“…the results … is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by the program

Simultaneous operations (concurrent-with): no happens-before ordering.

Atomicity: Operation guaranteed to execute without visible intermediate state.

In a correctly synchronized program:

What happens to our code

SC is very inefficient to guarantee. The following changes may happen to the execution:

flag1 = 0; flag2 = 0;

void thread1() {
  while (true) {
    flag1 = 1;
    if (flag2 != 0) {
      // resolve contention
    } else {
      // critical section
      flag1 = 0;
    }
  }
}

void thread2() {
  while (true) {
    flag2 = 1;
    if (flag1 != 0) {
      // resolve contention
    } else {
      // critical section
      flag1 = 0;
    }
  }
}

The above execution is subject to:

As-if rule

C++ compiler can perform any changes so long as these are obeyed in any single-thread:

Compilers don’t know which memory locations are mutable shared and can asynchronously change due to op in another thread

Also program with undefined behaviour is free from the as-if rule.

Operation reordering

MM in C++11

Multithreading aware memory model: Using mutex/futex/barrier don’t require understanding of the MM.

Essential to make all MT facilities work.

Structure

Each var = 1 obj. Obj occupy at least 1 memory location Fundemantal types: exactly 1 memory location, whatever their size, even if they are part of an array. Adjacent bit field $\in$ same memory location

Concurrency

Mult threads access the same memory location.

Data race:

Ordering

Operation/Transaction: atomic logical operation on related data

Transactional memory: hardware guarantees for atomic load-stores of objects.

Acquire/Release barrier

One way barriers.

A release store x.store(rel) makes prior accesses (of x) visible to a thread performing acquire load x.load(acq) pairing with the store.

  1. Acq/Rel without SC can be reordered past each other
  2. With SC, they cannot.

With locks the compiler cannot reorder mtx.lock() before mtx.unlock because it would violate the internal acquire/release operations!

CPP Objects

Every var is an object.

Critical section implementation

  1. Locks
  2. Ordered Atomics
  3. Transactional Memory

Modification Order

Sequence of all writes to an object from all threads in the program.

Each object has a modification order.

In data race, diff threads see distinct sequences of values for a single variable.

Once a thread see a particular entry

Each individual object must have a agreed on modification order

Different objects can have different relative ordering of the modifications.

Atomic operations and types

Every atomic operation is an indivisible transaction, e.g. Load/store.

Non-atomic operations may be seen as partially complete.

atomic weapons for