`allocate_bucket()`

Remember that whole "two-level array" thingy? This is where it starts coming into play. allocate_bucket does just what is sounds like: allocating a bucket. Recall that a bucket is one of the arrays in the second level of the two level array.

+---+---+---+---+---+
| 1 | 2 | 3 | 4 | 5 |
+---+---+---+---+---+
  |   |   |   |   |
  v   v   v   v   v
+---+---+---+---+---+
| 1 | 2 | 3 | 4 | 5 |
+---+---+---+---+---+
    | 2 | 3 | 4 | 5 |
    +---+---+---+---+
        | 3 | 4 | 5 |
        +---+---+---+
          ^ | 4 | 5 |
          | +---+---+
          |     | 5 |
          |     +---+
          |
        we're allocating one of these little guys

There are two parts to allocate_bucket: allocating the memory and setting the pointer. We start off by tapping into the alloc crate's API. First, we create a Layout, which describes the allocation we want. The Layout::array::<Atomic64>() indicates that we want a bunch of AtomicU64 right next to each other in memory. If creating the layout fails (due to overflow), we call capacity_overflow, which just panics.

You might ask, why not just directly call panic!? Apparently, it reduces the generated code size if we just have panic in one function, which we then call from multiple places. I found this trick in the source code for std::vec::Vec. You can learn a lot from reading the Standard Library code. That's how I've learned a lot of the low level stuff I know. It's also a good way to see what good, idiomatic Rust looks like.

#![allow(unused)]
fn main() {
const FIRST_BUCKET_SIZE: usize = 8;

fn allocate_bucket(&self, bucket: usize) {
    // The shift-left is equivalent to raising 2 to the power of bucket
    let size = FIRST_BUCKET_SIZE * (1 << bucket);
    let layout = match Layout::array::<AtomicU64>(size) {
        Ok(layout) => layout,
        Err(_) => capacity_overflow(),
    };

}

The next thing we do is just another check. The Standard Library does both checks and I trust their strategy.

#![allow(unused)]
fn main() {
// Make sure allocation is ok
match alloc_guard(layout.size()) {
    Ok(_) => {}
    Err(_) => capacity_overflow(),
}

}

Miri is about to make its debut! Miri is a tool that runs your code in a special environment and detects undefined behavior (or UB, as the cool kids call it).

Now that our layout is all good, we can perform the actual allocation. We instantiate the Global struct, which is the allocator we're using. The allocator returns a pointer to our new allocation once it's finished allocating. Why are we using allocated_zeroed you might ask? Why not just allocate normally? The answer: It's Utmost Holiness: Miri. In all seriousness though, Miri has been and invaluable tool in catching memory and concurrency bugs. When we just allocate normally, Miri throws and error when we start actually using the memory later on, saying that intrinsics::atomic_cxchg_acqrel_failrelaxed(dst, old, new) requires initialized data. Thus, we just zero the memory for now. Later, it might be worth it to do some MaybeUninit magic, but honestly, I don't know if there'll be much, if any, performance gains.

Once again, we have more checks, and we'll just panic! if the allocation fails. handle_alloc_error is from the alloc crate:

#![allow(unused)]
fn main() {
let allocator = Global;

let allocation = allocator.allocate_zeroed(layout);
let ptr = match allocation {
    Ok(ptr) => ptr.as_ptr() as *mut AtomicU64,
    Err(_) => handle_alloc_error(layout),
};

}

The final part is to swap in the pointer into our array of buffers (the first level of the two-level array). We use the compare_exchange function, with a null pointer as the expected value, and our new pointer from the allocation. If compare_exchange fails, that means the pointer is no longer null, and someone else compare_exchangeded in a pointer. Therefore, the bucket is already allocated. In this case, we deallocate the freshly allocated memory. Notice how we assess the result of compare_exchange with Result::is_err(); we don't care about the value compare_exchange returns.

#![allow(unused)]
fn main() {
    if self.buffers[bucket] // <- this is an AtomicPtr<AtomicU64>
        .compare_exchange(
            ptr::null_mut::<AtomicU64>(), // old value
            ptr, // new value
            Ordering::AcqRel, // ordering on success
            Ordering::Relaxed, // ordering on fail
        )
        .is_err()
    {
        unsafe {
            // # Safety
            // We know that the pointer returned from the allocation is NonNull so
            // we can call unwrap() on NonNull::new(). We also know that the pointer
            // is pointing to the correct memory because we just got it from the
            // allocation. We know the layout is valid, as it is the same layout we
            // used to allocate.
            allocator.deallocate(NonNull::new(ptr as *mut u8).unwrap(), layout);
        }
    }
}

}

Success and Fail Orderings

Like all atomic operations, compare_exchange uses the orderings. Most operations take 1, but this bad boy takes two. Since compare_exchange reads and writes a memory location, we're using AcqRel. Since we always use AcqRel for the buckets, the load part (Acquire) of the compare_exchange will always see the most recent value because the store part is Release. If we just used Acquire, the store part of the compare_exchange would be Relaxed, which doesn't guarantee that the modification to the AtomicPtr<AtomicU64> is published to other threads by any certain point. Under a Relaxed situation, another thread might load a null pointer in its compare_exchange, even though our thread swapped in a pointer to memory!

That's the success ordering. The fail ordering is Relaxed because we don't need to establish any synchronization if the operation fails. It failed; we're not doing any stores. When I first saw this, I had the question, "Why do we provide different success and fail orderings if the compare_exchange doesn't know if it will fail or not?" The answer, thanks to Alice on the Rust User Forums, is that the compiler picks an ordering that will always satisfy the stronger ordering. Thus, compare_exchange(success: AcqRel, fail: Release) executes as compare_exchange(success: AcqRel, fail: Acquire) to ensure that the initial load is Acquire for both cases.

There's a little more to it; if you're still curious, see this thread on the Rust User Forums.

The last function in the "memory" section is reserve(), which I've "reserved" for last.

Complete source for `allocate_bucket()`

#![allow(unused)]
fn main() {
fn allocate_bucket(&self, bucket: usize) {
    // The shift-left is equivalent to raising 2 to the power of bucket
    let size = FIRST_BUCKET_SIZE * (1 << bucket);
    let layout = match Layout::array::<AtomicU64>(size) {
        Ok(layout) => layout,
        Err(_) => capacity_overflow(),
    };

    // Make sure allocation is ok
    match alloc_guard(layout.size()) {
        Ok(_) => {}
        Err(_) => capacity_overflow(),
    }

    let allocator = Global;
    // allocate_zeroed because miri complains about accessing uninitialized memory
    // TODO: Maybe use MaybeUninit?
    let allocation = allocator.allocate_zeroed(layout);
    let ptr = match allocation {
        Ok(ptr) => ptr.as_ptr() as *mut AtomicU64,
        Err(_) => handle_alloc_error(layout),
    };

    // If the CAS fails, then the bucket has already been initalized with memory
    // and we free the memory we just allocated
    if self.buffers[bucket]
        .compare_exchange(
            ptr::null_mut::<AtomicU64>(),
            ptr,
            Ordering::AcqRel,
            Ordering::Relaxed,
        )
        .is_err()
    {
        unsafe {
            // # Safety
            // We know that the pointer returned from the allocation is NonNull so
            // we can call unwrap() on NonNull::new(). We also know that the pointer
            // is pointing to the correct memory because we just got it from the
            // allocation. We know the layout is valid, as it is the same layout we
            // used to allocate.
            allocator.deallocate(NonNull::new(ptr as *mut u8).unwrap(), layout);
        }
    }
}

}

Building a Rusty, Lock-free Dynamically Resizable Array

allocate_bucket()

Success and Fail Orderings

Complete source for allocate_bucket()

`allocate_bucket()`

Complete source for `allocate_bucket()`