README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204

### Continuous Integration

| Drone | Travis | Cirrus |
| -------- | ------ | ------- |
| [![Build Status](https://cloud.drone.io/api/badges/concurrencykit/ck/status.svg)](https://cloud.drone.io/concurrencykit/ck) | [![Build Status](https://travis-ci.org/concurrencykit/ck.svg)](https://travis-ci.org/concurrencykit/ck) | [![Build Status](https://api.cirrus-ci.com/github/concurrencykit/ck.svg?branch=master)](https://cirrus-ci.com/github/concurrencykit/ck) |

Compilers tested in the past include gcc, clang, cygwin, icc, mingw32, mingw64 and suncc across all supported architectures. All new architectures are required to pass the integration test and under-go extensive code review.

Continuous integration is currently enabled for the following targets:
 * `darwin/clang/x86-64`
 * `freebsd/clang/x86-64`
 * `linux/gcc/arm64`
 * `linux/gcc/x86-64`
 * `linux/clang/x86-64`
 * `linux/clang/ppc64le`

### Compile and Build

* Step 1.
        `./configure`
        For additional options try `./configure --help`

* Step 2.
        In order to compile regressions (requires POSIX threads) use
        `make regressions`. In order to compile libck use `make all` or `make`.

* Step 3.
	In order to install use `make install`
	To uninstall use `make uninstall`.

See http://concurrencykit.org/ for more information.

### Supported Architectures

Concurrency Kit supports any architecture using compiler built-ins as a fallback. There is usually a performance degradation associated with this.

Concurrency Kit has specialized assembly for the following architectures:
 * `aarch64`
 * `arm`
 * `ppc`
 * `ppc64`
 * `s390x`
 * `sparcv9+`
 * `x86`
 * `x86_64`
 
### Features

#### Concurrency Primitives

##### ck_pr

Concurrency primitives as made available by the underlying architecture, includes support for all atomic operations (natively), transactional memory, pipeline control, read-for-ownership and more.

##### ck_backoff

A simple and efficient (minimal noise) backoff function.

##### ck_cc

Abstracted compiler builtins when writing efficient concurrent data structures.

#### Safe Memory Reclamation

##### ck_epoch

A scalable safe memory reclamation mechanism with support idle threads and various optimizations that make it better than or competitive with many state-of-the-art solutions.

##### ck_hp

Implements support for hazard pointers, a simple and efficient lock-free safe memory reclamation mechanism.

#### Data Structures

##### ck_array

A simple concurrently-readable pointer array structure.

##### ck_bitmap

An efficient multi-reader and multi-writer concurrent bitmap structure.

##### ck_ring

Efficient concurrent bounded FIFO data structures with various performance trade-off. This includes specialization for single-reader, many-reader, single-writer and many-writer.

##### ck_fifo

A reference implementation of the first published lock-free FIFO algorithm, with specialization for single-enqueuer-single-dequeuer and many-enqueuer-single-dequeuer and extensions to allow for node re-use.

##### ck_hp_fifo

A reference implementation of the above algorithm, implemented with safe memory reclamation using hazard pointers.

##### ck_hp_stack

A reference implementation of a Treiber stack with support for hazard pointers.

##### ck_stack

A reference implementation of an efficient lock-free stack, with specialized variants for a variety of memory management strategies and bounded concurrency.

##### ck_queue

A concurrently readable friendly derivative of the BSD-queue interface. Coupled with a safe memory reclamation mechanism, implement scalable read-side queues with a simple search and replace.

##### ck_hs

An extremely efficient single-writer-many-reader hash set, that satisfies lock-freedom with bounded concurrency without any usage of atomic operations and allows for recycling of unused or deleted slots. This data structure is recommended for use as a general hash-set if it is possible to compute values from keys. Learn more at https://engineering.backtrace.io/workload-specialization/ and http://concurrencykit.org/articles/ck_hs.html.

##### ck_ht

A specialization of the `ck_hs` algorithm allowing for disjunct key-value pairs.

##### ck_rhs

A variant of `ck_hs` that utilizes robin-hood hashing to allow for improved performance with higher load factors and high deletion rates.

#### Synchronization Primitives

##### ck_ec

An extremely efficient event counter implementation, a better alternative to condition variables.

##### ck_barrier

A plethora of execution barriers including: centralized barriers, combining barriers, dissemination barriers, MCS barriers, tournament barriers.

##### ck_brlock

A simple big-reader lock implementation, write-biased reader-writer lock with scalable read-side locking.

##### ck_bytelock

An implementation of bytelocks, for research purposes, allowing for (in theory), fast read-side acquisition without the use of atomic operations. In reality, memory barriers are required on the fast path.

##### ck_cohort

A generic lock cohorting interface, allows you to turn any lock into a NUMA-friendly scalable NUMA lock. There is a significant trade-off in fast path acquisition cost. Specialization is included for all relevant lock implementations in Concurrency Kit. Learn more by reading "Lock Cohorting: A General Technique for Designing NUMA Locks".

##### ck_elide

A generic lock elision framework, allows you to turn any lock implementation into an elision-aware implementation. This requires support for restricted transactional memory by the underlying hardware.

##### ck_pflock

Phase-fair reader-writer mutex that provides strong fairness guarantees between readers and writers. Learn more by reading "Spin-Based Reader-Writer Synchronization for Multiprocessor Real-Time Systems".

##### ck_rwcohort

A generic read-write lock cohorting interface, allows you to turn any read-write lock into a NUMA-friendly scalable NUMA lock. There is a significant trade-off in fast path acquisition cost. Specialization is included for all relevant lock implementations in Concurrency Kit. Learn more by reading "Lock Cohorting: A General Technique for Designing NUMA Locks".

##### ck_rwlock

A simple centralized write-biased read-write lock.

##### ck_sequence

A sequence counter lock, popularized by the Linux kernel, allows for very fast read and write synchronization for simple data structures where deep copy is permitted.

##### ck_swlock

A single-writer specialized read-lock that is copy-safe, useful for data structures that must remain small, be copied and contain in-band mutexes.

##### ck_tflock

Task-fair locks are fair read-write locks, derived from "Scalable reader-writer synchronization for shared-memory multiprocessors".

##### ck_spinlock

A basic but very fast spinlock implementation.

##### ck_spinlock_anderson

Scalable and fast anderson spinlocks. This is here for reference, one of the earliest scalable and fair lock implementations.

##### ck_spinlock_cas

A basic spinlock utilizing compare_and_swap.

##### ck_spinlock_dec

A basic spinlock, a C adaption of the older optimized Linux kernel spinlock for x86. Primarily here for reference.

##### ck_spinlock_fas

A basic spinlock utilizing atomic exchange.

##### ck_spinlock_clh

An efficient implementation of the scalable CLH lock, providing many of the same performance properties of MCS with a better fast-path.

##### ck_spinlock_hclh

A NUMA-friendly CLH lock.

##### ck_spinlock_mcs

An implementation of the seminal scalable and fair MCS lock.

##### ck_spinlock_ticket

An implementation of fair centralized locks.