summaryrefslogtreecommitdiffstats
path: root/doc/go_mem.html
blob: 026c1172e3a4c0ea9fa8f147202471f083b2cae8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
<!--{
	"Title": "The Go Memory Model",
	"Subtitle": "Version of June 6, 2022",
	"Path": "/ref/mem"
}-->

<style>
p.rule {
  font-style: italic;
}
</style>

<h2 id="introduction">Introduction</h2>

<p>
The Go memory model specifies the conditions under which
reads of a variable in one goroutine can be guaranteed to
observe values produced by writes to the same variable in a different goroutine.
</p>


<h3 id="advice">Advice</h3>

<p>
Programs that modify data being simultaneously accessed by multiple goroutines
must serialize such access.
</p>

<p>
To serialize access, protect the data with channel operations or other synchronization primitives
such as those in the <a href="/pkg/sync/"><code>sync</code></a>
and <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a> packages.
</p>

<p>
If you must read the rest of this document to understand the behavior of your program,
you are being too clever.
</p>

<p>
Don't be clever.
</p>

<h3 id="overview">Informal Overview</h3>

<p>
Go approaches its memory model in much the same way as the rest of the language,
aiming to keep the semantics simple, understandable, and useful.
This section gives a general overview of the approach and should suffice for most programmers.
The memory model is specified more formally in the next section.
</p>

<p>
A <em>data race</em> is defined as
a write to a memory location happening concurrently with another read or write to that same location,
unless all the accesses involved are atomic data accesses as provided by the <code>sync/atomic</code> package.
As noted already, programmers are strongly encouraged to use appropriate synchronization
to avoid data races.
In the absence of data races, Go programs behave as if all the goroutines
were multiplexed onto a single processor.
This property is sometimes referred to as DRF-SC: data-race-free programs
execute in a sequentially consistent manner.
</p>

<p>
While programmers should write Go programs without data races,
there are limitations to what a Go implementation can do in response to a data race.
An implementation may always react to a data race by reporting the race and terminating the program.
Otherwise, each read of a single-word-sized or sub-word-sized memory location
must observe a value actually written to that location (perhaps by a concurrent executing goroutine)
and not yet overwritten.
These implementation constraints make Go more like Java or JavaScript,
in that most races have a limited number of outcomes,
and less like C and C++, where the meaning of any program with a race
is entirely undefined, and the compiler may do anything at all.
Go's approach aims to make errant programs more reliable and easier to debug,
while still insisting that races are errors and that tools can diagnose and report them.
</p>

<h2 id="model">Memory Model</h2>

<p>
The following formal definition of Go's memory model closely follows
the approach presented by Hans-J. Boehm and Sarita V. Adve in
“<a href="https://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf">Foundations of the C++ Concurrency Memory Model</a>”,
published in PLDI 2008.
The definition of data-race-free programs and the guarantee of sequential consistency
for race-free programs are equivalent to the ones in that work.
</p>

<p>
The memory model describes the requirements on program executions,
which are made up of goroutine executions,
which in turn are made up of memory operations.
</p>

<p>
A <i>memory operation</i> is modeled by four details:
</p>
<ul>
<li>its kind, indicating whether it is an ordinary data read, an ordinary data write,
or a <i>synchronizing operation</i> such as an atomic data access,
a mutex operation, or a channel operation,
<li>its location in the program,
<li>the memory location or variable being accessed, and
<li>the values read or written by the operation.
</ul>
<p>
Some memory operations are <i>read-like</i>, including read, atomic read, mutex lock, and channel receive.
Other memory operations are <i>write-like</i>, including write, atomic write, mutex unlock, channel send, and channel close.
Some, such as atomic compare-and-swap, are both read-like and write-like.
</p>

<p>
A <i>goroutine execution</i> is modeled as a set of memory operations executed by a single goroutine.
</p>

<p>
<b>Requirement 1</b>:
The memory operations in each goroutine must correspond to a correct sequential execution of that goroutine,
given the values read from and written to memory.
That execution must be consistent with the <i>sequenced before</i> relation,
defined as the partial order requirements set out by the <a href="/ref/spec">Go language specification</a>
for Go's control flow constructs as well as the <a href="/ref/spec#Order_of_evaluation">order of evaluation for expressions</a>.
</p>

<p>
A Go <i>program execution</i> is modeled as a set of goroutine executions,
together with a mapping <i>W</i> that specifies the write-like operation that each read-like operation reads from.
(Multiple executions of the same program can have different program executions.)
</p>

<p>
<b>Requirement 2</b>:
For a given program execution, the mapping <i>W</i>, when limited to synchronizing operations,
must be explainable by some implicit total order of the synchronizing operations
that is consistent with sequencing and the values read and written by those operations.
</p>

<p>
The <i>synchronized before</i> relation is a partial order on synchronizing memory operations,
derived from <i>W</i>.
If a synchronizing read-like memory operation <i>r</i>
observes a synchronizing write-like memory operation <i>w</i>
(that is, if <i>W</i>(<i>r</i>) = <i>w</i>),
then <i>w</i> is synchronized before <i>r</i>.
Informally, the synchronized before relation is a subset of the implied total order
mentioned in the previous paragraph,
limited to the information that <i>W</i> directly observes.
</p>

<p>
The <i>happens before</i> relation is defined as the transitive closure of the
union of the sequenced before and synchronized before relations.
</p>

<p>
<b>Requirement 3</b>:
For an ordinary (non-synchronizing) data read <i>r</i> on a memory location <i>x</i>,
<i>W</i>(<i>r</i>) must be a write <i>w</i> that is <i>visible</i> to <i>r</i>,
where visible means that both of the following hold:
</p>

<ol>
<li><i>w</i> happens before <i>r</i>.
<li><i>w</i> does not happen before any other write <i>w'</i> (to <i>x</i>) that happens before <i>r</i>.
</ol>

<p>
A <i>read-write data race</i> on memory location <i>x</i>
consists of a read-like memory operation <i>r</i> on <i>x</i>
and a write-like memory operation <i>w</i> on <i>x</i>,
at least one of which is non-synchronizing,
which are unordered by happens before
(that is, neither <i>r</i> happens before <i>w</i>
nor <i>w</i> happens before <i>r</i>).
</p>

<p>
A <i>write-write data race</i> on memory location <i>x</i>
consists of two write-like memory operations <i>w</i> and <i>w'</i> on <i>x</i>,
at least one of which is non-synchronizing,
which are unordered by happens before.
</p>

<p>
Note that if there are no read-write or write-write data races on memory location <i>x</i>,
then any read <i>r</i> on <i>x</i> has only one possible <i>W</i>(<i>r</i>):
the single <i>w</i> that immediately precedes it in the happens before order.
</p>

<p>
More generally, it can be shown that any Go program that is data-race-free,
meaning it has no program executions with read-write or write-write data races,
can only have outcomes explained by some sequentially consistent interleaving
of the goroutine executions.
(The proof is the same as Section 7 of Boehm and Adve's paper cited above.)
This property is called DRF-SC.
</p>

<p>
The intent of the formal definition is to match
the DRF-SC guarantee provided to race-free programs
by other languages, including C, C++, Java, JavaScript, Rust, and Swift.
</p>

<p>
Certain Go language operations such as goroutine creation and memory allocation
act as synchronization operations.
The effect of these operations on the synchronized-before partial order
is documented in the “Synchronization” section below.
Individual packages are responsible for providing similar documentation
for their own operations.
</p>

<h2 id="restrictions">Implementation Restrictions for Programs Containing Data Races</h2>

<p>
The preceding section gave a formal definition of data-race-free program execution.
This section informally describes the semantics that implementations must provide
for programs that do contain races.
</p>

<p>
Any implementation can, upon detecting a data race,
report the race and halt execution of the program.
Implementations using ThreadSanitizer
(accessed with “<code>go</code> <code>build</code> <code>-race</code>”)
do exactly this.
</p>

<p>
A read of an array, struct, or complex number
may by implemented as a read of each individual sub-value
(array element, struct field, or real/imaginary component),
in any order.
Similarly, a write of an array, struct, or complex number
may be implemented as a write of each individual sub-value,
in any order.
</p>

<p>
A read <i>r</i> of a memory location <i>x</i>
holding a value
that is not larger than a machine word must observe
some write <i>w</i> such that <i>r</i> does not happen before <i>w</i>
and there is no write <i>w'</i> such that <i>w</i> happens before <i>w'</i>
and <i>w'</i> happens before <i>r</i>.
That is, each read must observe a value written by a preceding or concurrent write.
</p>

<p>
Additionally, observation of acausal and “out of thin air” writes is disallowed.
</p>

<p>
Reads of memory locations larger than a single machine word
are encouraged but not required to meet the same semantics
as word-sized memory locations,
observing a single allowed write <i>w</i>.
For performance reasons,
implementations may instead treat larger operations
as a set of individual machine-word-sized operations
in an unspecified order.
This means that races on multiword data structures
can lead to inconsistent values not corresponding to a single write.
When the values depend on the consistency
of internal (pointer, length) or (pointer, type) pairs,
as can be the case for interface values, maps,
slices, and strings in most Go implementations,
such races can in turn lead to arbitrary memory corruption.
</p>

<p>
Examples of incorrect synchronization are given in the
“Incorrect synchronization” section below.
</p>

<p>
Examples of the limitations on implementations are given in the
“Incorrect compilation” section below.
</p>

<h2 id="synchronization">Synchronization</h2>

<h3 id="init">Initialization</h3>

<p>
Program initialization runs in a single goroutine,
but that goroutine may create other goroutines,
which run concurrently.
</p>

<p class="rule">
If a package <code>p</code> imports package <code>q</code>, the completion of
<code>q</code>'s <code>init</code> functions happens before the start of any of <code>p</code>'s.
</p>

<p class="rule">
The completion of all <code>init</code> functions is synchronized before
the start of the function <code>main.main</code>.
</p>

<h3 id="go">Goroutine creation</h3>

<p class="rule">
The <code>go</code> statement that starts a new goroutine
is synchronized before the start of the goroutine's execution.
</p>

<p>
For example, in this program:
</p>

<pre>
var a string

func f() {
	print(a)
}

func hello() {
	a = "hello, world"
	go f()
}
</pre>

<p>
calling <code>hello</code> will print <code>"hello, world"</code>
at some point in the future (perhaps after <code>hello</code> has returned).
</p>

<h3 id="goexit">Goroutine destruction</h3>

<p>
The exit of a goroutine is not guaranteed to be synchronized before
any event in the program.
For example, in this program:
</p>

<pre>
var a string

func hello() {
	go func() { a = "hello" }()
	print(a)
}
</pre>

<p>
the assignment to <code>a</code> is not followed by
any synchronization event, so it is not guaranteed to be
observed by any other goroutine.
In fact, an aggressive compiler might delete the entire <code>go</code> statement.
</p>

<p>
If the effects of a goroutine must be observed by another goroutine,
use a synchronization mechanism such as a lock or channel
communication to establish a relative ordering.
</p>

<h3 id="chan">Channel communication</h3>

<p>
Channel communication is the main method of synchronization
between goroutines.  Each send on a particular channel
is matched to a corresponding receive from that channel,
usually in a different goroutine.
</p>

<p class="rule">
A send on a channel is synchronized before the completion of the
corresponding receive from that channel.
</p>

<p>
This program:
</p>

<pre>
var c = make(chan int, 10)
var a string

func f() {
	a = "hello, world"
	c &lt;- 0
}

func main() {
	go f()
	&lt;-c
	print(a)
}
</pre>

<p>
is guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
is sequenced before the send on <code>c</code>, which is synchronized before
the corresponding receive on <code>c</code> completes, which is sequenced before
the <code>print</code>.
</p>

<p class="rule">
The closing of a channel is synchronized before a receive that returns a zero value
because the channel is closed.
</p>

<p>
In the previous example, replacing
<code>c &lt;- 0</code> with <code>close(c)</code>
yields a program with the same guaranteed behavior.
</p>

<p class="rule">
A receive from an unbuffered channel is synchronized before the completion of
the corresponding send on that channel.
</p>

<p>
This program (as above, but with the send and receive statements swapped and
using an unbuffered channel):
</p>

<pre>
var c = make(chan int)
var a string

func f() {
	a = "hello, world"
	&lt;-c
}

func main() {
	go f()
	c &lt;- 0
	print(a)
}
</pre>

<p>
is also guaranteed to print <code>"hello, world"</code>.  The write to <code>a</code>
is sequenced before the receive on <code>c</code>, which is synchronized before
the corresponding send on <code>c</code> completes, which is sequenced
before the <code>print</code>.
</p>

<p>
If the channel were buffered (e.g., <code>c = make(chan int, 1)</code>)
then the program would not be guaranteed to print
<code>"hello, world"</code>.  (It might print the empty string,
crash, or do something else.)
</p>

<p class="rule">
The <i>k</i>th receive on a channel with capacity <i>C</i> is synchronized before the completion of the <i>k</i>+<i>C</i>th send from that channel completes.
</p>

<p>
This rule generalizes the previous rule to buffered channels.
It allows a counting semaphore to be modeled by a buffered channel:
the number of items in the channel corresponds to the number of active uses,
the capacity of the channel corresponds to the maximum number of simultaneous uses,
sending an item acquires the semaphore, and receiving an item releases
the semaphore.
This is a common idiom for limiting concurrency.
</p>

<p>
This program starts a goroutine for every entry in the work list, but the
goroutines coordinate using the <code>limit</code> channel to ensure
that at most three are running work functions at a time.
</p>

<pre>
var limit = make(chan int, 3)

func main() {
	for _, w := range work {
		go func(w func()) {
			limit &lt;- 1
			w()
			&lt;-limit
		}(w)
	}
	select{}
}
</pre>

<h3 id="locks">Locks</h3>

<p>
The <code>sync</code> package implements two lock data types,
<code>sync.Mutex</code> and <code>sync.RWMutex</code>.
</p>

<p class="rule">
For any <code>sync.Mutex</code> or <code>sync.RWMutex</code> variable <code>l</code> and <i>n</i> &lt; <i>m</i>,
call <i>n</i> of <code>l.Unlock()</code> is synchronized before call <i>m</i> of <code>l.Lock()</code> returns.
</p>

<p>
This program:
</p>

<pre>
var l sync.Mutex
var a string

func f() {
	a = "hello, world"
	l.Unlock()
}

func main() {
	l.Lock()
	go f()
	l.Lock()
	print(a)
}
</pre>

<p>
is guaranteed to print <code>"hello, world"</code>.
The first call to <code>l.Unlock()</code> (in <code>f</code>) is synchronized
before the second call to <code>l.Lock()</code> (in <code>main</code>) returns,
which is sequenced before the <code>print</code>.
</p>

<p class="rule">
For any call to <code>l.RLock</code> on a <code>sync.RWMutex</code> variable <code>l</code>,
there is an <i>n</i> such that the <i>n</i>th call to <code>l.Unlock</code>
is synchronized before the return from <code>l.RLock</code>,
and the matching call to <code>l.RUnlock</code> is synchronized before the return from call <i>n</i>+1 to <code>l.Lock</code>.
</p>

<p class="rule">
A successful call to <code>l.TryLock</code> (or <code>l.TryRLock</code>)
is equivalent to a call to <code>l.Lock</code> (or <code>l.RLock</code>).
An unsuccessful call has no synchronizing effect at all.
As far as the memory model is concerned,
<code>l.TryLock</code> (or <code>l.TryRLock</code>)
may be considered to be able to return false
even when the mutex <i>l</i> is unlocked.
</p>

<h3 id="once">Once</h3>

<p>
The <code>sync</code> package provides a safe mechanism for
initialization in the presence of multiple goroutines
through the use of the <code>Once</code> type.
Multiple threads can execute <code>once.Do(f)</code> for a particular <code>f</code>,
but only one will run <code>f()</code>, and the other calls block
until <code>f()</code> has returned.
</p>

<p class="rule">
The completion of a single call of <code>f()</code> from <code>once.Do(f)</code>
is synchronized before the return of any call of <code>once.Do(f)</code>.
</p>

<p>
In this program:
</p>

<pre>
var a string
var once sync.Once

func setup() {
	a = "hello, world"
}

func doprint() {
	once.Do(setup)
	print(a)
}

func twoprint() {
	go doprint()
	go doprint()
}
</pre>

<p>
calling <code>twoprint</code> will call <code>setup</code> exactly
once.
The <code>setup</code> function will complete before either call
of <code>print</code>.
The result will be that <code>"hello, world"</code> will be printed
twice.
</p>

<h3 id="atomic">Atomic Values</h3>

<p>
The APIs in the <a href="/pkg/sync/atomic/"><code>sync/atomic</code></a>
package are collectively “atomic operations”
that can be used to synchronize the execution of different goroutines.
If the effect of an atomic operation <i>A</i> is observed by atomic operation <i>B</i>,
then <i>A</i> is synchronized before <i>B</i>.
All the atomic operations executed in a program behave as though executed
in some sequentially consistent order.
</p>

<p>
The preceding definition has the same semantics as C++’s sequentially consistent atomics
and Java’s <code>volatile</code> variables.
</p>

<h3 id="finalizer">Finalizers</h3>

<p>
The <a href="/pkg/runtime/"><code>runtime</code></a> package provides
a <code>SetFinalizer</code> function that adds a finalizer to be called when
a particular object is no longer reachable by the program.
A call to <code>SetFinalizer(x, f)</code> is synchronized before the finalization call <code>f(x)</code>.
</p>

<h3 id="more">Additional Mechanisms</h3>

<p>
The <code>sync</code> package provides additional synchronization abstractions,
including <a href="/pkg/sync/#Cond">condition variables</a>,
<a href="/pkg/sync/#Map">lock-free maps</a>,
<a href="/pkg/sync/#Pool">allocation pools</a>,
and
<a href="/pkg/sync/#WaitGroup">wait groups</a>.
The documentation for each of these specifies the guarantees it
makes concerning synchronization.
</p>

<p>
Other packages that provide synchronization abstractions
should document the guarantees they make too.
</p>


<h2 id="badsync">Incorrect synchronization</h2>

<p>
Programs with races are incorrect and
can exhibit non-sequentially consistent executions.
In particular, note that a read <i>r</i> may observe the value written by any write <i>w</i>
that executes concurrently with <i>r</i>.
Even if this occurs, it does not imply that reads happening after <i>r</i>
will observe writes that happened before <i>w</i>.
</p>

<p>
In this program:
</p>

<pre>
var a, b int

func f() {
	a = 1
	b = 2
}

func g() {
	print(b)
	print(a)
}

func main() {
	go f()
	g()
}
</pre>

<p>
it can happen that <code>g</code> prints <code>2</code> and then <code>0</code>.
</p>

<p>
This fact invalidates a few common idioms.
</p>

<p>
Double-checked locking is an attempt to avoid the overhead of synchronization.
For example, the <code>twoprint</code> program might be
incorrectly written as:
</p>

<pre>
var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func doprint() {
	if !done {
		once.Do(setup)
	}
	print(a)
}

func twoprint() {
	go doprint()
	go doprint()
}
</pre>

<p>
but there is no guarantee that, in <code>doprint</code>, observing the write to <code>done</code>
implies observing the write to <code>a</code>.  This
version can (incorrectly) print an empty string
instead of <code>"hello, world"</code>.
</p>

<p>
Another incorrect idiom is busy waiting for a value, as in:
</p>

<pre>
var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func main() {
	go setup()
	for !done {
	}
	print(a)
}
</pre>

<p>
As before, there is no guarantee that, in <code>main</code>,
observing the write to <code>done</code>
implies observing the write to <code>a</code>, so this program could
print an empty string too.
Worse, there is no guarantee that the write to <code>done</code> will ever
be observed by <code>main</code>, since there are no synchronization
events between the two threads.  The loop in <code>main</code> is not
guaranteed to finish.
</p>

<p>
There are subtler variants on this theme, such as this program.
</p>

<pre>
type T struct {
	msg string
}

var g *T

func setup() {
	t := new(T)
	t.msg = "hello, world"
	g = t
}

func main() {
	go setup()
	for g == nil {
	}
	print(g.msg)
}
</pre>

<p>
Even if <code>main</code> observes <code>g != nil</code> and exits its loop,
there is no guarantee that it will observe the initialized
value for <code>g.msg</code>.
</p>

<p>
In all these examples, the solution is the same:
use explicit synchronization.
</p>

<h2 id="badcompiler">Incorrect compilation</h2>

<p>
The Go memory model restricts compiler optimizations as much as it does Go programs.
Some compiler optimizations that would be valid in single-threaded programs are not valid in all Go programs.
In particular, a compiler must not introduce writes that do not exist in the original program,
it must not allow a single read to observe multiple values,
and it must not allow a single write to write multiple values.
</p>

<p>
All the following examples assume that `*p` and `*q` refer to
memory locations accessible to multiple goroutines.
</p>

<p>
Not introducing data races into race-free programs means not moving
writes out of conditional statements in which they appear.
For example, a compiler must not invert the conditional in this program:
</p>

<pre>
*p = 1
if cond {
	*p = 2
}
</pre>

<p>
That is, the compiler must not rewrite the program into this one:
</p>

<pre>
*p = 2
if !cond {
	*p = 1
}
</pre>

<p>
If <code>cond</code> is false and another goroutine is reading <code>*p</code>,
then in the original program, the other goroutine can only observe any prior value of <code>*p</code> and <code>1</code>.
In the rewritten program, the other goroutine can observe <code>2</code>, which was previously impossible.
</p>

<p>
Not introducing data races also means not assuming that loops terminate.
For example, a compiler must in general not move the accesses to <code>*p</code> or <code>*q</code>
ahead of the loop in this program:
</p>

<pre>
n := 0
for e := list; e != nil; e = e.next {
	n++
}
i := *p
*q = 1
</pre>

<p>
If <code>list</code> pointed to a cyclic list,
then the original program would never access <code>*p</code> or <code>*q</code>,
but the rewritten program would.
(Moving `*p` ahead would be safe if the compiler can prove `*p` will not panic;
moving `*q` ahead would also require the compiler proving that no other
goroutine can access `*q`.)
</p>

<p>
Not introducing data races also means not assuming that called functions
always return or are free of synchronization operations.
For example, a compiler must not move the accesses to <code>*p</code> or <code>*q</code>
ahead of the function call in this program
(at least not without direct knowledge of the precise behavior of <code>f</code>):
</p>

<pre>
f()
i := *p
*q = 1
</pre>

<p>
If the call never returned, then once again the original program
would never access <code>*p</code> or <code>*q</code>, but the rewritten program would.
And if the call contained synchronizing operations, then the original program
could establish happens before edges preceding the accesses
to <code>*p</code> and <code>*q</code>, but the rewritten program would not.
</p>

<p>
Not allowing a single read to observe multiple values means
not reloading local variables from shared memory.
For example, a compiler must not discard <code>i</code> and reload it
a second time from <code>*p</code> in this program:
</p>

<pre>
i := *p
if i &lt; 0 || i &gt;= len(funcs) {
	panic("invalid function index")
}
... complex code ...
// compiler must NOT reload i = *p here
funcs[i]()
</pre>

<p>
If the complex code needs many registers, a compiler for single-threaded programs
could discard <code>i</code> without saving a copy and then reload
<code>i = *p</code> just before
<code>funcs[i]()</code>.
A Go compiler must not, because the value of <code>*p</code> may have changed.
(Instead, the compiler could spill <code>i</code> to the stack.)
</p>

<p>
Not allowing a single write to write multiple values also means not using
the memory where a local variable will be written as temporary storage before the write.
For example, a compiler must not use <code>*p</code> as temporary storage in this program:
</p>

<pre>
*p = i + *p/2
</pre>

<p>
That is, it must not rewrite the program into this one:
</p>

<pre>
*p /= 2
*p += i
</pre>

<p>
If <code>i</code> and <code>*p</code> start equal to 2,
the original code does <code>*p = 3</code>,
so a racing thread can read only 2 or 3 from <code>*p</code>.
The rewritten code does <code>*p = 1</code> and then <code>*p = 3</code>,
allowing a racing thread to read 1 as well.
</p>

<p>
Note that all these optimizations are permitted in C/C++ compilers:
a Go compiler sharing a back end with a C/C++ compiler must take care
to disable optimizations that are invalid for Go.
</p>

<p>
Note that the prohibition on introducing data races
does not apply if the compiler can prove that the races
do not affect correct execution on the target platform.
For example, on essentially all CPUs, it is valid to rewrite
</p>

<pre>
n := 0
for i := 0; i < m; i++ {
	n += *shared
}
</pre>

into:

<pre>
n := 0
local := *shared
for i := 0; i < m; i++ {
	n += local
}
</pre>

<p>
provided it can be proved that <code>*shared</code> will not fault on access,
because the potential added read will not affect any existing concurrent reads or writes.
On the other hand, the rewrite would not be valid in a source-to-source translator.
</p>

<h2 id="conclusion">Conclusion</h2>

<p>
Go programmers writing data-race-free programs can rely on
sequentially consistent execution of those programs,
just as in essentially all other modern programming languages.
</p>

<p>
When it comes to programs with races,
both programmers and compilers should remember the advice:
don't be clever.
</p>