1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
|
<!-- DO NOT EDIT THIS FILE.
This file is periodically generated from the content in the `/src/`
directory, so all fixes need to be made in `/src/`.
-->
[TOC]
# Smart Pointers
A *pointer* is a general concept for a variable that contains an address in
memory. This address refers to, or “points at,” some other data. The most
common kind of pointer in Rust is a reference, which you learned about in
Chapter 4. References are indicated by the `&` symbol and borrow the value they
point to. They don’t have any special capabilities other than referring to
data, and they have no overhead.
*Smart pointers*, on the other hand, are data structures that act like a
pointer but also have additional metadata and capabilities. The concept of
smart pointers isn’t unique to Rust: smart pointers originated in C++ and exist
in other languages as well. Rust has a variety of smart pointers defined in the
standard library that provide functionality beyond that provided by references.
To explore the general concept, we’ll look at a couple of different examples of
smart pointers, including a *reference counting* smart pointer type. This
pointer enables you to allow data to have multiple owners by keeping track of
the number of owners and, when no owners remain, cleaning up the data.
Rust, with its concept of ownership and borrowing, has an additional difference
between references and smart pointers: while references only borrow data, in
many cases smart pointers *own* the data they point to.
Though we didn’t call them as such at the time, we’ve already encountered a few
smart pointers in this book, including `String` and `Vec<T>` in Chapter 8. Both
of these types count as smart pointers because they own some memory and allow
you to manipulate it. They also have metadata and extra capabilities or
guarantees. `String`, for example, stores its capacity as metadata and has the
extra ability to ensure its data will always be valid UTF-8.
Smart pointers are usually implemented using structs. Unlike an ordinary
struct, smart pointers implement the `Deref` and `Drop` traits. The `Deref`
trait allows an instance of the smart pointer struct to behave like a reference
so you can write your code to work with either references or smart pointers.
The `Drop` trait allows you to customize the code that’s run when an instance
of the smart pointer goes out of scope. In this chapter, we’ll discuss both
traits and demonstrate why they’re important to smart pointers.
Given that the smart pointer pattern is a general design pattern used
frequently in Rust, this chapter won’t cover every existing smart pointer. Many
libraries have their own smart pointers, and you can even write your own. We’ll
cover the most common smart pointers in the standard library:
* `Box<T>`, for allocating values on the heap
* `Rc<T>`, a reference counting type that enables multiple ownership
* `Ref<T>` and `RefMut<T>`, accessed through `RefCell<T>`, a type that enforces
the borrowing rules at runtime instead of compile time
In addition, we’ll cover the *interior mutability* pattern where an immutable
type exposes an API for mutating an interior value. We’ll also discuss
*reference cycles*: how they can leak memory and how to prevent them.
Let’s dive in!
## Using Box<T> to Point to Data on the Heap
The most straightforward smart pointer is a *box*, whose type is written
`Box<T>`. Boxes allow you to store data on the heap rather than the stack. What
remains on the stack is the pointer to the heap data. Refer to Chapter 4 to
review the difference between the stack and the heap.
Boxes don’t have performance overhead, other than storing their data on the
heap instead of on the stack. But they don’t have many extra capabilities
either. You’ll use them most often in these situations:
* When you have a type whose size can’t be known at compile time and you want
to use a value of that type in a context that requires an exact size
* When you have a large amount of data and you want to transfer ownership but
ensure the data won’t be copied when you do so
* When you want to own a value and you care only that it’s a type that
implements a particular trait rather than being of a specific type
We’ll demonstrate the first situation in “Enabling Recursive Types with Boxes”
on page XX. In the second case, transferring ownership of a large amount of
data can take a long time because the data is copied around on the stack. To
improve performance in this situation, we can store the large amount of data on
the heap in a box. Then, only the small amount of pointer data is copied around
on the stack, while the data it references stays in one place on the heap. The
third case is known as a *trait object*, and “Using Trait Objects That Allow
for Values of Different Types” on page XX is devoted to that topic. So what you
learn here you’ll apply again in that section!
### Using Box<T> to Store Data on the Heap
Before we discuss the heap storage use case for `Box<T>`, we’ll cover the
syntax and how to interact with values stored within a `Box<T>`.
Listing 15-1 shows how to use a box to store an `i32` value on the heap.
Filename: src/main.rs
```
fn main() {
let b = Box::new(5);
println!("b = {b}");
}
```
Listing 15-1: Storing an `i32` value on the heap using a box
We define the variable `b` to have the value of a `Box` that points to the
value `5`, which is allocated on the heap. This program will print `b = 5`; in
this case, we can access the data in the box similar to how we would if this
data were on the stack. Just like any owned value, when a box goes out of
scope, as `b` does at the end of `main`, it will be deallocated. The
deallocation happens both for the box (stored on the stack) and the data it
points to (stored on the heap).
Putting a single value on the heap isn’t very useful, so you won’t use boxes by
themselves in this way very often. Having values like a single `i32` on the
stack, where they’re stored by default, is more appropriate in the majority of
situations. Let’s look at a case where boxes allow us to define types that we
wouldn’t be allowed to define if we didn’t have boxes.
### Enabling Recursive Types with Boxes
A value of a *recursive type* can have another value of the same type as part
of itself. Recursive types pose an issue because at compile time Rust needs to
know how much space a type takes up. However, the nesting of values of
recursive types could theoretically continue infinitely, so Rust can’t know how
much space the value needs. Because boxes have a known size, we can enable
recursive types by inserting a box in the recursive type definition.
As an example of a recursive type, let’s explore the *cons list*. This is a
data type commonly found in functional programming languages. The cons list
type we’ll define is straightforward except for the recursion; therefore, the
concepts in the example we’ll work with will be useful any time you get into
more complex situations involving recursive types.
#### More Information About the Cons List
A *cons list* is a data structure that comes from the Lisp programming language
and its dialects, is made up of nested pairs, and is the Lisp version of a
linked list. Its name comes from the `cons` function (short for *construct
function*) in Lisp that constructs a new pair from its two arguments. By
calling `cons` on a pair consisting of a value and another pair, we can
construct cons lists made up of recursive pairs.
For example, here’s a pseudocode representation of a cons list containing the
list `1, 2, 3` with each pair in parentheses:
```
(1, (2, (3, Nil)))
```
Each item in a cons list contains two elements: the value of the current item
and the next item. The last item in the list contains only a value called `Nil`
without a next item. A cons list is produced by recursively calling the `cons`
function. The canonical name to denote the base case of the recursion is `Nil`.
Note that this is not the same as the “null” or “nil” concept in Chapter 6,
which is an invalid or absent value.
The cons list isn’t a commonly used data structure in Rust. Most of the time
when you have a list of items in Rust, `Vec<T>` is a better choice to use.
Other, more complex recursive data types *are* useful in various situations,
but by starting with the cons list in this chapter, we can explore how boxes
let us define a recursive data type without much distraction.
Listing 15-2 contains an enum definition for a cons list. Note that this code
won’t compile yet because the `List` type doesn’t have a known size, which
we’ll demonstrate.
Filename: src/main.rs
```
enum List {
Cons(i32, List),
Nil,
}
```
Listing 15-2: The first attempt at defining an enum to represent a cons list
data structure of `i32` values
> Note: We’re implementing a cons list that holds only `i32` values for the
purposes of this example. We could have implemented it using generics, as we
discussed in Chapter 10, to define a cons list type that could store values of
any type.
Using the `List` type to store the list `1, 2, 3` would look like the code in
Listing 15-3.
Filename: src/main.rs
```
--snip--
use crate::List::{Cons, Nil};
fn main() {
let list = Cons(1, Cons(2, Cons(3, Nil)));
}
```
Listing 15-3: Using the `List` enum to store the list `1, 2, 3`
The first `Cons` value holds `1` and another `List` value. This `List` value is
another `Cons` value that holds `2` and another `List` value. This `List` value
is one more `Cons` value that holds `3` and a `List` value, which is finally
`Nil`, the non-recursive variant that signals the end of the list.
If we try to compile the code in Listing 15-3, we get the error shown in
Listing 15-4.
```
error[E0072]: recursive type `List` has infinite size
--> src/main.rs:1:1
|
1 | enum List {
| ^^^^^^^^^ recursive type has infinite size
2 | Cons(i32, List),
| ---- recursive without indirection
|
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List`
representable
|
2 | Cons(i32, Box<List>),
| ++++ +
```
Listing 15-4: The error we get when attempting to define a recursive enum
The error shows this type “has infinite size.” The reason is that we’ve defined
`List` with a variant that is recursive: it holds another value of itself
directly. As a result, Rust can’t figure out how much space it needs to store a
`List` value. Let’s break down why we get this error. First we’ll look at how
Rust decides how much space it needs to store a value of a non-recursive type.
#### Computing the Size of a Non-Recursive Type
Recall the `Message` enum we defined in Listing 6-2 when we discussed enum
definitions in Chapter 6:
```
enum Message {
Quit,
Move { x: i32, y: i32 },
Write(String),
ChangeColor(i32, i32, i32),
}
```
To determine how much space to allocate for a `Message` value, Rust goes
through each of the variants to see which variant needs the most space. Rust
sees that `Message::Quit` doesn’t need any space, `Message::Move` needs enough
space to store two `i32` values, and so forth. Because only one variant will be
used, the most space a `Message` value will need is the space it would take to
store the largest of its variants.
Contrast this with what happens when Rust tries to determine how much space a
recursive type like the `List` enum in Listing 15-2 needs. The compiler starts
by looking at the `Cons` variant, which holds a value of type `i32` and a value
of type `List`. Therefore, `Cons` needs an amount of space equal to the size of
an `i32` plus the size of a `List`. To figure out how much memory the `List`
type needs, the compiler looks at the variants, starting with the `Cons`
variant. The `Cons` variant holds a value of type `i32` and a value of type
`List`, and this process continues infinitely, as shown in Figure 15-1.
Figure 15-1: An infinite `List` consisting of infinite `Cons` variants
#### Using Box<T> to Get a Recursive Type with a Known Size
Because Rust can’t figure out how much space to allocate for recursively
defined types, the compiler gives an error with this helpful suggestion:
```
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List`
representable
|
2 | Cons(i32, Box<List>),
| ++++ +
```
In this suggestion, *indirection* means that instead of storing a value
directly, we should change the data structure to store the value indirectly by
storing a pointer to the value instead.
Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>`
needs: a pointer’s size doesn’t change based on the amount of data it’s
pointing to. This means we can put a `Box<T>` inside the `Cons` variant instead
of another `List` value directly. The `Box<T>` will point to the next `List`
value that will be on the heap rather than inside the `Cons` variant.
Conceptually, we still have a list, created with lists holding other lists, but
this implementation is now more like placing the items next to one another
rather than inside one another.
We can change the definition of the `List` enum in Listing 15-2 and the usage
of the `List` in Listing 15-3 to the code in Listing 15-5, which will compile.
Filename: src/main.rs
```
enum List {
Cons(i32, Box<List>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let list = Cons(
1,
Box::new(Cons(
2,
Box::new(Cons(
3,
Box::new(Nil)
))
))
);
}
```
Listing 15-5: Definition of `List` that uses `Box<T>` in order to have a known
size
The `Cons` variant needs the size of an `i32` plus the space to store the box’s
pointer data. The `Nil` variant stores no values, so it needs less space than
the `Cons` variant. We now know that any `List` value will take up the size of
an `i32` plus the size of a box’s pointer data. By using a box, we’ve broken
the infinite, recursive chain, so the compiler can figure out the size it needs
to store a `List` value. Figure 15-2 shows what the `Cons` variant looks like
now.
Figure 15-2: A `List` that is not infinitely sized, because `Cons` holds a `Box`
Boxes provide only the indirection and heap allocation; they don’t have any
other special capabilities, like those we’ll see with the other smart pointer
types. They also don’t have the performance overhead that these special
capabilities incur, so they can be useful in cases like the cons list where the
indirection is the only feature we need. We’ll look at more use cases for boxes
in Chapter 17.
The `Box<T>` type is a smart pointer because it implements the `Deref` trait,
which allows `Box<T>` values to be treated like references. When a `Box<T>`
value goes out of scope, the heap data that the box is pointing to is cleaned
up as well because of the `Drop` trait implementation. These two traits will be
even more important to the functionality provided by the other smart pointer
types we’ll discuss in the rest of this chapter. Let’s explore these two traits
in more detail.
## Treating Smart Pointers Like Regular References with Deref
Implementing the `Deref` trait allows you to customize the behavior of the
*dereference operator* `*` (not to be confused with the multiplication or glob
operator). By implementing `Deref` in such a way that a smart pointer can be
treated like a regular reference, you can write code that operates on
references and use that code with smart pointers too.
Let’s first look at how the dereference operator works with regular references.
Then we’ll try to define a custom type that behaves like `Box<T>`, and see why
the dereference operator doesn’t work like a reference on our newly defined
type. We’ll explore how implementing the `Deref` trait makes it possible for
smart pointers to work in ways similar to references. Then we’ll look at Rust’s
*deref coercion* feature and how it lets us work with either references or
smart pointers.
> Note: There’s one big difference between the `MyBox<T>` type we’re about to
build and the real `Box<T>`: our version will not store its data on the heap.
We are focusing this example on `Deref`, so where the data is actually stored
is less important than the pointer-like behavior.
### Following the Pointer to the Value
A regular reference is a type of pointer, and one way to think of a pointer is
as an arrow to a value stored somewhere else. In Listing 15-6, we create a
reference to an `i32` value and then use the dereference operator to follow the
reference to the value.
Filename: src/main.rs
```
fn main() {
1 let x = 5;
2 let y = &x;
3 assert_eq!(5, x);
4 assert_eq!(5, *y);
}
```
Listing 15-6: Using the dereference operator to follow a reference to an `i32`
value
The variable `x` holds an `i32` value `5` [1]. We set `y` equal to a reference
to `x` [2]. We can assert that `x` is equal to `5` [3]. However, if we want to
make an assertion about the value in `y`, we have to use `*y` to follow the
reference to the value it’s pointing to (hence *dereference*) so the compiler
can compare the actual value [4]. Once we dereference `y`, we have access to
the integer value `y` is pointing to that we can compare with `5`.
If we tried to write `assert_eq!(5, y);` instead, we would get this compilation
error:
```
error[E0277]: can't compare `{integer}` with `&{integer}`
--> src/main.rs:6:5
|
6 | assert_eq!(5, y);
| ^^^^^^^^^^^^^^^^ no implementation for `{integer} ==
&{integer}`
|
= help: the trait `PartialEq<&{integer}>` is not implemented
for `{integer}`
```
Comparing a number and a reference to a number isn’t allowed because they’re
different types. We must use the dereference operator to follow the reference
to the value it’s pointing to.
### Using Box<T> Like a Reference
We can rewrite the code in Listing 15-6 to use a `Box<T>` instead of a
reference; the dereference operator used on the `Box<T>` in Listing 15-7
functions in the same way as the dereference operator used on the reference in
Listing 15-6.
Filename: src/main.rs
```
fn main() {
let x = 5;
1 let y = Box::new(x);
assert_eq!(5, x);
2 assert_eq!(5, *y);
}
```
Listing 15-7: Using the dereference operator on a `Box<i32>`
The main difference between Listing 15-7 and Listing 15-6 is that here we set
`y` to be an instance of a box pointing to a copied value of `x` rather than a
reference pointing to the value of `x` [1]. In the last assertion [2], we can
use the dereference operator to follow the box’s pointer in the same way that
we did when `y` was a reference. Next, we’ll explore what is special about
`Box<T>` that enables us to use the dereference operator by defining our own
box type.
### Defining Our Own Smart Pointer
Let’s build a smart pointer similar to the `Box<T>` type provided by the
standard library to experience how smart pointers behave differently from
references by default. Then we’ll look at how to add the ability to use the
dereference operator.
The `Box<T>` type is ultimately defined as a tuple struct with one element, so
Listing 15-8 defines a `MyBox<T>` type in the same way. We’ll also define a
`new` function to match the `new` function defined on `Box<T>`.
Filename: src/main.rs
```
1 struct MyBox<T>(T);
impl<T> MyBox<T> {
2 fn new(x: T) -> MyBox<T> {
3 MyBox(x)
}
}
```
Listing 15-8: Defining a `MyBox<T>` type
We define a struct named `MyBox` and declare a generic parameter `T` [1]
because we want our type to hold values of any type. The `MyBox` type is a
tuple struct with one element of type `T`. The `MyBox::new` function takes one
parameter of type `T` [2] and returns a `MyBox` instance that holds the value
passed in [3].
Let’s try adding the `main` function in Listing 15-7 to Listing 15-8 and
changing it to use the `MyBox<T>` type we’ve defined instead of `Box<T>`. The
code in Listing 15-9 won’t compile because Rust doesn’t know how to dereference
`MyBox`.
Filename: src/main.rs
```
fn main() {
let x = 5;
let y = MyBox::new(x);
assert_eq!(5, x);
assert_eq!(5, *y);
}
```
Listing 15-9: Attempting to use `MyBox<T>` in the same way we used references
and `Box<T>`
Here’s the resultant compilation error:
```
error[E0614]: type `MyBox<{integer}>` cannot be dereferenced
--> src/main.rs:14:19
|
14 | assert_eq!(5, *y);
| ^^
```
Our `MyBox<T>` type can’t be dereferenced because we haven’t implemented that
ability on our type. To enable dereferencing with the `*` operator, we
implement the `Deref` trait.
### Implementing the Deref Trait
As discussed in “Implementing a Trait on a Type” on page XX, to implement a
trait we need to provide implementations for the trait’s required methods. The
`Deref` trait, provided by the standard library, requires us to implement one
method named `deref` that borrows `self` and returns a reference to the inner
data. Listing 15-10 contains an implementation of `Deref` to add to the
definition of `MyBox``<T>`.
Filename: src/main.rs
```
use std::ops::Deref;
impl<T> Deref for MyBox<T> {
1 type Target = T;
fn deref(&self) -> &Self::Target {
2 &self.0
}
}
```
Listing 15-10: Implementing `Deref` on `MyBox<T>`
The `type Target = T;` syntax [1] defines an associated type for the `Deref`
trait to use. Associated types are a slightly different way of declaring a
generic parameter, but you don’t need to worry about them for now; we’ll cover
them in more detail in Chapter 19.
We fill in the body of the `deref` method with `&self.0` so `deref` returns a
reference to the value we want to access with the `*` operator [2]; recall from
“Using Tuple Structs Without Named Fields to Create Different Types” on page XX
that `.0` accesses the first value in a tuple struct. The `main` function in
Listing 15-9 that calls `*` on the `MyBox<T>` value now compiles, and the
assertions pass!
Without the `Deref` trait, the compiler can only dereference `&` references.
The `deref` method gives the compiler the ability to take a value of any type
that implements `Deref` and call the `deref` method to get a `&` reference that
it knows how to dereference.
When we entered `*y` in Listing 15-9, behind the scenes Rust actually ran this
code:
```
*(y.deref())
```
Rust substitutes the `*` operator with a call to the `deref` method and then a
plain dereference so we don’t have to think about whether or not we need to
call the `deref` method. This Rust feature lets us write code that functions
identically whether we have a regular reference or a type that implements
`Deref`.
The reason the `deref` method returns a reference to a value, and that the
plain dereference outside the parentheses in `*(y.deref())` is still necessary,
has to do with the ownership system. If the `deref` method returned the value
directly instead of a reference to the value, the value would be moved out of
`self`. We don’t want to take ownership of the inner value inside `MyBox<T>` in
this case or in most cases where we use the dereference operator.
Note that the `*` operator is replaced with a call to the `deref` method and
then a call to the `*` operator just once, each time we use a `*` in our code.
Because the substitution of the `*` operator does not recurse infinitely, we
end up with data of type `i32`, which matches the `5` in `assert_eq!` in
Listing 15-9.
### Implicit Deref Coercions with Functions and Methods
*Deref coercion* converts a reference to a type that implements the `Deref`
trait into a reference to another type. For example, deref coercion can convert
`&String` to `&str` because `String` implements the `Deref` trait such that it
returns `&str`. Deref coercion is a convenience Rust performs on arguments to
functions and methods, and works only on types that implement the `Deref`
trait. It happens automatically when we pass a reference to a particular type’s
value as an argument to a function or method that doesn’t match the parameter
type in the function or method definition. A sequence of calls to the `deref`
method converts the type we provided into the type the parameter needs.
Deref coercion was added to Rust so that programmers writing function and
method calls don’t need to add as many explicit references and dereferences
with `&` and `*`. The deref coercion feature also lets us write more code that
can work for either references or smart pointers.
To see deref coercion in action, let’s use the `MyBox<T>` type we defined in
Listing 15-8 as well as the implementation of `Deref` that we added in Listing
15-10. Listing 15-11 shows the definition of a function that has a string slice
parameter.
Filename: src/main.rs
```
fn hello(name: &str) {
println!("Hello, {name}!");
}
```
Listing 15-11: A `hello` function that has the parameter `name` of type `&str`
We can call the `hello` function with a string slice as an argument, such as
`hello("Rust");`, for example. Deref coercion makes it possible to call `hello`
with a reference to a value of type `MyBox<String>`, as shown in Listing 15-12.
Filename: src/main.rs
```
fn main() {
let m = MyBox::new(String::from("Rust"));
hello(&m);
}
```
Listing 15-12: Calling `hello` with a reference to a `MyBox<String>` value,
which works because of deref coercion
Here we’re calling the `hello` function with the argument `&m`, which is a
reference to a `MyBox<String>` value. Because we implemented the `Deref` trait
on `MyBox<T>` in Listing 15-10, Rust can turn `&MyBox<String>` into `&String`
by calling `deref`. The standard library provides an implementation of `Deref`
on `String` that returns a string slice, and this is in the API documentation
for `Deref`. Rust calls `deref` again to turn the `&String` into `&str`, which
matches the `hello` function’s definition.
If Rust didn’t implement deref coercion, we would have to write the code in
Listing 15-13 instead of the code in Listing 15-12 to call `hello` with a value
of type `&MyBox<String>`.
Filename: src/main.rs
```
fn main() {
let m = MyBox::new(String::from("Rust"));
hello(&(*m)[..]);
}
```
Listing 15-13: The code we would have to write if Rust didn’t have deref
coercion
The `(*m)` dereferences the `MyBox<String>` into a `String`. Then the `&` and
`[..]` take a string slice of the `String` that is equal to the whole string to
match the signature of `hello`. This code without deref coercions is harder to
read, write, and understand with all of these symbols involved. Deref coercion
allows Rust to handle these conversions for us automatically.
When the `Deref` trait is defined for the types involved, Rust will analyze the
types and use `Deref::deref` as many times as necessary to get a reference to
match the parameter’s type. The number of times that `Deref::deref` needs to be
inserted is resolved at compile time, so there is no runtime penalty for taking
advantage of deref coercion!
### How Deref Coercion Interacts with Mutability
Similar to how you use the `Deref` trait to override the `*` operator on
immutable references, you can use the `DerefMut` trait to override the `*`
operator on mutable references.
Rust does deref coercion when it finds types and trait implementations in three
cases:
* From `&T` to `&U` when `T: Deref<Target=U>`
* From `&mut T` to `&mut U` when `T: DerefMut<Target=U>`
* From `&mut T` to `&U` when `T: Deref<Target=U>`
The first two cases are the same except that the second implements mutability.
The first case states that if you have a `&T`, and `T` implements `Deref` to
some type `U`, you can get a `&U` transparently. The second case states that
the same deref coercion happens for mutable references.
The third case is trickier: Rust will also coerce a mutable reference to an
immutable one. But the reverse is *not* possible: immutable references will
never coerce to mutable references. Because of the borrowing rules, if you have
a mutable reference, that mutable reference must be the only reference to that
data (otherwise, the program wouldn’t compile). Converting one mutable
reference to one immutable reference will never break the borrowing rules.
Converting an immutable reference to a mutable reference would require that the
initial immutable reference is the only immutable reference to that data, but
the borrowing rules don’t guarantee that. Therefore, Rust can’t make the
assumption that converting an immutable reference to a mutable reference is
possible.
## Running Code on Cleanup with the Drop Trait
The second trait important to the smart pointer pattern is `Drop`, which lets
you customize what happens when a value is about to go out of scope. You can
provide an implementation for the `Drop` trait on any type, and that code can
be used to release resources like files or network connections.
We’re introducing `Drop` in the context of smart pointers because the
functionality of the `Drop` trait is almost always used when implementing a
smart pointer. For example, when a `Box<T>` is dropped it will deallocate the
space on the heap that the box points to.
In some languages, for some types, the programmer must call code to free memory
or resources every time they finish using an instance of those types. Examples
include file handles, sockets, and locks. If they forget, the system might
become overloaded and crash. In Rust, you can specify that a particular bit of
code be run whenever a value goes out of scope, and the compiler will insert
this code automatically. As a result, you don’t need to be careful about
placing cleanup code everywhere in a program that an instance of a particular
type is finished with—you still won’t leak resources!
You specify the code to run when a value goes out of scope by implementing the
`Drop` trait. The `Drop` trait requires you to implement one method named
`drop` that takes a mutable reference to `self`. To see when Rust calls `drop`,
let’s implement `drop` with `println!` statements for now.
Listing 15-14 shows a `CustomSmartPointer` struct whose only custom
functionality is that it will print `Dropping CustomSmartPointer!` when the
instance goes out of scope, to show when Rust runs the `drop` method.
Filename: src/main.rs
```
struct CustomSmartPointer {
data: String,
}
1 impl Drop for CustomSmartPointer {
fn drop(&mut self) {
2 println!(
"Dropping CustomSmartPointer with data `{}`!",
self.data
);
}
}
fn main() {
3 let c = CustomSmartPointer {
data: String::from("my stuff"),
};
4 let d = CustomSmartPointer {
data: String::from("other stuff"),
};
5 println!("CustomSmartPointers created.");
6 }
```
Listing 15-14: A `CustomSmartPointer` struct that implements the `Drop` trait
where we would put our cleanup code
The `Drop` trait is included in the prelude, so we don’t need to bring it into
scope. We implement the `Drop` trait on `CustomSmartPointer` [1] and provide an
implementation for the `drop` method that calls `println!` [2]. The body of the
`drop` method is where you would place any logic that you wanted to run when an
instance of your type goes out of scope. We’re printing some text here to
demonstrate visually when Rust will call `drop`.
In `main`, we create two instances of `CustomSmartPointer` at [3] and [4] and
then print `CustomSmartPointers created` [5]. At the end of `main` [6], our
instances of `CustomSmartPointer` will go out of scope, and Rust will call the
code we put in the `drop` method [2], printing our final message. Note that we
didn’t need to call the `drop` method explicitly.
When we run this program, we’ll see the following output:
```
CustomSmartPointers created.
Dropping CustomSmartPointer with data `other stuff`!
Dropping CustomSmartPointer with data `my stuff`!
```
Rust automatically called `drop` for us when our instances went out of scope,
calling the code we specified. Variables are dropped in the reverse order of
their creation, so `d` was dropped before `c`. This example’s purpose is to
give you a visual guide to how the `drop` method works; usually you would
specify the cleanup code that your type needs to run rather than a print
message.
Unfortunately, it’s not straightforward to disable the automatic `drop`
functionality. Disabling `drop` isn’t usually necessary; the whole point of the
`Drop` trait is that it’s taken care of automatically. Occasionally, however,
you might want to clean up a value early. One example is when using smart
pointers that manage locks: you might want to force the `drop` method that
releases the lock so that other code in the same scope can acquire the lock.
Rust doesn’t let you call the `Drop` trait’s `drop` method manually; instead,
you have to call the `std::mem::drop` function provided by the standard library
if you want to force a value to be dropped before the end of its scope.
If we try to call the `Drop` trait’s `drop` method manually by modifying the
`main` function from Listing 15-14, as shown in Listing 15-15, we’ll get a
compiler error.
Filename: src/main.rs
```
fn main() {
let c = CustomSmartPointer {
data: String::from("some data"),
};
println!("CustomSmartPointer created.");
c.drop();
println!(
"CustomSmartPointer dropped before the end of main."
);
}
```
Listing 15-15: Attempting to call the `drop` method from the `Drop` trait
manually to clean up early
When we try to compile this code, we’ll get this error:
```
error[E0040]: explicit use of destructor method
--> src/main.rs:16:7
|
16 | c.drop();
| --^^^^--
| | |
| | explicit destructor calls not allowed
| help: consider using `drop` function: `drop(c)`
```
This error message states that we’re not allowed to explicitly call `drop`. The
error message uses the term *destructor*, which is the general programming term
for a function that cleans up an instance. A *destructor* is analogous to a
*constructor*, which creates an instance. The `drop` function in Rust is one
particular destructor.
Rust doesn’t let us call `drop` explicitly because Rust would still
automatically call `drop` on the value at the end of `main`. This would cause a
*double free* error because Rust would be trying to clean up the same value
twice.
We can’t disable the automatic insertion of `drop` when a value goes out of
scope, and we can’t call the `drop` method explicitly. So, if we need to force
a value to be cleaned up early, we use the `std::mem::drop` function.
The `std::mem::drop` function is different from the `drop` method in the `Drop`
trait. We call it by passing as an argument the value we want to force-drop.
The function is in the prelude, so we can modify `main` in Listing 15-15 to
call the `drop` function, as shown in Listing 15-16.
Filename: src/main.rs
```
fn main() {
let c = CustomSmartPointer {
data: String::from("some data"),
};
println!("CustomSmartPointer created.");
drop(c);
println!(
"CustomSmartPointer dropped before the end of main."
);
}
```
Listing 15-16: Calling `std::mem::drop` to explicitly drop a value before it
goes out of scope
Running this code will print the following:
```
CustomSmartPointer created.
Dropping CustomSmartPointer with data `some data`!
CustomSmartPointer dropped before the end of main.
```
The text `Dropping CustomSmartPointer with data `some data`!` is printed
between the `CustomSmartPointer created.` and `CustomSmartPointer dropped
before the end of main.` text, showing that the `drop` method code is called to
drop `c` at that point.
You can use code specified in a `Drop` trait implementation in many ways to
make cleanup convenient and safe: for instance, you could use it to create your
own memory allocator! With the `Drop` trait and Rust’s ownership system, you
don’t have to remember to clean up because Rust does it automatically.
You also don’t have to worry about problems resulting from accidentally
cleaning up values still in use: the ownership system that makes sure
references are always valid also ensures that `drop` gets called only once when
the value is no longer being used.
Now that we’ve examined `Box<T>` and some of the characteristics of smart
pointers, let’s look at a few other smart pointers defined in the standard
library.
## Rc<T>, the Reference Counted Smart Pointer
In the majority of cases, ownership is clear: you know exactly which variable
owns a given value. However, there are cases when a single value might have
multiple owners. For example, in graph data structures, multiple edges might
point to the same node, and that node is conceptually owned by all of the edges
that point to it. A node shouldn’t be cleaned up unless it doesn’t have any
edges pointing to it and so has no owners.
You have to enable multiple ownership explicitly by using the Rust type
`Rc<T>`, which is an abbreviation for *reference counting*. The `Rc<T>` type
keeps track of the number of references to a value to determine whether or not
the value is still in use. If there are zero references to a value, the value
can be cleaned up without any references becoming invalid.
Imagine `Rc<T>` as a TV in a family room. When one person enters to watch TV,
they turn it on. Others can come into the room and watch the TV. When the last
person leaves the room, they turn off the TV because it’s no longer being used.
If someone turns off the TV while others are still watching it, there would be
an uproar from the remaining TV watchers!
We use the `Rc<T>` type when we want to allocate some data on the heap for
multiple parts of our program to read and we can’t determine at compile time
which part will finish using the data last. If we knew which part would finish
last, we could just make that part the data’s owner, and the normal ownership
rules enforced at compile time would take effect.
Note that `Rc<T>` is only for use in single-threaded scenarios. When we discuss
concurrency in Chapter 16, we’ll cover how to do reference counting in
multithreaded programs.
### Using Rc<T> to Share Data
Let’s return to our cons list example in Listing 15-5. Recall that we defined
it using `Box<T>`. This time, we’ll create two lists that both share ownership
of a third list. Conceptually, this looks similar to Figure 15-3.
Figure 15-3: Two lists, `b` and `c`, sharing ownership of a third list, `a`
We’ll create list `a` that contains `5` and then `10`. Then we’ll make two more
lists: `b` that starts with `3` and `c` that starts with `4`. Both `b` and `c`
lists will then continue on to the first `a` list containing `5` and `10`. In
other words, both lists will share the first list containing `5` and `10`.
Trying to implement this scenario using our definition of `List` with `Box<T>`
won’t work, as shown in Listing 15-17.
Filename: src/main.rs
```
enum List {
Cons(i32, Box<List>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
1 let b = Cons(3, Box::new(a));
2 let c = Cons(4, Box::new(a));
}
```
Listing 15-17: Demonstrating that we’re not allowed to have two lists using
`Box<T>` that try to share ownership of a third list
When we compile this code, we get this error:
```
error[E0382]: use of moved value: `a`
--> src/main.rs:11:30
|
9 | let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
| - move occurs because `a` has type `List`, which
does not implement the `Copy` trait
10 | let b = Cons(3, Box::new(a));
| - value moved here
11 | let c = Cons(4, Box::new(a));
| ^ value used here after move
```
The `Cons` variants own the data they hold, so when we create the `b` list [1],
`a` is moved into `b` and `b` owns `a`. Then, when we try to use `a` again when
creating `c` [2], we’re not allowed to because `a` has been moved.
We could change the definition of `Cons` to hold references instead, but then
we would have to specify lifetime parameters. By specifying lifetime
parameters, we would be specifying that every element in the list will live at
least as long as the entire list. This is the case for the elements and lists
in Listing 15-17, but not in every scenario.
Instead, we’ll change our definition of `List` to use `Rc<T>` in place of
`Box<T>`, as shown in Listing 15-18. Each `Cons` variant will now hold a value
and an `Rc<T>` pointing to a `List`. When we create `b`, instead of taking
ownership of `a`, we’ll clone the `Rc<List>` that `a` is holding, thereby
increasing the number of references from one to two and letting `a` and `b`
share ownership of the data in that `Rc<List>`. We’ll also clone `a` when
creating `c`, increasing the number of references from two to three. Every time
we call `Rc::clone`, the reference count to the data within the `Rc<List>` will
increase, and the data won’t be cleaned up unless there are zero references to
it.
Filename: src/main.rs
```
enum List {
Cons(i32, Rc<List>),
Nil,
}
use crate::List::{Cons, Nil};
1 use std::rc::Rc;
fn main() {
2 let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
3 let b = Cons(3, Rc::clone(&a));
4 let c = Cons(4, Rc::clone(&a));
}
```
Listing 15-18: A definition of `List` that uses `Rc<T>`
We need to add a `use` statement to bring `Rc<T>` into scope [1] because it’s
not in the prelude. In `main`, we create the list holding `5` and `10` and
store it in a new `Rc<List>` in `a` [2]. Then, when we create `b` [3] and `c`
[4], we call the `Rc::clone` function and pass a reference to the `Rc<List>` in
`a` as an argument.
We could have called `a.clone()` rather than `Rc::clone(&a)`, but Rust’s
convention is to use `Rc::clone` in this case. The implementation of
`Rc::clone` doesn’t make a deep copy of all the data like most types’
implementations of `clone` do. The call to `Rc::clone` only increments the
reference count, which doesn’t take much time. Deep copies of data can take a
lot of time. By using `Rc::clone` for reference counting, we can visually
distinguish between the deep-copy kinds of clones and the kinds of clones that
increase the reference count. When looking for performance problems in the
code, we only need to consider the deep-copy clones and can disregard calls to
`Rc::clone`.
### Cloning an Rc<T> Increases the Reference Count
Let’s change our working example in Listing 15-18 so we can see the reference
counts changing as we create and drop references to the `Rc<List>` in `a`.
In Listing 15-19, we’ll change `main` so it has an inner scope around list `c`;
then we can see how the reference count changes when `c` goes out of scope.
Filename: src/main.rs
```
--snip--
fn main() {
let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
println!(
"count after creating a = {}",
Rc::strong_count(&a)
);
let b = Cons(3, Rc::clone(&a));
println!(
"count after creating b = {}",
Rc::strong_count(&a)
);
{
let c = Cons(4, Rc::clone(&a));
println!(
"count after creating c = {}",
Rc::strong_count(&a)
);
}
println!(
"count after c goes out of scope = {}",
Rc::strong_count(&a)
);
}
```
Listing 15-19: Printing the reference count
At each point in the program where the reference count changes, we print the
reference count, which we get by calling the `Rc::strong_count` function. This
function is named `strong_count` rather than `count` because the `Rc<T>` type
also has a `weak_count`; we’ll see what `weak_count` is used for in “Preventing
Reference Cycles Using Weak<T>” on page XX.
This code prints the following:
```
count after creating a = 1
count after creating b = 2
count after creating c = 3
count after c goes out of scope = 2
```
We can see that the `Rc<List>` in `a` has an initial reference count of 1; then
each time we call `clone`, the count goes up by 1. When `c` goes out of scope,
the count goes down by 1. We don’t have to call a function to decrease the
reference count like we have to call `Rc::clone` to increase the reference
count: the implementation of the `Drop` trait decreases the reference count
automatically when an `Rc<T>` value goes out of scope.
What we can’t see in this example is that when `b` and then `a` go out of scope
at the end of `main`, the count is then 0, and the `Rc<List>` is cleaned up
completely. Using `Rc<T>` allows a single value to have multiple owners, and
the count ensures that the value remains valid as long as any of the owners
still exist.
Via immutable references, `Rc<T>` allows you to share data between multiple
parts of your program for reading only. If `Rc<T>` allowed you to have multiple
mutable references too, you might violate one of the borrowing rules discussed
in Chapter 4: multiple mutable borrows to the same place can cause data races
and inconsistencies. But being able to mutate data is very useful! In the next
section, we’ll discuss the interior mutability pattern and the `RefCell<T>`
type that you can use in conjunction with an `Rc<T>` to work with this
immutability restriction.
## RefCell<T> and the Interior Mutability Pattern
*Interior mutability* is a design pattern in Rust that allows you to mutate
data even when there are immutable references to that data; normally, this
action is disallowed by the borrowing rules. To mutate data, the pattern uses
`unsafe` code inside a data structure to bend Rust’s usual rules that govern
mutation and borrowing. Unsafe code indicates to the compiler that we’re
checking the rules manually instead of relying on the compiler to check them
for us; we will discuss unsafe code more in Chapter 19.
We can use types that use the interior mutability pattern only when we can
ensure that the borrowing rules will be followed at runtime, even though the
compiler can’t guarantee that. The `unsafe` code involved is then wrapped in a
safe API, and the outer type is still immutable.
Let’s explore this concept by looking at the `RefCell<T>` type that follows the
interior mutability pattern.
### Enforcing Borrowing Rules at Runtime with RefCell<T>
Unlike `Rc<T>`, the `RefCell<T>` type represents single ownership over the data
it holds. So what makes `RefCell<T>` different from a type like `Box<T>`?
Recall the borrowing rules you learned in Chapter 4:
* At any given time, you can have *either* one mutable reference or any number
of immutable references (but not both).
* References must always be valid.
With references and `Box<T>`, the borrowing rules’ invariants are enforced at
compile time. With `RefCell<T>`, these invariants are enforced *at runtime*.
With references, if you break these rules, you’ll get a compiler error. With
`RefCell<T>`, if you break these rules, your program will panic and exit.
The advantages of checking the borrowing rules at compile time are that errors
will be caught sooner in the development process, and there is no impact on
runtime performance because all the analysis is completed beforehand. For those
reasons, checking the borrowing rules at compile time is the best choice in the
majority of cases, which is why this is Rust’s default.
The advantage of checking the borrowing rules at runtime instead is that
certain memory-safe scenarios are then allowed, where they would’ve been
disallowed by the compile-time checks. Static analysis, like the Rust compiler,
is inherently conservative. Some properties of code are impossible to detect by
analyzing the code: the most famous example is the Halting Problem, which is
beyond the scope of this book but is an interesting topic to research.
Because some analysis is impossible, if the Rust compiler can’t be sure the
code complies with the ownership rules, it might reject a correct program; in
this way, it’s conservative. If Rust accepted an incorrect program, users
wouldn’t be able to trust in the guarantees Rust makes. However, if Rust
rejects a correct program, the programmer will be inconvenienced, but nothing
catastrophic can occur. The `RefCell<T>` type is useful when you’re sure your
code follows the borrowing rules but the compiler is unable to understand and
guarantee that.
Similar to `Rc<T>`, `RefCell<T>` is only for use in single-threaded scenarios
and will give you a compile-time error if you try using it in a multithreaded
context. We’ll talk about how to get the functionality of `RefCell<T>` in a
multithreaded program in Chapter 16.
Here is a recap of the reasons to choose `Box<T>`, `Rc<T>`, or `RefCell<T>`:
* `Rc<T>` enables multiple owners of the same data; `Box<T>` and `RefCell<T>`
have single owners.
* `Box<T>` allows immutable or mutable borrows checked at compile time; `Rc<T>`
allows only immutable borrows checked at compile time; `RefCell<T>` allows
immutable or mutable borrows checked at runtime.
* Because `RefCell<T>` allows mutable borrows checked at runtime, you can
mutate the value inside the `RefCell<T>` even when the `RefCell<T>` is
immutable.
Mutating the value inside an immutable value is the *interior mutability*
pattern. Let’s look at a situation in which interior mutability is useful and
examine how it’s possible.
### Interior Mutability: A Mutable Borrow to an Immutable Value
A consequence of the borrowing rules is that when you have an immutable value,
you can’t borrow it mutably. For example, this code won’t compile:
Filename: src/main.rs
```
fn main() {
let x = 5;
let y = &mut x;
}
```
If you tried to compile this code, you’d get the following error:
```
error[E0596]: cannot borrow `x` as mutable, as it is not declared
as mutable
--> src/main.rs:3:13
|
2 | let x = 5;
| - help: consider changing this to be mutable: `mut x`
3 | let y = &mut x;
| ^^^^^^ cannot borrow as mutable
```
However, there are situations in which it would be useful for a value to mutate
itself in its methods but appear immutable to other code. Code outside the
value’s methods would not be able to mutate the value. Using `RefCell<T>` is
one way to get the ability to have interior mutability, but `RefCell<T>`
doesn’t get around the borrowing rules completely: the borrow checker in the
compiler allows this interior mutability, and the borrowing rules are checked
at runtime instead. If you violate the rules, you’ll get a `panic!` instead of
a compiler error.
Let’s work through a practical example where we can use `RefCell<T>` to mutate
an immutable value and see why that is useful.
#### A Use Case for Interior Mutability: Mock Objects
Sometimes during testing a programmer will use a type in place of another type,
in order to observe particular behavior and assert that it’s implemented
correctly. This placeholder type is called a *test double*. Think of it in the
sense of a stunt double in filmmaking, where a person steps in and substitutes
for an actor to do a particularly tricky scene. Test doubles stand in for other
types when we’re running tests. *Mock objects* are specific types of test
doubles that record what happens during a test so you can assert that the
correct actions took place.
Rust doesn’t have objects in the same sense as other languages have objects,
and Rust doesn’t have mock object functionality built into the standard library
as some other languages do. However, you can definitely create a struct that
will serve the same purposes as a mock object.
Here’s the scenario we’ll test: we’ll create a library that tracks a value
against a maximum value and sends messages based on how close to the maximum
value the current value is. This library could be used to keep track of a
user’s quota for the number of API calls they’re allowed to make, for example.
Our library will only provide the functionality of tracking how close to the
maximum a value is and what the messages should be at what times. Applications
that use our library will be expected to provide the mechanism for sending the
messages: the application could put a message in the application, send an
email, send a text message, or do something else. The library doesn’t need to
know that detail. All it needs is something that implements a trait we’ll
provide called `Messenger`. Listing 15-20 shows the library code.
Filename: src/lib.rs
```
pub trait Messenger {
1 fn send(&self, msg: &str);
}
pub struct LimitTracker<'a, T: Messenger> {
messenger: &'a T,
value: usize,
max: usize,
}
impl<'a, T> LimitTracker<'a, T>
where
T: Messenger,
{
pub fn new(
messenger: &'a T,
max: usize
) -> LimitTracker<'a, T> {
LimitTracker {
messenger,
value: 0,
max,
}
}
2 pub fn set_value(&mut self, value: usize) {
self.value = value;
let percentage_of_max =
self.value as f64 / self.max as f64;
if percentage_of_max >= 1.0 {
self.messenger
.send("Error: You are over your quota!");
} else if percentage_of_max >= 0.9 {
self.messenger
.send("Urgent: You're at 90% of your quota!");
} else if percentage_of_max >= 0.75 {
self.messenger
.send("Warning: You're at 75% of your quota!");
}
}
}
```
Listing 15-20: A library to keep track of how close a value is to a maximum
value and warn when the value is at certain levels
One important part of this code is that the `Messenger` trait has one method
called `send` that takes an immutable reference to `self` and the text of the
message [1]. This trait is the interface our mock object needs to implement so
that the mock can be used in the same way a real object is. The other important
part is that we want to test the behavior of the `set_value` method on the
`LimitTracker` [2]. We can change what we pass in for the `value` parameter,
but `set_value` doesn’t return anything for us to make assertions on. We want
to be able to say that if we create a `LimitTracker` with something that
implements the `Messenger` trait and a particular value for `max`, when we pass
different numbers for `value` the messenger is told to send the appropriate
messages.
We need a mock object that, instead of sending an email or text message when we
call `send`, will only keep track of the messages it’s told to send. We can
create a new instance of the mock object, create a `LimitTracker` that uses the
mock object, call the `set_value` method on `LimitTracker`, and then check that
the mock object has the messages we expect. Listing 15-21 shows an attempt to
implement a mock object to do just that, but the borrow checker won’t allow it.
Filename: src/lib.rs
```
#[cfg(test)]
mod tests {
use super::*;
1 struct MockMessenger {
2 sent_messages: Vec<String>,
}
impl MockMessenger {
3 fn new() -> MockMessenger {
MockMessenger {
sent_messages: vec![],
}
}
}
4 impl Messenger for MockMessenger {
fn send(&self, message: &str) {
5 self.sent_messages.push(String::from(message));
}
}
#[test]
6 fn it_sends_an_over_75_percent_warning_message() {
let mock_messenger = MockMessenger::new();
let mut limit_tracker = LimitTracker::new(
&mock_messenger,
100
);
limit_tracker.set_value(80);
assert_eq!(mock_messenger.sent_messages.len(), 1);
}
}
```
Listing 15-21: An attempt to implement a `MockMessenger` that isn’t allowed by
the borrow checker
This test code defines a `MockMessenger` struct [1] that has a `sent_messages`
field with a `Vec` of `String` values [2] to keep track of the messages it’s
told to send. We also define an associated function `new` [3] to make it
convenient to create new `MockMessenger` values that start with an empty list
of messages. We then implement the `Messenger` trait for `MockMessenger` [4] so
we can give a `MockMessenger` to a `LimitTracker`. In the definition of the
`send` method [5], we take the message passed in as a parameter and store it in
the `MockMessenger` list of `sent_messages`.
In the test, we’re testing what happens when the `LimitTracker` is told to set
`value` to something that is more than 75 percent of the `max` value [6]. First
we create a new `MockMessenger`, which will start with an empty list of
messages. Then we create a new `LimitTracker` and give it a reference to the
new `MockMessenger` and a `max` value of `100`. We call the `set_value` method
on the `LimitTracker` with a value of `80`, which is more than 75 percent of
100. Then we assert that the list of messages that the `MockMessenger` is
keeping track of should now have one message in it.
However, there’s one problem with this test, as shown here:
```
error[E0596]: cannot borrow `self.sent_messages` as mutable, as it is behind a
`&` reference
--> src/lib.rs:58:13
|
2 | fn send(&self, msg: &str);
| ----- help: consider changing that to be a mutable reference:
`&mut self`
...
58 | self.sent_messages.push(String::from(message));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `self` is a
`&` reference, so the data it refers to cannot be borrowed as mutable
```
We can’t modify the `MockMessenger` to keep track of the messages because the
`send` method takes an immutable reference to `self`. We also can’t take the
suggestion from the error text to use `&mut self` instead because then the
signature of `send` wouldn’t match the signature in the `Messenger` trait
definition (feel free to try it and see what error message you get).
This is a situation in which interior mutability can help! We’ll store the
`sent_messages` within a `RefCell<T>`, and then the `send` method will be able
to modify `sent_messages` to store the messages we’ve seen. Listing 15-22 shows
what that looks like.
Filename: src/lib.rs
```
#[cfg(test)]
mod tests {
use super::*;
use std::cell::RefCell;
struct MockMessenger {
1 sent_messages: RefCell<Vec<String>>,
}
impl MockMessenger {
fn new() -> MockMessenger {
MockMessenger {
2 sent_messages: RefCell::new(vec![]),
}
}
}
impl Messenger for MockMessenger {
fn send(&self, message: &str) {
self.sent_messages
3 .borrow_mut()
.push(String::from(message));
}
}
#[test]
fn it_sends_an_over_75_percent_warning_message() {
--snip--
assert_eq!(
4 mock_messenger.sent_messages.borrow().len(),
1
);
}
}
```
Listing 15-22: Using `RefCell<T>` to mutate an inner value while the outer
value is considered immutable
The `sent_messages` field is now of type `RefCell<Vec<String>>` [1] instead of
`Vec<String>`. In the `new` function, we create a new `RefCell<Vec<String>>`
instance around the empty vector [2].
For the implementation of the `send` method, the first parameter is still an
immutable borrow of `self`, which matches the trait definition. We call
`borrow_mut` on the `RefCell<Vec<String>>` in `self.sent_messages` [3] to get a
mutable reference to the value inside the `RefCell<Vec<String>>`, which is the
vector. Then we can call `push` on the mutable reference to the vector to keep
track of the messages sent during the test.
The last change we have to make is in the assertion: to see how many items are
in the inner vector, we call `borrow` on the `RefCell<Vec<String>>` to get an
immutable reference to the vector [4].
Now that you’ve seen how to use `RefCell<T>`, let’s dig into how it works!
#### Keeping Track of Borrows at Runtime with RefCell<T>
When creating immutable and mutable references, we use the `&` and `&mut`
syntax, respectively. With `RefCell<T>`, we use the `borrow` and `borrow_mut`
methods, which are part of the safe API that belongs to `RefCell<T>`. The
`borrow` method returns the smart pointer type `Ref<T>`, and `borrow_mut`
returns the smart pointer type `RefMut<T>`. Both types implement `Deref`, so we
can treat them like regular references.
The `RefCell<T>` keeps track of how many `Ref<T>` and `RefMut<T>` smart
pointers are currently active. Every time we call `borrow`, the `RefCell<T>`
increases its count of how many immutable borrows are active. When a `Ref<T>`
value goes out of scope, the count of immutable borrows goes down by 1. Just
like the compile-time borrowing rules, `RefCell<T>` lets us have many immutable
borrows or one mutable borrow at any point in time.
If we try to violate these rules, rather than getting a compiler error as we
would with references, the implementation of `RefCell<T>` will panic at
runtime. Listing 15-23 shows a modification of the implementation of `send` in
Listing 15-22. We’re deliberately trying to create two mutable borrows active
for the same scope to illustrate that `RefCell<T>` prevents us from doing this
at runtime.
Filename: src/lib.rs
```
impl Messenger for MockMessenger {
fn send(&self, message: &str) {
let mut one_borrow = self.sent_messages.borrow_mut();
let mut two_borrow = self.sent_messages.borrow_mut();
one_borrow.push(String::from(message));
two_borrow.push(String::from(message));
}
}
```
Listing 15-23: Creating two mutable references in the same scope to see that
`RefCell<T>` will panic
We create a variable `one_borrow` for the `RefMut<T>` smart pointer returned
from `borrow_mut`. Then we create another mutable borrow in the same way in the
variable `two_borrow`. This makes two mutable references in the same scope,
which isn’t allowed. When we run the tests for our library, the code in Listing
15-23 will compile without any errors, but the test will fail:
```
---- tests::it_sends_an_over_75_percent_warning_message stdout ----
thread 'main' panicked at 'already borrowed: BorrowMutError', src/lib.rs:60:53
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
Notice that the code panicked with the message `already borrowed:
BorrowMutError`. This is how `RefCell<T>` handles violations of the borrowing
rules at runtime.
Choosing to catch borrowing errors at runtime rather than compile time, as
we’ve done here, means you’d potentially be finding mistakes in your code later
in the development process: possibly not until your code was deployed to
production. Also, your code would incur a small runtime performance penalty as
a result of keeping track of the borrows at runtime rather than compile time.
However, using `RefCell<T>` makes it possible to write a mock object that can
modify itself to keep track of the messages it has seen while you’re using it
in a context where only immutable values are allowed. You can use `RefCell<T>`
despite its trade-offs to get more functionality than regular references
provide.
### Allowing Multiple Owners of Mutable Data with Rc<T> and RefCell<T>
A common way to use `RefCell<T>` is in combination with `Rc<T>`. Recall that
`Rc<T>` lets you have multiple owners of some data, but it only gives immutable
access to that data. If you have an `Rc<T>` that holds a `RefCell<T>`, you can
get a value that can have multiple owners *and* that you can mutate!
For example, recall the cons list example in Listing 15-18 where we used
`Rc<T>` to allow multiple lists to share ownership of another list. Because
`Rc<T>` holds only immutable values, we can’t change any of the values in the
list once we’ve created them. Let’s add in `RefCell<T>` for its ability to
change the values in the lists. Listing 15-24 shows that by using a
`RefCell<T>` in the `Cons` definition, we can modify the value stored in all
the lists.
Filename: src/main.rs
```
#[derive(Debug)]
enum List {
Cons(Rc<RefCell<i32>>, Rc<List>),
Nil,
}
use crate::List::{Cons, Nil};
use std::cell::RefCell;
use std::rc::Rc;
fn main() {
1 let value = Rc::new(RefCell::new(5));
2 let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil)));
let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a));
let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a));
3 *value.borrow_mut() += 10;
println!("a after = {:?}", a);
println!("b after = {:?}", b);
println!("c after = {:?}", c);
}
```
Listing 15-24: Using `Rc<RefCell<i32>>` to create a `List` that we can mutate
We create a value that is an instance of `Rc<RefCell<i32>>` and store it in a
variable named `value` [1] so we can access it directly later. Then we create a
`List` in `a` with a `Cons` variant that holds `value` [2]. We need to clone
`value` so both `a` and `value` have ownership of the inner `5` value rather
than transferring ownership from `value` to `a` or having `a` borrow from
`value`.
We wrap the list `a` in an `Rc<T>` so when we create lists `b` and `c`, they
can both refer to `a`, which is what we did in Listing 15-18.
After we’ve created the lists in `a`, `b`, and `c`, we want to add 10 to the
value in `value` [3]. We do this by calling `borrow_mut` on `value`, which uses
the automatic dereferencing feature we discussed in “Where’s the -> Operator?”
on page XX to dereference the `Rc<T>` to the inner `RefCell<T>` value. The
`borrow_mut` method returns a `RefMut<T>` smart pointer, and we use the
dereference operator on it and change the inner value.
When we print `a`, `b`, and `c`, we can see that they all have the modified
value of `15` rather than `5`:
```
a after = Cons(RefCell { value: 15 }, Nil)
b after = Cons(RefCell { value: 3 }, Cons(RefCell { value: 15 }, Nil))
c after = Cons(RefCell { value: 4 }, Cons(RefCell { value: 15 }, Nil))
```
This technique is pretty neat! By using `RefCell<T>`, we have an outwardly
immutable `List` value. But we can use the methods on `RefCell<T>` that provide
access to its interior mutability so we can modify our data when we need to.
The runtime checks of the borrowing rules protect us from data races, and it’s
sometimes worth trading a bit of speed for this flexibility in our data
structures. Note that `RefCell<T>` does not work for multithreaded code!
`Mutex<T>` is the thread-safe version of `RefCell<T>`, and we’ll discuss
`Mutex<T>` in Chapter 16.
## Reference Cycles Can Leak Memory
Rust’s memory safety guarantees make it difficult, but not impossible, to
accidentally create memory that is never cleaned up (known as a *memory leak*).
Preventing memory leaks entirely is not one of Rust’s guarantees, meaning
memory leaks are memory safe in Rust. We can see that Rust allows memory leaks
by using `Rc<T>` and `RefCell<T>`: it’s possible to create references where
items refer to each other in a cycle. This creates memory leaks because the
reference count of each item in the cycle will never reach 0, and the values
will never be dropped.
### Creating a Reference Cycle
Let’s look at how a reference cycle might happen and how to prevent it,
starting with the definition of the `List` enum and a `tail` method in Listing
15-25.
Filename: src/main.rs
```
use crate::List::{Cons, Nil};
use std::cell::RefCell;
use std::rc::Rc;
#[derive(Debug)]
enum List {
1 Cons(i32, RefCell<Rc<List>>),
Nil,
}
impl List {
2 fn tail(&self) -> Option<&RefCell<Rc<List>>> {
match self {
Cons(_, item) => Some(item),
Nil => None,
}
}
}
```
Listing 15-25: A cons list definition that holds a `RefCell<T>` so we can
modify what a `Cons` variant is referring to
We’re using another variation of the `List` definition from Listing 15-5. The
second element in the `Cons` variant is now `RefCell<Rc<List>>` [1], meaning
that instead of having the ability to modify the `i32` value as we did in
Listing 15-24, we want to modify the `List` value a `Cons` variant is pointing
to. We’re also adding a `tail` method [2] to make it convenient for us to
access the second item if we have a `Cons` variant.
In Listing 15-26, we’re adding a `main` function that uses the definitions in
Listing 15-25. This code creates a list in `a` and a list in `b` that points to
the list in `a`. Then it modifies the list in `a` to point to `b`, creating a
reference cycle. There are `println!` statements along the way to show what the
reference counts are at various points in this process.
Filename: src/main.rs
```
fn main() {
1 let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil))));
println!("a initial rc count = {}", Rc::strong_count(&a));
println!("a next item = {:?}", a.tail());
2 let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a))));
println!(
"a rc count after b creation = {}",
Rc::strong_count(&a)
);
println!("b initial rc count = {}", Rc::strong_count(&b));
println!("b next item = {:?}", b.tail());
3 if let Some(link) = a.tail() {
4 *link.borrow_mut() = Rc::clone(&b);
}
println!(
"b rc count after changing a = {}",
Rc::strong_count(&b)
);
println!(
"a rc count after changing a = {}",
Rc::strong_count(&a)
);
// Uncomment the next line to see that we have a cycle;
// it will overflow the stack
// println!("a next item = {:?}", a.tail());
}
```
Listing 15-26: Creating a reference cycle of two `List` values pointing to each
other
We create an `Rc<List>` instance holding a `List` value in the variable `a`
with an initial list of `5, Nil` [1]. We then create an `Rc<List>` instance
holding another `List` value in the variable `b` that contains the value `10`
and points to the list in `a` [2].
We modify `a` so it points to `b` instead of `Nil`, creating a cycle. We do
that by using the `tail` method to get a reference to the `RefCell<Rc<List>>`
in `a`, which we put in the variable `link` [3]. Then we use the `borrow_mut`
method on the `RefCell<Rc<List>>` to change the value inside from an `Rc<List>`
that holds a `Nil` value to the `Rc<List>` in `b` [4].
When we run this code, keeping the last `println!` commented out for the
moment, we’ll get this output:
```
a initial rc count = 1
a next item = Some(RefCell { value: Nil })
a rc count after b creation = 2
b initial rc count = 1
b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) })
b rc count after changing a = 2
a rc count after changing a = 2
```
The reference count of the `Rc<List>` instances in both `a` and `b` is 2 after
we change the list in `a` to point to `b`. At the end of `main`, Rust drops the
variable `b`, which decreases the reference count of the `b` `Rc<List>`
instance from 2 to 1. The memory that `Rc<List>` has on the heap won’t be
dropped at this point because its reference count is 1, not 0. Then Rust drops
`a`, which decreases the reference count of the `a` `Rc<List>` instance from 2
to 1 as well. This instance’s memory can’t be dropped either, because the other
`Rc<List>` instance still refers to it. The memory allocated to the list will
remain uncollected forever. To visualize this reference cycle, we’ve created a
diagram in Figure 15-4.
Figure 15-4: A reference cycle of lists `a` and `b` pointing to each other
If you uncomment the last `println!` and run the program, Rust will try to
print this cycle with `a` pointing to `b` pointing to `a` and so forth until it
overflows the stack.
Compared to a real-world program, the consequences of creating a reference
cycle in this example aren’t very dire: right after we create the reference
cycle, the program ends. However, if a more complex program allocated lots of
memory in a cycle and held onto it for a long time, the program would use more
memory than it needed and might overwhelm the system, causing it to run out of
available memory.
Creating reference cycles is not easily done, but it’s not impossible either.
If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested
combinations of types with interior mutability and reference counting, you must
ensure that you don’t create cycles; you can’t rely on Rust to catch them.
Creating a reference cycle would be a logic bug in your program that you should
use automated tests, code reviews, and other software development practices to
minimize.
Another solution for avoiding reference cycles is reorganizing your data
structures so that some references express ownership and some references don’t.
As a result, you can have cycles made up of some ownership relationships and
some non-ownership relationships, and only the ownership relationships affect
whether or not a value can be dropped. In Listing 15-25, we always want `Cons`
variants to own their list, so reorganizing the data structure isn’t possible.
Let’s look at an example using graphs made up of parent nodes and child nodes
to see when non-ownership relationships are an appropriate way to prevent
reference cycles.
### Preventing Reference Cycles Using Weak<T>
So far, we’ve demonstrated that calling `Rc::clone` increases the
`strong_count` of an `Rc<T>` instance, and an `Rc<T>` instance is only cleaned
up if its `strong_count` is 0. You can also create a *weak reference* to the
value within an `Rc<T>` instance by calling `Rc::downgrade` and passing a
reference to the `Rc<T>`. Strong references are how you can share ownership of
an `Rc<T>` instance. Weak references don’t express an ownership relationship,
and their count doesn’t affect when an `Rc<T>` instance is cleaned up. They
won’t cause a reference cycle because any cycle involving some weak references
will be broken once the strong reference count of values involved is 0.
When you call `Rc::downgrade`, you get a smart pointer of type `Weak<T>`.
Instead of increasing the `strong_count` in the `Rc<T>` instance by 1, calling
`Rc::downgrade` increases the `weak_count` by 1. The `Rc<T>` type uses
`weak_count` to keep track of how many `Weak<T>` references exist, similar to
`strong_count`. The difference is the `weak_count` doesn’t need to be 0 for the
`Rc<T>` instance to be cleaned up.
Because the value that `Weak<T>` references might have been dropped, to do
anything with the value that a `Weak<T>` is pointing to you must make sure the
value still exists. Do this by calling the `upgrade` method on a `Weak<T>`
instance, which will return an `Option<Rc<T>>`. You’ll get a result of `Some`
if the `Rc<T>` value has not been dropped yet and a result of `None` if the
`Rc<T>` value has been dropped. Because `upgrade` returns an `Option<Rc<T>>`,
Rust will ensure that the `Some` case and the `None` case are handled, and
there won’t be an invalid pointer.
As an example, rather than using a list whose items know only about the next
item, we’ll create a tree whose items know about their children items *and*
their parent items.
#### Creating a Tree Data Structure: A Node with Child Nodes
To start, we’ll build a tree with nodes that know about their child nodes.
We’ll create a struct named `Node` that holds its own `i32` value as well as
references to its children `Node` values:
Filename: src/main.rs
```
use std::cell::RefCell;
use std::rc::Rc;
#[derive(Debug)]
struct Node {
value: i32,
children: RefCell<Vec<Rc<Node>>>,
}
```
We want a `Node` to own its children, and we want to share that ownership with
variables so we can access each `Node` in the tree directly. To do this, we
define the `Vec<T>` items to be values of type `Rc<Node>`. We also want to
modify which nodes are children of another node, so we have a `RefCell<T>` in
`children` around the `Vec<Rc<Node>>`.
Next, we’ll use our struct definition and create one `Node` instance named
`leaf` with the value `3` and no children, and another instance named `branch`
with the value `5` and `leaf` as one of its children, as shown in Listing 15-27.
Filename: src/main.rs
```
fn main() {
let leaf = Rc::new(Node {
value: 3,
children: RefCell::new(vec![]),
});
let branch = Rc::new(Node {
value: 5,
children: RefCell::new(vec![Rc::clone(&leaf)]),
});
}
```
Listing 15-27: Creating a `leaf` node with no children and a `branch` node with
`leaf` as one of its children
We clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the
`Node` in `leaf` now has two owners: `leaf` and `branch`. We can get from
`branch` to `leaf` through `branch.children`, but there’s no way to get from
`leaf` to `branch`. The reason is that `leaf` has no reference to `branch` and
doesn’t know they’re related. We want `leaf` to know that `branch` is its
parent. We’ll do that next.
#### Adding a Reference from a Child to Its Parent
To make the child node aware of its parent, we need to add a `parent` field to
our `Node` struct definition. The trouble is in deciding what the type of
`parent` should be. We know it can’t contain an `Rc<T>` because that would
create a reference cycle with `leaf.parent` pointing to `branch` and
`branch.children` pointing to `leaf`, which would cause their `strong_count`
values to never be 0.
Thinking about the relationships another way, a parent node should own its
children: if a parent node is dropped, its child nodes should be dropped as
well. However, a child should not own its parent: if we drop a child node, the
parent should still exist. This is a case for weak references!
So, instead of `Rc<T>`, we’ll make the type of `parent` use `Weak<T>`,
specifically a `RefCell<Weak<Node>>`. Now our `Node` struct definition looks
like this:
Filename: src/main.rs
```
use std::cell::RefCell;
use std::rc::{Rc, Weak};
#[derive(Debug)]
struct Node {
value: i32,
parent: RefCell<Weak<Node>>,
children: RefCell<Vec<Rc<Node>>>,
}
```
A node will be able to refer to its parent node but doesn’t own its parent. In
Listing 15-28, we update `main` to use this new definition so the `leaf` node
will have a way to refer to its parent, `branch`.
Filename: src/main.rs
```
fn main() {
let leaf = Rc::new(Node {
value: 3,
1 parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![]),
});
2 println!(
"leaf parent = {:?}",
leaf.parent.borrow().upgrade()
);
let branch = Rc::new(Node {
value: 5,
3 parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![Rc::clone(&leaf)]),
});
4 *leaf.parent.borrow_mut() = Rc::downgrade(&branch);
5 println!(
"leaf parent = {:?}",
leaf.parent.borrow().upgrade()
);
}
```
Listing 15-28: A `leaf` node with a weak reference to its parent node, `branch`
Creating the `leaf` node looks similar to Listing 15-27 with the exception of
the `parent` field: `leaf` starts out without a parent, so we create a new,
empty `Weak<Node>` reference instance [1].
At this point, when we try to get a reference to the parent of `leaf` by using
the `upgrade` method, we get a `None` value. We see this in the output from the
first `println!` statement [2]:
```
leaf parent = None
```
When we create the `branch` node, it will also have a new `Weak<Node>`
reference in the `parent` field [3] because `branch` doesn’t have a parent
node. We still have `leaf` as one of the children of `branch`. Once we have the
`Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>`
reference to its parent [4]. We use the `borrow_mut` method on the
`RefCell<Weak<Node>>` in the `parent` field of `leaf`, and then we use the
`Rc::downgrade` function to create a `Weak<Node>` reference to `branch` from
the `Rc<Node>` in `branch`.
When we print the parent of `leaf` again [5], this time we’ll get a `Some`
variant holding `branch`: now `leaf` can access its parent! When we print
`leaf`, we also avoid the cycle that eventually ended in a stack overflow like
we had in Listing 15-26; the `Weak<Node>` references are printed as `(Weak)`:
```
leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) },
children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) },
children: RefCell { value: [] } }] } })
```
The lack of infinite output indicates that this code didn’t create a reference
cycle. We can also tell this by looking at the values we get from calling
`Rc::strong_count` and `Rc::weak_count`.
#### Visualizing Changes to strong_count and weak_count
Let’s look at how the `strong_count` and `weak_count` values of the `Rc<Node>`
instances change by creating a new inner scope and moving the creation of
`branch` into that scope. By doing so, we can see what happens when `branch` is
created and then dropped when it goes out of scope. The modifications are shown
in Listing 15-29.
Filename: src/main.rs
```
fn main() {
let leaf = Rc::new(Node {
value: 3,
parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![]),
});
1 println!(
"leaf strong = {}, weak = {}",
Rc::strong_count(&leaf),
Rc::weak_count(&leaf),
);
2 {
let branch = Rc::new(Node {
value: 5,
parent: RefCell::new(Weak::new()),
children: RefCell::new(vec![Rc::clone(&leaf)]),
});
*leaf.parent.borrow_mut() = Rc::downgrade(&branch);
3 println!(
"branch strong = {}, weak = {}",
Rc::strong_count(&branch),
Rc::weak_count(&branch),
);
4 println!(
"leaf strong = {}, weak = {}",
Rc::strong_count(&leaf),
Rc::weak_count(&leaf),
);
5 }
6 println!(
"leaf parent = {:?}",
leaf.parent.borrow().upgrade()
);
7 println!(
"leaf strong = {}, weak = {}",
Rc::strong_count(&leaf),
Rc::weak_count(&leaf),
);
}
```
Listing 15-29: Creating `branch` in an inner scope and examining strong and
weak reference counts
After `leaf` is created, its `Rc<Node>` has a strong count of 1 and a weak
count of 0 [1]. In the inner scope [2], we create `branch` and associate it
with `leaf`, at which point when we print the counts [3], the `Rc<Node>` in
`branch` will have a strong count of 1 and a weak count of 1 (for `leaf.parent`
pointing to `branch` with a `Weak<Node>`). When we print the counts in `leaf`
[4], we’ll see it will have a strong count of 2 because `branch` now has a
clone of the `Rc<Node>` of `leaf` stored in `branch.children`, but will still
have a weak count of 0.
When the inner scope ends [5], `branch` goes out of scope and the strong count
of the `Rc<Node>` decreases to 0, so its `Node` is dropped. The weak count of 1
from `leaf.parent` has no bearing on whether or not `Node` is dropped, so we
don’t get any memory leaks!
If we try to access the parent of `leaf` after the end of the scope, we’ll get
`None` again [6]. At the end of the program [7], the `Rc<Node>` in `leaf` has a
strong count of 1 and a weak count of 0 because the variable `leaf` is now the
only reference to the `Rc<Node>` again.
All of the logic that manages the counts and value dropping is built into
`Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait. By
specifying that the relationship from a child to its parent should be a
`Weak<T>` reference in the definition of `Node`, you’re able to have parent
nodes point to child nodes and vice versa without creating a reference cycle
and memory leaks.
## Summary
This chapter covered how to use smart pointers to make different guarantees and
trade-offs from those Rust makes by default with regular references. The
`Box<T>` type has a known size and points to data allocated on the heap. The
`Rc<T>` type keeps track of the number of references to data on the heap so
that data can have multiple owners. The `RefCell<T>` type with its interior
mutability gives us a type that we can use when we need an immutable type but
need to change an inner value of that type; it also enforces the borrowing
rules at runtime instead of at compile time.
Also discussed were the `Deref` and `Drop` traits, which enable a lot of the
functionality of smart pointers. We explored reference cycles that can cause
memory leaks and how to prevent them using `Weak<T>`.
If this chapter has piqued your interest and you want to implement your own
smart pointers, check out “The Rustonomicon” at
*https://doc.rust-lang.org/stable/nomicon* for more useful information.
Next, we’ll talk about concurrency in Rust. You’ll even learn about a few new
smart pointers.
|