From 58325b93c5b6212697b088371809e9948fee8052 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Tue, 24 Jan 2023 19:43:45 -0500 Subject: [PATCH] t5619: demonstrate clone_local() with ambiguous transport When cloning a repository, Git must determine (a) what transport mechanism to use, and (b) whether or not the clone is local. Since f38aa83f9a (use local cloning if insteadOf makes a local URL, 2014-07-17), the latter check happens after the remote has been initialized, and references the remote's URL instead of the local path. This is done to make it possible for a `url..insteadOf` rule to convert a remote URL into a local one, in which case the `clone_local()` mechanism should be used. However, with a specially crafted repository, Git can be tricked into using a non-local transport while still setting `is_local` to "1" and using the `clone_local()` optimization. The below test case demonstrates such an instance, and shows that it can be used to include arbitrary (known) paths in the working copy of a cloned repository on a victim's machine[^1], even if local file clones are forbidden by `protocol.file.allow`. This happens in a few parts: 1. We first call `get_repo_path()` to see if the remote is a local path. If it is, we replace the repo name with its absolute path. 2. We then call `transport_get()` on the repo name and decide how to access it. If it was turned into an absolute path in the previous step, then we should always treat it like a file. 3. We use `get_repo_path()` again, and set `is_local` as appropriate. But it's already too late to rewrite the repo name as an absolute path, since we've already fed it to the transport code. The attack works by including a submodule whose URL corresponds to a path on disk. In the below example, the repository "sub" is reachable via the dumb HTTP protocol at (something like): http://127.0.0.1:NNNN/dumb/sub.git However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level directory called "http:", then nested directories "127.0.0.1:NNNN", and "dumb") exists within the repository, too. To determine this, it first picks the appropriate transport, which is dumb HTTP. It then uses the remote's URL in order to determine whether the repository exists locally on disk. However, the malicious repository also contains an embedded stub repository which is the target of a symbolic link at the local path corresponding to the "sub" repository on disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git", pointing to the stub repository via ".git/modules/sub/../../../repo"). This stub repository fools Git into thinking that a local repository exists at that URL and thus can be cloned locally. The affected call is in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which locates a valid repository at that target. This then causes Git to set the `is_local` variable to "1", and in turn instructs Git to clone the repository using its local clone optimization via the `clone_local()` function. The exploit comes into play because the stub repository's top-level "$GIT_DIR/objects" directory is a symbolic link which can point to an arbitrary path on the victim's machine. `clone_local()` resolves the top-level "objects" directory through a `stat(2)` call, meaning that we read through the symbolic link and copy or hardlink the directory contents at the destination of the link. In other words, we can get steps (1) and (3) to disagree by leveraging the dangling symlink to pick a non-local transport in the first step, and then set is_local to "1" in the third step when cloning with `--separate-git-dir`, which makes the symlink non-dangling. This can result in data-exfiltration on the victim's machine when sensitive data is at a known path (e.g., "/home/$USER/.ssh"). The appropriate fix is two-fold: - Resolve the transport later on (to avoid using the local clone optimization with a non-local transport). - Avoid reading through the top-level "objects" directory when (correctly) using the clone_local() optimization. This patch merely demonstrates the issue. The following two patches will implement each part of the above fix, respectively. [^1]: Provided that any target directory does not contain symbolic links, in which case the changes from 6f054f9fb3 (builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28) will abort the clone. Reported-by: yvvdwf Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- t/t5619-clone-local-ambiguous-transport.sh | 63 ++++++++++++++++++++++ 1 file changed, 63 insertions(+) create mode 100755 t/t5619-clone-local-ambiguous-transport.sh diff --git a/t/t5619-clone-local-ambiguous-transport.sh b/t/t5619-clone-local-ambiguous-transport.sh new file mode 100755 index 0000000000..7ebd31a150 --- /dev/null +++ b/t/t5619-clone-local-ambiguous-transport.sh @@ -0,0 +1,63 @@ +#!/bin/sh + +test_description='test local clone with ambiguous transport' + +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-httpd.sh" + +if ! test_have_prereq SYMLINKS +then + skip_all='skipping test, symlink support unavailable' + test_done +fi + +start_httpd + +REPO="$HTTPD_DOCUMENT_ROOT_PATH/sub.git" +URI="$HTTPD_URL/dumb/sub.git" + +test_expect_success 'setup' ' + mkdir -p sensitive && + echo "secret" >sensitive/secret && + + git init --bare "$REPO" && + test_commit_bulk -C "$REPO" --ref=main 1 && + + git -C "$REPO" update-ref HEAD main && + git -C "$REPO" update-server-info && + + git init malicious && + ( + cd malicious && + + git submodule add "$URI" && + + mkdir -p repo/refs && + touch repo/refs/.gitkeep && + printf "ref: refs/heads/a" >repo/HEAD && + ln -s "$(cd .. && pwd)/sensitive" repo/objects && + + mkdir -p "$HTTPD_URL/dumb" && + ln -s "../../../.git/modules/sub/../../../repo/" "$URI" && + + git add . && + git commit -m "initial commit" + ) && + + # Delete all of the references in our malicious submodule to + # avoid the client attempting to checkout any objects (which + # will be missing, and thus will cause the clone to fail before + # we can trigger the exploit). + git -C "$REPO" for-each-ref --format="delete %(refname)" >in && + git -C "$REPO" update-ref --stdin