diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-09 13:34:27 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-09 13:34:27 +0000 |
commit | 4dbdc42d9e7c3968ff7f690d00680419c9b8cb0f (patch) | |
tree | 47c1d492e9c956c1cd2b74dbd3b9d8b0db44dc4e /Documentation/howto | |
parent | Initial commit. (diff) | |
download | git-4dbdc42d9e7c3968ff7f690d00680419c9b8cb0f.tar.xz git-4dbdc42d9e7c3968ff7f690d00680419c9b8cb0f.zip |
Adding upstream version 1:2.43.0.upstream/1%2.43.0
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'Documentation/howto')
-rw-r--r-- | Documentation/howto/coordinate-embargoed-releases.txt | 246 | ||||
-rw-r--r-- | Documentation/howto/keep-canonical-history-correct.txt | 216 | ||||
-rw-r--r-- | Documentation/howto/maintain-git.txt | 480 | ||||
-rw-r--r-- | Documentation/howto/new-command.txt | 106 | ||||
-rw-r--r-- | Documentation/howto/rebase-from-internal-branch.txt | 164 | ||||
-rw-r--r-- | Documentation/howto/rebuild-from-update-hook.txt | 90 | ||||
-rw-r--r-- | Documentation/howto/recover-corrupted-blob-object.txt | 144 | ||||
-rw-r--r-- | Documentation/howto/recover-corrupted-object-harder.txt | 479 | ||||
-rw-r--r-- | Documentation/howto/revert-a-faulty-merge.txt | 273 | ||||
-rw-r--r-- | Documentation/howto/revert-branch-rebase.txt | 187 | ||||
-rw-r--r-- | Documentation/howto/separating-topic-branches.txt | 94 | ||||
-rw-r--r-- | Documentation/howto/setup-git-server-over-http.txt | 285 | ||||
-rw-r--r-- | Documentation/howto/update-hook-example.txt | 192 | ||||
-rw-r--r-- | Documentation/howto/use-git-daemon.txt | 54 | ||||
-rw-r--r-- | Documentation/howto/using-merge-subtree.txt | 75 | ||||
-rw-r--r-- | Documentation/howto/using-signed-tag-in-pull-request.txt | 217 |
16 files changed, 3302 insertions, 0 deletions
diff --git a/Documentation/howto/coordinate-embargoed-releases.txt b/Documentation/howto/coordinate-embargoed-releases.txt new file mode 100644 index 0000000..b9cb95e --- /dev/null +++ b/Documentation/howto/coordinate-embargoed-releases.txt @@ -0,0 +1,246 @@ +Content-type: text/asciidoc +Abstract: When a vulnerability is reported, we follow these guidelines to + assess the vulnerability, create and review a fix, and coordinate embargoed + security releases. + +How we coordinate embargoed releases +------------------------------------ + +To protect Git users from critical vulnerabilities, we do not just release +fixed versions like regular maintenance releases. Instead, we coordinate +releases with packagers, keeping the fixes under an embargo until the release +date. That way, users will have a chance to upgrade on that date, no matter +what Operating System or distribution they run. + +The `git-security` mailing list +------------------------------- + +Responsible disclosures of vulnerabilities, analysis, proposed fixes as +well as the orchestration of coordinated embargoed releases all happen on the +`git-security` mailing list at <git-security@googlegroups.com>. + +In this context, the term "embargo" refers to the time period that information +about a vulnerability is kept under wraps and only shared on a need-to-know +basis. This is necessary to protect Git's users from bad actors who would +otherwise be made aware of attack vectors that could be exploited. "Lifting the +embargo" refers to publishing the version that fixes the vulnerabilities. + +Audience of the `git-security` mailing list +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Anybody may contact the `git-security` mailing list by sending an email +to <git-security@googlegroups.com>, though the archive is closed to the +public and only accessible to subscribed members. + +There are a few dozen subscribed members: core Git developers who are trusted +with addressing vulnerabilities, and stakeholders (i.e. owners of products +affected by security vulnerabilities in Git). + +Most of the discussions revolve around assessing the severity of the reported +issue (including the decision whether the report is security-relevant or can be +redirected to the public mailing list), how to remediate the issue, determining +the timeline of the disclosure as well as aligning priorities and +requirements. + +Communications +~~~~~~~~~~~~~~ + +If you are a stakeholder, it is a good idea to pay close attention to the +discussions, as pertinent information may be buried in the middle of a lively +conversation that might not look relevant to your interests. For example, the +tentative timeline might be agreed upon in the middle of discussing code +comment formatting in one of the patches and whether or not to combine fixes +for multiple, separate vulnerabilities into the same embargoed release. Most +mail threads are not usually structured specifically to communicate +agreements, assessments or timelines. + +Typical timeline +---------------- + +- A potential vulnerability is reported to the `git-security` mailing list. + +- The members of the git-security list start a discussion to give an initial + assessment of the severity of the reported potential vulnerability. + We aspire to do so within a few days. + +- After discussion, if consensus is reached that it is not critical enough + to warrant any embargo, the reporter is redirected to the public Git mailing + list. This ends the reporter's interaction with the `git-security` list. + +- If it is deemed critical enough for an embargo, ideas are presented on how to + address the vulnerability. + +- Usually around that time, the Git maintainer or their delegate(s) open a draft + security advisory in the `git/git` repository on GitHub (see below for more + details). + +- Code review can take place in a variety of different locations, + depending on context. These are: patches sent inline on the git-security list, + a private fork on GitHub associated with the draft security advisory, or the + git/cabal repository. + +- Contributors working on a fix should consider beginning by sending + patches to the git-security list (inline with the original thread), since they + are accessible to all subscribers, along with the original reporter. + +- Once the review has settled and everyone involved in the review agrees that + the patches are nearing the finish line, the Git maintainer, and others + determine a release date as well as the release trains that are serviced. The + decision regarding which versions need a backported fix is based on input from + the reporter, the contributor who worked on the patches, and from + stakeholders. Operators of hosting sites who may want to analyze whether the + given issue is exploited via any of the repositories they host, and binary + packagers who want to make sure their product gets patched adequately against + the vulnerability, for example, may want to give their input at this stage. + +- While the Git community does its best to accommodate the specific timeline + requests of the various binary packagers, the nature of the issue may preclude + a prolonged release schedule. For fixes deemed urgent, it may be in the best + interest of the Git users community to shorten the disclosure and release + timeline, and packagers may need to adapt accordingly. + +- Subsequently, branches with the fixes are pushed to the git/cabal repository. + +- The tags are created by the Git maintainer and pushed to the same repository. + +- The Git for Windows, Git for macOS, BSD, Debian, etc. maintainers prepare the + corresponding release artifacts, based on the tags created that have been + prepared by the Git maintainer. + +- The release artifacts prepared by various binary packagers can be + made available to stakeholders under embargo via a mail to the + `git-security` list. + +- Less than a week before the release, a mail with the relevant information is + sent to <distros@vs.openwall.org> (see below), a list used to pre-announce + embargoed releases of open source projects to the stakeholders of all major + distributions of Linux as well as other OSes. + +- Public communication is then prepared in advance of the release date. This + includes blog posts and mails to the Git and Git for Windows mailing lists. + +- On the day of the release, at around 10am Pacific Time, the Git maintainer + pushes the tag and the `master` branch to the public repository, then sends + out an announcement mail. + +- Once the tag is pushed, the Git for Windows maintainer publishes the + corresponding tag and creates a GitHub Release with the associated release + artifacts (Git for Windows installer, Portable Git, MinGit, etc). + +- Git for Windows release is then announced via a mail to the public Git and + Git for Windows mailing lists as well as via a tweet. + +- Ditto for distribution packagers for Linux and other platforms: + their releases are announced via their preferred channels. + +- A mail to <oss-security@lists.openwall.org> (see below for details) is sent + as a follow-up to the <distros@vs.openwall.org> one, describing the + vulnerability in detail, often including a proof of concept of an exploit. + +Note: The Git project makes no guarantees about timelines, but aims to keep +embargoes reasonably short in the interest of keeping Git's users safe. + +Opening a Security Advisory draft +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The first step is to https://github.com/git/git/security/advisories/new[open +an advisory]. Technically, this is not necessary. However, it is the most +convenient way to obtain the CVE number and it gives us a private repository +associated with it that can be used to collaborate on a fix. + +Notifying the Linux distributions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +At most two weeks before release date, we need to send a notification to +<distros@vs.openwall.org>, preferably less than 7 days before the release date. +This will reach most (all?) Linux distributions. See an example below, and the +guidelines for this mailing list at +https://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists[here]. + +Once the version has been published, we send a note about that to oss-security. +As an example, see https://www.openwall.com/lists/oss-security/2019/12/13/1[the +v2.24.1 mail]; +https://oss-security.openwall.org/wiki/mailing-lists/oss-security[Here] are +their guidelines. + +The mail to oss-security should also describe the exploit, and give credit to +the reporter(s): security researchers still receive too little respect for the +invaluable service they provide, and public credit goes a long way to keep them +paid by their respective organizations. + +Technically, describing any exploit can be delayed up to 7 days, but we usually +refrain from doing that, including it right away. + +As a courtesy we typically attach a Git bundle (as `.tar.xz` because the list +will drop `.bundle` attachments) in the mail to distros@ so that the involved +parties can take care of integrating/backporting them. This bundle is typically +created using a command like this: + + git bundle create cve-xxx.bundle ^origin/master vA.B.C vD.E.F + tar cJvf cve-xxx.bundle.tar.xz cve-xxx.bundle + +Example mail to distros@vs.openwall.org +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.... +To: distros@vs.openwall.org +Cc: git-security@googlegroups.com, <other people involved in the report/fix> +Subject: [vs] Upcoming Git security fix release + +Team, + +The Git project will release new versions on <date> at 10am Pacific Time or +soon thereafter. I have attached a Git bundle (embedded in a `.tar.xz` to avoid +it being dropped) which you can fetch into a clone of +https://github.com/git/git via `git fetch --tags /path/to/cve-xxx.bundle`, +containing the tags for versions <versions>. + +You can verify with `git tag -v <tag>` that the versions were signed by +the Git maintainer, using the same GPG key as e.g. v2.24.0. + +Please use these tags to prepare `git` packages for your various +distributions, using the appropriate tagged versions. The added test cases +help verify the correctness. + +The addressed issues are: + +<list of CVEs with a short description, typically copy/pasted from Git's +release notes, usually demo exploit(s), too> + +Credit for finding the vulnerability goes to <reporter>, credit for fixing +it goes to <developer>. + +Thanks, +<name> + +.... + +Example mail to oss-security@lists.openwall.com +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.... +To: oss-security@lists.openwall.com +Cc: git-security@googlegroups.com, <other people involved in the report/fix> +Subject: git: <copy from security advisory> + +Team, + +The Git project released new versions on <date>, addressing <CVE>. + +All supported platforms are affected in one way or another, and all Git +versions all the way back to <version> are affected. The fixed versions are: +<versions>. + +Link to the announcement: <link to lore.kernel.org/git> + +We highly recommend to upgrade. + +The addressed issues are: +* <list of CVEs and their explanations, along with demo exploits> + +Credit for finding the vulnerability goes to <reporter>, credit for fixing +it goes to <developer>. + +Thanks, +<name> +....
\ No newline at end of file diff --git a/Documentation/howto/keep-canonical-history-correct.txt b/Documentation/howto/keep-canonical-history-correct.txt new file mode 100644 index 0000000..35d48ef --- /dev/null +++ b/Documentation/howto/keep-canonical-history-correct.txt @@ -0,0 +1,216 @@ +From: Junio C Hamano <gitster@pobox.com> +Date: Wed, 07 May 2014 13:15:39 -0700 +Subject: Beginner question on "Pull is mostly evil" +Abstract: This how-to explains a method for keeping a + project's history correct when using git pull. +Content-type: text/asciidoc + +Keep authoritative canonical history correct with git pull +========================================================== + +Sometimes a new project integrator will end up with project history +that appears to be "backwards" from what other project developers +expect. This howto presents a suggested integration workflow for +maintaining a central repository. + +Suppose that that central repository has this history: + +------------ + ---o---o---A +------------ + +which ends at commit `A` (time flows from left to right and each node +in the graph is a commit, lines between them indicating parent-child +relationship). + +Then you clone it and work on your own commits, which leads you to +have this history in *your* repository: + +------------ + ---o---o---A---B---C +------------ + +Imagine your coworker did the same and built on top of `A` in *his* +repository in the meantime, and then pushed it to the +central repository: + +------------ + ---o---o---A---X---Y---Z +------------ + +Now, if you `git push` at this point, because your history that leads +to `C` lacks `X`, `Y` and `Z`, it will fail. You need to somehow make +the tip of your history a descendant of `Z`. + +One suggested way to solve the problem is "fetch and then merge", aka +`git pull`. When you fetch, your repository will have a history like +this: + +------------ + ---o---o---A---B---C + \ + X---Y---Z +------------ + +Once you run merge after that, while still on *your* branch, i.e. `C`, +you will create a merge `M` and make the history look like this: + +------------ + ---o---o---A---B---C---M + \ / + X---Y---Z +------------ + +`M` is a descendant of `Z`, so you can push to update the central +repository. Such a merge `M` does not lose any commit in both +histories, so in that sense it may not be wrong, but when people want +to talk about "the authoritative canonical history that is shared +among the project participants", i.e. "the trunk", they often view +it as "commits you see by following the first-parent chain", and use +this command to view it: + +------------ + $ git log --first-parent +------------ + +For all other people who observed the central repository after your +coworker pushed `Z` but before you pushed `M`, the commit on the trunk +used to be `o-o-A-X-Y-Z`. But because you made `M` while you were on +`C`, `M`'s first parent is `C`, so by pushing `M` to advance the +central repository, you made `X-Y-Z` a side branch, not on the trunk. + +You would rather want to have a history of this shape: + +------------ + ---o---o---A---X---Y---Z---M' + \ / + B-----------C +------------ + +so that in the first-parent chain, it is clear that the project first +did `X` and then `Y` and then `Z` and merged a change that consists of +two commits `B` and `C` that achieves a single goal. You may have +worked on fixing the bug #12345 with these two patches, and the merge +`M'` with swapped parents can say in its log message "Merge +fix-bug-12345". Having a way to tell `git pull` to create a merge +but record the parents in reverse order may be a way to do so. + +Note that I said "achieves a single goal" above, because this is +important. "Swapping the merge order" only covers a special case +where the project does not care too much about having unrelated +things done on a single merge but cares a lot about first-parent +chain. + +There are multiple schools of thought about the "trunk" management. + + 1. Some projects want to keep a completely linear history without any + merges. Obviously, swapping the merge order would not match their + taste. You would need to flatten your history on top of the + updated upstream to result in a history of this shape instead: ++ +------------ + ---o---o---A---X---Y---Z---B---C +------------ ++ +with `git pull --rebase` or something. + + 2. Some projects tolerate merges in their history, but do not worry + too much about the first-parent order, and allow fast-forward + merges. To them, swapping the merge order does not hurt, but + it is unnecessary. + + 3. Some projects want each commit on the "trunk" to do one single + thing. The output of `git log --first-parent` in such a project + would show either a merge of a side branch that completes a single + theme, or a single commit that completes a single theme by itself. + If your two commits `B` and `C` (or they may even be two groups of + commits) were solving two independent issues, then the merge `M'` + we made in the earlier example by swapping the merge order is + still not up to the project standard. It merges two unrelated + efforts `B` and `C` at the same time. + +For projects in the last category (Git itself is one of them), +individual developers would want to prepare a history more like +this: + +------------ + C0--C1--C2 topic-c + / + ---o---o---A master + \ + B0--B1--B2 topic-b +------------ + +That is, keeping separate topics on separate branches, perhaps like +so: + +------------ + $ git clone $URL work && cd work + $ git checkout -b topic-b master + $ ... work to create B0, B1 and B2 to complete one theme + $ git checkout -b topic-c master + $ ... same for the theme of topic-c +------------ + +And then + +------------ + $ git checkout master + $ git pull --ff-only +------------ + +would grab `X`, `Y` and `Z` from the upstream and advance your master +branch: + +------------ + C0--C1--C2 topic-c + / + ---o---o---A---X---Y---Z master + \ + B0--B1--B2 topic-b +------------ + +And then you would merge these two branches separately: + +------------ + $ git merge topic-b + $ git merge topic-c +------------ + +to result in + +------------ + C0--C1---------C2 + / \ + ---o---o---A---X---Y---Z---M---N + \ / + B0--B1-----B2 +------------ + +and push it back to the central repository. + +It is very much possible that while you are merging topic-b and +topic-c, somebody again advanced the history in the central repository +to put `W` on top of `Z`, and make your `git push` fail. + +In such a case, you would rewind to discard `M` and `N`, update the +tip of your 'master' again and redo the two merges: + +------------ + $ git reset --hard origin/master + $ git pull --ff-only + $ git merge topic-b + $ git merge topic-c +------------ + +The procedure will result in a history that looks like this: + +------------ + C0--C1--------------C2 + / \ + ---o---o---A---X---Y---Z---W---M'--N' + \ / + B0--B1---------B2 +------------ + +See also http://git-blame.blogspot.com/2013/09/fun-with-first-parent-history.html diff --git a/Documentation/howto/maintain-git.txt b/Documentation/howto/maintain-git.txt new file mode 100644 index 0000000..013014b --- /dev/null +++ b/Documentation/howto/maintain-git.txt @@ -0,0 +1,480 @@ +From: Junio C Hamano <gitster@pobox.com> +Date: Wed, 21 Nov 2007 16:32:55 -0800 +Subject: Addendum to "MaintNotes" +Abstract: Imagine that Git development is racing along as usual, when our friendly + neighborhood maintainer is struck down by a wayward bus. Out of the + hordes of suckers (loyal developers), you have been tricked (chosen) to + step up as the new maintainer. This howto will show you "how to" do it. +Content-type: text/asciidoc + +How to maintain Git +=================== + +Activities +---------- + +The maintainer's Git time is spent on three activities. + + - Communication (45%) + + Mailing list discussions on general design, fielding user + questions, diagnosing bug reports; reviewing, commenting on, + suggesting alternatives to, and rejecting patches. + + - Integration (50%) + + Applying new patches from the contributors while spotting and + correcting minor mistakes, shuffling the integration and + testing branches, pushing the results out, cutting the + releases, and making announcements. + + - Own development (5%) + + Scratching my own itch and sending proposed patch series out. + +The Policy +---------- + +The policy on Integration is informally mentioned in "A Note +from the maintainer" message, which is periodically posted to +this mailing list after each feature release is made. + + - Feature releases are numbered as vX.Y.0 and are meant to + contain bugfixes and enhancements in any area, including + functionality, performance and usability, without regression. + + - One release cycle for a feature release is expected to last for + eight to ten weeks. + + - Maintenance releases are numbered as vX.Y.Z and are meant + to contain only bugfixes for the corresponding vX.Y.0 feature + release and earlier maintenance releases vX.Y.W (W < Z). + + - 'master' branch is used to prepare for the next feature + release. In other words, at some point, the tip of 'master' + branch is tagged with vX.Y.0. + + - 'maint' branch is used to prepare for the next maintenance + release. After the feature release vX.Y.0 is made, the tip + of 'maint' branch is set to that release, and bugfixes will + accumulate on the branch, and at some point, the tip of the + branch is tagged with vX.Y.1, vX.Y.2, and so on. + + - 'next' branch is used to publish changes (both enhancements + and fixes) that (1) have worthwhile goal, (2) are in a fairly + good shape suitable for everyday use, (3) but have not yet + demonstrated to be regression free. New changes are tested + in 'next' before merged to 'master'. + + - 'seen' branch is used to publish other proposed changes that do + not yet pass the criteria set for 'next'. + + - The tips of 'master' and 'maint' branches will not be rewound to + allow people to build their own customization on top of them. + Early in a new development cycle, 'next' is rewound to the tip of + 'master' once, but otherwise it will not be rewound until the end + of the cycle. + + - Usually 'master' contains all of 'maint' and 'next' contains all + of 'master'. 'seen' contains all the topics merged to 'next', but + is rebuilt directly on 'master'. + + - The tip of 'master' is meant to be more stable than any + tagged releases, and the users are encouraged to follow it. + + - The 'next' branch is where new action takes place, and the + users are encouraged to test it so that regressions and bugs + are found before new topics are merged to 'master'. + +Note that before v1.9.0 release, the version numbers used to be +structured slightly differently. vX.Y.Z were feature releases while +vX.Y.Z.W were maintenance releases for vX.Y.Z. + + +A Typical Git Day +----------------- + +A typical Git day for the maintainer implements the above policy +by doing the following: + + - Scan mailing list. Respond with review comments, suggestions + etc. Kibitz. Collect potentially usable patches from the + mailing list. Patches about a single topic go to one mailbox (I + read my mail in Gnus, and type \C-o to save/append messages in + files in mbox format). + + - Write his own patches to address issues raised on the list but + nobody has stepped up to solve. Send it out just like other + contributors do, and pick them up just like patches from other + contributors (see above). + + - Review the patches in the saved mailboxes. Edit proposed log + message for typofixes and clarifications, and add Acks + collected from the list. Edit patch to incorporate "Oops, + that should have been like this" fixes from the discussion. + + - Classify the collected patches and handle 'master' and + 'maint' updates: + + - Obviously correct fixes that pertain to the tip of 'maint' + are directly applied to 'maint'. + + - Obviously correct fixes that pertain to the tip of 'master' + are directly applied to 'master'. + + - Other topics are not handled in this step. + + This step is done with "git am". + + $ git checkout master ;# or "git checkout maint" + $ git am -sc3 mailbox + $ make test + + In practice, almost no patch directly goes to 'master' or + 'maint'. + + - Review the last issue of "What's cooking" message, review the + topics ready for merging (topic->master and topic->maint). Use + "Meta/cook -w" script (where Meta/ contains a checkout of the + 'todo' branch) to aid this step. + + And perform the merge. Use "Meta/Reintegrate -e" script (see + later) to aid this step. + + $ Meta/cook -w last-issue-of-whats-cooking.mbox + + $ git checkout master ;# or "git checkout maint" + $ echo ai/topic | Meta/Reintegrate -e ;# "git merge ai/topic" + $ git log -p ORIG_HEAD.. ;# final review + $ git diff ORIG_HEAD.. ;# final review + $ make test ;# final review + + - Handle the remaining patches: + + - Anything unobvious that is applicable to 'master' (in other + words, does not depend on anything that is still in 'next' + and not in 'master') is applied to a new topic branch that + is forked from the tip of 'master' (or the last feature release, + which is a bit older than 'master'). This includes both + enhancements and unobvious fixes to 'master'. A topic + branch is named as ai/topic where "ai" is two-letter string + named after author's initial and "topic" is a descriptive name + of the topic (in other words, "what's the series is about"). + + - An unobvious fix meant for 'maint' is applied to a new + topic branch that is forked from the tip of 'maint' (or the + oldest and still relevant maintenance branch). The + topic may be named as ai/maint-topic. + + - Changes that pertain to an existing topic are applied to + the branch, but: + + - obviously correct ones are applied first; + + - questionable ones are discarded or applied to near the tip; + + - Replacement patches to an existing topic are accepted only + for commits not in 'next'. + + The initial round is done with: + + $ git checkout ai/topic ;# or "git checkout -b ai/topic master" + $ git am -sc3 mailbox + + and replacing an existing topic with subsequent round is done with: + + $ git checkout master...ai/topic ;# try to reapply to the same base + $ git am -sc3 mailbox + + to prepare the new round on a detached HEAD, and then + + $ git range-diff @{-1}... + $ git diff @{-1} + + to double check what changed since the last round, and finally + + $ git checkout -B @{-1} + + to conclude (the last step is why a topic already in 'next' is + not replaced but updated incrementally). + + Whether it is the initial round or a subsequent round, the topic + may not build even in isolation, or may break the build when + merged to integration branches due to bugs. There may already + be obvious and trivial improvements suggested on the list. The + maintainer often adds an extra commit, with "SQUASH???" in its + title, to fix things up, before publishing the integration + branches to make it usable by other developers for testing. + These changes are what the maintainer is not 100% committed to + (trivial typofixes etc. are often squashed directly into the + patches that need fixing, without being applied as a separate + "SQUASH???" commit), so that they can be removed easily as needed. + + + - Merge maint to master as needed: + + $ git checkout master + $ git merge maint + $ make test + + - Merge master to next as needed: + + $ git checkout next + $ git merge master + $ make test + + - Review the last issue of "What's cooking" again and see if topics + that are ready to be merged to 'next' are still in good shape + (e.g. has there any new issue identified on the list with the + series?) + + - Prepare 'jch' branch, which is used to represent somewhere + between 'master' and 'seen' and often is slightly ahead of 'next'. + + $ Meta/Reintegrate master..jch >Meta/redo-jch.sh + + The result is a script that lists topics to be merged in order to + rebuild 'seen' as the input to Meta/Reintegrate script. Remove + later topics that should not be in 'jch' yet. Add a line that + consists of '### match next' before the name of the first topic + in the output that should be in 'jch' but not in 'next' yet. + + - Now we are ready to start merging topics to 'next'. For each + branch whose tip is not merged to 'next', one of three things can + happen: + + - The commits are all next-worthy; merge the topic to next; + - The new parts are of mixed quality, but earlier ones are + next-worthy; merge the early parts to next; + - Nothing is next-worthy; do not do anything. + + This step is aided with Meta/redo-jch.sh script created earlier. + If a topic that was already in 'next' gained a patch, the script + would list it as "ai/topic~1". To include the new patch to the + updated 'next', drop the "~1" part; to keep it excluded, do not + touch the line. If a topic that was not in 'next' should be + merged to 'next', add it at the end of the list. Then: + + $ git checkout -B jch master + $ sh Meta/redo-jch.sh -c1 + + to rebuild the 'jch' branch from scratch. "-c1" tells the script + to stop merging at the first line that begins with '###' + (i.e. the "### match next" line you added earlier). + + At this point, build-test the result. It may reveal semantic + conflicts (e.g. a topic renamed a variable, another added a new + reference to the variable under its old name), in which case + prepare an appropriate merge-fix first (see appendix), and + rebuild the 'jch' branch from scratch, starting at the tip of + 'master'. + + Then do the same to 'next' + + $ git checkout next + $ sh Meta/redo-jch.sh -c1 -e + + The "-e" option allows the merge message that comes from the + history of the topic and the comments in the "What's cooking" to + be edited. The resulting tree should match 'jch' as the same set + of topics are merged on 'master'; otherwise there is a mismerge. + Investigate why and do not proceed until the mismerge is found + and rectified. + + $ git diff jch next + + Then build the rest of 'jch': + + $ git checkout jch + $ sh Meta/redo-jch.sh + + When all is well, clean up the redo-jch.sh script with + + $ sh Meta/redo-jch.sh -u + + This removes topics listed in the script that have already been + merged to 'master'. This may lose '### match next' marker; + add it again to the appropriate place when it happens. + + - Rebuild 'seen'. + + $ Meta/Reintegrate jch..seen >Meta/redo-seen.sh + + Edit the result by adding new topics that are not still in 'seen' + in the script. Then + + $ git checkout -B seen jch + $ sh Meta/redo-seen.sh + + When all is well, clean up the redo-seen.sh script with + + $ sh Meta/redo-seen.sh -u + + Double check by running + + $ git branch --no-merged seen + + to see there is no unexpected leftover topics. + + At this point, build-test the result for semantic conflicts, and + if there are, prepare an appropriate merge-fix first (see + appendix), and rebuild the 'seen' branch from scratch, starting at + the tip of 'jch'. + + - Update "What's cooking" message to review the updates to + existing topics, newly added topics and graduated topics. + + This step is helped with Meta/cook script. + + $ Meta/cook + + This script inspects the history between master..seen, finds tips + of topic branches, compares what it found with the current + contents in Meta/whats-cooking.txt, and updates that file. + Topics not listed in the file but are found in master..seen are + added to the "New topics" section, topics listed in the file that + are no longer found in master..seen are moved to the "Graduated to + master" section, and topics whose commits changed their states + (e.g. used to be only in 'seen', now merged to 'next') are updated + with change markers "<<" and ">>". + + Look for lines enclosed in "<<" and ">>"; they hold contents from + old file that are replaced by this integration round. After + verifying them, remove the old part. Review the description for + each topic and update its doneness and plan as needed. To review + the updated plan, run + + $ Meta/cook -w + + which will pick up comments given to the topics, such as "Will + merge to 'next'", etc. (see Meta/cook script to learn what kind + of phrases are supported). + + - Compile, test and install all four (five) integration branches; + Meta/Dothem script may aid this step. + + - Format documentation if the 'master' branch was updated; + Meta/dodoc.sh script may aid this step. + + - Push the integration branches out to public places; Meta/pushall + script may aid this step. + +Observations +------------ + +Some observations to be made. + + * Each topic is tested individually, and also together with other + topics cooking first in 'seen', then in 'jch' and then in 'next'. + Until it matures, no part of it is merged to 'master'. + + * A topic already in 'next' can get fixes while still in + 'next'. Such a topic will have many merges to 'next' (in + other words, "git log --first-parent next" will show many + "Merge branch 'ai/topic' to next" for the same topic. + + * An unobvious fix for 'maint' is cooked in 'next' and then + merged to 'master' to make extra sure it is Ok and then + merged to 'maint'. + + * Even when 'next' becomes empty (in other words, all topics + prove stable and are merged to 'master' and "git diff master + next" shows empty), it has tons of merge commits that will + never be in 'master'. + + * In principle, "git log --first-parent master..next" should + show nothing but merges (in practice, there are fixup commits + and reverts that are not merges). + + * Commits near the tip of a topic branch that are not in 'next' + are fair game to be discarded, replaced or rewritten. + Commits already merged to 'next' will not be. + + * Being in the 'next' branch is not a guarantee for a topic to + be included in the next feature release. Being in the + 'master' branch typically is. + + * Due to the nature of "SQUASH???" fix-ups, if the original author + agrees with the suggested changes, it is OK to squash them to + appropriate patches in the next round (when the suggested change + is small enough, the author should not even bother with + "Helped-by"). It is also OK to drop them from the next round + when the original author does not agree with the suggestion, but + the author is expected to say why somewhere in the discussion. + + +Appendix +-------- + +Preparing a "merge-fix" +~~~~~~~~~~~~~~~~~~~~~~~ + +A merge of two topics may not textually conflict but still have +conflict at the semantic level. A classic example is for one topic +to rename a variable and all its uses, while another topic adds a +new use of the variable under its old name. When these two topics +are merged together, the reference to the variable newly added by +the latter topic will still use the old name in the result. + +The Meta/Reintegrate script that is used by redo-jch and redo-seen +scripts implements a crude but usable way to work around this issue. +When the script merges branch $X, it checks if "refs/merge-fix/$X" +exists, and if so, the effect of it is squashed into the result of +the mechanical merge. In other words, + + $ echo $X | Meta/Reintegrate + +is roughly equivalent to this sequence: + + $ git merge --rerere-autoupdate $X + $ git commit + $ git cherry-pick -n refs/merge-fix/$X + $ git commit --amend + +The goal of this "prepare a merge-fix" step is to come up with a +commit that can be squashed into a result of mechanical merge to +correct semantic conflicts. + +After finding that the result of merging branch "ai/topic" to an +integration branch had such a semantic conflict, say seen~4, check the +problematic merge out on a detached HEAD, edit the working tree to +fix the semantic conflict, and make a separate commit to record the +fix-up: + + $ git checkout seen~4 + $ git show -s --pretty=%s ;# double check + Merge branch 'ai/topic' to seen + $ edit + $ git commit -m 'merge-fix/ai/topic' -a + +Then make a reference "refs/merge-fix/ai/topic" to point at this +result: + + $ git update-ref refs/merge-fix/ai/topic HEAD + +Then double check the result by asking Meta/Reintegrate to redo the +merge: + + $ git checkout seen~5 ;# the parent of the problem merge + $ echo ai/topic | Meta/Reintegrate + $ git diff seen~4 + +This time, because you prepared refs/merge-fix/ai/topic, the +resulting merge should have been tweaked to include the fix for the +semantic conflict. + +Note that this assumes that the order in which conflicting branches +are merged does not change. If the reason why merging ai/topic +branch needs this merge-fix is because another branch merged earlier +to the integration branch changed the underlying assumption ai/topic +branch made (e.g. ai/topic branch added a site to refer to a +variable, while the other branch renamed that variable and adjusted +existing use sites), and if you changed redo-jch (or redo-seen) script +to merge ai/topic branch before the other branch, then the above +merge-fix should not be applied while merging ai/topic, but should +instead be applied while merging the other branch. You would need +to move the fix to apply to the other branch, perhaps like this: + + $ mf=refs/merge-fix + $ git update-ref $mf/$the_other_branch $mf/ai/topic + $ git update-ref -d $mf/ai/topic diff --git a/Documentation/howto/new-command.txt b/Documentation/howto/new-command.txt new file mode 100644 index 0000000..880c511 --- /dev/null +++ b/Documentation/howto/new-command.txt @@ -0,0 +1,106 @@ +From: Eric S. Raymond <esr@thyrsus.com> +Abstract: This is how-to documentation for people who want to add extension + commands to Git. It should be read alongside builtin.h. +Content-type: text/asciidoc + +How to integrate new subcommands +================================ + +This is how-to documentation for people who want to add extension +commands to Git. It should be read alongside builtin.h. + +Runtime environment +------------------- + +Git subcommands are standalone executables that live in the Git exec +path, normally /usr/lib/git-core. The git executable itself is a +thin wrapper that knows where the subcommands live, and runs them by +passing command-line arguments to them. + +(If "git foo" is not found in the Git exec path, the wrapper +will look in the rest of your $PATH for it. Thus, it's possible +to write local Git extensions that don't live in system space.) + +Implementation languages +------------------------ + +Most subcommands are written in C or shell. A few are written in +Perl. + +While we strongly encourage coding in portable C for portability, +these specific scripting languages are also acceptable. We won't +accept more without a very strong technical case, as we don't want +to broaden the Git suite's required dependencies. Import utilities, +surgical tools, remote helpers and other code at the edges of the +Git suite are more lenient and we allow Python (and even Tcl/tk), +but they should not be used for core functions. + +This may change in the future. Especially Python is not allowed in +core because we need better Python integration in the Git Windows +installer before we can be confident people in that environment +won't experience an unacceptably large loss of capability. + +C commands are normally written as single modules, named after the +command, that link a collection of functions called libgit. Thus, +your command 'git-foo' would normally be implemented as a single +"git-foo.c" (or "builtin/foo.c" if it is to be linked to the main +binary); this organization makes it easy for people reading the code +to find things. + +See the CodingGuidelines document for other guidance on what we consider +good practice in C and shell, and api-builtin.txt for the support +functions available to built-in commands written in C. + +What every extension command needs +---------------------------------- + +You must have a man page, written in asciidoc (this is what Git help +followed by your subcommand name will display). Be aware that there is +a local asciidoc configuration and macros which you should use. It's +often helpful to start by cloning an existing page and replacing the +text content. + +You must have a test, written to report in TAP (Test Anything Protocol). +Tests are executables (usually shell scripts) that live in the 't' +subdirectory of the tree. Each test name begins with 't' and a sequence +number that controls where in the test sequence it will be executed; +conventionally the rest of the name stem is that of the command +being tested. + +Read the file t/README to learn more about the conventions to be used +in writing tests, and the test support library. + +Integrating a command +--------------------- + +Here are the things you need to do when you want to merge a new +subcommand into the Git tree. + +1. Don't forget to sign off your patch! + +2. Append your command name to one of the variables BUILTIN_OBJS, +EXTRA_PROGRAMS, SCRIPT_SH, SCRIPT_PERL or SCRIPT_PYTHON. + +3. Drop its test in the t directory. + +4. If your command is implemented in an interpreted language with a +p-code intermediate form, make sure .gitignore in the main directory +includes a pattern entry that ignores such files. Python .pyc and +.pyo files will already be covered. + +5. If your command has any dependency on a particular version of +your language, document it in the INSTALL file. + +6. There is a file command-list.txt in the distribution main directory +that categorizes commands by type, so they can be listed in appropriate +subsections in the documentation's summary command list. Add an entry +for yours. To understand the categories, look at command-list.txt +in the main directory. If the new command is part of the typical Git +workflow and you believe it common enough to be mentioned in 'git help', +map this command to a common group in the column [common]. + +7. Give the maintainer one paragraph to include in the RelNotes file +to describe the new feature; a good place to do so is in the cover +letter [PATCH 0/n]. + +That's all there is to it. diff --git a/Documentation/howto/rebase-from-internal-branch.txt b/Documentation/howto/rebase-from-internal-branch.txt new file mode 100644 index 0000000..f2e10a7 --- /dev/null +++ b/Documentation/howto/rebase-from-internal-branch.txt @@ -0,0 +1,164 @@ +From: Junio C Hamano <gitster@pobox.com> +To: git@vger.kernel.org +Cc: Petr Baudis <pasky@suse.cz>, Linus Torvalds <torvalds@osdl.org> +Subject: Re: sending changesets from the middle of a git tree +Date: Sun, 14 Aug 2005 18:37:39 -0700 +Abstract: In this article, JC talks about how he rebases the + public "seen" branch using the core Git tools when he updates + the "master" branch, and how "rebase" works. Also discussed + is how this applies to individual developers who sends patches + upstream. +Content-type: text/asciidoc + +How to rebase from an internal branch +===================================== + +-------------------------------------- +Petr Baudis <pasky@suse.cz> writes: + +> Dear diary, on Sun, Aug 14, 2005 at 09:57:13AM CEST, I got a letter +> where Junio C Hamano <junkio@cox.net> told me that... +>> Linus Torvalds <torvalds@osdl.org> writes: +>> +>> > Junio, maybe you want to talk about how you move patches from your +>> > "seen" branch to the real branches. +>> +> Actually, wouldn't this be also precisely for what StGIT is intended to? +-------------------------------------- + +Exactly my feeling. I was sort of waiting for Catalin to speak +up. With its basing philosophical ancestry on quilt, this is +the kind of task StGIT is designed to do. + +I just have done a simpler one, this time using only the core +Git tools. + +I had a handful of commits that were ahead of master in 'seen', and I +wanted to add some documentation bypassing my usual habit of +placing new things in 'seen' first. At the beginning, the commit +ancestry graph looked like this: + + *"seen" head + master --> #1 --> #2 --> #3 + +So I started from master, made a bunch of edits, and committed: + + $ git checkout master + $ cd Documentation; ed git.txt ... + $ cd ..; git add Documentation/*.txt + $ git commit -s + +After the commit, the ancestry graph would look like this: + + *"seen" head + master^ --> #1 --> #2 --> #3 + \ + \---> master + +The old master is now master^ (the first parent of the master). +The new master commit holds my documentation updates. + +Now I have to deal with "seen" branch. + +This is the kind of situation I used to have all the time when +Linus was the maintainer and I was a contributor, when you look +at "master" branch being the "maintainer" branch, and "seen" +branch being the "contributor" branch. Your work started at the +tip of the "maintainer" branch some time ago, you made a lot of +progress in the meantime, and now the maintainer branch has some +other commits you do not have yet. And "git rebase" was written +with the explicit purpose of helping to maintain branches like +"seen". You _could_ merge master to 'seen' and keep going, but if you +eventually want to cherrypick and merge some but not necessarily +all changes back to the master branch, it often makes later +operations for _you_ easier if you rebase (i.e. carry forward +your changes) "seen" rather than merge. So I ran "git rebase": + + $ git checkout seen + $ git rebase master seen + +What this does is to pick all the commits since the current +branch (note that I now am on "seen" branch) forked from the +master branch, and forward port these changes. + + master^ --> #1 --> #2 --> #3 + \ *"seen" head + \---> master --> #1' --> #2' --> #3' + +The diff between master^ and #1 is applied to master and +committed to create #1' commit with the commit information (log, +author and date) taken from commit #1. On top of that #2' and #3' +commits are made similarly out of #2 and #3 commits. + +Old #3 is not recorded in any of the .git/refs/heads/ file +anymore, so after doing this you will have dangling commit if +you ran fsck-cache, which is normal. After testing "seen", you +can run "git prune" to get rid of those original three commits. + +While I am talking about "git rebase", I should talk about how +to do cherrypicking using only the core Git tools. + +Let's go back to the earlier picture, with different labels. + +You, as an individual developer, cloned upstream repository and +made a couple of commits on top of it. + + *your "master" head + upstream --> #1 --> #2 --> #3 + +You would want changes #2 and #3 incorporated in the upstream, +while you feel that #1 may need further improvements. So you +prepare #2 and #3 for e-mail submission. + + $ git format-patch master^^ master + +This creates two files, 0001-XXXX.patch and 0002-XXXX.patch. Send +them out "To: " your project maintainer and "Cc: " your mailing +list. You could use contributed script git-send-email if +your host has necessary perl modules for this, but your usual +MUA would do as long as it does not corrupt whitespaces in the +patch. + +Then you would wait, and you find out that the upstream picked +up your changes, along with other changes. + + where *your "master" head + upstream --> #1 --> #2 --> #3 + used \ + to be \--> #A --> #2' --> #3' --> #B --> #C + *upstream head + +The two commits #2' and #3' in the above picture record the same +changes your e-mail submission for #2 and #3 contained, but +probably with the new sign-off line added by the upstream +maintainer and definitely with different committer and ancestry +information, they are different objects from #2 and #3 commits. + +You fetch from upstream, but not merge. + + $ git fetch upstream + +This leaves the updated upstream head in .git/FETCH_HEAD but +does not touch your .git/HEAD or .git/refs/heads/master. +You run "git rebase" now. + + $ git rebase FETCH_HEAD master + +Earlier, I said that rebase applies all the commits from your +branch on top of the upstream head. Well, I lied. "git rebase" +is a bit smarter than that and notices that #2 and #3 need not +be applied, so it only applies #1. The commit ancestry graph +becomes something like this: + + where *your old "master" head + upstream --> #1 --> #2 --> #3 + used \ your new "master" head* + to be \--> #A --> #2' --> #3' --> #B --> #C --> #1' + *upstream + head + +Again, "git prune" would discard the disused commits #1-#3 and +you continue on starting from the new "master" head, which is +the #1' commit. + +-jc diff --git a/Documentation/howto/rebuild-from-update-hook.txt b/Documentation/howto/rebuild-from-update-hook.txt new file mode 100644 index 0000000..db219f5 --- /dev/null +++ b/Documentation/howto/rebuild-from-update-hook.txt @@ -0,0 +1,90 @@ +Subject: [HOWTO] Using post-update hook +Message-ID: <7vy86o6usx.fsf@assigned-by-dhcp.cox.net> +From: Junio C Hamano <gitster@pobox.com> +Date: Fri, 26 Aug 2005 18:19:10 -0700 +Abstract: In this how-to article, JC talks about how he + uses the post-update hook to automate Git documentation page + shown at https://www.kernel.org/pub/software/scm/git/docs/. +Content-type: text/asciidoc + +How to rebuild from update hook +=============================== + +The pages under https://www.kernel.org/pub/software/scm/git/docs/ +are built from Documentation/ directory of the git.git project +and needed to be kept up-to-date. The www.kernel.org/ servers +are mirrored and I was told that the origin of the mirror is on +the machine $some.kernel.org, on which I was given an account +when I took over Git maintainership from Linus. + +The directories relevant to this how-to are these two: + + /pub/scm/git/git.git/ The public Git repository. + /pub/software/scm/git/docs/ The HTML documentation page. + +So I made a repository to generate the documentation under my +home directory over there. + + $ cd + $ mkdir doc-git && cd doc-git + $ git clone /pub/scm/git/git.git/ docgen + +What needs to happen is to update the $HOME/doc-git/docgen/ +working tree, build HTML docs there and install the result in +/pub/software/scm/git/docs/ directory. So I wrote a little +script: + + $ cat >dododoc.sh <<\EOF + #!/bin/sh + cd $HOME/doc-git/docgen || exit + + unset GIT_DIR + + git pull /pub/scm/git/git.git/ master && + cd Documentation && + make install-webdoc + EOF + +Initially I used to run this by hand whenever I push into the +public Git repository. Then I did a cron job that ran twice a +day. The current round uses the post-update hook mechanism, +like this: + + $ cat >/pub/scm/git/git.git/hooks/post-update <<\EOF + #!/bin/sh + # + # An example hook script to prepare a packed repository for use over + # dumb transports. + # + # To enable this hook, make this file executable by "chmod +x post-update". + + case " $* " in + *' refs/heads/master '*) + echo $HOME/doc-git/dododoc.sh | at now + ;; + esac + exec git-update-server-info + EOF + $ chmod +x /pub/scm/git/git.git/hooks/post-update + +There are four things worth mentioning: + + - The update-hook is run after the repository accepts a "git + push", under my user privilege. It is given the full names + of refs that have been updated as arguments. My post-update + runs the dododoc.sh script only when the master head is + updated. + + - When update-hook is run, GIT_DIR is set to '.' by the calling + receive-pack. This is inherited by the dododoc.sh run via + the "at" command, and needs to be unset; otherwise, "git + pull" it does into $HOME/doc-git/docgen/ repository would not + work correctly. + + - The stdout of update hook script is not connected to git + push; I run the heavy part of the command inside "at", to + receive the execution report via e-mail. + + - This is still crude and does not protect against simultaneous + make invocations stomping on each other. I would need to add + some locking mechanism for this. diff --git a/Documentation/howto/recover-corrupted-blob-object.txt b/Documentation/howto/recover-corrupted-blob-object.txt new file mode 100644 index 0000000..1b3b188 --- /dev/null +++ b/Documentation/howto/recover-corrupted-blob-object.txt @@ -0,0 +1,144 @@ +Date: Fri, 9 Nov 2007 08:28:38 -0800 (PST) +From: Linus Torvalds <torvalds@linux-foundation.org> +Subject: corrupt object on git-gc +Abstract: Some tricks to reconstruct blob objects in order to fix + a corrupted repository. +Content-type: text/asciidoc + +How to recover a corrupted blob object +====================================== + +----------------------------------------------------------- +On Fri, 9 Nov 2007, Yossi Leybovich wrote: +> +> Did not help still the repository look for this object? +> Any one know how can I track this object and understand which file is it +----------------------------------------------------------- + +So exactly *because* the SHA-1 hash is cryptographically secure, the hash +itself doesn't actually tell you anything, in order to fix a corrupt +object you basically have to find the "original source" for it. + +The easiest way to do that is almost always to have backups, and find the +same object somewhere else. Backups really are a good idea, and Git makes +it pretty easy (if nothing else, just clone the repository somewhere else, +and make sure that you do *not* use a hard-linked clone, and preferably +not the same disk/machine). + +But since you don't seem to have backups right now, the good news is that +especially with a single blob being corrupt, these things *are* somewhat +debuggable. + +First off, move the corrupt object away, and *save* it. The most common +cause of corruption so far has been memory corruption, but even so, there +are people who would be interested in seeing the corruption - but it's +basically impossible to judge the corruption until we can also see the +original object, so right now the corrupt object is useless, but it's very +interesting for the future, in the hope that you can re-create a +non-corrupt version. + +----------------------------------------------------------- +So: + +> ib]$ mv .git/objects/4b/9458b3786228369c63936db65827de3cc06200 ../ +----------------------------------------------------------- + +This is the right thing to do, although it's usually best to save it under +it's full SHA-1 name (you just dropped the "4b" from the result ;). + +Let's see what that tells us: + +----------------------------------------------------------- +> ib]$ git-fsck --full +> broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8 +> to blob 4b9458b3786228369c63936db65827de3cc06200 +> missing blob 4b9458b3786228369c63936db65827de3cc06200 +----------------------------------------------------------- + +Ok, I removed the "dangling commit" messages, because they are just +messages about the fact that you probably have rebased etc, so they're not +at all interesting. But what remains is still very useful. In particular, +we now know which tree points to it! + +Now you can do + + git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8 + +which will show something like + + 100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore + 100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap + 100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING + 100644 blob ee909f2cc49e54f0799a4739d24c4cb9151ae453 CREDITS + 040000 tree 0f5f709c17ad89e72bdbbef6ea221c69807009f6 Documentation + 100644 blob 1570d248ad9237e4fa6e4d079336b9da62d9ba32 Kbuild + 100644 blob 1c7c229a092665b11cd46a25dbd40feeb31661d9 MAINTAINERS + ... + +and you should now have a line that looks like + + 10064 blob 4b9458b3786228369c63936db65827de3cc06200 my-magic-file + +in the output. This already tells you a *lot* it tells you what file the +corrupt blob came from! + +Now, it doesn't tell you quite enough, though: it doesn't tell what +*version* of the file didn't get correctly written! You might be really +lucky, and it may be the version that you already have checked out in your +working tree, in which case fixing this problem is really simple, just do + + git hash-object -w my-magic-file + +again, and if it outputs the missing SHA-1 (4b945..) you're now all done! + +But that's the really lucky case, so let's assume that it was some older +version that was broken. How do you tell which version it was? + +The easiest way to do it is to do + + git log --raw --all --full-history -- subdirectory/my-magic-file + +and that will show you the whole log for that file (please realize that +the tree you had may not be the top-level tree, so you need to figure out +which subdirectory it was in on your own), and because you're asking for +raw output, you'll now get something like + + commit abc + Author: + Date: + .. + :100644 100644 4b9458b... newsha... M somedirectory/my-magic-file + + + commit xyz + Author: + Date: + + .. + :100644 100644 oldsha... 4b9458b... M somedirectory/my-magic-file + +and this actually tells you what the *previous* and *subsequent* versions +of that file were! So now you can look at those ("oldsha" and "newsha" +respectively), and hopefully you have done commits often, and can +re-create the missing my-magic-file version by looking at those older and +newer versions! + +If you can do that, you can now recreate the missing object with + + git hash-object -w <recreated-file> + +and your repository is good again! + +(Btw, you could have ignored the fsck, and started with doing a + + git log --raw --all + +and just looked for the sha of the missing object (4b9458b..) in that +whole thing. It's up to you - Git does *have* a lot of information, it is +just missing one particular blob version. + +Trying to recreate trees and especially commits is *much* harder. So you +were lucky that it's a blob. It's quite possible that you can recreate the +thing. + + Linus diff --git a/Documentation/howto/recover-corrupted-object-harder.txt b/Documentation/howto/recover-corrupted-object-harder.txt new file mode 100644 index 0000000..5efb4fe --- /dev/null +++ b/Documentation/howto/recover-corrupted-object-harder.txt @@ -0,0 +1,479 @@ +Date: Wed, 16 Oct 2013 04:34:01 -0400 +From: Jeff King <peff@peff.net> +Subject: pack corruption post-mortem +Abstract: Recovering a corrupted object when no good copy is available. +Content-type: text/asciidoc + +How to recover an object from scratch +===================================== + +I was recently presented with a repository with a corrupted packfile, +and was asked if the data was recoverable. This post-mortem describes +the steps I took to investigate and fix the problem. I thought others +might find the process interesting, and it might help somebody in the +same situation. + +******************************** +Note: In this case, no good copy of the repository was available. For +the much easier case where you can get the corrupted object from +elsewhere, see link:recover-corrupted-blob-object.html[this howto]. +******************************** + +I started with an fsck, which found a problem with exactly one object +(I've used $pack and $obj below to keep the output readable, and also +because I'll refer to them later): + +----------- + $ git fsck + error: $pack SHA1 checksum mismatch + error: index CRC mismatch for object $obj from $pack at offset 51653873 + error: inflate: data stream error (incorrect data check) + error: cannot unpack $obj from $pack at offset 51653873 +----------- + +The pack checksum failing means a byte is munged somewhere, and it is +presumably in the object mentioned (since both the index checksum and +zlib were failing). + +Reading the zlib source code, I found that "incorrect data check" means +that the adler-32 checksum at the end of the zlib data did not match the +inflated data. So stepping the data through zlib would not help, as it +did not fail until the very end, when we realize the CRC does not match. +The problematic bytes could be anywhere in the object data. + +The first thing I did was pull the broken data out of the packfile. I +needed to know how big the object was, which I found out with: + +------------ + $ git show-index <$idx | cut -d' ' -f1 | sort -n | grep -A1 51653873 + 51653873 + 51664736 +------------ + +Show-index gives us the list of objects and their offsets. We throw away +everything but the offsets, and then sort them so that our interesting +offset (which we got from the fsck output above) is followed immediately +by the offset of the next object. Now we know that the object data is +10863 bytes long, and we can grab it with: + +------------ + dd if=$pack of=object bs=1 skip=51653873 count=10863 +------------ + +I inspected a hexdump of the data, looking for any obvious bogosity +(e.g., a 4K run of zeroes would be a good sign of filesystem +corruption). But everything looked pretty reasonable. + +Note that the "object" file isn't fit for feeding straight to zlib; it +has the git packed object header, which is variable-length. We want to +strip that off so we can start playing with the zlib data directly. You +can either work your way through it manually (the format is described in +linkgit:gitformat-pack[5]), +or you can walk through it in a debugger. I did the latter, creating a +valid pack like: + +------------ + # pack magic and version + printf 'PACK\0\0\0\2' >tmp.pack + # pack has one object + printf '\0\0\0\1' >>tmp.pack + # now add our object data + cat object >>tmp.pack + # and then append the pack trailer + /path/to/git.git/t/helper/test-tool sha1 -b <tmp.pack >trailer + cat trailer >>tmp.pack +------------ + +and then running "git index-pack tmp.pack" in the debugger (stop at +unpack_raw_entry). Doing this, I found that there were 3 bytes of header +(and the header itself had a sane type and size). So I stripped those +off with: + +------------ + dd if=object of=zlib bs=1 skip=3 +------------ + +I ran the result through zlib's inflate using a custom C program. And +while it did report the error, I did get the right number of output +bytes (i.e., it matched git's size header that we decoded above). But +feeding the result back to "git hash-object" didn't produce the same +sha1. So there were some wrong bytes, but I didn't know which. The file +happened to be C source code, so I hoped I could notice something +obviously wrong with it, but I didn't. I even got it to compile! + +I also tried comparing it to other versions of the same path in the +repository, hoping that there would be some part of the diff that didn't +make sense. Unfortunately, this happened to be the only revision of this +particular file in the repository, so I had nothing to compare against. + +So I took a different approach. Working under the guess that the +corruption was limited to a single byte, I wrote a program to munge each +byte individually, and try inflating the result. Since the object was +only 10K compressed, that worked out to about 2.5M attempts, which took +a few minutes. + +The program I used is here: + +---------------------------------------------- +#include <stdio.h> +#include <unistd.h> +#include <string.h> +#include <signal.h> +#include <zlib.h> + +static int try_zlib(unsigned char *buf, int len) +{ + /* make this absurdly large so we don't have to loop */ + static unsigned char out[1024*1024]; + z_stream z; + int ret; + + memset(&z, 0, sizeof(z)); + inflateInit(&z); + + z.next_in = buf; + z.avail_in = len; + z.next_out = out; + z.avail_out = sizeof(out); + + ret = inflate(&z, 0); + inflateEnd(&z); + return ret >= 0; +} + +/* eye candy */ +static int counter = 0; +static void progress(int sig) +{ + fprintf(stderr, "\r%d", counter); + alarm(1); +} + +int main(void) +{ + /* oversized so we can read the whole buffer in */ + unsigned char buf[1024*1024]; + int len; + unsigned i, j; + + signal(SIGALRM, progress); + alarm(1); + + len = read(0, buf, sizeof(buf)); + for (i = 0; i < len; i++) { + unsigned char c = buf[i]; + for (j = 0; j <= 0xff; j++) { + buf[i] = j; + + counter++; + if (try_zlib(buf, len)) + printf("i=%d, j=%x\n", i, j); + } + buf[i] = c; + } + + alarm(0); + fprintf(stderr, "\n"); + return 0; +} +---------------------------------------------- + +I compiled and ran with: + +------- + gcc -Wall -Werror -O3 munge.c -o munge -lz + ./munge <zlib +------- + + +There were a few false positives early on (if you write "no data" in the +zlib header, zlib thinks it's just fine :) ). But I got a hit about +halfway through: + +------- + i=5642, j=c7 +------- + +I let it run to completion, and got a few more hits at the end (where it +was munging the CRC to match our broken data). So there was a good +chance this middle hit was the source of the problem. + +I confirmed by tweaking the byte in a hex editor, zlib inflating the +result (no errors!), and then piping the output into "git hash-object", +which reported the sha1 of the broken object. Success! + +I fixed the packfile itself with: + +------- + chmod +w $pack + printf '\xc7' | dd of=$pack bs=1 seek=51659518 conv=notrunc + chmod -w $pack +------- + +The `\xc7` comes from the replacement byte our "munge" program found. +The offset 51659518 is derived by taking the original object offset +(51653873), adding the replacement offset found by "munge" (5642), and +then adding back in the 3 bytes of git header we stripped. + +After that, "git fsck" ran clean. + +As for the corruption itself, I was lucky that it was indeed a single +byte. In fact, it turned out to be a single bit. The byte 0xc7 was +corrupted to 0xc5. So presumably it was caused by faulty hardware, or a +cosmic ray. + +And the aborted attempt to look at the inflated output to see what was +wrong? I could have looked forever and never found it. Here's the diff +between what the corrupted data inflates to, versus the real data: + +-------------- + - cp = strtok (arg, "+"); + + cp = strtok (arg, "."); +-------------- + +It tweaked one byte and still ended up as valid, readable C that just +happened to do something totally different! One takeaway is that on a +less unlucky day, looking at the zlib output might have actually been +helpful, as most random changes would actually break the C code. + +But more importantly, git's hashing and checksumming noticed a problem +that easily could have gone undetected in another system. The result +still compiled, but would have caused an interesting bug (that would +have been blamed on some random commit). + + +The adventure continues... +-------------------------- + +I ended up doing this again! Same entity, new hardware. The assumption +at this point is that the old disk corrupted the packfile, and then the +corruption was migrated to the new hardware (because it was done by +rsync or similar, and no fsck was done at the time of migration). + +This time, the affected blob was over 20 megabytes, which was far too +large to do a brute-force on. I followed the instructions above to +create the `zlib` file. I then used the `inflate` program below to pull +the corrupted data from that. Examining that output gave me a hint about +where in the file the corruption was. But now I was working with the +file itself, not the zlib contents. So knowing the sha1 of the object +and the approximate area of the corruption, I used the `sha1-munge` +program below to brute-force the correct byte. + +Here's the inflate program (it's essentially `gunzip` but without the +`.gz` header processing): + +-------------------------- +#include <stdio.h> +#include <string.h> +#include <zlib.h> +#include <stdlib.h> + +int main(int argc, char **argv) +{ + /* + * oversized so we can read the whole buffer in; + * this could actually be switched to streaming + * to avoid any memory limitations + */ + static unsigned char buf[25 * 1024 * 1024]; + static unsigned char out[25 * 1024 * 1024]; + int len; + z_stream z; + int ret; + + len = read(0, buf, sizeof(buf)); + memset(&z, 0, sizeof(z)); + inflateInit(&z); + + z.next_in = buf; + z.avail_in = len; + z.next_out = out; + z.avail_out = sizeof(out); + + ret = inflate(&z, 0); + if (ret != Z_OK && ret != Z_STREAM_END) + fprintf(stderr, "initial inflate failed (%d)\n", ret); + + fprintf(stderr, "outputting %lu bytes", z.total_out); + fwrite(out, 1, z.total_out, stdout); + return 0; +} +-------------------------- + +And here is the `sha1-munge` program: + +-------------------------- +#include <stdio.h> +#include <unistd.h> +#include <string.h> +#include <signal.h> +#include <openssl/sha.h> +#include <stdlib.h> + +/* eye candy */ +static int counter = 0; +static void progress(int sig) +{ + fprintf(stderr, "\r%d", counter); + alarm(1); +} + +static const signed char hexval_table[256] = { + -1, -1, -1, -1, -1, -1, -1, -1, /* 00-07 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 08-0f */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 10-17 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 18-1f */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 20-27 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 28-2f */ + 0, 1, 2, 3, 4, 5, 6, 7, /* 30-37 */ + 8, 9, -1, -1, -1, -1, -1, -1, /* 38-3f */ + -1, 10, 11, 12, 13, 14, 15, -1, /* 40-47 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 48-4f */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 50-57 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 58-5f */ + -1, 10, 11, 12, 13, 14, 15, -1, /* 60-67 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 68-67 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 70-77 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 78-7f */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 80-87 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 88-8f */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 90-97 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* 98-9f */ + -1, -1, -1, -1, -1, -1, -1, -1, /* a0-a7 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* a8-af */ + -1, -1, -1, -1, -1, -1, -1, -1, /* b0-b7 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* b8-bf */ + -1, -1, -1, -1, -1, -1, -1, -1, /* c0-c7 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* c8-cf */ + -1, -1, -1, -1, -1, -1, -1, -1, /* d0-d7 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* d8-df */ + -1, -1, -1, -1, -1, -1, -1, -1, /* e0-e7 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* e8-ef */ + -1, -1, -1, -1, -1, -1, -1, -1, /* f0-f7 */ + -1, -1, -1, -1, -1, -1, -1, -1, /* f8-ff */ +}; + +static inline unsigned int hexval(unsigned char c) +{ +return hexval_table[c]; +} + +static int get_sha1_hex(const char *hex, unsigned char *sha1) +{ + int i; + for (i = 0; i < 20; i++) { + unsigned int val; + /* + * hex[1]=='\0' is caught when val is checked below, + * but if hex[0] is NUL we have to avoid reading + * past the end of the string: + */ + if (!hex[0]) + return -1; + val = (hexval(hex[0]) << 4) | hexval(hex[1]); + if (val & ~0xff) + return -1; + *sha1++ = val; + hex += 2; + } + return 0; +} + +int main(int argc, char **argv) +{ + /* oversized so we can read the whole buffer in */ + static unsigned char buf[25 * 1024 * 1024]; + char header[32]; + int header_len; + unsigned char have[20], want[20]; + int start, len; + SHA_CTX orig; + unsigned i, j; + + if (!argv[1] || get_sha1_hex(argv[1], want)) { + fprintf(stderr, "usage: sha1-munge <sha1> [start] <file.in\n"); + return 1; + } + + if (argv[2]) + start = atoi(argv[2]); + else + start = 0; + + len = read(0, buf, sizeof(buf)); + header_len = sprintf(header, "blob %d", len) + 1; + fprintf(stderr, "using header: %s\n", header); + + /* + * We keep a running sha1 so that if you are munging + * near the end of the file, we do not have to re-sha1 + * the unchanged earlier bytes + */ + SHA1_Init(&orig); + SHA1_Update(&orig, header, header_len); + if (start) + SHA1_Update(&orig, buf, start); + + signal(SIGALRM, progress); + alarm(1); + + for (i = start; i < len; i++) { + unsigned char c; + SHA_CTX x; + +#if 0 + /* + * deletion -- this would not actually work in practice, + * I think, because we've already committed to a + * particular size in the header. Ditto for addition + * below. In those cases, you'd have to do the whole + * sha1 from scratch, or possibly keep three running + * "orig" sha1 computations going. + */ + memcpy(&x, &orig, sizeof(x)); + SHA1_Update(&x, buf + i + 1, len - i - 1); + SHA1_Final(have, &x); + if (!memcmp(have, want, 20)) + printf("i=%d, deletion\n", i); +#endif + + /* + * replacement -- note that this tries each of the 256 + * possible bytes. If you suspect a single-bit flip, + * it would be much shorter to just try the 8 + * bit-flipped variants. + */ + c = buf[i]; + for (j = 0; j <= 0xff; j++) { + buf[i] = j; + + memcpy(&x, &orig, sizeof(x)); + SHA1_Update(&x, buf + i, len - i); + SHA1_Final(have, &x); + if (!memcmp(have, want, 20)) + printf("i=%d, j=%02x\n", i, j); + } + buf[i] = c; + +#if 0 + /* addition */ + for (j = 0; j <= 0xff; j++) { + unsigned char extra = j; + memcpy(&x, &orig, sizeof(x)); + SHA1_Update(&x, &extra, 1); + SHA1_Update(&x, buf + i, len - i); + SHA1_Final(have, &x); + if (!memcmp(have, want, 20)) + printf("i=%d, addition=%02x", i, j); + } +#endif + + SHA1_Update(&orig, buf + i, 1); + counter++; + } + + alarm(0); + fprintf(stderr, "\r%d\n", counter); + return 0; +} +-------------------------- diff --git a/Documentation/howto/revert-a-faulty-merge.txt b/Documentation/howto/revert-a-faulty-merge.txt new file mode 100644 index 0000000..19f59cc --- /dev/null +++ b/Documentation/howto/revert-a-faulty-merge.txt @@ -0,0 +1,273 @@ +Date: Fri, 19 Dec 2008 00:45:19 -0800 +From: Linus Torvalds <torvalds@linux-foundation.org>, Junio C Hamano <gitster@pobox.com> +Subject: Re: Odd merge behaviour involving reverts +Abstract: Sometimes a branch that was already merged to the mainline + is later found to be faulty. Linus and Junio give guidance on + recovering from such a premature merge and continuing development + after the offending branch is fixed. +Message-ID: <7vocz8a6zk.fsf@gitster.siamese.dyndns.org> +References: <alpine.LFD.2.00.0812181949450.14014@localhost.localdomain> +Content-type: text/asciidoc + +How to revert a faulty merge +============================ + +Alan <alan@clueserver.org> said: + + I have a master branch. We have a branch off of that that some + developers are doing work on. They claim it is ready. We merge it + into the master branch. It breaks something so we revert the merge. + They make changes to the code. they get it to a point where they say + it is ok and we merge again. + + When examined, we find that code changes made before the revert are + not in the master branch, but code changes after are in the master + branch. + +and asked for help recovering from this situation. + +The history immediately after the "revert of the merge" would look like +this: + + ---o---o---o---M---x---x---W + / + ---A---B + +where A and B are on the side development that was not so good, M is the +merge that brings these premature changes into the mainline, x are changes +unrelated to what the side branch did and already made on the mainline, +and W is the "revert of the merge M" (doesn't W look M upside down?). +IOW, `"diff W^..W"` is similar to `"diff -R M^..M"`. + +Such a "revert" of a merge can be made with: + + $ git revert -m 1 M + +After the developers of the side branch fix their mistakes, the history +may look like this: + + ---o---o---o---M---x---x---W---x + / + ---A---B-------------------C---D + +where C and D are to fix what was broken in A and B, and you may already +have some other changes on the mainline after W. + +If you merge the updated side branch (with D at its tip), none of the +changes made in A or B will be in the result, because they were reverted +by W. That is what Alan saw. + +Linus explains the situation: + + Reverting a regular commit just effectively undoes what that commit + did, and is fairly straightforward. But reverting a merge commit also + undoes the _data_ that the commit changed, but it does absolutely + nothing to the effects on _history_ that the merge had. + + So the merge will still exist, and it will still be seen as joining + the two branches together, and future merges will see that merge as + the last shared state - and the revert that reverted the merge brought + in will not affect that at all. + + So a "revert" undoes the data changes, but it's very much _not_ an + "undo" in the sense that it doesn't undo the effects of a commit on + the repository history. + + So if you think of "revert" as "undo", then you're going to always + miss this part of reverts. Yes, it undoes the data, but no, it doesn't + undo history. + +In such a situation, you would want to first revert the previous revert, +which would make the history look like this: + + ---o---o---o---M---x---x---W---x---Y + / + ---A---B-------------------C---D + +where Y is the revert of W. Such a "revert of the revert" can be done +with: + + $ git revert W + +This history would (ignoring possible conflicts between what W and W..Y +changed) be equivalent to not having W or Y at all in the history: + + ---o---o---o---M---x---x-------x---- + / + ---A---B-------------------C---D + +and merging the side branch again will not have conflict arising from an +earlier revert and revert of the revert. + + ---o---o---o---M---x---x-------x-------* + / / + ---A---B-------------------C---D + +Of course the changes made in C and D still can conflict with what was +done by any of the x, but that is just a normal merge conflict. + +On the other hand, if the developers of the side branch discarded their +faulty A and B, and redone the changes on top of the updated mainline +after the revert, the history would have looked like this: + + ---o---o---o---M---x---x---W---x---x + / \ + ---A---B A'--B'--C' + +If you reverted the revert in such a case as in the previous example: + + ---o---o---o---M---x---x---W---x---x---Y---* + / \ / + ---A---B A'--B'--C' + +where Y is the revert of W, A' and B' are rerolled A and B, and there may +also be a further fix-up C' on the side branch. `"diff Y^..Y"` is similar +to `"diff -R W^..W"` (which in turn means it is similar to `"diff M^..M"`), +and `"diff A'^..C'"` by definition would be similar but different from that, +because it is a rerolled series of the earlier change. There will be a +lot of overlapping changes that result in conflicts. So do not do "revert +of revert" blindly without thinking.. + + ---o---o---o---M---x---x---W---x---x + / \ + ---A---B A'--B'--C' + +In the history with rebased side branch, W (and M) are behind the merge +base of the updated branch and the tip of the mainline, and they should +merge without the past faulty merge and its revert getting in the way. + +To recap, these are two very different scenarios, and they want two very +different resolution strategies: + + - If the faulty side branch was fixed by adding corrections on top, then + doing a revert of the previous revert would be the right thing to do. + + - If the faulty side branch whose effects were discarded by an earlier + revert of a merge was rebuilt from scratch (i.e. rebasing and fixing, + as you seem to have interpreted), then re-merging the result without + doing anything else fancy would be the right thing to do. + (See the ADDENDUM below for how to rebuild a branch from scratch + without changing its original branching-off point.) + +However, there are things to keep in mind when reverting a merge (and +reverting such a revert). + +For example, think about what reverting a merge (and then reverting the +revert) does to bisectability. Ignore the fact that the revert of a revert +is undoing it - just think of it as a "single commit that does a lot". +Because that is what it does. + +When you have a problem you are chasing down, and you hit a "revert this +merge", what you're hitting is essentially a single commit that contains +all the changes (but obviously in reverse) of all the commits that got +merged. So it's debugging hell, because now you don't have lots of small +changes that you can try to pinpoint which _part_ of it changes. + +But does it all work? Sure it does. You can revert a merge, and from a +purely technical angle, Git did it very naturally and had no real +troubles. It just considered it a change from "state before merge" to +"state after merge", and that was it. Nothing complicated, nothing odd, +nothing really dangerous. Git will do it without even thinking about it. + +So from a technical angle, there's nothing wrong with reverting a merge, +but from a workflow angle it's something that you generally should try to +avoid. + +If at all possible, for example, if you find a problem that got merged +into the main tree, rather than revert the merge, try _really_ hard to +bisect the problem down into the branch you merged, and just fix it, or +try to revert the individual commit that caused it. + +Yes, it's more complex, and no, it's not always going to work (sometimes +the answer is: "oops, I really shouldn't have merged it, because it wasn't +ready yet, and I really need to undo _all_ of the merge"). So then you +really should revert the merge, but when you want to re-do the merge, you +now need to do it by reverting the revert. + +ADDENDUM + +Sometimes you have to rewrite one of a topic branch's commits *and* you can't +change the topic's branching-off point. Consider the following situation: + + P---o---o---M---x---x---W---x + \ / + A---B---C + +where commit W reverted commit M because it turned out that commit B was wrong +and needs to be rewritten, but you need the rewritten topic to still branch +from commit P (perhaps P is a branching-off point for yet another branch, and +you want be able to merge the topic into both branches). + +The natural thing to do in this case is to checkout the A-B-C branch and use +"rebase -i P" to change commit B. However this does not rewrite commit A, +because "rebase -i" by default fast-forwards over any initial commits selected +with the "pick" command. So you end up with this: + + P---o---o---M---x---x---W---x + \ / + A---B---C <-- old branch + \ + B'---C' <-- naively rewritten branch + +To merge A-B'-C' into the mainline branch you would still have to first revert +commit W in order to pick up the changes in A, but then it's likely that the +changes in B' will conflict with the original B changes re-introduced by the +reversion of W. + +However, you can avoid these problems if you recreate the entire branch, +including commit A: + + A'---B'---C' <-- completely rewritten branch + / + P---o---o---M---x---x---W---x + \ / + A---B---C + +You can merge A'-B'-C' into the mainline branch without worrying about first +reverting W. Mainline's history would look like this: + + A'---B'---C'------------------ + / \ + P---o---o---M---x---x---W---x---M2 + \ / + A---B---C + +But if you don't actually need to change commit A, then you need some way to +recreate it as a new commit with the same changes in it. The rebase command's +--no-ff option provides a way to do this: + + $ git rebase [-i] --no-ff P + +The --no-ff option creates a new branch A'-B'-C' with all-new commits (all the +SHA IDs will be different) even if in the interactive case you only actually +modify commit B. You can then merge this new branch directly into the mainline +branch and be sure you'll get all of the branch's changes. + +You can also use --no-ff in cases where you just add extra commits to the topic +to fix it up. Let's revisit the situation discussed at the start of this howto: + + P---o---o---M---x---x---W---x + \ / + A---B---C----------------D---E <-- fixed-up topic branch + +At this point, you can use --no-ff to recreate the topic branch: + + $ git checkout E + $ git rebase --no-ff P + +yielding + + A'---B'---C'------------D'---E' <-- recreated topic branch + / + P---o---o---M---x---x---W---x + \ / + A---B---C----------------D---E + +You can merge the recreated branch into the mainline without reverting commit W, +and mainline's history will look like this: + + A'---B'---C'------------D'---E' + / \ + P---o---o---M---x---x---W---x---M2 + \ / + A---B---C diff --git a/Documentation/howto/revert-branch-rebase.txt b/Documentation/howto/revert-branch-rebase.txt new file mode 100644 index 0000000..a3e5595 --- /dev/null +++ b/Documentation/howto/revert-branch-rebase.txt @@ -0,0 +1,187 @@ +From: Junio C Hamano <gitster@pobox.com> +To: git@vger.kernel.org +Subject: [HOWTO] Reverting an existing commit +Abstract: In this article, JC gives a small real-life example of using + 'git revert' command, and using a temporary branch and tag for safety + and easier sanity checking. +Date: Mon, 29 Aug 2005 21:39:02 -0700 +Content-type: text/asciidoc +Message-ID: <7voe7g3uop.fsf@assigned-by-dhcp.cox.net> + +How to revert an existing commit +================================ + +One of the changes I pulled into the 'master' branch turns out to +break building Git with GCC 2.95. While they were well-intentioned +portability fixes, keeping things working with gcc-2.95 was also +important. Here is what I did to revert the change in the 'master' +branch and to adjust the 'seen' branch, using core Git tools and +barebone Porcelain. + +First, prepare a throw-away branch in case I screw things up. + +------------------------------------------------ +$ git checkout -b revert-c99 master +------------------------------------------------ + +Now I am on the 'revert-c99' branch. Let's figure out which commit to +revert. I happen to know that the top of the 'master' branch is a +merge, and its second parent (i.e. foreign commit I merged from) has +the change I would want to undo. Further I happen to know that that +merge introduced 5 commits or so: + +------------------------------------------------ +$ git show-branch --more=4 master master^2 | head +* [master] Merge refs/heads/portable from http://www.cs.berkeley.... + ! [master^2] Replace C99 array initializers with code. +-- +- [master] Merge refs/heads/portable from http://www.cs.berkeley.... +*+ [master^2] Replace C99 array initializers with code. +*+ [master^2~1] Replace unsetenv() and setenv() with older putenv(). +*+ [master^2~2] Include sys/time.h in daemon.c. +*+ [master^2~3] Fix ?: statements. +*+ [master^2~4] Replace zero-length array decls with []. +* [master~1] tutorial note about git branch +------------------------------------------------ + +The '--more=4' above means "after we reach the merge base of refs, +show until we display four more common commits". That last commit +would have been where the "portable" branch was forked from the main +git.git repository, so this would show everything on both branches +since then. I just limited the output to the first handful using +'head'. + +Now I know 'master^2~4' (pronounce it as "find the second parent of +the 'master', and then go four generations back following the first +parent") is the one I would want to revert. Since I also want to say +why I am reverting it, the '-n' flag is given to 'git revert'. This +prevents it from actually making a commit, and instead 'git revert' +leaves the commit log message it wanted to use in '.msg' file: + +------------------------------------------------ +$ git revert -n master^2~4 +$ cat .msg +Revert "Replace zero-length array decls with []." + +This reverts 6c5f9baa3bc0d63e141e0afc23110205379905a4 commit. +$ git diff HEAD ;# to make sure what we are reverting makes sense. +$ make CC=gcc-2.95 clean test ;# make sure it fixed the breakage. +$ make clean test ;# make sure it did not cause other breakage. +------------------------------------------------ + +The reverted change makes sense (from reading the 'diff' output), does +fix the problem (from 'make CC=gcc-2.95' test), and does not cause new +breakage (from the last 'make test'). I'm ready to commit: + +------------------------------------------------ +$ git commit -a -s ;# read .msg into the log, + # and explain why I am reverting. +------------------------------------------------ + +I could have screwed up in any of the above steps, but in the worst +case I could just have done 'git checkout master' to start over. +Fortunately I did not have to; what I have in the current branch +'revert-c99' is what I want. So merge that back into 'master': + +------------------------------------------------ +$ git checkout master +$ git merge revert-c99 ;# this should be a fast-forward +Updating from 10d781b9caa4f71495c7b34963bef137216f86a8 to e3a693c... + cache.h | 8 ++++---- + commit.c | 2 +- + ls-files.c | 2 +- + receive-pack.c | 2 +- + server-info.c | 2 +- + 5 files changed, 8 insertions(+), 8 deletions(-) +------------------------------------------------ + +There is no need to redo the test at this point. We fast-forwarded +and we know 'master' matches 'revert-c99' exactly. In fact: + +------------------------------------------------ +$ git diff master..revert-c99 +------------------------------------------------ + +says nothing. + +Then we rebase the 'seen' branch as usual. + +------------------------------------------------ +$ git checkout seen +$ git tag seen-anchor seen +$ git rebase master +* Applying: Redo "revert" using three-way merge machinery. +First trying simple merge strategy to cherry-pick. +* Applying: Remove git-apply-patch-script. +First trying simple merge strategy to cherry-pick. +Simple cherry-pick fails; trying Automatic cherry-pick. +Removing Documentation/git-apply-patch-script.txt +Removing git-apply-patch-script +* Applying: Document "git cherry-pick" and "git revert" +First trying simple merge strategy to cherry-pick. +* Applying: mailinfo and applymbox updates +First trying simple merge strategy to cherry-pick. +* Applying: Show commits in topo order and name all commits. +First trying simple merge strategy to cherry-pick. +* Applying: More documentation updates. +First trying simple merge strategy to cherry-pick. +------------------------------------------------ + +The temporary tag 'seen-anchor' is me just being careful, in case 'git +rebase' screws up. After this, I can do these for sanity check: + +------------------------------------------------ +$ git diff seen-anchor..seen ;# make sure we got the master fix. +$ make CC=gcc-2.95 clean test ;# make sure it fixed the breakage. +$ make clean test ;# make sure it did not cause other breakage. +------------------------------------------------ + +Everything is in the good order. I do not need the temporary branch +or tag anymore, so remove them: + +------------------------------------------------ +$ rm -f .git/refs/tags/seen-anchor +$ git branch -d revert-c99 +------------------------------------------------ + +It was an emergency fix, so we might as well merge it into the +'release candidate' branch, although I expect the next release would +be some days off: + +------------------------------------------------ +$ git checkout rc +$ git pull . master +Packing 0 objects +Unpacking 0 objects + +* commit-ish: e3a693c... refs/heads/master from . +Trying to merge e3a693c... into 8c1f5f0... using 10d781b... +Committed merge 7fb9b7262a1d1e0a47bbfdcbbcf50ce0635d3f8f + cache.h | 8 ++++---- + commit.c | 2 +- + ls-files.c | 2 +- + receive-pack.c | 2 +- + server-info.c | 2 +- + 5 files changed, 8 insertions(+), 8 deletions(-) +------------------------------------------------ + +And the final repository status looks like this: + +------------------------------------------------ +$ git show-branch --more=1 master seen rc +! [master] Revert "Replace zero-length array decls with []." + ! [seen] git-repack: Add option to repack all objects. + * [rc] Merge refs/heads/master from . +--- + + [seen] git-repack: Add option to repack all objects. + + [seen~1] More documentation updates. + + [seen~2] Show commits in topo order and name all commits. + + [seen~3] mailinfo and applymbox updates + + [seen~4] Document "git cherry-pick" and "git revert" + + [seen~5] Remove git-apply-patch-script. + + [seen~6] Redo "revert" using three-way merge machinery. + - [rc] Merge refs/heads/master from . +++* [master] Revert "Replace zero-length array decls with []." + - [rc~1] Merge refs/heads/master from . +... [master~1] Merge refs/heads/portable from http://www.cs.berkeley.... +------------------------------------------------ diff --git a/Documentation/howto/separating-topic-branches.txt b/Documentation/howto/separating-topic-branches.txt new file mode 100644 index 0000000..81be0d6 --- /dev/null +++ b/Documentation/howto/separating-topic-branches.txt @@ -0,0 +1,94 @@ +From: Junio C Hamano <gitster@pobox.com> +Subject: Separating topic branches +Abstract: In this article, JC describes how to separate topic branches. +Content-type: text/asciidoc + +How to separate topic branches +============================== + +This text was originally a footnote to a discussion about the +behaviour of the git diff commands. + +Often I find myself doing that [running diff against something other +than HEAD] while rewriting messy development history. For example, I +start doing some work without knowing exactly where it leads, and end +up with a history like this: + + "master" + o---o + \ "topic" + o---o---o---o---o---o + +At this point, "topic" contains something I know I want, but it +contains two concepts that turned out to be completely independent. +And often, one topic component is larger than the other. It may +contain more than two topics. + +In order to rewrite this mess to be more manageable, I would first do +"diff master..topic", to extract the changes into a single patch, start +picking pieces from it to get logically self-contained units, and +start building on top of "master": + + $ git diff master..topic >P.diff + $ git checkout -b topicA master + ... pick and apply pieces from P.diff to build + ... commits on topicA branch. + + o---o---o + / "topicA" + o---o"master" + \ "topic" + o---o---o---o---o---o + +Before doing each commit on "topicA" HEAD, I run "diff HEAD" +before update-index the affected paths, or "diff --cached HEAD" +after. Also I would run "diff --cached master" to make sure +that the changes are only the ones related to "topicA". Usually +I do this for smaller topics first. + +After that, I'd do the remainder of the original "topic", but +for that, I do not start from the patchfile I extracted by +comparing "master" and "topic" I used initially. Still on +"topicA", I extract "diff topic", and use it to rebuild the +other topic: + + $ git diff -R topic >P.diff ;# --cached also would work fine + $ git checkout -b topicB master + ... pick and apply pieces from P.diff to build + ... commits on topicB branch. + + "topicB" + o---o---o---o---o + / + /o---o---o + |/ "topicA" + o---o"master" + \ "topic" + o---o---o---o---o---o + +After I am done, I'd try a pretend-merge between "topicA" and +"topicB" in order to make sure I have not missed anything: + + $ git pull . topicA ;# merge it into current "topicB" + $ git diff topic + "topicB" + o---o---o---o---o---* (pretend merge) + / / + /o---o---o----------' + |/ "topicA" + o---o"master" + \ "topic" + o---o---o---o---o---o + +The last diff better not to show anything other than cleanups +for cruft. Then I can finally clean things up: + + $ git branch -D topic + $ git reset --hard HEAD^ ;# nuke pretend merge + + "topicB" + o---o---o---o---o + / + /o---o---o + |/ "topicA" + o---o"master" diff --git a/Documentation/howto/setup-git-server-over-http.txt b/Documentation/howto/setup-git-server-over-http.txt new file mode 100644 index 0000000..bfe6f9b --- /dev/null +++ b/Documentation/howto/setup-git-server-over-http.txt @@ -0,0 +1,285 @@ +From: Rutger Nijlunsing <rutger@nospam.com> +Subject: Setting up a Git repository which can be pushed into and pulled from over HTTP(S). +Date: Thu, 10 Aug 2006 22:00:26 +0200 +Content-type: text/asciidoc + +How to setup Git server over http +================================= + +NOTE: This document is from 2006. A lot has happened since then, and this +document is now relevant mainly if your web host is not CGI capable. +Almost everyone else should instead look at linkgit:git-http-backend[1]. + +Since Apache is one of those packages people like to compile +themselves while others prefer the bureaucrat's dream Debian, it is +impossible to give guidelines which will work for everyone. Just send +some feedback to the mailing list at git@vger.kernel.org to get this +document tailored to your favorite distro. + + +What's needed: + +- Have an Apache web-server + + On Debian: + $ apt-get install apache2 + To get apache2 by default started, + edit /etc/default/apache2 and set NO_START=0 + +- can edit the configuration of it. + + This could be found under /etc/httpd, or refer to your Apache documentation. + + On Debian: this means being able to edit files under /etc/apache2 + +- can restart it. + + 'apachectl --graceful' might do. If it doesn't, just stop and + restart apache. Be warning that active connections to your server + might be aborted by this. + + On Debian: + $ /etc/init.d/apache2 restart + or + $ /etc/init.d/apache2 force-reload + (which seems to do the same) + This adds symlinks from the /etc/apache2/mods-enabled to + /etc/apache2/mods-available. + +- have permissions to chown a directory + +- have Git installed on the client, and + +- either have Git installed on the server or have a webdav client on + the client. + +In effect, this means you're going to be root, or that you're using a +preconfigured WebDAV server. + + +Step 1: setup a bare Git repository +----------------------------------- + +At the time of writing, git-http-push cannot remotely create a Git +repository. So we have to do that at the server side with Git. Another +option is to generate an empty bare repository at the client and copy +it to the server with a WebDAV client (which is the only option if Git +is not installed on the server). + +Create the directory under the DocumentRoot of the directories served +by Apache. As an example we take /usr/local/apache2, but try "grep +DocumentRoot /where/ever/httpd.conf" to find your root: + + $ cd /usr/local/apache/htdocs + $ mkdir my-new-repo.git + + On Debian: + + $ cd /var/www + $ mkdir my-new-repo.git + + +Initialize a bare repository + + $ cd my-new-repo.git + $ git --bare init + + +Change the ownership to your web-server's credentials. Use `"grep ^User +httpd.conf"` and `"grep ^Group httpd.conf"` to find out: + + $ chown -R www.www . + + On Debian: + + $ chown -R www-data.www-data . + + +If you do not know which user Apache runs as, you can alternatively do +a "chmod -R a+w .", inspect the files which are created later on, and +set the permissions appropriately. + +Restart apache2, and check whether http://server/my-new-repo.git gives +a directory listing. If not, check whether apache started up +successfully. + + +Step 2: enable DAV on this repository +------------------------------------- + +First make sure the dav_module is loaded. For this, insert in httpd.conf: + + LoadModule dav_module libexec/httpd/libdav.so + AddModule mod_dav.c + +Also make sure that this line exists which is the file used for +locking DAV operations: + + DAVLockDB "/usr/local/apache2/temp/DAV.lock" + + On Debian these steps can be performed with: + + Enable the dav and dav_fs modules of apache: + $ a2enmod dav_fs + (just to be sure. dav_fs might be unneeded, I don't know) + $ a2enmod dav + The DAV lock is located in /etc/apache2/mods-available/dav_fs.conf: + DAVLockDB /var/lock/apache2/DAVLock + +Of course, it can point somewhere else, but the string is actually just a +prefix in some Apache configurations, and therefore the _directory_ has to +be writable by the user Apache runs as. + +Then, add something like this to your httpd.conf + + <Location /my-new-repo.git> + DAV on + AuthType Basic + AuthName "Git" + AuthUserFile /usr/local/apache2/conf/passwd.git + Require valid-user + </Location> + + On Debian: + Create (or add to) /etc/apache2/conf.d/git.conf : + + <Location /my-new-repo.git> + DAV on + AuthType Basic + AuthName "Git" + AuthUserFile /etc/apache2/passwd.git + Require valid-user + </Location> + + Debian automatically reads all files under /etc/apache2/conf.d. + +The password file can be somewhere else, but it has to be readable by +Apache and preferably not readable by the world. + +Create this file by + $ htpasswd -c /usr/local/apache2/conf/passwd.git <user> + + On Debian: + $ htpasswd -c /etc/apache2/passwd.git <user> + +You will be asked a password, and the file is created. Subsequent calls +to htpasswd should omit the '-c' option, since you want to append to the +existing file. + +You need to restart Apache. + +Now go to http://<username>@<servername>/my-new-repo.git in your +browser to check whether it asks for a password and accepts the right +password. + +On Debian: + + To test the WebDAV part, do: + + $ apt-get install litmus + $ litmus http://<servername>/my-new-repo.git <username> <password> + + Most tests should pass. + +A command-line tool to test WebDAV is cadaver. If you prefer GUIs, for +example, konqueror can open WebDAV URLs as "webdav://..." or +"webdavs://...". + +If you're into Windows, from XP onwards Internet Explorer supports +WebDAV. For this, do Internet Explorer -> Open Location -> +http://<servername>/my-new-repo.git [x] Open as webfolder -> login . + + +Step 3: setup the client +------------------------ + +Make sure that you have HTTP support, i.e. your Git was built with +libcurl (version more recent than 7.10). The command 'git http-push' with +no argument should display a usage message. + +Then, add the following to your $HOME/.netrc (you can do without, but will be +asked to input your password a _lot_ of times): + + machine <servername> + login <username> + password <password> + +...and set permissions: + chmod 600 ~/.netrc + +If you want to access the web-server by its IP, you have to type that in, +instead of the server name. + +To check whether all is OK, do: + + curl --netrc --location -v http://<username>@<servername>/my-new-repo.git/HEAD + +...this should give something like 'ref: refs/heads/master', which is +the content of the file HEAD on the server. + +Now, add the remote in your existing repository which contains the project +you want to export: + + $ git-config remote.upload.url \ + http://<username>@<servername>/my-new-repo.git/ + +It is important to put the last '/'; Without it, the server will send +a redirect which git-http-push does not (yet) understand, and git-http-push +will repeat the request infinitely. + + +Step 4: make the initial push +----------------------------- + +From your client repository, do + + $ git push upload master + +This pushes branch 'master' (which is assumed to be the branch you +want to export) to repository called 'upload', which we previously +defined with git-config. + + +Using a proxy: +-------------- + +If you have to access the WebDAV server from behind an HTTP(S) proxy, +set the variable 'all_proxy' to `http://proxy-host.com:port`, or +`http://login-on-proxy:passwd-on-proxy@proxy-host.com:port`. See 'man +curl' for details. + + +Troubleshooting: +---------------- + +If git-http-push says + + Error: no DAV locking support on remote repo http://... + +then it means the web-server did not accept your authentication. Make sure +that the user name and password matches in httpd.conf, .netrc and the URL +you are uploading to. + +If git-http-push shows you an error (22/502) when trying to MOVE a blob, +it means that your web-server somehow does not recognize its name in the +request; This can happen when you start Apache, but then disable the +network interface. A simple restart of Apache helps. + +Errors like (22/502) are of format (curl error code/http error +code). So (22/404) means something like 'not found' at the server. + +Reading /usr/local/apache2/logs/error_log is often helpful. + + On Debian: Read /var/log/apache2/error.log instead. + +If you access HTTPS locations, Git may fail verifying the SSL +certificate (this is return code 60). Setting http.sslVerify=false can +help diagnosing the problem, but removes security checks. + + +Debian References: http://www.debian-administration.org/articles/285 + +Authors + Johannes Schindelin <Johannes.Schindelin@gmx.de> + Rutger Nijlunsing <git@wingding.demon.nl> + Matthieu Moy <Matthieu.Moy@imag.fr> diff --git a/Documentation/howto/update-hook-example.txt b/Documentation/howto/update-hook-example.txt new file mode 100644 index 0000000..151ee84 --- /dev/null +++ b/Documentation/howto/update-hook-example.txt @@ -0,0 +1,192 @@ +From: Junio C Hamano <gitster@pobox.com> and Carl Baldwin <cnb@fc.hp.com> +Subject: control access to branches. +Date: Thu, 17 Nov 2005 23:55:32 -0800 +Message-ID: <7vfypumlu3.fsf@assigned-by-dhcp.cox.net> +Abstract: An example hooks/update script is presented to + implement repository maintenance policies, such as who can push + into which branch and who can make a tag. +Content-type: text/asciidoc + +How to use the update hook +========================== + +When your developer runs git-push into the repository, +git-receive-pack is run (either locally or over ssh) as that +developer, so is hooks/update script. Quoting from the relevant +section of the documentation: + + Before each ref is updated, if $GIT_DIR/hooks/update file exists + and executable, it is called with three parameters: + + $GIT_DIR/hooks/update refname sha1-old sha1-new + + The refname parameter is relative to $GIT_DIR; e.g. for the + master head this is "refs/heads/master". Two sha1 are the + object names for the refname before and after the update. Note + that the hook is called before the refname is updated, so either + sha1-old is 0{40} (meaning there is no such ref yet), or it + should match what is recorded in refname. + +So if your policy is (1) always require fast-forward push +(i.e. never allow "git-push repo +branch:branch"), (2) you +have a list of users allowed to update each branch, and (3) you +do not let tags to be overwritten, then you can use something +like this as your hooks/update script. + +[jc: editorial note. This is a much improved version by Carl +since I posted the original outline] + +---------------------------------------------------- +#!/bin/bash + +umask 002 + +# If you are having trouble with this access control hook script +# you can try setting this to true. It will tell you exactly +# why a user is being allowed/denied access. + +verbose=false + +# Default shell globbing messes things up downstream +GLOBIGNORE=* + +function grant { + $verbose && echo >&2 "-Grant- $1" + echo grant + exit 0 +} + +function deny { + $verbose && echo >&2 "-Deny- $1" + echo deny + exit 1 +} + +function info { + $verbose && echo >&2 "-Info- $1" +} + +# Implement generic branch and tag policies. +# - Tags should not be updated once created. +# - Branches should only be fast-forwarded unless their pattern starts with '+' +case "$1" in + refs/tags/*) + git rev-parse --verify -q "$1" && + deny >/dev/null "You can't overwrite an existing tag" + ;; + refs/heads/*) + # No rebasing or rewinding + if expr "$2" : '0*$' >/dev/null; then + info "The branch '$1' is new..." + else + # updating -- make sure it is a fast-forward + mb=$(git merge-base "$2" "$3") + case "$mb,$2" in + "$2,$mb") info "Update is fast-forward" ;; + *) noff=y; info "This is not a fast-forward update.";; + esac + fi + ;; + *) + deny >/dev/null \ + "Branch is not under refs/heads or refs/tags. What are you trying to do?" + ;; +esac + +# Implement per-branch controls based on username +allowed_users_file=$GIT_DIR/info/allowed-users +username=$(id -u -n) +info "The user is: '$username'" + +if test -f "$allowed_users_file" +then + rc=$(cat $allowed_users_file | grep -v '^#' | grep -v '^$' | + while read heads user_patterns + do + # does this rule apply to us? + head_pattern=${heads#+} + matchlen=$(expr "$1" : "${head_pattern#+}") + test "$matchlen" = ${#1} || continue + + # if non-ff, $heads must be with the '+' prefix + test -n "$noff" && + test "$head_pattern" = "$heads" && continue + + info "Found matching head pattern: '$head_pattern'" + for user_pattern in $user_patterns; do + info "Checking user: '$username' against pattern: '$user_pattern'" + matchlen=$(expr "$username" : "$user_pattern") + if test "$matchlen" = "${#username}" + then + grant "Allowing user: '$username' with pattern: '$user_pattern'" + fi + done + deny "The user is not in the access list for this branch" + done + ) + case "$rc" in + grant) grant >/dev/null "Granting access based on $allowed_users_file" ;; + deny) deny >/dev/null "Denying access based on $allowed_users_file" ;; + *) ;; + esac +fi + +allowed_groups_file=$GIT_DIR/info/allowed-groups +groups=$(id -G -n) +info "The user belongs to the following groups:" +info "'$groups'" + +if test -f "$allowed_groups_file" +then + rc=$(cat $allowed_groups_file | grep -v '^#' | grep -v '^$' | + while read heads group_patterns + do + # does this rule apply to us? + head_pattern=${heads#+} + matchlen=$(expr "$1" : "${head_pattern#+}") + test "$matchlen" = ${#1} || continue + + # if non-ff, $heads must be with the '+' prefix + test -n "$noff" && + test "$head_pattern" = "$heads" && continue + + info "Found matching head pattern: '$head_pattern'" + for group_pattern in $group_patterns; do + for groupname in $groups; do + info "Checking group: '$groupname' against pattern: '$group_pattern'" + matchlen=$(expr "$groupname" : "$group_pattern") + if test "$matchlen" = "${#groupname}" + then + grant "Allowing group: '$groupname' with pattern: '$group_pattern'" + fi + done + done + deny "None of the user's groups are in the access list for this branch" + done + ) + case "$rc" in + grant) grant >/dev/null "Granting access based on $allowed_groups_file" ;; + deny) deny >/dev/null "Denying access based on $allowed_groups_file" ;; + *) ;; + esac +fi + +deny >/dev/null "There are no more rules to check. Denying access" +---------------------------------------------------- + +This uses two files, $GIT_DIR/info/allowed-users and +allowed-groups, to describe which heads can be pushed into by +whom. The format of each file would look like this: + + refs/heads/master junio + +refs/heads/seen junio + refs/heads/cogito$ pasky + refs/heads/bw/.* linus + refs/heads/tmp/.* .* + refs/tags/v[0-9].* junio + +With this, Linus can push or create "bw/penguin" or "bw/zebra" +or "bw/panda" branches, Pasky can do only "cogito", and JC can +do master and "seen" branches and make versioned tags. And anybody +can do tmp/blah branches. The '+' sign at the "seen" record means +that JC can make non-fast-forward pushes on it. diff --git a/Documentation/howto/use-git-daemon.txt b/Documentation/howto/use-git-daemon.txt new file mode 100644 index 0000000..2cad9b3 --- /dev/null +++ b/Documentation/howto/use-git-daemon.txt @@ -0,0 +1,54 @@ +Content-type: text/asciidoc + +How to use git-daemon +===================== + +Git can be run in inetd mode and in stand alone mode. But all you want is +to let a coworker pull from you, and therefore need to set up a Git server +real quick, right? + +Note that git-daemon is not really chatty at the moment, especially when +things do not go according to plan (e.g. a socket could not be bound). + +Another word of warning: if you run + + $ git ls-remote git://127.0.0.1/rule-the-world.git + +and you see a message like + + fatal: The remote end hung up unexpectedly + +it only means that _something_ went wrong. To find out _what_ went wrong, +you have to ask the server. (Git refuses to be more precise for your +security only. Take off your shoes now. You have any coins in your pockets? +Sorry, not allowed -- who knows what you planned to do with them?) + +With these two caveats, let's see an example: + + $ git daemon --reuseaddr --verbose --base-path=/home/gitte/git \ + --export-all -- /home/gitte/git/rule-the-world.git + +(Of course, unless your user name is `gitte` _and_ your repository is in +~/rule-the-world.git, you have to adjust the paths. If your repository is +not bare, be aware that you have to type the path to the .git directory!) + +This invocation tries to reuse the address if it is already taken +(this can save you some debugging, because otherwise killing and restarting +git-daemon could just silently fail to bind to a socket). + +Also, it is (relatively) verbose when somebody actually connects to it. +It also sets the base path, which means that all the projects which can be +accessed using this daemon have to reside in or under that path. + +The option `--export-all` just means that you _don't_ have to create a +file named `git-daemon-export-ok` in each exported repository. (Otherwise, +git-daemon would complain loudly, and refuse to cooperate.) + +Last of all, the repository which should be exported is specified. It is +a good practice to put the paths after a "--" separator. + +Now, test your daemon with + + $ git ls-remote git://127.0.0.1/rule-the-world.git + +If this does not work, find out why, and submit a patch to this document. diff --git a/Documentation/howto/using-merge-subtree.txt b/Documentation/howto/using-merge-subtree.txt new file mode 100644 index 0000000..3bd581a --- /dev/null +++ b/Documentation/howto/using-merge-subtree.txt @@ -0,0 +1,75 @@ +Date: Sat, 5 Jan 2008 20:17:40 -0500 +From: Sean <seanlkml@sympatico.ca> +To: Miklos Vajna <vmiklos@frugalware.org> +Cc: git@vger.kernel.org +Subject: how to use git merge -s subtree? +Abstract: In this article, Sean demonstrates how one can use the subtree merge + strategy. +Content-type: text/asciidoc +Message-ID: <BAYC1-PASMTP12374B54BA370A1E1C6E78AE4E0@CEZ.ICE> + +How to use the subtree merge strategy +===================================== + +There are situations where you want to include content in your project +from an independently developed project. You can just pull from the +other project as long as there are no conflicting paths. + +The problematic case is when there are conflicting files. Potential +candidates are Makefiles and other standard filenames. You could merge +these files but probably you do not want to. A better solution for this +problem can be to merge the project as its own subdirectory. This is not +supported by the 'recursive' merge strategy, so just pulling won't work. + +What you want is the 'subtree' merge strategy, which helps you in such a +situation. + +In this example, let's say you have the repository at `/path/to/B` (but +it can be a URL as well, if you want). You want to merge the 'master' +branch of that repository to the `dir-B` subdirectory in your current +branch. + +Here is the command sequence you need: + +---------------- +$ git remote add -f Bproject /path/to/B <1> +$ git merge -s ours --no-commit --allow-unrelated-histories Bproject/master <2> +$ git read-tree --prefix=dir-B/ -u Bproject/master <3> +$ git commit -m "Merge B project as our subdirectory" <4> + +$ git pull -s subtree Bproject master <5> +---------------- +<1> name the other project "Bproject", and fetch. +<2> prepare for the later step to record the result as a merge. +<3> read "master" branch of Bproject to the subdirectory "dir-B". +<4> record the merge result. +<5> maintain the result with subsequent merges using "subtree" + +The first four commands are used for the initial merge, while the last +one is to merge updates from 'B project'. + +Comparing 'subtree' merge with submodules +----------------------------------------- + +- The benefit of using subtree merge is that it requires less + administrative burden from the users of your repository. It works with + older (before Git v1.5.2) clients and you have the code right after + clone. + +- However if you use submodules then you can choose not to transfer the + submodule objects. This may be a problem with the subtree merge. + +- Also, in case you make changes to the other project, it is easier to + submit changes if you just use submodules. + +Additional tips +--------------- + +- If you made changes to the other project in your repository, they may + want to merge from your project. This is possible using subtree -- it + can shift up the paths in your tree and then they can merge only the + relevant parts of your tree. + +- Please note that if the other project merges from you, then it will + connect its history to yours, which can be something they don't want + to. diff --git a/Documentation/howto/using-signed-tag-in-pull-request.txt b/Documentation/howto/using-signed-tag-in-pull-request.txt new file mode 100644 index 0000000..bbf040e --- /dev/null +++ b/Documentation/howto/using-signed-tag-in-pull-request.txt @@ -0,0 +1,217 @@ +From: Junio C Hamano <gitster@pobox.com> +Date: Tue, 17 Jan 2011 13:00:00 -0800 +Subject: Using signed tag in pull requests +Abstract: Beginning v1.7.9, a contributor can push a signed tag to her + publishing repository and ask her integrator to pull it. This assures the + integrator that the pulled history is authentic and allows others to + later validate it. +Content-type: text/asciidoc + +How to use a signed tag in pull requests +======================================== + +A typical distributed workflow using Git is for a contributor to fork a +project, build on it, publish the result to her public repository, and ask +the "upstream" person (often the owner of the project where she forked +from) to pull from her public repository. Requesting such a "pull" is made +easy by the `git request-pull` command. + +Earlier, a typical pull request may have started like this: + +------------ + The following changes since commit 406da78032179...: + + Froboz 3.2 (2011-09-30 14:20:57 -0700) + + are available in the Git repository at: + + example.com:/git/froboz.git for-xyzzy +------------ + +followed by a shortlog of the changes and a diffstat. + +The request was for a branch name (e.g. `for-xyzzy`) in the public +repository of the contributor, and even though it stated where the +contributor forked her work from, the message did not say anything about +the commit to expect at the tip of the for-xyzzy branch. If the site that +hosts the public repository of the contributor cannot be fully trusted, it +was unnecessarily hard to make sure what was pulled by the integrator was +genuinely what the contributor had produced for the project. Also there +was no easy way for third-party auditors to later verify the resulting +history. + +Starting from Git release v1.7.9, a contributor can add a signed tag to +the commit at the tip of the history and ask the integrator to pull that +signed tag. When the integrator runs `git pull`, the signed tag is +automatically verified to assure that the history is not tampered with. +In addition, the resulting merge commit records the content of the signed +tag, so that other people can verify that the branch merged by the +integrator was signed by the contributor, without fetching the signed tag +used to validate the pull request separately and keeping it in the refs +namespace. + +This document describes the workflow between the contributor and the +integrator, using Git v1.7.9 or later. + + +A contributor or a lieutenant +----------------------------- + +After preparing her work to be pulled, the contributor uses `git tag -s` +to create a signed tag: + +------------ + $ git checkout work + $ ... "git pull" from sublieutenants, "git commit" your own work ... + $ git tag -s -m "Completed frotz feature" frotz-for-xyzzy work +------------ + +Note that this example uses the `-m` option to create a signed tag with +just a one-liner message, but this is for illustration purposes only. It +is advisable to compose a well-written explanation of what the topic does +to justify why it is worthwhile for the integrator to pull it, as this +message will eventually become part of the final history after the +integrator responds to the pull request (as we will see later). + +Then she pushes the tag out to her public repository: + +------------ + $ git push example.com:/git/froboz.git/ +frotz-for-xyzzy +------------ + +There is no need to push the `work` branch or anything else. + +Note that the above command line used a plus sign at the beginning of +`+frotz-for-xyzzy` to allow forcing the update of a tag, as the same +contributor may want to reuse a signed tag with the same name after the +previous pull request has already been responded to. + +The contributor then prepares a message to request a "pull": + +------------ + $ git request-pull v3.2 example.com:/git/froboz.git/ frotz-for-xyzzy >msg.txt +------------ + +The arguments are: + +. the version of the integrator's commit the contributor based her work on; +. the URL of the repository, to which the contributor has pushed what she + wants to get pulled; and +. the name of the tag the contributor wants to get pulled (earlier, she could + write only a branch name here). + +The resulting msg.txt file begins like so: + +------------ + The following changes since commit 406da78032179...: + + Froboz 3.2 (2011-09-30 14:20:57 -0700) + + are available in the Git repository at: + + example.com:/git/froboz.git tags/frotz-for-xyzzy + + for you to fetch changes up to 703f05ad5835c...: + + Add tests and documentation for frotz (2011-12-02 10:02:52 -0800) + + ----------------------------------------------- + Completed frotz feature + ----------------------------------------------- +------------ + +followed by a shortlog of the changes and a diffstat. Comparing this with +the earlier illustration of the output from the traditional `git request-pull` +command, the reader should notice that: + +. The tip commit to expect is shown to the integrator; and +. The signed tag message is shown prominently between the dashed lines + before the shortlog. + +The latter is why the contributor would want to justify why pulling her +work is worthwhile when creating the signed tag. The contributor then +opens her favorite MUA, reads msg.txt, edits and sends it to her upstream +integrator. + + +Integrator +---------- + +After receiving such a pull request message, the integrator fetches and +integrates the tag named in the request, with: + +------------ + $ git pull example.com:/git/froboz.git/ tags/frotz-for-xyzzy +------------ + +This operation will always open an editor to allow the integrator to fine +tune the commit log message when merging a signed tag. Also, pulling a +signed tag will always create a merge commit even when the integrator does +not have any new commit since the contributor's work forked (i.e. 'fast +forward'), so that the integrator can properly explain what the merge is +about and why it was made. + +In the editor, the integrator will see something like this: + +------------ + Merge tag 'frotz-for-xyzzy' of example.com:/git/froboz.git/ + + Completed frotz feature + # gpg: Signature made Fri 02 Dec 2011 10:03:01 AM PST using RSA key ID 96AFE6CB + # gpg: Good signature from "Con Tributor <nitfol@example.com>" +------------ + +Notice that the message recorded in the signed tag "Completed frotz +feature" appears here, and again that is why it is important for the +contributor to explain her work well when creating the signed tag. + +As usual, the lines commented with `#` are stripped out. The resulting +commit records the signed tag used for this validation in a hidden field +so that it can later be used by others to audit the history. There is no +need for the integrator to keep a separate copy of the tag in his +repository (i.e. `git tag -l` won't list the `frotz-for-xyzzy` tag in the +above example), and there is no need to publish the tag to his public +repository, either. + +After the integrator responds to the pull request and her work becomes +part of the permanent history, the contributor can remove the tag from +her public repository, if she chooses, in order to keep the tag namespace +of her public repository clean, with: + +------------ + $ git push example.com:/git/froboz.git :frotz-for-xyzzy +------------ + + +Auditors +-------- + +The `--show-signature` option can be given to `git log` or `git show` and +shows the verification status of the embedded signed tag in merge commits +created when the integrator responded to a pull request of a signed tag. + +A typical output from `git show --show-signature` may look like this: + +------------ + $ git show --show-signature + commit 02306ef6a3498a39118aef9df7975bdb50091585 + merged tag 'frotz-for-xyzzy' + gpg: Signature made Fri 06 Jan 2012 12:41:49 PM PST using RSA key ID 96AFE6CB + gpg: Good signature from "Con Tributor <nitfol@example.com>" + Merge: 406da78 703f05a + Author: Inte Grator <xyzzy@example.com> + Date: Tue Jan 17 13:49:41 2012 -0800 + + Merge tag 'frotz-for-xyzzy' of example.com:/git/froboz.git/ + + Completed frotz feature + + * tag 'frotz-for-xyzzy' (100 commits) + Add tests and documentation for frotz + ... +------------ + +There is no need for the auditor to explicitly fetch the contributor's +signature, or to even be aware of what tag(s) the contributor and integrator +used to communicate the signature. All the required information is recorded +as part of the merge commit. |