diff options
Diffstat (limited to 'Documentation/git-fast-export.txt')
-rw-r--r-- | Documentation/git-fast-export.txt | 284 |
1 files changed, 284 insertions, 0 deletions
diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt new file mode 100644 index 0000000..4643ddb --- /dev/null +++ b/Documentation/git-fast-export.txt @@ -0,0 +1,284 @@ +git-fast-export(1) +================== + +NAME +---- +git-fast-export - Git data exporter + + +SYNOPSIS +-------- +[verse] +'git fast-export' [<options>] | 'git fast-import' + +DESCRIPTION +----------- +This program dumps the given revisions in a form suitable to be piped +into 'git fast-import'. + +You can use it as a human-readable bundle replacement (see +linkgit:git-bundle[1]), or as a format that can be edited before being +fed to 'git fast-import' in order to do history rewrites (an ability +relied on by tools like 'git filter-repo'). + +OPTIONS +------- +--progress=<n>:: + Insert 'progress' statements every <n> objects, to be shown by + 'git fast-import' during import. + +--signed-tags=(verbatim|warn|warn-strip|strip|abort):: + Specify how to handle signed tags. Since any transformation + after the export can change the tag names (which can also happen + when excluding revisions) the signatures will not match. ++ +When asking to 'abort' (which is the default), this program will die +when encountering a signed tag. With 'strip', the tags will silently +be made unsigned, with 'warn-strip' they will be made unsigned but a +warning will be displayed, with 'verbatim', they will be silently +exported and with 'warn', they will be exported, but you will see a +warning. + +--tag-of-filtered-object=(abort|drop|rewrite):: + Specify how to handle tags whose tagged object is filtered out. + Since revisions and files to export can be limited by path, + tagged objects may be filtered completely. ++ +When asking to 'abort' (which is the default), this program will die +when encountering such a tag. With 'drop' it will omit such tags from +the output. With 'rewrite', if the tagged object is a commit, it will +rewrite the tag to tag an ancestor commit (via parent rewriting; see +linkgit:git-rev-list[1]) + +-M:: +-C:: + Perform move and/or copy detection, as described in the + linkgit:git-diff[1] manual page, and use it to generate + rename and copy commands in the output dump. ++ +Note that earlier versions of this command did not complain and +produced incorrect results if you gave these options. + +--export-marks=<file>:: + Dumps the internal marks table to <file> when complete. + Marks are written one per line as `:markid SHA-1`. Only marks + for revisions are dumped; marks for blobs are ignored. + Backends can use this file to validate imports after they + have been completed, or to save the marks table across + incremental runs. As <file> is only opened and truncated + at completion, the same path can also be safely given to + --import-marks. + The file will not be written if no new object has been + marked/exported. + +--import-marks=<file>:: + Before processing any input, load the marks specified in + <file>. The input file must exist, must be readable, and + must use the same format as produced by --export-marks. + +--mark-tags:: + In addition to labelling blobs and commits with mark ids, also + label tags. This is useful in conjunction with + `--export-marks` and `--import-marks`, and is also useful (and + necessary) for exporting of nested tags. It does not hurt + other cases and would be the default, but many fast-import + frontends are not prepared to accept tags with mark + identifiers. ++ +Any commits (or tags) that have already been marked will not be +exported again. If the backend uses a similar --import-marks file, +this allows for incremental bidirectional exporting of the repository +by keeping the marks the same across runs. + +--fake-missing-tagger:: + Some old repositories have tags without a tagger. The + fast-import protocol was pretty strict about that, and did not + allow that. So fake a tagger to be able to fast-import the + output. + +--use-done-feature:: + Start the stream with a 'feature done' stanza, and terminate + it with a 'done' command. + +--no-data:: + Skip output of blob objects and instead refer to blobs via + their original SHA-1 hash. This is useful when rewriting the + directory structure or history of a repository without + touching the contents of individual files. Note that the + resulting stream can only be used by a repository which + already contains the necessary objects. + +--full-tree:: + This option will cause fast-export to issue a "deleteall" + directive for each commit followed by a full list of all files + in the commit (as opposed to just listing the files which are + different from the commit's first parent). + +--anonymize:: + Anonymize the contents of the repository while still retaining + the shape of the history and stored tree. See the section on + `ANONYMIZING` below. + +--anonymize-map=<from>[:<to>]:: + Convert token `<from>` to `<to>` in the anonymized output. If + `<to>` is omitted, map `<from>` to itself (i.e., do not + anonymize it). See the section on `ANONYMIZING` below. + +--reference-excluded-parents:: + By default, running a command such as `git fast-export + master~5..master` will not include the commit master{tilde}5 + and will make master{tilde}4 no longer have master{tilde}5 as + a parent (though both the old master{tilde}4 and new + master{tilde}4 will have all the same files). Use + --reference-excluded-parents to instead have the stream + refer to commits in the excluded range of history by their + sha1sum. Note that the resulting stream can only be used by a + repository which already contains the necessary parent + commits. + +--show-original-ids:: + Add an extra directive to the output for commits and blobs, + `original-oid <SHA1SUM>`. While such directives will likely be + ignored by importers such as git-fast-import, it may be useful + for intermediary filters (e.g. for rewriting commit messages + which refer to older commits, or for stripping blobs by id). + +--reencode=(yes|no|abort):: + Specify how to handle `encoding` header in commit objects. When + asking to 'abort' (which is the default), this program will die + when encountering such a commit object. With 'yes', the commit + message will be re-encoded into UTF-8. With 'no', the original + encoding will be preserved. + +--refspec:: + Apply the specified refspec to each ref exported. Multiple of them can + be specified. + +[<git-rev-list-args>...]:: + A list of arguments, acceptable to 'git rev-parse' and + 'git rev-list', that specifies the specific objects and references + to export. For example, `master~10..master` causes the + current master reference to be exported along with all objects + added since its 10th ancestor commit and (unless the + --reference-excluded-parents option is specified) all files + common to master{tilde}9 and master{tilde}10. + +EXAMPLES +-------- + +------------------------------------------------------------------- +$ git fast-export --all | (cd /empty/repository && git fast-import) +------------------------------------------------------------------- + +This will export the whole repository and import it into the existing +empty repository. Except for reencoding commits that are not in +UTF-8, it would be a one-to-one mirror. + +----------------------------------------------------- +$ git fast-export master~5..master | + sed "s|refs/heads/master|refs/heads/other|" | + git fast-import +----------------------------------------------------- + +This makes a new branch called 'other' from 'master~5..master' +(i.e. if 'master' has linear history, it will take the last 5 commits). + +Note that this assumes that none of the blobs and commit messages +referenced by that revision range contains the string +'refs/heads/master'. + + +ANONYMIZING +----------- + +If the `--anonymize` option is given, git will attempt to remove all +identifying information from the repository while still retaining enough +of the original tree and history patterns to reproduce some bugs. The +goal is that a git bug which is found on a private repository will +persist in the anonymized repository, and the latter can be shared with +git developers to help solve the bug. + +With this option, git will replace all refnames, paths, blob contents, +commit and tag messages, names, and email addresses in the output with +anonymized data. Two instances of the same string will be replaced +equivalently (e.g., two commits with the same author will have the same +anonymized author in the output, but bear no resemblance to the original +author string). The relationship between commits, branches, and tags is +retained, as well as the commit timestamps (but the commit messages and +refnames bear no resemblance to the originals). The relative makeup of +the tree is retained (e.g., if you have a root tree with 10 files and 3 +trees, so will the output), but their names and the contents of the +files will be replaced. + +If you think you have found a git bug, you can start by exporting an +anonymized stream of the whole repository: + +--------------------------------------------------- +$ git fast-export --anonymize --all >anon-stream +--------------------------------------------------- + +Then confirm that the bug persists in a repository created from that +stream (many bugs will not, as they really do depend on the exact +repository contents): + +--------------------------------------------------- +$ git init anon-repo +$ cd anon-repo +$ git fast-import <../anon-stream +$ ... test your bug ... +--------------------------------------------------- + +If the anonymized repository shows the bug, it may be worth sharing +`anon-stream` along with a regular bug report. Note that the anonymized +stream compresses very well, so gzipping it is encouraged. If you want +to examine the stream to see that it does not contain any private data, +you can peruse it directly before sending. You may also want to try: + +--------------------------------------------------- +$ perl -pe 's/\d+/X/g' <anon-stream | sort -u | less +--------------------------------------------------- + +which shows all of the unique lines (with numbers converted to "X", to +collapse "User 0", "User 1", etc into "User X"). This produces a much +smaller output, and it is usually easy to quickly confirm that there is +no private data in the stream. + +Reproducing some bugs may require referencing particular commits or +paths, which becomes challenging after refnames and paths have been +anonymized. You can ask for a particular token to be left as-is or +mapped to a new value. For example, if you have a bug which reproduces +with `git rev-list sensitive -- secret.c`, you can run: + +--------------------------------------------------- +$ git fast-export --anonymize --all \ + --anonymize-map=sensitive:foo \ + --anonymize-map=secret.c:bar.c \ + >stream +--------------------------------------------------- + +After importing the stream, you can then run `git rev-list foo -- bar.c` +in the anonymized repository. + +Note that paths and refnames are split into tokens at slash boundaries. +The command above would anonymize `subdir/secret.c` as something like +`path123/bar.c`; you could then search for `bar.c` in the anonymized +repository to determine the final pathname. + +To make referencing the final pathname simpler, you can map each path +component; so if you also anonymize `subdir` to `publicdir`, then the +final pathname would be `publicdir/bar.c`. + +LIMITATIONS +----------- + +Since 'git fast-import' cannot tag trees, you will not be +able to export the linux.git repository completely, as it contains +a tag referencing a tree instead of a commit. + +SEE ALSO +-------- +linkgit:git-fast-import[1] + +GIT +--- +Part of the linkgit:git[1] suite |