1
0
mirror of https://github.com/git/git.git synced 2025-04-05 04:09:31 +00:00
Dmitry Ivankov a7e9c34126 fast-import: treat cat-blob as a delta base hint for next blob
Delta base for blobs is chosen as a previously saved blob. If we
treat cat-blob's blob as a delta base for the next blob, nothing
is likely to become worse.

For fast-import stream producer like svn-fe cat-blob is used like
following:
- svn-fe reads file delta in svn format
- to apply it, svn-fe asks cat-blob 'svn delta base'
- applies 'svn delta' to the response
- produces a blob command to store the result

Currently there is no way for svn-fe to give fast-import a hint on
object delta base. While what's requested in cat-blob is most of
the time a best delta base possible. Of course, it could be not a
good delta base, but we don't know any better one anyway.

So do treat cat-blob's result as a delta base for next blob. The
profit is nice: 2x to 7x reduction in pack size AND 1.2x to 3x
time speedup due to diff_delta being faster on good deltas. git gc
--aggressive can compress it even more, by 10% to 70%, utilizing
more cpu time, real time and 3 cpu cores.

Tested on 213M and 2.7G fast-import streams, resulting packs are 22M
and 113M, import time is 7s and 60s, both streams are produced by
svn-fe, sniffed and then used as raw input for fast-import.

For git-fast-export produced streams there is no change as it doesn't
use cat-blob and doesn't try to reorder blobs in some smart way to
make successive deltas small.

Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Acked-by: David Barr <davidbarr@google.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-22 11:57:07 -07:00
2011-06-26 12:09:11 -07:00
2011-03-26 10:42:35 -07:00
2011-06-21 14:56:59 -07:00
2011-04-27 11:36:42 -07:00
2011-05-26 10:31:57 -07:00
2011-03-17 15:30:49 -07:00
2011-02-13 15:13:41 -08:00
2011-02-06 22:50:26 -08:00
2011-05-31 12:19:11 -07:00
2011-07-19 13:43:34 -07:00
2011-05-09 16:29:46 -07:00
2011-03-22 10:16:54 -07:00
2011-06-06 11:40:14 -07:00
2011-06-06 11:40:14 -07:00
2011-06-06 11:40:14 -07:00
2011-05-06 10:50:06 -07:00
2011-05-02 15:58:30 -07:00
2011-05-19 20:37:21 -07:00
2011-04-28 14:11:39 -07:00
2011-04-28 14:11:39 -07:00
2011-05-30 00:09:55 -07:00
2011-06-26 12:41:16 -07:00
2011-05-09 16:29:33 -07:00
2011-05-09 16:29:33 -07:00
2011-05-26 10:32:19 -07:00
2010-11-24 15:13:58 -08:00
2011-05-19 18:23:17 -07:00
2010-08-26 09:20:03 -07:00
2011-05-31 12:19:11 -07:00
2010-10-13 19:11:26 -07:00
2011-05-30 00:00:07 -07:00
2011-03-22 10:16:54 -07:00
2011-05-31 12:19:11 -07:00
2011-02-07 15:15:17 -08:00
2011-03-22 11:43:27 -07:00
2011-05-31 12:06:40 -07:00
2011-05-30 00:09:55 -07:00
2011-05-31 12:19:11 -07:00
2011-05-30 00:09:55 -07:00
2011-05-23 09:58:35 -07:00
2011-04-27 11:36:43 -07:00
2011-02-21 22:51:07 -08:00
2011-02-07 15:04:42 -08:00
2010-08-14 19:35:37 -07:00
2011-03-22 11:43:27 -07:00
2011-03-22 10:16:54 -07:00
2011-03-22 10:16:54 -07:00
2011-03-22 11:43:27 -07:00
2011-05-30 00:09:55 -07:00
2011-03-22 10:16:54 -07:00
2011-03-22 10:16:54 -07:00
2011-05-26 13:54:18 -07:00
2011-04-01 17:55:55 -07:00

////////////////////////////////////////////////////////////////

	GIT - the stupid content tracker

////////////////////////////////////////////////////////////////

"git" can mean anything, depending on your mood.

 - random three-letter combination that is pronounceable, and not
   actually used by any common UNIX command.  The fact that it is a
   mispronunciation of "get" may or may not be relevant.
 - stupid. contemptible and despicable. simple. Take your pick from the
   dictionary of slang.
 - "global information tracker": you're in a good mood, and it actually
   works for you. Angels sing, and a light suddenly fills the room.
 - "goddamn idiotic truckload of sh*t": when it breaks

Git is a fast, scalable, distributed revision control system with an
unusually rich command set that provides both high-level operations
and full access to internals.

Git is an Open Source project covered by the GNU General Public License.
It was originally written by Linus Torvalds with help of a group of
hackers around the net. It is currently maintained by Junio C Hamano.

Please read the file INSTALL for installation instructions.

See Documentation/gittutorial.txt to get started, then see
Documentation/everyday.txt for a useful minimum set of commands, and
Documentation/git-commandname.txt for documentation of each command.
If git has been correctly installed, then the tutorial can also be
read with "man gittutorial" or "git help tutorial", and the
documentation of each command with "man git-commandname" or "git help
commandname".

CVS users may also want to read Documentation/gitcvs-migration.txt
("man gitcvs-migration" or "git help cvs-migration" if git is
installed).

Many Git online resources are accessible from http://git-scm.com/
including full documentation and Git related tools.

The user discussion and development of Git take place on the Git
mailing list -- everyone is welcome to post bug reports, feature
requests, comments and patches to git@vger.kernel.org. To subscribe
to the list, send an email with just "subscribe git" in the body to
majordomo@vger.kernel.org. The mailing list archives are available at
http://marc.theaimsgroup.com/?l=git and other archival sites.

The messages titled "A note from the maintainer", "What's in
git.git (stable)" and "What's cooking in git.git (topics)" and
the discussion following them on the mailing list give a good
reference for project status, development direction and
remaining tasks.
Description
Git Source Code Mirror - This is a publish-only repository but pull requests can be turned into patches to the mailing list via GitGitGadget (https://gitgitgadget.github.io/). Please follow Documentation/SubmittingPatches procedure for any of your improvements.
Readme 809 MiB
Languages
C 50.1%
Shell 38.4%
Perl 5.1%
Tcl 3.2%
Python 0.8%
Other 2.1%