mirror of
https://github.com/git/git.git
synced 2025-03-21 16:13:59 +00:00
user-manual: rewrite index discussion
Add an example using git-ls-files, standardize on the new "index" terminology (as opposed to "cache"), attempt to clarify discussion and make it a little shorter, avoid some unnecessary jargon ("write-back cache"). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This commit is contained in:
parent
1c6045fffa
commit
1c097891e4
@ -2911,57 +2911,63 @@ gitlink:git-verify-tag[1].
|
||||
|
||||
|
||||
[[the-index]]
|
||||
The "index" aka "Current Directory Cache"
|
||||
-----------------------------------------
|
||||
The index
|
||||
-----------
|
||||
|
||||
The index is a simple binary file, which contains an efficient
|
||||
representation of the contents of a virtual directory. It
|
||||
does so by a simple array that associates a set of names, dates,
|
||||
permissions and content (aka "blob") objects together. The cache is
|
||||
always kept ordered by name, and names are unique (with a few very
|
||||
specific rules) at any point in time, but the cache has no long-term
|
||||
meaning, and can be partially updated at any time.
|
||||
The index is a binary file (generally kept in .git/index) containing a
|
||||
sorted list of path names, each with permissions and the SHA1 of a blob
|
||||
object; gitlink:git-ls-files[1] can show you the contents of the index:
|
||||
|
||||
In particular, the index certainly does not need to be consistent with
|
||||
the current directory contents (in fact, most operations will depend on
|
||||
different ways to make the index 'not' be consistent with the directory
|
||||
hierarchy), but it has three very important attributes:
|
||||
-------------------------------------------------
|
||||
$ git ls-files --stage
|
||||
100644 63c918c667fa005ff12ad89437f2fdc80926e21c 0 .gitignore
|
||||
100644 5529b198e8d14decbe4ad99db3f7fb632de0439d 0 .mailmap
|
||||
100644 6ff87c4664981e4397625791c8ea3bbb5f2279a3 0 COPYING
|
||||
100644 a37b2152bd26be2c2289e1f57a292534a51a93c7 0 Documentation/.gitignore
|
||||
100644 fbefe9a45b00a54b58d94d06eca48b03d40a50e0 0 Documentation/Makefile
|
||||
...
|
||||
100644 2511aef8d89ab52be5ec6a5e46236b4b6bcd07ea 0 xdiff/xtypes.h
|
||||
100644 2ade97b2574a9f77e7ae4002a4e07a6a38e46d07 0 xdiff/xutils.c
|
||||
100644 d5de8292e05e7c36c4b68857c1cf9855e3d2f70a 0 xdiff/xutils.h
|
||||
-------------------------------------------------
|
||||
|
||||
'(a) it can re-generate the full state it caches (not just the
|
||||
directory structure: it contains pointers to the "blob" objects so
|
||||
that it can regenerate the data too)'
|
||||
Note that in older documentation you may see the index called the
|
||||
"current directory cache" or just the "cache". It has three important
|
||||
properties:
|
||||
|
||||
As a special case, there is a clear and unambiguous one-way mapping
|
||||
from a current directory cache to a "tree object", which can be
|
||||
efficiently created from just the current directory cache without
|
||||
actually looking at any other data. So a directory cache at any one
|
||||
time uniquely specifies one and only one "tree" object (but has
|
||||
additional data to make it easy to match up that tree object with what
|
||||
has happened in the directory)
|
||||
1. The index contains all the information necessary to generate a single
|
||||
(uniquely determined) tree object.
|
||||
+
|
||||
For example, running gitlink:git-commit[1] generates this tree object
|
||||
from the index, stores it in the object database, and uses it as the
|
||||
tree object associated with the new commit.
|
||||
|
||||
'(b) it has efficient methods for finding inconsistencies between that
|
||||
cached state ("tree object waiting to be instantiated") and the
|
||||
current state.'
|
||||
2. The index enables fast comparisons between the tree object it defines
|
||||
and the working tree.
|
||||
+
|
||||
It does this by storing some additional data for each entry (such as
|
||||
the last modified time). This data is not displayed above, and is not
|
||||
stored in the created tree object, but it can be used to determine
|
||||
quickly which files in the working directory differ from what was
|
||||
stored in the index, and thus save git from having to read all of the
|
||||
data from such files to look for changes.
|
||||
|
||||
'(c) it can additionally efficiently represent information about merge
|
||||
conflicts between different tree objects, allowing each pathname to be
|
||||
3. It can efficiently represent information about merge conflicts
|
||||
between different tree objects, allowing each pathname to be
|
||||
associated with sufficient information about the trees involved that
|
||||
you can create a three-way merge between them.'
|
||||
you can create a three-way merge between them.
|
||||
+
|
||||
We saw in <<conflict-resolution>> that during a merge the index can
|
||||
store multiple versions of a single file (called "stages"). The third
|
||||
column in the gitlink:git-ls-files[1] output above is the stage
|
||||
number, and will take on values other than 0 for files with merge
|
||||
conflicts.
|
||||
|
||||
Those are the ONLY three things that the directory cache does. It's a
|
||||
cache, and the normal operation is to re-generate it completely from a
|
||||
known tree object, or update/compare it with a live tree that is being
|
||||
developed. If you blow the directory cache away entirely, you generally
|
||||
haven't lost any information as long as you have the name of the tree
|
||||
that it described.
|
||||
The index is thus a sort of temporary staging area, which is filled with
|
||||
a tree which you are in the process of working on.
|
||||
|
||||
At the same time, the index is also the staging area for creating
|
||||
new trees, and creating a new tree always involves a controlled
|
||||
modification of the index file. In particular, the index file can
|
||||
have the representation of an intermediate tree that has not yet been
|
||||
instantiated. So the index can be thought of as a write-back cache,
|
||||
which can contain dirty information that has not yet been written back
|
||||
to the backing store.
|
||||
If you blow the index away entirely, you generally haven't lost any
|
||||
information as long as you have the name of the tree that it described.
|
||||
|
||||
[[low-level-operations]]
|
||||
Low-level git operations
|
||||
|
Loading…
x
Reference in New Issue
Block a user