aboutsummaryrefslogtreecommitdiff
path: root/Documentation/howto/keep-canonical-history-correct.txt
blob: 35d48ef714e9b2cddbbff68b63a14901844ded90 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
From: Junio C Hamano <gitster@pobox.com>
Date: Wed, 07 May 2014 13:15:39 -0700
Subject: Beginner question on "Pull is mostly evil"
Abstract: This how-to explains a method for keeping a
 project's history correct when using git pull.
Content-type: text/asciidoc

Keep authoritative canonical history correct with git pull
==========================================================

Sometimes a new project integrator will end up with project history
that appears to be "backwards" from what other project developers
expect. This howto presents a suggested integration workflow for
maintaining a central repository.

Suppose that that central repository has this history:

------------
    ---o---o---A
------------

which ends at commit `A` (time flows from left to right and each node
in the graph is a commit, lines between them indicating parent-child
relationship).

Then you clone it and work on your own commits, which leads you to
have this history in *your* repository:

------------
    ---o---o---A---B---C
------------

Imagine your coworker did the same and built on top of `A` in *his*
repository in the meantime, and then pushed it to the
central repository:

------------
    ---o---o---A---X---Y---Z
------------

Now, if you `git push` at this point, because your history that leads
to `C` lacks `X`, `Y` and `Z`, it will fail.  You need to somehow make
the tip of your history a descendant of `Z`.

One suggested way to solve the problem is "fetch and then merge", aka
`git pull`. When you fetch, your repository will have a history like
this:

------------
    ---o---o---A---B---C
		\
		 X---Y---Z
------------

Once you run merge after that, while still on *your* branch, i.e. `C`,
you will create a merge `M` and make the history look like this:

------------
    ---o---o---A---B---C---M
		\         /
		 X---Y---Z
------------

`M` is a descendant of `Z`, so you can push to update the central
repository.  Such a merge `M` does not lose any commit in both
histories, so in that sense it may not be wrong, but when people want
to talk about "the authoritative canonical history that is shared
among the project participants", i.e. "the trunk", they often view
it as "commits you see by following the first-parent chain", and use
this command to view it:

------------
    $ git log --first-parent
------------

For all other people who observed the central repository after your
coworker pushed `Z` but before you pushed `M`, the commit on the trunk
used to be `o-o-A-X-Y-Z`.  But because you made `M` while you were on
`C`, `M`'s first parent is `C`, so by pushing `M` to advance the
central repository, you made `X-Y-Z` a side branch, not on the trunk.

You would rather want to have a history of this shape:

------------
    ---o---o---A---X---Y---Z---M'
		\             /
		 B-----------C
------------

so that in the first-parent chain, it is clear that the project first
did `X` and then `Y` and then `Z` and merged a change that consists of
two commits `B` and `C` that achieves a single goal.  You may have
worked on fixing the bug #12345 with these two patches, and the merge
`M'` with swapped parents can say in its log message "Merge
fix-bug-12345". Having a way to tell `git pull` to create a merge
but record the parents in reverse order may be a way to do so.

Note that I said "achieves a single goal" above, because this is
important.  "Swapping the merge order" only covers a special case
where the project does not care too much about having unrelated
things done on a single merge but cares a lot about first-parent
chain.

There are multiple schools of thought about the "trunk" management.

 1. Some projects want to keep a completely linear history without any
    merges.  Obviously, swapping the merge order would not match their
    taste.  You would need to flatten your history on top of the
    updated upstream to result in a history of this shape instead:
+
------------
    ---o---o---A---X---Y---Z---B---C
------------
+
with `git pull --rebase` or something.

 2. Some projects tolerate merges in their history, but do not worry
    too much about the first-parent order, and allow fast-forward
    merges.  To them, swapping the merge order does not hurt, but
    it is unnecessary.

 3. Some projects want each commit on the "trunk" to do one single
    thing.  The output of `git log --first-parent` in such a project
    would show either a merge of a side branch that completes a single
    theme, or a single commit that completes a single theme by itself.
    If your two commits `B` and `C` (or they may even be two groups of
    commits) were solving two independent issues, then the merge `M'`
    we made in the earlier example by swapping the merge order is
    still not up to the project standard.  It merges two unrelated
    efforts `B` and `C` at the same time.

For projects in the last category (Git itself is one of them),
individual developers would want to prepare a history more like
this:

------------
		 C0--C1--C2     topic-c
		/
    ---o---o---A                master
		\
		 B0--B1--B2     topic-b
------------

That is, keeping separate topics on separate branches, perhaps like
so:

------------
    $ git clone $URL work && cd work
    $ git checkout -b topic-b master
    $ ... work to create B0, B1 and B2 to complete one theme
    $ git checkout -b topic-c master
    $ ... same for the theme of topic-c
------------

And then

------------
    $ git checkout master
    $ git pull --ff-only
------------

would grab `X`, `Y` and `Z` from the upstream and advance your master
branch:

------------
		 C0--C1--C2     topic-c
		/
    ---o---o---A---X---Y---Z    master
		\
		 B0--B1--B2     topic-b
------------

And then you would merge these two branches separately:

------------
    $ git merge topic-b
    $ git merge topic-c
------------

to result in

------------
		 C0--C1---------C2
		/                 \
    ---o---o---A---X---Y---Z---M---N
		\             /
		 B0--B1-----B2
------------

and push it back to the central repository.

It is very much possible that while you are merging topic-b and
topic-c, somebody again advanced the history in the central repository
to put `W` on top of `Z`, and make your `git push` fail.

In such a case, you would rewind to discard `M` and `N`, update the
tip of your 'master' again and redo the two merges:

------------
    $ git reset --hard origin/master
    $ git pull --ff-only
    $ git merge topic-b
    $ git merge topic-c
------------

The procedure will result in a history that looks like this:

------------
		 C0--C1--------------C2
		/                     \
    ---o---o---A---X---Y---Z---W---M'--N'
		\                 /
		 B0--B1---------B2
------------

See also http://git-blame.blogspot.com/2013/09/fun-with-first-parent-history.html