aboutsummaryrefslogtreecommitdiff
path: root/posts/hunk-editing.org
diff options
context:
space:
mode:
Diffstat (limited to 'posts/hunk-editing.org')
-rw-r--r--posts/hunk-editing.org302
1 files changed, 302 insertions, 0 deletions
diff --git a/posts/hunk-editing.org b/posts/hunk-editing.org
new file mode 100644
index 0000000..df93982
--- /dev/null
+++ b/posts/hunk-editing.org
@@ -0,0 +1,302 @@
+#+TITLE: The Art of Manually Editing Hunks
+#+DESCRIPTION: How to edit hunk diffs
+#+TAGS: Git
+#+TAGS: How-to
+#+TAGS: Tips and Tricks
+#+DATE: 2015-10-24
+#+SLUG: art-manually-edit-hunks
+#+LINK: udiff https://www.gnu.org/software/diffutils/manual/html_node/Unified-Format.html
+#+LINK: git-show https://www.kernel.org/pub/software/scm/git/docs/git-show.html
+#+LINK: git-diff https://www.kernel.org/pub/software/scm/git/docs/git-diff.html
+
+#+BEGIN_PREVIEW
+There's a certain art to editing hunks, seemingly arcane. Hunks are blocks of
+changes typically found in unified diff patch files, or, more commonly today,
+found in Git patches.
+#+END_PREVIEW
+
+Git uses its own variant of the [[udiff][unified diff format]], but it isn't
+much different. The differences between the unified format and Git's are
+usually not significant. The patch files created with [[git-show][git-show]]
+or [[git-diff][git-diff]] are consumable by the usual tools, ~patch~, ~git~,
+~vimdiff~, etc.
+
+** Short Introduction to Unified Diff
+
+A unified diff may look something similar to (freely copied from the
+~diffutils~ manual):
+
+#+BEGIN_SRC diff
+ --- lao 2002-02-21 23:30:39.942229878 -0800
+ +++ tzu 2002-02-21 23:30:50.442260588 -0800
+ @@ -1,7 +1,6 @@
+ -The Way that can be told of is not the eternal Way;
+ -The name that can be named is not the eternal name.
+ The Nameless is the origin of Heaven and Earth;
+ -The Named is the mother of all things.
+ +The named is the mother of all things.
+ +
+ Therefore let there always be non-being,
+ so we may see their subtlety,
+ And let there always be being,
+ @@ -9,3 +8,6 @@
+ The two are the same,
+ But after they are produced,
+ they have different names.
+ +They both may be called deep and profound.
+ +Deeper and more profound,
+ +The door of all subtleties!
+#+END_SRC
+
+The first two lines define the files that are input into the ~diff~ program,
+the first, ~lao~, being the "source" file and the second, ~tzu~, being the
+"new" file. The starting characters ~---~ and ~+++~ denote the lines from
+each.
+
+~+~ denotes a line that will be added to the first file and ~-~ denotes a line
+that will be removed from the first file. Lines with no changes are preceded
+by a single space.
+
+The ~@@ -1,7 +1,6 @@~ and ~@@ -9,3 +8,6 @@~ are the hunk identifiers. That is,
+diff hunks are the blocks identified by ~@@ -line number[,context] +line
+number[, context] @@~ in the diff format. The ~context~ number is optional and
+occasionally not needed. However, it is always included in when using
+~git-diff~. The line numbers defines the number the hunk begins. The context
+number defines the number of lines in the hunk. Unlike the line number, it
+often differs between the two files. In the first hunk of the example above,
+the context numbers are ~7~ and ~6~, respectively. That is, lines preceded
+with a ~-~ and a space equals 7. Similarly, lines starting with a ~+~ and a
+space equals 6.
+
+#+BEGIN_QUOTE
+ Lines starting with a space count towards the context of both files.
+#+END_QUOTE
+
+Since the second file has a smaller context, this means we are removing more
+(by one) lines than we are adding. To ~diff~, updating a line is the same as
+removing the old line and adding a new line (with the changes).
+
+Armed with this information, we can start editing hunks that can be cleanly
+applied.
+
+** Motivation
+
+What might be the motivation for even wanting to edit hunk files? The biggest I
+see is when using ~git-add --patch~. Particularly when the changes run
+together and cannot be split apart automatically. We can see this in the diff
+above.
+
+The trivial case is being able to stage a single hunk of the above diff,
+nothing has to be done to stage the changes separately other than using the
+~--patch~ option.
+
+However, staging separate changes inside a hunk becomes slightly more
+complicated. Often, if the changes are broken up with a even just a single
+line (if it exists), they can be split. When they run together, it becomes
+more difficult to do.
+
+Of course, a way to solve this problem, is to manually back out the changes (a
+series of "undos"), save the file, stage it, play back the changes (a series of
+"redos", perhaps). This can be very error prone and if you make any other
+changes during between undo and redo, you may have lost the changes.
+Therefore, being able to manually edit the specific hunk into the right shape,
+no changes are lost.
+
+** Hunk Editing Example
+
+Let's walk through an example of staging some changes, and manually editing a
+hunk to stage them into the patches we want.
+
+Create a temporary Git repository, this will be a just some basic stuff for
+testing.
+
+#+BEGIN_SRC sh
+ % cd /tmp
+ % git init foo
+ % cd foo
+#+END_SRC
+
+#+BEGIN_QUOTE
+ From here on, we will assume the working directory to be ~/tmp/foo~.
+#+END_QUOTE
+
+Inside this new Git repository, add a new file, ~quicksort.exs~:
+
+#+BEGIN_SRC elixir
+ defmodule Quicksort do
+
+ def sort(list) do
+ _sort(list)
+ end
+
+ defp _sort([]), do: []
+ defp _sort(list = [h|t]) do
+ _sort(Enum.filter(list, &(&1 < h))) ++ [h] ++ _sort(Enum.filter(list, &(&1 > h)))
+ end
+
+ end
+#+END_SRC
+
+Perform the usual actions, ~git-add~ and ~git-commit~:
+
+#+BEGIN_SRC sh
+ % git add quicksort.exs
+ % git commit -m 'initial commit'
+#+END_SRC
+
+Now, let's make some changes. For one, there's compiler warning about the
+unused variable ~t~ and the actually sorting seems a bit dense. Let's fix the
+warning and breakup the sorting:
+
+#+BEGIN_SRC elixir
+ defmodule Quicksort do
+
+ def sort(list) do
+ _sort(list)
+ end
+
+ defp _sort([]), do: []
+ defp _sort(list = [h|_]) do
+ (list |> Enum.filter(&(&1 < h)) |> _sort)
+ ++ [h] ++
+ (list |> Enum.filter(&(&1 > h)) |> _sort)
+ end
+
+ end
+#+END_SRC
+
+Saving this version of the file should produce a diff similar to the following:
+
+#+BEGIN_SRC diff
+ diff --git a/quicksort.exs b/quicksort.exs
+ index 97b60b4..ed2446b 100644
+ --- a/quicksort.exs
+ +++ b/quicksort.exs
+ @@ -5,8 +5,10 @@ defmodule Quicksort do
+ end
+
+ defp _sort([]), do: []
+ - defp _sort(list = [h|t]) do
+ - _sort(Enum.filter(list, &(&1 < h))) ++ [h] ++ _sort(Enum.filter(list, &(&1 > h)))
+ + defp _sort(list = [h|_]) do
+ + (list |> Enum.filter(&(&1 < h)) |> _sort)
+ + ++ [h] ++
+ + (list |> Enum.filter(&(&1 > h)) |> _sort)
+ end
+
+ end
+#+END_SRC
+
+However, since these changes are actually, argubly, two different changes, they
+should live in two commits. Let's stage the change for ~t~ to ~_~:
+
+#+BEGIN_SRC sh
+ % git add --patch
+#+END_SRC
+
+We will be presented with the diff from before:
+
+#+BEGIN_SRC sh
+ diff --git a/quicksort.exs b/quicksort.exs
+ index 97b60b4..ed2446b 100644
+ --- a/quicksort.exs
+ +++ b/quicksort.exs
+ @@ -5,8 +5,10 @@ defmodule Quicksort do
+ end
+
+ defp _sort([]), do: []
+ - defp _sort(list = [h|t]) do
+ - _sort(Enum.filter(list, &(&1 < h))) ++ [h] ++ _sort(Enum.filter(list, &(&1 > h)))
+ + defp _sort(list = [h|_]) do
+ + (list |> Enum.filter(&(&1 < h)) |> _sort)
+ + ++ [h] ++
+ + (list |> Enum.filter(&(&1 > h)) |> _sort)
+ end
+
+ end
+ Stage this hunk [y,n,q,a,d,/,e,?]?
+#+END_SRC
+
+First thing we want to try is using the ~split(s)~ option. However, this is an
+invalid choice because Git does not know how to split this hunk and we will be
+presented with the available options and the hunk again. The option we then
+want is ~edit(e)~.
+
+We will be dropped into our default editor, environment variable ~$EDITOR~, Git
+~core.editor~ setting. From there, we will be presented with something of the
+following:
+
+#+BEGIN_SRC diff
+ # Manual hunk edit mode -- see bottom for a quick guide
+ @@ -5,8 +5,10 @@ defmodule Quicksort do
+ end
+
+ defp _sort([]), do: []
+ - defp _sort(list = [h|t]) do
+ - _sort(Enum.filter(list, &(&1 < h))) ++ [h] ++ _sort(Enum.filter(list, &(&1 > h)))
+ + defp _sort(list = [h|_]) do
+ + (list |> Enum.filter(&(&1 < h)) |> _sort)
+ + ++ [h] ++
+ + (list |> Enum.filter(&(&1 > h)) |> _sort)
+ end
+
+ end
+ # ---
+ # To remove '-' lines, make them ' ' lines (context).
+ # To remove '+' lines, delete them.
+ # Lines starting with # will be removed.
+ #
+ # If the patch applies cleanly, the edited hunk will immediately be
+ # marked for staging. If it does not apply cleanly, you will be given
+ # an opportunity to edit again. If all lines of the hunk are removed,
+ # then the edit is aborted and the hunk is left unchanged.
+#+END_SRC
+
+From here, we want to replace the leading minus of the change removal to a
+space and remove the last three additions.
+
+That is, we want the diff to look like:
+
+#+BEGIN_SRC diff
+ @@ -5,8 +5,10 @@ defmodule Quicksort do
+ end
+
+ defp _sort([]), do: []
+ - defp _sort(list = [h|t]) do
+ sort(Enum.filter(list, &(&1 < h))) ++ [h] ++ _sort(Enum.filter(list, &(&1 > h)))
+ + defp _sort(list = [h|_]) do
+ end
+
+ end
+#+END_SRC
+
+Saving and closing the editor now, Git will have staged the desired diff. We
+can check the staged changes via ~git-diff~:
+
+#+BEGIN_SRC diff
+ % git diff --cached
+ diff --git a/quicksort.exs b/quicksort.exs
+ index 97b60b4..94a5101 100644
+ --- a/quicksort.exs
+ +++ b/quicksort.exs
+ @@ -5,8 +5,8 @@ defmodule Quicksort do
+ end
+
+ defp _sort([]), do: []
+ - defp _sort(list = [h|t]) do
+ _sort(Enum.filter(list, &(&1 < h))) ++ [h] ++ _sort(Enum.filter(list, &(&1 > h)))
+ + defp _sort(list = [h|_]) do
+ end
+
+ end
+#+END_SRC
+
+Notice, the hunk context data was updated correctly to match the new changes.
+
+From here, commit the first change, and then add and commit the second change.
+
+Something to watch out for is over zealously removing changed lines. For
+example, in Elixir quicksort example we have just did, if we entirely removed
+the second ~-~ from the diff /and/ manually updated the hunk header, the patch
+will never apply cleanly. Therefore, be especially careful with removing ~-~
+lines.