annotate hggit/hg2git.py @ 672:fbfa6353d96c

hg2git: fix subrepo handling to be deterministic Previously, the correctness of _handle_subrepos was based on the order the files were processed in. For example, consider the case where a subrepo at location 'loc' is replaced with a file at 'loc', while another subrepo exists. This would cause .hgsubstate and .hgsub to be modified and the file added. If .hgsubstate was seen _before_ 'loc' in the modified/added loop, then _handle_subrepos would run and remove 'loc' correctly, before 'loc' was added back later. If, however, .hgsubstate was seen _after_ 'loc', then _handle_subrepos would run after 'loc' was added and would remove 'loc'. With this patch, _handle_subrepos merely computes the changes that need to be applied. The changes are then applied, making sure removed files and subrepos are processed before added ones. This was detected by setting a random PYTHONHASHSEED (in this case, 3910358828) and running the test suite against it. An upcoming patch will randomize the PYTHONHASHSEED in run-tests.py, just like is done in Mercurial.
author Siddharth Agarwal <sid0@fb.com>
date Wed, 19 Feb 2014 20:52:59 -0800
parents 71fb5dd678bc
children d5facc1be5f8
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1 # This file contains code dealing specifically with converting Mercurial
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
2 # repositories to Git repositories. Code in this file is meant to be a generic
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
3 # library and should be usable outside the context of hg-git or an hg command.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
4
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
5 import os
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
6 import stat
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
7
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
8 import dulwich.objects as dulobjs
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
9 import mercurial.node
637
23d7caeed05a hg2git: store ctx instead of rev
Siddharth Agarwal <sid0@fb.com>
parents: 636
diff changeset
10 import mercurial.context
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
11
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
12 import util
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
13
671
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
14 def parse_subrepos(ctx):
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
15 sub = util.OrderedDict()
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
16 if '.hgsub' in ctx:
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
17 sub = util.parse_hgsub(ctx['.hgsub'].data().splitlines())
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
18 substate = util.OrderedDict()
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
19 if '.hgsubstate' in ctx:
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
20 substate = util.parse_hgsubstate(
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
21 ctx['.hgsubstate'].data().splitlines())
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
22 return sub, substate
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
23
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
24 class IncrementalChangesetExporter(object):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
25 """Incrementally export Mercurial changesets to Git trees.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
26
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
27 The purpose of this class is to facilitate Git tree export that is more
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
28 optimal than brute force.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
29
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
30 A "dumb" implementations of Mercurial to Git export would iterate over
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
31 every file present in a Mercurial changeset and would convert each to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
32 a Git blob and then conditionally add it to a Git repository if it didn't
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
33 yet exist. This is suboptimal because the overhead associated with
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
34 obtaining every file's raw content and converting it to a Git blob is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
35 not trivial!
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
36
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
37 This class works around the suboptimality of brute force export by
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
38 leveraging the information stored in Mercurial - the knowledge of what
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
39 changed between changesets - to only export Git objects corresponding to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
40 changes in Mercurial. In the context of converting Mercurial repositories
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
41 to Git repositories, we only export objects Git (possibly) hasn't seen yet.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
42 This prevents a lot of redundant work and is thus faster.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
43
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
44 Callers instantiate an instance of this class against a mercurial.localrepo
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
45 instance. They then associate it with a specific changesets by calling
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
46 update_changeset(). On each call to update_changeset(), the instance
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
47 computes the difference between the current and new changesets and emits
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
48 Git objects that haven't yet been encountered during the lifetime of the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
49 class instance. In other words, it expresses Mercurial changeset deltas in
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
50 terms of Git objects. Callers then (usually) take this set of Git objects
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
51 and add them to the Git repository.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
52
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
53 This class only emits Git blobs and trees, not commits.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
54
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
55 The tree calculation part of this class is essentially a reimplementation
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
56 of dulwich.index.commit_tree. However, since our implementation reuses
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
57 Tree instances and only recalculates SHA-1 when things change, we are
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
58 more efficient.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
59 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
60
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
61 def __init__(self, hg_repo):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
62 """Create an instance against a mercurial.localrepo."""
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
63 self._hg = hg_repo
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
64
637
23d7caeed05a hg2git: store ctx instead of rev
Siddharth Agarwal <sid0@fb.com>
parents: 636
diff changeset
65 # Our current revision's context.
23d7caeed05a hg2git: store ctx instead of rev
Siddharth Agarwal <sid0@fb.com>
parents: 636
diff changeset
66 self._ctx = mercurial.context.changectx(hg_repo, 'null')
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
67
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
68 # Path to dulwich.objects.Tree.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
69 self._dirs = {}
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
70
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
71 # Mercurial file nodeid to Git blob SHA-1. Used to prevent redundant
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
72 # blob calculation.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
73 self._blob_cache = {}
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
74
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
75 @property
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
76 def root_tree_sha(self):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
77 """The SHA-1 of the root Git tree.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
78
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
79 This is needed to construct a Git commit object.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
80 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
81 return self._dirs[''].id
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
82
636
0ab89bd32c8e hg2git: rename ctx to newctx in update_changeset
Siddharth Agarwal <sid0@fb.com>
parents: 598
diff changeset
83 def update_changeset(self, newctx):
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
84 """Set the tree to track a new Mercurial changeset.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
85
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
86 This is a generator of 2-tuples. The first item in each tuple is a
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
87 dulwich object, either a Blob or a Tree. The second item is the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
88 corresponding Mercurial nodeid for the item, if any. Only blobs will
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
89 have nodeids. Trees do not correspond to a specific nodeid, so it does
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
90 not make sense to emit a nodeid for them.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
91
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
92 When exporting trees from Mercurial, callers typically write the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
93 returned dulwich object to the Git repo via the store's add_object().
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
94
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
95 Some emitted objects may already exist in the Git repository. This
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
96 class does not know about the Git repository, so it's up to the caller
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
97 to conditionally add the object, etc.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
98
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
99 Emitted objects are those that have changed since the last call to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
100 update_changeset. If this is the first call to update_chanageset, all
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
101 objects in the tree are emitted.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
102 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
103 # Our general strategy is to accumulate dulwich.objects.Blob and
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
104 # dulwich.objects.Tree instances for the current Mercurial changeset.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
105 # We do this incremental by iterating over the Mercurial-reported
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
106 # changeset delta. We rely on the behavior of Mercurial to lazy
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
107 # calculate a Tree's SHA-1 when we modify it. This is critical to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
108 # performance.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
109
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
110 # In theory we should be able to look at changectx.files(). This is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
111 # *much* faster. However, it may not be accurate, especially with older
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
112 # repositories, which may not record things like deleted files
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
113 # explicitly in the manifest (which is where files() gets its data).
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
114 # The only reliable way to get the full set of changes is by looking at
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
115 # the full manifest. And, the easy way to compare two manifests is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
116 # localrepo.status().
638
f828d82c35dc hg2git: call status on newctx, not newctx.rev()
Siddharth Agarwal <sid0@fb.com>
parents: 637
diff changeset
117 modified, added, removed = self._hg.status(self._ctx, newctx)[0:3]
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
118
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
119 # We track which directories/trees have modified in this update and we
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
120 # only export those.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
121 dirty_trees = set()
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
122
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
123 subadded, subremoved = [], []
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
124
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
125 for s in modified, added, removed:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
126 if '.hgsub' in s or '.hgsubstate' in s:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
127 subadded, subremoved = self._handle_subrepos(newctx)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
128 break
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
129
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
130 # We first process subrepo and file removals so we can prune dead trees.
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
131 for path in subremoved:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
132 self._remove_path(path, dirty_trees)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
133
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
134 for path in removed:
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
135 if path == '.hgsubstate' or path == '.hgsub':
649
53423381c540 hg2git: call _handle_subrepos when .hgsubstate is removed
Siddharth Agarwal <sid0@fb.com>
parents: 648
diff changeset
136 continue
53423381c540 hg2git: call _handle_subrepos when .hgsubstate is removed
Siddharth Agarwal <sid0@fb.com>
parents: 648
diff changeset
137
645
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
138 self._remove_path(path, dirty_trees)
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
139
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
140 for path, sha in subadded:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
141 d = os.path.dirname(path)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
142 tree = self._dirs.setdefault(d, dulobjs.Tree())
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
143 dirty_trees.add(d)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
144 tree.add(os.path.basename(path), dulobjs.S_IFGITLINK, sha)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
145
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
146 # For every file that changed or was added, we need to calculate the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
147 # corresponding Git blob and its tree entry. We emit the blob
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
148 # immediately and update trees to be aware of its presence.
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
149 for path in set(modified) | set(added):
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
150 if path == '.hgsubstate' or path == '.hgsub':
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
151 continue
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
152
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
153 d = os.path.dirname(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
154 tree = self._dirs.setdefault(d, dulobjs.Tree())
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
155 dirty_trees.add(d)
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
156
636
0ab89bd32c8e hg2git: rename ctx to newctx in update_changeset
Siddharth Agarwal <sid0@fb.com>
parents: 598
diff changeset
157 fctx = newctx[path]
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
158
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
159 entry, blob = IncrementalChangesetExporter.tree_entry(fctx,
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
160 self._blob_cache)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
161 if blob is not None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
162 yield (blob, fctx.filenode())
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
163
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
164 tree.add(*entry)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
165
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
166 # Now that all the trees represent the current changeset, recalculate
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
167 # the tree IDs and emit them. Note that we wait until now to calculate
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
168 # tree SHA-1s. This is an important difference between us and
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
169 # dulwich.index.commit_tree(), which builds new Tree instances for each
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
170 # series of blobs.
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
171 for obj in self._populate_tree_entries(dirty_trees):
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
172 yield (obj, None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
173
637
23d7caeed05a hg2git: store ctx instead of rev
Siddharth Agarwal <sid0@fb.com>
parents: 636
diff changeset
174 self._ctx = newctx
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
175
645
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
176 def _remove_path(self, path, dirty_trees):
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
177 """Remove a path (file or git link) from the current changeset.
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
178
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
179 If the tree containing this path is empty, it might be removed."""
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
180 d = os.path.dirname(path)
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
181 tree = self._dirs.get(d, dulobjs.Tree())
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
182
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
183 del tree[os.path.basename(path)]
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
184 dirty_trees.add(d)
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
185
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
186 # If removing this file made the tree empty, we should delete this
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
187 # tree. This could result in parent trees losing their only child
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
188 # and so on.
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
189 if not len(tree):
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
190 self._remove_tree(d)
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
191 else:
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
192 self._dirs[d] = tree
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
193
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
194 def _remove_tree(self, path):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
195 """Remove a (presumably empty) tree from the current changeset.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
196
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
197 A now-empty tree may be the only child of its parent. So, we traverse
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
198 up the chain to the root tree, deleting any empty trees along the way.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
199 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
200 try:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
201 del self._dirs[path]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
202 except KeyError:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
203 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
204
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
205 # Now we traverse up to the parent and delete any references.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
206 if path == '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
207 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
208
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
209 basename = os.path.basename(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
210 parent = os.path.dirname(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
211 while True:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
212 tree = self._dirs.get(parent, None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
213
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
214 # No parent entry. Nothing to remove or update.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
215 if tree is None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
216 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
217
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
218 try:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
219 del tree[basename]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
220 except KeyError:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
221 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
222
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
223 if len(tree):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
224 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
225
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
226 # The parent tree is empty. Se, we can delete it.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
227 del self._dirs[parent]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
228
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
229 if parent == '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
230 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
231
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
232 basename = os.path.basename(parent)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
233 parent = os.path.dirname(parent)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
234
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
235 def _populate_tree_entries(self, dirty_trees):
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
236 self._dirs.setdefault('', dulobjs.Tree())
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
237
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
238 # Fill in missing directories.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
239 for path in self._dirs.keys():
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
240 parent = os.path.dirname(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
241
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
242 while parent != '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
243 parent_tree = self._dirs.get(parent, None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
244
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
245 if parent_tree is not None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
246 break
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
247
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
248 self._dirs[parent] = dulobjs.Tree()
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
249 parent = os.path.dirname(parent)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
250
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
251 for dirty in list(dirty_trees):
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
252 parent = os.path.dirname(dirty)
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
253
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
254 while parent != '':
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
255 if parent in dirty_trees:
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
256 break
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
257
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
258 dirty_trees.add(parent)
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
259 parent = os.path.dirname(parent)
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
260
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
261 # The root tree is always dirty but doesn't always get updated.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
262 dirty_trees.add('')
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
263
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
264 # We only need to recalculate and export dirty trees.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
265 for d in sorted(dirty_trees, key=len, reverse=True):
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
266 # Only happens for deleted directories.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
267 try:
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
268 tree = self._dirs[d]
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
269 except KeyError:
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
270 continue
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
271
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
272 yield tree
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
273
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
274 if d == '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
275 continue
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
276
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
277 parent_tree = self._dirs[os.path.dirname(d)]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
278
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
279 # Accessing the tree's ID is what triggers SHA-1 calculation and is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
280 # the expensive part (at least if the tree has been modified since
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
281 # the last time we retrieved its ID). Also, assigning an entry to a
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
282 # tree (even if it already exists) invalidates the existing tree
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
283 # and incurs SHA-1 recalculation. So, it's in our interest to avoid
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
284 # invalidating trees. Since we only update the entries of dirty
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
285 # trees, this should hold true.
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
286 parent_tree[os.path.basename(d)] = (stat.S_IFDIR, tree.id)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
287
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
288 def _handle_subrepos(self, newctx):
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
289 sub, substate = parse_subrepos(self._ctx)
647
3ceacdd23abe hg2git: add 'new' prefix to _handle_subrepos variables
Siddharth Agarwal <sid0@fb.com>
parents: 646
diff changeset
290 newsub, newsubstate = parse_subrepos(newctx)
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
291
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
292 # For each path, the logic is described by the following table. 'no'
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
293 # stands for 'the subrepo doesn't exist', 'git' stands for 'git
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
294 # subrepo', and 'hg' stands for 'hg or other subrepo'.
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
295 #
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
296 # old new | action
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
297 # * git | link (1)
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
298 # git hg | delete (2)
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
299 # git no | delete (3)
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
300 #
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
301 # All other combinations are 'do nothing'.
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
302 #
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
303 # git links without corresponding submodule paths are stored as subrepos
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
304 # with a substate but without an entry in .hgsub.
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
305
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
306 # 'added' is both modified and added
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
307 added, removed = [], []
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
308
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
309 def isgit(sub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
310 return path not in sub or sub[path].startswith('[git]')
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
311
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
312 for path, sha in substate.iteritems():
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
313 if not isgit(sub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
314 # old = hg -- will be handled in next loop
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
315 continue
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
316 # old = git
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
317 if path not in newsubstate or not isgit(newsub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
318 # new = hg or no, case (2) or (3)
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
319 removed.append(path)
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
320
647
3ceacdd23abe hg2git: add 'new' prefix to _handle_subrepos variables
Siddharth Agarwal <sid0@fb.com>
parents: 646
diff changeset
321 for path, sha in newsubstate.iteritems():
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
322 if not isgit(newsub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
323 # new = hg or no; the only cases we care about are handled above
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
324 continue
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
325
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
326 # case (1)
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
327 added.append((path, sha))
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
328
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
329 return added, removed
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
330
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
331 @staticmethod
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
332 def tree_entry(fctx, blob_cache):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
333 """Compute a dulwich TreeEntry from a filectx.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
334
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
335 A side effect is the TreeEntry is stored in the passed cache.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
336
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
337 Returns a 2-tuple of (dulwich.objects.TreeEntry, dulwich.objects.Blob).
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
338 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
339 blob_id = blob_cache.get(fctx.filenode(), None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
340 blob = None
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
341
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
342 if blob_id is None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
343 blob = dulobjs.Blob.from_string(fctx.data())
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
344 blob_id = blob.id
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
345 blob_cache[fctx.filenode()] = blob_id
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
346
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
347 flags = fctx.flags()
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
348
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
349 if 'l' in flags:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
350 mode = 0120000
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
351 elif 'x' in flags:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
352 mode = 0100755
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
353 else:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
354 mode = 0100644
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
355
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
356 return (dulobjs.TreeEntry(os.path.basename(fctx.path()), mode, blob_id),
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
357 blob)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
358