annotate hggit/hg2git.py @ 710:623cb724c3d0

hg2git: in _init_dirs, store keys without leading '/' (issue103) Previously, whenever a tree that wasn't the root ('') was stored, we'd prepend a '/' to it. Then, when we'd try retrieving the entry, we'd do so without the leading '/'. This caused data loss because existing tree entries were dropped on the floor. Fix that by only adding '/' if we're adding to a non-empty initial path. This wasn't detected in tests because most of them deal only with files in the root and not ones in subdirectories.
author Siddharth Agarwal <sid0@fb.com>
date Tue, 25 Mar 2014 11:11:04 -0700
parents 5c7943ca051f
children 81c55f8629ba
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1 # This file contains code dealing specifically with converting Mercurial
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
2 # repositories to Git repositories. Code in this file is meant to be a generic
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
3 # library and should be usable outside the context of hg-git or an hg command.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
4
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
5 import os
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
6 import stat
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
7
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
8 import dulwich.objects as dulobjs
707
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
9 from dulwich import diff_tree
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
10
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
11 import util
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
12
671
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
13 def parse_subrepos(ctx):
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
14 sub = util.OrderedDict()
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
15 if '.hgsub' in ctx:
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
16 sub = util.parse_hgsub(ctx['.hgsub'].data().splitlines())
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
17 substate = util.OrderedDict()
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
18 if '.hgsubstate' in ctx:
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
19 substate = util.parse_hgsubstate(
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
20 ctx['.hgsubstate'].data().splitlines())
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
21 return sub, substate
71fb5dd678bc hg2git: move parse_subrepos to top level
Siddharth Agarwal <sid0@fb.com>
parents: 649
diff changeset
22
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
23 class IncrementalChangesetExporter(object):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
24 """Incrementally export Mercurial changesets to Git trees.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
25
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
26 The purpose of this class is to facilitate Git tree export that is more
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
27 optimal than brute force.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
28
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
29 A "dumb" implementations of Mercurial to Git export would iterate over
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
30 every file present in a Mercurial changeset and would convert each to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
31 a Git blob and then conditionally add it to a Git repository if it didn't
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
32 yet exist. This is suboptimal because the overhead associated with
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
33 obtaining every file's raw content and converting it to a Git blob is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
34 not trivial!
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
35
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
36 This class works around the suboptimality of brute force export by
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
37 leveraging the information stored in Mercurial - the knowledge of what
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
38 changed between changesets - to only export Git objects corresponding to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
39 changes in Mercurial. In the context of converting Mercurial repositories
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
40 to Git repositories, we only export objects Git (possibly) hasn't seen yet.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
41 This prevents a lot of redundant work and is thus faster.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
42
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
43 Callers instantiate an instance of this class against a mercurial.localrepo
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
44 instance. They then associate it with a specific changesets by calling
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
45 update_changeset(). On each call to update_changeset(), the instance
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
46 computes the difference between the current and new changesets and emits
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
47 Git objects that haven't yet been encountered during the lifetime of the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
48 class instance. In other words, it expresses Mercurial changeset deltas in
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
49 terms of Git objects. Callers then (usually) take this set of Git objects
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
50 and add them to the Git repository.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
51
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
52 This class only emits Git blobs and trees, not commits.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
53
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
54 The tree calculation part of this class is essentially a reimplementation
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
55 of dulwich.index.commit_tree. However, since our implementation reuses
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
56 Tree instances and only recalculates SHA-1 when things change, we are
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
57 more efficient.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
58 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
59
709
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
60 def __init__(self, hg_repo, start_ctx, git_store, git_commit):
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
61 """Create an instance against a mercurial.localrepo.
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
62
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
63 start_ctx is the context for a Mercurial commit that has a Git
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
64 equivalent, passed in as git_commit. The incremental computation will be
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
65 started from this commit. git_store is the Git object store the commit
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
66 comes from. start_ctx can be repo[nullid], in which case git_commit
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
67 should be None.
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
68 """
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
69 self._hg = hg_repo
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
70
637
23d7caeed05a hg2git: store ctx instead of rev
Siddharth Agarwal <sid0@fb.com>
parents: 636
diff changeset
71 # Our current revision's context.
709
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
72 self._ctx = start_ctx
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
73
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
74 # Path to dulwich.objects.Tree.
709
5c7943ca051f hg2git: start incremental conversion from a known commit
Siddharth Agarwal <sid0@fb.com>
parents: 707
diff changeset
75 self._init_dirs(git_store, git_commit)
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
76
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
77 # Mercurial file nodeid to Git blob SHA-1. Used to prevent redundant
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
78 # blob calculation.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
79 self._blob_cache = {}
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
80
707
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
81 def _init_dirs(self, store, commit):
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
82 """Initialize self._dirs for a Git object store and commit."""
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
83 self._dirs = {}
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
84 if commit is None:
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
85 return
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
86 dirkind = stat.S_IFDIR
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
87 # depth-first order, chosen arbitrarily
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
88 todo = [('', store[commit.tree])]
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
89 while todo:
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
90 path, tree = todo.pop()
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
91 self._dirs[path] = tree
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
92 for entry in tree.iteritems():
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
93 if entry.mode == dirkind:
710
623cb724c3d0 hg2git: in _init_dirs, store keys without leading '/' (issue103)
Siddharth Agarwal <sid0@fb.com>
parents: 709
diff changeset
94 if path == '':
623cb724c3d0 hg2git: in _init_dirs, store keys without leading '/' (issue103)
Siddharth Agarwal <sid0@fb.com>
parents: 709
diff changeset
95 newpath = entry.path
623cb724c3d0 hg2git: in _init_dirs, store keys without leading '/' (issue103)
Siddharth Agarwal <sid0@fb.com>
parents: 709
diff changeset
96 else:
623cb724c3d0 hg2git: in _init_dirs, store keys without leading '/' (issue103)
Siddharth Agarwal <sid0@fb.com>
parents: 709
diff changeset
97 newpath = path + '/' + entry.path
623cb724c3d0 hg2git: in _init_dirs, store keys without leading '/' (issue103)
Siddharth Agarwal <sid0@fb.com>
parents: 709
diff changeset
98 todo.append((newpath, store[entry.sha]))
707
d5facc1be5f8 hg2git: implement a method to initialize _dirs from a Git commit
Siddharth Agarwal <sid0@fb.com>
parents: 672
diff changeset
99
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
100 @property
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
101 def root_tree_sha(self):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
102 """The SHA-1 of the root Git tree.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
103
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
104 This is needed to construct a Git commit object.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
105 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
106 return self._dirs[''].id
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
107
636
0ab89bd32c8e hg2git: rename ctx to newctx in update_changeset
Siddharth Agarwal <sid0@fb.com>
parents: 598
diff changeset
108 def update_changeset(self, newctx):
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
109 """Set the tree to track a new Mercurial changeset.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
110
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
111 This is a generator of 2-tuples. The first item in each tuple is a
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
112 dulwich object, either a Blob or a Tree. The second item is the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
113 corresponding Mercurial nodeid for the item, if any. Only blobs will
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
114 have nodeids. Trees do not correspond to a specific nodeid, so it does
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
115 not make sense to emit a nodeid for them.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
116
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
117 When exporting trees from Mercurial, callers typically write the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
118 returned dulwich object to the Git repo via the store's add_object().
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
119
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
120 Some emitted objects may already exist in the Git repository. This
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
121 class does not know about the Git repository, so it's up to the caller
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
122 to conditionally add the object, etc.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
123
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
124 Emitted objects are those that have changed since the last call to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
125 update_changeset. If this is the first call to update_chanageset, all
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
126 objects in the tree are emitted.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
127 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
128 # Our general strategy is to accumulate dulwich.objects.Blob and
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
129 # dulwich.objects.Tree instances for the current Mercurial changeset.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
130 # We do this incremental by iterating over the Mercurial-reported
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
131 # changeset delta. We rely on the behavior of Mercurial to lazy
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
132 # calculate a Tree's SHA-1 when we modify it. This is critical to
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
133 # performance.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
134
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
135 # In theory we should be able to look at changectx.files(). This is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
136 # *much* faster. However, it may not be accurate, especially with older
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
137 # repositories, which may not record things like deleted files
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
138 # explicitly in the manifest (which is where files() gets its data).
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
139 # The only reliable way to get the full set of changes is by looking at
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
140 # the full manifest. And, the easy way to compare two manifests is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
141 # localrepo.status().
638
f828d82c35dc hg2git: call status on newctx, not newctx.rev()
Siddharth Agarwal <sid0@fb.com>
parents: 637
diff changeset
142 modified, added, removed = self._hg.status(self._ctx, newctx)[0:3]
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
143
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
144 # We track which directories/trees have modified in this update and we
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
145 # only export those.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
146 dirty_trees = set()
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
147
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
148 subadded, subremoved = [], []
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
149
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
150 for s in modified, added, removed:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
151 if '.hgsub' in s or '.hgsubstate' in s:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
152 subadded, subremoved = self._handle_subrepos(newctx)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
153 break
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
154
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
155 # We first process subrepo and file removals so we can prune dead trees.
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
156 for path in subremoved:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
157 self._remove_path(path, dirty_trees)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
158
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
159 for path in removed:
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
160 if path == '.hgsubstate' or path == '.hgsub':
649
53423381c540 hg2git: call _handle_subrepos when .hgsubstate is removed
Siddharth Agarwal <sid0@fb.com>
parents: 648
diff changeset
161 continue
53423381c540 hg2git: call _handle_subrepos when .hgsubstate is removed
Siddharth Agarwal <sid0@fb.com>
parents: 648
diff changeset
162
645
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
163 self._remove_path(path, dirty_trees)
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
164
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
165 for path, sha in subadded:
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
166 d = os.path.dirname(path)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
167 tree = self._dirs.setdefault(d, dulobjs.Tree())
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
168 dirty_trees.add(d)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
169 tree.add(os.path.basename(path), dulobjs.S_IFGITLINK, sha)
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
170
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
171 # For every file that changed or was added, we need to calculate the
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
172 # corresponding Git blob and its tree entry. We emit the blob
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
173 # immediately and update trees to be aware of its presence.
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
174 for path in set(modified) | set(added):
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
175 if path == '.hgsubstate' or path == '.hgsub':
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
176 continue
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
177
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
178 d = os.path.dirname(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
179 tree = self._dirs.setdefault(d, dulobjs.Tree())
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
180 dirty_trees.add(d)
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
181
636
0ab89bd32c8e hg2git: rename ctx to newctx in update_changeset
Siddharth Agarwal <sid0@fb.com>
parents: 598
diff changeset
182 fctx = newctx[path]
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
183
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
184 entry, blob = IncrementalChangesetExporter.tree_entry(fctx,
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
185 self._blob_cache)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
186 if blob is not None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
187 yield (blob, fctx.filenode())
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
188
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
189 tree.add(*entry)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
190
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
191 # Now that all the trees represent the current changeset, recalculate
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
192 # the tree IDs and emit them. Note that we wait until now to calculate
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
193 # tree SHA-1s. This is an important difference between us and
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
194 # dulwich.index.commit_tree(), which builds new Tree instances for each
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
195 # series of blobs.
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
196 for obj in self._populate_tree_entries(dirty_trees):
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
197 yield (obj, None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
198
637
23d7caeed05a hg2git: store ctx instead of rev
Siddharth Agarwal <sid0@fb.com>
parents: 636
diff changeset
199 self._ctx = newctx
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
200
645
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
201 def _remove_path(self, path, dirty_trees):
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
202 """Remove a path (file or git link) from the current changeset.
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
203
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
204 If the tree containing this path is empty, it might be removed."""
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
205 d = os.path.dirname(path)
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
206 tree = self._dirs.get(d, dulobjs.Tree())
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
207
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
208 del tree[os.path.basename(path)]
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
209 dirty_trees.add(d)
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
210
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
211 # If removing this file made the tree empty, we should delete this
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
212 # tree. This could result in parent trees losing their only child
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
213 # and so on.
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
214 if not len(tree):
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
215 self._remove_tree(d)
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
216 else:
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
217 self._dirs[d] = tree
104f536be5c7 hg2git: factor out remove path logic into a separate function
Siddharth Agarwal <sid0@fb.com>
parents: 638
diff changeset
218
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
219 def _remove_tree(self, path):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
220 """Remove a (presumably empty) tree from the current changeset.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
221
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
222 A now-empty tree may be the only child of its parent. So, we traverse
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
223 up the chain to the root tree, deleting any empty trees along the way.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
224 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
225 try:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
226 del self._dirs[path]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
227 except KeyError:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
228 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
229
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
230 # Now we traverse up to the parent and delete any references.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
231 if path == '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
232 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
233
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
234 basename = os.path.basename(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
235 parent = os.path.dirname(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
236 while True:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
237 tree = self._dirs.get(parent, None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
238
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
239 # No parent entry. Nothing to remove or update.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
240 if tree is None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
241 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
242
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
243 try:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
244 del tree[basename]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
245 except KeyError:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
246 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
247
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
248 if len(tree):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
249 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
250
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
251 # The parent tree is empty. Se, we can delete it.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
252 del self._dirs[parent]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
253
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
254 if parent == '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
255 return
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
256
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
257 basename = os.path.basename(parent)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
258 parent = os.path.dirname(parent)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
259
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
260 def _populate_tree_entries(self, dirty_trees):
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
261 self._dirs.setdefault('', dulobjs.Tree())
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
262
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
263 # Fill in missing directories.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
264 for path in self._dirs.keys():
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
265 parent = os.path.dirname(path)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
266
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
267 while parent != '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
268 parent_tree = self._dirs.get(parent, None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
269
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
270 if parent_tree is not None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
271 break
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
272
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
273 self._dirs[parent] = dulobjs.Tree()
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
274 parent = os.path.dirname(parent)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
275
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
276 for dirty in list(dirty_trees):
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
277 parent = os.path.dirname(dirty)
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
278
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
279 while parent != '':
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
280 if parent in dirty_trees:
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
281 break
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
282
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
283 dirty_trees.add(parent)
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
284 parent = os.path.dirname(parent)
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
285
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
286 # The root tree is always dirty but doesn't always get updated.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
287 dirty_trees.add('')
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
288
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
289 # We only need to recalculate and export dirty trees.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
290 for d in sorted(dirty_trees, key=len, reverse=True):
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
291 # Only happens for deleted directories.
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
292 try:
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
293 tree = self._dirs[d]
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
294 except KeyError:
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
295 continue
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
296
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
297 yield tree
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
298
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
299 if d == '':
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
300 continue
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
301
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
302 parent_tree = self._dirs[os.path.dirname(d)]
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
303
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
304 # Accessing the tree's ID is what triggers SHA-1 calculation and is
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
305 # the expensive part (at least if the tree has been modified since
598
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
306 # the last time we retrieved its ID). Also, assigning an entry to a
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
307 # tree (even if it already exists) invalidates the existing tree
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
308 # and incurs SHA-1 recalculation. So, it's in our interest to avoid
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
309 # invalidating trees. Since we only update the entries of dirty
792955be68dd Only export modified Git trees
Gregory Szorc <gregory.szorc@gmail.com>
parents: 596
diff changeset
310 # trees, this should hold true.
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
311 parent_tree[os.path.basename(d)] = (stat.S_IFDIR, tree.id)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
312
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
313 def _handle_subrepos(self, newctx):
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
314 sub, substate = parse_subrepos(self._ctx)
647
3ceacdd23abe hg2git: add 'new' prefix to _handle_subrepos variables
Siddharth Agarwal <sid0@fb.com>
parents: 646
diff changeset
315 newsub, newsubstate = parse_subrepos(newctx)
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
316
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
317 # For each path, the logic is described by the following table. 'no'
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
318 # stands for 'the subrepo doesn't exist', 'git' stands for 'git
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
319 # subrepo', and 'hg' stands for 'hg or other subrepo'.
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
320 #
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
321 # old new | action
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
322 # * git | link (1)
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
323 # git hg | delete (2)
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
324 # git no | delete (3)
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
325 #
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
326 # All other combinations are 'do nothing'.
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
327 #
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
328 # git links without corresponding submodule paths are stored as subrepos
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
329 # with a substate but without an entry in .hgsub.
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
330
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
331 # 'added' is both modified and added
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
332 added, removed = [], []
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
333
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
334 def isgit(sub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
335 return path not in sub or sub[path].startswith('[git]')
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
336
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
337 for path, sha in substate.iteritems():
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
338 if not isgit(sub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
339 # old = hg -- will be handled in next loop
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
340 continue
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
341 # old = git
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
342 if path not in newsubstate or not isgit(newsub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
343 # new = hg or no, case (2) or (3)
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
344 removed.append(path)
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
345
647
3ceacdd23abe hg2git: add 'new' prefix to _handle_subrepos variables
Siddharth Agarwal <sid0@fb.com>
parents: 646
diff changeset
346 for path, sha in newsubstate.iteritems():
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
347 if not isgit(newsub, path):
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
348 # new = hg or no; the only cases we care about are handled above
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
349 continue
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
350
648
bd63cdfbc1de hg2git: make _handle_subrepos worked in the removed case
Siddharth Agarwal <sid0@fb.com>
parents: 647
diff changeset
351 # case (1)
672
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
352 added.append((path, sha))
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
353
fbfa6353d96c hg2git: fix subrepo handling to be deterministic
Siddharth Agarwal <sid0@fb.com>
parents: 671
diff changeset
354 return added, removed
596
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
355
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
356 @staticmethod
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
357 def tree_entry(fctx, blob_cache):
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
358 """Compute a dulwich TreeEntry from a filectx.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
359
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
360 A side effect is the TreeEntry is stored in the passed cache.
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
361
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
362 Returns a 2-tuple of (dulwich.objects.TreeEntry, dulwich.objects.Blob).
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
363 """
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
364 blob_id = blob_cache.get(fctx.filenode(), None)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
365 blob = None
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
366
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
367 if blob_id is None:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
368 blob = dulobjs.Blob.from_string(fctx.data())
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
369 blob_id = blob.id
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
370 blob_cache[fctx.filenode()] = blob_id
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
371
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
372 flags = fctx.flags()
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
373
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
374 if 'l' in flags:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
375 mode = 0120000
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
376 elif 'x' in flags:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
377 mode = 0100755
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
378 else:
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
379 mode = 0100644
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
380
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
381 return (dulobjs.TreeEntry(os.path.basename(fctx.path()), mode, blob_id),
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
382 blob)
d6b9c30a3e0f Export Git objects from incremental Mercurial changes
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
383