comparison hggit/git_handler.py @ 915:efcefc3522bd

pull: consider remotes during discovery The default dulwich graph walker only walks from refs/heads. During the discovery phase of fetching this causes it to redownload commits that are only referenced by refs/remotes. In a normal hggit case, this seems to mean it redownloads the entire git repo on every hg pull. Added a --debug to a test to check the object count (it decreased from 21 to 10 as part of this patch).
author Durham Goode <durham@fb.com>
date Tue, 23 Jun 2015 20:17:10 -0700
parents d153586c28f8
children 6aa31a3b0506
comparison
equal deleted inserted replaced
914:e4006703a287 915:efcefc3522bd
1073 1073
1074 return new_refs 1074 return new_refs
1075 1075
1076 def fetch_pack(self, remote_name, heads=None): 1076 def fetch_pack(self, remote_name, heads=None):
1077 client, path = self.get_transport_and_path(remote_name) 1077 client, path = self.get_transport_and_path(remote_name)
1078 graphwalker = self.git.get_graph_walker() 1078
1079 # The dulwich default walk only checks refs/heads/. We also want to
1080 # consider remotes when doing discovery, so we build our own list. We
1081 # can't just do 'refs/' here because the tag class doesn't have a
1082 # parents function for walking, and older versions of dulwich don't like
1083 # that.
1084 haveheads = self.git.refs.as_dict('refs/remotes/').values()
1085 haveheads.extend(self.git.refs.as_dict('refs/heads/').values())
1086 graphwalker = self.git.get_graph_walker(heads=haveheads)
1079 1087
1080 def determine_wants(refs): 1088 def determine_wants(refs):
1081 filteredrefs = self.filter_refs(refs, heads) 1089 filteredrefs = self.filter_refs(refs, heads)
1082 return [x for x in filteredrefs.itervalues() if x not in self.git] 1090 return [x for x in filteredrefs.itervalues() if x not in self.git]
1083 1091