view hgext/states.py @ 60:14a4499d2cd6

small refactoring and big doc update. Sorry for the big commit crecord one so much diff seems to confuse my powerbook to death :-/
author Pierre-Yves David <pierre-yves.david@ens-lyon.org>
date Mon, 12 Sep 2011 14:05:32 +0200
parents 02fba620d139
children 0dfe459c7b1c
line wrap: on
line source

# states.py - introduce the state concept for mercurial changeset
#
# Copyright 2011 Pierre-Yves David <pierre-yves.david@ens-lyon.org>
#                Logilab SA        <contact@logilab.fr>
#                Augie Fackler     <durin42@gmail.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

'''introduce the state concept for mercurial changeset

(see http://mercurial.selenic.com/wiki/StatesPlan)

General concept
===============

This extension adds the state concept. A changeset are now in a specific state
that control they mutability and they exchange.

States properties
.................

The states extension currently alter two property for changeset

:mutability:  history rewritten tool should refuse to work on immutable changeset
:sharing:     shared changeset are exchanged during pull and push. other are not

Here is a small summary of the current property of state existing state::

    ||           || mutable || shared ||
    || published ||         ||   x    ||
    || ready     ||    x    ||   x    ||
    || draft     ||    x    ||        ||

States consistency and ordering
...............................

States of changesets have to be consistent with each other. A changeset can only have ancestors of it's state (or a compatible states)

Example:

    A ``published`` changeset can't have a ``draft`` parent.

a state is compatible with itself and all "smaller" states. Order is as follow::

    published < ready < draft


.. note:

    This section if probably far too conceptual for people. The result is just
    that: A ``published`` changeset can only have ``published`` ancestors. A
    ``ready`` changeset can only have ``published`` or ``ready`` ancestors.

    Moreover There is a need for a nice word to refer to "a state smaller than another"


States details
==============


published
    Changesets in the ``published`` state are the core of the history.  They are
    changesets that you published to the world. People can expect them to always
    exist. They are changesets as you know them. **By default all changesets
    are published**

    - They are exchanged with other repositories (included in pull//push).

    - They are not mutable, extensions rewriting history should refuse to
      rewrite them.

ready
    Changesets in the ``ready`` state have not yet been accepted in the
    immutable history. You can share them with others for review, testing or
    improvement. Any ``ready`` changeset can either be included in the
    published history (and become immutable) or be rewritten and never make it
    to the published history.

    - They are exchanged with other repositories (included in pull//push).

    - They are mutable, extensions rewriting history accept to work on them.

draft

    Changesets in the ``draft`` state are heavy work in progress you are not
    yet willing to share with others.

    - They are not exchanged with other repositories. pull//push do not see them.
    - They are mutable, extensions rewriting history accept to work on them.

--

.. note:

    The Dead states mentionned in on the wiki page are missing. There is two main reason for it:

    1. The ``dead`` state has a different behaviour that requires more work to be
       implemented.

    2. I believe that the use cases of ``dead changeset`` are better covered by
       the ``obsolete`` extension.

--

.. note:

    I'm tempted to add a state with the same property that ``ready`` for review
    workflow.::

        ||           || mutable || shared ||
        || published ||         ||   x    ||
        || ready     ||    x    ||   x    ||
        || inprogress||    x    ||   x    ||
        || draft     ||    x    ||        ||

    The ``ready`` state would be for changeset that wait review of someone that
    can "publish" them.



Current Feature and usage
=========================


Enabling states
...............

The extension adds a :hg:`hg states` command to display and choose which states
are used by a repository, see :hg:`hg states` for details.

By default all changesets in the repository are ``published``. Other states
must be explicitly activated. Changeset in a remote repository that doesn't
support states are all seen as ``published``.

.. note:

    When a state is not activated, changesets in this state are handled as
    changesets of the previous state it (``draft`` are handled as ``ready``,
    ``ready`` are handled as ``published``).

TODO:

- have a configuration in hgrc::

    [states]
    ready=(off|on)(-inherit)?
    <state>=(off|on)(-inherit)?

 :off:     state disabled for new repo
 :on:      state enabled  for new repo
 :inherit: if present, inherit states of source on :hg:`clone`.

-  have a switch to select if changesets do change state on state activation.

- display the number of changesets that change state when activating a state.



State transition
................

Changeset you create locally will be in the ``draft`` state. (or any previous
state if draft isn't enabled)

There is some situation where the state of a changeset will change
automatically. Automatic movement always go in the same direction.: ``draft ->
``ready`` -> ``published``

1. When you pull or push boundary move. Common changeset that are ``published`` in
one of the two repository are set to ``published``. Same goes for ``ready`` etc
(states are evaluated from in increasing order XXX I bet no one understand this
parenthesis. Pull operation alter the local repository. push alter both local
and remote repository.

.. note:

    As Repository without any specific state have all their changeset
    ``published``, Pushing to such repo will ``publish`` all common changeset.

2. Tagged changeset get automatically Published. The tagging changeset is
tagged too... This doesn't apply to local tag.


You can also manually change changeset state with a dedicated command for each
state. See :hg:`published`, :hg:`ready` and :hg:`draft` for details.

XXX maybe we can details the general behaviour here

:hg <state> revs:                 move boundary of state so it includes revs
                                  ( revs included in ::<state>heads())
:hg --exact <state> revs:         move boundary so that revs are exactly in state
                                  <state> ( all([rev.state == <state> for rev in
                                  revs]))
:hg --exact --force <state> revs: move boundary event if it create inconsistency
                                  (with tag for example)

TODO:

- implement --exact

- implement consistency check

- implement --force


Existing command change
.......................

As said in the previous section:

:commit:    Create draft changeset (or the first enabled previous changeset).
:tag:       Move tagged and tagging changeset in the ``published`` state.
:incoming:  Exclude ``draft`` changeset of remote repository.
:outgoing:  Exclude ``draft`` changeset of local repository.
:pull:      As :hg:`in`  + change state of local changeset according to remote side.
:push:      As :hg:`out` + sync state of common changeset on both side
:rollback:  rollback restore states heads as before the last transaction (see bookmark)

Template
........

A new template keyword ``{state}`` has been added.

Revset
......

    We add new ``readyheads()`` and ``publishedheads()`` revset directives. This
    returns the heads of each state **as if all of them were activated**.

    XXX TODO - I would like to

    - move the current ``<state>heads()`` directives to
      _``<state>heads()``

    - add ``<state>heads()`` directives to that return the currently in used heads

    - add ``<state>()`` directives that match all node in a state.

Implementation
==============

State definition
................

Conceptually:

The set of node in the states are defined by the set of the state heads. This allow
easy storage, exchange and consistency.

.. note: A cache of the complete set of node that belong to a states will
         probably be need for performance.

Code wise:

There is a ``state`` class that hold the state property and several useful
logic (name, revset entry etc).

All defined states are accessible thought the STATES tuple at the ROOT of the
module. Or the STATESMAP dictionary that allow to fetch a state from it's
name.

You can get and edit the list head node that define a state with two methods on
repo.

:stateheads(<state>):        Returns the list of heads node that define a states
:setstate(<state>, [nodes]): Move states boundary forward to include the given
                             nodes in the given states.

Those methods handle ``node`` and not rev as it seems more resilient to me that
rev in a mutable world. Maybe it' would make more sens to have ``node`` store
on disk but revision in the code.

Storage
.......

States related data are stored in the ``.hg/states/`` directory.

The ``.hg/states/Enabled`` file list the states enabled in this
repository. This data is *not* stored in the .hg/hgrc because the .hg/hgrc
might be ignored for trust reason. As missing und with states can be pretty
annoying. (publishing unfinalized changeset, pulling draft one etc) we don't
want trust issue to interfer with enabled states information.

``.hg/states/<state>-heads`` file list the nodes that define a states.

_NOSHARE filtering
..................

Any changeset in a state with a _NOSHARE property will be exclude from pull,
push, clone, incoming, outgoing and bundle. It is done through three mechanism:

1. Wrapping the findcommonincoming and findcommonoutgoing code with (not very
   efficient) logic that recompute the exchanged heads.

2. Altering ``heads`` wireprotocol command to return sharead heads.

3. Disabling hardlink cloning when there is _NOSHARE changeset available.

Internal plumbery
-----------------

sum up of what we do:

* state are object

* repo.__class__ is extended

* discovery is wrapped up

* wire protocol is patched

* transaction and rollback mechanism are wrapped up.

* XXX we write new version of the boundard whenever something happen. We need a
  smarter and faster way to do this.


'''
import os
from functools import partial

from mercurial.i18n import _
from mercurial import cmdutil
from mercurial import scmutil
from mercurial import context
from mercurial import revset
from mercurial import templatekw
from mercurial import util
from mercurial import node
from mercurial.node import nullid, hex, short
from mercurial import discovery
from mercurial import extensions
from mercurial import wireproto
from mercurial import pushkey
from mercurial import error
from mercurial.lock import release


# states property constante
_NOSHARE=2
_MUTABLE=1

class state(object):
    """State of changeset

    An utility object that handle several behaviour and containts useful code

    A state is defined by:
        - It's name
        - It's property (defined right above)

        - It's next state.

    XXX maybe we could stick description of the state semantic here.
    """

    def __init__(self, name, properties=0, next=None):
        self.name = name
        self.properties = properties
        assert next is None or self < next
        self.next = next

    def __repr__(self):
        return 'state(%s)' % self.name

    def __str__(self):
        return self.name

    @util.propertycache
    def trackheads(self):
        """Do we need to track heads of changeset in this state ?

        We don't need to track heads for the last state as this is repo heads"""
        return self.next is not None

    def __cmp__(self, other):
        """Use property to compare states.

        This is a naiv approach that assume the  the next state are strictly
        more property than the one before
        # assert min(self, other).properties = self.properties & other.properties
        """
        return cmp(self.properties, other.properties)

    @util.propertycache
    def _revsetheads(self):
        """function to be used by revset to finds heads of this states"""
        assert self.trackheads
        def revsetheads(repo, subset, x):
            args = revset.getargs(x, 0, 0, 'publicheads takes no arguments')
            heads = []
            for h in repo._statesheads[self]:
                try:
                    heads.append(repo.changelog.rev(h))
                except error.LookupError:
                    pass
            heads.sort()
            return heads
        return revsetheads

    @util.propertycache
    def headssymbol(self):
        """name of the revset symbols"""
        if self.trackheads:
            return "%sheads" % self.name
        else:
            return 'heads'

# Actual state definition

ST2 = state('draft', _NOSHARE | _MUTABLE)
ST1 = state('ready', _MUTABLE, next=ST2)
ST0 = state('published', next=ST1)

# all available state
STATES = (ST0, ST1, ST2)
# all available state by name
STATESMAP =dict([(st.name, st) for st in STATES])

@util.cachefunc
def laststatewithout(prop):
    """Find the states with the most property but <prop>

    (This function is necessary because the whole state stuff are abstracted)"""
    for state in STATES:
        if not state.properties & prop:
            candidate = state
        else:
            return candidate

# util function
#############################
def noderange(repo, revsets):
    """The same as revrange but return node"""
    return map(repo.changelog.node,
               scmutil.revrange(repo, revsets))

# Patch changectx
#############################

def state(ctx):
    """return the state objet associated to the context"""
    if ctx.node()is None:
        return STATES[-1]
    return ctx._repo.nodestate(ctx.node())
context.changectx.state = state

# improve template
#############################

def showstate(ctx, **args):
    """Show the name of the state associated with the context"""
    return ctx.state()


# New commands
#############################


def cmdstates(ui, repo, *states, **opt):
    """view and modify activated states.

    With no argument, list activated state.

    With argument, activate the state in argument.

    With argument plus the --off switch, deactivate the state in argument.

    note: published state are alway activated."""

    if not states:
        for st in sorted(repo._enabledstates):
            ui.write('%s\n' % st)
    else:
        off = opt.get('off', False)
        for state_name in states:
            for st in STATES:
                if st.name == state_name:
                    break
            else:
                ui.write_err(_('no state named %s\n') % state_name)
                return 1
            if off and st in repo._enabledstates:
                repo._enabledstates.remove(st)
            else:
                repo._enabledstates.add(st)
        repo._writeenabledstates()
    return 0

cmdtable = {'states': (cmdstates, [ ('', 'off', False, _('desactivate the state') )], '<state>')}

# automatic generation of command that set state
def makecmd(state):
    def cmdmoveheads(ui, repo, *changesets):
        """set revisions in %s state

        This command also alter state of ancestors if necessary.
        """ % state
        revs = scmutil.revrange(repo, changesets)
        repo.setstate(state, [repo.changelog.node(rev) for rev in revs])
        return 0
    return cmdmoveheads

for state in STATES:
    if state.trackheads:
        cmdmoveheads = makecmd(state)
        cmdtable[state.name] = (cmdmoveheads, [], '<revset>')

# Pushkey mechanism for mutable
#########################################

def pushstatesheads(repo, key, old, new):
    """receive a new state for a revision via pushkey

    It only move revision from a state to a <= one

    Return True if the <key> revision exist in the repository
    Return False otherwise. (and doesn't alter any state)"""
    st = STATESMAP[new]
    w = repo.wlock()
    try:
        newhead = node.bin(key)
        try:
            repo[newhead]
        except error.RepoLookupError:
            return False
        repo.setstate(st, [newhead])
        return True
    finally:
        w.release()

def liststatesheads(repo):
    """List the boundary of all states.

    {"node-hex" -> "comma separated list of state",}
    """
    keys = {}
    for state in [st for st in STATES if st.trackheads]:
        for head in repo.stateheads(state):
            head = node.hex(head)
            if head in keys:
                keys[head] += ',' + state.name
            else:
                keys[head] = state.name
    return keys

pushkey.register('states-heads', pushstatesheads, liststatesheads)


# Wrap discovery
####################
def filterprivateout(orig, repo, *args,**kwargs):
    """wrapper for findcommonoutgoing that remove _NOSHARE"""
    common, heads = orig(repo, *args, **kwargs)
    if getattr(repo, '_reducehead', None) is not None:
        return common, repo._reducehead(heads)
def filterprivatein(orig, repo, remote, *args, **kwargs):
    """wrapper for findcommonincoming that remove _NOSHARE"""
    common, anyinc, heads = orig(repo, remote, *args, **kwargs)
    if getattr(remote, '_reducehead', None) is not None:
        heads = remote._reducehead(heads)
    return common, anyinc, heads

# WireProtocols
####################
def wireheads(repo, proto):
    """Altered head command that doesn't include _NOSHARE

    This is a write protocol command"""
    st = laststatewithout(_NOSHARE)
    h = repo.stateheads(st)
    return wireproto.encodelist(h) + "\n"

def uisetup(ui):
    """
    * patch stuff for the _NOSHARE property
    * add template keyword
    """
    # patch discovery
    extensions.wrapfunction(discovery, 'findcommonoutgoing', filterprivateout)
    extensions.wrapfunction(discovery, 'findcommonincoming', filterprivatein)

    # patch wireprotocol
    wireproto.commands['heads'] = (wireheads, '')

    # add template keyword
    templatekw.keywords['state'] = showstate

def extsetup(ui):
    """Extension setup

    * add revset entry"""
    for state in STATES:
        if state.trackheads:
            revset.symbols[state.headssymbol] = state._revsetheads

def reposetup(ui, repo):
    """Repository setup

    * extend repo class with states logic"""

    if not repo.local():
        return

    ocancopy =repo.cancopy
    opull = repo.pull
    opush = repo.push
    o_tag = repo._tag
    orollback = repo.rollback
    o_writejournal = repo._writejournal
    class statefulrepo(repo.__class__):
        """An extension of repo class that handle state logic

        - nodestate
        - stateheads
        """

        def nodestate(self, node):
            """return the state object associated to the given node"""
            rev = self.changelog.rev(node)

            for state in STATES:
                # avoid for untracked heads
                if state.next is not None:
                    ancestors = map(self.changelog.rev, self.stateheads(state))
                    ancestors.extend(self.changelog.ancestors(*ancestors))
                    if rev in ancestors:
                        break
            return state



        def stateheads(self, state):
            """Return the set of head that define the state"""
            # look for a relevant state
            while state.trackheads and state.next not in self._enabledstates:
                state = state.next
            # last state have no cached head.
            if state.trackheads:
                return self._statesheads[state]
            return self.heads()

        @util.propertycache
        def _statesheads(self):
            """{ state-object -> set(defining head)} mapping"""
            return self._readstatesheads()


        def _readheadsfile(self, filename):
            """read head from the given file

            XXX move me elsewhere"""
            heads = [nullid]
            try:
                f = self.opener(filename)
                try:
                    heads = sorted([node.bin(n) for n in f.read().split() if n])
                finally:
                    f.close()
            except IOError:
                pass
            return heads

        def _readstatesheads(self, undo=False):
            """read all state heads

            XXX move me elsewhere"""
            statesheads = {}
            for state in STATES:
                if state.trackheads:
                    filemask = 'states/%s-heads'
                    filename = filemask % state.name
                    statesheads[state] = self._readheadsfile(filename)
            return statesheads

        def _writeheadsfile(self, filename, heads):
            """write given <heads> in the file with at <filename>

            XXX move me elsewhere"""
            f = self.opener(filename, 'w', atomictemp=True)
            try:
                for h in heads:
                    f.write(hex(h) + '\n')
                f.rename()
            finally:
                f.close()

        def _writestateshead(self):
            """write all heads

            XXX move me elsewhere"""
            # XXX transaction!
            for state in STATES:
                if state.trackheads:
                    filename = 'states/%s-heads' % state.name
                    self._writeheadsfile(filename, self._statesheads[state])

        def setstate(self, state, nodes):
            """change state of targets changeset and it's ancestors.

            Simplify the list of head."""
            assert not isinstance(nodes, basestring), repr(nodes)
            heads = self._statesheads[state]
            olds = heads[:]
            heads.extend(nodes)
            heads[:] = set(heads)
            heads.sort()
            if olds != heads:
                heads[:] = noderange(repo, ["heads(::%s())" % state.headssymbol])
                heads.sort()
            if olds != heads:
                self._writestateshead()
            if state.next is not None and state.next.trackheads:
                self.setstate(state.next, nodes) # cascading

        def _reducehead(self, candidates):
            """recompute a set of heads so it doesn't include _NOSHARE changeset

            This is basically a complicated method that compute
            heads(::candidates - _NOSHARE)
            """
            selected = set()
            st = laststatewithout(_NOSHARE)
            candidates = set(map(self.changelog.rev, candidates))
            heads = set(map(self.changelog.rev, self.stateheads(st)))
            shareable = set(self.changelog.ancestors(*heads))
            shareable.update(heads)
            selected = candidates & shareable
            unselected = candidates - shareable
            for rev in unselected:
                for revh in heads:
                    if self.changelog.descendant(revh, rev):
                        selected.add(revh)
            return sorted(map(self.changelog.node, selected))

        ### enable // disable logic

        @util.propertycache
        def _enabledstates(self):
            """The set of state enabled in this repository"""
            return self._readenabledstates()

        def _readenabledstates(self):
            """read enabled state from disk"""
            states = set()
            states.add(ST0)
            mapping = dict([(st.name, st) for st in STATES])
            try:
                f = self.opener('states/Enabled')
                for line in f:
                    st =  mapping.get(line.strip())
                    if st is not None:
                        states.add(st)
            finally:
                return states

        def _writeenabledstates(self):
            """read enabled state to disk"""
            f = self.opener('states/Enabled', 'w', atomictemp=True)
            try:
                for st in self._enabledstates:
                    f.write(st.name + '\n')
                f.rename()
            finally:
                f.close()

        ### local clone support

        def cancopy(self):
            """deny copy if there is _NOSHARE changeset"""
            st = laststatewithout(_NOSHARE)
            return ocancopy() and (self.stateheads(st) == self.heads())

        ### pull // push support

        def pull(self, remote, *args, **kwargs):
            """altered pull that also update states heads on local repo"""
            result = opull(remote, *args, **kwargs)
            remoteheads = self._pullstatesheads(remote)
            for st, heads in remoteheads.iteritems():
                self.setstate(st, heads)
            return result

        def push(self, remote, *args, **opts):
            """altered push that also update states heads on local and remote"""
            result = opush(remote, *args, **opts)
            remoteheads = self._pullstatesheads(remote)
            for st, heads in remoteheads.iteritems():
                self.setstate(st, heads)
                if heads != self.stateheads(st):
                    self._pushstatesheads(remote, st,  heads)
            return result

        def _pushstatesheads(self, remote, state, remoteheads):
            """push head of a given state for remote

            This handle pushing boundary that does exist on remote host
            This is done a very naive way"""
            local = set(self.stateheads(state))
            missing = local - set(remoteheads)
            while missing:
                h = missing.pop()
                ok = remote.pushkey('states-heads', node.hex(h), '', state.name)
                if not ok:
                    missing.update(p.node() for p in repo[h].parents())


        def _pullstatesheads(self, remote):
            """pull all remote states boundary locally

            This can only make the boundary move on a newer changeset"""
            remoteheads = {}
            self.ui.debug('checking for states-heads on remote server')
            if 'states-heads' not in remote.listkeys('namespaces'):
                self.ui.debug('states-heads not enabled on the remote server, '
                              'marking everything as published')
                remoteheads[ST0] = remote.heads()
            else:
                self.ui.debug('server has states-heads enabled, merging lists')
                for hex, statenames in  remote.listkeys('states-heads').iteritems():
                    for stn in statenames.split(','):
                        remoteheads.setdefault(STATESMAP[stn], []).append(node.bin(hex))
            return remoteheads

        ### Tag support

        def _tag(self, names, node, *args, **kwargs):
            """Altered version of _tag that make tag (and tagging) published"""
            tagnode = o_tag(names, node, *args, **kwargs)
            if tagnode is not None: # do nothing for local one
                self.setstate(ST0, [node, tagnode])
            return tagnode

        ### rollback support

        def _writejournal(self, desc):
            """extended _writejournal that also save states"""
            entries = list(o_writejournal(desc))
            for state in STATES:
                if state.trackheads:
                    filename = 'states/%s-heads' % state.name
                    filepath = self.join(filename)
                    if  os.path.exists(filepath):
                        journalname = 'states/journal.%s-heads' % state.name
                        journalpath = self.join(journalname)
                        util.copyfile(filepath, journalpath)
                        entries.append(journalpath)
            return tuple(entries)

        def rollback(self, dryrun=False):
            """extended rollback that also restore states"""
            wlock = lock = None
            try:
                wlock = self.wlock()
                lock = self.lock()
                ret = orollback(dryrun)
                if not (ret or dryrun): #rollback did not failed
                    for state in STATES:
                        if state.trackheads:
                            src  = self.join('states/undo.%s-heads') % state.name
                            dest = self.join('states/%s-heads') % state.name
                            if os.path.exists(src):
                                util.rename(src, dest)
                            elif os.path.exists(dest): #unlink in any case
                                os.unlink(dest)
                    self.__dict__.pop('_statesheads', None)
                return ret
            finally:
                release(lock, wlock)

    repo.__class__ = statefulrepo