TL;DR; All the questions that we needed to clarify for
to complete have been answered.
We are now working on having a final implementation enabled by default in
In the past couple of years, multiple important milestones have been reached
for the Mercurial's Changeset Evolution project.
Changeset Evolution allows for simple, safe and powerful exchange of draft
changesets that can be rewritten by a distributed team.
(The concept is also useful locally, but most innovative part are the
Those progresses are the result of a combined effort of a dozen people, thanks to
all of them.
In particular, we have been working on final algorithms for two of the central
concepts of changeset evolution: exchanging markers and automatic resolution
They now just need a final implementation in Core.
In addition, the last uncharted territory, the restoration of older
evolutions of changesets, have been explored.
We discovered that we have to track more data regarding changeset folding to
perform this task well.
This will trigger what we expect to be the last update in tracked data and
There are many improvements made apart from the ones we just highlighted, and
we'll go over them later in this post.
However, the highlighted ones are important milestones.
All the major algorithms questions are now cleared, we know how each complex
issues will be handled and what associated data storage we need.
Removing unknowns on those questions was a priority because they can have a
high impact on how we design user interfaces and on-disk data formats.
Two things that are complicated to change afterwards.
Now that all those areas have been explored, we have a clear sight of the route to
Evolve completion and can focus on actions directly moving toward this goal.
We need to polish and upstream the concept and code that are still inside the
Some of that code is already in a good shape to go upstream.
The road is clear, but there is a lot to carry along, any kind of help is
Summary of work done since 2017
In general, all areas received many fixes, improvement, and polish. Some
important areas made significant progress:
Evolution provides powerful features, capable to handle complex issues.
We must do all we can to smooth this complexity, in particular when it comes
to naming new concepts.
To address multiple feedback regarding the naming scheme, a group of people
dedicated a significant amount of time to come up with a new naming scheme for
most of the changeset evolution concept.
The resulting scheme was then enforced in Core Mercurial, the Evolve
extensions and Tortoisehg:
- Instability replaces Troubles
- Unstable replaces Troubled
- Predecessor replaces Precursor
- Orphan replaces Unstable
- Content-divergent replaces Divergent
- Phase-Divergent replaces Bumped
wiki page associated with the renaming discussion
In addition, we made some update to command and flag names:
hg uncommit command is merged into
hg amend as
hg amend --extract
- The unloved
hg prune --biject flag has been renamed to
hg prune --pair
To make distributed history edition manageable by the users involved, it is
important to make it easy for someone to understand what happened outside of
its local repository.
Changeset evolution tracks the change to all changesets that have been rewritten.
This information is valuable for many different facets, for example for
displaying what happened to a changeset, or automatically stabilize it.
We made multiple improvements to make sure the information tracked was more
informative and better put to use.
Previously the obsolescence information was mostly used to detect and resolve
Obsmarkers now store more data:
- the effect of an obsolescence marker (message change, patch change, etc),
- the operation that created it (amend, rebase, ...),
- possibly record a user-defined note.
And that data get used in more places:
hg obslog command that displays the evolution of a changeset,
- Display successors and predecessors in
hgweb when appropriate,
- Point at latest successors of obsolete changeset in
- Point to latest successors of obsolete working copy and accessed hidden changesets.
This greatly improved users ability to understand evolution that they, and their
team members, have performed.
Making Changeset Evolution more accessible to people unfamiliar with it.
This area is in a very good shape, there are multiple possible improvements,
but overall we have the feature we need to provide a reasonable experience.
A good share of them have been implemented in Core already, the rest needs to
See the "What next?" section for possible future improvements in this area.
History Rewriting Commands
Smooth distributed history rewriting requires a good set of commands to rewrite
They need to be easy to use and to record the user intend well so that we can
Changesets evolutions allow more flexible history editing but also rely on
the user using the right command for the right operation to build a useful
A quality evolution history provides the user with the best experience.
To achieve this goal we need a clear, compact and powerful set of command to
Progresses were made on existing commands and new commands:
hg fold got flags to clarify its two modes --from and --exact, (We are
not entirely happy with the result yet)
hg uncommit is now available as
hg amend --extract. This reduces the
number of commands and closes the debate about the
hg amend gains a new --patch flag to directly edit changesets diff,
hg grab alias got upgraded to a full
hg pick command (with proper
--continue support). The command is a mix of graft and rebase.
The command can be used to reorganize a stack of changesets.
An important milestone is the introduction of a
rewind command. It is a
command to restore a stack of changeset to a predecessors state.
The command is at an early stage and will need some time to mature, however,
it is already useful and part of some people main workflow.
rewind command was part of our exploration of the evolution
And, indeed, we gathered very important insights.
hg rewind command can automatically find and restore predecessors of
To do so, it walks the evolution graph, but it does it in the opposite
direction than the
hg evolve command.
It turns out that, in the same way we had to store special information for
split, we will need extra information to properly differentiate fold from
We currently do not have this information.
This is a good example of data requirement we had to discover early in order to
shape a functional changeset evolution in the end.
Changeset Evolution core purpose it to unlock history edition in a distributed
To achieve this goal, the exchange of obsolescence information between
repositories is critical.
The question of which obsolescence markers should be exchanged during push and
pull have been solved a while ago.
Further testing in diverse environment setups have confirmed this logic is
However, at the start of 2017, an important issue remained: how to discover
which markers are missing on the other side?
Without an efficient way to detect them, we could not provide an efficient
synchronization of the obsolescence data.
Fortunately, we developed an algorithm and protocol that can perform an
efficient discovery for obsolescence markers.
In our daily usage on the Mercurial repository, this saves us the extra minute
we were spending on obsmarkers discovery using the previous method.
The data structure used by this algorithm scale well with repository size,
an important point that qualifies this solution for all usages.
This new method got tested in various settings and it will be used by default
in the next version of the Evolve extension.
In addition one data structure we developed for this algorithm, "stablerange",
will likely be useful to help with exchanges and caching of other data.
It could be used for pull bundle caching decisions.
The main goal of the current implementation was to validate the approach and
the scaling property of our algorithm.
Its performance and cache storage implementation is not great and this will
have to be reworked when upstreamed.
Workflow and stack
Something important connected to evolution is the clear definition of the
group of changesets related to the current user work.
Such clarity is a great help to provide good behavior and information to the
users regarding possible instability in their current work.
We made many progress in this area in the past couple of year.
- Automatic instability resolution, using
hg evolve, restrict itself to
the current topic by default.
Avoiding selection of unrelated changeset during stabilization, something
that has been confusing users.
prev commands are also restricting themselves to the current
topic, making their movement more predictable and useful.
hg stack command allows for a quick view of the work in progress,
including listing changeset in a semantic order, same as if they were all
This work also offers a simple, yet powerful, workflow for feature branching in
Having topic tightly linked to phases make them a good tools to enforce a healthy
phase movement practice in a project.
Since Changeset evolution is also tightly coupled with phases, healthy phase
movements is important here. Here are the progresses made:
New server repository mode: changeset with topic stay draft of push, other
get published. This is especially useful as it bridges the gap between publishing and
Using this mode by default will preserve backward compatibility with
current Mercurial behavior regarding phase movement while allowing the
user to opt in from the client side for exchanging drafts when they want
New push flag
--publish: to publish selected changeset on push.
There have been various improvements on topic usability. Notably:
- clarification of topic activation, state, and movement on push/pull and
- ability to force new draft commit to have a topic (or automatically assign
a random one),
- improved topic discoverability with
hg topic --age,
- Introduction of a
s0 label referring to the parent of the topic root
- update to
s0 will keep the topic active, making insertion of new
changeset at the start of the topic simpler
- hg prev can go down to
We are getting good feedback from people using this workflow.
We would like to upstream all these improvements.
Some of them are not really attached to topic (eg:
hg push --publish) but we
feels like topic is overall a good branching solution for Mercurial and would
like to see it more of it upstream.
This is a critical part of Evolution and we made many important progress in
Exchanging draft changesets can bring "instabilities" in one's repository.
We can use various strategy to reduce the odds for it to happen, however,
given the distributed nature of Mercurial, we cannot guarantee it won't.
Because instability can happen, we need a good automatic resolution of it.
We are now is a pretty good shape.
hg evolve command can now keep tracks of multi-step operations unlocking
- Good handling of evolution interrupted by merge conflict,
--stop flags works as expected. (for both
hg evolve and
- Automatic stabilization of situation involving and orphan merge,
- Automatic stabilization of phase-divergence in most complex case,
- Automatic stabilization of content-divergence in most complex case.
So we now have all the instability types well under control.
There are still some corner case involving split, fold or merge that remain to
be properly handled.
However, the core tool to handle them now exists.
With all these improvements to
hg evolve and
hg next, we will be able to
update the default behavior of this two commands to something final, and
upstream them (more about this in the next section).
This section focus on concrete actions to bring Changeset Evolution to
It is intended for people with some knowledge of Mercurial internals.
The most important things to clear up right now is that need for better
tracking of fold.
Fixing it right might eventually affect disk storage and various algorithms.
Getting this out of the way as soon as possible is important.
In the same move, we need to mature the
hg rewind command.
Easy undo is a critical item if we are to hand changeset evolution to all
In parallel, there are multiple areas in the Evolve extension that are ready
to be cleaned up and upstreamed (obsmarkers discovery, cache, internal
rewriting toolkit, …).
We should also resume the upstreaming of the improvement regarding stacks
management and publishing workflow contained in the topic extension.
At first, this area might seem unrelated to enable changeset evolution into
Core, in practice the clarification in working set and publishing workflows
greatly reduce complexity around local and distributed work on draft changesets.
So time spent improving these areas is well balanced by the one we do not have
to spend solving more complex user experience issue elsewhere.
Finally, there are a couple of areas we that are mostly done but need to focus to
be wrapped up.
This is mainly about the history rewriting command and automatic instabilities
In these cases, there are small actions to take before freezing the user
experience in Core.
Of course, more improvement will be made to them once into Core.
However such improvement does not seem necessary to enable evolution by
We need better tracking of fold operation.
This is necessary to provide a fully functional
hg rewind command.
This command is important because users need a simple way to undo mistakes.
We need to update the marker creation API and to store this information.
Such change might impact the on-disk storage format and more.
The way we currently store split have a couple of quirks (eg: when only some
of the split successors are exchanged). The current ideas for storing fold
data could also be applied to split, handling these cases better than with the
current split encoding. Tracking splits and folds is a quite interresting
issues, we'll get back to it in an independant blog post.
These possible low-level changes makes the tracking of fold a priority target.
Upgrading up-disk storage of existing users is never simple and it might also
affect multiple algorithms that will have to cope with a different split
History rewriting toolkits
To power its history rewriting commands and its UI, Evolve has a full toolkit
for building history rewriting commands.
This toolkit is ready to be upstreamed and would be a good first step toward
upstreaming the commands, free of any question related to user experience.
All the current work done around better evolution history is either already
upstream, or ready to be upstreamed.
We should keep that effort going.
There is one extra feature that users have been requesting, a clear
transaction log to keep trace of what exactly happens from each past
This would be especially useful on push and pull as they may introduce many
changes from other people in one repository.
This would fit the
journal extension, making it tracking a wider amount of data.
This is not a road blocker, but it would be nice to have.
The Evolve extension provides multiples commands, most of them are close to be
"done" enough for our goal of Evolve enabled by default in Core.
The one command that needs serious attention to mature is the
Users can "rewind" evolution of stack of changeset using this command.
The first version of this command revealed we needed to record more data about
Once this data is available we should have all the pieces to build a final user
experience for this command.
Having this command is very important for the Changeset Evolution experience.
To trust the tool, users need to be confident they can undo their mistake.
History rewriting command
Over the year history rewriting commands in the Evolve extension evolved into a
pretty solid base.
As mentioned in a previous subsection, the internal toolkit powering these
functions could be upstreamed now.
The commands themselves need various adjustments to ensure consistency and
solve some long lasting questions (those we'll cover in the second part of this blog post).
All these adjustments are small and should be easy to perform.
That last round of polish is best done in Evolve extension to offer it the
wider testing possible before we settle it down.
Here are examples of small adjustment we need to do:
hg amend --extract need to expose a flag similar to
hg uncommit --rev
before can fully drop the older command,
hg fold requires either one of
We should pick one to be the default.
--exact has a bit more mercurial developer supporters, but also requires the
user to learn some revsets which is bad.
If we have access to the shorter in-stack reference from topic (s1, s2, etc),
hg fold s1+s2 form might be good enough.
Provided we give a proper example of how to use this in the documentation.
hg split comment is inconsistent with other similar commands (amend,
revert, uncommit, etc).
It does not accept filename and is interactive by default.
We should align its behavior with the other commands.
Of course, this command-set will keep getting larger improvements.
For example, ideas have been floating around about making it easier to control
changeset order in the stack via new commands flags.
However, those improvements are not on the critical path to enable changeset
evolution in Core.
So we don't plan to spend time on them until then.
There has been a long-standing debate: should we automatically evolve
descendants of rewritten changesets".
Automatically evolving orphans provide a smoother experience to users in many
cases but come with multiple drawbacks:
- It can trigger merge earlier than the user wants, forcing them to switch
content from the changeset they were currently rewriting.
- When rewriting all changesets in a stack, automatically evolving all
descendant means the creation of an
N² complexity scale very badly and this can have a very dire impact on
users repository and overall experience.
So we tend to focus users into gradual evolution instead.
This issue is referred as "obs markers explosion".
However, there has been a large demand for automatic evolution at least in some
Some command can guarantee they won't introduce conflict, lifting related
We could even consider extending that logic to all command as long as no
conflict is detected.
The N² obsmarkers explosion is trickier because all command could possible
Math does not really make compromises, N² is unsustainable for most values of N.
However, we still have room for improved experience.
We could imagine allowing auto evolution if the number of descendants is small
(maybe 4 or 5), or try to detect repeated rewrite in the stack.
Once we detect a problematic pattern or stack size, we could skip auto
evolution and redirect the user toward a helpful documentation page.
hg stack command provided by the topic extensions provide a clear view of
the current work in progress, even when orphans are involved.
It is a good tool to reduce the user surge to run
hg evolve --all
A good way to avoid exposing users to the complexity inherent to instability is
to avoid creating it in the first place.
Core mercurial already have some mechanism that history rewriting commands can
use to check if their planned rewrite is valid.
However, we need to make sure it is used by all history rewriting operation and
that it catches all the instability types we want to prevent.
All the logic we use for obsmarkers discovery is sound and ready to be
However, the code will have to be rewritten as it gets in.
The current implementation evolved from a multi-step experiment, and use too
many complicated indirections and is not very effective in general.
Some of the cache storage we use is also problematic and will have to be
In practice, most of the data we cache are not volatile.
They are an inherent property of changeset, so we could imagine storing these
value directly into a "changelog v3" format instead of keeping an independent
I expect the discovery to keep being improved over time, however, these
improvement can come later and are not in the critical path for enabling
evolution in Core.
The work needed to upstream this is well defined but significant.
Finding other use cases for the stablerange data might get more people
interested in making it happens.
The caches used for obsmarkers discovery shares a lot of code with other
performance related caches that should move upstream too.
Stack and workflow
The topic extensions provide various features that contribute to providing a
smooth Changeset Evolution experience. While not strictly necessary these
features simplify the task of making changeset evolution accessible to all
One of the core features of topic is the clear definition of the current working
set: the stack.
As each changeset explicitly carries the topic information, there is no room for
Related changesets that a user is actively working on are not always linear.
Either the user action or the instability brought in by distributed rewriting
can spread them on multiple topological branches.
The clear stack definition from topic handle these situations, making it an
This stack definition simplifies operation around evolution.
For example, restricting most operations within the current stack make things
much more predictable:
hg prev and
hg next ignores other unrelated
hg evolve only select items in the stack.
Having a predictable outcome for these commands is important for users to trust
Without clear stack boundary, the behavior of there command becomes either more
limited or more complicated to explain.
The stack defines a limited number of changeset relevant to the current situation.
A small number allows for better UI.
For example, we can provide a
hg stack command that displays orphan changeset
as if they were already evolved.
This provides a preview of the final structure of their stack, even if some of
it is still orphan.
This is a powerful tool to make changeset evolution accessible to all kind of
The limited amount of changeset make it possible to bring back incremental
numbers to refer to a changeset.
With topic the first item in the stack is nicknamed "s1", the second "s2".
This is very useful to refer to changeset without having to copy paste obscure
The numbering is also preserved across rewrite, making them useful for a longer
Changeset Evolution empower commit centric workflow and "s#" alias make it
easier to reference individual commits.
In practice, other ways to define resilient stacks that provide this benefits.
For example, a stack definition based on phases and named-branch would do.
Named-branches are a strong fit for long-lived branches, but they have life
Topic feels like a better solution for feature branching.
Exchange and Publishing workflow
Another interesting aspect of topic is how their life cycle is tied to phases.
Changeset Evolution can be used on the non-public part of history, and for
sanity reason, this part should stay fairly limited.
Nobody wants to see a six-month-old changeset rewritten, creating tons of orphan
changesets in the process.
A core property of topic is to fade away when they are published.
This means that accepting a topic into the main branch requires to publish it.
Having a workflow step that explicitly involves publishing make sure the set of
non-public changeset remains reasonable.
Topic also solves another pain point with regarding phases.
The main purpose of Changeset evolution is to unlock distributed collaboration
on draft changesets.
However, the current default for any Mercurial server is to publish changeset on
push … so long for exchanging drafts.
It is possible to configure the server to no longer publish on push but this
has multiple drawbacks.
First, this is a server-side config, usually more complicated to setup.
Second, it means the phase cycle has to be manually though about and handled for
Finally, the change affects all users, you can't have a small group of advanced
users playing with advanced feature without impacting other users.
A common workaround is to have two repositories, a publishing, and a
This makes things more complicated than we would like.
Because topic is a new concept that people have to opt-in, we have more
We could have a new default mode for servers where changesets are published on
push as usual unless they have a topic.
This way, users can opt-in draft changeset exchange without server side
configuration and without impacting the other users.
And since exchanging draft requires them to have a topic, they won't interfere
with the usual branch resolution of users not interested in the topic.
This offer a smooth path to draft changesets exchange without breaking backward
As a bonus that scheme force draft changeset to explicitly belong to a feature
To deal with phases cycle, the topic extensions provides various options:
- A new
hg push --publish flag to push that make a push publishing even when
pushing to a non-publishing server
- A config option to have server behave as described above (non-publishing for
topics also comes with small workflow improvement, like intuitive rebase
destination, making it easier to use.
One of the key improvement is the ability to push a new topic to a server
Removing the cumbersome need to use this dangerous flag.
The concept explained above are in a good overall shape, and we got good
feedback on them.
So, what would it take to add support for all this stack and workflow related
Some of them are clear and good workflow improvement not strongly related to topic:
hg push --publish option proved very useful,
- The experimental config option to limit a repository to 1 head per name
should learn about closed heads and graduate from experimental.
Some of the "stack" related logic is also independent of
topic and could be
first implemented around named branches.
- "s#" alias for changeset in a stack,
- Feature from
hg stack (eg: evolution aware order),
- Constraint to the range of
hg next and
Finally, there are interesting pieces directly related to
topic manipulation commands and lifecycle,
- The new publishing mode for servers,
- The ability to push new topic without --force.
General performance and cleanup work will be needed.
However implementing these specific features directly in Core will be
significantly simpler than from the extension.
Commands changes and upstreaming
There are some incoming change to
hg next: The clear definition of a working
set to move within and the improved recovery when conflict occurs during
evolution means we no longer need to hide
hg next triggered evolution behind
a --evolve flag.
It should become the default.
And related change to
hg evolve: step by step evolution is an important
feature. However, since
hg next will be able to fill this role, is make sense
hg evolve to evolve all changesets in the current stack by default.
In addition, matching rebase behavior of preserving the working copy parent
seems in order.
These changes will happen in the Evolve extension.
However, after they happen, both
hg evolve and
hg next should be ready to
be upstreamed provided we have access to a clear definition of the "current
working set" of revisions (as topic provides).
We already have a couple of ways to trigger automatic instability resolution
(eg: next, evolve, …).
However, there could be another good vector to expose it to the user
The command is already dealing with working copy change and merge conflict.
hg update --evolve flag could make sense and offer a simple user
experience for some of the simpler case.
Lowering the barrier of entry is always useful.
hg evolve can now handle the majority of instability cases, some
remains to be handled.
The unhandled case usually involves a mix of phase-divergence and
content-divergence or some merges, split or fold. We need to hunt down and
handle these remaining cases.
It is probably simpler to tackle the last corner case of automatic
stabilization from the Evolve extension.
So that we can have a wider set of users to test them more quickly.
Besides that, the whole logic is in good shape and ready to be upstreamed.
We can probably start to upstream some of the core bits sooner (eg: upstreaming
hg next would make a good excuse to upstream the orphan resolution).
An alternative approach would be to upstreaming the current instabilities
resolution logic in a way where the Evolve extension can monkey patch fixes
for people using an older version.
This might be more work overall.
There have been inline documentation and tutorials written alongside evolve
However, most of it does not contain the latest commands and workflow. Help is
welcome to refresh them.
I'm happy with the progress made in the past years.
As many, I wished we could have done more, faster, but I'm very excited to have
a clear view on project completion now.
By project completion, I mean mercurial Core to contain a good enough subset of
Changeset Evolution so it could be enabled by default.
At Octobus, we do our best to bring this to completion and mobilize people and
resource to reach this goal.
Next, our own effort will focus on getting the fundations concept in Core:
rewind command, automated resolution of instabilities and efficient obsmarkers
In parallel, we'll take care of the supporting concept necessary to safely
enable Changeset Evolution to all users (obs-history viewing, supporting
command, stack and workflow, …).
Of course, this effort is not just performed by people at Octobus or people
working closely with us.
Other Mercurial contributors are also working on their own to complete this project.
What they will exactly do and in what orders is something for them to decide.
All sort of help is welcome.
For multiple years now we have helped the project moved in many ways: through
direct contributions of course, but also by training new people to contribute to
the concept and finally by efficiently gathering and spending money to make the
concept move forward, reaching out, funding, and steering other members of our
open source community to make the project move forward.
Reach out to us if you want to contribute time or money to see Changeset
Evolution enabled by default in Mercurial.
Discussions around the Changeset Evolution concept usually happens on the
#hg-evolve IRC channel on freenode.
If you are using the Evolve extension, do not forget to subscribe to the user list,