Boosting a Python app with Rust: the case of Mercurial
Incremental enhancing of a medium/large Python code base with Rust
Georges Racinet
Extensions: what and why ?
An extension is a Python module made of native code.
- Performance (speed)
- Memory efficiency
- True multithreading: releasing the GIL
- Wrapping an existing library (not our case)
Traditionally, written in C, using the CPython API and taking the form
of a shared object library, that CPython loads dynamically.
Mercurial (hg)
https://mercurial-scm.org
- Distributed version control system (DVCS)
- Same generation as Git
- Core written in Python (85 kLOC), boosted by C Extensions (10 kLOC)
- Supports Python plugins (called Hg Extensions)
- Some strong points:
- non-destructive, shareable, history manipulation (rebase, amend)
- very large code bases (monorepos)
- large histories (millions commits)
Why in Rust ?
- memory safe (leaks, security)
- not garbage collected by default
- focus on multithreading
- ABI compatible with C
- batteries included, too!
Example: ancestors iteration
import heapq
def iter_ancestors(parentfn, initrevs):
seen = {nullrev}
seen.update(initrevs)
visit = [-r for r in initrevs]
heapq.heapify(visit)
while visit:
current = -visit[0]
yield current
p1, p2 = parentfn(current)
if p1 not in seen:
heapq.heapreplace(visit, -p1)
seen.add(p1)
else:
heapq.heappop(visit)
heapq.heappush(visit, -p2)
seen.add(p2)
We are already squeezing the most out of Python.
The code actually starts like this:
def iter_ancestors(parentfn, initrevs):
seen = {nullrev}
see = seen.update
heappush = heapq.heappush
see(initrevs)
visit = [-r for r in initrevs]
heapq.heapify(visit)
(…)
Such method prefetching really make a difference in hg case!
Example: ancestors iteration (Rust)
pub struct AncestorsIterator<G: Graph> {
graph: G,
visit: BinaryHeap<Revision>,
seen: HashSet<Revision>,
}
impl AncestorsIterator<G: Graph> {
pub fn new(graph: G, initrevs: impl IntoIterator<Item = Revision> -> Self {
let visit: BinaryHeap<Revision> = initrevs.collect();
let seen = visit.iter().cloned().collect();
seen.insert(NULL_REVISION);
AncestorsIterator {visit: visit, seen: seen};
}
}
Example: ancestors iteration (Rust)
impl<G: Graph> Iterator for AncestorsIterator<G> {
type Item = Result<Revision, GraphError>;
fn next(&mut self) -> Option<Self::Item> {
let current = match self.visit.peek() {
None => { return None; }
Some(c) => *c,
};
let [p1, p2] = self.graph.parents(current).map_err(|e| Some(e))?;
if self.seen.insert(p1) {
*(self.visit.peek_mut().unwrap()) = p1;
} else {
self.visit.pop();
}
if self.seen.insert(p2) {
self.visit.push(p2);
}
Some(Ok(current))
}
}
FFI options for Rust extensions
(FFI = Foreign Function Interface)
- direct FFI: a C Extension does the binding
- rust-cpython
- PyO3
- cffi
cffi
- universal (PyPy)
- more overhead
- standard case: object written in the low-level language has a
complete API made of simple types (int, bytes, str…)
- calling back Python is trickier
Practical conclusion:
- keep the option open
- postpone until Rust code is complete enough that it doesn't
need to exchange complex objects with the Python layer
rust-cpython
- lower layer, sys-python just exports CPython API
(PyDict_Set, etc.)
- upper layer cpython exposes as natural Rust constructs
- helper macros to define Python classes and modules directly
- has the compiler check GIL acquisition by expressing it as a marker
pseudo object.
PyO3
- Started as a fork of rust-cpython
- Still similar (marker)
- Arguably, more modern macro system (leverages procedural macros)
- Depends on Rust unstable (no-no for us)
FFI options: conclusions
- plan for several FFI options
- separate a pure Rust layer hg-core
- start with rust-cpython
FFI options: conclusions
Fastpath to C ?
Fastpath to C: capsules
https://docs.python.org/3.7/extending/extending.html#using-capsules
- PyCapsule are Python objects wrapping (function) pointers
- can be defined in a module and retrieved in another
- meant exactly to share C APIs between extensions!
~ $ python3
Python 3.7.2 (default, Jan 3 2019, 02:55:40)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> datetime.datetime_CAPI
<capsule object "datetime.datetime_CAPI" at 0x7fdfb6713b10>
The dataflow challenge
handling big Python data structures making round trips between
Python -> Rust -> Python
Dataflow challenge: converting
- Easy to implement and reason about
- Can be very inefficient
- Example: hg-core's treatment amounts to add or remove one
element in a set of 10^6 commits…
- Rule of thumb: acceptable as an intermediate step, or for small data
Dataflow challenge: encapsulating
- No need to convert/copy huge objects
- Either reacquire the GIL (slow) or keep it (kills multithreading)
- Must define abstractions (traits) so that hg-core isn't polluted by CPython
specifics (GIL again).
Dataflow solution: Rust owns it all
Complex objects passing the interface between languages
are implemented in Rust.
- example follow-up: we'd have a RustCommitSet class, backed by a
Rust HashSet, and implementing the whole set API.
- ideally, the Python layer would only work with RustSet instances
- alternatively we'd convert just once and return a RustCommitSet
- don't care about the GIL, introduce proper locking in the Rust realm
- in the case of Mercurial, we have a custom smartset class anyway
Dataflow solution: not much to exchange
- hg-core has grown a lot
- its Python API involves in practice only tiny structures or isn't
called repeatedly
- high level questions only: "return the heads the remote repo has
that we haven't".