Boosting a Python app with Rust: the case of Mercurial

Incremental enhancing of a medium/large Python code base with Rust

Georges Racinet

Extensions: what and why ?

the best of two worlds

An extension is a Python module made of native code.

Traditionally, written in C, using the CPython API and taking the form of a shared object library, that CPython loads dynamically.

Mercurial (hg)

https://mercurial-scm.org

Why in Rust ?

Example: ancestors iteration

import heapq

def iter_ancestors(parentfn, initrevs):
  seen = {nullrev}
  seen.update(initrevs)
  visit = [-r for r in initrevs]
  heapq.heapify(visit)
  while visit:
      current = -visit[0]
      yield current
      p1, p2 = parentfn(current)
      if p1 not in seen:
          heapq.heapreplace(visit, -p1)
          seen.add(p1)
      else:
          heapq.heappop(visit)
      heapq.heappush(visit, -p2)
      seen.add(p2)

A performance glimpse

We are already squeezing the most out of Python.

The code actually starts like this:

def iter_ancestors(parentfn, initrevs):
  seen = {nullrev}
  see = seen.update
  heappush = heapq.heappush

  see(initrevs)
  visit = [-r for r in initrevs]
  heapq.heapify(visit)

(…)

Such method prefetching really make a difference in hg case!

Example: ancestors iteration (Rust)

pub struct AncestorsIterator<G: Graph> {
    graph: G,
    visit: BinaryHeap<Revision>,
    seen: HashSet<Revision>,
}

impl AncestorsIterator<G: Graph> {
    pub fn new(graph: G, initrevs: impl IntoIterator<Item = Revision> -> Self {
        let visit: BinaryHeap<Revision> = initrevs.collect();
        let seen = visit.iter().cloned().collect();
        seen.insert(NULL_REVISION);
        AncestorsIterator {visit: visit, seen: seen};
    }
}

Example: ancestors iteration (Rust)

impl<G: Graph> Iterator for AncestorsIterator<G> {
    type Item = Result<Revision, GraphError>;

    fn next(&mut self) -> Option<Self::Item> {
        let current = match self.visit.peek() {
            None => { return None; }
            Some(c) => *c,
        };
        let [p1, p2] = self.graph.parents(current).map_err(|e| Some(e))?;
        if self.seen.insert(p1) {
            *(self.visit.peek_mut().unwrap()) = p1;
        } else {
            self.visit.pop();
        }
        if self.seen.insert(p2) {
            self.visit.push(p2);
        }
        Some(Ok(current))
    }
 }

FFI options for Rust extensions

(FFI = Foreign Function Interface)

cffi

Practical conclusion:

rust-cpython

PyO3

FFI options: conclusions

FFI options: conclusions

Fastpath to C ?

Fastpath to C: capsules

https://docs.python.org/3.7/extending/extending.html#using-capsules

~ $ python3
Python 3.7.2 (default, Jan  3 2019, 02:55:40)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> datetime.datetime_CAPI
<capsule object "datetime.datetime_CAPI" at 0x7fdfb6713b10>

The dataflow challenge

handling big Python data structures making round trips between

Python -> Rust -> Python

Dataflow challenge: converting

Dataflow challenge: encapsulating

Dataflow solution: Rust owns it all

Complex objects passing the interface between languages are implemented in Rust.

Dataflow solution: not much to exchange

SpaceForward
Right, Down, Page DownNext slide
Left, Up, Page UpPrevious slide
GGo to slide number
POpen presenter console
HToggle this help