Sergei Stepanenko
Research Interests
I am interested in formal verification, programming language design, compilers, type systems, and functional programming. To date, I have contributed to the verification of an idealized model of hypervisors, the propositional semantics of generalized algebraic datatypes, and the denotational semantics of various types of continuation manipulations.
Projects
- Hypervisors
Details
High-level description
We developed a simplified model of the Hafnium hypervisor and provided it with three elements:- A novel definition of weakest precondition that allows reasoning about assembly with cooperative concurrency.
- Separation logic rules for Hafnium memory management instructions.
- A logical relation to express “robust security”. With this, we were able to verify some examples where trusted code is linked with adversary code. By providing limited knowledge about the adversary code, we could still prove that the trusted code invariants are preserved.
My contribution
I have contributed to the definition of the hypervisor model, the separation logic rules for it, and the proof of the fundamental lemma. - GADTs
Details
High-level description
We provided a propositional model for a version of System F&omega with built-in equalities between types, allowing the expression of GADTs used in programming languages that support them (e.g., Haskell, OCaml). For this calculus, we provided the first propositional model using a novel technique that combines normalization by evaluation with logical relations. With this model, we were able to prove several things:- Combining impredicative polymorphism with injectivity of polymorphic types leads to a new encoding of non-terminating computation. Consequently, the language supporting them is not strongly normalizing even without effects or recursion.
- Limited support of parametric reasoning about languages is possible even with syntactic constraints on types.
- We were able to provide a few semantically correct but syntactically ill-typed programs and prove their properties.
My contribution
I have contributed to the definition of the semantic model, the proof of soundness of the language, and its extension to include effects such as state. This demonstrates that GADTs are orthogonal to additional computational language features. - Guarded interaction trees
Details
High-level description
I’m currently working on continuation-dependent effects. We extended Guarded Interaction Trees to allow effects that can observe or modify continuations around them. Using this extension, we provided a denotational semantics for a language with call/cc and for a language with shift/reset. Using that, we aim to provide denotational semantics to feature-rich languages, like OCaml with effect handlers, Scheme or SML with call/cc.
My contribution
In this project, I’ve been involved in modification of the original definition of reification for GITs, to allow them to capture continuation, thus, enabling reasoning about control-flow modification effects. To show that our approach is viable, I’ve provided a sound and adequate direct interpretation of a language with call/cc. - Scheme embedded in Discord
Details
High-level description
An extendable Scheme in Discord. Allows to easily extend the standard library by specifying wrappers for Discord-specific commands. This is just a toy project of mine that I really like.
Publications
- VMSL: A Separation Logic for Mechanised Robust Safety of Virtual Machines Communicating above FF-A
PLDI 2023, Zongyuan Liu, Sergei Stepanenko, Jean Pichon-Pharabod, Amin Timany, Aslan Askarov, and Lars Birkedal
Details
Abstract
Thin hypervisors make it possible to isolate key security components like keychains, fingerprint readers, and digital wallets from the easily-compromised operating system. To work together, virtual machines running on top of the hypervisor can make hypercalls to the hypervisor to share pages between each other in a controlled way. However, the design of such hypercall ABIs remains a delicate balancing task between conflicting needs for expressivity, performance, and security. In particular, it raises the question of what makes the specification of a hypervisor, and of its hypercall ABIs, good enough for the virtual machines. In this paper, we validate the expressivity and security of the design of the hypercall ABIs of Arm’s FF-A. We formalise a substantial fragment of FF-A as a machine with a simplified ISA in which hypercalls are steps of the machine. We then develop VMSL, a novel separation logic, which we prove sound with respect to the machine execution model, and use it to reason modularly about virtual machines which communicate through the hypercall ABIs, demonstrating the hypercall ABIs’ expressivity. Moreover, we use the logic to prove robust safety of communicating virtual machines, that is, the guarantee that even if some of the virtual machines are compromised and execute unknown code, they cannot break the safety properties of other virtual machines running known code. This demonstrates the intended security guarantees of the hypercall ABIs. All the results in the paper have been formalised in Coq using the Iris framework. - The Essence of Generalized Algebraic Data Types
POPL 2024, Filip Sieczkowski, Sergei Stepanenko, Jonathan Sterling, and Lars Birkedal
Details
Abstract
This paper considers direct encodings of generalized algebraic data types (GADTs) in a minimal suitable lambda-calculus. To this end, we develop an extension of System Fω with recursive types and internalized type equalities with injective constant type constructors. We show how GADTs and associated pattern-matching constructs can be directly expressed in the calculus, thus showing that it may be treated as a highly idealized modern functional programming language. We prove that the internalized type equalities in conjunction with injectivity rules increase the expressive power of the calculus by establishing a non-macro-expressibility result in Fω, and prove the system type-sound via a syntactic argument. Finally, we build two relational models of our calculus: a simple, unary model that illustrates a novel, two-stage interpretation technique, necessary to account for the equational constraints; and a more sophisticated, binary model that relaxes the construction to allow, for the first time, formal reasoning about data-abstraction in a calculus equipped with GADTs.
Talks
- The Essence of Generalized Algebraic Data Types, INRIA, 2023, Announcement
- The Essence of Generalized Algebraic Data Types, POPL, 2024, Slides
Community Service
- Committee Member in Artifact Evaluation Committee within the Research Artifacts-track, PLDI, 2024
- External reviewer, IJCAR, 2024
Teaching
- Compilation, TA, Fall 2021
- Distributed Systems and Security, TA, Fall 2022
- Computability and Logic, TA, Spring 2023
- Compilation, TA, Fall 2023
- Computer Architecture, Networks and Operating Systems, TA, Spring 2024, GDB notes