First International

Workshop of Rigorous Systems Engineering

April 10, Salzburg, Austria

Affiliated with EUROSYS'11 and supported by RiSE

News

March 25: program published.

Invited Talks

Multi-Path Analysis for Real-World Software Systems

George Candea, EPFL, Switzerland

Abstract

Reasoning about multiple paths in complex, large software systems, which interact richly with their environment, is difficult - this talk will describe steps toward making such analysis practical. I will describe S2E, a new platform for analyzing the properties and behavior of software systems.

S2E’s novelty consists of its ability to scale to large real systems, such as a full Windows stack. S2E is based on two new ideas: selective symbolic execution, a way to automatically minimize the amount of code that has to be executed symbolically given a target analysis, and relaxed execution consistency models, a way to make principled performance/precision trade-offs in complex analyses. These techniques give S2E three key abilities: to simultaneously analyze entire families of execution paths, instead of just one execution at a time; to perform the analyses in-vivo within a real software stack - user programs, libraries, kernel, drivers, etc. - instead of using abstract models of these layers; and to operate directly on binaries, thus being able to analyze even proprietary software. Conceptually, S2E is an automated path explorer with modular path analyzers: the explorer drives the target system down all ex- ecution paths of interest, while analyzers check properties of each such path (e.g., to look for bugs) or simply collect information (e.g., count page faults). Desired paths can be specified in multiple ways, and one can either combine existing analyzers to build a custom analysis tool, or write new analyzers using the S2E API.

So far, S2E has been used for comprehensive performance profiling, reverse engineering of proprietary software, and bug finding for both kernel-mode and user-mode binaries. Using S2E to build these tools is easy: less than 770 LOC and 40 person-hours each.

Bio

George Candea heads the Dependable Systems Lab at EPFL in Switzerland, where he focuses on techniques, tools, and runtimes that improve the dependability of software systems while increasing programmer productivity. Until 2009, George was also Chief Scientist of Aster Data, a Silicon Valley-based large-scale data analytics company he co-founded in 2005; Aster Data received the 2011 "Technology Pioneer" award from the World Economic Forum.

George is a recipient of the MIT TR35 "Top 35 Young Technology Innovators" award for 2005. In 2001, George was part of the founding team of the Stanford/Berkeley Recovery-Oriented Computing (ROC) project. Previously, he worked at Oracle, IBM Research, and Microsoft Research. George received his PhD in computer science from Stanford University in 2005 and his BS (1997) and MEng (1998) in computer science from the Massachusetts Institute of Technology.


Model Checking Unbounded Threads

Daniel Kröning, Oxford University, United Kingdom

Abstract

The trend towards multi-core computing has increased the importance of concurrent software dramatically. The talk will begin with a brief introduction to the pitfalls in this area. We then show how to to use Model Checking techniques to obtain formal guarantees about programs of this kind. Specifically, we present how to apply counter abstraction on real-world concurrent programs to factor out redundancy due to thread replication. The traditional global state representation as a vector of local states is replaced by a vector of thread counters, one per local state.

In practice, straightforward implementations of this idea are unfavorably sensitive to the number of local states. We present a novel symbolic exploration algorithm that avoids this problem by carefully scheduling which counters to track at any moment during the search. Our experiments are carried out on Boolean programs, an abstraction promoted by the SLAM project. To our knowledge, this marks the first application of counter abstraction to programs with non- trivial local state spaces, and results in the first scalable Model Checker for concurrent Boolean programs.

Bio

Daniel Kröning received the M.E. and doctoral degrees in computer science from the University of Saarland, Saarbrücken, Germany, in 1999 and 2001, respectively. He joined the Model Checking group in the Computer Science Department at Carnegie Mellon University, Pittsburgh PA, USA, in 2001 as a Post-Doc.

He was an assistant professor at the Swiss Technical Institute (ETH) in Zürich, Switzerland, from 2004 to 2007. He is now a Reader at the Computing Laboratory at Oxford University. His research interests include automated formal verification of hardware and software systems, decision procedures, embedded systems, and hardware/software co-design.

Overview

This workshop provides a platform for the exchange of ideas between systems research and formal methods. Recent years have seen tremendous increase in reasoning power of formal verification techniques. Today formal methods are routinely applied during the design of processors. Device drivers are model checked. Microkernels and hypervisors have been fully formally verified.

With this workshop we want to stipulate further research in rigorous system design in this direction. We particular welcome contributions that ''cross the border'', which either apply established, adapted or new formal techniques to computer systems or use precise formal modelling within systems research. Systems papers, which show the need for formal reasoning, would be equally interesting.

In the CFP we asked for original and short papers as well, but we only received presentation-only papers. Nevertheless, we think we have been able to put together an inspiring and exciting program.

Program

08:50 - 09:00 opening

09:00 - 10:00
George Candea
Multi-Path Analysis for Real-World Software Systems

10:00 - 10:30 coffee break

10:30 - 11:00
Andreas Haas, Christoph Kirsch, Ana Sokolova and Stephanie Stroka
Towards Automating Heap Management through Short-term Memory

We are interested in dynamic heap management which provides the performance and predictability of explicit memory management with the ease of use of implicit memory management. To gain new insights we designed a new heap management model which may provide an interesting target for program analysis and code generation of explicit memory management code.

In our new heap management model, called short-term memory, objects on the heap expire after a finite amount of time, which makes deallocation unnecessary. Expiration of longer-needed objects may be extended by refreshing.

The generation of refresh calls seems more feasible than the generation of free calls for several reasons:

  • In short-term memory it is valid to refresh an object multiple times, which allows to conservatively add refresh calls for all objects which are potentially needed, even if no refreshing would be necessary.
  • Only needed objects require refreshing. Needed objects are always reachable, independent of the code location. This may allow more flexible positioning of memory management code for code generation.

Our current implementation [SBG10c] uses logical clocks controlled by explicit tick calls. The positioning of tick calls influences the memory consumption, since with faster time advance objects may expire earlier. In first experiments we added tick and refresh calls manually. We ported applications written in C, Java, and Go to use short-term memory with a single clock and got competitive performance in time and space. In a next step we will place refresh and tick calls automatically, and we will employ multiple clocks to allow better code generation and program analysis, and to further improve runtime behavior.

11:00 - 11:30
Vasu Singh
Relaxed Access Models for Transactional Memory

Transactional memory (TM) has shown potential to simplify the task of writing concurrent programs. Recent studies have shown that the conflict detection mechanisms for TM do not allow the performance of TM to scale to that of fine-grained locking in concurrent data structures. This has led researchers to relax the conflict detection for transactions, using mechanisms like early release and elastic transactions.

We formalize the notion of relaxed transactional access models and their desired properties. We introduce a new transactional access model, peek-quake model, that allows two new commands, peeks (a lighter form of read) - and its dual quakes (a heavier form of write) - for finer conflict detection in STM. In our augmented model, a normal read conflicts with every write, while a peek conflicts only with quakes. We compare the peek-quake access model to existing access models in our formalism. We use the peek-quake access model to implement concurrent data structures like linked list based sets and red-black trees. We extend the TL2 implementation with support for peek and quake commands. We evaluate the performance of the peek-quake based TL2 on the linked list based set and red-black tree (microbenchmarks) and STAMP benchmarks. Compared to TL2, we observe performance gains of upto 2x for the microbenchmarks, and 1.2x for the genome benchmark.

11:30 - 12:00
Stephan Falke, Florian Merz and Carsten Sinz
The Low-Level Software Bounded Model Checker LLBMC

We present LLBMC, a tool for bounded model checking of C programs. LLBMC uses the LLVM compiler framework to translate C programs into LLVM's intermediate representation. The resulting code is then converted into our own logical intermediate representation and simplified using rewrite rules. The simplified formula is finally passed to an SMT-solver. In contrast to existing tools, LLBMC uses a bit-precise memory model. It thus can handle, e.g., memory-based re-interpret casts, which frequently occur in system level programs.

12:00 - 13:30 lunch break

13:30 - 14:30
Daniel Kröning
Model Checking Unbounded Threads

14:30 - 15:00
Hannes Payer, Ana Sokolova, Harald Roeck and Christoph Kirsch
Scalability versus Semantics of Concurrent Objects

The semantics of a concurrent object is determined by a correctness condition (e.g. linearizability) and a sequential specification (e.g. FIFO). An implementation of a concurrent object may incur high synchronization overhead for realizing the semantics and thus limit scalability. A relaxed version of a given correctness condition and sequential specification may allow us to gain scalability at the expense of deviating from the original semantics (e.g. linearizable FIFO). We study the trade-off between scalability and semantical deviation based on concurrent FIFO queues and present concurrent object components that allow us to control that trade-off and provide scalability even under high contention scenarios [PODC 2011 Brief Announcement].

15:00 - 15:30 coffee break

15:30 - 16:00
Roderick Bloem
When is Declarative Programming Useful ?

Computer science has seen a long push from imperative to declarative programming, which has failed miserably. The idea that the programmer should state *what* should be computed instead of *how* has lead to languages like Prolog, Lisp, and Caml, which have never become mainstream. We argue that computations are much easier programmed imperatively, but that the same does not hold for concurrency. In concurrency statements of correctness may be as simple as "no race conditions", or, "the functionality of the program is the same whether it runs on one or two processors." Thus, we need a programming paradigm that combines the strengths of imperative languages where data is concerned with those of declarative programming for concurrency. We argue that this line of research deserves more attention and look at some initial ideas to make it real.

16:00 - 16:30
Paul Marinescu and Cristian Cadar
Zero-Effort Test Improvement via Symbolic Execution

Software testing is an expensive and time consuming process, often involving the manual creation of comprehensive test suites. However, current testing methodologies do not take full advantage of these tests. We introduce a technique for augmenting existing test cases with no additional human effort, via a lightweight symbolic execution mechanism. Our prototype implementation found a previously unknown bug in the cut GNU Coreutils program, which goes undetected by the regression suite or extensive symbolic execution alone.

16:30 - 17:00
Andreas Holzer and Helmut Veith
Seamless Testing for Models and Code

We describe an approach to model-based testing where a test suite is generated from a model and automatically concretized to drive an implementation. Motivated by an industrial project involving DO-178B compliant avionics software, where the models are UML activity diagrams and the implementation is ANSI C, we developed a seamless testing environment based on our test specification language FQL. We demonstrate how to apply FQL to activity diagrams in such a way that FQL test specifications easily translate from UML to C code. Our approach does not require any additional glue or auxiliary code but is fully automatic except for straightforward source code annotations that link source and model. In this way, we can check for modeled but unimplemented behavior and vice versa, and we can also evaluate the degree of abstraction between model and implementation.

17:00 - 17:10 closing