PRAVDA - Parametrized Verification of Fault-tolerant Distributed Algorithms
PRAVDA - Parametrized Verification of Fault-tolerant Distributed Algorithms
Disciplines
Computer Sciences (100%)
Keywords
-
Model Checking,
Fault Tolerant,
Distrubuted Algorithms,
Parametrized Model Checking,
Automated Verification,
Byzantine faults
In the design of reliable computer systems, one has to address two challenges: on the one hand, we have to design means that tolerate partial failure that is outside the control of a system designer (e.g., power outages), and on the other hand, find design faults (bugs) in order to fix them. The former is addressed by means of replication and fault-tolerant distributed algorithms (FTDAs), while the latter is dealt with by rigorous system and software engineering methods, such as model checking. These two challenges have been studied by disjoint research communities. The distributed algorithm community has created a wealth of interesting FTDAs. While in the past these FTDAs were implemented only in safety-critical systems, we currently see more and more implementations of FTDAs for other application domains. The more implementations we will see, the more important it becomes to have effective methods to verify that these algorithms actually perform the task they were designed for. The classic approach to ascertain the correctness of FTDAs is paper-and-pencil proofs. Conducting such proofs is an error-prone task, because the reasoning required is actually quite different from arguments in mainstream computer science. As a result, even published distributed algorithms contain bugs. It is therefore questionable whether FTDAs that are designed to increase to reliability of computer systems actually reach this goal. Hence, we require additional means to increase the trust in the correctness of FTDAs. This project explores model checking for this purpose, because model checking promises a high degree and automation, and has been used successfully for software and hardware verification. In particular this project considers the issue of parametrization in FTDAs. FTDAs are classically parametrized by the number of processes n and the assumed maximal number of faults t, using resilience conditions that require, for instance, that a majority of the processes is correct, which is formalized using expressions such as n > 2t. This leads to the verification problem that a given FTDA should be verified for all combinations of n and t satisfying n > 2t, that is, an infinite family of system instances with fixed n and t has to be verified. The verification community has considered similar problems in the research area of parametrized model checking. Parametrized model checking is still considered a field that is full of open research questions. In particular, FTDAs have several features that introduce challenges that have not been considered in the literature so far, such as, the mentioned resilience conditions, process code that uses the parameters, faults, intricate timing assumptions, etc. Within the PRAVDA project, we will develop new parametrized model checking methods, that will allow us to ascertain the correctness of state-of-the-art FTDAs.
In the design of reliable computer systems, one has to address two challenges: on the one hand, we have to design means that tolerate partial failure that is outside the control of a system designer (e.g., power outages, or parts of a network being disconnected), and on the other hand, find design flaws (bugs) in order to fix them. The former is addressed by means of replication and fault-tolerant distributed algorithms (FTDAs), while the latter is dealt with by rigorous system and software engineering methods, such as model checking. While in the past these FTDAs were implemented only in safety-critical systems, we currently see more and more implementations of FTDAs for other application domains. The classic approach to ascertain the correctness of FTDAs is paper-and-pencil proofs. Conducting such proofs is an error-prone task, because the reasoning required is actually quite different from arguments in mainstream computer science. As a result, even published distributed algorithms contain bugs. It is therefore questionable whether FTDAs that are designed to increase the reliability of computer systems actually reach this goal. Hence, we require additional means to increase the trust in the correctness of FTDAs. The PRAVDA project developed new automated verification techniques for this purpose, e.g., new parameterized model checking or deductive verification methods. These methods promise a higher degree and automation, and are less error-prone than the classic approach. We made significant progress both towards automated verification of high-level protocols as well as towards deductive verification of code. While classic automated verification techniques focus on safety properties (nothing bad ever happens), fault-tolerant distributed algorithms motivated us to specifically also look into liveness properties (eventually something good happens). Another focus was to better understand the relationship between asynchronous semantics (that capture how a system behaves in reality) and synchronous semantics (that are more abstract and simpler to reason in). We adapted classic notions of reductions to the specifics of FTDAs, which allowed us to make several breakthroughs for their verification. In parallel, and out of scope of the PRAVDA proposal, proof-of-stake blockchain projects, such as Tendermint or Facebook's Libra, implemented Byzantine fault-tolerant distributed algorithms for the large scale, that is, for hundreds of participating processes. For such large systems, parameterized verification methods provide, in principle, a solid basis to ascertain the correctness of the systems. Indeed, given the amount of money that is managed in such blockchain systems, the automated verification of fault-tolerant distributed systems suddenly finds itself in the mainstream of computer science. Although the gap between verification of high-level protocols and implementations is still considerable (and warrants more research in this area), the research of PRAVDA builds a solid basis.
- Technische Universität Wien - 100%
- Bernadette Charron-Bost, Ecole Normale Superieure Paris - France
- Stephan Merz, INRIA Nancy - France
Research Output
- 365 Citations
- 18 Publications
- 3 Software
-
2021
Title Verifying safety of synchronous fault-tolerant algorithms by bounded model checking DOI 10.1007/s10009-021-00637-9 Type Journal Article Author Stoilkovska I Journal International Journal on Software Tools for Technology Transfer Pages 33-48 -
2018
Title ByMC: Byzantine Model Checker DOI 10.1007/978-3-030-03424-5_22 Type Book Chapter Author Konnov I Publisher Springer Nature Pages 327-342 -
2017
Title On the completeness of bounded model checking for threshold-based distributed algorithms: Reachability DOI 10.1016/j.ic.2016.03.006 Type Journal Article Author Konnov I Journal Information and Computation Pages 95-109 Link Publication -
2019
Title Verification of Randomized Consensus Algorithms Under Round-Rigid Adversaries DOI 10.4230/lipics.concur.2019.33 Type Conference Proceeding Abstract Author Bertrand N Conference LIPIcs, Volume 140, CONCUR 2019 Pages 33:1 - 33:15 Link Publication -
2019
Title Communication-Closed Asynchronous Protocols DOI 10.1007/978-3-030-25543-5_20 Type Book Chapter Author Damian A Publisher Springer Nature Pages 344-363 -
2019
Title TLA+ model checking made symbolic DOI 10.1145/3360549 Type Journal Article Author Konnov I Journal Proceedings of the ACM on Programming Languages Pages 1-30 Link Publication -
2018
Title Synthesis of Distributed Algorithms with Parameterized Threshold Guards DOI 10.4230/lipics.opodis.2017.32 Type Conference Proceeding Abstract Author Konnov I Conference LIPIcs, Volume 95, OPODIS 2017 Pages 32:1 - 32:20 Link Publication -
2018
Title Reachability in Parameterized Systems: All Flavors of Threshold Automata DOI 10.4230/lipics.concur.2018.19 Type Conference Proceeding Abstract Author Konnov I Conference LIPIcs, Volume 118, CONCUR 2018 Pages 19:1 - 19:17 Link Publication -
2016
Title What You Always Wanted to Know About Model Checking of Fault-Tolerant Distributed Algorithms DOI 10.1007/978-3-319-41579-6_2 Type Book Chapter Author Konnov I Publisher Springer Nature Pages 6-21 -
2016
Title Decidability in Parameterized Verification DOI 10.1145/2951860.2951873 Type Journal Article Author Bloem R Journal ACM SIGACT News Pages 53-64 Link Publication -
2015
Title SMT and POR Beat Counter Abstraction: Parameterized Model Checking of Threshold-Based Distributed Algorithms DOI 10.1007/978-3-319-21690-4_6 Type Book Chapter Author Konnov I Publisher Springer Nature Pages 85-102 -
2017
Title A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithms DOI 10.1145/3009837.3009860 Type Conference Proceeding Abstract Author Konnov I Pages 719-734 Link Publication -
2015
Title Decidability of Parameterized Verification DOI 10.2200/s00658ed1v01y201508dct013 Type Journal Article Author Bloem R Journal Synthesis Lectures on Distributed Computing Theory Pages 1-170 Link Publication -
2017
Title A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithms DOI 10.1145/3093333.3009860 Type Journal Article Author Konnov I Journal ACM SIGPLAN Notices Pages 719-734 Link Publication -
2017
Title Parameterized Model Checking of Synchronous Distributed Algorithms by Abstraction DOI 10.1007/978-3-319-73721-8_1 Type Book Chapter Author Aminof B Publisher Springer Nature Pages 1-24 -
2017
Title Para2: parameterized path reduction, acceleration, and SMT for reachability in threshold-guarded distributed algorithms DOI 10.1007/s10703-017-0297-4 Type Journal Article Author Konnov I Journal Formal Methods in System Design Pages 270-307 Link Publication -
2017
Title Accuracy of Message Counting Abstraction in Fault-Tolerant Distributed Algorithms DOI 10.1007/978-3-319-52234-0_19 Type Book Chapter Author Konnov I Publisher Springer Nature Pages 347-366 -
2019
Title Verifying Safety of Synchronous Fault-Tolerant Algorithms by Bounded Model Checking DOI 10.1007/978-3-030-17465-1_20 Type Book Chapter Author Stoilkovska I Publisher Springer Nature Pages 357-374
-
2019
Link
Title Artifact and instructions to generate experimental results for TACAS 2019 paper: Verifying Safety of Synchronous Fault-Tolerant Algorithms by Bounded Model Checking DOI 10.6084/m9.figshare.7824929 Link Link -
2019
Link
Title Artifact and instructions to generate experimental results for TACAS 2019 paper: Verifying Safety of Synchronous Fault-Tolerant Algorithms by Bounded Model Checking DOI 10.6084/m9.figshare.7824929.v1 Link Link -
2017
Link
Title ByMC: Byzantine Model Checker Link Link