Partially Synchronous Distributed Real-Time Systems
Partially Synchronous Distributed Real-Time Systems
Disciplines
Computer Sciences (100%)
Keywords
-
Distributed computing models,
Real-time scheduling,
Partially synchronous systems,
Worst case response time analysis,
Fault-tolerant distr. real-time systems
The proposed project "Partially Synchronous Distributed Real-Time Systems" (PSRTS) ist devoted to the development of a sound scientific basis for fault-tolerant distributed hard real-time systems with a high degree of concurrency and, hence, relaxed synchrony-by-design. By contrast, most existing distributed hard real-time systems adhere to the synchronous distributed computing paradigm, which is based on a fine-grained common notion of time that can be used to closely synchronize all distributed computations. For example, the popular Time- Triggered Protocol TTP is based on distributed processes that communicate over shared buses via time-slotted communication (TDMA). This approach, which obviously limits the concurrency in the system considerably, greatly simplifies the handling of both real-time and fault-tolerance requirements, but does not allow the application to fully exploit the computing power available in the system: Due to the inevitable overhead of synchronous computation and communication, it is essentially the slowest part of the system that determines the overall computing speed. Besides wasting resources, this also creates the danger of under-estimating the worst case delay bounds required for synchrony-by-design, and hence running the system outside its specification sometimes. Given the often severe power & space constraints and the criticality of aerospace/automotive applications, for example, it is tempting to increase the concurrency in the system in order to better utilize the scarce computing resources and to reduce the system`s vulnerability with respect to incorrectly estimated worst case bounds. Past research has explored several ways for achieving this: Essentially, the synchrony degree of a distributed real-time system can be relaxed in time, in space, and via suitable (partially synchronous) algorithms. We focus entirely on the latter approach here, which is particularly promising and challenging from a research perspective: Much research has been (and is being) conducted on partially synchronous distributed algorithms that can be started from. On the other hand, we are not aware of any research on how to conduct a worst case response time analysis in a system that it not synchronous by design. The proposed project PSRTS targets exactly this gap in the intersection between distributed algorithms and real- time systems: Its purpose is to revise/adapt/extend existing approaches in order to add a proper real-time systems perspective to the theory of distributed algorithms. The project consists of three cleanly defined sub-topics, namely, (1) provision of a distributed real-time computing model, which is compatible with classic distributed computing models but also allows to incorporate real-time scheduling and analysis, (2) the selection/development of a partially synchronous system model, which is synchronous enough to allow a solution of all important distributed computing problems (like consensus) but asynchronous enough to allow a significant degree of concurrency, and, (3) a reasonably simple but realistic worst case schedulability analysis of distributed algorithms running atop of (1) & (2), which allows to break the inevitable cyclic dependency of the timing performances of a distributed algorithm and the underlying distributed computing system.
The ultimate goal of PSRTS (Partially Synchronous Distributed Real-Time Systems) has been to add a proper real-time systems perspective to fault-tolerant distributed algorithms, by establishing a sound scientific basis for fault-tolerant distributed real-time systems with a high degree of concurrency and, hence, relaxed synchrony-by-design. By contrast, state-of-the-art distributed hard real-time systems adhere to the synchronous distributed computing paradigm, which is based on a fine-grained common notion of time that can be used to closely synchronize all distributed computations. For example, the popular Time-Triggered Protocol TTP is based on distributed processes that communicate over shared buses via time-slotted communication. Although strict synchrony-by-design greatly simplifies the handling of both real-time and fault-tolerance requirements, it does not allow the application to fully exploit the computing power available in the system: Due to the inevitable overhead of synchronous computation and communication, it is essentially the slowest part of the system that determines the overall computing speed. Besides sub-optimal utilization, i.e., wasting resources, this also creates the danger of running the system outside its specification sometimes. Given the often severe power & space constraints and the criticality of aerospace/automotive applications, for example, increasing the concurrency in the system in order to better utilize scarce computing resources and to reduce the system's vulnerability is important. PSRTS has been devoted to the development of the scientific basis for such systems, and provided the following major results:A comprehensive Real-Time Model, which is the first distributed computing model that reconciles fault-tolerant distributed algorithms with real-time analysis. Essentially, it replaces zero-time state transitions by non-zero non-preemptable time computing steps and thus allows queueing effects and scheduling issues to enter the picture.Several instances of partially synchronous system models, like the time-free Asynchronous Bounded Cycle model, which provide different relaxations of synchrony conditions, as well as related algorithms for fault-tolerant distributed agreement and proof techniques. Advanced (real-time) analysis techniques, primarily based on non-standard algebras, and their application to some network algorithms. Essentially, these methods allow to treat distributed systems as linear dynamic systems.Besides its primary research results, PSRTS also stimulated several follow-up projects, including transdisciplinary research in formal verification and digital integrated circuits.
- Technische Universität Wien - 100%
Research Output
- 455 Citations
- 47 Publications
-
2012
Title Agreement in Directed Dynamic Networks DOI 10.1007/978-3-642-31104-8_7 Type Book Chapter Author Biely M Publisher Springer Nature Pages 73-84 -
2012
Title Efficient Checking of Link-Reversal-Based Concurrent Systems DOI 10.1007/978-3-642-32940-1_34 Type Book Chapter Author Függer M Publisher Springer Nature Pages 486-499 -
2013
Title An infrastructure for accurate characterization of single-event transients in digital circuits DOI 10.1016/j.micpro.2013.04.011 Type Journal Article Author Veeravalli V Journal Microprocessors and Microsystems Pages 772-791 Link Publication -
2013
Title On the performance of a retransmission-based synchronizer DOI 10.1016/j.tcs.2012.04.035 Type Journal Article Author Nowak T Journal Theoretical Computer Science Pages 25-39 Link Publication -
2012
Title Agreement in Directed Dynamic Networks DOI 10.48550/arxiv.1204.0641 Type Preprint Author Biely M -
2012
Title Architecture and Design Analysis of a Digital Single-Event Transient/Upset Measurement Chip DOI 10.1109/dsd.2012.26 Type Conference Proceeding Abstract Author Veeravalli V Pages 8-17 -
2012
Title Radiation-Tolerant Combinational Gates - An Implementation Based Comparison DOI 10.1109/ddecs.2012.6219036 Type Conference Proceeding Abstract Author Veeravalli V Pages 115-120 -
2011
Title Easy impossibility proofs for k-set agreement in message passing systems DOI 10.1145/1993806.1993846 Type Conference Proceeding Abstract Author Biely M Pages 227-228 Link Publication -
2011
Title On the Performance of a Retransmission-Based Synchronizer DOI 10.1007/978-3-642-22212-2_21 Type Book Chapter Author Nowak T Publisher Springer Nature Pages 234-245 -
2011
Title Optimal regional consecutive leader election in mobile ad-hoc networks DOI 10.1145/1998476.1998485 Type Conference Proceeding Abstract Author Chung H Pages 52-61 -
2011
Title Reconciling fault-tolerant distributed computing and systems-on-chip DOI 10.1007/s00446-011-0151-7 Type Journal Article Author Függer M Journal Distributed Computing Pages 323-355 Link Publication -
2011
Title Full Reversal Routing as a Linear Dynamical System DOI 10.1007/978-3-642-22212-2_10 Type Book Chapter Author Charron-Bost B Publisher Springer Nature Pages 101-112 -
2011
Title The Asynchronous Bounded-Cycle model DOI 10.1016/j.tcs.2010.08.001 Type Journal Article Author Robinson P Journal Theoretical Computer Science Pages 5580-5601 Link Publication -
2011
Title Easy Impossibility Proofs for k-Set Agreement in Message Passing Systems DOI 10.1007/978-3-642-25873-2_21 Type Book Chapter Author Biely M Publisher Springer Nature Pages 299-312 -
2011
Title Solving k-set agreement with stable skeleton graphs. Type Conference Proceeding Abstract Author Biely M Conference T. Kikuno and T. Tsuchiya, editors: Proceedings 25th IEEE International Symposium on Parallel and Distributed Processing (IPDPS'11) Workshops -
2013
Title Reconciling fault-tolerant distributed algorithms and real-time computing DOI 10.1007/s00446-013-0204-1 Type Journal Article Author Moser H Journal Distributed Computing Pages 203-230 Link Publication -
2010
Title Real-time analysis of round-based distributed algorithms. Type Conference Proceeding Abstract Author Kössler A Conference Proceedings of the 1st International Real-Time Scheduling Open Problems Seminar (RTSOPS'10), in conjunction with 22nd Euromicro Conference on Real-Time Systems (ECRTS'10) -
2010
Title Brief Announcement: Regional Consecutive Leader Election in Mobile Ad-Hoc Networks DOI 10.1007/978-3-642-16988-5_8 Type Book Chapter Author Chung H Publisher Springer Nature Pages 89-91 -
2010
Title How to Speed-up Fault-Tolerant Clock Generation in VLSI Systems-on-Chip via Pipelining DOI 10.1109/edcc.2010.35 Type Conference Proceeding Abstract Author Függer M Pages 230-239 -
2010
Title Regional consecutive leader election in mobile ad-hoc networks DOI 10.1145/1860684.1860701 Type Conference Proceeding Abstract Author Chung H Pages 81-90 -
2022
Title Hollow Gradient-Structured Iron-Anchored Carbon Nanospheres for Enhanced Electromagnetic Wave Absorption DOI 10.1007/s40820-022-00963-w Type Journal Article Author Wu C Journal Nano-Micro Letters Pages 7 Link Publication -
2009
Title Weak Synchrony Models and Failure Detectors for Message Passing (k-)Set Agreement DOI 10.1007/978-3-642-10877-8_23 Type Book Chapter Author Biely M Publisher Springer Nature Pages 285-299 -
2009
Title Link Reversal: How to Play Better to Work Less DOI 10.1007/978-3-642-05434-1_10 Type Book Chapter Author Charron-Bost B Publisher Springer Nature Pages 88-101 -
2009
Title Weak synchrony models and failure detectors for message passing k-set agreement. Type Conference Proceeding Abstract Author Biely M Conference Proceedings of the 23rd International Symposium on Distributed Computing (DISC'09) -
2009
Title On the Stability and Robustness of Non-Synchronous Circuits with Timing Loops. Type Conference Proceeding Abstract Author Függer M Conference 3rd Workshop on Dependable and Secure Nanocomputing, Estoril, Portugal -
2008
Title The Asynchronous Bounded-Cycle Model DOI 10.1007/978-3-540-89335-6_20 Type Book Chapter Author Robinson P Publisher Springer Nature Pages 246-262 -
2008
Title Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing, PODC '08 DOI 10.1145/1400751 Type Journal Article -
2008
Title The asynchronous bounded-cycle model DOI 10.1145/1400751.1400815 Type Conference Proceeding Abstract Author Robinson P Pages 423-423 -
2008
Title Chasing the Weakest System Model for Implementing O and Consensus DOI 10.1109/tdsc.2008.24 Type Journal Article Author Hutle M Journal IEEE Transactions on Dependable and Secure Computing Pages 269-281 -
2008
Title Optimal Deterministic Remote Clock Estimation in Real-Time Systems DOI 10.1007/978-3-540-92221-6_24 Type Book Chapter Author Moser H Publisher Springer Nature Pages 363-387 -
2015
Title Time Complexity of Link Reversal Routing DOI 10.1145/2644815 Type Journal Article Author Charron-Bost B Journal ACM Transactions on Algorithms (TALG) Pages 1-39 Link Publication -
2015
Title The effect of forgetting on the performance of a synchronizer DOI 10.1016/j.peva.2015.08.002 Type Journal Article Author Függer M Journal Performance Evaluation Pages 1-16 Link Publication -
2014
Title Architecture for Monitoring SET Propagation in 16-bit Sklansky Adder DOI 10.1109/isqed.2014.6783354 Type Conference Proceeding Abstract Author Veeravalli V Pages 412-419 -
2018
Title Gracefully degrading consensus and k-set agreement in directed dynamic networks DOI 10.1016/j.tcs.2018.02.019 Type Journal Article Author Biely M Journal Theoretical Computer Science Pages 41-77 Link Publication -
0
DOI 10.1145/1993806 Type Other -
2009
Title Towards a real-time distributed computing model DOI 10.1016/j.tcs.2008.10.012 Type Journal Article Author Moser H Journal Theoretical Computer Science Pages 629-659 -
2009
Title The Theta-Model: achieving synchrony without clocks DOI 10.1007/s00446-009-0080-x Type Journal Article Author Widder J Journal Distributed Computing Pages 29-47 -
2009
Title Brief announcement: How to speed-up fault-tolerant clock generation in VLSI systems-on-chip via pipelining. Type Conference Proceeding Abstract Author Dielacher A Conference Proceedings of the 28th ACM Symposium on Principles of Distributed Computing (PODC'09) -
2009
Title Optimal message-driven implementations of omega with mute processes DOI 10.1145/1462187.1462191 Type Journal Article Author Biely M Journal ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pages 1-22 -
2009
Title Routing without ordering DOI 10.1145/1583991.1584034 Type Conference Proceeding Abstract Author Charron-Bost B Pages 145-153 -
2011
Title Synchronous consensus under hybrid process and link failures DOI 10.1016/j.tcs.2010.09.032 Type Journal Article Author Biely M Journal Theoretical Computer Science Pages 5602-5630 Link Publication -
2011
Title Brief announcement: Full reversal routing as a linear dynamical system. Type Conference Proceeding Abstract Author Charron-Bost B Conference Proceedings 23rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'11) -
2011
Title Reconciling Fault-Tolerant Distributed Algorithms and Real-Time Computing DOI 10.1007/978-3-642-22212-2_5 Type Book Chapter Author Moser H Publisher Springer Nature Pages 42-53 -
2011
Title Easy Impossibility Proofs for k-Set Agreement in Message Passing Systems DOI 10.48550/arxiv.1103.3671 Type Preprint Author Biely M -
2011
Title Solving k-Set Agreement with Stable Skeleton Graphs DOI 10.48550/arxiv.1102.4423 Type Preprint Author Biely M -
2011
Title Solving k-Set Agreement with Stable Skeleton Graphs DOI 10.1109/ipdps.2011.301 Type Conference Proceeding Abstract Author Biely M Pages 1488-1495 Link Publication -
2013
Title The Effect of Forgetting on the Performance of a Synchronizer DOI 10.1007/978-3-642-45346-5_14 Type Book Chapter Author Függer M Publisher Springer Nature Pages 185-200