Fully Programmable GPU Pipelines
Fully Programmable GPU Pipelines
DACH: Österreich - Deutschland - Schweiz
Disciplines
Computer Sciences (100%)
Keywords
-
Graphics Processing Unit (GPU),
Rendering Pipeline,
Parallel Computing
A modern computer system has a graphics processing unit (GPU) with enormous computational power. However, the overall hardware architecture of the GPU has not changed for almost 15 years. A fixed- function pipeline with a static sequence of stages supports a certain type of graphics application, primarily designed to deliver computer games. With new GPU programming languages such as CUDA, we can use the GPU power for computing simulations, but we cannot easily build new graphics pipelines, because certain parts of the GPU are not available using CUDA. In this project, we propose to overcome this restriction with a new software framework, which supplements the missing parts of a graphics pipeline as a framework for a GPU programming language. This is challenging, because the framework has to be very efficient, or it will not be able to compete with a standard graphics pipeline. However, we can make use of the high flexibility afforded by a software implementation to produce a competitive framework. More important than pure efficiency is that, with this software framework for graphics pipelines, we can then overcome all the restrictions of the conventional pipeline. For example, we can replace a rectangular dense framebuffer with a much more efficient irregular pixel structure. We can also replace uniform sampling patters for pixels with non-uniform ones, feeding into new Virtual Reality devices such as the Oculus Rift.
Real-time rendering is not only important in the entertainment industry, but for all types of visual applications, such as medical data visualization, simulations, education, or architecture. The real-time rendering pipeline is typically tied to efficient hardware, i.e., the graphics processing unit (GPU). However, while tight coupling to a hardware architecture yields high performance and power efficiency, flexibility is sacrificed. Thus, it is not surprising that the real-time rendering pipelines hardly changed since the introduction of programmable shading. Although programmability added flexibility within certain stages of the pipeline, the pipeline itself remains rigid. This inflexible design restricts research and development of new rendering architectures. Any novel rendering approach, which does not follow the predefined pipeline, is doomed to fail, as it can never compete in performance with approaches that fit the hardware pipeline. In this project we showed that rendering pipelines can run in a complete software approach. To reach this result, efficient dynamic scheduling of the pipeline stages is essential. Scheduling must consider multiple trade-offs, like generating parallelism and keeping data compact, allowing for the execution on arbitrary processing cores and keeping data local for faster access, or streamlining the processing and supporting dynamic decision making. Considering these trade-offs and deriving general solutions to multiple scheduling problems, we derived a complete streaming rendering pipeline, whose performance is reasonably close to the hardware pipeline, while offering unprecedented flexibility. With this approach, we showed that novel rendering algorithms can easily be derived and that small changes to the pipeline can have a lasting impact on rendering quality and rendering speed. Our pipeline cannot only be used to alter the rendering pipeline, but to also experiment with alternative hardware designs and completely new rendering pipelines. For example, we presented the novel rendering pipeline for vector graphics, typically used for font rendering and on websites. Our hierarchical rasterizer for vector graphics rendering not only outperforms the state-of-the-art hardware-supported rendering methods, but also achieves significantly better quality. Another example for the advantages of our approach is the application to virtual reality rendering on head-mounted displays. For this example, we could significantly reduce the perceived latency and thus tackle one of major issues with head-mounted displays: motion sickness. Finally, we applied the scheduling solutions devised for rendering pipelines to other domains, including sparse linear algebra operations---typically found in material simulation, dynamic graph processing---such as large social networks, and mesh processing---creation and manipulation of three dimensional models. In all domains, we outperformed the previous state-of-the-art, including handcrafted solutions from both research and industry. These surprising results show that advanced, adaptive scheduling strategies have the potential to transform multiple domains relying on efficient computation on manycore processors.
- Technische Universität Graz - 100%
- Markus Steinberger, Technische Universität Graz , national collaboration partner
- Matthias Nießner, TU München - Germany
- Jan Kautz, NVIDIA - USA
Research Output
- 299 Citations
- 19 Publications
- 3 Scientific Awards
-
2020
Title Subdivision-Specialized Linear Algebra Kernels for Static and Dynamic Mesh Connectivity on the GPU Type Journal Article Author Mlakar D Journal Computer Graphics Forum Link Publication -
2019
Title Adaptive sparse matrix-matrix multiplication on the GPU DOI 10.1145/3293883.3295701 Type Conference Proceeding Abstract Author Winter M Pages 68-81 -
2019
Title Breadth-First Search on Dynamic Graphs using Dynamic Parallelism on the GPU DOI 10.1109/hpec.2019.8916476 Type Conference Proceeding Abstract Author Tödling D Pages 1-7 -
2018
Title Human upper-body inverse kinematics for increased embodiment in consumer-grade virtual reality DOI 10.1145/3281505.3281529 Type Conference Proceeding Abstract Author Parger M Pages 1-10 -
2021
Title Are dynamic memory managers on GPUs slow? DOI 10.1145/3437801.3441612 Type Conference Proceeding Abstract Author Winter M Pages 219-233 -
2020
Title Stochastic Substitute Trees for Real-Time Global Illumination DOI 10.1145/3384382.3384521 Type Conference Proceeding Abstract Author Tatzgern W Pages 1-9 -
2020
Title Ouroboros DOI 10.1145/3392717.3392742 Type Conference Proceeding Abstract Author Winter M Pages 1-12 -
2019
Title Hierarchical Rasterization of Curved Primitives for Vector Graphics Rendering on the GPU DOI 10.1111/cgf.13622 Type Journal Article Author Dokter M Journal Computer Graphics Forum Pages 93-103 -
2020
Title spECK DOI 10.1145/3332466.3374521 Type Conference Proceeding Abstract Author Parger M Pages 362-375 -
2017
Title Autonomous, Independent Management of Dynamic Graphs on GPUs DOI 10.1109/hpec.2017.8091058 Type Conference Proceeding Abstract Author Winter M Pages 1-7 -
2017
Title Dynamic scheduling for efficient hierarchical sparse matrix operations on the GPU DOI 10.1145/3079079.3079085 Type Conference Proceeding Abstract Author Derler A Pages 1-10 -
2018
Title faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU DOI 10.1109/sc.2018.00063 Type Conference Proceeding Abstract Author Winter M Pages 1-13 -
2017
Title Effective static bin patterns for sort-middle rendering DOI 10.1145/3105762.3105777 Type Conference Proceeding Abstract Author Kerbl B Pages 1-10 -
2019
Title The camera offset space DOI 10.1145/3355089.3356530 Type Journal Article Author Hladky J Journal ACM Transactions on Graphics (TOG) Pages 1-14 -
2018
Title On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing DOI 10.1145/3233303 Type Journal Article Author Kenzel M Journal Proceedings of the ACM on Computer Graphics and Interactive Techniques Pages 1-17 Link Publication -
2018
Title Revisiting The Vertex Cache DOI 10.1145/3233302 Type Journal Article Author Kerbl B Journal Proceedings of the ACM on Computer Graphics and Interactive Techniques Pages 1-16 -
2018
Title A high-performance software graphics pipeline architecture for the GPU DOI 10.1145/3197517.3201374 Type Journal Article Author Kenzel M Journal ACM Transactions on Graphics (TOG) Pages 1-15 -
2018
Title The Broker Queue DOI 10.1145/3205289.3205291 Type Conference Proceeding Abstract Author Kerbl B Pages 76-85 -
2018
Title On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing DOI 10.48550/arxiv.1805.08893 Type Preprint Author Kenzel M
-
2020
Title Eurographics 2020 Best Paper Award Type Research prize Level of Recognition Continental/International -
2019
Title Computer Graphics Forum Associate Editor Type Appointed as the editor/advisor to a journal or book series Level of Recognition Continental/International -
2017
Title Best Student Paper Award High Performance Extreme Computing Type Research prize Level of Recognition Continental/International