Projectdetail

Grant DOI 10.55776/I3007
Funding program Principal Investigator Projects International
Status ended
Start March 1, 2017
End February 29, 2020
Funding amount € 138,146
Project website

DACH: Österreich - Deutschland - Schweiz

Disciplines

Computer Sciences (100%)

Keywords

Graphics Processing Unit (GPU), Rendering Pipeline, Parallel Computing

Abstract

Final report

A modern computer system has a graphics processing unit (GPU) with enormous computational power. However, the overall hardware architecture of the GPU has not changed for almost 15 years. A fixed- function pipeline with a static sequence of stages supports a certain type of graphics application, primarily designed to deliver computer games. With new GPU programming languages such as CUDA, we can use the GPU power for computing simulations, but we cannot easily build new graphics pipelines, because certain parts of the GPU are not available using CUDA. In this project, we propose to overcome this restriction with a new software framework, which supplements the missing parts of a graphics pipeline as a framework for a GPU programming language. This is challenging, because the framework has to be very efficient, or it will not be able to compete with a standard graphics pipeline. However, we can make use of the high flexibility afforded by a software implementation to produce a competitive framework. More important than pure efficiency is that, with this software framework for graphics pipelines, we can then overcome all the restrictions of the conventional pipeline. For example, we can replace a rectangular dense framebuffer with a much more efficient irregular pixel structure. We can also replace uniform sampling patters for pixels with non-uniform ones, feeding into new Virtual Reality devices such as the Oculus Rift.

Real-time rendering is not only important in the entertainment industry, but for all types of visual applications, such as medical data visualization, simulations, education, or architecture. The real-time rendering pipeline is typically tied to efficient hardware, i.e., the graphics processing unit (GPU). However, while tight coupling to a hardware architecture yields high performance and power efficiency, flexibility is sacrificed. Thus, it is not surprising that the real-time rendering pipelines hardly changed since the introduction of programmable shading. Although programmability added flexibility within certain stages of the pipeline, the pipeline itself remains rigid. This inflexible design restricts research and development of new rendering architectures. Any novel rendering approach, which does not follow the predefined pipeline, is doomed to fail, as it can never compete in performance with approaches that fit the hardware pipeline. In this project we showed that rendering pipelines can run in a complete software approach. To reach this result, efficient dynamic scheduling of the pipeline stages is essential. Scheduling must consider multiple trade-offs, like generating parallelism and keeping data compact, allowing for the execution on arbitrary processing cores and keeping data local for faster access, or streamlining the processing and supporting dynamic decision making. Considering these trade-offs and deriving general solutions to multiple scheduling problems, we derived a complete streaming rendering pipeline, whose performance is reasonably close to the hardware pipeline, while offering unprecedented flexibility. With this approach, we showed that novel rendering algorithms can easily be derived and that small changes to the pipeline can have a lasting impact on rendering quality and rendering speed. Our pipeline cannot only be used to alter the rendering pipeline, but to also experiment with alternative hardware designs and completely new rendering pipelines. For example, we presented the novel rendering pipeline for vector graphics, typically used for font rendering and on websites. Our hierarchical rasterizer for vector graphics rendering not only outperforms the state-of-the-art hardware-supported rendering methods, but also achieves significantly better quality. Another example for the advantages of our approach is the application to virtual reality rendering on head-mounted displays. For this example, we could significantly reduce the perceived latency and thus tackle one of major issues with head-mounted displays: motion sickness. Finally, we applied the scheduling solutions devised for rendering pipelines to other domains, including sparse linear algebra operations---typically found in material simulation, dynamic graph processing---such as large social networks, and mesh processing---creation and manipulation of three dimensional models. In all domains, we outperformed the previous state-of-the-art, including handcrafted solutions from both research and industry. These surprising results show that advanced, adaptive scheduling strategies have the potential to transform multiple domains relying on efficient computation on manycore processors.

Research institution(s)

Technische Universität Graz - 100%

Project participants

Markus Steinberger, Technische Universität Graz , national collaboration partner

International project participants

Matthias Nießner, TU München - Germany
Jan Kautz, NVIDIA - USA

Research Output

299 Citations
19 Publications
3 Scientific Awards

Publications

Title	On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing
DOI	10.48550/arxiv.1805.08893
Type	Preprint
Author	Kenzel M

Title	Are dynamic memory managers on GPUs slow?
DOI	10.1145/3437801.3441612
Type	Conference Proceeding Abstract
Author	Winter M
Pages	219-233

Title	Autonomous, Independent Management of Dynamic Graphs on GPUs
DOI	10.1109/hpec.2017.8091058
Type	Conference Proceeding Abstract
Author	Winter M
Pages	1-7

Title	Subdivision-Specialized Linear Algebra Kernels for Static and Dynamic Mesh Connectivity on the GPU
Type	Journal Article
Author	Mlakar D
Journal	Computer Graphics Forum
Link	Publication

Title	Adaptive sparse matrix-matrix multiplication on the GPU
DOI	10.1145/3293883.3295701
Type	Conference Proceeding Abstract
Author	Winter M
Pages	68-81

Title	The camera offset space
DOI	10.1145/3355089.3356530
Type	Journal Article
Author	Hladky J
Journal	ACM Transactions on Graphics (TOG)
Pages	1-14

Title	A high-performance software graphics pipeline architecture for the GPU
DOI	10.1145/3197517.3201374
Type	Journal Article
Author	Kenzel M
Journal	ACM Transactions on Graphics (TOG)
Pages	1-15

Title	The Broker Queue
DOI	10.1145/3205289.3205291
Type	Conference Proceeding Abstract
Author	Kerbl B
Pages	76-85

Title	Revisiting The Vertex Cache
DOI	10.1145/3233302
Type	Journal Article
Author	Kerbl B
Journal	Proceedings of the ACM on Computer Graphics and Interactive Techniques
Pages	1-16

Title	On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing
DOI	10.1145/3233303
Type	Journal Article
Author	Kenzel M
Journal	Proceedings of the ACM on Computer Graphics and Interactive Techniques
Pages	1-17
Link	Publication

Title	Human upper-body inverse kinematics for increased embodiment in consumer-grade virtual reality
DOI	10.1145/3281505.3281529
Type	Conference Proceeding Abstract
Author	Parger M
Pages	1-10

Title	Ouroboros
DOI	10.1145/3392717.3392742
Type	Conference Proceeding Abstract
Author	Winter M
Pages	1-12

Title	Hierarchical Rasterization of Curved Primitives for Vector Graphics Rendering on the GPU
DOI	10.1111/cgf.13622
Type	Journal Article
Author	Dokter M
Journal	Computer Graphics Forum
Pages	93-103

Title	Stochastic Substitute Trees for Real-Time Global Illumination
DOI	10.1145/3384382.3384521
Type	Conference Proceeding Abstract
Author	Tatzgern W
Pages	1-9

Title	Breadth-First Search on Dynamic Graphs using Dynamic Parallelism on the GPU
DOI	10.1109/hpec.2019.8916476
Type	Conference Proceeding Abstract
Author	Tödling D
Pages	1-7

Title	faimGraph: High Performance Management of Fully-Dynamic Graphs Under Tight Memory Constraints on the GPU
DOI	10.1109/sc.2018.00063
Type	Conference Proceeding Abstract
Author	Winter M
Pages	1-13

Title	Effective static bin patterns for sort-middle rendering
DOI	10.1145/3105762.3105777
Type	Conference Proceeding Abstract
Author	Kerbl B
Pages	1-10

Title	spECK
DOI	10.1145/3332466.3374521
Type	Conference Proceeding Abstract
Author	Parger M
Pages	362-375

Title	Dynamic scheduling for efficient hierarchical sparse matrix operations on the GPU
DOI	10.1145/3079079.3079085
Type	Conference Proceeding Abstract
Author	Derler A
Pages	1-10

Scientific Awards

Title	Eurographics 2020 Best Paper Award
Type	Research prize
Level of Recognition	Continental/International

Title	Computer Graphics Forum Associate Editor
Type	Appointed as the editor/advisor to a journal or book series
Level of Recognition	Continental/International

Title	Best Student Paper Award High Performance Extreme Computing
Type	Research prize
Level of Recognition	Continental/International

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

Fully Programmable GPU Pipelines

Fully Programmable GPU Pipelines

Disciplines

Keywords

Research Output

Contact

General information

Go to overview page Discover

Go to overview page Funding

Go to overview page About Us

Go to overview page News

SOCIAL MEDIA

SCILOG

Fully Programmable GPU Pipelines

Fully Programmable GPU Pipelines

Disciplines

Keywords

Research Output