Managed Volume Processing on the GPU
Managed Volume Processing on the GPU
Disciplines
Computer Sciences (100%)
Keywords
-
Volume Rendering,
Memory Management,
GPU programming,
Scientific Visualization,
Scheduling
Volumetric data is very common in medicine, geology or engineering, but the high complexity in data and algorithms has prevented widespread use of volume graphics. Recently, however, 3D image processing and visualization algorithms have been parallelized and ported to graphics processing units (GPUs). This proposal is concerned with new ways of designing volume graphics algorithms for the GPU that can interactively cope with these huge problems by better utilization of GPU capacity. Unfortunately, only certain parts of common image or volume processing algorithms can be mapped to the standard GPU stream processing model. For most real-world problems, writing programs for this architecture is a tedious task. As a result, most algorithms use the available processing power only for small subtasks -- the number crunching in inner loops. For example, direct volume rendering (DVR) methods send rays into a volumetric object, accumulate intensities, divide rays into sub-rays, scatter rays in materials and/or extract certain features. All GPU implementations of DVR use one processing unit for one pixel, regardless of whether the pixel will require very complex calculations or not. This strategy frequently leads to strong load imbalances. A particular problem of interactive applications such as volume graphics is that they are not traditional number crunching tasks, which only require optimal computational throughput, while having relaxed or no constraints concerning latency. On the contrary, interactive applications demand meeting real-time deadlines to ensure interactive response. This is a classical real-time resource scheduling problem. It can only be achieved by adaptive algorithms that rely on complex flow control and memory management decisions during the parallel execution. Both is currently only available on the CPU, which allows access to privileged mode through the operating system. On the GPU, components for high level scheduling involving latency hiding and memory management are missing or inaccessible. The desired full utilization of the GPU is very difficult to achieve for complex graphics algorithms with real-time demands. Building a toolset that allows harvesting the full GPU power for a general class of real-time volume graphics algorithms is the main goal of this proposal. We propose a managed volume processing system that incorporates the missing components. Its key modules are a task model, a workload scheduler with real-time capabilities and a virtual memory management system executed in tandem on the GPU and CPU. We will rely on the most recent hardware developments and use OpenCL as the standardized interface to access them.
During the last years we have witnessed a severe change in computing. Processors hit the so-called power wall, disallowing them to increase in clock speed. Thus, the arguably only way to feed the ever growing demand for more processing power is parallelism the concurrent execution of multiple smaller tasks on a chip with many cores. The currently most widespread many core chip is the graphics processing unit (GPU). While a GPU offers tremendous processing power, this power could up to now only be harnessed by algorithms which can be divided into thousands of coherent execution threads. In this project, we tackled to problem of unlocking GPU execution for a new class of algorithms. To this aim, we developed algorithms that form the basis of an operating system on the GPU. This operating system allows the description and execution of algorithms with inhomogeneous, time-varying parallelism. In the background, our system collects thousands of work packages, combines them for efficient cooperative execution, and chooses the best fitting processors for execution. Furthermore, we offer the possibility to freely and dynamically prioritize those work packages, which allows the concurrent execution of multiple algorithms. To assist highly parallel programs, we provide a memory allocator, which can serve concurrent requests of tens of thousands of threads efficiently. With this research, we have provided the currently fastest algorithms for the GPU in the areas of queuing strategies, memory management, and prioritized scheduling. Additionally, we have advanced the current state of the art in processing of volume data, rendering, and geometric algorithms. Our model speeds up the simulation of global illumination computation by assigning the available processing power to those regions that that will increase image quality most. We were able to significantly speed up multiple techniques for volume visualization by only changing the way the execution is carried out. Finally, we have showed that our model can be used to generate and render complex geometric models in real-time on the GPU, whereas previous methods would take hours to complete the same task. The results of this project will very likely influence the design and execution strategies for future parallel architectures.
- Technische Universität Graz - 100%
Research Output
- 512 Citations
- 24 Publications
-
2014
Title Parallel generation of architecture on the GPU DOI 10.1111/cgf.12312 Type Journal Article Author Steinberger M Journal Computer Graphics Forum Pages 73-82 Link Publication -
2015
Title Reyes rendering on the GPU DOI 10.1145/2788539.2788543 Type Conference Proceeding Abstract Author Sattlecker M Pages 31-38 -
2014
Title On-the-fly generation and rendering of infinite cities on the GPU DOI 10.1111/cgf.12315 Type Journal Article Author Steinberger M Journal Computer Graphics Forum Pages 105-114 -
2014
Title Parallel Irradiance Caching for Interactive Monte-Carlo Direct Volume Rendering DOI 10.1111/cgf.12362 Type Journal Article Author Khlebnikov R Journal Computer Graphics Forum Pages 61-70 -
2014
Title Whippletree DOI 10.1145/2661229.2661250 Type Journal Article Author Steinberger M Journal ACM Transactions on Graphics (TOG) Pages 1-11 -
2013
Title Volume Rendering with advanced GPU scheduling strategies. Type Conference Proceeding Abstract Author Schmalstieg D Et Al Conference Proceedings of the IEEE Scientific Visualization Posters. -
2012
Title Massively parallel dynamic memory allocation for the GPU. Type Conference Proceeding Abstract Author Schmalstieg D Et Al Conference Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units (GPGPU-6). -
2012
Title Stochastic Particle-Based Volume Rendering. Type Conference Proceeding Abstract Author Kainz B Et Al Conference proceedings of Central European Seminar on Computer Graphics (CESCG). -
2012
Title Procedural Texture Synthesis for Zoom-Independent Visualization of Multivariate Data DOI 10.1111/j.1467-8659.2012.03127.x Type Journal Article Author Khlebnikov R Journal Computer Graphics Forum Pages 1355-1364 -
2012
Title OmniKinect DOI 10.1145/2407336.2407342 Type Conference Proceeding Abstract Author Kainz B Pages 25-32 -
2012
Title ScatterAlloc: Massively Parallel Dynamic Memory Allocation for the GPU DOI 10.1109/inpar.2012.6339604 Type Conference Proceeding Abstract Author Steinberger M Pages 1-10 -
2012
Title Softshell DOI 10.1145/2366145.2366180 Type Journal Article Author Steinberger M Journal ACM Transactions on Graphics (TOG) Pages 1-11 -
2016
Title Hierarchical Bucket Queuing for Fine-Grained Priority Scheduling on the GPU DOI 10.1111/cgf.13075 Type Journal Article Author Kerbl B Journal Computer Graphics Forum Pages 232-246 -
2015
Title Fast Volume Reconstruction from Motion Corrupted Stacks of 2D Slices DOI 10.1109/tmi.2015.2415453 Type Journal Article Author Kainz B Journal IEEE Transactions on Medical Imaging Pages 1901-1913 Link Publication -
2011
Title Stylization-based ray prioritization for guaranteed frame rates DOI 10.1145/2024676.2024685 Type Conference Proceeding Abstract Author Kainz B Pages 43-54 -
2012
Title Priority-Based Task Management in a GPGPU Megakernel. Type Conference Proceeding Abstract Author Kerbl B Conference proceedings of Central European Seminar on Computer Graphics (CESCG). -
2012
Title Interactive Self-Organizing Windows DOI 10.1111/j.1467-8659.2012.03041.x Type Journal Article Author Steinberger M Journal Computer Graphics Forum Pages 621-630 -
2012
Title OmniKinect: real-time dense volumetric data acquisition and applications. Type Conference Proceeding Abstract Author Kainz B Conference VRST '12 Proceedings of the 18th ACM symposium on Virtual reality software and Technology. -
2012
Title Volumetric Real-Time Particle-Based Representation of Large Unstructured Tetrahedral Polygon Meshes DOI 10.1007/978-3-642-33463-4_16 Type Book Chapter Author Voglreiter P Publisher Springer Nature Pages 159-168 -
2012
Title Ray prioritization using stylization and visual saliency DOI 10.1016/j.cag.2012.03.037 Type Journal Article Author Steinberger M Journal Computers & Graphics Pages 673-684 Link Publication -
2013
Title Adaptive ghosted views for Augmented Reality. Type Conference Proceeding Abstract Author Kalkofen D Conference IEEE International Symposium on Mixed and Augmented Reality (ISMAR). -
2013
Title Fast dynamic memory allocator for massively parallel architectures DOI 10.1145/2458523.2458535 Type Conference Proceeding Abstract Author Widmer S Pages 120-126 Link Publication -
2013
Title Adaptive Ghosted Views for Augmented Reality DOI 10.1109/ismar.2013.6671758 Type Conference Proceeding Abstract Author Denis K Pages 1-9 -
2013
Title Noise-Based Volume Rendering for the Visualization of Multivariate Volumetric Data DOI 10.1109/tvcg.2013.180 Type Journal Article Author Khlebnikov R Journal IEEE Transactions on Visualization and Computer Graphics Pages 2926-2935 Link Publication