Product Cover Image

GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation, Safari

By Matt Pharr, Randima Fernando

Published by Addison-Wesley Professional

Published Date: Dec 27, 2007

More Product Info

Description

GPU Gems 2 isn’t meant to simply adorn your bookshelf—it’s required reading for anyone trying to keep pace with the rapid evolution of programmable graphics. If you’re serious about graphics, this book will take you to the edge of what the GPU can do.”

—Remi Arnaud, Graphics Architect at Sony Computer Entertainment
“The topics covered in GPU Gems 2 are critical to the next generation of game engines.”

—Gary McTaggart, Software Engineer at Valve, Creators of Half-Life and Counter-Strike

This sequel to the best-selling, first volume of GPU Gems details the latest programming techniques for today’s graphics processing units (GPUs). As GPUs find their way into mobile phones, handheld gaming devices, and consoles, GPU expertise is even more critical in today’s competitive environment. Real-time graphics programmers will discover the latest algorithms for creating advanced visual effects, strategies for managing complex scenes, and advanced image processing techniques. Readers will also learn new methods for using the substantial processing power of the GPU in other computationally intensive applications, such as scientific computing and finance. Twenty of the book’s forty-eight chapters are devoted to GPGPU programming, from basic concepts to advanced techniques. Written by experts in cutting-edge GPU programming, this book offers readers practical means to harness the enormous capabilities of GPUs.

Major topics covered include:

  • Geometric Complexity
  • Shading, Lighting, and Shadows
  • High-Quality Rendering
  • General-Purpose Computation on GPUs: A Primer
  • Image-Oriented Computing
  • Simulation and Numerical Algorithms

Contributors are from the following corporations and universities:

1C: Maddox Games
2015
Apple Computer
Armstrong State University
Climax Entertainment
Crytek
discreet
ETH Zurich
GRAVIR/IMAG—INRIA
GSC Game World
Lionhead Studios
Lund University
Massachusetts Institute of Technology
mental images
Microsoft Research
NVIDIA Corporation
Piranha Bytes
Siemens Corporate Research
Siemens Medical Solutions
Simutronics Corporation
Sony Pictures Imageworks
Stanford University
Stony Brook University
Technische Universität München
University of California, Davis
University of North Carolina at Chapel Hill
University of Potsdam
University of Tokyo
University of Toronto
University of Utah
University of Virginia
University of Waterloo
Vienna University of Technology
VRVis Research Center

Section editors include NVIDIA engineers: Kevin Bjorke, Cem Cebenoyan, Simon Green, Mark Harris, Craig Kolb, and Matthias Wloka

The accompanying CD-ROM includes complementary examples and sample programs.



Table of Contents

Foreword xxix

Preface xxxi

Contributors xxxv

PART I: GEOMETRIC COMPLEXITY 1

Chapter 1: Toward Photorealism in Virtual Botany 7
David Whatley, Simutronics Corporation

1.1 Scene Management 7

1.2 The Grass Layer 11

1.3 The Ground Clutter Layer 17

1.4 The Tree and Shrub Layers 18

1.5 Shadowing 20

1.6 Post-Processing 22

1.7 Conclusion 24
1.8 References 24

Chapter 2: Terrain Rendering Using GPU-Based Geometry Clipmaps 27
Arul Asirvatham, Microsoft Research
Hugues Hoppe, Microsoft Research

2.1 Review of Geometry Clipmaps 27

2.2 Overview of GPU Implementation 30

2.3 Rendering 32

2.4 Update 39

2.5 Results and Discussion 43

2.6 Summary and Improvements 43

2.7 References 44

Chapter 3: Inside Geometry Instancing 47
Francesco Carucci, Lionhead Studios

3.1 Why Geometry Instancing? 48

3.2 Definitions 49

3.3 Implementation 53

3.4 Conclusion 65

3.5 References 67

Chapter 4: Segment Buffering 69
Jon Olick, 2015

4.1 The Problem Space 69

4.2 The Solution 70

4.3 The Method 71

4.4 Improving the Technique 72

4.5 Conclusion 72

4.6 References 73

Chapter 5: Optimizing Resource Management with Multistreaming. 75
Oliver Hoeller, Piranha Bytes
Kurt Pelzer, Piranha Bytes

5.1 Overview 76

5.2 Implementation 77

5.3 Conclusion 89

5.4 References 90

Chapter 6: Hardware Occlusion Queries Made Useful 91
Michael Wimmer, Vienna University of Technology
Jirí Bittner, Vienna University of Technology

6.1 Introduction 91

6.2 For Which Scenes Are Occlusion Queries Effective? 92

6.3 What Is Occlusion Culling? 93

6.4 Hierarchical Stop-and-Wait Method 94

6.5 Coherent Hierarchical Culling 97

6.6 Optimizations 105

6.7 Conclusion 106

6.8 References 108

Chapter 7: Adaptive Tessellation of Subdivision Surfaces withDisplacement Mapping 109
Michael Bunnell, NVIDIA Corporation

7.1 Subdivision Surfaces 109

7.2 Displacement Mapping 119

7.3 Conclusion 122

7.4 References 122

Chapter 8: Per-Pixel Displacement Mapping with Distance Functions 123
William Donnelly, University of Waterloo

8.1 Introduction 123

8.2 Previous Work 125

8.3 The Distance-Mapping Algorithm 126

8.4 Computing the Distance Map 130

8.5 The Shaders 130

8.6 Results 132

8.7 Conclusion 134

8.8 References 135

PART II: SHADING, LIGHTING, AND SHADOWS 137

Chapter 9: Deferred Shading in S.T.A.L.K.E.R. 143
Oles Shishkovtsov, GSC Game World

9.1 Introduction 143

9.2 The Myths 145

9.3 Optimizations 147

9.4 Improving Quality 154

9.5 Antialiasing 158

9.6 Things We Tried but Did Not Include in the Final Code 162

9.7 Conclusion 164

9.8 References 165

Chapter 10: Real-Time Computation of Dynamic Irradiance Environment Maps 167
Gary King, NVIDIA Corporation

10.1 Irradiance Environment Maps 167

10.2 Spherical Harmonic Convolution 170

10.3 Mapping to the GPU 172

10.4 Further Work 175

10.5 Conclusion 176

10.6 References 176

Chapter 11: Approximate Bidirectional Texture Functions 177
Jan Kautz, Massachusetts Institute of Technology

11.1 Introduction 177

11.2 Acquisition 179

11.3 Rendering 181

11.4 Results 184

11.5 Conclusion 187

11.6 References 187

Chapter 12: Tile-Based Texture Mapping 189
Li-Yi Wei, NVIDIA Corporation

12.1 Our Approach 191

12.2 Texture Tile Construction 191

12.3 Texture Tile Packing 192

12.4 Texture Tile Mapping 195

12.5 Mipmap Issues 197

12.6 Conclusion 198

12.7 References 199

Chapter 13: Implementing the mental images Phenomena Renderer on the GPU 201
Martin-Karl Lefrançois, mental images

13.1 Introduction 201

13.2 Shaders and Phenomena 202

13.3 Implementing Phenomena Using Cg 205

13.4 Conclusion 221

13.5 References 222

Chapter 14: Dynamic Ambient Occlusion and Indirect Lighting 223
Michael Bunnell, NVIDIA Corporation

14.1 Surface Elements 223

14.2 Ambient Occlusion 225

14.3 Indirect Lighting and Area Lights 231

14.4 Conclusion 232

14.5 References 233

Chapter 15: Blueprint Rendering and “Sketchy Drawings” 235
Marc Nienhaus, University of Potsdam, Hasso-Plattner-Institute
Jürgen Döllner, University of Potsdam, Hasso-Plattner-Institute

15.1 Basic Principles 236

15.2 Blueprint Rendering 238

15.3 Sketchy Rendering 244

15.4 Conclusion 251

15.5 References 252

Chapter 16: Accurate Atmospheric Scattering 253
Sean O’Neil

16.1 Introduction 253

16.2 Solving the Scattering Equations 254

16.3 Making It Real-Time 258

16.4 Squeezing It into a Shader 260

16.5 Implementing the Scattering Shaders 262

16.6 Adding High-Dynamic-Range Rendering 265

16.7 Conclusion 266

16.8 References 267

Chapter 17: Efficient Soft-Edged Shadows Using Pixel Shader Branching 269
Yury Uralsky, NVIDIA Corporation

17.1 Current Shadowing Techniques 270

17.2 Soft Shadows with a Single Shadow Map 271

17.3 Conclusion 281

17.4 References 282

Chapter 18: Using Vertex Texture Displacement for Realistic Water Rendering 283
Yuri Kryachko, 1C:Maddox Games

18.1 Water Models 283

18.2 Implementation 284

18.3 Conclusion 294

18.4 References 294

Chapter 19: Generic Refraction Simulation 295
Tiago Sousa, Crytek

19.1 Basic Technique 296

19.2 Refraction Mask 297

19.3 Examples 300

19.4 Conclusion 305

19.5 References 305

PART III: HIGH-QUALITY RENDERING 307

Chapter 20: Fast Third-Order Texture Filtering 313
Christian Sigg, ETH Zurich
Markus Hadwiger, VRVis Research Center

20.1 Higher-Order Filtering 314

20.2 Fast Recursive Cubic Convolution 315

20.3 Mipmapping 320

20.4 Derivative Reconstruction 324

20.5 Conclusion 327

20.6 References 328

Chapter 21: High-Quality Antialiased Rasterization 331
Dan Wexler, NVIDIA Corporation
Eric Enderton, NVIDIA Corporation

21.1 Overview 331

21.2 Downsampling 334

21.3 Padding 336

21.4 Filter Details 337

21.5 Two-Pass Separable Filtering 338

21.6 Tiling and Accumulation 339

21.7 The Code 339

21.8 Conclusion 344

21.9 References 344

Chapter 22: Fast Prefiltered Lines 345
Eric Chan, Massachusetts Institute of Technology
Frédo Durand, Massachusetts Institute of Technology

22.1 Why Sharp Lines Look Bad 345

22.2 Bandlimiting the Signal 347

22.3 The Preprocess 349

22.4 Runtime 351

22.5 Implementation Issues 355

22.6 Examples 356

22.7 Conclusion 358

22.8 References 359

Chapter 23: Hair Animation and Rendering in the Nalu Demo 361
Hubert Nguyen, NVIDIA Corporation
William Donnelly, NVIDIA Corporation

23.1 Hair Geometry 362

23.2 Dynamics and Collisions 366

23.3 Hair Shading 369

23.4 Conclusion and Future Work 378

23.5 References 380

Chapter 24: Using Lookup Tables to Accelerate Color Transformations 381
Jeremy Selan, Sony Pictures Imageworks

24.1 Lookup Table Basics 381

24.2 Implementation 386

24.3 Conclusion 392

24.4 References 392

Chapter 25: GPU Image Processing in Apple’s Motion 393
Pete Warden, Apple Computer

25.1 Design 393

25.2 Implementation 397

25.3 Debugging 406

25.4 Conclusion 407

25.5 References 408

Chapter 26: Implementing Improved Perlin Noise 409
Simon Green, NVIDIA Corporation

26.1 Random but Smooth 409

26.2 Storage vs. Computation 410

26.3 Implementation Details 411

26.4 Conclusion 415

26.5 References 416

Chapter 27: Advanced High-Quality Filtering 417
Justin Novosad, discreet

27.1 Implementing Filters on GPUs 417

27.2 The Problem of Digital Image Resampling 422

27.3 Shock Filtering: A Method for Deblurring Images 430

27.4 Filter Implementation Tips 433

27.5 Advanced Applications 433

27.6 Conclusion 434

27.7 References 435

Chapter 28: Mipmap-Level Measurement 437
Iain Cantlay, Climax Entertainment

28.1 Which Mipmap Level Is Visible? 438

28.2 GPU to the Rescue 439

28.3 Sample Results 447

28.4 Conclusion 448

28.5 References 449

PART IV: GENERAL-PURPOSE COMPUTATION ON GPUS: A PRIMER 451

Chapter 29: Streaming Architectures and Technology Trends 457
John Owens, University of California, Davis

29.1 Technology Trends 457

29.2 Keys to High-Performance Computing 461

29.3 Stream Computation 464

29.4 The Future and Challenges 468

29.5 References 470

Chapter 30: The GeForce 6 Series GPU Architecture 471
Emmett Kilgariff, NVIDIA Corporation
Randima Fernando, NVIDIA Corporation

30.1 How the GPU Fits into the Overall Computer System 471

30.2 Overall System Architecture 473

30.3 GPU Features 481

30.4 Performance 488

30.5 Achieving Optimal Performance 490

30.6 Conclusion 491

Chapter 31: Mapping Computational Concepts to GPUs 493
Mark Harris, NVIDIA Corporation

31.1 The Importance of Data Parallelism 493

31.2 An Inventory of GPU Computational Resources 497

31.3 CPU-GPU Analogies 500

31.4 From Analogies to Implementation 503

31.5 A Simple Example 505

31.6 Conclusion 508

31.7 References 508

Chapter 32: Taking the Plunge into GPU Computing 509
Ian Buck, Stanford University

32.1 Choosing a Fast Algorithm 509

32.2 Understanding Floating Point 513

32.3 Implementing Scatter 515

32.4 Conclusion 518

32.5 References 519

Chapter 33: Implementing Efficient Parallel Data Structures on GPUs 521
Aaron Lefohn, University of California, Davis
Joe Kniss, University of Utah
John Owens, University of California, Davis

33.1 Programming with Streams 521

33.2 The GPU Memory Model 524

33.3 GPU-Based Data Structures 528

33.4 Performance Considerations 540

33.5 Conclusion 543

33.6 References 544

Chapter 34: GPU Flow-Control Idioms 547
Mark Harris, NVIDIA Corporation
Ian Buck, Stanford University

34.1 Flow-Control Challenges 547

34.2 Basic Flow-Control Strategies 549

34.3 Data-Dependent Looping with Occlusion Queries 554

34.4 Conclusion 555

Chapter 35: GPU Program Optimization 557
Cliff Woolley, University of Virginia

35.1 Data-Parallel Computing 557

35.2 Computational Frequency 561

35.3 Profiling and Load Balancing 568

35.4 Conclusion 570

35.5 References 570

Chapter 36: Stream Reduction Operations for GPGPU Applications 573
Daniel Horn, Stanford University

36.1 Filtering Through Compaction 574

36.2 Motivation: Collision Detection 579

36.3 Filtering for Subdivision Surfaces 583

36.4 Conclusion 587

36.5 References 587

PART V: IMAGE-ORIENTED COMPUTING 591

Chapter 37: Octree Textures on the GPU 595
Sylvain Lefebvre, GRAVIR/IMAG—INRIA
Samuel Hornus, GRAVIR/IMAG—INRIA
Fabrice Neyret, GRAVIR/IMAG—INRIA

37.1 A GPU-Accelerated Hierarchical Structure: The N3-Tree 597

37.2 Application 1: Painting on Meshes 602

37.3 Application 2: Surface Simulation 611

37.4 Conclusion 612

37.5 References 613

Chapter 38: High-Quality Global Illumination Rendering Using Rasterization 615
Toshiya Hachisuka, The University of Tokyo

38.1 Global Illumination via Rasterization 616

38.2 Overview of Final Gathering 617

38.3 Final Gathering via Rasterization 621

38.4 Implementation Details 625

38.5 A Global Illumination Renderer on the GPU 627

38.6 Conclusion 632

38.7 References 632

Chapter 39: Global Illumination Using Progressive Refinement Radiosity 635
Greg Coombe, University of North Carolina at Chapel Hill
Mark Harris, NVIDIA Corporation

39.1 Radiosity Foundations 636

39.2 GPU Implementation 638

39.3 Adaptive Subdivision 643

39.4 Performance 645

39.5 Conclusion 645

39.6 References 647

Chapter 40: Computer Vision on the GPU 649
James Fung, University of Toronto

40.1 Introduction 649

40.2 Implementation Framework 650

40.3 Application Examples 651

40.4 Parallel Computer Vision Processing 664

40.5 Conclusion 664

40.6 References 665

Chapter 41: Deferred Filtering: Rendering from Difficult Data Formats 667
Joe Kniss, University of Utah
Aaron Lefohn, University of California, Davis
Nathaniel Fout, University of California, Davis

41.1 Introduction 667

41.2 Why Defer? 668

41.3 Deferred Filtering Algorithm 669

41.4 Why It Works 673

41.5 Conclusions: When to Defer 673

41.6 References 674

Chapter 42: Conservative Rasterization 677
Jon Hasselgren, Lund University
Tomas Akenine-Möller, Lund University
Lennart Ohlsson, Lund University

42.1 Problem Definition 678

42.2 Two Conservative Algorithms 679

42.3 Robustness Issues 686

42.4 Conservative Depth 687

42.5 Results and Conclusions 689

42.6 References 690

PART VI: SIMULATION AND NUMERICAL ALGORITHMS 691

Chapter 43: GPU Computing for Protein Structure Prediction 695
Paulius Micikevicius, Armstrong Atlantic State University

43.1 Introduction 695

43.2 The Floyd-Warshall Algorithm and Distance-Bound Smoothing 697

43.3 GPU Implementation 698

43.4 Experimental Results 701

43.5 Conclusions and Further Work 701

43.6 References 702

Chapter 44: A GPU Framework for Solving Systems of Linear Equations 703
Jens Krüger, Technische Universität München
Rüdiger Westermann, Technische Universität München

44.1 Overview 703

44.2 Representation 704

44.3 Operations 708

44.4 A Sample Partial Differential Equation 714

44.5 Conclusion 718

44.6 References 718

Chapter 45: Options Pricing on the GPU 719
Craig Kolb, NVIDIA Corporation
Matt Pharr, NVIDIA Corporation

45.1 What Are Options? 719

45.2 The Black-Scholes Model 721

45.3 Lattice Models 725

45.4 Conclusion 730

45.5 References 731

Chapter 46: Improved GPU Sorting 733
Peter Kipfer, Technische Universität München
Rüdiger Westermann, Technische Universität München

46.1 Sorting Algorithms 733

46.2 A Simple First Approach 734

46.3 Fast Sorting 735

46.4 Using All GPU Resources 738

46.5 Conclusion 745

46.6 References 746

Chapter 47: Flow Simulation with Complex Boundaries 747
Wei Li, Siemens Corporate Research
Zhe Fan, Stony Brook University
Xiaoming Wei, Stony Brook University
Arie Kaufman, Stony Brook University

47.1 Introduction 747

47.2 The Lattice Boltzmann Method 748

47.3 GPU-Based LBM 749

47.4 GPU-Based Boundary Handling 753

47.5 Visualization 759

47.6 Experimental Results 760

47.7 Conclusion 761

47.8 References 763

Chapter 48: Medical Image Reconstruction with the FFT 765
Thilaka Sumanaweera, Siemens Medical Solutions USA
Donald Liu, Siemens Medical Solutions USA

48.1 Background 765

48.2 The Fourier Transform 766

48.3 The FFT Algorithm 767

48.4 Implementation on the GPU 768

48.5 The FFT in Medical Imaging 776

48.6 Conclusion 783

48.7 References 784

Index 785

Purchase Info

ISBN-10: 0-321-54541-9

ISBN-13: 978-0-321-54541-1

Format: Safari PTG

This publication is not currently for sale.