Product Cover Image

Multicore Application Programming: for Windows, Linux, and Oracle Solaris

By Darryl Gove

Published by Addison-Wesley Professional

Published Date: Nov 8, 2010

More Product Info


This is the Safari online edition of the printed book.

Write High-Performance, Highly Scalable Multicore Applications for Leading Platforms

Multicore Application Programming is a comprehensive, practical guide to high-performance multicore programming that any experienced developer can use.


Author Darryl Gove covers the leading approaches to parallelization on Windows, Linux, and Oracle Solaris. Through practical examples, he illuminates the challenges involved in writing applications that fully utilize multicore processors, helping you produce applications that are functionally correct, offer superior performance, and scale well to eight cores, sixteen cores, and beyond.


The book reveals how specific hardware implementations impact application performance and shows how to avoid common pitfalls. Step by step, you’ll write applications that can handle large numbers of parallel threads, and you’ll master advanced parallelization techniques. You’ll learn how to


  • Identify your best opportunities to use parallelism
  • Share data safely between multiple threads
  • Write applications using POSIX or Windows threads
  • Hand-code synchronization and sharing
  • Take advantage of automatic parallelization and OpenMP
  • Overcome common obstacles to scaling
  • Apply new approaches to writing correct, fast, scalable parallel code


Multicore Application Programming isn’t wedded to a single approach or platform: It is for every experienced C programmer working with any contemporary multicore processor in any leading operating system environment.

Table of Contents

Preface xv

Acknowledgments xix

About the Author xxi


Chapter 1: Hardware, Processes, and Threads 1

Examining the Insides of a Computer 1

The Motivation for Multicore Processors 3

The Characteristics of Multiprocessor Systems 18

The Translation of Source Code to Assembly Language 21

Summary 29


Chapter 2: Coding for Performance 31

Defining Performance 31

Understanding Algorithmic Complexity 33

How Structure Impacts Performance 39

The Role of the Compiler 60

Identifying Where Time Is Spent Using Profiling 74

How Not to Optimize 80

Performance by Design 82

Summary 83


Chapter 3: Identifying Opportunities for Parallelism 85

Using Multiple Processes to Improve System Productivity 85

Multiple Users Utilizing a Single System 87

Improving Machine Efficiency Through Consolidation 88

Using Parallelism to Improve the Performance of a Single Task 92

Parallelization Patterns 100

How Dependencies Influence the Ability Run Code in Parallel 110

Identifying Parallelization Opportunities 118

Summary 119


Chapter 4: Synchronization and Data Sharing 121

Data Races 121

Synchronization Primitives 126

Deadlocks and Livelocks 132

Communication Between Threads and Processes 133

Storing Thread-Private Data 141

Summary 142


Chapter 5: Using POSIX Threads 143

Creating Threads 143

Compiling Multithreaded Code 151

Process Termination 153

Sharing Data Between Threads 154

Variables and Memory 175

Multiprocess Programming 179

Sockets 193

Reentrant Code and Compiler Flags 197

Summary 198


Chapter 6: Windows Threading 199

Creating Native Windows Threads 199

Methods of Synchronization and Resource Sharing 208

Wide String Handling in Windows 221

Creating Processes 222

Atomic Updates of Variables 238

Allocating Thread-Local Storage 240

Setting Thread Priority 242

Summary 244


Chapter 7: Using Automatic Parallelization and OpenMP 245

Using Automatic Parallelization to Produce a Parallel Application 245

Using OpenMP to Produce a Parallel Application 256

Ensuring That Code in a Parallel Region Is Executed in Order 285

Collapsing Loops to Improve Workload Balance 286

Enforcing Memory Consistency 287

An Example of Parallelization 288

Summary 293


Chapter 8: Hand-Coded Synchronization and Sharing 295

Atomic Operations 295

Operating System–Provided Atomics 309

Lockless Algorithms 312

Summary 332


Chapter 9: Scaling with Multicore Processors 333

Constraints to Application Scaling 333

Hardware Constraints to Scaling 352

Operating System Constraints to Scaling 369

Multicore Processors and Scaling 380

Summary 381


Chapter 10: Other Parallelization Technologies 383

GPU-Based Computing 383

Language Extensions 386

Alternative Languages 399

Clustering Technologies 402

Transactional Memory 407

Vectorization 408

Summary 409


Chapter 11: Concluding Remarks 411

Writing Parallel Applications 411

Parallel Code on Multicore Processors 414

The Future 416


Bibliography 417

Index 419