CUDA范例精解:通用GPU编程(影印版)图书
人气:15

CUDA范例精解:通用GPU编程(影印版)

CUDA是设计用于帮助开发并行程序的计算体系结构。通过与广泛的软件平台相结合,cuda体系结构使程序员可以充分利用图形处理单元(gpu)的强大能力构建高性能的应用程序。当然,gpu已经在很长时间内用于实现复杂的图形...

内容简介

CUDA是设计用于帮助开发并行程序的计算体系结构。通过与广泛的软件平台相结合,cuda体系结构使程序员可以充分利用图形处理单元(gpu)的强大能力构建高性能的应用程序。当然,gpu已经在很长时间内用于实现复杂的图形和游戏应用程序。现在,cuda将这种具有价值的资源带给在其他领域内从事应用程序开发的程序员,包括科学、工程和财务领域。这些程序员不需要了解图形编程的相关知识,而只要能够采用适当扩展的c语言版本进行编程即可。

本书由cuda软件平台团队中的两位博学成员编写而成,他们向程序员展示了如何使用这种新的技术,并且通过大量可以运行的示例介绍了cuda开发的每个领域。在简要介绍cuda平台和体系结构以及快速指导cudac之后,本书详细介绍了与每个关键的cuda功能相关的技术,以及如何权衡使用这些功能。通过阅读本书,您将掌握使用每个cudac扩展的时机以及编写性能极为优越的cuda软件的方式。

编辑推荐

CUDA范例精解:通用GPU编程(影印版)》由清华大学出版社出版。

作者简介

山德尔(Jason Sanders)是NVIDIA公司CUDA平台团队中的博学软件工程师,他协助开发了早期版本的CUDA系统软件,并且帮助制定了作为异构计算的行业标准的OpenCL 1.0规范。Jason也在ATI Technologies、Apple和Novell担任相关职务。 康洛特(Edward Kandrot)是NVIDIA公司CU

目录

foreword

preface

acknowledgments

about the authors

1 why cuda ? why now?

1.1 chapter objectives

1.2 the age of parau. el. processing

1.3 the rise of gpu computing

1.4 cuda

1.5 applications of cuda

1.6 chapter review

2 getting started

3.1 chapter objectives

2.2 deve!.opment environment

2.3 chapter review

3 introduction to cuda c

3.1 chapter objectives

3.2 a first program

3.3 querying devices

3.4 using device properties

3.5 chapter review

4 parallel programming in cuda c

4.1 chapter objectives

4.2 cuda para[tel programming

4.3 chapter review

5 thread cooperation

5.1 chapter objectives

5.2 splitting parallel blocks

5.3 shared memory and synchronization

5.4 chapter review

6 constant memory and events

6.1 chapter objectives

6.2 constant memory

6.3 measuring performance with events

6.4 chapter review

7 texture memory

7.1 chapter objectives

7.2 texture memory overview

7.3 simulating heat transfer

7.4 chapter review

8 graphics interoperability

8.1 chapter objectives

8.2 graphics interoperation

8.3 gpu ripple with graphics interoperability

8.4 heat transfer with graphics interop

8.5 directx interoperability

8.6 chapter' review

9 atomics

9.1 chapter objectives

9.2 compute capability

9.3 atomic operations overview

9.4computing histograms

9.5 chapter review

10 streams

10.1 chapter objectives

10.2 page-locked host memory

10.3 cuda streams

10.4 using a single cuda stream

10.5 using multipte cuda streams

10.6 gpu work scheduling

10.7 using multiple cuda streams effectively

10.8 chapter review

11 cuda c on multiple gpus

11.1 chapter objectives

11.2 zero-copy host memory

11.3 using multiple gpus

11.4 portable pinned memory

11.5 chapter review

12 the final countdown

12.1 chapter objectives

12.2 cuda tools

12.3 written resources

12.4 code resources

12.5 chapter review

a advanced atomics

a.1 dot product revisited

a.2 impl. ementing a hash tabte

a.3 appendix review

index

在线预览

In recent years, however, manufacturers have been forced to l,ook for al,terna-tives to this traditional, source of increased computational, power. Because ofvarious fundamental- l,imitations in the fabrication of integrated circuits, it is nol-onger feasibl.e to rel.y on upward-spiral,ing processor cl,ock speeds as a meansfor extracting additional power from existing architectures. Because of power andheat restrictions as wel,l, as a rapidl,y approaching physical- l,imit to transistor size,researchers and manufacturers have begun to l,ook el.sewhere.Outside the woHd of consumer computing, supercomputers have for decadesextracted massive performance gains in simil,ar ways. The performance of aprocessor used in a supercomputer has cl,imbed astronomical,l,y, simil.ar to theimprovements in the personal- computer CPU. However, in addition to dramaticimprovements in the performance of a singl,e processor, supercomputer manu-facturers have al,so extracted massive leaps in performance by steadily increasingthe number of processors. It is not uncommon for the fastest supercomputers tohave tens or hundreds of thousands of processor cores working in tandem.In the search for additional, processing power for personat computers, theimprovement in supercomputers raises a very good question: Rather than sol,el,yl,ooking to increase the performance of a single processing core, why not putmore than one in a personal- computer? In this way, personal- computers coul,dcontinue to improve in performance without the need for continuing increases inprocessor clock speed.

媒体评论

"对于处理基于图形加速器的计算系统的人员来说,本书是必不可少的读物。" ——Jack Dongarra博士(田纳西大学特聘教授和橡树岭国家实验室杰出研究员)作序推荐

网友评论(不代表本站观点)

来自的JF**的评论:

简单易懂的书 建议GPU入门的人购买

2011-01-04 09:04:36
来自wendyan**的评论:

cuda的经典书籍,值得购买~

2011-11-30 13:20:16
来自life4li**的评论:

实用的教材

2011-12-12 09:42:59
来自小笨笨l**的评论:

男朋友说还不错

2012-02-21 22:29:58
来自土豆丝Y**的评论:

商品很好,隔了这么久才来好评,不错不错

2013-12-03 18:58:33
来自无昵称**的评论:

我英文一般 看起来有点吃力 外国人写书比中国人强那是一定的

2011-01-17 15:57:28
来自等待鲜**的评论:

还可以,就是简单了点,不过老外能把问题讲清楚,还是比较推荐。。。。。。

2011-03-07 17:55:17
来自无昵称**的评论:

介绍cuda为数不多的书,正在看,目前还可以

2010-12-21 21:54:27
登录后即可发表评论

免责声明

更多相关图书
在线咨询