OpenCLNews 2012

The website was a community for professionals working with Heterogeneous Computing with a focus on Open Standards and OpenCL. This was their website.
Content is from the site's 2012 archived pages.


About and


The startup sponsor is AMD.  While the community is completely independent, and the blog content the views of the authors, the basic charter of the site is to promote awareness of the OpenCL and Heterogeneous Computing using Open Standards and to engage customers through understanding how they want the APIs to evolve . Postings on this site are the opinions of the authors and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and AMD is not responsible for the contents of such linked sites and no endorsement is implied.


Recent news

Happenings in the OpenCL / Heterogeneous Computing Community

AMD Fusion Developer Summit 2012 Virtual Scavenger Hunt - Win an Eyefinity Desktop Gaming System

May 03, 2012 by Tony DeYoung

The AMD Fusion Developer Summit is running a Virtual Scavenger Hunt. They have hidden 10 "Easter Eggs" across the following websites:


Be the first to find all 10 "Easter Eggs" (an Easter egg is an intentional hidden message, in-joke, or special feature or functionality) and you could win an AMD-powered iBUYPOWER Gamer Extreme 579D3 Desktop PC complete with AMD Eyefinity multi-display technology. Runner-up prizes include a Hewlett-Packard dv6 laptop with VISION Technology from AMD for second place and an AMD HD Radeon 7970 series GPU for third place.


Update: Just heard through the grapevine that anyone who submits an entry with proof of finding all 10 Easter Eggs in the virtual scavenger hunt (even if not one of the first 3) will get a "thank you" discount code to AFDS for $300 off the normal registration cost.


Select presentation highlights from the upcoming AMD Fusion Developer Summit 2012

May 02, 2012 by Tony DeYoung


With 134 presentations at this year’s AMD Fusion Developer Summit,  it is difficult to pick out some representative highlights, but based on my own interests here are a few topics that immediately captured my attention:

GPU Acceleration of Interactive Large Scale Data Analytics Utilizing The Aparapi Framework
The extreme volume of unstructured data being generated worldwide that must be analyzed, abstracted and understood has for years fueled extensive research to create intuitive, meaningful insights to the data. Capitalizing on human beings innate ability to rapidly comprehend visual imagery, PNNL’s IN-SPIRE processes this data and presents the results to users in a variety of intuitive and interactive visualizations. Within IN-SPIRE users can interactively explore complex relationships between visualizations and their own ad hoc search criteria to discover meaningful insights. This process is both highly dynamic and computationally intensive, as users are continuously drilling-down or widening their focus areas while requiring the computations to be accomplished at ‘interaction speed’ where time-to-solution is critical. In this session, we will explore the use of AMD’s Aparapi to accelerate critical computational analytics and user interactions through high performance GPU computations.

Accelerating Medical Imaging Applications Using OpenCL and APUs
In this talk, we present results on using the OpenCL programming model to enable stages in a medical imaging pipeline for execution on an AMD APU. AMD’s APUs are a promising example of a heterogeneous architecture which bridges the gap between past homogeneity and future heterogeneity by placing a CPU and GPU on the same chip with a shared memory, while offering increased computation power and lower energy costs than many current architectures. This hybrid architecture is well suited for efficient hybrid CPU+GPU execution of programs with diverse computational characteristics. We first approach OpenCL as GPU developers and focus on data-parallel sections of the medical imaging pipeline. We then employ the full APU by including more stages of the pipeline as well as other applications for hybrid execution on CPU and GPU processors. We will discuss insights gained on OpenCL as a model for heterogeneous development, including programmability and performance benefits of greater than 10x relative to a single-core CPU.

Leveraging GPU to Drive Cutting Edge Augmented Reality Experiences
The promise of Augmented Reality (AR) being a major disruptive paradigm is finally being fulfilled as we see this computer visual technology being integrated across multiple vertical markets (education, gaming, ecommerce, retail) and on multiple compute form factors (mobile, tablet, PC, kiosk) The convergence of real and virtual worlds will require massive computational resources to recognize, render and track 3D assets in real time, gesture movement as well as the introduction of physics engines to provide intuitive movement. Leveraging OpenGL for rendering capabilities and OpenCL to harness GPU power will be essential to fulfill future use cases for AR. We invite developers to hear more about the potential of AR and how Total Immersion’s D’Fusion studio platform is evolving to meet this computational demand in partnership with AMD

AMD Fusion and Holographic Rendering
Zebra Imaging has pushed the state of the art in holographic rendering for the last several years. However the methods traditionally employed relied on conventional rasterization, thereby not only limiting parallel lightfield generation but also tightly coupling frame rate to data complexity. Furthermore we found that data resolution often exceeded display resolution, hence a solution was needed to match the two. The obvious highly parallel solution is raytracing, however for our product line we must carefully balance processing power versus power consumption and form factor. In this presentation we describe our success experimenting with the AMD Fusion chips to achieve a low power, low profile, yet high performance solution to parallel lightfield generation. We also discuss our performance expectations for future AMD Fusion product

GPU Acceleration in Chrome
The display of rich, interactive media in web browsers increasingly requires GPU acceleration for high performance. This talk will describe how the GPU is utilized in the Chrome web browser, both for APIs such as WebGL as well as for general HTML rendering. The talk will discuss the particular challenges of rendering arbitrary downloaded content in a robust manner, as well as interactions with the Chrome browser’s strong security sandbox.

GPU Accelerated Face Recognition in Photo and Video
Face recognition in photo and video (FRiPV) is a challenging task that requires a large amount of computational resources. Successful solutions of the FRiPV problem require complex, state of the art algorithms for visual analysis. One of the key elements of the face recognition solution is face detection (FD). Previously, the OpenCL version of Viewdle FD was developed using AMD APU technology. A number of challenges appeared in the implementation. In order to overcome them several techniques were used: block based processing, task queues, etc. This presentation gives in-depth details of our OpenCL FD implementation that allowed us to bring its performance to new levels. As a result of the improvements, a very effective and beneficial composition of data-parallel and task-parallel approaches was built, providing performance not achievable before. A presentation of the evolution of old to new optimization approaches in FD implementation is the topic of this talk.

One Size Doesn’t Fit All: Building High-performance Game Content Authoring Tools with Fabric Engine
Fabric Engine enables game developers to easily build tools that leverage the power of modern processor architectures such as the AMD APU. By providing a framework that allows developers to concentrate on what their application does, rather than the underlying architecture that handles performance, Fabric allows any developer to build high-performance applications that can tackle the challenges of modern production environments. In this talk we will focus on a character animation pipeline tool for games.


Upcoming OpenCL Programming Webinars in May

April 27, 2012 by Tony DeYoung

Get your OpenCL chops honed by attending these webinars during the month of May. Ask questions of experts in heterogenous compute or just site back and listen.

  1. May 1st, 10AM Pacific: Accelerate Rendering by an Order of Magnitude with OpenCL, Plus a View to the Multi-core and Web-enabled Future
  2. May 9th, 9AM Pacific: Advanced OpenCL Debugging with AMD gDEBugger
  3. May 22nd, 11AM Pacific: Heterogeneous Compute Features of AMD CodeAnalyst Performance Profiler

Photoshop CS6 and Premier Pro CS6 will take advantage of OpenCL acceleration

April 24, 2012 by Tony DeYoung



Adobe and AMD announced that Adobe Photoshop CS6 and Adobe Premiere Pro CS6 will feature OpenCL acceleration to dramatically speed up critical imaging features and generating real-time results when editing. The new Adobe Mercury Graphics Engine within Photoshop CS6 utilizes both OpenCL and OpenGL, to accelerate new and existing features such as the new Blur Gallery that runs up to 10x as fast on the upcoming “Trinity” APU with OpenCL GPU acceleration turned on.

Tom Malloy, senior vice president and chief software architect at Adobe’s Advanced Technology Labs will deliver a keynote address at the AMD Fusion Developer Summit (AFDS) running June 11-14, 2012.


AMD gDEBugger v6.2 adds support for Linux & OpenCL 1.2 beta drivers

April 23, 2012 by Tony DeYoung

AMD gDEBugger is an OpenCL and OpenGL debugger and memory analyzer that is available as Microsoft Visual Studio plugin on Windows and a standalone version on Linux. It offers real-time OpenCL kernel debugging, which allows developers to step into the kernel execution directly from the API calls, debug inside the kernel, view all variable values across different work groups and work items - and all this on a single computer with a single GPU.

The new v6.2 adds:

  • Linux Support
  • Supports OpenCL 1.2 beta drivers
  • New standalone user interface for both Linux and Windows, with enhancements for better navigation and ease of use
  • Supports OpenCL kernel and API level debugging on AMD Radeon HD 7000 series graphics cards
  • Automatic updater to notify and download new product updates
  • Feature enhancements including support for static arrays, union variables and Find feature
  • Stability improvements


COPRTHR v1.4 adds support for OpenCL on multi-core ARM

March 08, 2012 by Tony DeYoung



The CO-PRocessing THReads (COPRTHR) SDK (open-source GPLv3 license) provides several OpenCL related libraries and tools for developers targeting many-core compute technology and hybrid CPU/GPU/APU computing architectures. The new v1.4 adds:

  • Offline OpenCL compiler
  • Redesigned OpenCL implementation for x86_64
  • OpenCL implementation for multicore ARM processors
  • Preview of replacement for Khronos ICD loader
  • Improved build system

StagePresence uses OpenCL to extract video of person from a background & embed into presentations

March 06, 2012 by Tony DeYoung


AMD has invested in Nuvixa for OpenCL-accelerated, gesture-based videoconference applications .

Nuvixa StagePresence video presentation tool, uses a 3D depth camera, similar to Microsoft’s Kinect motion sensor, to extract a high-resolution live video of a presenter from virtually any background environment. It then embeds their live video speaker image into the presentation, all without a green screen or special video editing software. You can also use it to bring a video conference to life, placing participants in a shared virtual space.

With the OpenCL CPU+GPU optimization, Nuvixa StagePresence can achieve up to 96 percent better performance, resulting in nearly double the frame rate possible with the CPU alone.



Bullet 2.80 release: OpenCL rigid body pipeline, Android & deterministic Dynamica

March 05, 2012 by Tony DeYoung


The new Bullet 2.80 release includes a preview of the GPU rigid body pipeline running 100% on the GPU using OpenCL. A new AMD Radeon 7970 Tahiti can simulate 110k objects in real-time between 15-30 frames/second. It also works on recent NVIDIA GPUs with latest drivers.

This release includes an Android/NEON optimized version of Physics Effects that will be used as a handheld backend for Bullet 3.x, and includes the open source Dynamica Bullet plugin for Maya which is now deterministic and has preliminary support for soft body/cloth and convex decomposition through HACD.


PGI OpenCL 1.1 Compiler For ARM ST-Ericsson mobile devices running Android

February 28, 2012 by Tony DeYoung


The Portland Group (wholly-owned subsidiary of STMicroelectronics) announced PGI OpenCL Compiler For ARM - an OpenCL framework that will initially target ST-Ericsson’s NovaThor U8500 SoC, which is based on a dual-core Cortex-A9 CPU and coupled with an ARM Mali 400 MP GPU. The framework includes a PGI OpenCL compiler for multi-core ARM CPUs as a compute device and complements OpenCL for GPUs.

In the OpenCL programming model, the host CPU controls all operation of a compute device. The device can be a GPU, another CPU, the host CPU itself running in multi-core mode, or some other type of compute device. The PGI OpenCL framework is comprised of five core components:

  • PGI OpenCL device compiler - compiles OpenCL kernels for parallel execution on multi-core ARM processors
  • PGCL driver - a command-level driver for processing source files containing C99, C++ or OpenCL program units, including support for static compilation of OpenCL kernels
  • OpenCL host compilers - the PGCL driver uses the Android native development kit versions of gcc and g++ to compile OpenCL host code
  • OpenCL Platform Layer - a library of routines to query platform capabilities and create execution contexts from OpenCL host code
  • OpenCL Runtime Layer - a library of routines and an extensible runtime system used to set up and execute OpenCL kernels on multi-core ARM

The initial release provides OpenCL 1.1 embedded profile support. The PGI OpenCL framework runs on Linux/x86 compilation host platforms and is integrated with the Android NDK toolchain to generate binary executables for ST-Ericsson NovaThor platforms running the Android operating system.


White Paper: Leveraging GPGPU And OpenCL Technologies For Natural User Interfaces

February 28, 2012 by Tony DeYoung


This 10-page white paper from You i Labs examine a few aspects of a Natural User Interface (NUI) that could benefit from GPGPU computing through OpenCL along with the associated implementation benefits.  Natural User Interfaces require large amounts of processing power to provide users with a fluid experience and can be quite costly in terms of CPU bandwidth.  They will easily benefit from the potential performance and efficiency offered by utilizing available GPU cycles. Already many NUI devices incorporate GPUs to provide high performance rendering freeing the CPU to handle the user input, content control, and other aspects of the NUI environment. However, in most cases the GPU is utilized only for visual performance enhancement and relatively underutilized in the normal NUI environment. Due to the parallel nature of GPU design, making use of their computing power for other tasks in a NUI environment is an optimum application for GPGPU enhancement.

The white paper review ten areas of NUI design are directly suited to GPGPU and OpenCL in particular.

1. Visual features not traditionally handled by the GPU, such as font rendering
2. Layout of NUI elements and content frames
3. Enhanced Visual features
4. Sorting of data or other data handling services
5. Compression / decompression of content stores
6. Physical properties of objects – reactive physics effects processing
7. Complex animation or motion schemes for NUI visuals to appear more natural
8. Handling streaming NUI content
9. Real time security encryption / decryption
10. You I Labs uSwish NUI engine specific features