AMD takes a major step in enabling GPGPU coding

CodeAnalyst 3.2 allows heterogenous profiling of code

Apr 14, 2011 by Charlie Demerjian

AMD just took, and then retracted, a major step forward in the whole ‘fusion’ concept, enabling profiling across heterogeneous cores. Although it may sound minor, it is a huge step forward in the usability of the whole paradigm.

Yesterday, AMD (AMD) put up a ‘blog’ post about CodeAnalyst 3.2, the AMD profiling tool that is currently on v3.1. In a very short period of time, the page disappeared, but not before a sharp eyed reader captured it. Before the conspiracy theorists go nuts, someone probably just hit the wrong button to post instead of schedule the release, so it will probably be up in full soon.

The post detailed some of the features of CodeAnalyst 3.2, including Bulldozer (12h family) support, CPU/memory utilization timelines, Visual Studio integration, and the aforementioned heterogeneous profiling. That is by far the biggest addition.

Although it may not sound like much, this part of the release is the key. “If you captured OpenCL information, that will also be shown on the timeline. The timeline has an easy navigation for zooming into the most minute call, while retaining a relative sense of the entire profile. Each GPU device with OpenCL activity will be displayed. A chart for each thread with OpenCL API calls will display the function durations, with double-click, two-way navigation to a detailed data table of the function traces. Kernel and data transfer events are logged and shown in the respective command queues, with the ability to see the latency involved with enqueued events waiting in parallel.”

OpenCL has the ability to pick a target for your code to run on, CPU or GPU, and have it ‘just work’, at least in theory. Given the disparity between CPU and GPU tools, sending things to the GPU usually meant looking at your code with the tools equivalent of welding goggles and a divining rod. Trying to find bottlenecks in your code across multiple types of execution units simultaneously made coding for the PS3 seem like light hearted fun, even counting the inevitable Sony lawsuit.

Since AMD is heavily promoting GPU compute, even holding a conference on the subject, they have a vested interests in people using OpenCL and similar technologies. If coding for a device is pain, and optimization/debugging is an advanced study in masochism, coders will just say no. AMD finally seems to understand that concept, and is actually making the tools people want and need now, with releases coming thick and fast. This is what they needed to do a few years ago, but it is still a welcome change.

From here, the next big step, possibly the final major hurdle, is to make a system that transparently parses threads to the appropriate device. To do that, you need to know what ‘appropriate device’ means, and the major metric there is performance. For that, you need a tool that can see both CPU and GPU performance counters, data transfer events, and queues/latencies. Now do you see the direction that CodeAnalyst 3.2 is moving us in?S|A

Bio
Latest Posts

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate

Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate

Latest posts by Charlie Demerjian (see all)

Qualcomm Releases the Dragonwing IQ10 RRD Robotics Platform - Jun 1, 2026
Nvidia Finally ‘Launches’ N1X/GB10 At Computex - May 31, 2026
Qualcomm Launches Low Cost Snapdragon C Platform - May 29, 2026
Nuvia Founders Have A New Startup, Nuvacore - Apr 16, 2026
Nvidia Is Negotiating To Buy A Large PC Oriented Company - Apr 13, 2026

Thank you, Subscribers!

Thank you to our Subscribers, past and present.

You are appreciated.

You are what keeps SemiAccurate going, what allows us to maintain our journalism, what keeps us ad-free, what allows us to tell it like it is, it is still just you. You, the reader and subscriber, we thank you.

If you want to know more about subscriptions, both free and paid, the information can be found here.

For more on our track record of leading edge journalism see Fully Accurate.
Our Writers

Charlie Demerjian is the founder of Stone Arch Networking Services and S|A.

SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, security and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture.

As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends.

Thomas Ryan is a GIS Programmer and freelance technology writer from Seattle, WA. You can find his work on SemiAccurate and PCWorld.

Semiaccurate

On Target Technology News

Hot Article AMD to differentiate cores

Hot Article Intel foundry customer bails out

Hot Article Coffee Lake is going to impact Intel’s margins

Hot Article SemiAccurate digs up Intel Coffee Lake specs

AMD takes a major step in enabling GPGPU coding

CodeAnalyst 3.2 allows heterogenous profiling of code

Charlie Demerjian

Latest posts by Charlie Demerjian (see all)

Hot Article AMD to differentiate cores

Hot Article Intel foundry customer bails out

Hot Article Coffee Lake is going to impact Intel’s margins

Hot Article SemiAccurate digs up Intel Coffee Lake specs

AMD takes a major step in enabling GPGPU coding

CodeAnalyst 3.2 allows heterogenous profiling of code

Charlie Demerjian

Latest posts by Charlie Demerjian (see all)

Share this: