CellPerformance
All things related to getting the best performance from your Cell Broadband Engine™ (CBE) processor.
CellPerformance
is the unofficial stopping place
for performance related articles and discussions on the Cell Broadband Engine™ (CBE) processor.
If you are new to Cell development, check out our Getting Started page.
Member Pages
Would you like to be a member?

CellPerformance is an all volunteer effort dedicated to researching and publishing code and articles for the Cell community.

You don´t need to be an expert to help out! From coding and technical writing through reviewing documentation and website administration; If you have an interest in partipating in the Cell community, there is a place for you here.
Email Mike Acton if you want to join the team.
No Insider Info!
Although discussions on applying the Cell processor to game development are welcome here, do not ask for insider information related to Sony's Playstation 3.

The details of the hardware and development are covered by a non-disclosure agreement and under no conditions will confidential information be permitted on this site.

Playstation 3 developers are welcome to participate in the discussions but be aware that this is a publicly accessable site and information not available to the general public may not be disclosed.

Keep it clean so that we can continue to build on the community of Cell developers both inside and outside video game development.

Thank you for your cooperation,
Mike.
Links
Legal
Content Copyright © 2006 by Mike Acton. All Rights Reserved.

This site uses the Movable Type 3.2 content engine.

This site uses the phpBB bulletin board engine Copyright © 2001, 2005 phpBB Group.

Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc

PowerPC is a trademark of International Business Machines Corporation.

Linux is a registered trademark of Linus Torvalds in the U.S. and other countries.

Macintosh, and Mac are registered trademarks of Apple Computer, Inc

All other trademarks are the property of their respective owners.
CellPerformance: Forums
Share your opinions and questions related to the Cell Broadband Engine (CBE) processor. (Forums hosted by Beyond3D.com)
CellPerformance is proud to be an Official Partner of the IGDA
Articles
Fast Matrix Multiplication on Cell (SMP) Systems
Daniel Hackenberg wrote to tell me about some matrix multiply code he has written for the Cell.

Cleaning House
I'm working on a plan that will make the forums better and more useful. And hopefully, I can get a little help from some friends.

Handy PS3 Linux Framebuffer Utilities
While the documentation within Sony's vsync example should be enough to get you started with writing to the framebuffer, here's a couple of handy functions to test the framebuffer settings, open the virtual terminal and get access the the frame buffer.

HowTo: Huge TLB pages on PS3 Linux
Understanding the TLB and minimizing misses is a critical part of high performance Cell programming. Unfortunately some PS3 kernels do not come with huge page support enabled. Jakub Kurzak and Alfredo Buttari step through the details of recompiling the kernel for huge page support.

Cross-compiling for PS3 Linux
n this article, I will detail the basic steps I used to get started building on a host PC and running on the PS3.

Unaligned scalar load and store on the SPU
An example of unaligned loads and stores on the SPU. The solution to this problem is to remember that the SPU does not have a scalar instruction set or access local memory in anything except 16 bytes quadwords.

atan2 on SPU
A branch-free implementation of atan2 vector floats for the SPU.

Branch-free implementation of half-precision (16 bit) floating point
The goal of this project is serve as an example of developing some relatively complex operations completely without branches - a software implementation of half-precision floating point numbers.

Better Performance Through Branch Elimination
An introduction to branch penalties: Why it's a good idea to avoid branchy code.

Box Overlap
A look at a function to test for overlap between 3D boxes, and how to optimize it for the CBE.

A 4x4 Matrix Inverse
Study case about how to convert scalar code indo SIMD code for PPU and SPU using the matrix inverse as example.

Avoiding Microcoded Instructions On The PPU
Executing instructions from microcode can wreck havok on inner loop performance. Find out which instructions are microcoded and how to avoid them.

Choosing to Avoid Branches: A Small Altivec Example
An example of why less instructions doesn't always equal faster code.

More Techniques for Eliminating Branches
Some additional examples for eliminating integer and floating-point branches.

Programming with Branches, Patterns and Tips
GCC follows some straightforward rules that are useful to know when programming with branches.

Fast Matrix Multiplication on Cell (SMP) Systems

Daniel Hackenberg wrote to tell me about some matrix multiply code he has written for the Cell.



From his page:


This site describes a fast matrix multiplication code for Cell BE processors. It has been developed as part of a seminar paper at the Center for Information Services and High Performance Computing. The program is freely available under the GNU GPL.



Go ahead and check it out: Fast Matrix Multiplication on Cell (SMP) Systems [tu-desden.de]

Cleaning House
UPDATE! 7 July 2007 The new CellPerformance Forums are now up and running, hosted by our friends at Beyond3D. [Thanks guys!]

I'll be fixing up the links and generally cleaning things up to point all article discussions over to the new forums. It might take a little time, so be patient - but the quality of their forums is great, and I know that the addition of the existing B3D community to our own will drive a lot of good discussion.

Remember the main articles will continue to be posted here. Hopefully, a few more than I've had time for in recent months.

Well be back up and running full-speed shortly!

Mike.
Hey everyone! I know our forums have been hacked. You'd think that these kids would have better things to do. You'd also think that they'd appreciate exactly the kind of info we're trying to share here. Dumb.

Anyway, not worth the effort to worry about them. I'm working on a plan that will make the forums better and more useful. And hopefully, I can get a little help from some friends.

Stay tuned. It's time for me to get back to this and get all of you more of the info you want!

Mike.
Handy PS3 Linux Framebuffer Utilities
While the documentation within Sony's vsync example should be enough to get you started with writing to the framebuffer, here's a couple of handy functions to test the framebuffer settings, open the virtual terminal and get access the the frame buffer.

Open the virtual terminal:
cp_vt.h
cp_vt.c

Open the framebuffer:
cp_fb.h
cp_fb.c

Dump framebuffer info:
fb_info.c

Example output from fb_info
Example of using cp_vt and cp_fb

Files should be compiled with:
ppu-gcc -std=c99 -pedantic -W -Wall -O3
HowTo: Huge TLB pages on PS3 Linux
Updated! (22 Mar 07) Minor edits. Added notes for YellowDog Linux. Added source code for using huge page allocation.
Updated! (30 Mar 07) A couple minor fixes. Thanks to Guénaël Renault for pointing them out!
Updated! (15 July 07) Added notes for kernel 2.6.21
Guest article: Understanding the TLB and minimizing misses is a critical part of high performance Cell programming. Unfortunately some PS3 kernels do not come with huge page support enabled. Jakub Kurzak and Alfredo Buttari step through the details of recompiling the kernel for huge page support.
The availability of huge TLB pages depends on the way the linux kernel has been configured prior to compilation. The default kernel that ships with Fedora Core 5 (most likely with any other distribution that has binary kernel packages) doesn't include this option. So, in order to have huge TLB pages, it is necessary to reconfigure the kernel, recompile it, instruct the boot loader about the newly created kernel image. Finally we will also show a way to allocate the TLB pages automatically at boot time.

[Mike Acton] This process also works with YellowDog Linux virtually unchanged.
Cross-compiling for PS3 Linux
Now that the PS3 is out and multiple Linux-based distributions are available which can be installed using Open Platform [playstation.com] it's time to start developing on some publically available hardware!

Although the PPU and SPU compilers can be installed and used on the PS3 directly, I find it much more familiar and convinient to cross-compile from my desktop and just ship the resulting executables over to the target (PS3).

In this article, I will detail the basic steps I used to get started building on a host PC and running on the PS3.