PowerPC (Performance Optimization With Enhanced RISC - Performance Computing) or PPC is totally unrelated to the Intel x86-based PC that most of us know. The Power PC is a RISC instruction set architecture created by the 1991 Apple-IBM-Motorola alliance, known as AIM PowerPC. This becomes instantly obvious to anyone examining the assembly level language. It is, at first sight, somewhat quirky; most assembler programmers can swiftly move from one CPU to another fairly rapidly, the PPC however requires a little more effort.
The PPC has several novel features, such as seven independent condition register sets (each set contains flags for less than, greater than, equal and overflow). These condition register sets are explicitly specified when performing comparisons, meaning that the compares and branches can be distributed throughout code in a way that could be somewhat unintuitive. One instruction that took some time to get to grips with was the ISEL instruction. This makes use of the condition register set and allows conditional execution while eliminating branch instruction.
cmplw %cr7, %r9, %r5
Compare register 9 with register 5 and set the appropriate bits in condition register 7.
So now we can look at the
isel instruction. It is defined in the PPC manual as:
Integer Select -
If CR[crb] is set, the contents of rA (or the immediate value 0) are copied into rD. If CR[crb] is clear, the contents of rB are copied into rD
crb is an integer corresponding to the specific flag within the condition register of interest. Values of crb are as follows:
Why should this interest us? There are two reasons that this instruction is relevant to us:
- For minimizing both worst-case execution time, and execution jitter, it is better to avoid branches in code wherever possible
- Code complexity metrics suggest that the complexity of a piece of code and it's liability to failure is often directly related to the number of branches in the code.
So, if we re-write the following code:
cmpwi %cr7,%r11,35 beq cr7, Label1 ; branch if r11 == 35 li %r0,0 ; else set r0 = 0 b Label2 Label1: li %r0,1 ; set r0 = 1 Label2:
li %r10,0 li %r5,1 , cmpwi %cr7,%r11,35 isel %r0,%r5,%r10,30
This rewriting gives us only one less instruction - but that's 20% of this fragment, and it now contains no branches or labels!
Only one possible path exists in this fragment instead of two, which means that the execution time of this fragment is now deterministic.
The application of this technique to the entire code base can make a significant reduction in effort and complexity when performing and coverage code analysis.
This is just one feature of the PPC architecture. Watch this space for more snippets!