# What is the best way to optimize for fast floating point operations?



## konan (Jul 26, 2002)

What is the best way to optimize for fast floating point operations using Project Builder 2.0? 

I already have my project set to optimization Level 3, but is there anything else I can do to make floating point calculations go faster?

Konan


----------



## btoneill (Jul 26, 2002)

Make them vectors and use the altivec processor for them.

Brian


----------



## konan (Jul 26, 2002)

Looks really interesting. Unfortuantely, it would require a massive overhaul of my program (and render it OS specific, as it is currently pure Ansi C++).  Are there any other methods to get a more modest speed increase in the short term?

Konan


----------



## strobe (Jul 26, 2002)

You can use the AltiVec C extensions |-)

Optimization levels in gcc are a joke, gcc sucks. 3.1 is supposed to be better however, you can pull that from Apple's cvs. The improvement is supposed to be pretty good.


----------



## konan (Jul 29, 2002)

Are the Alti-Vec extensions seamless? i.e. will I have to change my Ansi C++ code to accomidate it (other than a header perhaps)?

Can you point me to a suitable reference (tutorial perhaps)?

Konan


----------



## rharder (Jul 30, 2002)

Yeah, you'd have to change your algorithms considerably. A good practice is to have ANSI C versions of the algorithm, and then call the AltiVec versions instead if you're on a G4 processor.

Apple's got some docs here:

http://developer.apple.com/samplecode/Sample_Code/Devices_and_Hardware/Velocity_Engine.htm

-Rob


----------



## konan (Jul 30, 2002)

Looks good. I'll keep it in mind for the next version. I have been meaing to abstract my vectors in any case. I would probably wrap them in macros instead of functions to ensure top performance.

Question... what happens if someone on a G3 runs the program? Will it simply switch to software emulation or will it not work at all?

Konan


----------



## rharder (Jul 31, 2002)

If you don't do the checking yourself and try to execute G4 instructions on a G3, you'll get an illegal instruction crash.

-Rob


----------



## konan (Aug 2, 2002)

How horrible! So much for a smooth transition between the two. You'd think that the AltiVec library would kick into software mode if the G4 was not detected. 

Konan


----------



## strobe (Aug 2, 2002)

Uuh, just like you want to use FPU operations when the FPU isn't available? Until we use programmable processors you will always have to check the CPU's ISA. 

Note: AltiVec is useful for non-vector operations as well. It's very flexible. Making your vector code modular doesn't necessarily make it more apt for optimization.


----------



## monty (Aug 3, 2002)

> _Originally posted by konan _
> *How horrible! So much for a smooth transition between the two. You'd think that the AltiVec library would kick into software mode if the G4 was not detected.
> 
> Konan *



The flaw in your statement is that it isn't a library! Those altivec language extensions look like C functions but they're not. They translate into single PPC instructions. The G3 will be going merrily along the sequence of instructions until it suddenly hits an altivec instruction and craps out.

It would be theoretically possible to trap to the OS and check if an altivec instruction was the one that caused the problem and emulate it but there would be no point. An OS trap per instruction would probably be *many* orders of magnitude slower than the ANSI C code.

It's not as horrible as you think. If you use C++ just make two subclasses of your main processor class, one for the G4 and one for everything else. Set a variable to the right one at the start of the program and you're set. If you're using straight C, use function pointers. There is a Gestalt selector to figure out if you have a G4.


----------



## ladavacm (Aug 5, 2002)

the AltiVec compatibility could be included in kernel, by on the fly catching the illegal instruction fault, and emulating; this is unnecessary bloat, IMNSHO, and probably quite slow.  Or you can install a SIGILL handler, and perform emulation yourself out of the handler, if it triggers; this is not as easy as it seems, and is probably cheaper to test the presence of AltiVec (using SIGILL handler once, e.g.) and switching to non-altivec implementation on a higher level (i.e. not per failed instruction)


----------

