|Some people have asked for examples of 64bit code.|
Of course there are so many examples where 64 bit helps to improve performance.
The simplest answer would maybe be : "Google for MMX instruction set" :-)
As you all know MMX was a 64bit instruction extension added by INTEL which gave a very good performance boost to many routines.
Lets look at some examples more in detail:
For example an important part of jpeg or video decoding included calculating of averages of a block.
Lets make a very simple example lets say you have a 16 byte and another 16 bytes and you want to AVERAGE them up to a new mixed 16. Like mixing 2 audio streams.
What you need to do for this is add each byte from Pointer A to pointer B. And then you need to LEFT SHIFT the result.
Typical 68k Code might look like this
With these 8 instructions 1 byte of data was processed.
To process all 16 bytes we need to execute these instructions 16 times. This means over 100 instructions are needed.
Now APOLLO included PAVGU.B instruction.
Which does the average of 8 bytes in a single instruction in a single cycle.
The advantage is obvious right?
With 3 instructions 8 Byte are loaded, Averaged, and stored.
This means Apollo can do in 6 instructions where previous 68k CPUs needed over 100 instructions.
The advantage of the new Apollo instructions is easy to see.
The code is simpler to write, its much less instructions.
And Apollo can do the same work with the new instructions over 10 times faster than the highest clocked 68060 could do it.
You always wanted a GigaHerz 68060?
Apollo can as of today give you the performance of a GigaHerz 68060 for all routines which are rewritten to benefit from the new instructions. Examples would be Datatypes, JpegViewer, Videoplayer ...
You want another simple example?
Lets look at Video display conversion.
Now videos are stored as compressed YUV data - applications often want to convert them to another format RGB.
To convert a YUV pixel to RGB some mathematical calculation is needed. This involves 4 Multiplications, 2 ADDS, 2 SUBS, several COMPARE instructions and several SHIFT, AND, OR instructions.
In total converting 1 pixel from YUV to RGB takes over 100 cycles on todays 68030 CPU.
Now Apollo has a special instruction for this.
If you want to convert to RGB16 then Apollo can even convert 4 Pixels in parallel with needing only 1 asm instruction.
As you can see using the new instructions makes the code not only more than 100 times faster - but also makes coding significantly easier.
Does this mean that all programs need to be rewritten to benefit from the performance boots of 64bit operations?
To get the full potential of running certain algorithms 10 or 100 times faster - YES.
But even without changing the code at all, APOLLO is able to often accelerate the execution by utilizing 64bit.
This is a simple memcopy example.
Code like this is used in many old 68k programs.
The code copies 2 times 32bit....
APOLLO is able to understand that the programmer wanted to move Data as fast as possible - and Apollo is able to help him.
Apollo can re-write the instructions from 2 times 32bit to 1 time 64bit. Which will speed up the memcopy by 100%.
Using such features APOLLO is able to accelerate and improve old existing 68k programs. Several of these "intelligent-rewrites" will be enabled in the next GOLD release of APOLLO.