Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Performance and Benchmark Results!

Is Vampire Faster Than Classic PPC Cards?page  1 2 3 4 

Gunnar von Boehn
(Apollo Team Member)
Posts 6207
19 Dec 2016 20:50


The video performance of the Vampire Gold 2 in combination with the new RIVA is really nice.
 
Running 640x360 in high quality and in truecolor, and fully smooth is no problem.
 
I have to say Henryk did an awesome job on tuning the Riva and making celver usage of AMMX instructions. Riva runs reall great now - its a real pleasure to use it. I think Henryk is clearly one of the best 68k Amiga coders around.
 
Many of our users reported us that the Vampire now plays videos significant smoother and in higher resolution than their PPC classic cards were able to do.
 
According to these reports the Vampire outclasses all existing AMIGA upgrade cards including 603e and 604e PPC CPU cards.


Roman S.

Posts 149
19 Dec 2016 22:09


Sometimes it is faster, sometimes it is probably much slower. What about:
       
- 3D rendering (LightWave, Real3D, ...)
- FLAC playback (there is FFMPEG on Aminet)
- Hollywood audio/video import using AVCodec plugin
- LHA compression/decompression (there is a PPC LHA compilation available)
- Quake 1 frame rate (vs Classic with a PPC and some Voodoo 5 card)
- network transfer possible to achieve (vs some Zorro III network adapter)
- disk data transfer rate (FastATA controller, or SCSI controllers vs V500 internal IDE)
- ...
   
Also keep in mind that PowerPC on Classic is no longer limited to 603e or 604e - currently you can put a 500 MHz G3 or a 400 MHz G4 CPU there (Sonnet Crescendo 7200).
Do we have any benchmarks comparing Vampire to the G3/G4?


Daniel Sevo

Posts 299
19 Dec 2016 23:13


The PPC 604e used in Cyberstorm PPC could run up to 240MHz I think. So while not running at its full potential on a Cyberstorm board, the 604e could do 6 instructions / clock cycle.
Code optimized for it should therefore be tough for Vampire to reach, for "general purpose use"..
However, special cases such as AMMX optimized software could very well have the edge. Altivec didn't arrive until a couple of years later.


Wawa T

Posts 695
20 Dec 2016 00:16


no matter how fast a ppc card will be the zorro bus to video limits the throughput to rtg card to around 10mb/s, that means, at least with a ppc accelerators without an own video card on a local bus the smoothly playable framebuffer is limited to low res, somewhere around 320x240 pixels.
 
  this always been pita, but it doesnt prove that apollo as cpu can be considered in the range of sppeds corresponding to ppc cpus running dediacted code.


Samuel Crow

Posts 424
20 Dec 2016 04:30


Roman S. wrote:

Sometimes it is faster, sometimes it is probably much slower. What about:
       
  - 3D rendering (LightWave, Real3D, ...)
  - FLAC playback (there is FFMPEG on Aminet)
  - Hollywood audio/video import using AVCodec plugin
  - LHA compression/decompression (there is a PPC LHA compilation available)
  - Quake 1 frame rate (vs Classic with a PPC and some Voodoo 5 card)
  - network transfer possible to achieve (vs some Zorro III network adapter)
  - disk data transfer rate (FastATA controller, or SCSI controllers vs V500 internal IDE)
  - ...

Most of these require floating point performance.  I'll comment on this later.

LHa didn't benefit much from the PPC due to lack of bitfield support on the 603e so it probably is slower on 603e.

The disk transfer performance would require an updated SCSI.device to get better performance but ultimately, it is a peripheral device compared to the CPU performance.
Roman S. wrote:

  Also keep in mind that PowerPC on Classic is no longer limited to 603e or 604e - currently you can put a 500 MHz G3 or a 400 MHz G4 CPU there (Sonnet Crescendo 7200).
  Do we have any benchmarks comparing Vampire to the G3/G4?

To beat the G3/G4, we'll probably need a faster FPGA since those PPC models clock much faster.

Re:FPU performance
The integer performance sucks on 603e because it has only one integer unit (or so I've heard) and doesn't get the full benefit of its superscalar design until you start alternating between integers and floating point instructions.

Re:apples to apples?
Where the Vampire excels is in opcode fusions to increase the range of operations that can be executed in one clock.  PPC sucks in this regard because a load-store architecture deliberately reduced out the complexity of the CPU in an attempt to get higher clock speeds.  This improves the FPU disproportionately to the overall performance.  Also, PPC equivalent programs require 30% more memory on average over 68k or x86 due to the RISC instruction set.  This reduces the cache efficiency in favor of the CISC machines.

Overall, I'd say the integer perfomance of a Vampire will tromp the PPC but the reverse is true for FPU at this time.


Mr-Z EdgeOfPanic

Posts 189
20 Dec 2016 05:07


Roman S. wrote:

Sometimes it is faster, sometimes it is probably much slower. What about:
       
  - 3D rendering (LightWave, Real3D, ...)
  - FLAC playback (there is FFMPEG on Aminet)
  - Hollywood audio/video import using AVCodec plugin
  - LHA compression/decompression (there is a PPC LHA compilation available)
  - Quake 1 frame rate (vs Classic with a PPC and some Voodoo 5 card)
  - network transfer possible to achieve (vs some Zorro III network adapter)
  - disk data transfer rate (FastATA controller, or SCSI controllers vs V500 internal IDE)
  - ...
     
  Also keep in mind that PowerPC on Classic is no longer limited to 603e or 604e - currently you can put a 500 MHz G3 or a 400 MHz G4 CPU there (Sonnet Crescendo 7200).
  Do we have any benchmarks comparing Vampire to the G3/G4?

What i know from experience is that my 603e 240 Mhz+bvision did not like higher resolutions then 320x240 for mpeg playback, anything higher and the playback would be choppy.
Never tried ffmpeg on my A1200T has been dead in the water for many years now because of a broken Zorro IV busboard.

LHA compression/decompression back in he day, it did not work correctly with PPC, back then very unstable/badly coded.

Quake 1 well that should be faster on PPC, due to more raw power then a cyclone 3 can give us atm.
I've seen it running just last weekend on a CSPPC 233Mhz+cv643d, it ran at 27,5 FPS (average) en demo1.dem on 320x240 software render.

The CSPPC uw3 scsi controller wins easily, it can do 30-35 MB/sec.

Then there is software optimization, Riva is now heavily optimized for Apollo core, and thus it runs very good on gold 2 AMMX.

Back in the day Amp2 was by far the best mpeg player for PPC equipped Amiga's.

So to get some tests I've started to put the A4000 together again for the first time in many years.
The Second CSPPC (060 50Mhz+604e 233Mhz) I've got lying around here still seems to work, I'm waiting now for a friend to bring my old 10K RPM cheetah disk if that one still works it has a full OS3.9 install on it.

Once i have the system up and running I'll try different mpeg players to test.

 



Gunnar von Boehn
(Apollo Team Member)
Posts 6207
20 Dec 2016 10:00


Daniel Sevo wrote:

  The PPC 604e used in Cyberstorm PPC could run up to 240MHz I think. So while not running at its full potential on a Cyberstorm board, the 604e could do 6 instructions / clock cycle.
 

 
No, this is a misconception.
The PowerPC 604 can _NOT_ do 6 instruction per clock.
 
The maximum instruction number the PowerPC 604 can issue and retire is 4 instructions per cycle.
And this is the same number as Apollo 68080.
So both Apollo and 604 are equal in this regard.
 
Of course as you know 68K instructions can include MEMORY OPERATIONs. So 68k instructions are stronger than PowerPC instructions - and you need less instructions on 68k to do the same amount of work compared to PowerPC.
 
Another factor to consider is the strength of the memory unit.
The 604 can execute 1 memory access per clock.
Either 1 LOAD or STORE per cycle.
APOllO 68080 on the other hand can execute both LOAD and STORE per cycle. This means APOLLO provides up to two times the throughput compared to the PowerPC.
 

As we all know 68k instructions are more powerful than PowerPC instruction.

Lets look for example at this simple 68k instruction
ADDQ.l #1,(SP)
This single 68k instruction would need 3 instruction for a POWERPC to do an equivalent work.
 
 


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
20 Dec 2016 10:24


Regarding Quake: Of course today QUAKE is not a topic.
  As the FPU is disabled in APOLLO atm.
  But this can change soon.
 
 
Mr-Z EdgeOfPanic wrote:

  Quake 1 well that should be faster on PPC
 

 
 
 
  One could think this yes.
  As the PPC is actually strong on Floating point.
 
  But there is one major design flaw in the POWERPC which can give huge problemss. The PowerPC can NOT exchange data between INTEGER UNIT and FPU. The x86 and 68k can do this.
  Quake was designed on the x86 and is based on using both INTEGER unit and FPU together.
 
  The FPU to INTEGER data exchange comes for free on Apollo - but needs to be extremely costly "emulated" on PPC.
  Therefore there are certainly FPU algorithms which will in fact run better on Vampire than on POWERPC.
 
Mr-Z EdgeOfPanic wrote:

  I've seen it running just last weekend on a CSPPC 233Mhz+cv643d, it ran at 27,5 FPS (average) en demo1.dem on 320x240 software render.

See as I said.
The number of the PPC is actually bad.
A Pentium @233Mhz reaches about twice the FPS!


Grzegorz Wójcik (pisklak
(Apollo Team Member)
Posts 87
21 Dec 2016 13:54


I think if we realy want compare CPU speed here we should compare decoding speed - without rendering. Otherwise Vampire will dust Zorro solutions. And about Quake... when we will have FPU we will see. Comparasion will be very interesting :-)


Roman S.

Posts 149
21 Dec 2016 16:09


Quake 2 on Amiga with a Sonnet PPC accelerator, software rendering; 640x480 - 14.5 FPS:
 
EXTERNAL LINK 
I seriously doubt the Vampire can beat this. But - so what? The Vampire:
 
- combines a fast CPU, memory and graphics chip in a single, small, power efficient board
- the availability is much better than Amiga PPC accelerators
- it's dirt-cheap for what it already offers
- will possibly gain also FPU and digital audio capabilities
 
For me it doesn't have to be the absolute fastest solution. Really :)


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
22 Dec 2016 15:06


Roman S. wrote:

Quake 2 on Amiga with a Sonnet PPC accelerator, software rendering; 640x480 - 14.5 FPS:

Let me challenge this. ;-)

How many FPS can you reach at DOOM with this SONNET at 640x480?
Can you reach more than the Vampire?

I seriously doubt the Sonnet can beat the Vampire at this :)



Szyk Cech

Posts 191
22 Dec 2016 16:20


According to this page: EXTERNAL LINK fastest SONET PPC G4 reach 1.7GHz (single core) or 1.3GHz (dual core). It has its own memory so it should beat easily Vampire even in single core version. But I don't criticise your effort, just justify fact.


Roman S.

Posts 149
22 Dec 2016 16:36


Yes, but only the PCI versions can be used in Amiga - which rules out these fast cards.

According to the drivers authors, the PPC access to graphics memory on Voodoo or Radeon is rather fast. I'm sure someone will test DOOM sooner or later.


Kymon Erec Zonias

Posts 7
22 Dec 2016 18:18


Szyk Cech wrote:

According to this page: EXTERNAL LINK fastest SONET PPC G4 reach 1.7GHz (single core) or 1.3GHz (dual core). It has its own memory so it should beat easily Vampire even in single core version. But I don't criticise your effort, just justify fact.

Great G3/G4 CPUs you can find on a Sonnet Accelerator. But there is only one series
that can be used in an Mediator Busboard. It is the PCI Version with local Dimms.
The biggest Versions delivered are: G3/500 1M L2 Cache or G4/400 1M L2 Cache.
Max usable Ram on the Sonnet  is 192MB Ram.


Kymon Erec Zonias

Posts 7
22 Dec 2016 21:32


Gunnar von Boehn wrote:

Roman S. wrote:

  Quake 2 on Amiga with a Sonnet PPC accelerator, software rendering; 640x480 - 14.5 FPS:
 

 
  Let me challenge this. ;-)
 
  How many FPS can you reach at DOOM with this SONNET at 640x480?
  Can you reach more than the Vampire?
 
  I seriously doubt the Sonnet can beat the Vampire at this :)
 

Hi Gunnar.
I kindly asked my friend Dennis v.d. Boon who is the creator of the sonnet.library
to benchmark his SonnetPPC G3/500MHz for us.

Here are some numbers (Pics/Clips will follow)

ADoomPPC
(with optimal Resolutions as suggested in the readme)

672x480 = 35,57fps
800x600 = 26,87fps
1056x792 =16,44fps

and as an add Dennis compiled your SortBench for WarpOS.
Here are the SonnetPPC G3/500 results:

1k = 476.08 MB/sec
2k = 476.27 MB/sec
4k = 476.36 MB/sec
8k = 476.37 MB/sec
16k = 422.47 MB/sec
32k = 408.73 MB/sec

Rgrds
Kymon




Niclas A
(Apollo Team Member)
Posts 219
22 Dec 2016 23:34


Nice figures Kymon.

Only found this old video from Shk for Doom (800x600) on the Apollo.
EXTERNAL LINK 
Does not look like 27 fps but a lot has happened in almost a year.


Gregthe Canuck

Posts 274
23 Dec 2016 07:07



Thanks for the sortbench figures. Those are only about 33% faster than the latest figures Gunnar posted. This supports the clock-for-clock performance advantage that the new Apollo core has over the G3 core.

Now we just need someone to provide some Doom figures to see how that compares. Maybe wait until Gold 2 is released?




Mr-Z EdgeOfPanic

Posts 189
23 Dec 2016 08:48


Or ask one on the team members to run a quick Doom test.
I think vampire has a good chance to win or come really close.


Kymon Erec Zonias

Posts 7
26 Dec 2016 18:47


Kymon Erec Zonias wrote:

 
    and as an add Dennis compiled your SortBench for WarpOS.
    Here are the SonnetPPC G3/500 results:
   
    1k = 476.08 MB/sec
    2k = 476.27 MB/sec
    4k = 476.36 MB/sec
    8k = 476.37 MB/sec
    16k = 422.47 MB/sec
    32k = 408.73 MB/sec
   
    Rgrds
    Kymon
 

 
Dennis contacted me again. While compiling SortBench for the SonnetPPC, which works with the WarpOS environment, he forgot the -O1 argument.
 
The WarpOS optimized SortBench gives the following corrected numbers.
 
    1k = 634.82 MB/sec
    2k = 635.04 MB/sec
    4k = 635.16 MB/sec
    8k = 635.18 MB/sec
    16k = 538.31 MB/sec
    32k = 515.31 MB/sec
 
 
  Bye ;-)
  Kymon


Roger Shimada

Posts 30
27 Dec 2016 00:59


Kymon Erec Zonias wrote:
The WarpOS optimized SortBench gives the following corrected numbers.
   
    1k = 634.82 MB/sec
...
    32k = 515.31 MB/sec

From the GitHub page it seems this would have required an A3000 or A4000 and basically a PPC Sidecar.

It would be interesting to know how many (few?) people have such systems.

posts 68page  1 2 3 4