Overview Features Instructions Performance Forum Downloads Products Reseller Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
VISIT APOLLO IRC CHANNEL



All TopicsNewsPerformanceGamesDemosApolloVampireCoffinReleases
Performance and Benchmark Results!

POVRay - Come to Fish Togetherpage  1 2 

Renaud Schweingruber

Posts 247
23 Jun 2018 17:18



 
 
Vampire FPGA 80k LE with big fat cache, x13 speed

 
   
Ran with those parameters :
povray31fpu +A +W720 +H486 +Ipovray31:scenes/advanced/fish13/fish13.pov +Lpovray31:scenes/advanced/fish13 +Oram:test.tga
   
 
Anyone can run it with other Moto's CPUs ?


Gunnar von Boehn
(Apollo Team Member)
Posts 3526
24 Jun 2018 18:42


Tuko very nice result!
 
I took the freedom to remove your emulation score and would propose that we focus on comparing real 68K CPUs.


Anderson Santos

Posts 1
26 Jun 2018 15:41



Results for Blizzard 1260 50MHz, using a povray31 060 version :
 
  ----------------------------------------------------------------------------
  Pixels:          350640  Samples:          841040  Smpls/Pxl: 2.40
  Rays:          1906080  Saved:                4  Max Level: 5/5
  ----------------------------------------------------------------------------
  Ray->Shape Intersection          Tests      Succeeded  Percentage
  ----------------------------------------------------------------------------
  Box                            3858829        2021269    52.38
  Cone/Cylinder                  146871          19321    13.16
  CSG Intersection              2641040          363189    13.75
  CSG Union                      1301867          274165    21.06
  Plane                          2056369        1286824    62.58
  Quadric                        3834218        1046256    27.29
  Sphere                        35901752        8462559    23.57
  Clipping Object                229128          217392    94.88
  Bounding Box                  34884678        19654541    56.34
  Light Buffer                  54814093        31929341    58.25
  Vista Buffer                  15593114        11463479    73.52
  ----------------------------------------------------------------------------
  Calls to Noise:            799182  Calls to DNoise:        2735506
  ----------------------------------------------------------------------------
  Shadow Ray Tests:          6262714  Succeeded:              963540
  Reflected Rays:            1049312  Total Internal:              66
  Refracted Rays:              15557
  Transmitted Rays:              171
  ----------------------------------------------------------------------------
  Time For Parse:    0 hours  0 minutes  3.0 seconds (3 seconds)
  Time For Trace:    1 hours 18 minutes  6.0 seconds (4686 seconds)
  Total Time:    1 hours 18 minutes  9.0 seconds (4689 seconds)



Andy Hearn

Posts 165
26 Jun 2018 17:00


nice one.
I doubt there'd be much in it difference in time wise if I did a run  with my machine, but the more I think about it, the more I want to - just to see ;)
If I get a free bit of evening time, i'll see if I can do.


Mallagan Bellator

Posts 375
26 Jun 2018 17:14


I might do this one too


Andy Hearn

Posts 165
27 Jun 2018 09:17


Results for A3k+CyberstormMk3 060@50MHz, using the fpu version of Povray off Coffin R51 with the same command line
 
  ----------------------------------------------------------------------------
  Pixels:          350640  Samples:          841040  Smpls/Pxl: 2.40
  Rays:          1906080  Saved:                4  Max Level: 5/5
  ----------------------------------------------------------------------------
  Ray->Shape Intersection          Tests      Succeeded  Percentage
  ----------------------------------------------------------------------------
  Box                            3858829        2021269    52.38
  Cone/Cylinder                  146871          19321    13.16
  CSG Intersection              2641040          363189    13.75
  CSG Union                      1301867          274165    21.06
  Plane                          2056369        1286824    62.58
  Quadric                        3834218        1046256    27.29
  Sphere                        35901752        8461842    23.57
  Clipping Object                229128          217392    94.88
  Bounding Box                  34988796        19656929    56.18
  Light Buffer                  55348401        31877029    57.59
  Vista Buffer                  15621145        11460670    73.37
  ----------------------------------------------------------------------------
  Calls to Noise:            799182  Calls to DNoise:        2735506
  ----------------------------------------------------------------------------
  Shadow Ray Tests:          6262714  Succeeded:              963540
  Reflected Rays:            1049312  Total Internal:              66
  Refracted Rays:              15557
  Transmitted Rays:              171
  ----------------------------------------------------------------------------
  Time For Parse:    0 hours  0 minutes  3.0 seconds (3 seconds)
  Time For Trace:    1 hours  8 minutes  30.0 seconds (4110 seconds)
      Total Time:    1 hours  8 minutes  33.0 seconds (4113 seconds)
 
  for a bit there, I really thought I was going to crack the hour at up to about two thirds to three quarters done - I guess those water calcs really take it up a notch. puzzeled over the diff with the blizz card though.
 


Vojin Vidanovic

Posts 697
27 Jun 2018 10:05


Renaud Schweingruber wrote:

      Vampire FPGA 80k LE with big fat cache, x13 speed
   

   
    Is this v4 test case? Its nice to see 90Mhz 080 has 2-3x diff to 060 50Mhz CPUs ...


Gregthe Canuck

Posts 269
27 Jun 2018 10:39


It has to be V4.  There aren't 80K LE on the V2. :)


Stefano Briccolani

Posts 219
27 Jun 2018 12:31


But some versions of cyclone3 have more logic elements than the cyclones put in v2. So it isn't clear if this is the performance of v4 or something else..


Andy Hearn

Posts 165
27 Jun 2018 14:09


hmmm. so an 080@92Mhz takes 32 minutes, an 060@50Mhz take 68 minutes. not quite twice as fast (clock speed) for just under half as long (time).
 
  so looking at these basic numbers there seems to be a slight advantage for the vampire on calcs per cycle - maybe 10% or so, the main advantage the vampire has is the cycle execution clock speed.
 
  I recon an 060 running at 100Mhz with some really fast ram would bring the fight back to "real hardware". but lets face it, how long are you going to want to run your machine at that level with potential associated power supply/cooling/"20+yr old hardware" issues. and even then probably not beat an FPGA. and that's with non-AMMX optimized code.
 
  will see if can nail together an 040 machine tonight and run another povray bench


Saladriel Amrael

Posts 93
27 Jun 2018 16:20


Andy Hearn wrote:

[...]
 
  I recon an 060 running at 100Mhz with some really fast ram would bring the fight back to "real hardware". but lets face it, how long are you going to want to run your machine at that level with potential associated power supply/cooling/"20+yr old hardware" issues. and even then probably not beat an FPGA. and that's with non-AMMX optimized code.
 
  will see if can nail together an 040 machine tonight and run another povray bench

Exactly, listening to Gunnar's words, a version specifically optimized for 080FPU would be much faster even without using AMMX




Andy Hearn

Posts 165
28 Jun 2018 21:28


Results for A3k+A3640 040@25MHz, again using the fpu version of Povray off Coffin R51 with the same command line
       
     
        ----------------------------------------------------------------------------
        (Edit - snipped as all the same as before)
        ----------------------------------------------------------------------------
        Time For Parse:    0 hours  0 minutes  18.0 seconds (18 seconds)
        Time For Trace:    4 hours 52 minutes  57.0 seconds (17577 seconds)
            Total Time:    4 hours 53 minutes  15.0 seconds (17595 seconds)
     

      over 4 times slower than an 060? wth? i'm re-running this (overnight) on the on board 030+882@25 to see. I guess an 040 really wants some local ram
   


Saladriel Amrael

Posts 93
29 Jun 2018 13:52


AFAIK 060 FPU is per clock faster than 040, that does not surprise me


Andy Hearn

Posts 165
29 Jun 2018 21:07


Results for A3000 stock 030+882@25MHz, fpu Povray Coffin R51 same args
 
       
              ----------------------------------------------------------------------------
              (Edit - snipped as blah blah blah)
              ----------------------------------------------------------------------------
              Time For Parse:    0 hours  0 minutes  19.0 seconds (19 seconds)
              Time For Trace:  10 hours 34 minutes  2.0 seconds (38042 seconds)
                  Total Time:  10 hours 34 minutes  21.0 seconds (38061 seconds)

    yeah ok, so that was the 040 not the 030 then. 030's still not too shabby at only double the time at the same clock.


Andy Hearn

Posts 165
03 Jul 2018 09:57


Results for A1200 Blizz040@40 running the 3.9 install taken from my A3k060 before I put a new R51 imaged CF card in the A3k, Povray copied off the r51 setup to the OS3.9 CF card

----------------------------------------------------------------------------
Time For Parse:    0 hours  0 minutes  11.0 seconds (11 seconds)
Time For Trace:    5 hours 28 minutes  33.0 seconds (19713 seconds)
    Total Time:    5 hours 28 minutes  44.0 seconds (19724 seconds)

ok, clearly something wrong there. I think I screwed up. wrong FPU libraries or something... might explain the A3640 results as well


Saladriel Amrael

Posts 93
03 Jul 2018 13:28


Surely a 040@40 slower than a 040@25 is strange.
  Are you sure the A3640 is 25Mhz and not 50Mhz? Your result seems to be in line with a 50Mhz 040, looking at all the others

 
 


Andy Hearn

Posts 165
03 Jul 2018 13:37


absolutely :D
that's why I think the installation for the A1200 has some 060 librarys or something that is causing problems.

First i'm going to make a fresh build OS3.9 install. no 060 librarys. and see what that does. then maybe i'll try to duplicate the Coffin R51 system drive on the A3000 onto a compact flash for the A1200. I tried to do directly plug in the R51 Compact flash the A3000 runs from into the A1200, but it just kept stalling with a software failure at setpatch. hence trying anything else that would boot.

mainly because i'm lazy, but also I wanted to replicate the same software environments to generate good bench marks.

I also need to get my PPC card to boot as that has a 25Mhz 040 on it. I got the insert floppy screen, but nothing much beyond that last night.


Andy Hearn

Posts 165
04 Jul 2018 11:08


A1200 rev1d4, 3.1roms Blizzard1240 @40Mhz, 128meg single 72pin stick.
built a fresh OS3.1 install on a 4gig CF card, upgraded to 3.9. no additional boingbags. copied Povray directory from the coffin OS povray straight to the fresh CF card, and ran the bench.

----------------------------------------------------------------------------
Time For Parse:    0 hours  0 minutes  8.0 seconds (8 seconds)
Time For Trace:    5 hours 28 minutes  32.0 seconds (19712 seconds)
    Total Time:    5 hours 28 minutes  40.0 seconds (19720 seconds)

 
3 seconds faster for the Parse, which made sense, but 1 second faster for trace? that's within an margin of error - so at least the hardware is consistent with the two software environments/librarys etc.
What i'm having trouble with, is I don't believe that a blizzard1240@40 is half an hour slower than an A3640@25. i'm going to have to pull my A3k apart again if I can't get the PPC to run a bench @25... #HeadInHands


Gunnar von Boehn
(Apollo Team Member)
Posts 3526
03 Aug 2018 18:56


APOLLO CORE got faster.

The new GOLD 2.11 release candidate fishes much faster




Andrew Miller

Posts 142
03 Aug 2018 22:35


An around 20%+ increase in speed, nice :)

posts 27page  1 2