Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Performance and Benchmark Results!

300Mhz Targetpage  1 2 3 4 

Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
30 Aug 2018 11:29


gregthe canuck wrote:

  I don't think it is realistic to expect that much of a clock speed jump. I believe a 14x core was demoed which is a nice bump up from the 12x on V2. But that isn't guaranteed. *Maybe* if at some point the team does a "black edition" with a higher speed grade of the chip used in the V4 they could maybe go one speed level higher? But as Majsta has noted earlier the speed grading is a bit of a crapshoot.

Its NOT LE scalable (more space - more speed). I do expect x14 as standard bullet proof core and x15 as kind of overclock.A bit more about end of its life + a lot of new feats, which is as important, if not more.



Gregthe Canuck

Posts 274
30 Aug 2018 13:30


Vojin Vidanovic wrote:

  Its NOT LE scalable (more space - more speed).

 
Hi Vojin!

You missed my point. It is scalable when you can take advantage of the extra LE's for bigger caches, more comprehensive optimizations, extra pipelines/units, more branch prediction logic, etc...
 
 


Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
30 Aug 2018 13:54


gregthe canuck wrote:

  You missed my point. It is scalable when you can take advantage of the extra LE's for bigger caches, more comprehensive optimizations, extra pipelines/units, more branch prediction logic, etc...

That is exact diff V4 to V2 (more cache, a bit higher clock, faster RAM, full ApolloFPU). But that is NOT an infinite source of speed increase to named 300Mhz target (and whose 300Mhz? Pentium Pro/II/III 300Mhz levels?). Feature wise its really a modern CPU and our current FPGA is a blessing and a curse. Curse since you cannot increase speed that easily as with higher clocked CPUs.


Gregthe Canuck

Posts 274
30 Aug 2018 14:23


I never said it was an infinite source of speed. My point was and still is that having more LE (with same clock speed and same RAM speed) allows for a faster core. That is all.


Mr Niding

Posts 459
30 Aug 2018 14:31


Greg;

If I read you right;

We shouldnt focus on Mhz, as there are alot of performance to be utilized by smart design and code.

I was watching Hardware Unboxed the other day, and they ran Ryzen vs Intel tests on Linux and Windows.
Linux blew Windows out of the water on most tests.


Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
30 Aug 2018 17:16


gregthe canuck wrote:

I never said it was an infinite source of speed. My point was and still is that having more LE (with same clock speed and same RAM speed) allows for a faster core. That is all.

Dont take it personal, people have cried for Mhz all the time. Increase in speed with Vamps with each core is nice and steady, so one should just be patient. And buy v4.


Jm 68k

Posts 2
30 Aug 2018 23:48


32 cycles @ 8MHz = 4 µs
0.5 cycle @ 85 MHz = 5.88 ns
4µ / 5.88 ns = 680
680 x 8 MHz = 5440 MHz = 68000 @ 5.4GHz!!!

Correct Gunnar?


Sean Sk

Posts 488
31 Aug 2018 00:54


An easier way to work it out is:

68000 = 32 Cycles for instruction
68080 = 0.5 Cycles for instruction

32 / 0.5 = 64

That means the 68080 could perform the same instruction 64 times in the time it takes for the 68000 to do it just once!

85mHz x 64 = 5440mHz <--- The speed at which the 68000 would have to run to perform the instruction in the same amount of time.


Peter Slegg

Posts 22
31 Aug 2018 13:21


How would that compare with a 68060 at 50MHz ?


Don Adan

Posts 38
31 Aug 2018 14:12


jm 68k wrote:

  32 cycles @ 8MHz = 4 µs
  0.5 cycle @ 85 MHz = 5.88 ns
  4µ / 5.88 ns = 680
  680 x 8 MHz = 5440 MHz = 68000 @ 5.4GHz!!!
 
  Correct Gunnar?
 

  No. Fastest 68000 instruction needs 4 cycles. Fastest 68080 instruction needs 0.5 cycle. Comparing single instruction has no big sense. F.e. mulu.w needs 70 cycles for 68000 and perhaps 2 (?) cycles for 68080. Divu.w needs 140 cycles for 68000 and perhaps 19 (?) or 27 (?) cycles for 68080.


Don Adan

Posts 38
31 Aug 2018 14:16


Peter Slegg wrote:

  How would that compare with a 68060 at 50MHz ?
 

  For most code same speed at same CPU clock, but 68080 can do more instructions in 0.5 cycle. Then is fastest a few, if original 68k instructions set is used. Of course big difference exist for all 68k instructions which are trapped for 68060, f.e movep.


A1200 Coder

Posts 74
31 Aug 2018 16:51


Well, if you look at MC68060 manual, an addi.l #data, (d16, An) costs 2 clock cycles on a 68060.

So the same instruction would be 4 times faster on Vampire, and you get of course an additional speedup from higher clock speed on Vampire.


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
01 Sep 2018 07:33


Peter Slegg wrote:

How would that compare with a 68060 at 50MHz ?

The 68K family offers a high number of instructions.
And the 68K family support a powerful wealth of Address-Modes.
This makes the 68K so versatile and powerful.
The Instruction length on the 68000 can be 2 Byte, 4 Byte, 6 Byte, 8 Byte or 10 Byte. Since the 68020 even longer instructions are supported.

The big number of instructions makes comparing CPUs complex.

The 2nd fastest 68K CPU is the 68060.
The 68060 can do peak up to continuously 2 instructions per cycle.
This is very good.
But the 68060 can do this only if both instructions are very short and only 2 bytes long each.

The fastest 68K CPU is the 68080.
The 68080 is designed very similar to the 68060, but it adds a huge number of improvements to the 68060 design.
For some code the 68080 can do peak up to 4 instructions per cycle.
The 68080 can do several instructions per cycle for 2 Byte, 4 Byte, 6 Byte even 2 times 8 byte long instructions.

The 68080 supports simultaneous 3 Data-Cache operations per cycle, (READ/WRITE/REFILL) and the 68080 supports misaligned Cache access for no cycles extra.

Typically the 68080 is about 50% faster than an 68060 at the same clockrate. If you use new features like AMMX it becomes of course much faster.


John Heritage

Posts 111
02 Sep 2018 16:07


Gunnar - where do I find a list of byte length for each 68K instruction?




Philippe Flype
(Apollo Team Member)
Posts 299
02 Sep 2018 22:06


@John
 
It depends not only of the instruction, but also the Effective Address mode and the Immediates input length.
 
D5 is much shorter than ([$1234,A2,D5.l*2],$5678),D7
 
#XXX.W is shorter than #XXX.L, same for .L/S/D/X FPU inputs.
 

For the instructions themselves, there are some tables at end of the official Programmer Book (starting from page 561/646) :

EXTERNAL LINK 
;)


Philippe Flype
(Apollo Team Member)
Posts 299
02 Sep 2018 22:15


That was long time ago i did not put my hands into MiniBench.

Since the core got FPU, and some features such as the OoO, the tool needed some refurbishment.

Below is a MIPS, MFLOPS, MB/SEC battle between

  - MC68060 @ 50MHz,
  - MC68060 @ 80MHz,
  - AC68080 @ 78MHz.

EXTERNAL LINK 



Nixus Minimax

Posts 416
03 Sep 2018 08:21


Philippe Flype wrote:

  Below is a MIPS, MFLOPS, MB/SEC battle between
 
  - MC68060 @ 50MHz,
  - MC68060 @ 80MHz,
  - AC68080 @ 78MHz.

These numbers are truly impressive! The only surprise was the relatively weak MULU-value. How come the 080 is markably slower in this discipline than the 060?



Gunnar von Boehn
(Apollo Team Member)
Posts 6207
03 Sep 2018 11:36


OK, lets look at the summery result

  Amiga 4000 + Cyberstorm060MK-I  @ 50 MHz
  ----------------------------------
  | CPU SCORE :          54  MIPS
  | FPU SCORE :          19  MFLOPS
  | MEM SCORE :          43  MB/Sec
  ----------------------------------
  | ALL SCORE :          116  Points
 
 
  Amiga 1200 + Apollo 1260  @  80 MHz
  ----------------------------------
  | CPU SCORE :          86  MIPS
  | FPU SCORE :          27  MFLOPS
  | MEM SCORE :          58  MB/Sec
  ----------------------------------
  | ALL SCORE :          171  Points
 
 
 
  APOLLO AC 68080 @ 78 MHz
  ----------------------------------
  | CPU SCORE :          114  MIPS
  | FPU SCORE :          73  MFLOPS
  | MEM SCORE :          224  MB/Sec
  ----------------------------------
  | ALL SCORE :          411  Points




Henryk Richter
(Apollo Team Member)
Posts 128/ 1
03 Sep 2018 13:20


Nixus Minimax wrote:

  These numbers are truly impressive! The only surprise was the relatively weak MULU-value. How come the 080 is markably slower in this discipline than the 060?

Probably because the test didn't involve mulu.l d0,d1:d2 <insert grinning emoticon here>. Seriously though, it might warrant a look.


Philippe Flype
(Apollo Team Member)
Posts 299
03 Sep 2018 17:13


About MULU, the test is a dumb "MULU.L Dx,Dy"

REPT 8
  MULU.l d1,d5
  MULU.l d1,d4
  MULU.l d1,d2
  MULU.l d1,d3
ENDR

On the 060, the MULU.L 32 bits is very fast because Motorola removed the 64 bits support.
They moved the 64 bits operation into the FPSP (software library).

So this is very slow on the 060 :


  MULU.L <ea>,Dr:Dq 32*32 -> 64
  MULS.L <ea>,Dr:Dq 32*32 -> 64

On the 080, both 32 bits and 64 bits are in Hardware, no FPSP involded.


  MULU.L (ea),Dn 32 -> 32
  MULS.L (ea),Dn 32 -> 32
  MULU.L (ea),Dr:Dq 32*32 -> 64
  MULS.L (ea),Dr:Dq 32*32 -> 64

Side notes:
If Minibench did a MULU 64 bits, the 060 test would be very slow, in comparison (hence, unfair).
If the MULU 32 bits test would be done from MEM, the Vamp FastRAM would probably be faster.



posts 68page  1 2 3 4