Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
The team will post updates and news about our project here

AMMX - Apollo 68080 MMX for AMIGApage  1 2 3 

Salteadorneo Salteador

Posts 20
16 Aug 2016 09:58


Do the registers are 64-bit?
Two 32-bit SIMD operations for maximum cycle?
Why MMX and not a proper cojunto improved?
Not better would be a vector own core with the best of the known?

Thank you for resurrecting 68000.


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
16 Aug 2016 12:27


Salteadorneo salteador wrote:

Do the registers are 64-bit?
  Two 32-bit SIMD operations for maximum cycle?
  Why MMX and not a proper cojunto improved?
  Not better would be a vector own core with the best of the known?
 
  Thank you for resurrecting 68000.

AMMX register are 64Bit

AMXX operations are 64bit

Why MMX?
Because there are MANY already existing code examples for Video and Jpeg acceleration with MMX.

What is the MMX speed up?
Example 1:
For 1 saturated 8 bit calculation you need with old 68k core 3 instructions.

AMMX can do 8 such calculation in 1 instruction.
This means speedup 8x3 = 24 times for this example.

Video and GFX calculations use mostly 8bit and or 16bit operations.
With 16 bit operations the speedup is 4 times minimum.

AMMX can give a significant boost.



Salteadorneo Salteador

Posts 20
16 Aug 2016 14:46


Thanks for your kind reply.
 
  I understand that like MMX, AMMX only works with integers.
  Why not use 128-bit registers for operations 4 32-bit floating point (single precision) as SEE, VU Emotion Engine, Cell, AltiVec, (VMXx, Velocity Engine) ...? Personally I find MMX poor.
  I see that as AltiVec AMMX uses 32 registers. Could you use a similar set of instructions and build altivec own vector processor? Does the cyclone III used to implement this Vampire does not allow for space?
 
  Thanks again for all your work with the 68080.


Johannes Schäfer

Posts 47
16 Aug 2016 16:16


Reply to Salteadorneo Salteador who wrote:
      I understand that like MMX, AMMX only works with integers.
    Why not use 128-bit registers for operations 4 32-bit floating point (single precision) as SEE, VU Emotion Engine, Cell, AltiVec, (VMXx, Velocity Engine) ...? Personally I find MMX poor.
    I see that as AltiVec AMMX uses 32 registers. Could you use a similar set of instructions and build altivec own vector processor? Does the cyclone III used to implement this Vampire does not allow for space?
   
 
 
  Maybe the roadmap now should be
   
    DevTools
    Add FPU
    Add MMU
    Add S-AGA
    Testing
   
    And finally: Deliver Vampire 500 and 1200 to the masses
   
    I dont see the benefit to add SSE/Altivec etc. before that. Just my opinion.
   
   


Thierry Atheist

Posts 644
16 Aug 2016 18:01


If MMU ever gets added, I'd like to see it as OPTIONAL, like when people compile their own Linux and include the components that they want into it.

We really are looking at a superior computing platform emerging.... As AMIGA has always allowed for, and is still delivering!

This is as accurate today as it ever was!!!
EXTERNAL LINK 
(If that doesn't stir you, maybe you don't have a pulse.)


Marcus Gerards

Posts 58
16 Aug 2016 19:44


Thierry Atheist wrote:

  We really are looking at a superior computing platform emerging.... As AMIGA has always allowed for, and is still delivering!

Just one question: Are you "cosmos"? :D



Marcus Gerards

Posts 58
16 Aug 2016 19:49


Johannes Schäfer wrote:

   
    Maybe the roadmap now should be
   
    DevTools
    Add FPU
    Add MMU
    Add S-AGA
    Testing
   
    And finally: Deliver Vampire 500 and 1200 to the masses
   
    I dont see the benefit to add SSE/Altivec etc. before that. Just my opinion. 

I could not agree more. But Gunnar wants to do his thing and that is legitimate.

In the end, we may all get what we want.


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
16 Aug 2016 22:06


Johannes Schäfer wrote:

I understand that like MMX, AMMX only works with integers.

correct - and you need INTEGERS for video decoding!

Johannes Schäfer wrote:

Why not use 128-bit registers for operations 4 32-bit floating point (single precision) as SEE,

Because this needs 10 times more FPGA space  - and helps you nothing at all to decode AVIs or JPEGs.




Salteadorneo Salteador

Posts 20
17 Aug 2016 10:37


Thanks for the reply. I already imagined there would be problems with the space occupied in the FPGA.
 
  I do not consider it a priority this type of features (the AGA full support also seems more priority), but I seem interesting questions.
 
  You could add some floating point instructions, such as 21 3dnow? ¿Added to the FPU to make a VFPU, as the PSP? Could they be grouped 2 registers 64 to operate as one of only 128 bits in the case of adding floating point operations? The PSP VFPU 128 groups in 32-bit registers in matrix of 4x4, 3x3 or 2x2. Off course, this is for 3D operations.
 
  All this if space remains in the FPGA, probably not. ;-)
 
  Thanks for everything.


Andrew Copland

Posts 113
18 Aug 2016 09:39


Yes the PSP VFPU is quite lovely, lots of dedicated registers, easy instructions. It'd be great to have.

However, I'd really just like a pipelined register based FPU, not even fully 68882 compatible as we could handle that via libs (probably).


Gregthe Canuck

Posts 274
18 Aug 2016 10:15



I wasn't aware of such a thing as a vector FPU. Very cool.

I have a feeling something like this would need a *lot* of FPGA space so likely to be far down the team's priority list. But at some point this would be a very powerful addition for sure.

Cheers!


Andrew Copland

Posts 113
18 Aug 2016 12:31


VFPU is just SIMD instructions for FPU. So Altivec/SSE/etc in PPC/PC world. On the PSP it was Sony's extension for the MIPS CPU they had.


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
18 Aug 2016 13:38


Guys..

MMX is designed for VIDEO and having DVD or AVI playback is something whcih many people want.

MMX is also designed for JPEG decoding, and JPEG decoding is something every webbrowser does all the time..

So speeding up VIDEO playback and speeding up JPEG decoding = speeding up webbrowsing will benefit all users.

MMX at the same time needs not that much FPGA resources.
This means the return for the money is high.



Nadyr Nick

Posts 54
18 Aug 2016 17:46


Gunnar von Boehn wrote:

Guys..
 
  MMX is designed for VIDEO and having DVD or AVI playback is something whcih many people want.
 
  MMX is also designed for JPEG decoding, and JPEG decoding is something every webbrowser does all the time..
 
  So speeding up VIDEO playback and speeding up JPEG decoding = speeding up webbrowsing will benefit all users.
 
  MMX at the same time needs not that much FPGA resources.
  This means the return for the money is high.
 

Hi Gunnar,

I have seen this morning the detail specs about ARRIA 10, that is great, Do you connfirm you have idea to use this one in Vampire4000 ? (and eventually in a version of Vampire1200). I'd like to buy one or two of these cards for A4000 in particular and also one for A1200.
thanks, Best regards



Thierry Atheist

Posts 644
18 Aug 2016 20:25


nadyr nick wrote:

Hi Gunnar,

I have seen this morning the detail specs about ARRIA 10, that is great, Do you connfirm you have idea to use this one in Vampire4000 ? (and eventually in a version of Vampire1200). I'd like to buy one or two of these cards for A4000 in particular and also one for A1200.
thanks, Best regards


Forget A1200 and A4000.

The Arria 10 is going to be a STANDALONE motherboard. Doesn't need ANY OTHER equipment to use ALL AMIGA software....

1 Gigabyte of RAM (DDR2 I think).
MORE Data and Instruction cache.
Between 2 and 3 times faster* than ANY Vampire ][ out there!!!

And in the future, will have MORE internal CPU capabilities than the much smaller FPGA that is currently being used.

It will be able to play back MPEG2 video with no frames dropped!!! 720P may even be possible?!!!!

This is what EVERY SINGLE AMIGAN has been waiting for to become available, and there were about 7,000,000 Commodore Amigas sold at one time!!!

* And that's ONLY based on the FPGA speed in MHz. The extra RAM and caches, etc. may make it operate even faster, if used properly, at times!


Daniel Sevo

Posts 299
18 Aug 2016 23:33


Gunnar von Boehn wrote:

    Guys..
   
    MMX is designed for VIDEO and having DVD or AVI playback is something whcih many people want.
   
    MMX is also designed for JPEG decoding, and JPEG decoding is something every webbrowser does all the time..
   
    So speeding up VIDEO playback and speeding up JPEG decoding = speeding up webbrowsing will benefit all users.
   
    MMX at the same time needs not that much FPGA resources.
    This means the return for the money is high.
   
   

   
    This is the best explanation so far, thanx BigGun.
    It sure makes sense to implement something that is useful yet low on requirements.
   
    I think many of us dream about SSE so that we can start accelerating 3d (Quake III, looking at you) but the truth is, we (outside the project team) don't know how much FPGA space that would occupy. Probably more than what can be crammed into Cyclone III after AGA and FPU support.
   
    So maybe we should pause the hopes for more SIMD stuff until the other stuff is done and/or another FPGA is being used. ;-)
 
  Edit:
  But just for the record, for those who missed it:
  Post from Gunnar himself on May 24th:
 
-------------------------------
  "For the next Release we are currently working on the integration of
   
    1) fully pipelined = fast
      and compatible 80Bit FPU
   
    2) 128 bit SSE compatible SIMD instructions
--------------------------
 
  So you can assume there were plans but maybe compromises needed to be made along the road as it is an eternal fight against limits.. (FPGA Technology currently used, priorities, time etc)

   


Thierry Atheist

Posts 644
19 Aug 2016 00:58


Can SIMD speed up raytracing? How about file (de)compression? Generating fractals? Encryption? Sorting?


Andrew Copland

Posts 113
19 Aug 2016 11:33


Thierry Atheist wrote:

Can SIMD speed up raytracing? How about file (de)compression? Generating fractals? Encryption? Sorting?

Yes, to all of the above.

Depending on the specific instructions implemented obviously, if it's all float stuff then possibly not (En/De)cryption or file (de)compression.


Andrew Copland

Posts 113
19 Aug 2016 11:34


Daniel Sevo wrote:

    Edit:
    But just for the record, for those who missed it:
    Post from Gunnar himself on May 24th:
 
  -------------------------------
  "For the next Release we are currently working on the integration of
   
      1) fully pipelined = fast
        and compatible 80Bit FPU
   
      2) 128 bit SSE compatible SIMD instructions
  --------------------------
   
    So you can assume there were plans but maybe compromises needed to be made along the road as it is an eternal fight against limits.. (FPGA Technology currently used, priorities, time etc)   

Ah thanks for that, you're right I had totally missed it!

posts 59page  1 2 3