Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Performance and Benchmark Results!

3d Performance On Vapire With Openglpage  1 2 

Ozzy Boshi

Posts 31
15 Jun 2020 11:34


Hello vampires

Today it's my day off and I spent the whole morning playing with my vampire.
Playing means not to play games but see what I can do with mi v600, at the end i came up with a little C program that draws some rodoneas into a coffin window.
I took the math equations from here

EXTERNAL LINK 
and i ended up representing in memory almost all the points of this roses

https://it.wikipedia.org/wiki/Rodonea#/media/File:Rose-rhodonea-curve-7x7-chart.svg

Actually this post is not about for showing off my clumsy attempts to do something good with a vampire but to ask you how would you paint this stuff.

In my tests i used some magic inside a file called libGl.a that artzi gave me some time ago (in the community it's a very well known guy for his amiga ports you should know him very well).

All the code was compiled with bebbo's gcc using fastmath and fpu enabling switches.

Now, in opengl I found out there is a function called glutTimerFunc(...) and its first parameter (for my understanding) it's the time to wait before executing a function , so , callig glutTimerFunc recursively in serial with glutPostRedisplay() should yeld me something like an animation.

In addiction to this i used libsdl playmusic function to have a nice background music.
The thing is, when i give glutTimerFunc first parameter a value less than more or less 100ms, the music disappears.
This is true with flowers made of 720 points, so, in other word it seems the vampire cannot handle moving 720 points quickly while playing music.

So, at the end what I am asking to the people in this forum is... how would you quiclky draw a rodonea (or more in general 2d images rotating in a 3d space) to screen in vampires? Are there alternatives out there instead of using opengl implementation?
Maybe it's only my code so messy and bad written and could be optimized i dont know, if you are about this you can download the whole zip

EXTERNAL LINK 
inside there is my c code, some libs to make him run on amiga and a m68k executable, if you have coffin you should just unzip it and double click on the executable.

I know that it would be way better to rewrite all this stuff in asm but i am not very m68k asm fluent and would take me a lot of time.
On the other hand I deem opengl (lecgacy opengl in this case) very elegant and well documented, furthermore there are plenty of examples and tutorials on the internet.

I know it exists minigl for the amiga, i dont now how to use, actually i dont even know how arti compiled this libGL.a file but it works pretty well. I assume it's just a redirection to gl*.library that I really dont know what is inside, probably someone wrote a software implementation of legacy opengl for amiga bbut I dont know if this is somewhat related to minigl stored in aminet.
I'd like to hear your comments about my frenzy.
Greets from Italy



Denis Markovic

Posts 4
21 Jun 2020 23:49


Hi Ozzy,

nice background music. In general for just drawing single points to the screen I would probably not use OpenGL or any library at all but just hard-code everything directly
on Amiga (i.e. do all the 2D -> 3D -> 2D calculations myself and then just set the
pixels directly in the display buffer, maybe use double buffering).

The reason for this is that I am not sure how efficient the OpenGL implementation is on classic Amiga's (and for me a Vampire is a classic Amiga on steroids); as there is no
dedicated HW support for 3D yet, an OpenGL implementation together with SDL has to be software based, so unless SDL and OpenGL are highly optimised and your code is not, it might be possible that a direct implementation is faster than using the libraries.

I also wanted to learn a bit more about Vampire programming and maybe also refresh my Amiga coding skills a bit, if you like we could do some experiments together to see, how fast we could get your program by using more direct code instead of OpenGL?



Ozzy Boshi

Posts 31
22 Jun 2020 08:16


Hi Denis Markovic and thank you for your reply.
First of all I want to point out that the music is not mine, it's just a remix from a theme from turrican 2 I downloaded from amigaremix.com, I like the music and I thought would fit well on my demo so...

I agree with you about OpenGL on Amiga, it seems very slow but maybe for very small things could be useful even without HW support, drawing some points could be the case.
The main goal of my work is to understand what can I do with vampire and Opengl and see how far I can go with them.
I posted on this forum just to see if anyone could do some sort of little code review and tell me why adding sound support slows down the amiga so much because without wav reproduction for me Opengl could fit well for drawing a 360 points rodonea on screen and rotating it.

I am testing other alternatives though, try this
EXTERNAL LINK  This is the same demo for stock amigas, you can see here positions are precalculated, for me there is no way a stock 68000 can rotate 360 points at each frame, however things change a lot when you run this stuff on a vampire.
This file outputs video from the rgb port, it's just an old school demo with no rtg and a mod playing in the background, but it's not what i wanted at the end.
WARNING : for unknown reasons this file does not run well con coffinOS, if you want to try it you must boot another OS, with wb3.1 works as expected.

I have another experiment:
EXTERNAL LINK  This is a rodonea painted on a rtg window but WITHOUT opengl.
In this demo i used intuition + lib cybergraphics, all it does is to create a bitmap with setAPen() + move() + draw() and blit the result into a viewport.
In this latter demo the rodonea does not spin because I cant figure out how to set a framerate without opengl (in other words i need a corresponding settimerfunc() in this environment, but it could be a faster implementation.

If you are interested in this topic and you want to play around with your vampire with me you can find me privately on whatever platform (skype, irc discord ecc ecc).

Thank you for your interest.



Gunnar von Boehn
(Apollo Team Member)
Posts 6197
22 Jun 2020 08:54


Ozzy boshi wrote:

I posted on this forum just to see if anyone could do some sort of little code review and tell me why adding sound support slows down the amiga so much because without wav reproduction for me Opengl could fit well for drawing a 360 points rodonea on screen and rotating it.

I my experience not all frameworks have good performance.
Some frameworks are not tuned at all
and a bad framework can waste a lot of resources.

For example:
To play a 16bit wave on the recent V4 cores all you need to do a a few MOVE instruction to start it, the rest will be fully done by the HW using DMA. This means play back of a complete WAVE music is 100% free on Vampire if you code it using the HW.

Also the 3D performance can reach very good level is the right code is used.
I have seen some people writing ASM using the FPU to calculate the 3D on Vampire.
These demos rendered 3D pixel/voxel demos exactly like you did.

The pixel count in these demos was rotating 30,000 - 50,000 pixels per frame, and to render them.



Thellier Alain

Posts 141
23 Jun 2020 07:02


See
EXTERNAL LINK 
For a simple C example to draw a 3d object with GL/glut



Steve Ferrell

Posts 424
23 Jun 2020 08:20


thellier alain wrote:

  See
    EXTERNAL LINK   
    For a simple C example to draw a 3d object with GL/glut
   
 

 
Nice!  And completely portable code too.  After downloading and extracting the source code I had the EXE built in Visual Studio in around 15 seconds.
 
EXTERNAL LINK


Vladimir Repcak

Posts 359
29 Jul 2020 23:21


Ozzy boshi wrote:

Hello vampires
  Actually this post is not about for showing off my clumsy attempts to do something good with a vampire but to ask you how would you paint this stuff.

...

it seems the vampire cannot handle moving 720 points quickly while playing music.

...
 
  So, at the end what I am asking to the people in this forum is... how would you quiclky draw a rodonea (or more in general 2d images rotating in a 3d space) to screen in vampires?

...

  Maybe it's only my code so messy and bad written and could be optimized i dont know


The only way you are going to get a great performance out of a C program is if the C code will be merely calling the ASM functions with pointers to your 3D mesh data (e.g. 99.99% of CPU work on processing 3D scene will be executing the hand-written ASM code).

I think I saw some threads in the past that talked about OpenGL/miniGL on Amiga and those APIs used generic C code for drawing, which as you can imagine, is as inefficient as humanly possible.

I'm sure you realize it would be a gargantuan effort to write an OpenGL wrapper in 68080 ASM. Not impossible, just insane.

Slightly less insane, would be having some subset of OpenGL - kinda like miniGL.

Now, purely hypothetically, with what I have written so far for my Heimdall game, I have a working set of basic flatshading functionality in Assembler.

I'm realizing now, that all this code could  -indeed-  serve as a base code for some kind of , say, VampireGL library.

For my own debugging purposes, I have a Point Cloud, WireFrame and Flatshaded (no texturing yet) drawing style.

I will keep thinking about this...


Andy Hearn

Posts 374
30 Jul 2020 10:04


or for those of us without a shred of talent for coding like me, install latest warp3D
  (picasso96 and default for everything else on install)
  then install wazp3D
  (warp3D library copy/replace and wazp3Dprefs file copy/run)
 
  i'm able to get 11fps on "StarshipW3D" on my V1200
 
  am going to play with the StormMesa OpenGL demos later, and maybe some quakeGL. don't care if it's a slide show ;) just happy that it works...

--edit--
QuakeGL @ 320x240x8 "timedemo demo2" gets 2.4fps with no bilinear filtering, and the perspective correction is a bit funky at places.

Normal Quake @ 320x240x8 "timedemo demo2" gets 27.95 fps


Vladimir Repcak

Posts 359
30 Jul 2020 12:09


Can you post a screenshot of the StarshipW3D ? Just wondering what 11 fps scene complexity looks like...

It should be 96 vertices and 170 tris, right ?


Andy Hearn

Posts 374
30 Jul 2020 12:40


sure - yes that sounds like the exact one i'm running. it;s not massively complex
 
https://1drv.ms/u/s!ArUfT_xm4N1vlMlTQvlgtfhU5kbhiQ?e=EzNvqq
 
let me know if this link works or not... :)


Vladimir Repcak

Posts 359
30 Jul 2020 15:10


Thanks. Link works, but I can't seem to recognize it as a spaceship from that angle.

It isn't textured, correct? Just flatshaded?


Andy Hearn

Posts 374
30 Jul 2020 21:36


yeah it's textured:-

https://1drv.ms/v/s!ArUfT_xm4N1vlMlkiFBi1dhV9MwU7w?e=RKMbNH

i've been playing with some of the stormmesa demos. some of them really get a performance boost running on a 16bit depth window, and shrinking the workbench window size leads to a nice performance boost too:-
TexCyl

https://1drv.ms/v/s!ArUfT_xm4N1vlMlnVM_-Z-ZS2BAHBQ?e=jn7bj3


Vladimir Repcak

Posts 359
30 Jul 2020 23:33


What's the framerate if you turn off the texturing?
Assuming it allows you to do that...


Andy Hearn

Posts 374
31 Jul 2020 09:52


no option to disable texturing, just switch to "coloured poly" mode instead of textured. no change in FPS.
  lots of zbuffer options. in the options where nothing is seemingly displayed, fps goes up to 20-25fps depending on the option.
 
  only the "hideface" option makes the fps drop down - to 8fps
 
  'f' showfps=!showfps; 
  'o' optimroty=!optimroty;
 
  'z' zbuffer=!zbuffer; 
  '1'  zmode=W3D_Z_NEVER; - nothing visable - 25fps
  '2'  zmode=W3D_Z_LESS; 
  '3'  zmode=W3D_Z_GEQUAL; - nothing visable - 25fps
  '4'  zmode=W3D_Z_LEQUAL; 
  '5'  zmode=W3D_Z_GREATER; 
  '6'  zmode=W3D_Z_NOTEQUAL; 
  '7'  zmode=W3D_Z_EQUAL; 
  '8'  zmode=W3D_Z_ALWAYS; 
  'u' zupdate=!zupdate; 
  't' tridraw=!tridraw; - not sure what this does 
  'c' colored=!colored; - textured or coloured polys 
  'h' hideface=!hideface; - not obvious what difference this made - 8fps
  'r' rotate=!rotate; 
  ESC closed=TRUE;

--edit--
also ran "Cow3D" this morning. wasn't able to get more than just less of 1fps - 0fps reported. hideface on/off caused the only discernible slowdown. allocating a large buffer to pre-calculated points didn't help. crashed when i changed the draw mode to "0 - big points and lines" as i was going through the options


Vladimir Repcak

Posts 359
01 Aug 2020 01:34


Andy Hearn wrote:

no option to disable texturing, just switch to "coloured poly" mode instead of textured. no change in FPS.
That is quite weird, indeed. I mean, texturing is obviously going to inevitably slow it down - there is just so much math happening per pixel (perhaps even with a division [per pixel] for perspective-correct texturing). I simply cannot fathom how could it possibly not be slower than flatshaded, where the polygon color is retained in the register.

Andy Hearn wrote:

  lots of zbuffer options. in the options where nothing is seemingly displayed, fps goes up to 20-25fps depending on the option.
 
  only the "hideface" option makes the fps drop down - to 8fps
I'm guessing that's the back-face culling (removing non-visible polygons - the ones that are facing away from camera).

All those Z-Buffer flags, they are going to make a difference only when you have a really complex scene and are hitting the limits of Z-Buffer precision.
And when you do that with the current version of the library, you would hit 0.00001 fps anyway...

Andy Hearn wrote:

  --edit--
  also ran "Cow3D" this morning. wasn't able to get more than just less of 1fps - 0fps reported. hideface on/off caused the only discernible slowdown. allocating a large buffer to pre-calculated points didn't help. crashed when i changed the draw mode to "0 - big points and lines" as i was going through the options

I think I saw the Cow vids before. How many polys/vertices it is ?


Vladimir Repcak

Posts 359
01 Aug 2020 01:52


Andy Hearn wrote:

  i'm able to get 11fps on "StarshipW3D" on my V1200
It's not entirely apples-to-apples comparison, as I don't use the full-blown matrix solution (which would add couple % of frame time to transform stage), and I don't do Z-Buffer, but 170 tris spceship covering the similar size of screen would take for my engine:

2.2%  3D Transform
4.3%  Quad Set-Up
2.4%  Scanline Traversal (1,000 scanlines)
2.8%  Pixel Fill (18,000 px)
-----------------------------
11.7% of frame time -> 60 fps / 0.117 => 511 FPS

You could have a few of those ships on screen before you'd even reach 60 fps - looks like you could have around 8 and still be at 60 fps (8 * 11.7 = 93.6% of frame time).

Then again, OpenGL probably handles per-pixel Z-Buffer, which must be awfully slow - just imagine doing a *run-time configurable* condition+read per pixel. In C !!!! That alone must be easily 2-3 pages of code. Boom - there goes your 170 MIPS :)

  Even if I was doing the GL library, I would simply enforce doing some sorting on the programmer's side - that's simply incredibly lazy to just tell API - you go sort all this stuff out (which it will - at 0.1 fps)...



Gunnar von Boehn
(Apollo Team Member)
Posts 6197
01 Aug 2020 11:07


Vladimir Repcak wrote:

Then again, OpenGL probably handles per-pixel Z-Buffer, which must be awfully slow

In my experience, per Pixel Z-buffer check is no problem at all.
I think the real problem is naiv coding of this.

If you have 10 options how to do this, then you can have 1 routine which per pixel check all 10 options and per pixel repeats this check.
So if you draw 1 million pixel you do also 10 Million checks...
This is what people call "naiv" coding.

High performance coding would write 10 routines, and decide which one to use .. And inside the routine never do uneessary checks.
Et voila, 10 times better performance



Markus B

Posts 209
01 Aug 2020 12:08


Vladimir Repcak wrote:

  I'm realizing now, that all this code could  -indeed-  serve as a base code for some kind of , say, VampireGL library.
 
  For my own debugging purposes, I have a Point Cloud, WireFrame and Flatshaded (no texturing yet) drawing style.
 
  I will keep thinking about this...
 

 
  If you need a game's idea, take a look at something like Indianapolis 500 (https://www.youtube.com/watch?v=KmCFPrLNZZM) oder the famous Formula One Grand Prix.
 
  I'd love to see such game in SAGA (940x560 or 1280x720).


Andy Hearn

Posts 374
01 Aug 2020 14:39


just FYI
Cow3D is 5813Tris /2914 points at ~>1fps
running this in a nearly full-screen window on a 800x600x16bit screen. dropping down to 640x480x15bit doesn't seem to improve any performance. 8bit just displays a blank window. again, colouring-vs-texturing doesn't change much.

but this punishment in performance is the price for a nearly complete (as i understand it) implementation of an early form of  OpenGL - running entirely in 68k software... thats worth it in my eyes, however inefficient. but not something base something beyond an "it works" technical demo. I know nothing about how any of this goes together - but i imagine boiling it down to it's base constituents would lead to some big gains. maybe some AMMX magic here or there? ;)


Vladimir Repcak

Posts 359
01 Aug 2020 19:14



 
Gunnar von Boehn wrote:

  In my experience, per Pixel Z-buffer check is no problem at all.

  Well, even in pure ASM, you must:
  - Read current pixel's Z value - I reckon we could reuse the already computed pixel offset address and use the addressing mode with complex relative addressing - so 1 op
  - Compare against interpolated Z value (presumably in a register) - 1 op
 
  So, it could be done in 2 ops. If we have 64,000 rendered px (in 320x200), this check should take 128,000 ops, so perhaps just ~10% of frame time. That is doable. Of course, then we need Z interpolation during rasterization, which might be another few ops (probably floating point, and those ops are expensive), so in the end it can take up to 40-50% of frame time just for Z-Buffering.
 
 
 
Gunnar von Boehn wrote:

  I think the real problem is naiv coding of this.
 
  If you have 10 options how to do this, then you can have 1 routine which per pixel check all 10 options and per pixel repeats this check.
  So if you draw 1 million pixel you do also 10 Million checks...
  This is what people call "naiv" coding.
 
  High performance coding would write 10 routines, and decide which one to use .. And inside the routine never do uneessary checks.
  Et voila, 10 times better performance

  There are other alternatives. You could have a boolean flag for a compile-time Z-Buffer condition. Thus, no dozen conditions per pixel, compiler would just choose the one coder wants. Of course, that's not fully OpenGL-compliant behavior. But in reality, you don't need to mess with the Z-buffer condition, unless you have real-time 3D shadows (or some special FX) anyway. There's no need to punish 99% of use cases for 1% special case...
 
 
  Alternatively, we could do run-time code generation. When the used changes the Z-buffer condition, the backend would replace the hexa op-code with the selected one. Then there would be no additional performance penalty per pixel.

posts 40page  1 2