Information about the Apollo CPU and FPU. |
68080 Coding Tricks (Endian) | |
---|
|
---|
| | Gunnar von Boehn (Apollo Team Member) Posts 6254 21 Dec 2019 22:49
| 68K ASM is a wonderful language with cool options Lets looking at the code of some existing AMIGA programs and lets try to think about some tricks : Code example 1
MOVE.B -(A0),D1 ;05936: 1220 LSL.W #8,D1 ;05938: e149 MOVE.B -(A0),D1 ;0593a: 1220 SWAP D1 ;0593c: 4841 MOVE.B -(A0),D1 ;0593e: 1220 LSL.W #8,D1 ;05940: e149 MOVE.B -(A0),D1 ;05942: 1220 MOVE.B -(A0),D2 ;05944: 1420 LSL.W #8,D2 ;05946: e14a MOVE.B -(A0),D2 ;05948: 1420 SWAP D2 ;0594a: 4842 MOVE.B -(A0),D2 ;0594c: 1420 LSL.W #8,D2 ;0594e: e14a MOVE.B -(A0),D2 ;05950: 1420 MOVE.B -(A0),D3 ;05952: 1620 LSL.W #8,D3 ;05954: e14b MOVE.B -(A0),D3 ;05956: 1620 SWAP D3 ;05958: 4843 MOVE.B -(A0),D3 ;0595a: 1620 LSL.W #8,D3 ;0595c: e14b MOVE.B -(A0),D3 ;0595e: 1620
The above code also be written as:
moveX.L -(A0),D1 moveX.L -(A0),D2 moveX.L -(A0),D3
Obviously the much shorter version is also much faster
| |
| | Jacek Rafal Tatko
Posts 19 21 Dec 2019 23:06
| Gunned it masterfully! ;)
| |
| | Don Adan
Posts 38 22 Dec 2019 00:44
| Gunnar von Boehn wrote:
| 68K ASM is a wonderful language with cool options Lets looking at the code of some existing AMIGA programs and lets try to think about some tricks : Code example 1 MOVE.B -(A0),D1 ;05936: 1220 LSL.W #8,D1 ;05938: e149 MOVE.B -(A0),D1 ;0593a: 1220 SWAP D1 ;0593c: 4841 MOVE.B -(A0),D1 ;0593e: 1220 LSL.W #8,D1 ;05940: e149 MOVE.B -(A0),D1 ;05942: 1220 MOVE.B -(A0),D2 ;05944: 1420 LSL.W #8,D2 ;05946: e14a MOVE.B -(A0),D2 ;05948: 1420 SWAP D2 ;0594a: 4842 MOVE.B -(A0),D2 ;0594c: 1420 LSL.W #8,D2 ;0594e: e14a MOVE.B -(A0),D2 ;05950: 1420 MOVE.B -(A0),D3 ;05952: 1620 LSL.W #8,D3 ;05954: e14b MOVE.B -(A0),D3 ;05956: 1620 SWAP D3 ;05958: 4843 MOVE.B -(A0),D3 ;0595a: 1620 LSL.W #8,D3 ;0595c: e14b MOVE.B -(A0),D3 ;0595e: 1620
The above code also be written as: moveX.L -(A0),D1 moveX.L -(A0),D2 moveX.L -(A0),D3
Obviously the much shorter version is also much faster
|
Yes, but this version of code example is ok for 68000. For 68020+, move.l -(a0),d1 ror.w #8,d1 swap d1 ror.w #8,d1Is better/fastest.
| |
| | Gunnar von Boehn (Apollo Team Member) Posts 6254 22 Dec 2019 06:25
| OK lets look at another code snipped from an AMIGA program. Lets us try play detective here and find out what is does and how we could tune it: MOVEQ #0,D0 ;128de: 7000 MOVE.B (A3)+,D0 ;128e0: 101b MOVEQ #0,D1 ;128e2: 7200 MOVE.B D0,D1 ;128e4: 1200 ASL.L #8,D1 ;128e6: e181 MOVEQ #0,D2 ;128e8: 7400 MOVE.B D0,D2 ;128ea: 1400 SWAP D2 ;128ec: 4842 CLR.W D2 ;128ee: 4242 MOVEQ #0,D3 ;128f0: 7600 MOVE.B D0,D3 ;128f2: 1600 SWAP D3 ;128f4: 4843 CLR.W D3 ;128f6: 4243 ASL.L #8,D3 ;128f8: e183 OR.L D2,D3 ;128fa: 8682 OR.L D1,D3 ;128fc: 8681 OR.L D0,D3 ;128fe: 8680 MOVE.L D3,(A5)+ ;12900: 2ac3
The code looks to me like a compiled did it and could have been written shorter. MOVEQ #0,D0 ;128de: 7000 MOVE.B (A3)+,D0 ;128e0: 101b MOVE.L D0,D1 ;128e4: 1200 ASL.L #8,D1 ;128e6: e181 MOVE.L D0,D2 ;128ea: 1400 SWAP D2 ;128ec: 4842 MOVE.L D0,D3 ;128f2: 1600 ROR.L #8,D3 ;128f8: e183 OR.L D2,D3 ;128fa: 8682 OR.L D1,D3 ;128fc: 8681 OR.L D0,D3 ;128fe: 8680 MOVE.L D3,(A5)+ ;12900: 2ac3
On APOLLO I could write it shorter MOVE.B (A0)+,D0 PERM #3333,D0:D0 MOVE.L D0,(A5)+
| |
| | Don Adan
Posts 38 22 Dec 2019 21:09
| Gunnar von Boehn wrote:
| OK lets look at another code snipped from an AMIGA program. Lets us try play detective here and find out what is does and how we could tune it: MOVEQ #0,D0 ;128de: 7000 MOVE.B (A3)+,D0 ;128e0: 101b MOVEQ #0,D1 ;128e2: 7200 MOVE.B D0,D1 ;128e4: 1200 ASL.L #8,D1 ;128e6: e181 MOVEQ #0,D2 ;128e8: 7400 MOVE.B D0,D2 ;128ea: 1400 SWAP D2 ;128ec: 4842 CLR.W D2 ;128ee: 4242 MOVEQ #0,D3 ;128f0: 7600 MOVE.B D0,D3 ;128f2: 1600 SWAP D3 ;128f4: 4843 CLR.W D3 ;128f6: 4243 ASL.L #8,D3 ;128f8: e183 OR.L D2,D3 ;128fa: 8682 OR.L D1,D3 ;128fc: 8681 OR.L D0,D3 ;128fe: 8680 MOVE.L D3,(A5)+ ;12900: 2ac3
The code looks to me like a compiled did it and could have been written shorter. MOVEQ #0,D0 ;128de: 7000 MOVE.B (A3)+,D0 ;128e0: 101b MOVE.L D0,D1 ;128e4: 1200 ASL.L #8,D1 ;128e6: e181 MOVE.L D0,D2 ;128ea: 1400 SWAP D2 ;128ec: 4842 MOVE.L D0,D3 ;128f2: 1600 ROR.L #8,D3 ;128f8: e183 OR.L D2,D3 ;128fa: 8682 OR.L D1,D3 ;128fc: 8681 OR.L D0,D3 ;128fe: 8680 MOVE.L D3,(A5)+ ;12900: 2ac3
On APOLLO I could write it shorter MOVE.B (A0)+,D0 PERM #3333,D0:D0 MOVE.L D0,(A5)+
|
Only shortest or fastest? For 68020+ the shortest can be: Moveq #0,d0 Move.b (a3)+,d0 Mulu.l #$01010101,D0 Move.l d0,(a5)+ For 68000 Move.b (a3)+,d0 Move.w d0,-(sp) Move.b d0,(sp) Move.w (sp),(a5)+ Move.w (sp)+,(a5)+
| |
| | Samuel Crow
Posts 424 22 Dec 2019 21:15
| Gunnar von Boehn wrote:
| MOVE.B (A0)+,D0 PERM #3333,D0:D0 MOVE.L D0,(A5)+
|
Shouldn't the perm value be #0000 since we're loading the low byte?
| |
| | Gunnar von Boehn (Apollo Team Member) Posts 6254 22 Dec 2019 21:20
| Samuel Crow wrote:
| Gunnar von Boehn wrote:
| MOVE.B (A0)+,D0 PERM #3333,D0:D0 MOVE.L D0,(A5)+
|
Shouldn't the perm value be #0000 since we're loading the low byte? |
PERM has 2 input and counts the bytes from the LEFT You select 4 destination bytes, of the 8 Bytes Input. VPERM does the same in 64bit. This means you select 8 byte result from 16 Byte input.PERM counts from 0...7 from Left VPERM counts from 0..F from Left
| |
| | Gunnar von Boehn (Apollo Team Member) Posts 6254 22 Dec 2019 21:22
| Don Adan wrote:
|
Gunnar von Boehn wrote:
| OK lets look at another code snipped from an AMIGA program. Lets us try play detective here and find out what is does and how we could tune it: MOVEQ #0,D0 ;128de: 7000 MOVE.B (A3)+,D0 ;128e0: 101b MOVEQ #0,D1 ;128e2: 7200 MOVE.B D0,D1 ;128e4: 1200 ASL.L #8,D1 ;128e6: e181 MOVEQ #0,D2 ;128e8: 7400 MOVE.B D0,D2 ;128ea: 1400 SWAP D2 ;128ec: 4842 CLR.W D2 ;128ee: 4242 MOVEQ #0,D3 ;128f0: 7600 MOVE.B D0,D3 ;128f2: 1600 SWAP D3 ;128f4: 4843 CLR.W D3 ;128f6: 4243 ASL.L #8,D3 ;128f8: e183 OR.L D2,D3 ;128fa: 8682 OR.L D1,D3 ;128fc: 8681 OR.L D0,D3 ;128fe: 8680 MOVE.L D3,(A5)+ ;12900: 2ac3
The code looks to me like a compiled did it and could have been written shorter. MOVEQ #0,D0 ;128de: 7000 MOVE.B (A3)+,D0 ;128e0: 101b MOVE.L D0,D1 ;128e4: 1200 ASL.L #8,D1 ;128e6: e181 MOVE.L D0,D2 ;128ea: 1400 SWAP D2 ;128ec: 4842 MOVE.L D0,D3 ;128f2: 1600 ROR.L #8,D3 ;128f8: e183 OR.L D2,D3 ;128fa: 8682 OR.L D1,D3 ;128fc: 8681 OR.L D0,D3 ;128fe: 8680 MOVE.L D3,(A5)+ ;12900: 2ac3
On APOLLO I could write it shorter MOVE.B (A0)+,D0 PERM #3333,D0:D0 MOVE.L D0,(A5)+
|
Only shortest or fastest?
|
Yes the real goal was fastest!Of course the 68060 solution is unbeat able :)
| |
| | Samuel Crow
Posts 424 22 Dec 2019 21:35
| Gunnar von Boehn wrote:
| Perm has 2 input and counts the bytes from the LEFT |
Thanks Gunnar!
| |
|
|
|