We've just updated MediaWiki and its underlying software. If anything doesn't look or work quite right, please mention it to us. --RanAS

Bithacks: Difference between revisions

From SnesLab
Jump to: navigation, search
m (small correction with big implications (wrote this as immediate mode even though I've never used this trick that way, turns out that doesn't work!))
(Modified formatting to allow linking to individual sections and let the site automatically generate a table of contents.)
Line 4: Line 4:
'''Note: cycle counts are intended to be a worst case measure.'''
'''Note: cycle counts are intended to be a worst case measure.'''


== Math Bithacks ==
= Math Bithacks =
'''Signed Division By 2''' ''(3 bytes / 4 cycles)''
== Signed Division By 2 ==
''3 bytes / 4 cycles''
<br>
<br>
<u>inputs:</u> A
<u>inputs:</u> A
Line 15: Line 16:
</pre>
</pre>


'''Signed Division By 2<sup>n</sup>''' ''(6+n bytes / 6+2n cycles)''
== Signed Division By 2<sup>n</sup> ==
''6+n bytes / 6+2n cycles''
<br>
<br>
<u>inputs:</u> A
<u>inputs:</u> A
Line 42: Line 44:
</pre>
</pre>


'''Absolute Value''' ''(5 bytes / 6 cycles)''
== Absolute Value ==
''5 bytes / 6 cycles''
<br>
<br>
<u>inputs:</u> A, (N Flag)
<u>inputs:</u> A, (N Flag)
Line 56: Line 59:
</pre>
</pre>


'''Absolute Value (SEC)''' ''(4 bytes / 4 cycles)''
== Absolute Value (SEC) ==
''4 bytes / 4 cycles''
<br>
<br>
<u>inputs:</u> A, (Carry Set)
<u>inputs:</u> A, (Carry Set)
Line 69: Line 73:
</pre>
</pre>


'''Magnitude/Extents Check''' ''(~7 bytes / 12 cycles)''
== Magnitude/Extents Check ==
''~7 bytes / 12 cycles''
<br>
<br>
<u>inputs:</u> A
<u>inputs:</u> A
Line 92: Line 97:
</pre>
</pre>


== Misc. Tricks ==
= Misc. Tricks =
<small>As this list grows tricks here will be consolidated into their own sections. Clever optimization tricks that aren't necessarily what someone might personally call a "bithack" are okay here as well!</small>
<small>As this list grows tricks here will be consolidated into their own sections. Clever optimization tricks that aren't necessarily what someone might personally call a "bithack" are okay here as well!</small>


'''Clear High Byte of Accumulator''' ''(1 byte / 2 cycles)''
== Clear High Byte of Accumulator ==
''1 byte / 2 cycles''
<br>
<br>
<u>inputs:</u> (none)
<u>inputs:</u> (none)
Line 105: Line 111:
</pre>
</pre>


'''Direction/Facing As Index''' ''(4 bytes / 6 cycles)''
== Direction/Facing As Index ==
''4 bytes / 6 cycles''
<br>
<br>
<u>inputs:</u> A
<u>inputs:</u> A
Line 117: Line 124:
</pre>
</pre>


'''Skip Dead Code''' ''(1-2 bytes / 2-3 cycles)''
== Skip Dead Code ==
''1-2 bytes / 2-3 cycles''
<br>
<br>
<u>inputs:</u> (none)
<u>inputs:</u> (none)
Line 138: Line 146:
</pre>
</pre>


'''Check 3 Conditions''' ''(2 bytes / 2 cycles)''
== Check 3 Conditions ==
''2 bytes / 2 cycles''
<br>
<br>
<u>inputs:</u> A
<u>inputs:</u> A

Revision as of 19:12, 25 April 2021

Bithacks are optimization tricks that utilize information in bits and bit manipulation to accomplish their tasks. Usually they work in a slightly non-obvious way, (the most famous being the fast inverse sqrt), and bit manipulation in general is harder on the 65c816. To that end here is a collection of some useful tricks.
Note: cycle counts are intended to be a worst case measure.

Math Bithacks

Signed Division By 2

3 bytes / 4 cycles
inputs: A
outputs: A

	CMP #$80
	ROR

Signed Division By 2n

6+n bytes / 6+2n cycles
inputs: A
outputs: A

; signed division by two, n times
macro SignedDiv_2N(n)
	LSR #<n>
	BIT.b #$80>><n>
	BEQ ?positive
	ORA.b #$FF00>><n>    ; sign extension
?positive:
endmacro

; -1 cycle and +n bytes, but must have N flag set before use
macro SignedDiv_2N(n)
	BMI ?negative
	LSR #<n>
	BRA ?end
?negative:
	LSR #<n>
	ORA.b #$FF00>><n>    ; sign extension
?end:
endmacro

Absolute Value

5 bytes / 6 cycles
inputs: A, (N Flag)
outputs: A

macro abs()
	BPL ?plus
	EOR #$FF
	INC
?plus:		; only 3 cycles if branch taken
endmacro

Absolute Value (SEC)

4 bytes / 4 cycles
inputs: A, (Carry Set)
outputs: A

; compared to the branching version this is 1 byte smaller
; it's either 2 cycles slower/faster depending on branch taken
	EOR #$7F
;	SEC		; the instant you add this in it becomes worse than the branching version
	SBC #$7F

Magnitude/Extents Check

~7 bytes / 12 cycles
inputs: A
outputs: (none)

; asks "Is [A] on the zero-side of value [X] or the far side?"
; good for magnitude checks, smaller *AND* faster than alternatives
; NOTE: in the event that it is exactly [X] it will have that value at branch
; doesn't need to be an indexed CMP but is most useful this way
; this can be used to combine the BPL and BMI checks for both signs into one
	SEC : SBC Extents,x
	EOR Extents,x
	BMI .zero_side
.far_side:
	; do things
.zero_side:
	; do things

Extents:
	db -$23, $23

Misc. Tricks

As this list grows tricks here will be consolidated into their own sections. Clever optimization tricks that aren't necessarily what someone might personally call a "bithack" are okay here as well!

Clear High Byte of Accumulator

1 byte / 2 cycles
inputs: (none)
outputs: A

; "Trashes" A but clears high byte
	TDC

Direction/Facing As Index

4 bytes / 6 cycles
inputs: A
outputs: A

; Ever wonder why facing flags are 0=right and 1=left? This is why. It's incredibly cheap.
	ASL
	ROL
	AND #$01

Skip Dead Code

1-2 bytes / 2-3 cycles
inputs: (none)
outputs: (none)

; If you need to skip one byte of dead code (due to a hijack or whatever reason) you can use:
	NOP		; 1 byte, 2 cycles

; But if you need to skip just 2 bytes the most efficient is:
; NOTE: many times WDM is used as a breakpoint for debugging so only do this as a final pass to speed up your code!
	WDM		; 2 bytes, 2 cycles

; Finally, if you need to skip a large amount of dead code you can use BRA/JMP instead
; JMP is as fast as BRA on the SNES CPU, but will be slightly slower on SA-1, and 1 cycle slower on SPC. So BRA is recommended
; (The extra byte used for JMP in this case doesn't matter)
	BRA +		; 2 bytes, 3 cycles
	; dead code
+

Check 3 Conditions

2 bytes / 2 cycles
inputs: A
outputs: (none)

; just the opcode as normal here (not counting the conditions), using any operand that's not immediate (#)
; it's worth noting that you can do up to 3 tests with a single opcode though!
; Just As A Reminder: the V & N flag are set by the *operand* to BIT not the result of the AND!
	BIT $00
	BMI .bit7_set
	BVS .bit6_set
	BNE .bit5_set	; assuming #$20 is in $00
.bit7_set:
.bit6_set:
.bit5_set: