why do I keep hacking 16bit DOS games? do I hate myself?
32bit programs are SO MUCH EASIER to RE, because when you see an address, you know what it means. 0x12345678 always means 0x12345678!
why do I keep hacking 16bit DOS games? do I hate myself?
32bit programs are SO MUCH EASIER to RE, because when you see an address, you know what it means. 0x12345678 always means 0x12345678!
16bit games are full of MOV AX, 1234h and it's like, WHAT'S DS AT THIS POINT? WHICH 1234?
there's 65536 possible memory addresses it could be!
not to mention that there's more than one way to address a given part of memory.
in 32bit and 64bit code, if you see 0x12345678, you know that some code that writes to 0x12335662 doesn't change it.
not so in 16bit games. you have plenty of ways to refer to the same address.
This is why 16bit x86 is SO much more annoying than 8-bit computers.
with 8-bit computers, you have 16-bit addresses, because 256 bytes is rarely enough memory. So they work by having some addresses which are longer. simple, right? so instead of an 8bit number, you have a 16bit number.
16bit x86 does this as well. 16bits of ram is only 64kb, and that's just not enough. So you expand it to 24bits or 32bits, for "long addresses", right? same as you use in 8bit computers?
NOPE
no, you combine 16bits and 16bits and get... 20 bits.
it's a 20bit address.
segmented addressing, the solution they use, is not as simple as just adding some more bits. a 16bit segment and a 16bit offset.
so that's just a weird way of explaining a 32bit number, right?
NOPE
so what, they ignore all but the bottom 4 bits of the segment?
NO THAT WOULD MAKE SENSE
instead the full 16bit segment is used, but it's turned into a 20bit address by shifting it 4 bits over and adding in the offset.
So it's the TOP 4 bits that are important, not the bottom 4.
Okay that's fine, but wait, I said adding. Not "replacing".
Yes, all 16bits are used. So the address 0000:0000 is (linear) 0x0, and 0001:0000 is (linear) 0x10
which also means that 0001:0000 and 0000:0010 are both linear 0x10.
So you can get pointer aliasing even though both pointers HAVE DIFFERENT VALUES
And if that wasn't bad enough, there's also the A20 gate nonsense. Now, the A20 gate was added with the 286, for backwards compatibility with how the 8086/8088 worked, which is that memory wrapped.
so not only are 0001:0000 and 0000:00010 the same address, so is FFFF:0020!
but don't worry, for the 286 they wanted to add more than 1 megabyte of RAM, which is the max you can address with a 20bit address, so they added the ability to disable address wrapping.
on the keyboard controller.
so now your memory wrapping changes based on what you write to the keyboard controller.
BRILLIANT IDEA
anyway my favorite part of this A20 line thing is that it was supported by Intel chips up until Haswell, in 2013.
So in 2012 your 8 gigabyte of RAM PC booted up with every other megabyte of RAM mirrored to each other.
@xtof yeah I don't know of one. I hope so
@foone Do you know if there is a Ghidra plugin dedicated to help MS-DOS 16-bit EXE decompilation and disassembling?
One day, I would like to decompile some old games to learn a bit more how they worked.
I have to admit that I dread segmented memory through 😅
@foone That’s it, the world is switching to ARM. Or how about #MIPS? Thats a nice clean learning architecture.
@colincogle unfortunately I'm a retrocomputerist, and the past never changes. So I will always be mad at x86
076萌SNS is a social network, courtesy of 076. It runs on GNU social, version 2.0.2-beta0, available under the GNU Affero General Public License.
All 076萌SNS content and data are available under the Creative Commons Attribution 3.0 license.