xorpd-solutions

View on GitHub

0x00

[dec] [home] [inc]

[See this riddle on the book]

xor      eax,eax
lea      rbx,[0]
loop     $
mov      rdx,0
and      esi,0
sub      edi,edi
push     0
pop      rbp
See explanation ## Zeroing registers This snippet shows seven different ways to set various registers to zero, from `*ax` to `*dx`, and then `esi`, `edi`, `esp`, `rbp`. Let's see them one by one: ### xor eax,eax This is as simple as it sounds. Just XOR `eax` with itself. It works on any register, and width (i.e. it also works with `rax`, `edi`, etc). XOR is a notoriously slow operation, but it allows for very compact opcodes (2 bytes) if you are space-constrained: ``` $ rasm2 -a x86 -b 64 'xor eax,eax' 31c0 ``` It will take one more byte if we use the 64-bit wide register: ``` $ rasm2 -a x86 -b 64 'xor rax,rax' 4831c0 ``` ### lea rbx,[0] LEA stands for `load effective address`. This instruction loads the source value `[0]` into the 64-bit-wide `rbx` register. See the [LEA instruction's reference](https://c9x.me/x86/html/file_module_x86_id_153.html) for details. The source value uses the square brackets syntax, which means "dereference the value of the address within the square brackets", i.e. look for the content of the first 64 bits at the address `0` (i.e. like `*(void *)0` in C). In protected mode, and on a modern Linux kernel, this means accessing a reserved, and normally zero'ed area (see [Linux's memory layout](https://www.kernel.org/doc/html/latest/x86/boot.html#memory-layout)). This instruction requires memory access, and as such it is very time-inefficient. It is also less space-efficient than `xor`, as it takes 7 bytes: ``` $ rasm2 -a x86 -b 64 'lea rbx,[0]' 488d1df9ffffff ``` Even if we swap `rbx` for `ebx` it still takes 7 bytes: ``` $ rasm2 -a x86 -b 64 'lea ebx,[0]' 488d1df9ffffff ``` We have to assemble it in 32-bit mode to save one byte: ``` $ rasm2 -a x86 -b 32 'lea ebx,[0]' 8d1d00000000 ``` ### loop $ The [`LOOP`](https://c9x.me/x86/html/file_module_x86_id_161.html) instruction can be confusing: how can it set a register to zero? There is a simple explanation: `LOOP` jumps to the address indicated by its operand (`$`) until the value in `ecx` (or `cx`) reaches zero. The operand is a relative address to the `LOOP` instruction, and in our example `$` means "the current address" (this value has to be expanded by the assembler program, e.g. `nasm`). In practice, this instruction will loop on itself until `ecx` is zero before it continues to the next instruction, in fact setting `ecx` to 0. This instruction is space-efficient, as it takes only 2 bytes: ``` $ rasm2 -a x86 -b 64 'loop $' e2fe ``` However it can be time-inefficient if the value of `ecx` is large, because it will loop as many times as the value of `ecx`. The best case is when `ecx` is already zero of course. ### mov rdx,0 This is also as simple as it sounds: move 0 to `rdx`, 64-bit wide. It takes 7 bytes: ``` $ rasm2 -a x86 -b 64 'mov rdx,0' 48c7c200000000 ``` This is also quite time-efficient because an immediate value is written to a register. However, keep in mind that on modern CPUs things are less obvious than it seems, as this article shows: [Does register selection matter to performance on x86 CPUs?](https://fiigii.com/2020/02/16/Does-register-selection-matter-to-performance-on-x86-CPUs/). ### and esi,0 And this is also pretty obvious, and fast: a register is put to binary AND with 0, which results always in 0. Space efficiency is pretty good too: ``` $ rasm2 -a x86 -b 64 'and esi,0' 83e600 ``` and it gains a byte on 64-bit wide registers: ``` $ rasm2 -a x86 -b 64 'and rsi,0' 4883e600 ``` ### sub edi,edi We are again in Obvious Land: subtract the value of `edi` from `edi`, and get a nice round 0. Efficient in both time and space: ``` $ rasm2 -a x86 -b 64 'sub edi,edi' 29ff ``` plus one byte for 64-bit wide registers: ``` $ rasm2 -a x86 -b 64 'sub rdi,rdi' 4829ff ``` ### push 0; pop rbp This is more indirect: [`PUSH`](https://c9x.me/x86/html/file_module_x86_id_269.html) stores the value 0 to the top of the stack, and decrements it by the stack pointer's size (16, 32, or 64 bit on x86). Then the value is [`POP`](https://c9x.me/x86/html/file_module_x86_id_248.html)'ed into `rbp`. These two instructions in fact set `rbp` to 0. Less time-efficient: 0 is first copied to memory at the value pointed by `rsp, `rsp` is adjusted, then the value is read from memory again into `rbp`, and `rsp` is adjusted again to go back to the original value. Space: ``` $ rasm2 -a x86 -b 64 'push 0; pop rbp' 6a005d ``` Not great, not terrible: `6a00` is `push 0`, `5b` is `pop rbp`.

[dec] [home] [inc]