0x04
[dec] [home] [inc]
[See this riddle on the book]
See explanation
# toupper() and tolower()
This instruction swaps the case of ASCII letters.
This is one of the oldest and well-known ASCII tricks: the ASCII symbols are
arranged such as the only difference between a lower-case and an upper-case
latin letter is the 6th bit: set to 1 for lower-case, set to 0 for
upper-case.
A quick look at the 8-bit binary representation of a few ASCII letters:
a = 01100001
A = 01000001
b = 01100010
B = 01000010
Notice the difference? Just the 6th bit between lower and upper case.
So, the trick is to do a [`XOR`](https://en.wikipedia.org/wiki/Exclusive_or)
between `al` (the lowest order byte of `rax`) and `0x20`, and here we go,
'a' becomes 'A' and 'A' becomes 'a'.
This instruction lets you go back and forth between upper and lower case, but
there's more. As you have probably already imagined you can implement `tolower`
by setting the 6th bit with a binary OR, and implement `toupper` by clearing
the 6th bit with an AND NOT. In C:
```
int
tolower(int c) {
return c | 0x20;
}
int
toupper(int c) {
return c & ~0x20;
}
```
Or, in assembly:
```
tolower:
or al,0x20
...
toupper:
and al,0xdf
```
[dec] [home] [inc]