Preface
When writing C language code, pointers are everywhere. We can make a little extra use of pointers to store some extra information inside them. To implement this technique, we take advantage of the natural alignment of data in memory.
Data in memory is not stored at any address. Processors usually read memory data according to blocks of the same word size; then, considering efficiency factors, the compiler will align the entities in memory by integer multiples of the block size. Therefore, on a 32-bit processor, a 4-byte integer data must be stored in a place where the memory address can be divisible by 4.
Below, assume that the integer data and pointer sizes in the system are both 4 bytes.
Now there is a pointer to an integer. As mentioned above, integer data can be stored at memory addresses 0x1000 or 0x1004 or 0x1008, but will never be stored at 0x1001 or 0x1002 or 0x1003 or any other address that cannot be divisible by 4. All binary numbers that are 4 integer multiples end with 00. Actually, this means that for all pointers to integers, its last two digits are always 0.
Then there are 2 bits that do not carry any information. The trick here is to place our data into these two bits, use it when needed, and delete them before accessing memory through pointer dereferences.
Since the C standard does not support pointer bit operations well, we save the pointer as an unsigned integer data.
Below is a short simple code snippet. See the complete code in the github repositoryhide-data-in-ptr。
void put_data(int *p, unsigned int data) { assert(data < 4); *p |= data; } unsigned int get_data(unsigned int p) { return (p & 3); } void cleanse_pointer(int *p) { *p &= ~3; } int main(void) { unsigned int x = 701; unsigned int p = (unsigned int) &x; printf("Original ptr: %un", p); put_data(&p, 3); printf("ptr with data: %un", p); printf("data stored in ptr: %un", get_data(p)); cleanse_pointer(&p); printf("Cleansed ptr: %un", p); printf("Dereferencing cleansed ptr: %un", *(int*)p); return 0; }
The code output is as follows:
Original ptr: 3216722220
ptr with data: 3216722223
data stored in ptr: 3
Cleansed ptr: 3216722220
Dereferencing cleansed ptr: 701
We can store any data that can be represented by two bits in the pointer. Use the put_data() function to set the lowest two digits of the pointer to the data to be stored. This data can be obtained using the get_data() function. Here, all bits except the last two bits are overwritten to zero, so the data we hide is displayed.
The cleanse_pointer() function will zero at the lowest two positions to ensure that the pointer is safely dereferenced. Note that although some CPUs (like Intel allow us to access unaligned memory addresses, other CPUs (like ARM) will experience access errors. So, remember to ensure that the pointer points to an aligned memory address before dereference.
Is this applicable in practice?
Yes, there are applications. Check out the implementation of red and black trees in the Linux kernel (link: /torvalds/linux/blob/master/include/linux/).
The node definition of the tree is as follows:
struct rb_node { unsigned long __rb_parent_color; struct rb_node *rb_right; struct rb_node *rb_left; } __attribute__((aligned(sizeof(long))));
Here unsigned long __rb_parent_color stores the following information:
The address of the parent node
The color of the node
The color is represented by 0, and 1 represents black.
As in the previous example, this data is hidden in the parent pointer's "useless" bit.
Let’s take a look at how the parent pointer and color information are obtained:
/* in */ #define rb_parent(r) ((struct rb_node *)((r)->__rb_parent_color & ~3)) /* in rbtree_augmented.h */ #define __rb_color(pc) ((pc) & 1) #define rb_color(rb) __rb_color((rb)->__rb_parent_color)
Every bit in memory is precious, we should never waste it. ——(Author of this article)
Summarize
The above is the entire content of this article. I hope that the content of this article has a certain reference value for everyone's study or work. If you have any questions, you can leave a message to communicate. Thank you for your support.