SoFunction
Updated on 2025-04-06

Detailed explanation of some concepts about the size and ends and bit fields

Large and small end:

For data types like char in C++, it itself takes up a byte size and will not cause any problems. However, when the number system is int, in a 32bit system, it needs to occupy 4 bytes (32bit), which will cause the storage order of these 4 bytes in the register. For example, int maxHeight = 0x12345678, &maxHeight = 0x0042ffc4. How to store the specific ones? At this time, you need to understand the principles of the computer's large and small ends.

Big endian: (Big-Endian)It is to place the high-bit byte of the value on the low-bit address of the memory, and place the status byte of the value on the high-bit address of the memory.

Little-Endian: (Little-Endian)It is to place the high byte of the number at the address of the high and the low byte at the low address.

The x86 structures we commonly use are all small-end modes, while most DSPs and ARMs are also small-end modes, but some ARMs can choose large-end modes. Therefore, the maxHeight above should be stored in small-endian mode. For details, please refer to the following two tables.

address 0x0042ffc4 0x0042ffc5 0x0042ffc6 0x0042ffc7

Value

0x78

0x56

0x34

0x12

Figure (1) is a small-endian mode

address 0x0042ffc4 0x0042ffc5 0x0042ffc6 0x0042ffc7
Value

0x12

0x34

0x56

0x78

Figure (2) is a big-endian mode

Through the table above, we can see the difference between the big and small ends. It is better to discuss this method here. I personally think that the big-end model is more in line with my habits.. (Note: I would like to say here that there is actually no so-called data types in computer memory, such as char, int, etc. The function of this type in code is to let the compiler know how many bits of data should be read from that address each time and assign it to the corresponding variable.)

Bit field:

In computers, binary 0 and 1 are used to represent data. Each 0 or 1 occupies 1 bit of storage space, and 8 bits form a byte, which is the smallest unit of data type in the computer. For example, char occupies one byte in a 32bit system. But as we know, sometimes the data in the program may not require such bytes. For example, the state of a switch is only on and off, and can be represented by 1 and 0 instead. At this time, the state of the switch only requires one storage space to meet the requirements. If you store it in one byte, it is obviously a waste of another 7 bits of storage space. So in C language, there is the concept of bit segments (some are also called bit domains, which are actually one thing). The specific syntax is to add a colon (:) and the number of bits of the specified storage space after the variable name. The specific definition syntax is as follows:

Copy the codeThe code is as follows:

struct bit segment name
{      
Segment data type Segment variable name: Segment length;
   .......    
}  

//Example
struct Node 

   char a:2; 
   double i; 
   int c:4; 
}node;


In fact, the definition is very simple. The meaning of the above example is to define a char variable a, which occupies 2-bit storage space, a double variable i, and an int variable c that occupies 4-bit storage. Please note that the size of the variable originally occupied bytes is changed here. It is not that the int variable we often stipulate that occupies 4 bytes, and a char variable occupied 1 byte. Running in an actual running environment, sizeof(node) = 24 due to memory byte alignment.