3 GCC Inline ASM
GCC supports embedding assembly code in C/C++ code, which is called GCC Inline ASM - GCC inline assembly. This is a very useful function, which helps us to directly sneak some instructions that cannot be expressed in C/C++ syntax into C/C++ code, and also allows us to directly write concise and efficient code in C/C++ code using assembly.
1. Basic inline assembly
The basic inline assembly in GCC is very easy to understand, let’s first look at two simple examples:
__asm__("movl %esp,%eax"); // It looks very familiar!
Or
__asm__(" movl $1,%eax // SYS_exit xor %ebx,%ebx int $0x80 ");
or
__asm__( "movl $1,%eax\r\t" \ "xor %ebx,%ebx\r\t" \ "int $0x80" \ );
The format of basic inline assembly is
__asm__ __volatile__("Instruction List");
1、__asm__
__asm__ is the macro definition of the GCC keyword asm:
#define __asm__ asm
__asm__ or asm is used to declare an inline assembly expression, so any inline assembly expression starts with it and is essential.
2、Instruction List
Instruction List is an assembly instruction sequence. It can be empty, for example: __asm__ __volatile__(""); or __asm__ (""); are completely legal inline assembly expressions, but these two statements have no meaning. But not all inline assembly expressions with empty Instruction List are meaningless, for example: __asm__ ("":::"memory"); is very meaningful. It declares to GCC: "I made changes to memory". GCC will take this factor into account when compiling.
Let's take a look at the following example:
$ cat int main(int __argc, char* __argv[]) { int* __p = (int*)__argc; (*__p) = 9999; //__asm__("":::"memory"); if((*__p) == 9999) return 5; return (*__p); }
In this code, that inline assembly is commented out. Before this inline assembly, the memory pointed to by the memory pointer __p is assigned to 9999. Then, after the inline assembly, an if statement determines whether the memory pointed to by __p is equal to 9999. Obviously, they are equal. GCC can discover this smartly when optimizing and compiling. We compile it using the following command line:
$ gcc -O -S
Option -O means optimization compilation, we can also specify optimization level, for example -O2 means optimization level is 2; option -S means compiling the C/C++ source file into assembly file, the file name is the same as the C/C++ file, except that the extension is changed from .c to .s.
Let's check the compile results that are placed. Here we only list the part of the assembly code of related functions compiled on redhat 7.3 using gcc 2.96. For clarity, other unrelated codes are not listed.
$cat main: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax # int* __p = (int*)__argc movl $9999, (%eax) # (*__p) = 9999 movl $5, %eax # return 5 popl %ebp ret
Referring to the C source code and compiled assembly code, we will find that there is no code related to the if statement in the assembly code, but returns 5 directly after the assignment statement (*__p)=9999; this is because GCC believes that after (*__p) is assigned, there is no operation to change the content of (*__p) before the if statement, so the judgment condition (*__p) of that if statement is definitely true, so GCC will no longer generate the relevant code, but directly generates the assembly code of return 5 based on the conditions that are true (GCC uses eax as the register that holds the return value).
We now remove the comments of the inline assembly, recompile, and then look at the relevant compilation results.
$ gcc -O -S $ cat main: pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax # int* __p = (int*)__argc movl $9999, (%eax) # (*__p) = 9999 #APP # __asm__("":::"memory") #NO_APP cmpl $9999, (%eax) # (*__p) == 9999 ? jne .L3 # false movl $5, %eax # true, return 5 jmp .L2 .p2align 2 .L3: movl (%eax), %eax .L2: popl %ebp ret
Since the inline assembly statement __asm__("":::"memory") is declared to GCC, the memory contents may change where the inline assembly statement appears, so GCC cannot handle it as before during compilation. This time, GCC honestly generated assembly code for the if statement.
Some people may question: Why do we use __asm__("":::"memory") to declare memory changes to GCC? Obviously, the "Instruction List" is empty and there is no memory operation. Doing so will only increase the number of assembly code generated by GCC.
Indeed, that inline assembly statement does not do anything to memory, and in fact it does nothing. But it's not just the program you are currently running that affects the memory content. For example, if the memory you are operating now is a memory map, the mapping content is the peripheral I/O device register. Then it is not only the current program that operates this memory, but also the I/O device. Since both will operate the same piece of memory, neither party can take the content of this piece of memory for granted at any time. So when you write such programs in the high-level language C/C++, you must let the compiler understand this, after all, advanced languages will eventually be compiled into assembly code.
You may have noticed that there are two symbols in the assembly result output this time: #APP and #NO_APP. GCC places the instructions listed in the inline assembly sentence "Instruction List" between #APP and #NO_APP. Since the "Instruction List" in __asm__("":::"memory") is empty, there is nothing between #APP and #NO_APP. But our future examples will show this more clearly.
Regarding why inline assembly __asm__("":::"memory") is a statement that declares memory changes, we will discuss in detail later.
We just spent a lot of content to discuss the case where "Instruction List" is empty, but in actual programming, "Instruction List" is not empty in most cases. It can have 1 or as many assembly instructions.
When there are multiple instructions in the "Instruction List", you can list all instructions in a pair of quotes, or you can put one or several instructions in a pair of quotes, and all instructions in multiple pairs of quotes. If it is the former, you can put each instruction in one line. If you want to put multiple instructions in one line, you must separate them with a semicolon (;) or a newline character (\n, in most cases, followed by a \t, where \n is for a newline and \t is for a tab width space) . for example:
__asm__("movl %eax, %ebx sti popl %edi subl %ecx, %ebx"); __asm__("movl %eax, %ebx; sti popl %edi; subl %ecx, %ebx"); __asm__("movl %eax, %ebx; sti\n\t popl %edi subl %ecx, %ebx");
All are legal writing. If you put the directive in multiple pairs of quotes, except for the last pair of quotes, the last directive in the preceding quotes must have a semicolon (;) or (\n) or (\n\t) after it. for example:
__asm__("movl %eax, %ebx sti\n" "popl %edi;" "subl %ecx, %ebx"); __asm__("movl %eax, %ebx; sti\n\t" "popl %edi; subl %ecx, %ebx"); __asm__("movl %eax, %ebx; sti\n\t popl %edi\n" "subl %ecx, %ebx"); __asm__("movl %eax, %ebx; sti\n\t popl %edi;" "subl %ecx, %ebx");
All are legal.
The above principles can be attributed to:
Any two instructions are either separated by semicolons (;) or placed in two lines; the method of putting two lines can be implemented from the method of \n, or can be placed in two lines; 1 pair or multiple pairs of quotes can be used, and multiple instructions can be placed in each pair of quotes, and all instructions must be placed in quotes. In basic inline assembly, the format of writing "Instruction List" is no different from that of writing non-inline assembly directly in assembly files. You can define Label, define alignment (.align n), and define segments (.section name). For example:
__asm__(".align 2\n\t" "movl %eax, %ebx\n\t" "test %ebx, %ecx\n\t" "jne error\n\t" "sti\n\t" "error: popl %edi\n\t" "subl %ecx, %ebx");
The format of the example above is a commonly used format for Linux inline code, which is very neat. It is also recommended that everyone use this format to write inline assembly code.
3、__volatile__
__volatile__ is the macro definition of the GCC keyword volatile:
#define __volatile__ volatile
__volatile__ or volatile is optional, you can use it or not. If you use it, you declare to GCC "Don't touch the Instruction List I wrote, I need to keep every instruction intact". Otherwise, when you use the optimization option (-O) to compile, GCC will decide whether to optimize the instructions in this inline assembly expression based on its own judgment.
So what is the principle of GCC judgment? I don't know (if any friend knows it, please let me know). I tried it and found that if an inline assembly statement is a basic inline assembly (that is, there is only "Instruction List" and no inline assembly of Input/Output/Clobber, we will discuss this later), regardless of whether you use __volatile__ to modify it, GCC 2.96 will retain the "Instruction List" in the inline assembly intact when optimizing compilation. But perhaps my experiments are not sufficient, so this cannot be guaranteed.
To be on the safe side, if you don't want GCC optimization to affect your inline assembly code, you'd better add __volatile__ before it instead of relying on compiler principles, because even if you know the current compiler's optimization principles, you can't guarantee that this principle will not change in the future. But the meaning of __volatile__ is constant.
2. Inline assembly with C/C++ expressions
GCC allows you to specify the input and output of instructions in the "Instrcuction List" in inline assembly through C/C++ expressions. You can even ignore which register is used to be used, and it depends entirely on GCC to arrange and specify. This can prevent programmers from considering the use of limited registers, and can also improve the efficiency of target code.
Let's first look at a few examples:
__asm__ (" " : : : "memory" ); // The aforementioned __asm__ ("mov %%eax, %%ebx" : "=b"(rv) : "a"(foo) : "eax", "ebx"); __asm__ __volatile__("lidt %0": "=m" (idt_descr)); __asm__("subl %2,%0\n\t" "sbbl %3,%1" : "=a" (endlow), "=d" (endhigh) : "g" (startlow), "g" (starthigh), "0" (endlow), "1" (endhigh));
How about it? I have a little impression. Are you a little dizzy? It doesn't matter, you won't be dizzy after the discussion below. (Of course, it may be even more dizzy^_^). Discussion begins-
The inline assembly format with C/C++ expressions is:
__asm__ __volatile__("Instruction List" : Output : Input : Clobber/Modify);
From this we can see that the difference between it and basic inline assembly is that it has 3 additional parts (Input, Output, Clobber/Modify). The 4 parts in brackets are separated by colon (:).
None of these 4 parts is necessary, any part can be empty, and its rules are:
If Clobber/Modify is empty, the colon (:) before it must be omitted. For example, __asm__("mov %%eax, %%ebx" : "=b"(foo) : "a"(inp) : ) is an illegal way of writing; while __asm__("mov %%eax, %%ebx" : "=b"(foo) : "a"(inp) ) is correct. If the Instruction List is empty, Input, Output, and Clobber/Modify can be either empty or empty. For example, __asm__ ( " " : : : "memory" ); and __asm__(" " : : ); are both legal writing methods. If Output, Input, Clobber/Modify are all empty, the colon (:) before Output and Input can be omitted or not. If both are omitted, this assembly degenerates into a basic inline assembly. Otherwise, it is still an inline assembly with C/C++ expressions. At this time, the register writing in the "Instruction List" must comply with relevant regulations. For example, two percent signs (%%) must be used before the register, rather than using only one percent sign (%) before the register like the basic assembly format. For example, __asm__( " mov %%eax, %%ebx" : : ); __asm__( " mov %%eax, %%ebx" : ) and __asm__( " mov %eax, %ebx" ) are both correct writing methods, while __asm__( " mov %eax, %ebx" : : ); __asm__( " mov %eax, %ebx" : ) and __asm__( " mov %%eax, %%ebx" ) are both wrong writing methods. If Input, Clobber/Modify is empty, but Output is not empty, the colon (:) before Input can be omitted or not. For example, __asm__( " mov %%eax, %%ebx" : "=b"(foo) : ); __asm__( " mov %%eax, %%ebx" : "=b"(foo) ) are all correct. If the subsequent part is not empty and the previous part is empty, the previous colon (:) must be retained, otherwise it cannot be explained which part the part that is not empty is. For example, if Clobber/Modify, Output is empty and Input is not empty, the colon before Clobber/Modify must be omitted (the previous rule), and the colon before Output must be reserved. If Clobber/Modify is not empty, and both Input and Output are empty, both colons before Input and Output must be retained. For example, __asm__( " mov %%eax, %%ebx" : : "a"(foo) ) and __asm__( " mov %%eax, %%ebx" : : : "ebx" ). From the above rules, we can see another fact. The rule is to distinguish whether an inline assembly is in basic format or with C/C++ expression format. The rule lies in whether there is a colon (:) after the "Instruction List". If not, it is in basic format, otherwise, it is in C/C++ expression format.
The two formats have different requirements for register syntax: the basic format requires that only one percent sign (%) can be used before the register, which is the same as non-inline assembly; while the C/C++ expression format requires that two percent signs (%%) must be used before the register, and the reason we will discuss later.
1. Output
Output is used to specify the output of the current inline assembly statement. Let's take a look at this example:
__asm__("movl %%cr0, %0": "=a" (cr0));
The output part of this inline assembly statement is "=r"(cr0), which is a "operation expression" that specifies an output operation. We can clearly see that this output operation consists of two parts: the part enclosed by brackets (cr0) and the part enclosed by quotes "=a". Both of these parts are essential for every output operation. The part wrapped in brackets is a C/C++ expression, which is used to save an output value of inline assembly, and its operation is equal to the equal assignment of C/C++ cr0 = output_value. Therefore, the output expression in brackets can only be an lvalue expression of C/C++, which means it can only be a legal expression placed on the left of the equal sign (=) in the C/C++ assignment operation. So where does the rvalue output_value come from?
The answer is the content in quotes, called "Operation Constraint". In this example, the operation constraint is "=a", which contains two constraints: the equal sign (=) and the letter a, where the equal sign (=) means that the lvalue expression cr0 in the brackets is a Write-Only, which can only be used as input to the current inline assembly, but not as input. The letter a is the abbreviation of register EAX/AX/AL, which means that the value of cr0 must be obtained from the eax register, that is, cr0 = eax. In the end, this is converted into an assembly instruction, which is movl %eax, address_of_cr0. Now you should be clear. The operation constraint will give: from which register is passed the value to cr0.
In addition, it should be noted that many documents declare that the operation constraints of all output operations must contain an equal sign (=), but the GCC documentation clearly states that this is not the case. Because the equal sign (=) constraint indicates that the current expression is a Write-Only, but there is another symbol - the plus sign (+) to indicate that the current expression is a Read-Write, if no one of the two symbols is given in an operation constraint, it means that the current expression is Read-Only. Because for output operations, it must be writable, and the equal sign (=) and plus sign (+) both mean writable, but the plus sign (+) also means readable. Therefore, for an output operation, its operation constraints only need to have any one of the equal signs (=) or plus signs (+).
The difference between the two is: the equal sign (=) means that the current operation expression specifies a pure output operation, while the plus sign (+) means that the current operation expression is not just an output operation but an input operation. However, the operation expressions constrained by the equal sign (=) constraint or the plus sign (+) constraint can only be placed in the Output field and cannot be used in the Input field.
In addition, some documentation states that although the GCC documentation provides a plus sign (+) constraint, it cannot be passed in actual compilation; I don't know what will happen to the old version, and I use the plus sign (+) constraint in GCC 2.96 very normal.
Let's take an example to see the difference between using the equal sign (=) and the plus sign (+) constraint in an output operation.
This example is a case where the equal sign (=) constraint is used. The variable cr0 is placed in the memory -4 (%ebp), so the instruction mov %eax, -4 (%ebp) means outputting the content of %eax to the variable cr0.
Here is a case where the plus sign (+) constraint is used:
$ cat int main(int __argc, char* __argv[]) { int cr0 = 5; __asm__ __volatile__("movl %%cr0, %0" : "+a" (cr0)); return 0; } $ gcc -S $ cat main: pushl %ebp movl %esp, %ebp subl $4, %esp movl $5, -4(%ebp) # cr0 = 5 movl -4(%ebp), %eax # input ( %eax = cr0 ) #APP movl %cr0, %eax #NO_APP movl %eax, -4(%ebp) # output (cr0 = %eax ) movl $0, %eax leave ret
From the compilation results, we can see that when using the plus sign (+) constraint, cr0 is not only used as the output, but also as the input. The registers used are all specified by the register constraint (letter a, which means using the eax register). We will discuss register constraints later.
There can be multiple output operation expressions in the Output field, and the multiple operation expressions must be separated by commas (,). For example:
__asm__( "movl %%eax, %0 \n\t" "pushl %%ebx \n\t" "popl %1 \n\t" "movl %1, %2" : "+a"(cr0), "=b"(cr1), "=c"(cr2));
2、Input
The contents of the Input field are used to specify the input of the current inline assembly statement. Let's take a look at this example:
__asm__("movl %0, %%db7" : : "a" (cpu->db7));
In the example, the content of the Input field is an expression "a"[cpu->db7), which is called an "input expression", which is used to represent an input to the current inline assembly.
Like an output expression, an input expression is divided into two parts: the bracketed part (cpu->db7) and the quoted part "a". These two parts are also essential for an inline assembly input expression.
The expression cpu->db7 in brackets is an expression in C/C++ language. It does not have to be an lvalue expression, which means that it can not only be an expression placed on the left side of the C/C++ assignment operation, but also an expression placed on the right side of the C/C++ assignment operation. So it can be a variable, a number, or a complex expression (such as a+b/c*d). For example, the above example can be changed to: __asm__("movl %0, %%db7" : : "a" (foo)), __asm__("movl %0, %%db7" : : "a" (0x1000)) or __asm__("movl %0, %%db7" : : "a" (va*vb/vc)).
The part in the quotes is the constraint part. Unlike the output expression constraint, it does not allow the plus (+) constraint and equal (=) constraint, which means it can only be the default Read-Only. A register constraint must be specified in the constraint. The letter a in the example indicates that the current input variable cpu->db7 must be input into the current inline assembly through the register eax.
Let's look at an example:
$ cat int main(int __argc, char* __argv[]) { int cr0 = 5; __asm__ __volatile__("movl %0, %%cr0"::"a" (cr0)); return 0; } $ gcc -S $ cat main: pushl %ebp movl %esp, %ebp subl $4, %esp movl $5, -4(%ebp) # cr0 = 5 movl -4(%ebp), %eax # %eax = cr0 #APP movl %eax, %cr0 #NO_APP movl $0, %eax leave ret
We can see from the compiled assembly code that before the "Instruction List", GCC loads the content of the variable cr0 into the eax register according to our input constraint "a".
3. Operation Constraint
Each Input and Output expression must specify its own operation constraints Operation Constraint. Let's discuss the possible operation constraints on the 80386 platform.
1. Register Constraints
When your current input or input needs to be managed by a register, you need to specify a register constraint for it. You can directly specify the name of a register, such as:
__asm__ __volatile__("movl %0, %%cr0"::"eax" (cr0));
You can also specify an abbreviation, such as:
__asm__ __volatile__("movl %0, %%cr0"::"a" (cr0));
If you specify an abbreviation, such as the letter a, GCC will decide whether to use %eax, %ax or %al based on the width of the C/C++ expression in the current operation expression. for example:
unsigned short __shrt; __asm__ ("mov %0,%%bx" : : "a"(__shrt));
Since the variable __shrt is of 16-bit short type, the compiled assembly code will use the %ex register. The compiled result is:
movw -2(%ebp), %ax # %ax = __shrt #APP movl %ax, %bx #NO_APP
Register constraints can be used whether it is Input or Output operation expression constraints.
The abbreviation of commonly used register constraints is listed in the following table.
Constraint Input/Output Meaning
r I,O means using a general register, and GCC selects a GCC in %eax/%ax/%al, %ebx/%bx/%bl, %ecx/%cx/%cl, %edx/%dx/%dl that is considered appropriate.
q I,O means using a general register, and the meaning of r is the same. a I,O means using %eax / %ax / %al b I,O means using %ebx / %bx / %bl c I,O means using %ecx / %cx / %cl d I,O means using %edx / %dx / %dl D I,O means using %edi / %di S I,O means using %esi / %si f I,O means using floating point register t I,O means using the first floating point register u I,O means using the second floating point register
2. Memory constraints
If a C/C++ expression of an Input/Output operation expression is represented as a memory address and does not want to use any registers, you can use memory constraints.
for example:
__asm__ ("lidt %0" : "=m"(__idt_addr)); or __asm__ ("lidt %0" : :"m"(__idt_addr));
Let's look at the results after they are placed in a C source file and then compiled by GCC:
$ cat // In this example, the variable sh is input as a memory int main(int __argc, char* __argv[]) { char* sh = (char*)&__argc; __asm__ __volatile__("lidt %0" : : "m" (sh)); return 0; } $ gcc -S $ cat main: pushl %ebp movl %esp, %ebp subl $4, %esp leal 8(%ebp), %eax movl %eax, -4(%ebp) # sh = (char*) &__argc #APP lidt -4(%ebp) #NO_APP movl $0, %eax leave ret $ cat // In this example, the variable sh is output as a memory int main(int __argc, char* __argv[]) { char* sh = (char*)&__argc; __asm__ __volatile__("lidt %0" : "=m" (sh)); return 0; } $ gcc -S $ cat main: pushl %ebp movl %esp, %ebp subl $4, %esp leal 8(%ebp), %eax movl %eax, -4(%ebp) # sh = (char*) &__argc #APP lidt -4(%ebp) #NO_APP movl $0, %eax leave ret
First, you will notice that in these two examples, the variable sh does not use any registers, but directly participates in the operation of the instruction lidt.
Secondly, after careful observation, you will find an amazing fact that the assembly code compiled by the two examples is the same! Although, in one example, the variable sh is used as input, and in another example, the variable sh is used as output. What's going on?
It turns out that when using memory for input and output, GCC will not perform any input and output processing on it according to your statement because it does not use registers. GCC will only be used directly. Whether it is input or output for this C/C++ expression depends entirely on the instructions you write in the "Instruction List" to operate on it.
Since in the above example, the instruction to operate on it is lidt, and the operand of the lidt instruction is an input operand, so in fact the operation on the variable sh is an input operation, and even if you put it in the Output field, it will not change this. Therefore, for this example, the way to write it exactly in line with the semantics should be to place sh in the Input field, although it will also have correct execution results in place in the Output field.
Therefore, for memory constraint type operation expressions, whether placed in the Input field or the Output field has no effect on the compilation result, because we originally put an operation expression in the Input field or the Output field, hoping that GCC can automatically input or output the value of the expression through registers for us. Since GCC does not do anything for memory constraint type operation expressions, it doesn't matter where it is placed. But from the perspective of programmers, in order to enhance the readability of the code, it is best to put it in a place that suits the actual situation.
Constraint Input/Output Meaning m I,O means using any memory method supported by the system without the need to use registers
3. Immediate number constraints
If an Input/Output operation expression's C/C++ expression is a numeric constant and does not want to use any registers, you can use the immediate constraint.
Since immediate numbers can only be used as rvalues in C/C++, expressions that use immediate numbers constraints can only be placed in the Input field.
For example: __asm__ __volatile__("movl %0, %%eax" : : "i" (100) );
The immediate number constraint is simple and easy to understand, so we will not repeat it here.
Constraint Input/Output Meaning i I means that the input expression is an immediate number (integral), and does not need to use any registers F I means that the input expression is an immediate number (floating point number), and does not need to use any registers
4. General constraints
Constraint Input/Output Meaning g I,O means that any processing method such as general registers, memory, immediate number, etc. can be used. 0,1,2,3,4,5,6,7,8,9 I means that the nth operation expression uses the same register/memory.
General constraint g is a very flexible constraint. When programmers think that in actual operations, it doesn't matter whether a C/C++ expression uses registers, memory or immediate numbering, or when programmers want to implement a flexible template so that GCC can generate different access methods according to different C/C++ expressions, they can use general constraint g. for example:
#define JUST_MOV(foo) __asm__ ("movl %0, %%eax" : : "g"(foo)) JUST_MOV(100)andJUST_MOV(var)This will cause the compiler to produce different code。 int main(int __argc, char* __argv[]) { JUST_MOV(100); return 0; }
The code generated after compilation is:
main: pushl %ebp movl %esp, %ebp #APP movl $100, %eax #NO_APP movl $0, %eax popl %ebp ret
Obviously this is the immediate way to count. And the next example:
int main(int __argc, char* __argv[]) { JUST_MOV(__argc); return 0; }
The code generated after compilation is:
main: pushl %ebp movl %esp, %ebp #APP movl 8(%ebp), %eax #NO_APP movl $0, %eax popl %ebp ret
This example is using memory.
An inline assembly with C/C++ expressions whose operation expressions are numbered in the order listed, the first one is 0, the second one is 1, and so on, GCC allows up to 10 operation expressions. for example:
__asm__ ("popl %0 \n\t" "movl %1, %%esi \n\t" "movl %2, %%edi \n\t" : "=a"(__out) : "r" (__in1), "r" (__in2));
In this example, the Output operation expression where __out is located is numbered 0, "r" (__in1) is numbered 1, and "r" (__in2) is numbered 2.
Another example:
__asm__ ("movl %%eax, %%ebx" : : "a"(__in1), "b"(__in2));
In this example, "a"(__in1) is numbered 0 and "b"(__in2) is numbered 1.
If an Input operation expression uses a number from the numbers 0 to 9 (assuming 1) as its operation constraint, it is equivalent to declaring to GCC: "I want to use the same register as the Output operation expression numbered 1 (if Output operation expression 1 is using register), or the same memory address (if Output operation expression 1 is using memory)". The above description contains two limitations: the number 0 to the number 9 can only be used in the Input operation expression. The specified operation expression (for example, a certain Input operation expression uses the number 1 as the constraint, then the specified operation expression is the number 1) can only be an Output operation expression.
Since GCC stipulates that there can only be 10 Input/Output operation expressions at most, in fact, the number 9 as an operation constraint will never be used, because the Output operation expression is ranked first in the Input operation expression. If an Input operation expression specifies the number 9 as an operation constraint, it means that the number of Output operation expressions has been at least 10, and then adding this Input operation expression will be at least 11, and it exceeds the GCC limit.
5. Modifier Characters
The equal sign (=) and the plus sign (+) are used to modify the Output operation expression. An Output operation expression is either modified by the equal sign (=) or modified by the plus sign (+), and both must be one of them. Use the equal sign (=) to indicate that this Output operation expression is Write-Only, and use the plus sign (+) to indicate that this Output operation expression is Read-Write. They must be placed on the first letter of the constraint string. For example, "a="(foo) is illegal, while "+g"(foo) is legal.
When using the plus sign (+), this Output expression is equivalent to using the equal sign (=) constraint plus an Input expression. for example
__asm__ ("movl %0, %%eax; addl %%eax, %0" : "+b"(foo)) Equivalent to __asm__ ("movl %1, %%eax; addl %%eax, %0" : "=b"(foo) : "b"(foo))
However, if the latter writing method is used, the alias in "Instruction List" must also be changed accordingly. We will discuss alias later.
Like the equal sign (=) and plus sign (+) modifiers, the symbol (&) can only be used to modify Output operation expressions. When using it for modification, it is equivalent to declaring to GCC: "GCC shall not assign the same registers as this Output operation expression to any Input operation expression." The reason is that the & modifier means that the Output operation expression modified by it is output before all Input operation expressions are input. Let's look at the following example:
int main(int __argc, char* __argv[]) { int __in1 = 8, __in2 = 4, __out = 3; __asm__ ("popl %0 \n\t" "movl %1, %%esi \n\t" "movl %2, %%edi \n\t" : "=a"(__out) : "r" (__in1), "r" (__in2)); return 0; }
In this example, %0 corresponds to the Output operation expression. The specified register is %eax. The first instruction of the entire Instruction List is popl %0. After compiling, it becomes popl %eax. At this time, the content of %eax has been modified. Then, after the Instruction List, GCC will place the content of %eax into the Output variable __out through the movl %eax, address_of_out instruction. For the two Input operation expressions in this example, their registers are constrained to "r", that is, GCC requires that they specify appropriate registers, and then put the contents of __in1 and __in2 into the selected registers before the Instruction List. If one of them selects the register %eax that has been specified by __out, if it is __in1, then GCC will insert the instruction movl address_of_in1, %eax before the Instruction List, then the popl %eax instruction modifies the value of %eax. At this time, what is stored in %eax is no longer the value of the Input variable __in1, then the subsequent movl %1, The %%esi directive will not put the value of __in1 into %esi according to our original intention - to put the value of __in1 into %esi - but will put the value of __out into %esi.
The following is the compilation result of this example. It is obvious that GCC has selected the same register %eax as __in2 for __in2, which is inconsistent with our original intention.
main: pushl %ebp movl %esp, %ebp subl $12, %esp movl $8, -4(%ebp) movl $4, -8(%ebp) movl $3, -12(%ebp) movl -4(%ebp), %edx # __in1 uses register %edxmovl -8(%ebp), %eax # __in2 uses register %eax#APP popl %eax movl %edx, %esi movl %eax, %edi #NO_APP movl %eax, %eax movl %eax, -12(%ebp) # __out Use register %eaxmovl $0, %eax leave ret
To avoid this, we must declare this to GCC and require GCC to specify other registers for all Input operation expressions. The method is to add a & constraint to the operation constraint of Output operation expression "=a" (__out). Since GCC stipulates that the equal sign (=) constraint must be placed first, we write "=&a" (__out).
Here is the result of compiling the & constraint after adding it:
main: pushl %ebp movl %esp, %ebp subl $12, %esp movl $8, -4(%ebp) movl $4, -8(%ebp) movl $3, -12(%ebp) movl -4(%ebp), %edx #__in1 uses register %edxmovl -8(%ebp), %eax movl %eax, %ecx # __in2 uses register %ecx#APP popl %eax movl %edx, %esi movl %ecx, %edi #NO_APP movl %eax, %eax movl %eax, -12(%ebp) #__out Use register %eaxmovl $0, %eax leave ret
OK! Now it's OK, it's completely consistent with our intentions.
If the register constraint of an Output operation expression is specified as a certain register, the use & modification of this Output operation expression is only meaningful when at least one Input operation expression has an optional constraint (the optional constraint means that one can be selected from multiple registers, or use a non-register method), such as "r" or "g". If you specify a fixed register for all Input operation expressions, or use memory/immediate constraints, then there is no point in using & modification of this Output operation expression. for example:
__asm__ ("popl %0 \n\t" "movl %1, %%esi \n\t" "movl %2, %%edi \n\t" : "=&a"(__out) : "m" (__in1), "c" (__in2));
The Output operation expression in this example is not necessary to use & to modify it, because both __in1 and __in2 are specified with fixed registers, or memory mode is used, and GCC has no choice.
But if you have specified & modification for an Output operation expression and specified a fixed register, you can no longer specify this register for any Input operation expression, otherwise a compilation error will occur. for example:
__asm__ ("popl %0 \n\t" "movl %1, %%esi \n\t" "movl %2, %%edi \n\t" : "=&a"(__out) : "a" (__in1), "c" (__in2));
In this example, since __out has specified register %eax and used symbol & modification, it is illegal to specify register %eax for __in1.
Conversely, you can also specify optional constraints for Output, such as "r", "g", etc., so that GCC can choose which register to use or use memory method. When selecting GCC, it will first exclude all registers that have been used by the Input operation expression, and then select among the remaining registers, or simply use memory method. for example:
__asm__ ("popl %0 \n\t" "movl %1, %%esi \n\t" "movl %2, %%edi \n\t" : "=&r"(__out) : "a" (__in1), "c" (__in2));
In this example, since __out specifies the constraint "r", that is, let GCC decide which register to use, and the registers %eax and %ecx have been used by __in1 and __in2, then when GCC selects for __out, it will only select in %ebx and %edx.
The first 3 modifiers can only be used in Output operation expressions, while the percent sign [%] modifier is just the opposite and can only be used in Input operation expressions and is used to declare to GCC: "The C/C++ expression in the current Input operation expression can be interchanged with the C/C++ expression in the next Input operation expression." This modified symbol is generally used to comply with the law of exchange operations, such as adding (+), multiplying (*), and (&), or (|), etc. Let's look at an example:
int main(int __argc, char* __argv[]) { int __in1 = 8, __in2 = 4, __out = 3; __asm__ ("addl %1, %0\n\t" : "=r"(__out) : "%r" (__in1), "0" (__in2)); return 0; }
In this example, since the instruction is an addition operation, it is equivalent to the equation __out = __in1 + __in2, and it is no different from the equation __out = __in2 + __in1. So use the percent sign modification to let GCC know that __in1 and __in2 are interchangeable, which means that GCC can automatically change the inline assembly of this example to:
__asm__ ("addl %1, %0\n\t" : "=r"(__out) : "%r" (__in2), "0" (__in1));
Modifier Input/Output Meaning = O means that this Output operation expression is Write-Only + O means that this Output operation expression is Read-Write & O means that this Output operation expression is exclusively the register specified for it %I means that the C/C++ expression in this Input operation expression can be interchanged with the C/C++ expression in the next Input operation expression
4. Placeholder
What is a placeholder? Let's take a look at the following example:
__asm__ ("addl %1, %0\n\t" : "=a"(__out) : "m" (__in1), "a" (__in2));
%0 and %1 in this example are placeholders. Each placeholder corresponds to an Input/Output operation expression. As we mentioned earlier, GCC stipulates that an inline assembly statement can have up to 10 Input/Output operation expressions, and then assign numbers 0 to 9 in the order in which they are listed. For numbers in placeholders, these numbers correspond to them.
Since a percent sign (%) is used before the placeholder, in order to distinguish between placeholders and registers, GCC stipulates that in an inline assembly with C/C++ expressions, two percent signs (%%) must be used before the registers written directly in the "Instruction List".
When GCC compiles it, each placeholder will be replaced with the register/memory address/immediate number specified by the corresponding Input/Output operation expression. For example, in the above example, placeholder %0 corresponds to the Output operation expression "=a"(__out), and the register specified by "=a"(__out) is %eax, so placeholder %0 is replaced with %eax, placeholder %1 corresponds to the Input operation expression "m"(__in1), and "m"(__in1) is specified as a memory operation, so placeholder %1 is replaced with the memory address of the variable __in1.
Some people may think that in the above example, it is possible to write %%eax directly without using %0, just like this:
__asm__ ("addl %1, %%eax\n\t" : "=a"(__out) : "m" (__in1), "a" (__in2));
There is no difference between using placeholder %0 above, so there is no point in using placeholder %0. Indeed, the code generated by both is exactly the same, but that doesn't mean that the placeholder makes no sense in this case. Because if you do not use placeholders, then one day when you want to change the register constraint of the variable __out from a to b, you must also change %%eax in the addl instruction to %%ebx, which means you need to modify two places at the same time, and if you use placeholders, you only need to modify it once. Also, if you don't use placeholders, it will be detrimental to the clarity of the code. In the above example, if you use a placeholder, you can tell at a glance that the content of the second operand of the addl instruction will eventually be output to the variable __out; otherwise, if you do not use the placeholder, but write the second operand of the addl instruction as %% eax, then you need to consider it to know that it will eventually need to output to the variable __out. This is the most superficial meaning of placeholders. After all, in this case, you can never use it.
But for these situations, it is completely impossible to do without placeholders:
First, let’s take a look at the first Input operation expression "m" (__in1) in the above example. After it is replaced by GCC, it is represented as addl address_of_in1, %%eax. What is the address of __in1? Only when compiling it. Therefore, we cannot directly write the address of __in1 in the instruction. At this time, we can solve this problem by using placeholders and handing them over to GCC for replacement during compilation. So in this case we have to use placeholders.
Secondly, if the Output operation expression "=a"(__out) in the above example is changed to " =r"(__out), then the register can only be determined by GCC until compile time. Since when we write code, we do not know which register is selected, we cannot directly write the name of the register in the instruction, but can only solve it through placeholder substitution.
5. Clobber/Modify
Sometimes you want to notify GCC that the current inline assembly statement may modify certain registers or memory, and hope that GCC can take this into account at compile time. Then you can declare these registers or memory in the Clobber/Modify domain.
This situation generally occurs when a register appears in the "Instruction List", but is not specified by the Input/Output operation expression, nor is it selected by GCC for some Input/Output operation expressions when using "r" and "g" constraints. At the same time, this register is modified by the instructions in the "Instruction List", and this register is only used temporarily for the current inline assembly. for example:
__asm__ ("movl %0, %%ebx" : : "a"(__foo) : "bx");
Register %ebx appears in the "Instruction List" and is modified by the movl instruction, but is not specified by any Input/Output operation expression, so you need to specify "bx" in the Clobber/Modify field to let GCC know this.
Because you use the "r" and "g" constraints for some Input/Output operation expressions, and let GCC choose a register for you, GCC is very clear about these registers - it knows that these registers are modified and you don't need to declare them in the Clobber/Modify field at all. But beyond that, GCC has no idea what are the remaining registers that will be modified by the current inline assembly. So if you really modify them in the current inline assembly instructions, it is best to declare them in Clobber/Modify and let GCC deal with these registers accordingly. Otherwise, it may cause inconsistency in registers, resulting in program execution errors.
The method of specifying these registers in the Clobber/Modify field is simple, you just need to enclose the register's name in double quotes (" "). If there are multiple registers that need to be declared, you need to separate them with commas between any two declarations. for example:
__asm__ ("movl %0, %%ebx; popl %%ecx" : : "a"(__foo) : "bx", "cx" );
These strings include:
The declared string represents the register
"al","ax","eax" %eax "bl","bx","ebx" %ebx "cl","cx","ecx" %ecx "dl","dx","edx" %edx "si","esi" %esi "di", "edi" %edi
As can be seen from the above table, you only need to use "ax", "bx", "cx", "dx", "si", "di", because the others are equivalent to one of them.
If you declare a register content to GCC in the Clobber/Modify field of an inline assembly statement, if GCC finds that the content of the declared register continues to be used after the inline assembly statement, then GCC will first save the content of this register, and then restore the content after the relevant generated code of the inline assembly statement. Let's look at the two examples and compare the differences between them.
In this example, the %ebx content has been changed:
$ cat int main(int __argc, char* __argv[]) { int in = 8; __asm__ ("addl %0, %%ebx" : /* no output */ : "a" (in) : "bx"); return 0; } $ gcc -O -S $ cat main: pushl %ebp movl %esp, %ebp pushl %ebx # %ebx content is savedmovl $8, %eax #APP addl %eax, %ebx #NO_APP movl $0, %eax movl (%esp), %ebx # %ebx content is restoredleave ret
The C source code of the following example is the same as the previous example except that the %ebx register has changed.
$ cat int main(int __argc, char* __argv[]) { int in = 8; __asm__ ("addl %0, %%ebx" : /* no output */ : "a" (in) ); return 0; } $ gcc -O -S $ cat main: pushl %ebp movl %esp, %ebp movl $8, %eax #APP addl %eax, %ebx #NO_APP movl $0, %eax popl %ebp ret
If you compare the sum carefully, you will understand the meaning of declaring a register in the Clobber/Modify field.
Also, it should be noted that if you declare a register in the Clobber/Modify field, then this register will no longer be used as the register constraint of the Input/Output operation expression of the current inline assembly statement. If the register constraint of the Input/Output operation expression is specified as "r" or "g", GCC will not select the register that has been declared in Clobber/Modify. for example:
__asm__ ("movl %0, %%ebx" : : "a"(__foo) : "ax", "bx");
In this example, since the register constraint of the Output operation expression "a" (__foo) has specified the %eax register, then specifying "ax" in the Clobber/Modify field is illegal. When compiling, GCC will give a compilation error.
In addition to the content of the registers, the content of the memory can also be modified. If the instruction in an inline assembly statement "Instruction List" changes memory, or the memory content may change where the inline assembly occurs, and the memory address you are not using the "m" constraint in its Output operation expression, in this case you need to use the string "memory" in the Clobber/Modify field to declare to the GCC: "Here, the memory has occurred, or may have changed." For example:
void * memset(void * s, char c, size_t count) { __asm__("cld\n\t" "rep\n\t" "stosb" : /* no output */ : "a" (c),"D" (s),"c" (count) : "cx","di","memory"); return s; }
This example implements the standard function library memset. The stosb in its inline assembly has changed the memory, and its modified memory address s is specified to load into %edi. No Output operation expression uses the "m" constraint to specify that the content at the memory address s has changed. So in its Clobber/Modify domain, use "memory" to declare to GCC: the memory content has changed.
If the Clobber/Modify field of an inline assembly statement exists "memory", then GCC will ensure that before this inline assembly, if the content of a memory is loaded into a register, then after this inline assembly, if the content of this memory needs to be used, it will be directly read to this memory, rather than using the copy stored in the register. Because at this time the copy in the register is likely to be inconsistent with the content in the memory.
This is just a little bit GCC will guarantee to do when using "memory", but that's not all. Because using "memory" is to declare to GCC that memory has changed, and the impact of memory changes has not only this. For example, the example we mentioned earlier:
int main(int __argc, char* __argv[]) { int* __p = (int*)__argc; (*__p) = 9999; __asm__("":::"memory"); if((*__p) == 9999) return 5; return (*__p); }
In this example, if there is no inline assembly statement, the judgment conditions of that if statement are completely nonsense. GCC will realize this when optimizing, and directly generates only the assembly code of return 5, and will not generate the relevant code of if statements, and will not generate the relevant code of return (*__p). But you added this inline assembly statement and it did nothing except declare memory changes. But GCC cannot simply think that it does not require judgment and knows that (*__p) must be equal to 9999. It only generates the assembly code of this if statement honestly, and the two related codes related to the return statements are related.
When an inline assembly instruction contains conditional flags that affect the eflags register (that is, the flag bits to refer to by Jxx and other jump instructions, such as carry flags, 0 flags, etc.), then "cc" is required in the Clobber/Modify field to declare this. These instructions include adc, div, popfl, btr, bts, etc. In addition, when including call directives, since you do not know whether the function you call will modify the condition flag, it is best to use "cc" for safety reasons.
I rarely see the exact usage of "cc" in related materials. Only one document mentions it, but it is not on the i386 platform yet. It is just saying that "cc" is a processor platform related, not all platforms support it, but even on platforms that do not support it, using it will not cause compilation errors. I did some experiments, but found that the code generated using "cc" and not using "cc" is no different. But it is used in the relevant code of Linux 2.4. If anyone knows the details of "cc" on the i386 platform, please contact me.
In addition, you can also specify numbers 0 to 9 in the Clobber/Modify field to declare that the register used by the nth Input/Output operation expression has changed, but as we mentioned earlier, if you specify a register for an Input/Output operation expression, or use "g", "r" and other constraints to let GCC choose a register for it, GCC already knows which register content has changed, so it doesn't make sense to do so; I have also tried related experiments and found that using it will have any impact on the assembly code generated by GCC, at least on the i386 platform. This is not used in all i386 platform-related inline assembly codes for Linux 2.4, but it is used in the S390 platform-related codes. However, since I have no idea about S390 assembly, I don’t know what the meaning of doing this is.
Summarize
The above is a detailed explanation of the embedded grammar of C language ASM assembly introduced to you by the editor. I hope it will be helpful to you. If you have any questions, please leave me a message and the editor will reply to you in time. Thank you very much for your support for my website!
If you think this article is helpful to you, please reprint it. Please indicate the source, thank you!