SoFunction
Updated on 2025-03-11

Array and pointer assembly code analysis examples in C language

Today, when I was watching "Programmer Interview Book", I accidentally saw the access efficiency of arrays and pointers. I was bored and wrote a small piece of code myself to briefly analyze the assembly behind C. Many people may only focus on C, but in actual applications, when problems arise, sometimes the problem can be solved by analyzing the assembly code. This article is only for beginners, and big bulls can float by~

The C source code is as follows:

Copy the codeThe code is as follows:

#include ""
int main(int argc, char* argv[])
{
       char a=1;
       char c[] = "1234567890";
       char *p = "1234567890";
       a = c[1];
       a = p[1];
       return 0;
}

Check the assembly code steps under VC6.0:
Set breakpoints in any line in the main function F9 -> Compile -> F5 Right-click in the debug interface -> Go to disassembly

Debug assembly code (commented):

Copy the codeThe code is as follows:

4:    #include ""
5:
6:    int main(int argc, char* argv[])
7:    {
00401010   push        ebp    
00401011   mov
00401013                       esp,54h          ; Raise the top of the stack
00401016   push        ebx
00401017   push        esi
00401018   push                                                                                                                        �
00401019   lea         edi,[ebp-54h]            
0040101C   mov         ecx,15h
00401021   mov         eax,0CCCCCCCCh
00401026   rep stos    dword ptr [edi]    ; The data between the top of the stack and the stack frame is filled with 0xcc, which is equivalent to int 3 in assembly. This is because all variables on Stack are initialized to 0xcc in debug mode to check for uninitialized problems
8:        char a=1;
00401028   mov             byte ptr [ebp-4],1     ;ebp-4 is the space address allocated for variable a
9:        char c[] = "1234567890";
0040102C   mov         eax,[string "1234567890" (0042201c)]
00401031   mov                                                                                                                                                                                                                                                         In this sentence, the 4 bytes "1234" are copied into array C first.
00401034   mov         ecx,dword ptr [string "1234567890" 4 (00422020)]
0040103A   mov
0040103D   mov         dx,word ptr [string "1234567890" 8 (00422024)]
00401044   mov            word ptr [ebp-8],dx   ; The function is the same as above, copy the 2 bytes of "90" into C
00401048   mov         al,[string "1234567890" 0Ah (00422026)]
0040104D   mov             byte ptr [ebp-6],al    ; Everyone is familiar with this, don't forget\0
10:       char *p = "1234567890";
00401050   mov
11:       a = c[1];
00401057   mov           cl,byte ptr [ebp-0Fh]  ; This is the point, because array C is continuously stored on the stack, it is easy to find the address of one of the characters based on ebp, take the value, and assign it to cl
0040105A   mov             byte ptr [ebp-4],cl     ; Complete assignment
12:       a = p[1];
0040105D   mov             edx,dword ptr [ebp-14h]  ; There is a difference here from the above, because according to ebp, only the value of pointer p is known, and the value of p is obtained first, that is, a pointer is obtained first
00401060   mov          al,byte ptr [edx 1]    ; Indirectly find a character in the string based on the obtained pointer
00401063   mov         byte ptr [ebp-4],al
13:       return 0;
00401066   xor            eax,eax           ; eax clears 0, as the return value of the main function
14:   }
00401068   pop         edi
00401069   pop         esi
0040106A   pop         ebx
0040106B   mov         esp,ebp
0040106D  pop
0040106E   ret

OK, you can see that using an array to access elements requires only 2 steps, while using a pointer requires 3 steps. It can be seen that arrays and pointers are not the same. Sometimes everyone thinks that the name of an array can be regarded as a pointer. This idea is sometimes correct, but sometimes it can go wrong. Let me give you another simple example, and the following example may be a problem that everyone often encounters during the development process.

In the file:

Copy the codeThe code is as follows:

#include ""
#include ""
extern char chTest[10];
int main(int argc, char* argv[])
{
       printf("chTest=%s\n", chTest);
       return 0;
}

There is an extern declaration above, indicating that the chTest array is defined in an external file. chTest is defined in:

Copy the codeThe code is as follows:

char chTest[10]="123456789";

The above program can be successfully run after being compiled. But if you change the red code to the following:

Copy the codeThe code is as follows:

extern char *chTest;

At this time, the program will not be able to pass when compiling. The error message prompts is: redefinition; different types of indirection, but there is no explanation on which line of error appears. If you are developing a large project, it is not easy to locate where the problem lies. I think everyone understands the reason for the above error, because when chTest is referenced as a pointer, its element access method is different from that of arrays. Even if the program can be compiled and passed, an error will occur during runtime.

Okay, the above content is all about personal feelings, it is just simple and fragmentary things, and I accept it with a smile. If there are any places that are not appropriate, I hope to correct them!