where is memory allocated for pointers and their data?

My question is if i have some function

void func1(){
    char * s = "hello";
    char * c;
    int b;
    c = (char *) malloc(15);
    strcpy(c,s);
}

I think the s pointer is allocated on the stack but where is the data "hello" stored does that go in the data segment of the program? As for c and b they are unitialized and since 'c = some memory address' and it doesnt have one yet how does that work? and b also has no contents so it cant stored on the stack? Then when we allocate memory for c on the heap with malloc c now has some memory address, how is this unitialized c variable given the address of the first byte for that string on the heap?


Let's divide this answer in two points of view of the same stuff, because the standards only complicate understanding of this topic, but they're standards anyway :).

Subject common to both parts

void func1() {
    char *s = "hello";
    char *c;
    int b;

    c = (char*)malloc(15);
    strcpy(c, s);
}

Part I: From a standardese point of view

According to the standards, there's this useful concept known as automatic variable duration, in which a variable's space is reserved automatically upon entering a given scope (with unitialized values, aka: garbage!), it may be set/accessed or not during such a scope, and such a space is freed for future use. Note : In C++, this also involves construction and destruction of objects.

So, in your example, you have three automatic variables:

  • char *s , which gets initialized to whatever the address of "hello" happens to be.
  • char *c , which holds garbage until it's initialized by a later assignment.
  • int b , which holds garbage all of its lifetime.
  • BTW , how storage works with functions is unspecified by the standards.

    Part II: From a real-world point of view

    On any decent computer architecture you will find a data structure known as the stack. The stack's purpose is to hold space that can be used and recycled by automatic variables, as well as some space for some stuff needed for recursion/function calling, and can serve as a place to hold temporary values (for optimization purposes) if the compiler decides to.

    The stack works in a PUSH / POP fashion, that is, the stack grows downwards. Let my explain it a little better. Imagine an empty stack like this:

    [Top of the Stack]
    [Bottom of the Stack]
    

    If you, for example, PUSH an int of value 5 , you get:

    [Top of the Stack]
    5
    [Bottom of the Stack]
    

    Then, if you PUSH -2 :

    [Top of the Stack]
    5
    -2
    [Bottom of the Stack]
    

    And, if you POP , you retrieve -2 , and the stack looks as before -2 was PUSH ed.

    The bottom of the stack is a barrier that can be moved uppon PUSH ing and POP ing. On most architectures, the bottom of the stack is recorded by a processor register known as the stack pointer. Think of it as a unsigned char* . You can decrease it, increase it, do pointer arithmetic on it, etcetera. Everything with the sole purpose to do black magic on the stack's contents.

    Reserving (space for) automatic variables in the stack is done by decreasing it (remember, it grows downwards), and releasing them is done by increasing it. Basing us on this, the previous theoretical PUSH -2 is shorthand to something like this in pseudo-assembly:

    SUB %SP, $4    # Subtract sizeof(int) from the stack pointer
    MOV $-2, (%SP) # Copy the value `-2` to the address pointed by the stack pointer
    

    POP whereToPop is merely the inverse

    MOV (%SP), whereToPop # Get the value
    ADD %SP, $4           # Free the space
    

    Now, compiling func1() may yield the following pseudo-assembly ( Note : you are not expected to understand this at its fullest):

    .rodata # Read-only data goes here!
    .STR0 = "hello" # The string literal goes here
    
    .text # Code goes here!
    func1:
        SUB %SP, $12     # sizeof(char*) + sizeof(char*) + sizeof(int)
        LEA .STR0, (%SP) # Copy the address (LEA, load effective address) of `.STR0` (the string literal) into the first 4-byte space in the stack (a.k.a `char *s`)
        PUSH $15         # Pass argument to `malloc()` (note: arguments are pushed last to first)
        CALL malloc
        ADD %SP, 4       # The caller cleans up the stack/pops arguments
        MOV %RV, 4(%SP)  # Move the return value of `malloc()` (%RV) to the second 4-byte variable allocated (`4(%SP)`, a.k.a `char *c`)
        PUSH (%SP)       # Second argument to `strcpy()`
        PUSH 4(%SP)      # First argument to `strcpy()`
        CALL strcpy
        RET              # Return with no value
    

    I hope this has led some light on you!


    We need to consider what memory location a variable has and what its contents are. Keep this in mind.

    For an int, the variable has a memory address and has a number as its contents.

    For a char pointer, the variable has a memory address and its contents is a pointer to a string--the actual string data is at another memory location.

    To understand this, we need to consider two things:

    (1) the memory layout of a program
    (2) the memory layout of a function when it's been called

    Program layout [typical]. Lower memory address to higher memory address:

    code segment -- where instructions go:
      ...
      machine instructions for func1
      ...
    data segment -- where initialized global variables and constants go:
      ...
      int myglobal_inited = 23;
      ...
      "hello"
      ...
    bss segment -- for unitialized globals:
      ...
      int myglobal_tbd;
      ...
    heap segment -- where malloc data is stored (grows upward towards higher memory
    addresses):
      ...
    stack segment -- starts at top memory address and grows downward toward end
    of heap

    Now here's a stack frame for a function. It will be within the stack segment somewhere. Note, this is higher memory address to lower:

    function arguments [if any]:
      arg2
      arg1
      arg0
    function's return address [where it will go when it returns]
    function's stack/local variables:
      char *s
      char *c
      int b
      char buf[20]
    

    Note that I've added a "buf". If we changed func1 to return a string pointer (eg "char *func1(arg0,arg1,arg2)" and we added "strcpy(buf,c)" or "strcpy(buf,c)" buf would be usable by func1. func1 could return either c or s, but not buf.

    That's because with "c" the data is stored in the data segment and persists after func1 returns. Likewise, s can be returned because the data is in the heap segment.

    But, buf would not work (eg return buf) because the data is stored in func1's stack frame and that is popped off the stack when func1 returns [meaning it would appear as garbage to caller]. In other words, data in the stack frame of a given function is available to it and any function that it may call [and so on ...]. But, this stack frame is not available to a caller of that function. That is, the stack frame data only "persists" for the lifetime of the called function.

    Here's the fully adjusted sample program:

    int myglobal_initialized = 23;
    int myglobal_tbd;
    
    char *
    func1(int arg0,int arg1,int arg2)
    {
        char *s = "hello";
        char *c;
        int b;
        char buf[20];
        char *ret;
    
        c = malloc(15);
        strcpy(c,s);
    
        strcpy(buf,s);
    
        // ret can be c, s, but _not_ buf
        ret = ...;
    
        return ret;
    }
    
    链接地址: http://www.djcxy.com/p/14076.html

    上一篇: 为班级dynamiclly分配内存

    下一篇: 内存分配给指针和它们的数据在哪里?