0. Understand the difference between global (static), stack, heap (free memory) and unallocated memory.
A pointer is just the address of a memory location. However, you must
consider what you are pointing to, and how long the pointed object “lives”.
The objects that live the longest are global objects. They are “alive”
as long as the program runs. A global object is simply any variable defined
outside a function.
Additionally, local variables defined as static are also valid as long as the program runs – they are global variables that are only visible in the block that defines them.
Additionally, local variables defined as static are also valid as long as the program runs – they are global variables that are only visible in the block that defines them.
The stack is a special memory area that holds local variables, function
parameters (which are, in fact, also local variables), and function call
information. The stack is allocated by the operating system for each thread;
there are special processor instructions (and registers) for dealing with the
function call.
We don’t need to care about allocating and releasing memory for the local variables; the compiler does this for us. However, we must remember that local variables live only as long as the block that defines them – so never return a pointer to a local variable.
We don’t need to care about allocating and releasing memory for the local variables; the compiler does this for us. However, we must remember that local variables live only as long as the block that defines them – so never return a pointer to a local variable.
The free store, or heap, is where malloc allocates memory. The
responsibility for allocating and releasing memory from the heap falls to the
programmer.
1. Prefer local variables.
Of course, you don’t always have a choice. You may not know how many
objects you have until runtime. Or the object might be too big for the stack.
But for simple cases, you don’t need the headache.
int * pArray = calloc(3, sizeof(int)); pArray[1] = 3; pArray[2] = 7; pArray[3] = 12; DoSomething(pArray, 3); free(pArray);
In this case, we know the whole array in advance, so we can keep the code simple:
int array[] = { 3, 7, 5 }; DoSomething(array, ARRAYSIZE(array));
2. Never return a pointer to a local variable from a function.
This is obvious: the variable is only valid during the function call. Of
course, you can return the address of a static local variable, since this is
valid for the whole duration of the program.
int *a(void) { int x; static int y = 12; // ... printf("x = %d", &x); // OK, x is valid for the whole duration of the printf call. return &x; // Bad, x ceases to be valid when exiting the function. return &y; // OK, y is static. }
3. Initialize any variable you define.
This piece of advice is valid regardless of the actual data type.
By convention, any pointer that doesn’t point anywhere should be
initialized with NULL.
4.Allocated memory is not initialized. You must initialize it yourself.
Well, you might argue that calloc does initialize the memory. That’s true,
int * p = malloc(sizeof(int)); memset(p, 0, sizeof(int));
is equivalent to:
int * p = calloc(1, sizeof(int));
But there are more complicated cases:
int *p1 = realloc(p, sizeof(int) * 2); p1[0] = 0;
In this case, we want to extend the allocated memory area, from enough for one integer
to two. The first integer will keep its value, but the second will not be
initialized, so we need to do it ourselves.
5. Everything that was allocated must be freed, or you’ll have a memory “leak”.
It’s always fun to watch the memory consumption for this program, using Process
Explorer or a similar tool:
size_t i; for (i = 0; i < SIZE_MAX; ++i) { int * p = malloc(sizeof(int)); if (!p) break; }
Hmm, kind of boring; let’s make it leak faster:
size_t i; for (i = 0; i < SIZE_MAX; ++i) { int * p = malloc(sizeof(int) * 100000); if (!p) break; }
It’s true that the operating system frees all allocated memory after a
program ends its executions. Still, the leaked memory means less memory for the
other programs running. It can also mean that your program will run out of
memory instead of keeping on running for days or years.
6. Freed memory looks just like allocated memory* – always set pointers to NULL after deallocating the memory.
* Just the same as, in the chemistry lab, hot glass looks just like cold glass.
Quick now: what does this program print?
int *p = calloc(1000, sizeof(int)); printf("p = %x\n", p[3]); free(p); printf("p = %x\n", p[3]);
The answer is: it depends. If you use Visual Studio, it will probably show:
p = 0 p = feeefeee
(For FeeFee, see http://www.nobugs.org/developer/win32/debug_crt_heap.html).
But it might also print:
p = 0 p = 0
Why? Because memory keeps its value until overwritten. In general,
deallocation does not overwrite the freed memory – as you know, C was designed
to be fast (or to shoot yourself in the leg quickly).
So, to prevent problems, always set pointers to NULL after freeing
them.
free(p);
p = NULL;
For advanced practitioners: sometimes even this may be enough.
int *p = calloc(1000, sizeof(int)); int *q = p + 3; printf("p = %x\n", p[3]); printf("q = %x\n", *q); free(p); p = NULL; printf("q = %x\n", *q);
Here, we have several pointers. While we set the pointer to the
beginning of the block to NULL, we
may not know who else references our memory. The code might even work
correctly, using the “ghost” values from the freed memory, until a change in a
completely different place will mysteriously modify the value pointed to by q.
So how would you debug a situation like this? I would overwrite the
whole memory block with 0 (or another suitable value) just before freeing it:
memset(p, 0, 1000);
free(p);
p = NULL;
This, of course, will not solve the problem, but it might help find the bug
faster – the value that q points to
will change after freeing p.
7. Always check that the allocation was successful.
In general, you should always check the return code of any function.
int *p = malloc(1000000000 * sizeof(int)); p[0] = 13; free(p);
What’s wrong here? It’s not the fact that we try to allocate enough
space for a billion integers, but the fact that we never check if we actually
got it.
Let’s try again:
int *p = malloc(1000000000 * sizeof(int)); if (!p) { fprintf(stderr, "Not enough memory.\n"); } else { p[0] = 13; free(p); }
Easy, right?
Let’s try a more complicated case:
typedef struct Employee { char name[10]; // ... } Employee; typedef struct Employees { size_t count; Employee *pList; } Employees; bool AddEmployee(Employees *pEmployees, const Employee *pPers) { assert(pEmployees); assert(pPers); if (pEmployees->count == 0) { assert(pEmployees->pList == NULL); if (NULL == (pEmployees->pList = malloc(sizeof(Employee)))) return false; pEmployees ->pList[0] = *pPers; ++pEmployees->count; return true; } else { assert(pEmployees->pList != NULL); if (NULL == (pEmployees->pList = realloc(pEmployees->pList, sizeof(Employee) * (pEmployees->count + 1)))) return false; pEmployees->pList[pEmployees->count] = *pPers; ++pEmployees->count; return true; } }
At first sight, this looks correct. We verify if the allocation was
successful and we stop if it didn’t work. Maybe it could’ve been written using
only realloc, but that’s not our main
concern now.
However, a quick look in the documentation tells us that realloc returns NULL on failure. What we’ve created here is a memory leak – if
there is not enough room for a new employee, we lose the list (even though it’s
still allocated).
The correct approach would be to use a temporary variable:
Employee *pNewList = NULL; assert(pEmployees->pList != NULL); if (NULL == (pNewList = realloc(pEmployees->pList, sizeof(Employee) * (pEmployees->count + 1)))) return false; pEmployees->pList = pNewList; pEmployees->pList[pEmployees->count] = *pPers; ++pEmployees->count; return true;
8. free() needs a pointer to the beginning of the allocated memory block.
Consider the following code:
size_t count = 100; size_t i = 0; int *p = malloc(count * sizeof(int)); int *q = p + count; while (p != q) { *p = i; ++p; } // ... free(p);
The result is unspecified and probably bad. Fortunately, it’s easy to fix:
size_t count = 100; size_t i = 0; int * const pStart = malloc(count * sizeof(int)); int *p = pStart; int *q = p + count; while (p != q) { *p = i; ++p; } // ... free(pStart);
The const keeps us honest, by preventing us from moving the pointer. Remember that only the const after the asterisk makes the
pointer constant.
9. Never write more than you have allocated.
This goes even if you don’t use the heap. There is still too much code like this out there:
char c[3]; sprintf(c, "Abc"); puts(c);
There are tools to find this. But it’s still the programmer’s
responsibility to keep track of allocated memory and to make sure only the
allocated memory is accessed.
The traditional C functions like sprintf
and strcpy are not very safe, since
they don’t check if you’re writing outside the allocated string. There is a strncat and strncpy, but there’s a gotcha: if the block is not enough to write
the string, the string terminator is not
added automatically.
char c[4]; strncpy(c, "Hello, world!", ARRAYSIZE(c) - 1); c[ARRAYSIZE(c) - 1] = '\0'; // Make it safe.
The newer versions of the standard add Microsoft’s safe string
functions (sprintf_s, strcat_s), but only optional.
Use them if you can.
If you don’t have them, you may want to create your own.
10. String literals are constant.
This is the type of thing that can cause problems when upgrading from an
old compiler to a newer, stricter one.
char *c = "abc"; c[0] = 'A'; printf("%s", c);
On modern compilers, this give you an access violation – you are trying
to write in a read-only memory block. Older compilers let you get away with
this. Some compilers make the strings read-only by default, but have a switch
to make them writable.
It’s perhaps surprising that most compilers let you compile this code
without warnings, even on the highest warning level. I would have expected them
to complain that c is not const char * (or const char[]).
11. Always use the correct deallocation function.
If you also use C++, you will know that whatever you’ve allocated with new has to be freed with delete and what’s been allocated with malloc/realloc/calloc should be freed
with free. But that’s not the end of
the story.
Operating systems offer various memory management functions.
For example, on Windows you have GlobalAlloc, VirtualAlloc, CoTaskMemoryAlloc
and perhaps others, each with its own counterpart for freeing the memory. If
you call the wrong function, there’s no telling what will happen (but you can
be sure that it’s not going to be good).
You may think you’re safe if you use just malloc/calloc/realloc and free. Think again.
Employee * CreateEmployee(const char *name, Gender gender) { Employee *pEmp = calloc(1, sizeof(Employee)); if (pEmp) { strcpy_s(pEmp->name, ARRAYSIZE(pEmp->name), name); pEmp->gender = gender; } return pEmp; } void f(void) { Employee *pEmp = CreateEmployee("John Doe", Gender_Male); if (pEmp) { // Process and store employee // ... free(pEmp); pEmp = NULL; } }
As long as you use the two functions in the same module, you’re
fine. But at some point you might want to move the CreateEmployee function to a library. Somebody with a different
compiler (or the same compiler but different settings) will use your library
but recompile f(). Suddenly, you’ll
have a problem, because the copy of free from
the client code will not know how the calloc
from the Employee library has allocated the memory.
The solution is to create and export a DeleteEmployee function together with the CreateEmployee function:
// Employee library Employee * CreateEmployee(const char *name, Gender gender); void DeleteEmployee(Employee *pEmp); // Main program void f(void) { Employee *pEmp = CreateEmployee("John Doe", Gender_Male); if (pEmp) { // Process and store employee // ... DeleteEmployee(pEmp); pEmp = NULL; } }
Which brings me to the next point:
12. Remember that structures containing pointers need special treatment.
What I mean by this is that you cannot just free the top level object; you need to also free any object owned by it – but not the ones that are just
referenced.
For example:
typedef struct Employee { Gender gender; char *name; char *surname; Date birthday; Employee *pManager; size_t subordinatesCount; Employee *pSubordinates; } Employee; Employee * CreateEmployee(const char *name, const char *surname, Gender gender) { Employee *pEmp = calloc(1, sizeof(Employee)); if (pEmp) { strcpy_s(pEmp->name, ARRAYSIZE(pEmp->name), name); pEmp->gender = gender; } return pEmp; } void f(void) { Employee *pEmp = CreateEmployee("John", "Doe", Gender_Male); if (pEmp) { // Process and store employee // ... free(pEmp->name); free(pEmp->surname); // Don't free manager and subordinates, those are just references, not owned by this object free(pEmp); pEmp = NULL; } }
As Employee grows more
complicated, it becomes more and more convenient to have a DeleteEmployee function that takes care of all these details. In
other words, even in plains C it is worthwhile to think object-oriented.
13. Be sure to document who owns the memory block.
Consider the following function declaration:
const char * GetLastErrorText(void);
Should the caller free the text buffer? It is a pointer to a static
buffer? Maybe the library allocates memory for the message and keeps it until
the next call?
There is only one answer: see the documentation.
No comments:
Post a Comment