Dynamic memory allocation. Memory allocation in C (malloc function) Reserved keyword for dynamic memory allocation

08.12.2021

A program can store information in the computer's main memory in two main ways. The first one uses global and local variables, including arrays, structures and classes. In the case of global and static local variables, the storage location of information is fixed for the entire duration of program execution. In the case of local variables, memory is allocated on the stack. Although Borland C++ handles these variables very efficiently, their use requires the programmer to know in advance the amount of memory that will be required during program execution.

The second way to store information is to use the Borland C++ dynamic memory allocation system. In this method, memory for storing information is allocated from a free memory area as needed and returned back, i.e. released when the need for it has disappeared. The area of free memory lies between the memory area where the program is located and the stack. This area is called the heap and is used for dynamic memory allocation requests.

The advantage of using dynamic memory is that the same memory can be used to store different information during program execution. Since memory is allocated for a specific purpose and freed when its use is completed, the same memory can be used at another point in time for other purposes in another part of the program. Another benefit of dynamic memory allocation is that it can be used to create linked lists, binary trees, and other dynamic data structures.

The core of C's dynamic memory allocation is the malloc() and free() functions, which are parts of the standard library. Whenever a memory allocation request is made by the malloc() function, a portion of the available free memory is allocated. Whenever this memory is freed using the free() function, this memory is returned back to the system.

The C++ language defines two operators for dynamic memory allocation - new and delete.

The ANSI C standard defines only four dynamic memory allocation functions: calloc(), malloc(), free(), and realloc(). However, Borland C++ contains several other dynamic memory allocation functions. When compiling code for the modern 32-bit memory model, memory is flat and typically only the four standard memory allocation functions are used.

The ANSI C standard specifies that the header information required for dynamic memory allocation is contained in the file stdlib.h. However, Borland C++ allows the use of stdlib.h or alloc.h header files. Here we use the stdlib.h header file because it provides portability. Some other dynamic memory allocation functions require alloc.h, malloc.h, or dos.h header files. You need to pay special attention to which header file is needed to use each function.

So. The third type, the most interesting in this topic for us, is the dynamic type of memory.

How did we work with arrays before? int a How do we work now? We allocate as much as needed:

#include < stdio.h> #include < stdlib.h> int main() (size_t size; // Create a pointer to int // – essentially an empty array. int *list; scanf( "%lu", &size); // Allocate memory for size elements of size int // and our "empty array" now refers to this memory. list = (int *)malloc(size * sizeof(int)); for (int i = 0 ; i< size; ++i) { scanf (" %d " < size; ++i) { printf (" %d ", *(list + i)); ) // Don't forget to clean up after yourself! free(list); ) // *

Void * malloc(size_t size);

But in general, this function allocates size bytes of uninitialized memory (not zeros, but garbage).

If the allocation was successful, a pointer to the very first byte of the allocated memory is returned.

If unsuccessful - NULL. Also errno will be equal to ENOMEM (we will look at this wonderful variable later). That is, it would be more correct to write:

#include < stdio.h> #include < stdlib.h> int main () ( size_t size; int *list; scanf ( "%lu", &size); list = (int *)malloc(size * sizeof(int)); if (list == NULL ) ( goto error; ) for (int i = 0 ; i< size; ++i) { scanf (" %d ", list + i); ) for (int i = 0 ; i< size; ++i) { printf (" %d ", *(list + i)); ) free(list); return 0 ; error: return 1 ; ) // *

There is no need to clear a NULL pointer

#include < stdlib.h> int main() (free(NULL);)

– in the same clang everything will go fine (nothing will be done), but in more exotic cases it may well crash the program.

Next to malloc and free in mana you can also see:

void * calloc(size_t count, size_t size);

Just like malloc will allocate memory for count objects of size bytes. The allocated memory is initialized with zeros.

void * realloc (void *ptr, size_t size);

Reallocates (if it can) the memory pointed to by ptr by size bytes. If there is not enough space to increase the allocated memory pointed to by ptr , realloc creates a new allocation (allocation), copies the old data pointed to by ptr , frees the old allocation, and returns a pointer to the allocated memory.

If ptr is NULL, realloc is identical to calling malloc.

If size is zero and ptr is not NULL , a piece of memory of the minimum size is allocated and the original one is freed.

void * reallocf (void *ptr, size_t size);

An idea from the FreeBSD API. Like realloc , but if it fails to reallocate, it clears the accepted pointer.

void * valloc(size_t size);

Like malloc, but allocated memory is page aligned.

In C++, as in many other languages, memory can be allocated statically (memory is allocated before program execution begins and freed after program completion) or dynamically (memory is allocated and freed during program execution).

Static memory allocation is performed for all global and local variables that have explicit declarations in the program (without the use of pointers). In this case, the memory allocation mechanism is determined by the location of the variable description in the program and the memory class specifier in the description. The type of a variable determines the size of the allocated memory area, but the mechanism for allocating memory does not depend on the type. There are two main mechanisms for static memory allocation.

· Memory for each of the global and static (declared with the static specifier) variables is allocated before the start of program execution in accordance with the type description. From the beginning to the end of program execution, these variables are associated with the memory area allocated for them. Thus, they have a global lifetime, but their scope of visibility is different.

· For local variables declared inside a block and without a specifier static memory is allocated in a different way. Before the program starts executing (when it is loaded), a fairly large memory area is allocated, called stack(sometimes the terms are used program stack or call stack to make a distinction between the stack as an abstract data type). The size of the stack depends on the development environment, for example, in MS Visual C++, by default 1 megabyte is allocated for the stack (this value can be customized). During program execution, when entering a certain block, memory is allocated on the stack for variables localized in the block (in accordance with the description of their type); when exiting the block, this memory is freed. These processes are performed automatically, which is why local variables in C++ are often called automatic.

When a function is called, memory is allocated on the stack for its local variables, parameters (the value or address of the parameter is placed on the stack), the result of the function and saving a return point - the address in the program where you need to return when the function completes. When a function exits, all data associated with it is removed from the stack.

The use of the term “stack” is easy to explain - with the accepted approach to memory allocation and freeing, the variables that are placed last on the stack (these are the variables localized in the deepest nested block) are removed from it first. That is, memory allocation and release occurs according to the LIFO principle (LAST IN – FIRST OUT, last in – first out). This is the principle of how a stack works. We will look at the stack as a dynamic data structure and its possible implementation in the next section.

In many cases, statically allocated memory leads to inefficient use of it (this is especially true for large arrays), since the statically allocated memory area is not always actually filled with data. Therefore, in C++, as in many languages, there are convenient means of dynamically generating variables. The essence of dynamic memory allocation is that memory is allocated (captured) upon request from the program and also freed upon request. In this case, the memory size can be determined by the type of the variable or explicitly specified in the request. Such variables are called dynamic. The ability to create and use dynamic variables is closely related to the pointer mechanism.

Summarizing all of the above, we can imagine the following memory allocation scheme during program execution (Figure 2.1). The location of the areas relative to each other in the figure is quite arbitrary, because The operating system takes care of the details of memory allocation.

Figure 2.1 – memory distribution diagram

To conclude this section, let's touch on one painful problem when working with the stack - the possibility of its overflow (this emergency situation is usually called Stack Overflow). The reason that gave rise to the problem is clear - the limited amount of memory that is allocated for the stack when loading a program. The most likely situations for stack overflow are large local arrays and deep nesting of recursive function calls (usually occurs when programming recursive functions is inaccurately, for example, some terminal branch is forgotten).

In order to better understand the stack overflow problem, we recommend performing this simple experiment. In function main declare an array of integers with say a million elements in size. The program will compile, but when you run it, a stack overflow error will occur. Now add the specifier to the beginning of the array description static(or take the array declaration out of the function main) – the program will work!

There is nothing miraculous about this - it’s just that now the array is not placed on the stack, but in the area of global and static variables. The memory size for this area is determined by the compiler - if the program compiled, then it will work.

However, as a rule, there is no need to declare statically generated arrays of huge sizes in a program. In most cases, dynamically allocating memory for such data will be a more efficient and flexible way.

C++ supports three main types discharge(or more "distributions") memory, two of which we are already familiar with:

Static memory allocation holds for and variables. Memory is allocated once, when the program starts, and is retained throughout the entire program.

Automatic memory allocation holds for and . Memory is allocated when entering the block containing these variables, and removed when exiting it.

Dynamic memory allocation is the topic of this lesson.

Dynamic variable allocation

Both static and automatic memory allocation have two common properties:

How does dynamic memory allocation work?

Your computer has memory (perhaps most of it) that is available for use by programs. When you run a program, your operating system loads that program into some portion of this memory. And this memory used by your program is divided into several parts, each of which performs a specific task. One part contains your code, the other is used to perform normal operations (keeping track of which functions are called, creating and destroying global and local variables, etc.). We will talk about it later. However, most of the available memory is simply sitting there, waiting for allocation requests from programs.

When you dynamically allocate memory, you are asking the operating system to reserve some of that memory for use by your program. If the OS can fulfill this request, then the address of this memory is returned back to your program. From now on, your program will be able to use this memory whenever it wishes. When you have already done everything that was necessary with this memory, it needs to be returned back to the operating system to be distributed among other requests.

Unlike static or automatic memory allocation, the program is responsible for requesting and returning dynamically allocated memory.

Freeing up memory

When you dynamically allocate a variable, you can also initialize it via or uniform initialization (in C++11):

int *ptr1 = new int (7); // use direct initialization int *ptr2 = new int ( 8 ); // use uniform initialization

When everything that was needed has already been done with a dynamically allocated variable, you need to explicitly tell C++ to free this memory. For variables this is done using operator delete:

// Assume that ptr was previously allocated using the operator new delete ptr; // return the memory pointed to by ptr back to the operating system ptr = 0; // make ptr a null pointer (use nullptr instead of 0 in C++11)

The delete operator doesn't actually delete anything. It simply returns the memory that was previously allocated back to the operating system. The operating system can then reassign this memory to another application (or the same application again).

Although it may seem that we are deleting variable, but that's not true! A pointer variable still has the same scope as before and can be assigned a new value like any other variable.

Note that deleting a pointer that does not point to dynamically allocated memory can cause problems.

Hanging signs

C++ makes no guarantees about what will happen to the contents of freed memory or the value of the deleted pointer. In most cases, the memory returned to the operating system will contain the same values it had before. liberation, and the pointer will remain pointing to only the already freed (deleted) memory.

A pointer pointing to freed memory is called hanging sign. Dereferencing or removing a dangling pointer will produce unexpected results. Consider the following program:

#include int main() ( int *ptr = new int; *ptr = 8; // place the value in the allocated memory location delete ptr; // return the memory back to the operating system. ptr is now a dangling pointer std::cout<< *ptr; // разыменование висячего указателя приведёт к неожиданным результатам delete ptr; // попытка освободить память снова приведёт к неожиданным результатам также return 0; }

#include

int main()

int * ptr = new int ; // dynamically allocate an integer variable

* ptr = 8 ; // place the value in the allocated memory cell

delete ptr ; // return memory back to the operating system. ptr is now a dangling pointer

std::cout<< * ptr ; // dereferencing a dangling pointer will lead to unexpected results

delete ptr ; // trying to free memory again will lead to unexpected results as well

return 0 ;

In the program above, the value 8, which was previously assigned to a dynamic variable, may or may not continue to be there after it is freed. It is also possible that the freed memory may have already been allocated to another application (or for the operating system's own use), and attempting to access it will cause the operating system to automatically terminate your program.

The process of freeing memory can also lead to the creation several hanging signs. Consider the following example:

#include int main() ( int *ptr = new int; // dynamically allocate an integer variable int *otherPtr = ptr; // otherPtr now points to the same allocated memory as ptr delete ptr; // return memory back to the operating system .ptr and otherPtr are now dangling pointers ptr = 0; // ptr is now nullptr // However, otherPtr is still a dangling pointer! return 0; )

#include

int main()

int * ptr = new int ; // dynamically allocate an integer variable

int * otherPtr = ptr ; // otherPtr now points to the same allocated memory as ptr

delete ptr ; // return memory back to the operating system. ptr and otherPtr are now dangling pointers

ptr = 0 ; // ptr is now nullptr

// However, otherPtr is still a dangling pointer!

return 0 ;

First, try to avoid situations where multiple pointers point to the same portion of allocated memory. If this is not possible, then make it clear which pointer "owns" the memory (and is responsible for deleting it) and which pointers simply access it.

Secondly, when you delete a pointer, and if it does not exit immediately after deletion, then it needs to be made null, i.e. assign the value 0 (or in C++11). By "going out of scope immediately upon deletion" we mean that you delete the pointer at the very end of the block in which it is declared.

Rule: Set deleted pointers to 0 (or nullptr in C++11) unless they go out of scope immediately after being deleted.

Operator new

When requesting memory from the operating system, in rare cases it may not be available (that is, it may not be available).

By default, if the new operator did not work, memory was not allocated, then exception bad_alloc. If this exception is not handled correctly (and it will be, since we haven't looked at exceptions and their handling yet), then the program will simply stop executing (crash) with an unhandled exception error.

In many cases, the process of throwing an exception with the new operator (as well as crashing the program) is undesirable, so there is an alternative form of the new operator that returns a null pointer if memory cannot be allocated. You just need to add constant std::nothrow between the new keyword and the data type:

int *value = new (std::nothrow) int; // the value pointer will become null if the dynamic allocation of an integer variable fails

In the example above, if new does not return a pointer with dynamically allocated memory, then a null pointer will be returned.

Dereferencing it is also not recommended, as this will lead to unexpected results (most likely a program crash). Therefore, the best practice is to check all memory allocation requests to ensure that the requests complete successfully and the memory is allocated:

int *value = new (std::nothrow) int; // request to allocate dynamic memory for an integer value if (!value) // handle the case when new returns null (i.e. memory is not allocated) ( // Handle this case std::cout<< "Could not allocate memory"; }

Since non-allocation of memory by the new operator is extremely rare, programmers usually forget to perform this check!

Null pointers and dynamic memory allocation

Null pointers (pointers with the value 0 or nullptr) are especially useful during the dynamic memory allocation process. Their presence as if informs us: “No memory is allocated to this pointer.” And this, in turn, can be used to perform conditional memory allocation:

// If ptr has not yet been allocated memory, then allocate it if (!ptr) ptr = new int;

Removing the null pointer has no effect. So the following is not necessary:

if (ptr) delete ptr;

if(ptr)

delete ptr ;

Instead you can simply write:

delete ptr ;

If ptr is not null, then the dynamically allocated variable will be deleted. If the pointer value is null, then nothing will happen.

Memory leak

Dynamically allocated memory has no scope, i.e. it remains allocated until it is explicitly freed or until your program finishes executing (and the operating system clears all memory buffers on its own). However, pointers used to store dynamically allocated memory addresses follow the scoping rules of regular variables. This discrepancy can cause interesting behavior. For example:

void doSomething() ( int *ptr = new int; )

Working with dynamic memory is often a bottleneck in many algorithms, unless special tricks are used.

In this article I will look at a couple of such techniques. The examples in the article differ (for example, from) in that overloading of the new and delete operators is used, and due to this, the syntactic structures will be minimalistic, and the rework of the program will be simple. The pitfalls found in the process are also described (of course, gurus who read the standard from cover to cover will not be surprised).

0. Do we need manual work with memory?

First of all, let's check how much a smart allocator can speed up memory work.

Let's write simple tests for C++ and C# (C# is known for its excellent memory manager, which divides objects into generations, uses different pools for objects of different sizes, etc.).

Class Node ( public: Node* next; ); // ... for (int i = 0; i< 10000000; i++) { Node* v = new Node(); }

Class Node ( public Node next; ) // ... for (int l = 0; l< 10000000; l++) { var v = new Node(); }

Despite all the “spherical-vacuum” nature of the example, the time difference was 10 times (62 ms versus 650 ms). In addition, the C# example is finished, and according to the rules of good manners in C++, the selected objects must be deleted, which will further increase the gap (up to 2580 ms).

1. Object pool

The obvious solution is to take a large block of memory from the OS and split it into equal blocks of size sizeof(Node), when allocating memory, take the block from the pool, and when freeing it, return it to the pool. The easiest way to organize a pool is using a singly linked list (stack).

Since the goal is minimal intervention in the program, all that can be done is to add the BlockAlloc mixin to the Node class:
class Node: public BlockAlloc

First of all, we need a pool of large blocks (pages), which we take from the OS or C-runtime. It can be organized on top of the malloc and free functions, but for greater efficiency (to skip the extra level of abstraction), we use VirtualAlloc/VirtualFree. These functions allocate memory in multiples of 4K blocks and also reserve process address space in multiples of 64K blocks. By simultaneously specifying the commit and reserve options, we jump another level of abstraction, reserving address space and allocating memory pages in a single call.

PagePool class

inline size_t align(size_t x, size_t a) ( return ((x-1) | (a-1)) + 1; ) //#define align(x, a) ((((x)-1) | ( (a)-1)) + 1) template class PagePool ( public: void* GetPage() ( void* page = VirtualAlloc(NULL, PageSize, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE); pages.push_back(page); return page; ) ~PagePool() ( for (vector ::iterator i = pages.begin(); i != pages.end(); ++i) ( VirtualFree(*i, 0, MEM_RELEASE); ) ) private: vector pages; );

Then we organize a pool of blocks of a given size

BlockPool class

template class BlockPool: PagePool ( public: BlockPool() : head(NULL) ( BlockSize = align(sizeof(T), Alignment); count = PageSize / BlockSize; ) void* AllocBlock() ( // todo: lock(this) if (!head) FormatNewPage(); void* tmp = head; head = *(void**)head; return tmp; ) void FreeBlock(void* tmp) ( // todo: lock(this) *(void**)tmp = head; head = tmp; ) private: void* head; size_t BlockSize; size_t count; void FormatNewPage() ( void* tmp = GetPage(); head = tmp; for(size_t i = 0; i< count-1; i++) { void* next = (char*)tmp + BlockSize; *(void**)tmp = next; tmp = next; } *(void**)tmp = NULL; } };

Comment // todo: lock(this) Places that require inter-thread synchronization are marked (for example, use EnterCriticalSection or boost::mutex).

I’ll explain why when “formatting” a page, the FreeBlock abstraction is not used to add a block to the pool. If something like this were written

For (size_t i = 0; i< PageSize; i += BlockSize) FreeBlock((char*)tmp+i);

Then the page, using the FIFO principle, would be marked up “in reverse”:

Several blocks requested from the pool in a row would have descending addresses. But the processor doesn’t like to go backwards, this breaks Prefetch ( UPD: Not relevant for modern processors). If you do markup in a loop
for (size_t i = PageSize-(BlockSize-(PageSize%BlockSize)); i != 0; i -= BlockSize) FreeBlock...
then the markup cycle would go backwards to the addresses.

Now that the preparations are done, we can describe the mixin class.
template class BlockAlloc ( public: static void* operator new(size_t s) ( if (s != sizeof(T)) ( return::operator new(s); ) return pool.AllocBlock(); ) static void operator delete(void * m, size_t s) ( if (s != sizeof(T)) ( ::operator delete(m); ) else if (m != NULL) ( pool.... static void* operator new(size_t, void * m) ( return m; ) // ...and the warning about missing placement delete... static void operator delete(void*, void*) ( ) private: static BlockPool pool; ); template BlockPool BlockAlloc ::pool;

I’ll explain why checks are needed if (s != sizeof(T))
When do they work? Then, when a class inherited from the base T is created/deleted.
The heirs will use the usual new/delete, but BlockAlloc can also be mixed in with them. This way, we can easily and safely determine which classes should use the pools without fear of breaking something in the program. Multiple inheritance also works great with this mixin.

Ready. We inherit Node from BlockAlloc and re-run the test.
The test time is now 120 ms. 5 times faster. But in C# the allocator is still better. It's probably not just a linked list. (If immediately after new we call delete, and thus do not waste a lot of memory, placing the data in the cache, we get 62 ms. Strange. Exactly like the .NET CLR, as if it returns freed local variables immediately to the corresponding pool, without waiting for GC)

2. Container and its colorful contents

Do you often come across classes that store a lot of different child objects, such that the lifetime of the latter is no longer than the lifetime of the parent?

For example, this could be an XmlDocument class filled with Node and Attribute classes, as well as c-strings (char*) taken from the text inside the nodes. Or a list of files and directories in the file manager that are loaded once when the directory is re-read and never change again.

As was shown in the introduction, delete is more expensive than new. The idea of the second part of the article is to allocate memory for child objects in a large block associated with the Parent object. When a parent object is deleted, the children's destructors will be called, as usual, but the memory will not need to be returned - it will be freed in one large block.

Let's create a PointerBumpAllocator class that can bite off pieces of different sizes from a large block and allocate a new large block when the old one is exhausted.

PointerBumpAllocator class

template class PointerBumpAllocator ( public: PointerBumpAllocator() : free(0) ( ) void* AllocBlock(size_t block) ( // todo: lock(this) block = align(block, Alignment); if (block > free) ( free = align (block, PageSize); head = GetPage(free); ) void* tmp = head; head = (char*)head + block; free -= block; return tmp; ) ~PointerBumpAllocator() ( for (vector ::iterator i = pages.begin(); i != pages.end(); ++i) ( VirtualFree(*i, 0, MEM_RELEASE); ) ) private: void* GetPage(size_t size) ( void* page = VirtualAlloc(NULL, size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE); pages.push_back(page) ; return page; ) vector pages; void* head; size_t free; ); typedef PointerBumpAllocator<>DefaultAllocator;

Finally, let's describe a ChildObject mixin with overloaded new and delete accessing a given allocator:

Template struct ChildObject ( static void* operator new(size_t s, A& allocator) ( return allocator.AllocBlock(s); ) static void* operator new(size_t s, A* allocator) ( return allocator->AllocBlock(s); ) static void operator delete(void*, size_t) ( ) // *1 static void operator delete(void*, A*) ( ) static void operator delete(void*, A&) ( ) private: static void* operator new(size_t s ); );

In this case, in addition to adding a mixin to the child class, you will also need to correct all calls to new (or use the “factory” pattern). The syntax of the new operator will be as follows:

New (...parameters for the operator...) ChildObject (...parameters for the constructor...)

For convenience, I have specified two new operators that accept A& or A*.
If the allocator is added to the parent class as a member, the first option is more convenient:
node = new(allocator) XmlNode(nodename);
If the allocator is added as an ancestor (mixin), the second one is more convenient:
node = new(this) XmlNode(nodename);

There is no special syntax for calling delete; the compiler will call standard delete (marked *1), regardless of which new operator was used to create the object. That is, the delete syntax is normal:
delete node;

If an exception occurs in the constructor of ChildObject (or its descendant), delete is called with a signature corresponding to the signature of the new operator used to create this object (the first parameter size_t will be replaced by void*).

Placing the new operator in the private section protects against calling new without specifying an allocator.

Here's a complete example of using the Allocator-ChildObject pair:

Example

class XmlDocument: public DefaultAllocator ( public: ~XmlDocument() ( for (vector ::iterator i = nodes.begin(); i != nodes.end(); ++i) ( delete (*i); ) ) void AddNode(char* content, char* name) ( char* c = (char*)AllocBlock(strlen(content)+1); strcpy(c, content); char* n = (char*)AllocBlock(strlen(name)+1); strcpy(n, content); nodes.push_back(new(this) XmlNode(c, n)); ) class XmlNode: public ChildObject ( public: XmlNode(char* _content, char* _name) : content(_content), name(_name) ( ) private: char* content; char* name; ); private: vector nodes; );

Conclusion. The article was written 1.5 years ago for the sandbox, but alas, the moderator did not like it.