APR memory pool

Brunda K
Brunda’s Tech Notes
7 min readAug 19, 2023

--

The Apache HTTPD Web Server uses APR pools for memory allocation. In this article, I would like to talk about how APR pools work.

The Apache webserver uses the memory management functions provided by the APR library. Using these functions it is possible to allocate memory from a given pool. APR pools can be described as a memory management implementation provided by the APR library that helps to ease memory management issues that developers encounter in programming languages such as ‘C’.

When a memory allocation is done, it is very important to release the memory to the system, correctly. In programming languages such as ‘C’, the programmer must ensure that

  • memory is released (that is, there is no memory leak),
  • no piece of code accesses memory that has already been released (in case memory is shared by many threads, it is possible to introduce bugs that result in one thread accessing memory that has already been freed by another thread),
  • no double frees are done.

The pool implementation provided by the APR library aims to solve the above problems by clearly defining a lifetime for all memory allocations that happen in a pool. Each pool has a lifetime — the duration of time during which it is safe to access memory from it. Once the lifetime of the pool is over, the pool is cleared or destroyed. If the pool is destroyed, the memory is released back to the OS.

The memory allocation routines provided by the APR library take the pool from which memory needs to be allocated as one of the arguments. The lifetime of the memory allocation will be the same as that of the pool from which the memory was allocated. This takes away the responsibility of freeing each chunk of allocated memory from the programmer.

The programmer however, needs to make a decision on how long the allocated memory needs to be available before it is destroyed. Depending on how long the memory needs to be available, it can be allocated from the correct pool. The framework to decide this is as follows —

  • If the memory is needed for the lifetime of a single request, it is allocated from a pool called request pool. (See: ap_create_request in protocol.c for request pool creation)
  • If the memory is needed for the entire lifetime of a connection, it is allocated from a pool called transaction pool or connection pool. The connection or transaction pool can be accessed using conn_rec->pool field (see: core_create_conn in core.c). A new transaction pool is created every time a new socket is accepted.
  • If the memory is needed for the entire lifetime of a child Apache Web server process, it is allocated from pchild pool. The pchild pool is created in child_main function in event.c.

At this point, we know that pools have a lifetime and the memory allocated from a pool has the same lifetime as that of the pool. We also understand that the programmer is not needed to free each chunk of memory that has been allocated from a pool. However, the programmer needs to choose the correct pool for allocation depending on how long the variable needs to be available for the progam.

Sample allocation from an APR pool

apr_pool_t *mp;
/* create a memory pool. */
apr_pool_create(&mp, NULL);

/* allocate memory chunks from the memory pool */
char *buf1, buf2, buf3;
buf1 = apr_palloc(mp, 512);
buf2 = apr_palloc(mp, 1024);
buf3 = apr_palloc(mp, 2048);

/* destroy the entire pool */
apr_pool_destroy(mp);

The above code snippet shows -

  • A memory pool is first created using apr_pool_create
  • Allocations are done from the pool using apr_palloc functions
  • The entire pool is destroyed using apr_pool_destroy function

Deep dive into the structure of an APR pool

At this point, it is necessary to understand the data structures and some important functions associated with APR pool. This understanding is essential to appreciate the other benefits that APR pools provide.

The type of an APR pool is apr_pool_t. It is defined in memory/unix/apr/apr_pools.c

A pool consists of a linked list of apr_memnode_t structs called active and an allocator of type of apr_allocator_t

struct apr_pool_t {
apr_memnode_t *active;
apr_allocator_t *allocator;
....
....
};
A diagram showing the pool->active linked list

active is a linked list of apr_memnode_t structs. Each apr_memnode_t struct contains a pointer to the next node on the list. It also contains a pointer to an area of memory that has been allocated. The diagram above shows the size of the memory that is allocated. The allocated area of memory could be an area of memory that has been allocated on the heap using the malloc call or an area of an anonymous mmaped area. The memory that is allocated is always aligned to page size of the system.

apr_memnode_t structure is defined in apr_allocator.h. The area of the memory that is allocated contains the apr_memnode_t struct followed by the remaining bytes available for use.

apr_memnode_t also contains a field called index which indicates the size of the node. Instead of storing the size as an absolute number, it is stored as index = total size allocated/BOUNDARY_SIZE. The field first_avail indicates the first byte that can be returned to the caller when an allocation request is made.

struct apr_memnode_t {
apr_memnode_t * next;
apr_memnode_t ** ref'
apr_uint32_t index;
apr_uint32_t free_index;
char * first_avail;
char * endp;
}

In addition to the list of active nodes, each pool is associated with an allocator. The allocator is of type apr_allocator_t. The allocator has an array of free nodes which are again structs of apr_memnode_t type. apr_allocator_t is defined in apr/memory/unix/apr_pools.c

struct apr_allocator_t {
apr_memnode_t *free[max_index];
....
....
}

The array free is an array of apr_memnode_t type. This is a sorted array. It is sorted based on the value of index field of apr_memnode_t.

Allocation algorithm

When the program requests for memory of size size using any of the pool allocation routines such as apr_palloc or apr_calloc, the allocator executes the following steps to obtain the desired memory -

  • Check if the active pool has any enough memory. The first node in pool->active list is checked to see if there is enough space to satisfy the allocation of size bytes. If yes, the first_avail pointer is stored as mem. The first_avail is then advanced by size bytes. The variable mem is returned.
if (size < node_free_space(active)) {
mem = active->first_avail;
active->first_avail += size;
return mem;
}
  • The above case is the best case and the memory allocation just required pointer arithmetic. This case results in quick allocations avoiding a call to malloc/mmap.
  • If the first node does not have enough space, the next node on the active list is checked. The allocation algorithm tries to satisfy the requirement that the first node should always contains the highest amount of free space. Each allocation request provides a chance to keep the active list sorted to satisfy the above requirement.
  • This is done as follows — In case, the first node (pool->active) cannot satisfy the allocation, the next node (active->next) is checked and the pointers are adjusted, so that the next node becomes the first and pool->active points to the node with the highest free space. The old active node is inserted at an appropriate place to ensure that the active list contains a sorted list with the first node having the highest free space and each subsequent node having lesser free space with the last node having the least free space of all.
  • In case, the size requirement cannot be met with the active list, the allocator uses the free list to attempt to satisfy the memory requirement. The function is called allocator_alloc. The free list is a sorted array, sorted on the value of index/size of the memory node. Hence, it is easy to find the appropriate node as follows: index = size/BOUNDARY_SIZE.
  • If free[index] is non NULL, it is returned and inserted into the appropriate place on active list.
  • If the free[index] is NULL, a new memory allocation call is made.

The array free in apr_allocator_t is not allocated at startup. This array is populated when apr_pool_clear is called. It is the active list that is moved to free when apr_pool_clear is called. This mechanism provides a way to reuse memory whenever apr_pool_clear is called. Memory is released to the system when apr_pool_destroy is called.

MaxMemFree directive

APR pools also provide a mechanism to tune the amount of free memory that is kept in the free array at any point. It can be set using apr_allocator_max_free_set function. The Apache directive MaxMemFree provides a way for the user to set this value for the connection pool.

Since many requests are handled over a single connection (in Keep-Alive), it is possible that one of the requests requires huge amounts of memory. This results in the connection pool allocating this memory while handling the request. This memory will be retained (in free array) even after the request is handled and done with if MaxMemFree is set to unlimited. The memory will be released only on connection closure. In order to avoid this scenario, MaxMemFree can be set to a reasonable value to ensure that we do not hold large amounts of memory.

Benefits of using APR pools for memory allocation -

Fast allocation

As can be seen, APR pools provide the benefit of fast allocation — with just pointer arithmetic in most cases.

Freeing resources effeciently

APR pools handle freeing resources/memory efficiently by assigning a lifetime to the resource/memory.

--

--