Arena allocator for BSON documents

Francesco_Ballardin · November 14, 2023, 4:34am

Hi all!

I was just wondering if there is any way to make the C driver work with an arena allocator.
Arena allocators have many benefits regarding memory management: the most important here are the ability to reduce the amount of system calls needed to allocate memory, and an easier way to release memory instad of begin tied to the malloc/free pairing. (If you want to know more I suggest this read).

As example, let’s look at a simple program like this:

#include <mongoc/mongoc.h>

int main(void) {
  mongoc_init();
  mongoc_cleanup();
  return 0;
}

If compiled and run with Valgrind, we have this output:

==6188== Memcheck, a memory error detector
==6188== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6188== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==6188== Command: exe/test-bson
==6188== 
==6188== 
==6188== HEAP SUMMARY:
==6188==     in use at exit: 0 bytes in 0 blocks
==6188==   total heap usage: 8,739 allocs, 8,739 frees, 1,376,521 bytes allocated
==6188== 
==6188== All heap blocks were freed -- no leaks are possible
==6188== 
==6188== For lists of detected and suppressed errors, rerun with: -s
==6188== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

By giving an appopriate sized arena, we could reduce those 8739 mallocs to just a few (or even 1 if we could assume, for example, an arena page of 2MB for mongoc_init).

What would be needed is a way to dynamically tell the driver which allocator to use. I was looking into the docs, and found this: bson_mem_set_vtable, but it’s not very well documented so I’m having trouble understanding if it could work with an arena allocator.

Thank you!

Rishabh_Bisht · November 15, 2023, 10:02am

Hi @Francesco_Ballardin

You are looking in the right direction. A custom memory allocator can be set with bson_mem_set_vtable
The C Driver (libmongoc) is expected to use the libbson allocation functions (e.g. bson_malloc and bson_free ).

Related documentation - Memory Management - libbson 1.25.1
You can see an example usage in the PHP driver which uses bson_mem_set_vtable here: https://github.com/mongodb/mongo-php-driver/blob/master/php_phongo.c#L182-L186

Francesco_Ballardin · November 15, 2023, 11:21am

Hi Rishabh!
thanks a lot for taking the time to answer.

Could you please help me understand better this:

My expectation is that every allocation (both in libmongoc and in libbson) would use my provided vtable, just like it seems it does in the PHP driver source code that you linked.

However, by running this simple test:

#include <mongoc/mongoc.h>

typedef struct arena_t arena_t;
struct arena_t {
  uint8_t*  data;
  int64_t   cap;
  int64_t   pos;
  uint32_t  mcnt;
};

arena_t* arena_init (size_t cap) {
  arena_t* arena = calloc(1, sizeof(arena_t));
  arena->data = malloc(cap);
  arena->cap = cap;
  return arena;
}

void* arena_malloc (arena_t* arena, size_t size) {
  arena->mcnt++;
  void* current = arena->data + arena->pos;
  arena->pos += size;
  return current;
}

void* arena_calloc (arena_t* arena, size_t num, size_t size) {
  void* mem = arena_malloc(arena, num * size);
  memset(mem, '\0', num * size);
  return mem;
}

void* arena_realloc (arena_t* arena, void* oldmem, size_t size) {
  void* mem = arena_malloc(arena, size);
  memmove(mem, oldmem, size);
  return mem;
}

void arena_free (arena_t* arena) {
  return;
}

void arena_destroy (arena_t* arena) {
  free(arena->data);
  free(arena);
}

arena_t* bson_arena;

void* bson_arena_malloc (size_t size) { return arena_malloc(bson_arena, size); }
void* bson_arena_calloc (size_t num, size_t size) { return arena_calloc(bson_arena, num, size); }
void* bson_arena_realloc (void* mem, size_t size) { return arena_realloc(bson_arena, mem, size); }
void bson_arena_free (void* mem) { return; }

int main (void) {
  bson_arena = arena_init(2 * 1024 * 1024);
  bson_mem_vtable_t bson_arena_arena_vtable = {
    .malloc = bson_arena_malloc,
    .calloc = bson_arena_calloc,
    .realloc = bson_arena_realloc,
    .free = bson_arena_free
  };
  bson_mem_set_vtable(&bson_arena_arena_vtable);

  mongoc_init();
  mongoc_cleanup();

  fprintf(stderr, "total allocs in arena: %i\n", bson_arena->mcnt);
  fprintf(stderr, "total bytes alloc'd in arena: %li\n", bson_arena->pos);
  arena_destroy(bson_arena);
  bson_mem_restore_vtable();

  return 0;
}

I get this output using Valgrind memcheck:

==46== Memcheck, a memory error detector
==46== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==46== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==46== Command: exe/test-bson
==46== 
total allocs in arena: 33
total bytes alloc'd in arena: 805
==46== 
==46== HEAP SUMMARY:
==46==     in use at exit: 0 bytes in 0 blocks
==46==   total heap usage: 8,956 allocs, 8,956 frees, 2,918,362 bytes allocated
==46== 
==46== All heap blocks were freed -- no leaks are possible
==46== 
==46== For lists of detected and suppressed errors, rerun with: -s
==46== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

So it looks like that the vtable is working, but only for a tiny part of the total mallocs done by the libmongoc driver.

Is there something that I’m missing?
Thanks a lot!