[libbson] function `bson_iter_utf8` memory management

Hi everyone! I’d like to understand better the char* returned by the function bson_iter_utf8.
The documentation says “returns a UTF-8 encoded string that has not been modified or freed”. But from my test it looks like that this string is just a pointer to the bson_t buffer, so it is valid only as long as the bson_t buffer is valid, is that correct?

I’m doing this simple test here:

#include <stdio.h>
#include <mongoc/mongoc.h>

int main (int argc, char** argv) {
  mongoc_init ();

  char* string;
  bson_t* document = bson_new ();
  BSON_APPEND_UTF8 (document, "key", "test_string_123");
  bson_iter_t iter;
  if (bson_iter_init_find (&iter, document, "key")) {
    string = (char*) bson_iter_utf8 (&iter, NULL);
    printf ("before destroy: %s\n", string);
  }
  bson_destroy (document);
  printf ("after destroy: %s\n", string);

  mongoc_cleanup ();

  return 0;
}

If I compile and run it, it seems to correctly print all:

$ gcc -o exe/test tmp/test.c -I/usr/include/libbson-1.0 -I/usr/include/libmongoc-1.0 -lmongoc-1.0 -lbson-1.0  && exe/test

before destroy: test_string_123
after destroy: test_string_123

But if I run it with valgrind I get this (shortened):

$ valgrind --track-origins=yes --leak-check=full --show-leak-kinds=all exe/test

==441== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==441== Command: exe/test
==441== 
before destroy: test_string_123
==441== Invalid read of size 1
==441==    at 0x484ED16: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==441==    by 0x49CDDB0: __vfprintf_internal (vfprintf-internal.c:1517)
==441==    by 0x49B781E: printf (printf.c:33)
==441==    by 0x109331: main (in /hb/exe/test)
==441==  Address 0x74ef835 is 21 bytes inside a block of size 128 free'd
==441==    at 0x484B27F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==441==    by 0x109315: main (in /hb/exe/test)
==441==  Block was alloc'd at
==441==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==441==    by 0x49353A2: bson_malloc (in /usr/lib/x86_64-linux-gnu/libbson-1.0.so.0.0.0)
==441==    by 0x492CEE1: bson_new (in /usr/lib/x86_64-linux-gnu/libbson-1.0.so.0.0.0)
==441==    by 0x109281: main (in /hb/exe/test)
[...]
after destroy: test_string_123
==441== 
==441== HEAP SUMMARY:
==441==     in use at exit: 0 bytes in 0 blocks
==441==   total heap usage: 8,735 allocs, 8,735 frees, 1,371,850 bytes allocated
==441== 
==441== All heap blocks were freed -- no leaks are possible
==441== 

So the question is: if I need to use this string even after I dispatched the bson_t (i.e. when iterating on a cursor) do I have to make a copy of the string using bson_iter_dup_utf8?

Thank you for your help!
Have a nice day!

Yes. That is correct. The lifetime of the returned string depends on the the lifetime of the bson_t.

Hi Kevin, thanks for your reply and clarification.
Have a nice weekend!

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.