[root@memzero]# ls

2024/04/24 - Generate function wrapper with macro magic

Short disclaimer at front. This post discusses about c/cpp macros, which I generally try to minimize or avoid altogether. However, they do have their applications and can be useful. All I want to say is, use the right amount of macros for the specific context and consider your colleagues if used in production. :^)


Recently I had the need to wrap a list of c functions and execute some common task before and after each wrapped function call. Hence, the body of each wrapper function was identical. Theses wrappers were compiled and linked into a shared library which then was used for LD_PRELOADing.

In this post I want to collect and archive a few approaches to generate the required boilerplate code using preprocessor macros. Progressing from top to bottom, the amount of macro code (magic?) increases.

To examine the generated code, one can use the following command with all the code examples from this post.

g++ -E file.cc | clang-format

With the -E option, g++ stops after the preprocessor stage; output is written to stdout. Alternatively, one can just invoke the preprocessor cpp file.cc.

To syntax check the code examples, one can use the following command.

g++ -fsyntax-only file.cc

Version 1

The first version provides a function definition like macro, here called WRAP. When creating a wrapper function, this macro is directly used to define the wrapper. One declares the signature in the WRAP macro, which is then followed by curly braces including the wrapper body.

The benefit of this approach is that the wrapper implementation is quite explicit and a reader, not familiar with the WRAP macro, might have a good chance to reason about what the code is doing. The draw back is that bodies have to be repeated for each wrapper.

// Mock the common wrapper functionality. For the purpose of this discussion,
// this is just a sink accepting any function argument.
#define MOCK_WRAPPER_IMPL(ret, fn) \
  /* do common work */             \
  static ret wrap_##fn(...);

// Utility to generate wrapper boilerplate.
#define WRAP(ret, fn, ...)   \
  MOCK_WRAPPER_IMPL(ret, fn) \
                             \
  extern "C" ret fn(__VA_ARGS__)

WRAP(int, foo, const char* name) {
  return wrap_foo(name);
}

WRAP(int, bar, const char* name, const char* value) {
  return wrap_bar(name, value);
}

In the code examples, I use the MOCK_WRAPPER_IMPL1 macro, to mock away the common pre and post work and to invoke the real function.

Version 2

The second version provides a simple approach to also generate the wrapper bodies by explicitly writing out a typed list of arguments and a list of argument names.

The main drawbacks are that the wrapper definition as well as the body of the WRAP macro are somewhat obfuscated, which makes it hard for a reader to reason about the code. Additionally, the arguments must be specified twice, when defining a wrapper.

#define MOCK_WRAPPER_IMPL(ret, fn) \
  /* do common work */             \
  static ret wrap_##fn(...);

// Utility to generate wrapper boilerplate.
#define WRAP(ret, fn, typed_args, args) \
  MOCK_WRAPPER_IMPL(ret, fn)            \
                                        \
  extern "C" ret fn typed_args {        \
    return wrap_##fn args;              \
  }

WRAP(int, foo, (const char* name), (name))
WRAP(int, bar, (const char* name, const char* value), (name, value))

Version 3

The third version provides macros to generate the list of typed arguments as well as the list of argument names. Then for the different arity of the wrapper function, corresponding WRAP1, WRAP2 macros are provided to generate the boilerplate code with the correct number of arguments.

The example only supports function with one or two arguments, but the code can easily be extended.

#define ARGS0()
#define ARGS1() a0
#define ARGS2() a1, ARGS1()

#define TYPEDARGS0()
#define TYPEDARGS1(type)      type a0
#define TYPEDARGS2(type, ...) type a1, TYPEDARGS1(__VA_ARGS__)

#define MOCK_WRAPPER_IMPL(ret, fn) \
  /* do common work */             \
  static ret wrap_##fn(...);

// Utility to generate wrapper boilerplate for functions with *one* argument.
#define WRAP1(ret, fn, ty1)            \
  MOCK_WRAPPER_IMPL(ret, fn)           \
                                       \
  extern "C" ret fn(TYPEDARGS1(ty1)) { \
    return wrap_##fn(ARGS1());         \
  }

// Utility to generate wrapper boilerplate for functions with *two* arguments.
#define WRAP2(ret, fn, ty1, ty2)            \
  MOCK_WRAPPER_IMPL(ret, fn)                \
                                            \
  extern "C" ret fn(TYPEDARGS2(ty1, ty2)) { \
    return wrap_##fn(ARGS2());              \
  }

WRAP1(int, foo, const char*)
WRAP2(int, bar, const char*, const char*)

Version 4

The fourth version improves the third one by automatically using the correct TYPEDARGS and ARGS macro depending on the arity of the wrapper function.

This is achieved by dynamically constructing the correct macro name during preprocessing time. The technique to count the number of elements in the variadic argument list VA_ARGS is presented in VA_NARG.

// Get Nth argument.
#define CPP_NTH(_0, _1, _2, _3, n, ...) n
// Get number of arguments (uses gcc/clang extension for empty argument).
#define CPP_ARGC(...) CPP_NTH(_0, ##__VA_ARGS__, 3, 2, 1, 0)

// Utility to concatenate preprocessor tokens.
#define CONCAT2(lhs, rhs) lhs##rhs
#define CONCAT1(lhs, rhs) CONCAT2(lhs, rhs)

#define ARGS0()
#define ARGS1() a0
#define ARGS2() a1, ARGS1()
#define ARGS3() a2, ARGS2()

// Invoke correct ARGSn macro depending on #arguments.
#define ARGS(...) CONCAT1(ARGS, CPP_ARGC(__VA_ARGS__))()

#define TYPEDARGS0()
#define TYPEDARGS1(ty)      ty a0
#define TYPEDARGS2(ty, ...) ty a1, TYPEDARGS1(__VA_ARGS__)
#define TYPEDARGS3(ty, ...) ty a2, TYPEDARGS2(__VA_ARGS__)

// Invoke correct TYPEDARGSn macro depending on #arguments.
#define TYPEDARGS(...) CONCAT1(TYPEDARGS, CPP_ARGC(__VA_ARGS__))(__VA_ARGS__)

#define MOCK_WRAPPER_IMPL(ret, fn) \
  /* do common work */             \
  static ret wrap_##fn(...);

// Utility to generate wrapper boilerplate.
#define WRAP(ret, fn, ...)                    \
  MOCK_WRAPPER_IMPL(ret, fn)                  \
                                              \
  extern "C" ret fn(TYPEDARGS(__VA_ARGS__)) { \
    return wrap_##fn(ARGS(__VA_ARGS__));      \
  }

WRAP(int, foo, const char*)
WRAP(int, bar, const char*, const char*)
WRAP(int, baz, char, int, unsigned)

The implementation to handle empty arguments CPP_ARGC() == 0 uses a gcc extension for the ## operator. This is also supported by clang, but not by msvc. Maybe I will come back in the future to update this, if I have the need.

Appendix: CONCAT1 and CONCAT2?

Macro arguments are completely macro-expanded before they are substituted into a macro body, unless they are stringized or pasted with other tokens [..] - gcc

#define FOO() ABC

#define CONCAT2(lhs, rhs) lhs ## rhs
#define CONCAT1(lhs, rhs) CONCAT2(lhs, rhs)

CONCAT2(MOOOSE_, FOO())
// expands to:
//   MOOOSE_FOO()

CONCAT1(MOOOSE_, FOO())
// expands to:
//   MOOOSE_ABC

Appendix: Listify codegen data

A hacker friend once showed me some neat macro magic for organizing data relevant to code generation within a list that can be reused in multiple contexts.

I want to archive this technique here, as it fits well in the topic.

The idea is that the list can accept a macro, which is then applied to each entry. The processing macros, may not need to use all the elements in an entry.

The example below is somewhat arbitrary and one may not see the benefit, but it is serves to demonstrate the technique. However, this technique really shines when dealing with larger datasets that are used in different locations to generate code.

#define EXCEPTIONS(M)            \
  M(ill_opc  , "Illegal opcode") \
  M(mem_fault, "Memory fault")   \
  M(inv_perm , "Invalid permissions")

enum exception {
#define ELEM(e, _) e,
  EXCEPTIONS(ELEM)
#undef ELE
};

const char* describe(exception e) {
  switch (e) {
#define CASE(e, d) \
  case e:          \
    return d;
    EXCEPTIONS(CASE)
#undef CASE
  }
}

This technique can also be used to generate the function wrappers when using the WRAP macro from version 4.

#define API(M) \
    M(int, foo, const char*) \
    M(int, bar, const char*, const char*) \
    M(int, baz, char, int, unsigned)

API(WRAP)

To conclude this appendix, let me share a quote from that same great hacker friend:

This is the excel-ification of cpp code.

References

Footnotes

1) In the examples above, MOCK_WRAPPER_IMPL is a macro. However it does not need to be one, as it can be seen in the fn_hook.cc example.