~$ putting main in a namespace
Normally, I would simply place my
main()
in a file called, for example,
main.cpp
or
<project_name>.cpp
, and that would serve as the "root" of
the project. Within this file, I would include as many headers as are
necessary to get the job done. This is generally what I see in other code
as well, whether
main()
is a huge function that does everything, or a
simple two-line jumping point for a GUI application.
But I recently found that this doesn't need to be the case.
main()
can be
in a namespace. It can be somewhere random in your code. It just has to
be
somewhere
in your code. So why not hide it in a bird's nest of namespaces?
Well we can't do this for free.
Suppose we have a simple hello world application, but we hide
main()
inside a namespace.
namespace foo
{
int main(int argc, char **argv)
{
std::cout << "Hello, world!" << std::endl;
return 0;
}
}
We can compile this successfully. On Linux, I can do
g++ -c test.cpp
and
I will get a valid
test.o
file output. However, this file cannot be
linked into an executable. If you try to run
g++ test.o -o test
, you will
get an error to the effect of
undefined reference to 'main'
.
This happens because C++ compilers mangle the names of symbols when compiling the source code. This allows functions to have the same name in source within different namespaces and scopes without resulting in a symbol collision 1 .
We can check that this is happening on Linux with the
nm
utility to look
at the symbols. Using the above code, running
nm test.o
results in:
U __cxa_atexit
U __dso_handle
U _GLOBAL_OFFSET_TABLE_
0000000000000087 t _GLOBAL__sub_I__ZN3foo4mainEiPPc
000000000000003e t _Z41__static_initialization_and_destruction_0ii
0000000000000000 T _ZN3foo4mainEiPPc
U _ZNSolsEPFRSoS_E
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 r _ZStL19piecewise_construct
0000000000000000 b _ZStL8__ioinit
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
As you can see, the names are quite mangled, but you can still make out in
the middle our
main
function, which compiled to the symbol
_ZN3foo4mainEiPPc
. Note that it contains the
foo
namespace as part of
the name
2
.
When we try to link our compiled symbols to an executable, the linker will
look for a symbol called
main
. Just
main
. And it has to actually be
called that; it can't just have the same function prototype. For example,
we can foolishly try to trick the compiler by creating a new function
outside of our namespace with the correct prototype:
namespace foo
{
int main(int argc, char **argv)
{
std::cout << "Hello, world!" << std::endl;
return 0;
}
}
int mane(int argc, char **argv)
{
std::cout << "Neigh, world!" << std::endl;
return 0;
}
This will compile, but it won't link. If we look at the symbols, we can see:
U __cxa_atexit
U __dso_handle
U _GLOBAL_OFFSET_TABLE_
0000000000000099 t _GLOBAL__sub_I__ZN3foo4mainEiPPc
0000000000000050 t _Z41__static_initialization_and_destruction_0ii
000000000000003e T _Z4maneiPPc
0000000000000000 T _ZN3foo4mainEiPPc
U _ZNSolsEPFRSoS_E
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 r _ZStL19piecewise_construct
0000000000000000 b _ZStL8__ioinit
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
Well,
_Z4maneiPPc
is arguably
less
mangled than our
foo::main()
function's symbol, but it's not what
ld
will be looking for.
But hold on a minute. If we made a globally scoped function with the
correct prototype and it was still mangled, how would it ever link? Because
main
is treated specially. If we rename
mane
to
main
and inspect the
symbols, we can see we get the symbol we want:
U __cxa_atexit
U __dso_handle
U _GLOBAL_OFFSET_TABLE_
0000000000000099 t _GLOBAL__sub_I__ZN3foo4mainEiPPc
000000000000003e T main
0000000000000050 t _Z41__static_initialization_and_destruction_0ii
0000000000000000 T _ZN3foo4mainEiPPc
U _ZNSolsEPFRSoS_E
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 r _ZStL19piecewise_construct
0000000000000000 b _ZStL8__ioinit
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
There's our
main
symbol. This will link and run as expected. So how do we
get our
foo::main
to behave? We can use
extern
to accomplish this. If
we prepend the function with
extern "C"
, we tell the compiler to avoid
mangling the name of the symbol generated from this function. So the
following code:
namespace foo
{
extern "C" int main(int argc, char **argv)
{
std::cout << "Hello, world!" << std::endl;
return 0;
}
}
generates the following symbols:
U __cxa_atexit
U __dso_handle
U _GLOBAL_OFFSET_TABLE_
0000000000000087 t _GLOBAL__sub_I_main
0000000000000000 T main
000000000000003e t _Z41__static_initialization_and_destruction_0ii
U _ZNSolsEPFRSoS_E
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 r _ZStL19piecewise_construct
0000000000000000 b _ZStL8__ioinit
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
Beautiful.
I have an overly verbose, yet contrived, example on GitHub
here
. I've
hidden
main
within a bird's nest of code.
So, why would you do this? I actually found this pattern in production code I was working on. It's such a bizarre thing to do that I felt it must have a legitimate purpose 3 .
My first thought was that it could be used as a shortcut to namespace
specification for types and variables. That is, if
main()
is within
namespace foo
, and there exists an
int foo::bar
, then
main()
can
refer to it as simply
bar
. But you could just as easily write
main()
outside of the namespace and use
using namespace foo
to achieve the same
thing (though I personally dislike clipping namespaces like this). This is
an awful "shortcut" for clipping the namespace.
Another idea I had, which is marginally more useful, is that this idiom can
be roughly equivalent to the Python
if __name__ == '__main__'
, where
modules can be run standalone or as a library.
Individual Python modules can be run with, for example,
python
my_module.py
. Within the interpreter, a built-in, module-scoped variable
called
__name__
is set to
'__main__'
to denote that this is the "entry
point" to the running process. If this same module was imported into
another module via
import my_module
, then
__name__
is set to the
module's actual name. In this case, it would be
'my_module'
.
We can somewhat replicate this with a bit of extra scaffolding in C++. If
we additionally add preprocessor macros around our namespaced
main()
, we
can prevent it from being compiled until we specifically want it:
// foo.cpp
#include <iostream>
namespace foo
{
#ifndef I_GOT_A_MAIN
extern "C" int main(int argc, char **argv)
{
std::cout << "foo::main()" << std::endl;
return 0;
}
#endif
}
If
I_GOT_A_MAIN
is already defined, then our namespaced main function
will not be compiled. We can define our "actual"
main()
function in
another file and include the above code.
// bar.cpp
#define I_GOT_A_MAIN
#include <foo.cpp>
int main(int argc, char **argv)
{
std::cout << "main()" << std::endl;
return 0;
}
This can be compiled with
g++ bar.cpp -o test -I.
(I like to use angle
brackets in my
#include
s, hence the
-I
), and it would print out
main()
. If you remove the
#define
, we will get a symbol collision from
the linker that
main
is defined twice. If we compile only
foo.cpp
with
g++ foo.cpp -o foo_test
, we would get
foo::main()
printed.
Our
foo::main()
could be used as a unit test of some sort for everything
defined within the namespaces within that file. Shipping a unit test with
every source file sounds like a reasonable thing to do. We can simply
compile individual source files as our test platform instead of relying on
third-party utilities to do so, or totally rolling our own tests.
I have another overly complicated yet contrived example on GitHub here .
But maybe don't use this idiom in production code. It just feels dirty.
-
C doesn't have this issue at all as it requires all functions to have different prototypes, which directly translate to different symbols. See the OpenGL C API for example. ↩
-
This may not always be the case. It just so happens that my compiler does its naming this way. ↩
-
In hindsight, I no longer think it was purposefully written with a special design in mind. It is just strangely written code. ↩