Context
Here at the backend teams of Intraway we exclusively use the GCC compiler for compiling our C/C++ projects, targeting mostly RHEL 5 and 6 distributions. For that, we use the default version of GCC included in the Red Hat repositories, that is GCC 4.1 and 4.4 respectively. During development, life is easier: anyone can choose the linux distribution they want and the compiler they want for everyday work, as long as everything ends up compiling and running correctly under the targeted systems.
My current Compiler/OS combination of choice is Clang-3.5 under Kubuntu 14.04 (64 bit), and use docker containers of centos4 and centos5 ( Using the official builds available at https://registry.hub.docker.com/_/centos/ ).
Of course, the requirement of old compiler support forces developers to use an old standard of C++ language, C++03, that is 12(!) years old instead of the newer (and more enjoyable) C++11 or C++14 standards. Thankfully, we have the nice people that make the boost library that make us miss a bit less being stuck in old compilers.
Problem
But, last week I came across a bug that was reproducible in a pre-production RHEL6 server scanning thousands of real Network Elements, but not in our local testing environment. The bug manifested as a crash due to a SEGFAULT. Logs had no sign of what was going wrong and a core dump loaded using GDB just made no sense, meaning something nasty was being done in the newly added code. Usually, using valgrind would be the obvious choice, but the bug was also not reproducible there, and also made execution unbearably slow.
It was the time to try a tool that I had not needed nor used yet, Clang’s Address Sanitizer. Of course, being the only showstopper that Clang (or gcc-4.8, that also features the Address Sanitizer) was not available in my build environment.
Why Clang?
Basically, because:
- It’s compiles faster and uses less memory than GCC, even when it produces larger executables that may run a bit slower.
- Produced executables do not depend on a different version of libstdc++ than the one officially available for RHEL.
Building Clang 3.4
It turns out that version 3.4.2 is the last version that compiles with the gcc4.4 bundled with RHEL6, but that is enough to compile C++14 source code and using Address Sanitizer. It can be done very easily:
Clang source code
LLVM source code
Compiler RT source code (only needed to include the sanitizer tools like Address Sanitizer when compiling, it isn’t necessary for release builds)
mkdir llvm_tools
cd llvm_tools
wget http://llvm.org/releases/3.4.2/llvm-3.4.2.src.tar.gz
wget http://llvm.org/releases/3.4.2/cfe-3.4.2.src.tar.gz
wget http://llvm.org/releases/3.4/compiler-rt-3.4.src.tar.gztar xfz llvm-3.4.2.src.tar.gz
tar xfz cfe-3.4.2.src.tar.gz
mv cfe-3.4.2.src llvm-3.4.2.src/tools/clang
tar xfz compiler-rt-3.4.src.tar.gz
mv compiler-rt-3.4.src llvm-3.4.2.src/project/compiler-rtmkdir build
cd build
../llvm-3.4.2.src/configure –enable-optimized
make -j 4
Clang executable will be left under the build/Release+Asserts/bin folder, now its just a matter to add it to the PATH environment variable and setting the CXX and CC to the clang++ and clang executables respectively.
Compiling the binary
You just need to add the following to the CXXFLAGS environment variable:
export CXXFLAGS+=” -O1 -g -fsanitize=address -fno-omit-frame-pointer “
Running the binary in the remote server
You need to copy the llvm-symbolizer executable and set its path into the ASAN_SYMBOLIZER_PATH environment variable. If you don’t the stack trace will not show function names.
You might also play with the ASAN_OPTIONS, to customize Address Sanitizer behavior (read https://code.google.com/p/address-sanitizer/wiki/Flags for options documentation).
export ASAN_SYMBOLIZER_PATH=/tmp/llvm-symbolizer
export ASAN_OPTIONS=’detect_stack_use_after_return=1:check_initialization_order=1:symbolize=1:full_address_space=1′
So, did it find anything?
Indeed it did, the binary crashed as expected, just after printing the following message:
==10022==ERROR: AddressSanitizer: heap-use-after-free on address 0x629000146728 at pc 0xe43882 bp 0x7f136360b620 sp 0x7f136360b618
READ of size 8 at 0x629000146728 thread T183
#0 0xe43881 in boost::unordered::detail::table_impl<boost::unordered::detail::map<std::allocator<std::pair<std::pair<unsigned int, std::string> const, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface> > >, std::pair<unsigned int, std::string>, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface>, boost::hash<std::pair<unsigned int, std::string> >, std::equal_to<std::pair<unsigned int, std::string> > > >::place_in_bucket(boost::unordered::detail::table<boost::unordered::detail::map<std::allocator<std::pair<std::pair<unsigned int, std::string> const, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface> > >, std::pair<unsigned int, std::string>, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface>, boost::hash<std::pair<unsigned int, std::string> >, std::equal_to<std::pair<unsigned int, std::string> > > >&, boost::unordered::detail::ptr_bucket*) /opt/iway/deps/include/boost/unordered/detail/unique.hpp:608
#1 0xe43881 in boost::unordered::detail::table_impl<boost::unordered::detail::map<std::allocator<std::pair<std::pair<unsigned int, std::string> const, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface> > >, std::pair<unsigned int, std::string>, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface>, boost::hash<std::pair<unsigned int, std::string> >, std::equal_to<std::pair<unsigned int, std::string> > > >::rehash_impl(unsigned long) /opt/iway/deps/include/boost/unordered/detail/unique.hpp:598
#2 0xe4288e in boost::unordered::detail::table_impl<boost::unordered::detail::map<std::allocator<std::pair<std::pair<unsigned int, std::string> const, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface> > >, std::pair<unsigned int, std::string>, boost::shared_ptr<SNMP_Cache::Database_Cache_Interface>, boost::hash<std::pair<unsigned int, std::string> >, std::equal_to<std::pair<unsigned int, std::string> > > >::operator[](std::pair<unsigned int, std::string> const&) /opt/iway/deps/include/boost/unordered/detail/unique.hpp:352
#3 0xe3e134 in boost::shared_ptr<SNMP_Cache::Typed_Database_Cache_Interface<boost::unordered::unordered_map<std::string, ALU::Qos_Policer_Profile_Data, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, ALU::Qos_Policer_Profile_Data> > > > > SNMP_Cache::Typed_Database_Cache_Factory::build_typed_database_cache<boost::unordered::unordered_map<std::string, ALU::Qos_Policer_Profile_Data, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, ALU::Qos_Policer_Profile_Data> > > >(boost::shared_ptr<SNMP_Cache::SNMP_Getter_Functor_Interface<boost::unordered::unordered_map<std::string, ALU::Qos_Policer_Profile_Data, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, ALU::Qos_Policer_Profile_Data> > > > >, std::string const&, unsigned int, unsigned int, unsigned int) /home/intraway/workspace/dslam_scanner-cache/src/caches/Typed_Database_Cache_Factory.h:576
#4 0xe3d513 in SNMP_Cache::Cache_Builder<QoS_Profile_Scanner_Context>, ALU_Qos_Policer_Profile_Functor, boost::unordered::unordered_map<std::string, ALU::Qos_Policer_Profile_Data, boost::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, ALU::Qos_Policer_Profile_Data> > > >::build(QoS_Profile_Scanner_Context&, boost::shared_ptr<SNMP_Cache::Typed_Database_Cache_Factory_Interface>, unsigned int, unsigned int, unsigned long, bool) /home/intraway/workspace/dslam_scanner-cache/src/caches/Cache_Aux_Functions.h:66
#5 0xe37a7c in SNMP_Cache::Cache_Builder_Adv<QoS_Profile_Scanner_Context, ALU_Qos_Policer_Profile_Functor>::build(QoS_Profile_Scanner_Context&, boost::shared_ptr<SNMP_Cache::Typed_Database_Cache_Factory_Interface>, unsigned int, unsigned int, unsigned long) /home/intraway/workspace/dslam_scanner-cache/src/caches/Cache_Aux_Functions.h:102
#6 0xe2510b in Dslam_Cache_Factory::build_alu_policer_profile_cache(QoS_Profile_Scanner_Context&) const /home/intraway/workspace/dslam_scanner-cache/src/caches/Dslam_Cache_Factory.cpp:928
#7 0xcd66ef in Get_Policer_Profile_Table::execute(QoS_Profile_Scanner_Context&) const /opt/iway/deps/include/boost/unordered/detail/table.hpp:348:27
#8 0xf96ca3 in ALU_QoS_Profile_Scanner::do_snmp_queries(bool&) /home/intraway/workspace/dslam_scanner-cache/src/scanners/ALU_QoS_Profile_Scanner.cpp:65
#9 0xb2b4df in Base_Scanner<QoS_Profile_Scanner_Context>::do_run() /home/intraway/workspace/dslam_scanner-cache/src/scanners/Base_Scanner.h:47
#10 0xb3e1ad in Runnable::run() /home/intraway/workspace/dslam_scanner-cache/src/Runnable.cpp:25
#11 0xbd423a in Scanner_Executor::svc() /home/intraway/workspace/dslam_scanner-cache/src/Scanner_Executor.cpp:19
#12 0x7f139e72e386 in ACE_Task_Base::svc_run(void*) /root/builder/redhat/BUILD/ACE_wrappers/ace/Task.cpp:275
#13 0x7f139e72f950 in ACE_Thread_Adapter::invoke() /root/builder/redhat/BUILD/ACE_wrappers/ace/Thread_Adapter.cpp:98
#14 0xa08823 in __asan::AsanThread::ThreadStart(unsigned long) (/opt/iway/dslam_scanner-3.8.0.1-test/bin/scanner-clang-debug+0xa08823)
#15 0x390d007850 in start_thread (/lib64/libpthread.so.0+0x390d007850)
#16 0x3fe7ce767c in clone (/lib64/libc.so.6+0x3fe7ce767c)
Which pointed at a problematic access to operator[] of a boost::unordered map at Typed_Database_Cache_Factory.h:576. A quick search into the source code showed that there was concurrent access to this shared structure without using it’s mutex. Problem solved.
What about RHEL5?
In short, yes, you can build clang 3.4, but without support for the Sanitizer tools.
Compiling a working Clang turned out to be impossible with the default GCC 4.1, but this could be worked around by installing GCC4.4 from the official repositories by installing the gcc44-c++ package (and, of course its dependencies: binutils220, gcc44, libstdc++44-devel )
Python 2.4 is also too old to build Clang 3.4, there are two solutions for this:
- Install a newer version (which I did since we already had a custom Python 2.6 RPM)
- Install Clang 3.2, that does not have such requirement.
But even then, the compiler-rt package will fail to compile with the message:
fatal error: ‘linux/perf_event.h’ file not found
And here we are now out of luck, since RHEL5 does not ship with a kernel that is new enough to compile the Compiler-RT (I tried to compile even Clang 3.1 which was the first release to include Address Sanitizer but obtained the same result).
If you just avoid adding the compiler-rt directory in the llvm directory, it will build and link just fine.