with hash pair caching and tandem tree traversal. It should be much faster, when using many child shapes for both compounds. Fix iOS compilation, added header.