Locks on GetCurrentHostThreadID were causing performance issues
according to Visual Studio's profiler. It was consuming twice the time
as arm_interface.Run(). The cost was not in the function itself but in
the lockinig it required.
Reimplement these functions using atomics and static storage instead of
an unordered_map. This is a side effect to avoid locking and using linked
lists for reads.
Replace unordered_map with a linear search.