use atomic decrement rather than cas in pthread_exit thread count