use atomic decrement rather than cas in pthread_exit thread count
[musl] / src / thread / syscall_cp.c