add missing memory barrier to pthread_join
[musl] / src / thread / syscall_cp.c