optimize mempcpy to minimize need for data saved across the call
[musl] / src / thread / syscall_cp.c