add optimized aarch64 memcpy and memset