use own mode for fpcw, fix constants for shift, xmm const assembler