optimize contended normal mutex case; add int compare-and-swap atomic