Hi, I got many complains from the user that gcc's asm intrinsics are hard to use -- because they need to specify command line --mxxx option to use them even though the enclosing function has the right target attribute: #include int foo(...) __attribute__((__target__("sse4"))); int foo(...) { ... _mm_testc_si128 (... ... } The reason is that in gcc those intrinsics are defined as wrappers to the gcc builtins and their definitions are guarded by ISA macros that are defined via command line option or target pragmas. There are workarounds of the problem: 1. Move the function into a separation file and compile it with -mxxx option. 2. Use gcc builtins directly. The drawback is that the code won't be portable; and some data types such as __m128i is also guarded by the macros 3. Use Pragma target #pragma GCC push_options #pragma GCC target ("sse4") #include int foo(...) { ... } # pragma GCC pop_options Or use the pragma simply to wrap the header inclusion: my_smmintrin.h: #pragma GCC push_options #pragma GCC target ("sse4") #include #pragma GCC pop_options #include "my_smmintrin.h" int foo(...) __attribute__(...); int foo(...) { ... } None of the above are ideal: 1. There is an effort going to support optimization of calls to multi-versioned functions (via runtime dispatch) via predicate hoisting and caller cloning -- the end result is that it enables inlining of target specific clones into callers which are also cloned. This makes it important to avoid putting the target function into a separate file -- otherwise CMO is needed. 2. Using pragma is ugly and push and pop options seem broken at this moment (reset_options work fine). So the question becomes why not allow the definitions of those wrappers in the first place --- if the user includes the header, he then intends to use them. See attached patch that does this. GCC bootstrapping shows that there is NO visible compile time change at all with this patch. Comments? Thanks, David