Thanks! I am trying to re-write by calling __builtin_clear_padding. But I found gimple_fold_builtin_clear_padding seems only work before SSA pass. Should I remove the assertion? On the other hand, since ATOMIC_COMPARE_EXCHANGE will only work for simple reg type. excluding vector or complex types, is it enough to generate the mask by simple bit operations? Thanks!, xndcn --- static bool gimple_fold_builtin_clear_padding (gimple_stmt_iterator *gsi) { ... /* This should be folded during the lower pass. */ gcc_assert (!gimple_in_ssa_p (cfun) && cfun->cfg == NULL); Jakub Jelinek 于2024年1月3日周三 23:52写道: > On Wed, Jan 03, 2024 at 11:42:58PM +0800, xndcn wrote: > > Hi, I am new to this, and I really need your advice, thanks. > > > > I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE > > internal-fn optimization > > > > for floating type or types contains padding (e.g., long double). > > Please correct me if I happen to > > make any mistakes, Thanks! > > > > Firstly, about the concerns of sNaNs float/doduble value, it seems > > work well and shall have been > > covered by testsuite/gcc.dg/atomic/c11-atomic-exec-5.c > > > > Secondly, since ATOMIC_COMPARE_EXCHANGE is only enabled when expected > > var is only addressable > > because of the call, the padding bits can not be modified by any other > > stmts. So we can save all > > bits after ATOMIC_COMPARE_EXCHANGE call and extract the padding bits. > > After first iteration, the > > extracted padding bits can be mixed with the expected var. > > > > Bootstrapped/regtested on x86_64-linux. > > > > I did some benchmarks, and there is some significant time optimization > > for float/double types, > > > > while there is no regression for long double type. > > If anything, this should be using clear_padding_type_may_have_padding_p > and call to __builtin_clear_padding. > Code in that file uses /* ... */ style comments, so please use them instead > of // for consistency, and furthermore comments should be terminated with > a dot and two spaces before */ > > Also, I don't think this is late stage3 material, so needs to wait for GCC > 15. > > Jakub > >