On 29/06/2022 08:18, Richard Sandiford wrote: >> + break; >> + case AARCH64_RBIT: >> + case AARCH64_RBITL: >> + case AARCH64_RBITLL: >> + if (mode == SImode) >> + icode = CODE_FOR_aarch64_rbitsi; >> + else >> + icode = CODE_FOR_aarch64_rbitdi; >> + break; >> + default: >> + gcc_unreachable (); >> + } >> + expand_insn (icode, 2, ops); >> + return target; > This needs to return ops[0].value instead, since "target" just suggests > a possible location. > > Could you add tests for a memory source and memory destination, e.g.: > > void test_clz_mem (uint32_t *a) > { > *a = __clz (*a); > } > > Without tests like that, these comments probably just sound like a paper > exercise, but they should make a difference for memory sources (previous > review) and memory destinations (this round). I had locally tested it (with rev though because clz doesn't use that code) and strangely it does seem to work for the memory destinations, but that's just a simple test. It could very well go wrong with some more complex codegen, so I'll just take your word and use ops[0].value. And yeah I didn't add the tests at the time, don't really know why, I'll chuck it down to laziness :P > >> diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h >> index 9775a48c65825b424d3eb442384f5ab87b734fd7..a044bc74553fcf2a49b71290083f3f072fd5a2ce 100644 >> --- a/gcc/config/aarch64/arm_acle.h >> +++ b/gcc/config/aarch64/arm_acle.h >> @@ -28,6 +28,7 @@ >> #define _GCC_ARM_ACLE_H >> >> #include >> +#include >> >> #pragma GCC aarch64 "arm_acle.h" >> >> @@ -35,6 +36,58 @@ >> extern "C" { >> #endif >> >> +#define _GCC_ARM_ACLE_ROR_FN(NAME, TYPE) \ >> +__extension__ extern __inline TYPE \ >> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) \ >> +NAME (TYPE __value, uint32_t __rotate) \ >> +{ \ >> + size_t __size = sizeof (TYPE) * __CHAR_BIT__; \ >> + __rotate = __rotate % __size; \ >> + return __value >> __rotate | __value << ((__size - __rotate) % __size); \ >> +} >> + >> +_GCC_ARM_ACLE_ROR_FN (__ror, uint32_t) >> +_GCC_ARM_ACLE_ROR_FN (__rorl, unsigned long) >> +_GCC_ARM_ACLE_ROR_FN (__rorll, uint64_t) >> + >> +#undef _GCC_ARM_ACLE_ROR_FN >> + >> +#define _GCC_ARM_ACLE_DATA_FN(NAME, BUILTIN, ITYPE, RTYPE) \ >> +__extension__ extern __inline RTYPE \ >> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) \ >> +__##NAME (ITYPE __value) \ >> +{ \ >> + return __builtin_##BUILTIN (__value); \ >> +} >> + >> +_GCC_ARM_ACLE_DATA_FN (clz, clz, uint32_t, unsigned int) >> +_GCC_ARM_ACLE_DATA_FN (clzl, clzl, unsigned long, unsigned int) >> +_GCC_ARM_ACLE_DATA_FN (clzll, clzll, uint64_t, unsigned int) >> +_GCC_ARM_ACLE_DATA_FN (cls, clrsb, uint32_t, unsigned int) >> +_GCC_ARM_ACLE_DATA_FN (clsl, clrsbl, unsigned long, unsigned int) >> +_GCC_ARM_ACLE_DATA_FN (clsll, clrsbll, uint64_t, unsigned int) >> +_GCC_ARM_ACLE_DATA_FN (rev16, aarch64_rev16, uint32_t, uint32_t) >> +_GCC_ARM_ACLE_DATA_FN (rev16l, aarch64_rev16l, unsigned long, unsigned long) >> +_GCC_ARM_ACLE_DATA_FN (rev16ll, aarch64_rev16ll, uint64_t, uint64_t) >> +_GCC_ARM_ACLE_DATA_FN (rbit, aarch64_rbit, uint32_t, uint32_t) >> +_GCC_ARM_ACLE_DATA_FN (rbitl, aarch64_rbitl, unsigned long, unsigned long) >> +_GCC_ARM_ACLE_DATA_FN (rbitll, aarch64_rbitll, uint64_t, uint64_t) >> +_GCC_ARM_ACLE_DATA_FN (revsh, bswap16, int16_t, uint16_t) > The return type should be int16_t. Nice catch! > The clz and cls tests have the old return types (same as the argument > types), but I guess that's a good thing, since it shows that we avoid > the redundant zero-extend in clzll and clsll. Yeah I noticed that too when I was adding the mem tests, but I did change them though because at the time it just felt like an oversight, though I too was pleasantly surprised GCC was managing to avoid the zero-extending :) I then saw your comment and made me wonder whether I should keep the wrong return types in... I haven't but happy to change them back if you think it's a nice 'test' to have. Regards, Andre