On Thu, Jul 27, 2023 at 9:24 AM Florian Weimer <fweimer@redhat.com> wrote:

> * Sunil Pandey:
>
> > Ffsll is one of the benchmark tests in the phoronix test suite, not
> > sure how much it matters to the application. Lots of people involved
> > in phoronix benchmark testing/tracking and this kind of random perf
> > behavior wastes their time.
>
> That's a good point.  I've seen similar reports before (sadly I don't
> recall if they were specifically about ffsll).
>
> Regarding the mechanics of fixing it, if the instruction ordering and
> sizing is so sensitive, should this be an assembler implementation
> instead?


Instruction ordering and sizing pretty much always matter. Many instructions
can't be used in low latency applications, because their encoding size is
big.
Unfortunately assemblers can't do much in this case.


>   And will the fix even work for distributions that build with
> --enable-cet, considering that there's going to be an additional 4-byte
> NOP at the start of the function?
>

Extra 4 byte from --enable-cet can affect performance of many small
size highly optimized routines and can potentially create perf variation
depending on how function gets loaded to memory.  But again, this is
one of the costs of using cet.


> Thanks,
> Florian
>
>