Yes，on x86-64.
I just compare the disassemble between d275970ab and before commit by
objdump.
And __drand48_iterate will be more long distance after d275970ab, so I
revert this
commit and found the performance will recover a little.

Thanks,
abush


On Mon, Apr 1, 2024 at 9:12 PM Florian Weimer <fweimer@redhat.com> wrote:

> * abush wang:
>
> > This is test:
> > ```
> > uint64_t getnsecs() {
> >     uint32_t lo, hi;
> >     __asm__ __volatile__ (
> >         "rdtsc" : "=a"(lo), "=d"(hi)
> >     );
> >     return ((uint64_t)hi << 32) | lo;
> > }
> >
> > int main() {
> >     const int num_iterations = 1;
> >     uint64_t start, end, total_time = 0;
> >
> >     start = getnsecs();
> >     for (int i = 0; i < num_iterations; i++) {
> >         (void) lrand48();
> >     }
> >     end = getnsecs();
> >     total_time += (end - start);
> >
> >     printf("Average time for lrand48: %lu cycles\n", total_time /
> num_iterations);
> >     return 0;
> > }
> > ```
> > before:
> > Average time for lrand48: 21418 cycles
> >
> > after:
> > Average time for lrand48: 9892 cycles
>
> Do you see this on x86-64?  So this isn't a displacement range issue?
>
> It could be that this is a random performance change due to code
> alignment, and not actually caused by the direct call distance.
>
> Thanks,
> Florian
>
>