I have try to add iteration like this const int num_iterations = 100; and I get: Average time for lrand48: 37 cycles there is a huge gap about the order of magnitude of cycles. It seems like the first call for lrand48 do more thing than subsequent calls On Tue, Apr 2, 2024 at 10:16 PM Adhemerval Zanella Netto < adhemerval.zanella@linaro.org> wrote: > > > On 01/04/24 08:47, abush wang wrote: > > This is test: > > ``` > > uint64_t getnsecs() { > > uint32_t lo, hi; > > __asm__ __volatile__ ( > > "rdtsc" : "=a"(lo), "=d"(hi) > > ); > > return ((uint64_t)hi << 32) | lo; > > } > > > > int main() { > > const int num_iterations = 1; > > This low number of iteration makes the benchmark pretty much useless > on modern hardware with frequency scaling. By raising to something > like 1000000000 I see no variation on my workstation (Ryzen 5900). > > > uint64_t start, end, total_time = 0; > > > > start = getnsecs(); > > for (int i = 0; i < num_iterations; i++) { > > (void) lrand48(); > > } > > end = getnsecs(); > > total_time += (end - start); > > > > printf("Average time for lrand48: %lu cycles\n", total_time / > num_iterations); > > return 0; > > } > > ``` > > before: > > Average time for lrand48: 21418 cycles > > > > after: > > Average time for lrand48: 9892 cycles >