public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: "Lucas A. M. Magalhaes" <lamm@linux.ibm.com>, libc-alpha@sourceware.org
Subject: Re: [PATCH v2] Fix time/tst-cpuclock1 intermitent failures
Date: Wed, 19 Feb 2020 18:51:00 -0000	[thread overview]
Message-ID: <37861457-ff4d-076c-9d9a-270266108aef@linaro.org> (raw)
In-Reply-To: <158213053262.30076.4354318606424637635@dhcp-9-18-235-152.br.ibm.com>



On 19/02/2020 13:42, Lucas A. M. Magalhaes wrote:
> Hi,
> Thanks for the review :).
> 
> Quoting Adhemerval Zanella (2020-02-18 09:44:08)
>>
>>
>> On 06/02/2020 11:48, Lucas A. M. Magalhaes wrote:
>>> This test fails intermittently in systems with heavy load as
>>> CLOCK_PROCESS_CPUTIME_ID is subject to scheduler pressure.  Thus the
>>> test boundaries where relaxed to keep it from fail on this systems.
>>>
>>> A refactor of the spent time checking was made with some support
>>> functions. With the advantage to represent time jitter in percent
>>> of the target.
>>>
>>> The values used by the test boundaries are all empirical.
>>>
>>> ---
>>>
>>> Hi,
>>>
>>> Please Carlos see if this is what you asked.
>>>
>>> I spent sometime gathering the spent times to find the values
>>> for the boundaries. From this I selected values that will fail less
>>> than 1% of the time, in the tested machines.
>>>
>>> Also as I found the spent time deviation completely asymmetrical
>>> I choose to separate the values in upper and lower bounds.
>>>
>>> changes from V2:
>>>       - Add support functions
>>
>> None of the files follow the code and style guideline [1].
>>
>> [1] https://sourceware.org/glibc/wiki/Style_and_Conventions
>>
> 
> OK, Sorry about that.
> 
>>>
>>>  support/Makefile     |  1 +
>>>  support/cpuclock.c   | 51 ++++++++++++++++++++++++++++++++++++++++++++
>>>  support/cpuclock.h   | 30 ++++++++++++++++++++++++++
>>>  time/tst-cpuclock1.c | 43 ++++++++++---------------------------
>>>  4 files changed, 93 insertions(+), 32 deletions(-)
>>>  create mode 100644 support/cpuclock.c
>>>  create mode 100644 support/cpuclock.h
>>>
>>> diff --git a/support/Makefile b/support/Makefile
>>> index 3325feb790..b308fe0856 100644
>>> --- a/support/Makefile
>>> +++ b/support/Makefile
>>> @@ -31,6 +31,7 @@ libsupport-routines = \
>>>    check_dns_packet \
>>>    check_hostent \
>>>    check_netent \
>>> +  cpuclock \
>>>    delayed_exit \
>>>    ignore_stderr \
>>>    next_to_fault \
>>> diff --git a/support/cpuclock.c b/support/cpuclock.c
>>> new file mode 100644
>>> index 0000000000..f40c7e8d4a
>>> --- /dev/null
>>> +++ b/support/cpuclock.c
>>> @@ -0,0 +1,51 @@
>>> +/* Support functions for cpuclock tests.
>>> +   Copyright (C) 2018-2020 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <https://www.gnu.org/licenses/>.  */
>>> +
>>> +#include "cpuclock.h"
>>> +#include <stdlib.h>
>>> +
>>> +#define T_1s 1000000000.0
>>> +
>>> +struct timespec time_normalize(struct timespec t) {
>>> +     int diff;
>>> +     diff = (t.tv_nsec / T_1s);
>>> +     t.tv_sec += diff;
>>> +     t.tv_nsec += -(diff * T_1s);
>>> +     return t;
>>> +}
>>> +
>>> +struct timespec time_add(struct timespec a, struct timespec b) {
>>> +     struct timespec s = {.tv_sec = a.tv_sec + b.tv_sec,
>>> +                          .tv_nsec = a.tv_nsec + b.tv_nsec};
>>> +     return time_normalize(s);
>>> +}
>>> +
>>> +struct timespec time_sub(struct timespec a, struct timespec b) {
>>> +     struct timespec d = {.tv_sec = a.tv_sec - b.tv_sec,
>>> +                          .tv_nsec = a.tv_nsec - b.tv_nsec};
>>> +     return time_normalize(d);
>>> +}
>>
>> I don't using float operation is the correct solution to handle both
>> invalid inputs and overflow results. We already have a working solution
>> (support/timespec-add.c and support/timespec-sub.c), why do you need
>> to add such functions?
>>
> 
> Well I actually didn't notice them, as I was suggested to implement it.
> Thanks for pointing it out.  I will remove them and use the existing
> solution.
> 
>>> +
>>> +int percent_diff_check(struct timespec base, struct timespec dev,
>>> +                    float upper_bound, float lower_bound) {
>>> +     double timea = base.tv_sec + base.tv_nsec / T_1s;
>>> +     double timeb = dev.tv_sec + dev.tv_nsec / T_1s;
>>> +     double diff = (timea - timeb) / (0.5 * (timea + timeb));
>>> +     return (diff >= -upper_bound && diff <= lower_bound );
>>> +}
>>> +
>>
>> Please add some description of what this function do.  And I think 
>> it would be better to normalize to nanoseconds instead of seconds
>> to avoid floating-point cancellation due the range difference.
> 
> I'm using double here because this "(timea + timeb)" was sometimes
> overflowing.
> 
> IMHO this will not be used to check, or compute, really small jitter,
> so floating-point cancellation should not be a problem.  But if you see
> that it's worth I could to change to long and use:
> 
> double diff = (timea - timeb) / max(timea, timeb);
> 
> instead of
> 
> double diff = (timea - timeb) / (0.5 * (timea + timeb));
> 
>> Something like:
>>
>>   /* Return true if the DIFF time is within the ratio 
> 
> IMHO it makes more sense to think in it as a relative difference rather
> than in ration.  As "diff time is x% bigger than base" instead of "diff
> time is x% of the base".  So I want to keep this approach.

But the idea of test refactoring is to check if value is in a bounded
range of another value, i.e, to check if the CPU time spent in the
created process is within an expected value.

I personally think is quite confusing the function call:

   percent_diff_check(sleeptime, diff, .6, .34)

Which is simplified to:

  -0.6 <= (sleeptime - diff)/(0.5*(sleeptime + diff) <= 0.34

And the result will be negative iff diff is larger than sleeptime. And
this might happen iff the calling process is using more than one CPU
(so total number of CPU time will be higher than total sleep time
which assumes only one CPU).

Also, for such scenarios it really awkward to define the expected
threshold. For instance, assuming child process has two threads chewing 
up CPU and the parent process sleeping for 0.5s.  The result CPU
for child would be close to 1s, and with your formula the resulting
diff will be ~ -0.66.  This is confusing to set the threshold on
such scenario.

With my suggestion, for diff time larger than base one you set a
lower positive threshold; while for diff time lower than base you
set a large higher positive threshold.

> 
>>      [upper_bound, lower_bound] of DIFF time, or false otherwise.
>>
>>      For instance:
>>
>>      struct timespec diff = { 3, 94956 };
>>      struct timespec base = { 4, 0 };
>>
>>      The call checks if the ratio of diff/base is within the
>>      bounds of (0.65, 1.0) (i.e, if 'diff' is at least 65% of
>>      the 'base' value).
>>
>>      support_timespec_check_in_range (base, diff, 1.0, 0.65); */
>>   bool support_timespec_check_ratio (struct timespec base,
>>                                      struct timespec diff,
>>                                      double upper_bound,
>>                                      double lower_bound)
>>   {
>>     assert (upper_bound >= lower_bound);
>>     uint64_t base_norm = base.tv_sec * TIMESPEC_HZ + base.tv_nsec;
>>     uint64_t diff_norm = diff.tv_sec * TIMESPEC_HZ + diff.tv_nsec;
>>     double ratio = (double) base_norm / (double) diff_norm;
>>     return ratio >= lower_bound && ratio <= upper_bound;
>>   }

In fact I think base_norm and diff_norm should be calculated as double
here.

>>    
>>
>> I think it should follow the libsupport name convention of prepending
>> 'support_' on file name. 
>>
>> The same name convention should be used on the exported interfaces
>> (support_*).
>>
>> And I see that using time_* is confusing because the underlying type
>> is a 'timespec'. On 'include/time.h' the type is used on function
>> name, I think we should do the same here.
>>
>>
> 
> Ok.
> 
>>> diff --git a/support/cpuclock.h b/support/cpuclock.h
>>> new file mode 100644
>>> index 0000000000..2505fbdcff
>>> --- /dev/null
>>> +++ b/support/cpuclock.h
>>> @@ -0,0 +1,30 @@
>>> +/* Support functions for cpuclock tests.
>>> +   Copyright (C) 2018-2020 Free Software Foundation, Inc.
>>
>> This implementation is not based on old tests, I think the copyright
>> years should be only 2020.
>>
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library; if not, see
>>> +   <https://www.gnu.org/licenses/>.  */
>>> +
>>> +#ifndef SUPPORT_CPUCLOCK_H
>>> +#define SUPPORT_CPUCLOCK_H
>>> +
>>> +#include <time.h>
>>> +
>>> +struct timespec time_normalize(struct timespec t);
>>> +struct timespec time_add(struct timespec a, struct timespec b);
>>> +struct timespec time_sub(struct timespec a, struct timespec b);
>>> +int percent_diff_check(struct timespec base, struct timespec dev,
>>> +                    float upper_bound, float lower_bound);
>>> +
>>> +#endif /* SUPPORT_CPUCLOCK_H */
>>
>> As before.
>>
>>> diff --git a/time/tst-cpuclock1.c b/time/tst-cpuclock1.c
>>> index 0120906f23..4051de1646 100644
>>> --- a/time/tst-cpuclock1.c
>>> +++ b/time/tst-cpuclock1.c
>>> @@ -26,6 +26,7 @@
>>>  #include <signal.h>
>>>  #include <stdint.h>
>>>  #include <sys/wait.h>
>>> +#include <support/cpuclock.h>
>>>  
>>>  /* This function is intended to rack up both user and system time.  */
>>>  static void
>>> @@ -155,19 +156,11 @@ do_test (void)
>>>    printf ("live PID %d after sleep => %ju.%.9ju\n",
>>>         child, (uintmax_t) after.tv_sec, (uintmax_t) after.tv_nsec);
>>>  
>>> -  struct timespec diff = { .tv_sec = after.tv_sec - before.tv_sec,
>>> -                        .tv_nsec = after.tv_nsec - before.tv_nsec };
>>> -  if (diff.tv_nsec < 0)
>>> -    {
>>> -      --diff.tv_sec;
>>> -      diff.tv_nsec += 1000000000;
>>> -    }
>>> -  if (diff.tv_sec != 0
>>> -      || diff.tv_nsec > 600000000
>>> -      || diff.tv_nsec < 100000000)
>>> +  struct timespec diff = time_sub(after, before);
>>> +  if (!percent_diff_check(sleeptime, diff, .3, 2))
>>>      {
>>>        printf ("before - after %ju.%.9ju outside reasonable range\n",
>>> -           (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
>>> +         (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
>>>        result = 1;
>>>      }
>>>  
>>> @@ -194,19 +187,11 @@ do_test (void)
>>>       }
>>>        else
>>>       {
>>> -       struct timespec d = { .tv_sec = afterns.tv_sec - after.tv_sec,
>>> -                             .tv_nsec = afterns.tv_nsec - after.tv_nsec };
>>> -       if (d.tv_nsec < 0)
>>> -         {
>>> -           --d.tv_sec;
>>> -           d.tv_nsec += 1000000000;
>>> -         }
>>> -       if (d.tv_sec > 0
>>> -           || d.tv_nsec < sleeptime.tv_nsec
>>> -           || d.tv_nsec > sleeptime.tv_nsec * 2)
>>> +       diff = time_sub(afterns, after);
>>> +       if (!percent_diff_check(sleeptime, diff, .6, .34))
>>>           {
>>>             printf ("nanosleep time %ju.%.9ju outside reasonable range\n",
>>> -                   (uintmax_t) d.tv_sec, (uintmax_t) d.tv_nsec);
>>> +                  (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
>>>             result = 1;
>>>           }
>>>       }
>>> @@ -241,20 +226,14 @@ do_test (void)
>>>    printf ("dead PID %d => %ju.%.9ju\n",
>>>         child, (uintmax_t) dead.tv_sec, (uintmax_t) dead.tv_nsec);
>>>  
>>> -  diff.tv_sec = dead.tv_sec - after.tv_sec;
>>> -  diff.tv_nsec = dead.tv_nsec - after.tv_nsec;
>>> -  if (diff.tv_nsec < 0)
>>> -    {
>>> -      --diff.tv_sec;
>>> -      diff.tv_nsec += 1000000000;
>>> -    }
>>> -  if (diff.tv_sec != 0 || diff.tv_nsec > 200000000)
>>> +  diff = time_sub(dead, after);
>>> +  sleeptime.tv_nsec = 100000000;
>>> +  if (!percent_diff_check(sleeptime, diff, .6, .36))
>>>      {
>>>        printf ("dead - after %ju.%.9ju outside reasonable range\n",
>>> -           (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
>>> +          (uintmax_t) diff.tv_sec, (uintmax_t) diff.tv_nsec);
>>>        result = 1;
>>>      }
>>> -
>>>    /* Now reap the child and verify that its clock is no longer valid.  */
>>>    {
>>>      int x;
>>>

  reply	other threads:[~2020-02-19 18:51 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-06 14:48 Lucas A. M. Magalhaes
2020-02-17 16:44 ` Lucas A. M. Magalhaes
2020-02-18 12:44 ` Adhemerval Zanella
2020-02-19 16:42   ` Lucas A. M. Magalhaes
2020-02-19 18:51     ` Adhemerval Zanella [this message]
2020-02-20 18:17 ` [PATCH v3] " Lucas A. M. Magalhaes
2020-03-04 19:24   ` Matheus Castanho
2020-03-06 17:31     ` Lucas A. M. Magalhaes
2020-03-10 16:20   ` [PATCH v4] " Lucas A. M. Magalhaes
2020-03-10 16:30     ` Andreas Schwab
2020-03-10 17:45     ` Carlos O'Donell
2020-03-23 17:20     ` [PATCH v5] " Lucas A. M. Magalhaes
2020-03-23 21:06       ` Carlos O'Donell
2020-03-24 19:42         ` Lucas A. M. Magalhaes
2020-03-31 18:55           ` Carlos O'Donell
2020-03-31 11:34       ` [PATCH v6] " Lucas A. M. Magalhaes
2020-03-31 19:02         ` Carlos O'Donell
2020-04-03 19:24         ` [PATCH v7] " Lucas A. M. Magalhaes
2020-04-03 20:48           ` Carlos O'Donell
2020-04-07 13:59           ` [PATCH v8] " Lucas A. M. Magalhaes
2020-04-16 17:30             ` Lucas A. M. Magalhaes
2020-04-16 19:21             ` Carlos O'Donell
2020-04-21 17:44             ` [PATCH v9] " Lucas A. M. Magalhaes
2020-05-11 17:41               ` Lucas A. M. Magalhaes
2020-05-25 11:46                 ` Lucas A. M. Magalhaes
2020-06-08 13:58                   ` Lucas A. M. Magalhaes
2020-06-08 16:52               ` Carlos O'Donell
2020-06-12 15:28               ` [PATCH v10] " Lucas A. M. Magalhaes
2020-06-25 17:26                 ` Lucas A. M. Magalhaes
2020-07-06 14:15                   ` Lucas A. M. Magalhaes
2020-07-07 20:12                 ` Carlos O'Donell
2020-07-10 23:07                   ` Tulio Magno Quites Machado Filho
2020-07-11 14:45                     ` H.J. Lu
2020-07-11 16:31                       ` H.J. Lu
2020-07-13 23:30                         ` [PATCH] Correct timespec implementation [BZ #26232] H.J. Lu
2020-07-14  2:35                           ` Carlos O'Donell
2020-07-14 11:16                           ` Florian Weimer
2020-07-14 11:42                             ` H.J. Lu
2020-07-14 12:04                               ` H.J. Lu
2020-07-14 12:18                                 ` Florian Weimer
2020-07-14 13:12                                   ` H.J. Lu
2020-07-14 13:14                                     ` Florian Weimer
2020-07-14 13:17                                       ` H.J. Lu
2020-07-15 19:38                                         ` Paul Eggert
2020-07-15 19:44                                           ` H.J. Lu
2020-07-14 13:08                               ` Lucas A. M. Magalhaes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37861457-ff4d-076c-9d9a-270266108aef@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=lamm@linux.ibm.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).