Re: GDB 7.9.90 available for testing

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* Re: GDB 7.9.90 available for testing
@ 2015-07-09 22:34 David Edelsohn
  2015-07-09 23:21 ` Joel Brobecker
  0 siblings, 1 reply; 15+ messages in thread
From: David Edelsohn @ 2015-07-09 22:34 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: GDB Patches

Why should GDB 7.10 be considered ready for release with the large
number of regressions shown by the GDB Buildbot?  The buildbot only
recently was created and shows numerous regressions on almost all
architectures relative to when the buildslave started running.

Thanks, David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-09 22:34 GDB 7.9.90 available for testing David Edelsohn
@ 2015-07-09 23:21 ` Joel Brobecker
  2015-07-10  1:30   ` David Edelsohn
  0 siblings, 1 reply; 15+ messages in thread
From: Joel Brobecker @ 2015-07-09 23:21 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GDB Patches

> Why should GDB 7.10 be considered ready for release with the large
> number of regressions shown by the GDB Buildbot?  The buildbot only
> recently was created and shows numerous regressions on almost all
> architectures relative to when the buildslave started running.

I don't follow the Buildbot day to day, unfortunately, and there are
many configurations; so I rely on everyone with regular emails on
this list to help me determine whether there might be issues blocking
for (1) cutting the branch, and (2) making the official release.
We keep the list on a wiki page that helps us track identified issues
for each release.  For instance, for 7.10, we have:
    https://sourceware.org/gdb/wiki/GDB_7.10_Release
As you can see, very little has been reported.

I am happy holding the creation of the official release up for
however long it would take to analyze the buildBot failures,
and to fix all blocking issues. Honestly, given the traffic seen
here derived from issues found by the buildBot, I thought that
people were already keeping an eye on the results.

-- 
Joel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-09 23:21 ` Joel Brobecker
@ 2015-07-10  1:30   ` David Edelsohn
  2015-07-10  3:43     ` Joel Brobecker
  0 siblings, 1 reply; 15+ messages in thread
From: David Edelsohn @ 2015-07-10  1:30 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: GDB Patches

On Thu, Jul 9, 2015 at 7:21 PM, Joel Brobecker <brobecker@adacore.com> wrote:
>> Why should GDB 7.10 be considered ready for release with the large
>> number of regressions shown by the GDB Buildbot?  The buildbot only
>> recently was created and shows numerous regressions on almost all
>> architectures relative to when the buildslave started running.
>
> I don't follow the Buildbot day to day, unfortunately, and there are
> many configurations; so I rely on everyone with regular emails on
> this list to help me determine whether there might be issues blocking
> for (1) cutting the branch, and (2) making the official release.
> We keep the list on a wiki page that helps us track identified issues
> for each release.  For instance, for 7.10, we have:
>     https://sourceware.org/gdb/wiki/GDB_7.10_Release
> As you can see, very little has been reported.
>
> I am happy holding the creation of the official release up for
> however long it would take to analyze the buildBot failures,
> and to fix all blocking issues. Honestly, given the traffic seen
> here derived from issues found by the buildBot, I thought that
> people were already keeping an eye on the results.

I'm not certain if the baselines truly are accurate for all
buildslaves, but it seems strange to create a release when the
buildbot testsuite results show patches causing new failures.

- David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10  1:30   ` David Edelsohn
@ 2015-07-10  3:43     ` Joel Brobecker
  2015-07-10 14:04       ` David Edelsohn
  0 siblings, 1 reply; 15+ messages in thread
From: Joel Brobecker @ 2015-07-10  3:43 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GDB Patches

> I'm not certain if the baselines truly are accurate for all
> buildslaves, but it seems strange to create a release when the
> buildbot testsuite results show patches causing new failures.

To me, you are saying the same thing, and I don't disagree with you.
I said I didn't know that the buildBots were showing regressions.
Of course I would have held the creation of the branch if I had
known about this. But I didn't, and so here we are. Now we all know,
and the only way forward is to look at those regressions, and decide
what to do. We can and will delay the release if we have to.

Now, if you are implying that I didn't do enough as Release
Manager to prepare for this release, then that's an entirely
different discussion.

-- 
Joel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10  3:43     ` Joel Brobecker
@ 2015-07-10 14:04       ` David Edelsohn
  2015-07-10 14:33         ` Pedro Alves
  0 siblings, 1 reply; 15+ messages in thread
From: David Edelsohn @ 2015-07-10 14:04 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: GDB Patches

On Thu, Jul 9, 2015 at 11:42 PM, Joel Brobecker <brobecker@adacore.com> wrote:
>> I'm not certain if the baselines truly are accurate for all
>> buildslaves, but it seems strange to create a release when the
>> buildbot testsuite results show patches causing new failures.
>
> To me, you are saying the same thing, and I don't disagree with you.
> I said I didn't know that the buildBots were showing regressions.
> Of course I would have held the creation of the branch if I had
> known about this. But I didn't, and so here we are. Now we all know,
> and the only way forward is to look at those regressions, and decide
> what to do. We can and will delay the release if we have to.

Joel,

We are agreeing.  I was trying to provide some additional information
about interpretation of the buildbot status.

I am note two things about the buildbots:

1) Their color-coded "regression status" apparently is a comparison of
the testsuite between a "base" run and the current run. This is due to
few or no targets have completely clean testsuite runs to consider
"green".  Because there has been some adjustment and tweaking while
buildbots were added, the first run was not necessarily the ideal one
to choose as the "base" run, i.e., "regressions" may be due to changes
in the measurements after the first "base" run, not new failing tests.

2) Separate from the "regression" status, a quick inspection of some
testsuite output in the buildbots show the introduction of new errors
with recent commits.  Even if the overall regression status is not an
accurate measure of the state of GDB on those targets, the change in
regression status that is not monotonic reduction in regressions in
preparation for a release is disappointing.

I'm not demanding STOP SHIP.  GDB may not necessarily be in a bad
state for a release.  I don't know how this compares to the regression
status of previous releases.

I hope that GDB developers will become more aware of the effects of
their patches on multiple targets.

Thanks, David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10 14:04       ` David Edelsohn
@ 2015-07-10 14:33         ` Pedro Alves
  2015-07-10 14:56           ` David Edelsohn
  2015-07-11 13:59           ` Doug Evans
  0 siblings, 2 replies; 15+ messages in thread
From: Pedro Alves @ 2015-07-10 14:33 UTC (permalink / raw)
  To: David Edelsohn, Joel Brobecker; +Cc: GDB Patches

On 07/10/2015 03:04 PM, David Edelsohn wrote:
> On Thu, Jul 9, 2015 at 11:42 PM, Joel Brobecker <brobecker@adacore.com> wrote:
>>> I'm not certain if the baselines truly are accurate for all
>>> buildslaves, but it seems strange to create a release when the
>>> buildbot testsuite results show patches causing new failures.
>>
>> To me, you are saying the same thing, and I don't disagree with you.
>> I said I didn't know that the buildBots were showing regressions.
>> Of course I would have held the creation of the branch if I had
>> known about this. But I didn't, and so here we are. Now we all know,
>> and the only way forward is to look at those regressions, and decide
>> what to do. We can and will delay the release if we have to.
> 
> Joel,
> 
> We are agreeing.  I was trying to provide some additional information
> about interpretation of the buildbot status.
> 
> I am note two things about the buildbots:
> 
> 1) Their color-coded "regression status" apparently is a comparison of
> the testsuite between a "base" run and the current run. This is due to
> few or no targets have completely clean testsuite runs to consider
> "green".  Because there has been some adjustment and tweaking while
> buildbots were added, the first run was not necessarily the ideal one
> to choose as the "base" run, i.e., "regressions" may be due to changes
> in the measurements after the first "base" run, not new failing tests.

There's no single "base" run, actually.  The baseline is dynamically
adjusted at each build; it's a moving baseline, and it's per
test (single PASS/FAIL, not file).  As soon as a test PASSes, it's
added to the baseline.  That means that if some test is racy, it'll
sometimes FAIL, and then a few builds later it'll PASS, at which point
the PASS is recorded in the baseline, and then a few builds again
later, the test FAIL again, and so the buildbot email report mentions
the regression against the baseline.  In sum, if a test goes
FAIL -> PASS -> FAIL -> PASS on and on over builds, you'll constantly
get reports of regressions against the baseline for that racy test.

For each build, you can find the baseline file in the corresponding
git commit pointed at in the email report.  E.g., see the "baseline"
file here:

  http://gdb-build.sergiodj.net/cgit/AIX-POWER7-plain/.git/tree/?h=master&id=42b08c842d422ae995d244efeb1a85aa8a082e7b

The gdb.thread/ FAILs you see on AIX seem to fall in that category.
From the results, it looks to me that those are caused by the AIX port
not implementing schedlock correctly.  Is anyone from IBM available
to look at these?

The gdb.cp/var-tag.exp FAILs currently reported on AIX are not really
regressions, but new FAILs.  And they are really a test problem, not
a GDB bug.  They actually depend on compiler or debug info format
used, not system.

> 
> 2) Separate from the "regression" status, a quick inspection of some
> testsuite output in the buildbots show the introduction of new errors
> with recent commits.  Even if the overall regression status is not an
> accurate measure of the state of GDB on those targets, the change in
> regression status that is not monotonic reduction in regressions in
> preparation for a release is disappointing.
> 
> I'm not demanding STOP SHIP.  GDB may not necessarily be in a bad
> state for a release.  I don't know how this compares to the regression
> status of previous releases.
> 
> I hope that GDB developers will become more aware of the effects of
> their patches on multiple targets.

This is just a misunderstanding.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10 14:33         ` Pedro Alves
@ 2015-07-10 14:56           ` David Edelsohn
  2015-07-10 15:08             ` Pedro Alves
  2015-07-11 13:59           ` Doug Evans
  1 sibling, 1 reply; 15+ messages in thread
From: David Edelsohn @ 2015-07-10 14:56 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Joel Brobecker, GDB Patches

On Fri, Jul 10, 2015 at 10:33 AM, Pedro Alves <palves@redhat.com> wrote:
> On 07/10/2015 03:04 PM, David Edelsohn wrote:
>> On Thu, Jul 9, 2015 at 11:42 PM, Joel Brobecker <brobecker@adacore.com> wrote:
>>>> I'm not certain if the baselines truly are accurate for all
>>>> buildslaves, but it seems strange to create a release when the
>>>> buildbot testsuite results show patches causing new failures.
>>>
>>> To me, you are saying the same thing, and I don't disagree with you.
>>> I said I didn't know that the buildBots were showing regressions.
>>> Of course I would have held the creation of the branch if I had
>>> known about this. But I didn't, and so here we are. Now we all know,
>>> and the only way forward is to look at those regressions, and decide
>>> what to do. We can and will delay the release if we have to.
>>
>> Joel,
>>
>> We are agreeing.  I was trying to provide some additional information
>> about interpretation of the buildbot status.
>>
>> I am note two things about the buildbots:
>>
>> 1) Their color-coded "regression status" apparently is a comparison of
>> the testsuite between a "base" run and the current run. This is due to
>> few or no targets have completely clean testsuite runs to consider
>> "green".  Because there has been some adjustment and tweaking while
>> buildbots were added, the first run was not necessarily the ideal one
>> to choose as the "base" run, i.e., "regressions" may be due to changes
>> in the measurements after the first "base" run, not new failing tests.
>
> There's no single "base" run, actually.  The baseline is dynamically
> adjusted at each build; it's a moving baseline, and it's per
> test (single PASS/FAIL, not file).  As soon as a test PASSes, it's
> added to the baseline.  That means that if some test is racy, it'll
> sometimes FAIL, and then a few builds later it'll PASS, at which point
> the PASS is recorded in the baseline, and then a few builds again
> later, the test FAIL again, and so the buildbot email report mentions
> the regression against the baseline.  In sum, if a test goes
> FAIL -> PASS -> FAIL -> PASS on and on over builds, you'll constantly
> get reports of regressions against the baseline for that racy test.

Thanks for the clarification.

>
> For each build, you can find the baseline file in the corresponding
> git commit pointed at in the email report.  E.g., see the "baseline"
> file here:
>
>   http://gdb-build.sergiodj.net/cgit/AIX-POWER7-plain/.git/tree/?h=master&id=42b08c842d422ae995d244efeb1a85aa8a082e7b
>
> The gdb.thread/ FAILs you see on AIX seem to fall in that category.
> From the results, it looks to me that those are caused by the AIX port
> not implementing schedlock correctly.  Is anyone from IBM available
> to look at these?
>
> The gdb.cp/var-tag.exp FAILs currently reported on AIX are not really
> regressions, but new FAILs.  And they are really a test problem, not
> a GDB bug.  They actually depend on compiler or debug info format
> used, not system.

My concern is more about GDB on Linux on z Systems and even GDB on
x86-64, not AIX.  AIX is weird.

Shouldn't the buildbots for z Series and x86-64 be green before a release?

Thanks, David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10 14:56           ` David Edelsohn
@ 2015-07-10 15:08             ` Pedro Alves
  2015-07-10 15:25               ` David Edelsohn
  0 siblings, 1 reply; 15+ messages in thread
From: Pedro Alves @ 2015-07-10 15:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Joel Brobecker, GDB Patches

On 07/10/2015 03:56 PM, David Edelsohn wrote:

> My concern is more about GDB on Linux on z Systems and even GDB on
> x86-64, not AIX.  AIX is weird.
> 
> Shouldn't the buildbots for z Series and x86-64 be green before a release?

Ideally yes, but until the racy tests issue is fixed/kfailed and another
set of old known failures is kfail/xfailed, that can't happen.
I don't think any buildbot slave has ever been stably green yet; they
weren't green to start with.  It used to be much worse a few months ago,
we're getting there, but it requires effort, and we could use all
the help we can get.

The buildbots are quite useful, but we can't rely on greenness
alone to determine release-readyness at the moment.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10 15:08             ` Pedro Alves
@ 2015-07-10 15:25               ` David Edelsohn
  2015-07-10 15:33                 ` Pedro Alves
  0 siblings, 1 reply; 15+ messages in thread
From: David Edelsohn @ 2015-07-10 15:25 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Joel Brobecker, GDB Patches

On Fri, Jul 10, 2015 at 11:08 AM, Pedro Alves <palves@redhat.com> wrote:
> On 07/10/2015 03:56 PM, David Edelsohn wrote:
>
>> My concern is more about GDB on Linux on z Systems and even GDB on
>> x86-64, not AIX.  AIX is weird.
>>
>> Shouldn't the buildbots for z Series and x86-64 be green before a release?
>
> Ideally yes, but until the racy tests issue is fixed/kfailed and another
> set of old known failures is kfail/xfailed, that can't happen.
> I don't think any buildbot slave has ever been stably green yet; they
> weren't green to start with.  It used to be much worse a few months ago,
> we're getting there, but it requires effort, and we could use all
> the help we can get.
>
> The buildbots are quite useful, but we can't rely on greenness
> alone to determine release-readyness at the moment.

Yes, I was not requesting "green".  Because we do have the buildbots
available, I thought that it was important to understand the variable
failures or instability before a release.  If all of the failures on
primary platforms are due to race conditions, then they're understood.

Thanks, David

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10 15:25               ` David Edelsohn
@ 2015-07-10 15:33                 ` Pedro Alves
  0 siblings, 0 replies; 15+ messages in thread
From: Pedro Alves @ 2015-07-10 15:33 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Joel Brobecker, GDB Patches

On 07/10/2015 04:25 PM, David Edelsohn wrote:

> Yes, I was not requesting "green".  Because we do have the buildbots
> available, I thought that it was important to understand the variable
> failures or instability before a release.  If all of the failures on
> primary platforms are due to race conditions, then they're understood.

Agreed, but what makes you think people aren't doing that?

We have a wiki page to track issues that need to be fixed before
the release, and I'm sure Joel will not release without first
asking the community if there are other issues people know
should be addressed.

Reminder: anyone, if there are issues that need fixing that are
not on the wiki page, please list them there so we don't forget:

 https://sourceware.org/gdb/wiki/GDB_7.10_Release

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: GDB 7.9.90 available for testing
  2015-07-10 14:33         ` Pedro Alves
  2015-07-10 14:56           ` David Edelsohn
@ 2015-07-11 13:59           ` Doug Evans
  2015-07-11 18:54             ` racy tests Pedro Alves
  1 sibling, 1 reply; 15+ messages in thread
From: Doug Evans @ 2015-07-11 13:59 UTC (permalink / raw)
  To: Pedro Alves; +Cc: David Edelsohn, Joel Brobecker, GDB Patches

On Fri, Jul 10, 2015 at 9:33 AM, Pedro Alves <palves@redhat.com> wrote:
> There's no single "base" run, actually.  The baseline is dynamically
> adjusted at each build; it's a moving baseline, and it's per
> test (single PASS/FAIL, not file).  As soon as a test PASSes, it's
> added to the baseline.  That means that if some test is racy, it'll
> sometimes FAIL, and then a few builds later it'll PASS, at which point
> the PASS is recorded in the baseline, and then a few builds again
> later, the test FAIL again, and so the buildbot email report mentions
> the regression against the baseline.  In sum, if a test goes
> FAIL -> PASS -> FAIL -> PASS on and on over builds, you'll constantly
> get reports of regressions against the baseline for that racy test.

Time for another plug to change how we manage racy tests?

E.g., if a test fails, run it again a few times.
I can think of various things to do after that.
E.g. if any of the additional runs of the test record a PASS then flag
the test as RACY, and remember this state for the next run, and rerun
the same test multiple times in the next run. If the next time all N
runs pass (or all N runs fail) then switch its state to PASS/FAIL.
That's not perfect, it's hard to be perfect with racy tests. One can
build on that, but there's a pragmatic tradeoff here between being too
complex and not doing anything at all.
I think we should do something. The above keeps the baseline
machine-generated and does minimal work to manage racy tests. A lot of
racy tests get exposed during these additional runs for me because I
don't run them in parallel and thus the system is under less load, and
it's system load that triggers a lot of the racyness.

The ultimate goal is of course to remove racy tests, but first we need
to be more systematic in identifying them, which is one of the goals
of this process.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* racy tests
  2015-07-11 13:59           ` Doug Evans
@ 2015-07-11 18:54             ` Pedro Alves
  2015-07-14 16:05               ` Doug Evans
  2015-07-14 20:26               ` Sergio Durigan Junior
  0 siblings, 2 replies; 15+ messages in thread
From: Pedro Alves @ 2015-07-11 18:54 UTC (permalink / raw)
  To: Doug Evans; +Cc: David Edelsohn, Joel Brobecker, GDB Patches

On 07/11/2015 02:58 PM, Doug Evans wrote:
> On Fri, Jul 10, 2015 at 9:33 AM, Pedro Alves <palves@redhat.com> wrote:
>> There's no single "base" run, actually.  The baseline is dynamically
>> adjusted at each build; it's a moving baseline, and it's per
>> test (single PASS/FAIL, not file).  As soon as a test PASSes, it's
>> added to the baseline.  That means that if some test is racy, it'll
>> sometimes FAIL, and then a few builds later it'll PASS, at which point
>> the PASS is recorded in the baseline, and then a few builds again
>> later, the test FAIL again, and so the buildbot email report mentions
>> the regression against the baseline.  In sum, if a test goes
>> FAIL -> PASS -> FAIL -> PASS on and on over builds, you'll constantly
>> get reports of regressions against the baseline for that racy test.
> 
> Time for another plug to change how we manage racy tests?

I'm all for something more structured.

> E.g., if a test fails, run it again a few times.

Agreed.

> I can think of various things to do after that.
> E.g. if any of the additional runs of the test record a PASS then flag
> the test as RACY, and remember this state for the next run, and rerun
> the same test multiple times in the next run. If the next time all N
> runs pass (or all N runs fail) then switch its state to PASS/FAIL.
> That's not perfect, it's hard to be perfect with racy tests. One can
> build on that, but there's a pragmatic tradeoff here between being too
> complex and not doing anything at all.
> I think we should do something. The above keeps the baseline
> machine-generated and does minimal work to manage racy tests. A lot of
> racy tests get exposed during these additional runs for me because I
> don't run them in parallel and thus the system is under less load, and
> it's system load that triggers a lot of the racyness.
> 

One thing that I'd like is for this to be part of the testsuite
itself, rather than separate machinery the buildbot uses.  That way,
everyone benefits from it, and so that we all maintain/evolve it.
I think this is important, because people are often confused that
they do a test run before patch, apply patch, run test, and see
confusing new FAILs their patch can't explain.

E.g., we could have the testsuite machinery itself run the tests
multiple times, iff they failed.  May all tests would be eligible for
this, or maybe we'd run apply this to those which are explicitly
marked racy somehow, but that's separate policy from the framework
that actually re-runs tests.  On a parallel test run, we run
each .exp under its own separate runtest invocation, driven from
the testsuite's Makefile; we could wrap each of those invocation and
check whether it failed, and if so, rerun that exp a few times.

That may mean that only parallel mode supports this, but I'd be
myself fine with that, because we can always do

  make check -j1 FORCE_PARALLEL="1"

or some convenience for that, to get the benefits.

Maybe it's possible to restart the same .exp test in a
sequential run too, from gdb_finish, say; I haven't thought much
about that.

> The ultimate goal is of course to remove racy tests, but first we need
> to be more systematic in identifying them, which is one of the goals
> of this process.

Agreed.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: racy tests
  2015-07-11 18:54             ` racy tests Pedro Alves
@ 2015-07-14 16:05               ` Doug Evans
  2015-07-14 20:26               ` Sergio Durigan Junior
  1 sibling, 0 replies; 15+ messages in thread
From: Doug Evans @ 2015-07-14 16:05 UTC (permalink / raw)
  To: Pedro Alves; +Cc: David Edelsohn, Joel Brobecker, GDB Patches

On Sat, Jul 11, 2015 at 11:54 AM, Pedro Alves <palves@redhat.com> wrote:
> One thing that I'd like is for this to be part of the testsuite
> itself, rather than separate machinery the buildbot uses.  That way,
> everyone benefits from it, and so that we all maintain/evolve it.
> I think this is important, because people are often confused that
> they do a test run before patch, apply patch, run test, and see
> confusing new FAILs their patch can't explain.

No disagreement there.
I would build it on top of what's there now.
[I'd rather build this up in layers, and not have
overly complicated lower layers.]

A next question that arises is maintaining history.
E.g., how does one diff the results of the current run
with the current "gold standard"?

The way I do it here is to have separate files that augment the
XFAIL/KFAIL markers in the test (it's far easier to maintain a few
files than editing each test's .exp file)
but  I'm not sure it scales well.
[E.g., I need to keep separate files for different compilers,
though there is a #include mechanism for common stuff.]

Alternatively,
If a test run could take as input the gdb.sum file from a baseline
run (e.g., from an unpatched trunk) then that could work.
Buildbot could use the previous run, and Joe-Developer
could either use as input a buildbot run's output file
or run the testsuite twice (e.g., with/without the patch-under-test).
[I wouldn't use gdb.sum specifically, I'm just using it here
for illustration's sake.]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: racy tests
  2015-07-11 18:54             ` racy tests Pedro Alves
  2015-07-14 16:05               ` Doug Evans
@ 2015-07-14 20:26               ` Sergio Durigan Junior
  1 sibling, 0 replies; 15+ messages in thread
From: Sergio Durigan Junior @ 2015-07-14 20:26 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Doug Evans, David Edelsohn, Joel Brobecker, GDB Patches

On Saturday, July 11 2015, Pedro Alves wrote:

>> I can think of various things to do after that.
>> E.g. if any of the additional runs of the test record a PASS then flag
>> the test as RACY, and remember this state for the next run, and rerun
>> the same test multiple times in the next run. If the next time all N
>> runs pass (or all N runs fail) then switch its state to PASS/FAIL.
>> That's not perfect, it's hard to be perfect with racy tests. One can
>> build on that, but there's a pragmatic tradeoff here between being too
>> complex and not doing anything at all.
>> I think we should do something. The above keeps the baseline
>> machine-generated and does minimal work to manage racy tests. A lot of
>> racy tests get exposed during these additional runs for me because I
>> don't run them in parallel and thus the system is under less load, and
>> it's system load that triggers a lot of the racyness.
>> 
>
> One thing that I'd like is for this to be part of the testsuite
> itself, rather than separate machinery the buildbot uses.  That way,
> everyone benefits from it, and so that we all maintain/evolve it.
> I think this is important, because people are often confused that
> they do a test run before patch, apply patch, run test, and see
> confusing new FAILs their patch can't explain.

I have something implemented for BuildBot that would address the issue
of racy tests, but I agree that having this as part of the official
testsuite is the best approach.  I'm willing to work on this, BTW.  I
will see what I can do this week during my spare time.

> E.g., we could have the testsuite machinery itself run the tests
> multiple times, iff they failed.

I would expand this to "have the testsuite machinery itself run the
tests multiple times".  Racy tests will also PASS in the first run,
which means that we'd miss some of them if the testsuite only re-ran
those that failed.  This would have side-effects, of course: the
testsuite run would take much longer, because we would be running all
tests several times...

BuildBot would then be able to use this to update its own xfail files.
I think running this "special test mode" once a week would be enough for
BuildBot to keep things up-to-dated.

> May all tests would be eligible for
> this, or maybe we'd run apply this to those which are explicitly
> marked racy somehow, but that's separate policy from the framework
> that actually re-runs tests.  On a parallel test run, we run
> each .exp under its own separate runtest invocation, driven from
> the testsuite's Makefile; we could wrap each of those invocation and
> check whether it failed, and if so, rerun that exp a few times.
>
> That may mean that only parallel mode supports this, but I'd be
> myself fine with that, because we can always do
>
>   make check -j1 FORCE_PARALLEL="1"
>
> or some convenience for that, to get the benefits.

I glanced over the testsuite's Makefile, and I thought about having an
explicit "FORCE_RACY_PARALLEL" (with its do-check-racy-parallel
counterpart).

This would be a "special mode", that would take longer to complete (as
explained above), but that would also generate real results.  I think
we'd have to modify dg-extract-results.sh as well; not sure.

-- 
Sergio
GPG key ID: 237A 54B1 0287 28BF 00EF  31F4 D0EB 7628 65FC 5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* GDB 7.9.90 available for testing
@ 2015-07-06 20:29 Joel Brobecker
  0 siblings, 0 replies; 15+ messages in thread
From: Joel Brobecker @ 2015-07-06 20:29 UTC (permalink / raw)
  To: gdb-patches

Hello,

I have just finished creating the gdb-7.9.90 pre-release.
It is available for download at the following location:

    ftp://sourceware.org/pub/gdb/snapshots/branch/gdb-7.9.90.tar.xz

A gzip'ed version is also available: gdb-7.9.90.tar.gz.

Please give it a test if you can and report any problems you might find.

On behalf of all the GDB contributors, thank you!
-- 
Joel Brobecker

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-07-14 20:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-09 22:34 GDB 7.9.90 available for testing David Edelsohn
2015-07-09 23:21 ` Joel Brobecker
2015-07-10  1:30   ` David Edelsohn
2015-07-10  3:43     ` Joel Brobecker
2015-07-10 14:04       ` David Edelsohn
2015-07-10 14:33         ` Pedro Alves
2015-07-10 14:56           ` David Edelsohn
2015-07-10 15:08             ` Pedro Alves
2015-07-10 15:25               ` David Edelsohn
2015-07-10 15:33                 ` Pedro Alves
2015-07-11 13:59           ` Doug Evans
2015-07-11 18:54             ` racy tests Pedro Alves
2015-07-14 16:05               ` Doug Evans
2015-07-14 20:26               ` Sergio Durigan Junior
  -- strict thread matches above, loose matches on Subject: below --
2015-07-06 20:29 GDB 7.9.90 available for testing Joel Brobecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).