From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-relay-internal-0.canonical.com (smtp-relay-internal-0.canonical.com [185.125.188.122]) by sourceware.org (Postfix) with ESMTPS id AFEC73858D33 for ; Thu, 9 Mar 2023 02:36:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AFEC73858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=canonical.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=canonical.com Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id 6CFA73F592 for ; Thu, 9 Mar 2023 02:36:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1678329403; bh=MpOQn0NuMSknBleQkfpcV22j03RV8IJbTVVYRQ5NOJs=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=cJ7U9mv/I9dA9vxRRQrmU94j1diMAB/9oQTyfXt68E64IcqKsnutUsR8I9F7P248l rPm0MJmoIQLmOZUYwAz/cJrPqFjoWzHa85UXQDESZas8ztrbG0XCGYx2COQGnJ5dgl aDsfjgZpm+wsnKriDaDypbj/JTd7bHjslmXwnltyMQ88XJUGDfKWi8Kl/Y+AWmNwqP BCaes6HpL4KaqsYONO/8J/HPrkciT0nAoDnke8tdDGKFyyQ3l9R9uFqIGSrf5JuFNj Xc7Et/vQs0I/gW567702RXKJYwSUOu0Bobe87B1YzooZ+L2frJYLzDBx9gWj15+81d aA4VBgIzlno7Q== Received: by mail-pj1-f71.google.com with SMTP id gf7-20020a17090ac7c700b00237cfdb33d7so2924031pjb.0 for ; Wed, 08 Mar 2023 18:36:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678329402; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=MpOQn0NuMSknBleQkfpcV22j03RV8IJbTVVYRQ5NOJs=; b=vprqVy5v+IxBhDeyxKNZ4+ke7q+7coEjDlCx5WvQDWa6q/6oHy/KidwO0zn6uXYXgp 9Zn1El9GAeyXH0zuFnEaMn7PkHIDCHQnOLll8Na64P5mO7/LqgLnma6pPkOSMGZrDxFt HMoKICdBFpKY+OTbdG1uLTkFvIVRZNhwsGf7o/Uk5EW8bJl1J6bVkadcCfMbMNagWonO 3CJoAyFEtgxX3tK242TSE0PJlS1GfabfcC0MC6Jn6CQNRt/SCAoPAbYadks5QusDrvxT wQtVWn/6dcibwzAq2iYxsJLWK2BDTPZhFb6mh2AIOX1S7yRBcTUtTKDyhB2Eg8Am5+4N WidQ== X-Gm-Message-State: AO0yUKUp+gFQ9c1t767aVgQICSLOg6g9Pm8/osnsN7f8Hpw++sTs06XV 5CsGdycUCozANaF4VzrB2FQ1/2VMQ0H44iN8Ibgish1SjIP8iEnrIddm6p9kZPAf6/dfbXxQ390 nl5vHBAiTLgH6nDkn8pkk+dBfbmCJarQah+f2H+wkdsdx8qX421e2Iw== X-Received: by 2002:a17:903:4293:b0:199:ab4:e140 with SMTP id ju19-20020a170903429300b001990ab4e140mr7753019plb.6.1678329401735; Wed, 08 Mar 2023 18:36:41 -0800 (PST) X-Google-Smtp-Source: AK7set/S3CtZjSsDjq4TtCyxj9LGvH6a1bhg21ufQX+V1fvaAH2OJtFHejEARluFKEMYneHRTKzdID4xO4BFwiCfluc= X-Received: by 2002:a17:903:4293:b0:199:ab4:e140 with SMTP id ju19-20020a170903429300b001990ab4e140mr7753014plb.6.1678329401293; Wed, 08 Mar 2023 18:36:41 -0800 (PST) MIME-Version: 1.0 References: <40ee09cc-cfa1-4aac-d51e-120f0dbaccd9@redhat.com> In-Reply-To: <40ee09cc-cfa1-4aac-d51e-120f0dbaccd9@redhat.com> From: Michael Hudson-Doyle Date: Thu, 9 Mar 2023 15:36:29 +1300 Message-ID: Subject: Re: release branch policy and distributions To: "Carlos O'Donell" Cc: libc-alpha , Sam James , Simon Chopin Content-Type: multipart/alternative; boundary="000000000000c7d51705f66e83fe" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000c7d51705f66e83fe Content-Type: text/plain; charset="UTF-8" On Fri, 3 Mar 2023 at 07:04, Carlos O'Donell wrote: > On 2/16/23 17:57, Michael Hudson-Doyle wrote: > > Would it be unreasonable to suggest a policy where performance > improvements > > are not backported to release branches until say a month after they have > > been included in a glibc release? I realize this would add some overhead > to > > keep track of these 'pending' backports but I personally would be happier > > consuming the release branches directly if there was this sort of policy. > > Michael, Andreas, Adhemerval, > > Thank you all for raising this. > Thank you for the thought provoking reply. > I want to talk about outcomes. > > The outcome I want is for there to be fewer defects in the development > branch, > and by proxy fewer defects in the release branch. > Obviously this is not an outcome I would oppose so it's interesting to think about why I don't feel it quite gets to the point of my concerns. More on this later :-) (1) Do we have evidence of an increased rate of defects? > > I know we have some anecdotal evidence that we recently had defects in the > rolling release branches. Have we collected that evidence to determine what > kind of action is required? I don't have data, no. When I think of this sort of thing, I think of two recentish bugs: 1) https://sourceware.org/bugzilla/show_bug.cgi?id=29611 which is the one where code assumed that if AVX2 was available, BMI2 was too 2) https://sourceware.org/bugzilla/show_bug.cgi?id=30065 which was confusion around the semantics of strncat (this didn't get backported to a release branch, although it came pretty close to ending up in a release) > Do we have a gap in our hardware or testing that > needs to be improved? > Well, clearly, yes there are gaps. On the hardware front, especially in x86 land, is it realistic to cover all possibilities? I'm very far from an expert on this stuff but I know the kernel defines more than 300 X86_FEATURE_* macros and while lots are presumably always true on all hardware glibc still supports and it's not like they are all independently available, it's still an intimidating landscape. Maybe I am being overly pessimistic. On the semantic front, I kind of feel the same way. It's clearly _possible_ to have tests that cover all aspects of the semantics of the string functions and glibc surely has tests that cover _most_ semantics already, but absent something like autogeneration of test cases from some kind of formal description of these semantics -- which are then executed on a wide range of hardware! -- I can't see how we can be confident no gaps remain. A gap here could be that we need to setup x86_64 pre-commit CI with an > AVX512 > system to test all the IFUNCs (which may catch nothing if the tests are > missing). > That's certainly one example. > Another gap here could be that we need to setup pre-commit CI to rebuild > certain > packages under the modified glibc (similar to Fedora Rawhide CI). > Again this might help catch some issues, but I doubt it would have caught the above issues. > (2) What is your distro policy for updating from the rolling release > branch? > Poorly defined, which is something I would like to change. In practice 1) we follow the release branch for a short while after release, with the final update a bit before Ubuntu itself releases 2) for non-LTS releases we generally don't update glibc at all 3) for LTS releases we do occasional updates on request, basically Security updates trump all of this of course (but are not handled by my team). Do you know what RHEL's policy is for glibc updates? While upstream gives a policy, what is your own policy? > Well by default, it doesn't change. A strict reading of https://wiki.ubuntu.com/StableReleaseUpdates would suggest that a glibc update would have to be accompanied by an explicit test case for each change that has been included in the release branch since the previous update. I think glibc should be covered by the "micro release exception" though: https://wiki.ubuntu.com/StableReleaseUpdates#New_upstream_microreleases Example: Fedora Rawhide CI rebuilds a number of packages using the new glibc > we sync weekly, and we review the rebuild failures and their testsuite > results > before putting the new glibc into Rawhide (or stable Fedora releases). > For example, rebuilding lua and running their testsuite, particularly the > string > testsuite is good at detecting further string-related optimization defects. > We don't do anything like this as regularly. We do a rebuild of every package in the archive with a snap shot glibc at some point in the development cycle but usually only once. Each new upload of glibc, to development or a stable release, triggers the testing of almost every other package as well. The issue we have of course -- and I assume Fedora is the same here -- is that these tests all run on essentially the same hardware. We found BZ# 30065 in our rebuild testing but this was partly just luck. > (3) How does delaying backports impact our outcome? > > One way is that we use this extra time to do additional testing that > discovers > defects, and then we work with the machine maintainer, IHVs, etc, and > correct > the defects. > Well, if the delay is past a glibc or distribution release, that might make a difference. I do think there is a real difference in how bad things are for a defect to be in different places. To be parochial and concentrate on Ubuntu: 1) a bug in the development branch only of glibc is currently of very little impact to Ubuntu. We don't upload pre-releases to the primary archive. 2) a bug in a glibc release will get uploaded to the "proposed pocket" of the development series of Ubuntu, where a lot of testing happens (on homogenous hardware though). A bug here still doesn't impact users but can interfere with distribution development 3) if a bug gets past the automated testing it migrates to the "release pocket" of the development series of Ubuntu, which can affect users of the development series of Ubuntu, but these people are expected to know what they are letting themselves in for 4) if a bug makes it into the Ubuntu release, it can affect more regular users, which is starting to get into bad news territory but at least it will only affect new installs or newly updated installs. 5) if a bug is included in a stable release update, that's... really really bad. It leads to bad press and a culture of people not applying updates. Obviously we don't just unleash stable release updates on people without any testing but as I've said a few times, the automated testing hardware is quite homogenous. > This means that the action we want to take is not delaying, but some kind > of > increase in testing. In fact delaying may solve nothing if additional > validation and verification is not carried out in that delay period. > Well yes. Maybe the release being deployed to distributions isn't "testing" explicitly but it's certainly "use". My opinion is that delaying alone is not an outcome changing activity, I humbly disagree on this point, as above. Delaying in and of itself is not an outcome changing activity, but delaying past glibc and distributions releases can be. > and as > a steward for the project I do not want to delay code from reach our users > unless we can show that delay allowed the users to capture some value e.g. > higher stability. > I also want to get updated code to users more quickly, that's why I started this thread! > How could Canonical, Gentoo or Linaro support additional upstream testing? > I think running the glibc testsuite on a wider range of hardware would be the most significant thing we could do here. We do have quite a range of hardware for testing but I wouldn't know where to start about using it for glibc pre-commit CI, and I also doubt it's comprehensive in a way that would be useful in this context. I wonder if the silicon vendors have anything like this... > Can we work together to turn on more distro-specific pre-commit CI testing? > I don't think "distro-specific" is quite the point here. Cheers, mwh We have patchwork, it has a REST API, and we can submit test results via > that > API, like we do today for i686 test results (Fedora Rawhide-based). > > -- > Cheers, > Carlos. > > --000000000000c7d51705f66e83fe--