From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 111230 invoked by alias); 9 Jul 2018 19:19:15 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 111220 invoked by uid 89); 9 Jul 2018 19:19:14 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=AWL,BAYES_40,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=Agent, headquarters, Headquarters, compliance X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 19:19:12 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 648443A4AA7; Mon, 9 Jul 2018 15:19:11 -0400 (EDT) From: esr@thyrsus.com (Eric S. Raymond) To: GCC Development , fallenpegasus@gmail.com Subject: Repo conversion troubles. Message-Id: <20180709191911.648443A4AA7@snark.thyrsus.com> Date: Mon, 09 Jul 2018 19:19:00 -0000 X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00135.txt.bz2 Last time I did a comparison between SVN head and the git conversion tip they matched exactly. This time I have mismatches in the following files. libtool.m4 libvtv/ChangeLog libvtv/configure libvtv/testsuite/lib/libvtv.exp ltmain.sh lto-plugin/ChangeLog lto-plugin/configure lto-plugin/lto-plugin.c MAINTAINERS maintainer-scripts/ChangeLog maintainer-scripts/crontab maintainer-scripts/gcc_release Makefile.def Makefile.in Makefile.tpl zlib/configure zlib/configure.ac Now I'll explain what this means and why it's a serious problem. Reposurgeon is never confused by linear history, branching, or tagging; I have lots of regression tests for those cases. When it screws up it is invariably around branch copy operations, because there are cases near those where the data model of Subversion stream files is underspecified. That model was in fact entirely undocumented before I reverse-engineered it and wrote the description that now lives in the Subversion source tree. But that description is not complete; nobody, not even Subversion's designers, knows how to fill in all the corner cases. Thus, a content mismatch like this means there was some recent branch merge to trunk in the gcc history that reposurgeon is not interpreting as intended, or more likely an operator error such as a non-Subversion directory copy followed by a commit - my analyzer can recover from most such cases but not all. There are brute-force ways to pin down such malformations, but none of them are practical at the huge scale of this repository. The main problem here wouldn't reposurgeon itself but the fact that Subversion checkouts on a repo this large are very slow. I've seen a single one take 12 hours; an attempt at a whole bisection run to pin down the divergence point on trunk would therefore probably cost log2 of the commit length times that, or about 18 days. So...does that list of changed files look familar to anyone? If we can identify the revision number of the bad commit, the odds of being able to unscramble this mess go way up. They still aren't good, not when merely loading the repository for examination takes over four hours, but they would way better than if I were starting from zero. This is serious. I have preduced demonstrably correct history conversions of the gcc repo in the past. We may now be in a situation where I will never again be able to do that. -- Eric S. Raymond The real point of audits is to instill fear, not to extract revenue; the IRS aims at winning through intimidation and (thereby) getting maximum voluntary compliance -- Paul Strassel, former IRS Headquarters Agent Wall St. Journal 1980 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 103767 invoked by alias); 9 Jul 2018 19:40:13 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 103754 invoked by uid 89); 9 Jul 2018 19:40:13 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=1.8 required=5.0 tests=BAYES_50,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=examination, familar, tip, pin X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 19:40:09 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C59C130832D3; Mon, 9 Jul 2018 19:40:07 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-9.rdu2.redhat.com [10.10.112.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id A2CC25D750; Mon, 9 Jul 2018 19:40:06 +0000 (UTC) Subject: Re: Repo conversion troubles. To: "Eric S. Raymond" , GCC Development , fallenpegasus@gmail.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> From: Jeff Law Openpgp: preference=signencrypt Message-ID: Date: Mon, 09 Jul 2018 19:40:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180709191911.648443A4AA7@snark.thyrsus.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00136.txt.bz2 On 07/09/2018 01:19 PM, Eric S. Raymond wrote: > Last time I did a comparison between SVN head and the git conversion > tip they matched exactly. This time I have mismatches in the following > files. > > libtool.m4 > libvtv/ChangeLog > libvtv/configure > libvtv/testsuite/lib/libvtv.exp > ltmain.sh > lto-plugin/ChangeLog > lto-plugin/configure > lto-plugin/lto-plugin.c > MAINTAINERS > maintainer-scripts/ChangeLog > maintainer-scripts/crontab > maintainer-scripts/gcc_release > Makefile.def > Makefile.in > Makefile.tpl > zlib/configure > zlib/configure.ac > > Now I'll explain what this means and why it's a serious problem. [ ... ] That's weird -- let's take maintainer-scripts/crontab as our victim. That file (according to the git mirror) has only changed on the trunk 3 times in the last year. They're all changes from Jakub and none look unusual at all. Just trivial looking updates. libvtv.exp is another interesting file. It changed twice in early May of this year. Prior to that it hadn't changed since 2015. [ ... ] > > There are brute-force ways to pin down such malformations, but none of > them are practical at the huge scale of this repository. The main > problem here wouldn't reposurgeon itself but the fact that Subversion > checkouts on a repo this large are very slow. I've seen a single one > take 12 hours; an attempt at a whole bisection run to pin down the > divergence point on trunk would therefore probably cost log2 of the > commit length times that, or about 18 days. I'm not aware of any such merges, but any that occurred most likely happened after mid-April when the trunk was re-opened for development. I'm assuming that it's only work that merges onto the trunk that's potentially problematical here. > > So...does that list of changed files look familar to anyone? If we can > identify the revision number of the bad commit, the odds of being able > to unscramble this mess go way up. They still aren't good, not when > merely loading the repository for examination takes over four hours, > but they would way better than if I were starting from zero. They're familiar only in the sense that I know what those files are :-) Jeff From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 129205 invoked by alias); 9 Jul 2018 19:46:54 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 129150 invoked by uid 89); 9 Jul 2018 19:46:48 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.5 required=5.0 tests=BAYES_05,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_NONE autolearn=no version=3.3.2 spammy=H*Ad:D*t-online.de, H*r:encrypted, tip, mismatches X-HELO: mailout05.t-online.de Received: from mailout05.t-online.de (HELO mailout05.t-online.de) (194.25.134.82) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 19:46:45 +0000 Received: from fwd39.aul.t-online.de (fwd39.aul.t-online.de [172.20.27.138]) by mailout05.t-online.de (Postfix) with SMTP id 23BE1427DB7D; Mon, 9 Jul 2018 21:46:43 +0200 (CEST) Received: from sweetums.local (Xjw3D6Zc8hBcC6IF1YEEi4ojdTWWax30eTlZqi+aho5yCVyXUW4P0Jzl2f23OAhZH7@[93.230.219.72]) by fwd39.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384 encrypted) esmtp id 1fcc7L-0zs0G00; Mon, 9 Jul 2018 21:46:39 +0200 Subject: Re: Repo conversion troubles. To: "Eric S. Raymond" , GCC Development , fallenpegasus@gmail.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> From: Bernd Schmidt Openpgp: preference=signencrypt Message-ID: <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> Date: Mon, 09 Jul 2018 19:46:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180709191911.648443A4AA7@snark.thyrsus.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00137.txt.bz2 On 07/09/2018 09:19 PM, Eric S. Raymond wrote: > Last time I did a comparison between SVN head and the git conversion > tip they matched exactly. This time I have mismatches in the following > files. So what are the diffs? Are we talking about small differences (like one change missing) or large-scale mismatches? Bernd From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28551 invoked by alias); 9 Jul 2018 19:57:26 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 27392 invoked by uid 89); 9 Jul 2018 19:57:25 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=1.6 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=twelve, mismatches, pin, subversion X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 19:57:23 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id DE1AB3A4F0E; Mon, 9 Jul 2018 15:57:22 -0400 (EDT) Date: Mon, 09 Jul 2018 19:57:00 -0000 From: "Eric S. Raymond" To: Jeff Law Cc: GCC Development , fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. Message-ID: <20180709195722.GA32057@thyrsus.com> Reply-To: esr@thyrsus.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00139.txt.bz2 Jeff Law : > > There are brute-force ways to pin down such malformations, but none of > > them are practical at the huge scale of this repository. The main > > problem here wouldn't reposurgeon itself but the fact that Subversion > > checkouts on a repo this large are very slow. I've seen a single one > > take 12 hours; an attempt at a whole bisection run to pin down the > > divergence point on trunk would therefore probably cost log2 of the > > commit length times that, or about 18 days. > > I'm not aware of any such merges, but any that occurred most likely > happened after mid-April when the trunk was re-opened for development. I agree it can't have been earlier than that, or I'd have hit this rock sooner. I'd bet on the problem having arisen within the last six weeks. > I'm assuming that it's only work that merges onto the trunk that's > potentially problematical here. Yes. It is possible there are also content mismatches on branches - I haven't run that check yet, it takes an absurd amount of time to complete - - but not much point in worrying about that if we can't get trunk right. I'm pretty certain things were still good at r256000. I've started that check running. Not expecting results in less than twelve hours. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 34533 invoked by alias); 9 Jul 2018 19:59:59 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 34524 invoked by uid 89); 9 Jul 2018 19:59:58 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=mismatches, tip, largescale, bernd X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 19:59:57 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 293583A4AA7; Mon, 9 Jul 2018 15:59:57 -0400 (EDT) Date: Mon, 09 Jul 2018 19:59:00 -0000 From: "Eric S. Raymond" To: Bernd Schmidt Cc: GCC Development , fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. Message-ID: <20180709195957.GB32057@thyrsus.com> Reply-To: esr@thyrsus.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> User-Agent: Mutt/1.9.4 (2018-02-28) X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00140.txt.bz2 Bernd Schmidt : > On 07/09/2018 09:19 PM, Eric S. Raymond wrote: > > Last time I did a comparison between SVN head and the git conversion > > tip they matched exactly. This time I have mismatches in the following > > files. > > So what are the diffs? Are we talking about small differences (like one > change missing) or large-scale mismatches? Large-scale, I'm afraid. The context diff is about a GLOC. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 39117 invoked by alias); 9 Jul 2018 20:01:16 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 38528 invoked by uid 89); 9 Jul 2018 20:00:55 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=BAYES_50,SPF_HELO_PASS,TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy=Christmas, christmas, twelve, mismatches X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 20:00:28 +0000 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 376433084038; Mon, 9 Jul 2018 20:00:27 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-9.rdu2.redhat.com [10.10.112.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 512FD308BDAC; Mon, 9 Jul 2018 20:00:26 +0000 (UTC) Subject: Re: Repo conversion troubles. To: esr@thyrsus.com Cc: GCC Development , fallenpegasus@gmail.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> <20180709195722.GA32057@thyrsus.com> From: Jeff Law Openpgp: preference=signencrypt Message-ID: <2a0f8893-b289-daa9-7dc4-513572796722@redhat.com> Date: Mon, 09 Jul 2018 20:01:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180709195722.GA32057@thyrsus.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00141.txt.bz2 On 07/09/2018 01:57 PM, Eric S. Raymond wrote: > Jeff Law : >>> There are brute-force ways to pin down such malformations, but none of >>> them are practical at the huge scale of this repository. The main >>> problem here wouldn't reposurgeon itself but the fact that Subversion >>> checkouts on a repo this large are very slow. I've seen a single one >>> take 12 hours; an attempt at a whole bisection run to pin down the >>> divergence point on trunk would therefore probably cost log2 of the >>> commit length times that, or about 18 days. >> >> I'm not aware of any such merges, but any that occurred most likely >> happened after mid-April when the trunk was re-opened for development. > > I agree it can't have been earlier than that, or I'd have hit this rock > sooner. I'd bet on the problem having arisen within the last six weeks. > >> I'm assuming that it's only work that merges onto the trunk that's >> potentially problematical here. > > Yes. It is possible there are also content mismatches on branches - I > haven't run that check yet, it takes an absurd amount of time to complete - > - but not much point in worrying about that if we can't get trunk right. > > I'm pretty certain things were still good at r256000. I've started that > check running. Not expecting results in less than twelve hours. r256000 would be roughly Christmas 2017. I'd be very surprised if any merges to the trunk happened between that point and early April. We're essentially in regression bugfixes only during that timeframe. Not a time for branch->trunk merging :-) jeff From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 50956 invoked by alias); 9 Jul 2018 20:04:26 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 50783 invoked by uid 89); 9 Jul 2018 20:04:25 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=examination, mismatches, pin, tip X-HELO: mail-wm0-f49.google.com Received: from mail-wm0-f49.google.com (HELO mail-wm0-f49.google.com) (74.125.82.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 20:04:23 +0000 Received: by mail-wm0-f49.google.com with SMTP id z13-v6so21985736wma.5 for ; Mon, 09 Jul 2018 13:04:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:user-agent:in-reply-to:references:mime-version :content-transfer-encoding:subject:to:from:message-id; bh=EG6UtUPRkb2T/5LqmKOduuCkZIllysmKsrnb0Tg12Zk=; b=H9aOxDL+pJsL8mPJkbmg+en/2ncXANCtpch0nHYIUGCtBW0qyEYfp7QaEartEas9cB I1upvydcUcQoBvG8CGpm4Td+5DFMnD+8uZ7BW8+7Klk5CV+BHWnjDxgZdemXHB7diFqg 2mQJ4LVaXu/dPFXMQefDBdOb53YlcGhchdEeug/f6W2pzdDJjGlOBDWTYhWXPojDBWot nErnOaUDfdIZRCl+98ZldTe1QW67PbVFqot1jjhZgilxWPo3lS3yRVhgIue0cky30/1m EXVNOLP1LAJVtmuH3I/iTTLmSPXYfs8cTxEByswf9wzJ7S06iOBUQqO9VslQj36HJxZO LsHw== Return-Path: Received: from [192.168.178.32] (p2E530C99.dip0.t-ipconnect.de. [46.83.12.153]) by smtp.gmail.com with ESMTPSA id q70-v6sm27838429wmd.45.2018.07.09.13.04.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Jul 2018 13:04:20 -0700 (PDT) Date: Mon, 09 Jul 2018 20:04:00 -0000 User-Agent: K-9 Mail for Android In-Reply-To: <20180709191911.648443A4AA7@snark.thyrsus.com> References: <20180709191911.648443A4AA7@snark.thyrsus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: Repo conversion troubles. To: gcc@gcc.gnu.org,esr@thyrsus.com,GCC Development ,fallenpegasus@gmail.com From: Richard Biener Message-ID: <770521DA-2D67-407B-9AA8-C5F978364121@gmail.com> X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00143.txt.bz2 On July 9, 2018 9:19:11 PM GMT+02:00, esr@thyrsus.com wrote: >Last time I did a comparison between SVN head and the git conversion >tip they matched exactly. This time I have mismatches in the following >files. > >libtool.m4 >libvtv/ChangeLog >libvtv/configure >libvtv/testsuite/lib/libvtv.exp >ltmain.sh >lto-plugin/ChangeLog >lto-plugin/configure >lto-plugin/lto-plugin.c >MAINTAINERS >maintainer-scripts/ChangeLog >maintainer-scripts/crontab >maintainer-scripts/gcc_release >Makefile.def >Makefile.in >Makefile.tpl >zlib/configure >zlib/configure.ac > >Now I'll explain what this means and why it's a serious problem. > >Reposurgeon is never confused by linear history, branching, or >tagging; I have lots of regression tests for those cases. When it >screws up it is invariably around branch copy operations, because >there are cases near those where the data model of Subversion stream >files is underspecified. That model was in fact entirely undocumented >before I reverse-engineered it and wrote the description that now >lives in the Subversion source tree. But that description is not >complete; nobody, not even Subversion's designers, knows how to fill >in all the corner cases. > >Thus, a content mismatch like this means there was some recent branch >merge to trunk in the gcc history that reposurgeon is not interpreting >as intended, or more likely an operator error such as a non-Subversion >directory copy followed by a commit - my analyzer can recover from >most such cases but not all. > >There are brute-force ways to pin down such malformations, but none of >them are practical at the huge scale of this repository. The main >problem here wouldn't reposurgeon itself but the fact that Subversion >checkouts on a repo this large are very slow. I've seen a single one >take 12 hours; an attempt at a whole bisection run to pin down the >divergence point on trunk would therefore probably cost log2 of the >commit length times that, or about 18 days. 12 hours from remote I guess? The subversion repository is available throug= h rsync so you can create a local mirror to work from (we've been doing tha= t at suse for years)=20 Richard.=20 > >So...does that list of changed files look familar to anyone? If we can >identify the revision number of the bad commit, the odds of being able >to unscramble this mess go way up. They still aren't good, not when >merely loading the repository for examination takes over four hours, >but they would way better than if I were starting from zero. > >This is serious. I have preduced demonstrably correct history >conversions of the gcc repo in the past. We may now be in a situation >where I will never again be able to do that. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 55148 invoked by alias); 9 Jul 2018 20:06:19 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 55137 invoked by uid 89); 9 Jul 2018 20:06:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=AWL,BAYES_20,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=Christmas, christmas, twelve, merges X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 20:06:17 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 327283A4AA7; Mon, 9 Jul 2018 16:06:17 -0400 (EDT) Date: Mon, 09 Jul 2018 20:06:00 -0000 From: "Eric S. Raymond" To: Jeff Law Cc: GCC Development , fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. Message-ID: <20180709200617.GD32057@thyrsus.com> Reply-To: esr@thyrsus.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> <20180709195722.GA32057@thyrsus.com> <2a0f8893-b289-daa9-7dc4-513572796722@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2a0f8893-b289-daa9-7dc4-513572796722@redhat.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00144.txt.bz2 Jeff Law : > > I'm pretty certain things were still good at r256000. I've started that > > check running. Not expecting results in less than twelve hours. > r256000 would be roughly Christmas 2017. I'd be very surprised if any > merges to the trunk happened between that point and early April. We're > essentially in regression bugfixes only during that timeframe. Not a > time for branch->trunk merging :-) Thanks, that's useful to know. That means if the r256000 check passes I can jump forward to 1 Apr reasonably expecting that one to pass too. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16111 invoked by alias); 9 Jul 2018 20:20:41 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 16101 invoked by uid 89); 9 Jul 2018 20:20:40 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=H*f:sk:770521D, H*i:sk:770521D, his, visit X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Jul 2018 20:20:39 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 198223A4AA7; Mon, 9 Jul 2018 16:20:39 -0400 (EDT) Date: Mon, 09 Jul 2018 20:20:00 -0000 From: "Eric S. Raymond" To: Richard Biener Cc: gcc@gcc.gnu.org, fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. Message-ID: <20180709202039.GA1897@thyrsus.com> Reply-To: esr@thyrsus.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> <770521DA-2D67-407B-9AA8-C5F978364121@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <770521DA-2D67-407B-9AA8-C5F978364121@gmail.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00145.txt.bz2 Richard Biener : > 12 hours from remote I guess? The subversion repository is available through rsync so you can create a local mirror to work from (we've been doing that at suse for years) I'm saying I see rsync plus local checkout take 10-12 hours. I asked Jason about this and his response was basically "Well...we don't do that often." You probably never see thids case. Update from a remote is much faster. I'm trying to do a manual correctness check via update to commit 256000 now. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 37711 invoked by alias); 10 Jul 2018 01:13:56 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 37694 invoked by uid 89); 10 Jul 2018 01:13:56 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=Christmas, christmas, twelve, HCc:D*t-online.de X-HELO: linux-libre.fsfla.org Received: from linux-libre.fsfla.org (HELO linux-libre.fsfla.org) (208.118.235.54) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Jul 2018 01:13:54 +0000 Received: from free.home (home.lxoliva.fsfla.org [172.31.160.164]) by linux-libre.fsfla.org (8.15.2/8.15.2/Debian-3) with ESMTP id w6A1DiQo021490; Tue, 10 Jul 2018 01:13:50 GMT Received: from livre (livre.home [172.31.160.2]) by free.home (8.15.2/8.15.2) with ESMTP id w6A1Dbep098644; Mon, 9 Jul 2018 22:13:37 -0300 From: Alexandre Oliva To: "Eric S. Raymond" Cc: Bernd Schmidt , GCC Development , fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> <20180709195957.GB32057@thyrsus.com> Date: Tue, 10 Jul 2018 01:13:00 -0000 In-Reply-To: <20180709195957.GB32057@thyrsus.com> (Eric S. Raymond's message of "Mon, 9 Jul 2018 15:59:57 -0400") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00146.txt.bz2 On Jul 9, 2018, Jeff Law wrote: > On 07/09/2018 01:57 PM, Eric S. Raymond wrote: >> Jeff Law : >>> I'm not aware of any such merges, but any that occurred most likely >>> happened after mid-April when the trunk was re-opened for development. >> I'm pretty certain things were still good at r256000. I've started that >> check running. Not expecting results in less than twelve hours. > r256000 would be roughly Christmas 2017. When was the RAID/LVM disk corruption incident? Could it possibly have left any of our svn repo metadata in a corrupted way that confuses reposurgeon, and that leads to such huge differences? On Jul 9, 2018, "Eric S. Raymond" wrote: > Bernd Schmidt : >> So what are the diffs? Are we talking about small differences (like one >> change missing) or large-scale mismatches? > Large-scale, I'm afraid. The context diff is about a GLOC. -- Alexandre Oliva, freedom fighter https://FSFLA.org/blogs/lxo Be the change, be Free! FSF Latin America board member GNU Toolchain Engineer Free Software Evangelist From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21483 invoked by alias); 10 Jul 2018 04:57:25 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 21466 invoked by uid 89); 10 Jul 2018 04:57:23 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=256000, thids, 10-12, H*f:sk:770521D X-HELO: mail-wr1-f42.google.com Received: from mail-wr1-f42.google.com (HELO mail-wr1-f42.google.com) (209.85.221.42) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Jul 2018 04:57:21 +0000 Received: by mail-wr1-f42.google.com with SMTP id j5-v6so6561396wrr.8 for ; Mon, 09 Jul 2018 21:57:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:user-agent:in-reply-to:references:mime-version :content-transfer-encoding:subject:to:cc:from:message-id; bh=ohNCIGUzbbSYiyn4Xdh8N5MjA1fivlp5SCU/kb/kTDc=; b=YvhrVgeMItW5bI1lyRNbLPzpnePj0wOqEAfc9W5ZtZ8yJG3Rq5DQg4bVDKt+rOk0QQ E/qQ2LvkOL/MOzGZDv7am7h2d/BB3eFebNGcKv0kI0vNiJg2PrCkQmXJOJX5eBdg1QQa 1RqJNnaZEi3inZ7I9zvsOvBP5lUDvWU1oXn2ZOsW6wCm6x7avqtSz/sj6WgMwoUPDjdk gfSvaA6Otlek3ISO8SIE9EkFmKXkU/jUQNJRVIcEBxdCXsNQhl4uqWzOwq4Qrpds/tPs 50YmIoSYONxS8NpkLN4af0mHyamE+9k76P5fGiOD3GW3dqyT/Dkv17pLezazfqxeaLNl mTBw== Return-Path: Received: from [192.168.178.32] (p2E530C99.dip0.t-ipconnect.de. [46.83.12.153]) by smtp.gmail.com with ESMTPSA id u4-v6sm13423282wmc.1.2018.07.09.21.57.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Jul 2018 21:57:18 -0700 (PDT) Date: Tue, 10 Jul 2018 04:57:00 -0000 User-Agent: K-9 Mail for Android In-Reply-To: <20180709202039.GA1897@thyrsus.com> References: <20180709191911.648443A4AA7@snark.thyrsus.com> <770521DA-2D67-407B-9AA8-C5F978364121@gmail.com> <20180709202039.GA1897@thyrsus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: Repo conversion troubles. To: esr@thyrsus.com,"Eric S. Raymond" CC: gcc@gcc.gnu.org,fallenpegasus@gmail.com From: Richard Biener Message-ID: <2A556E5E-4278-4461-93C5-BBE1B5CA43DE@gmail.com> X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00149.txt.bz2 On July 9, 2018 10:20:39 PM GMT+02:00, "Eric S. Raymond" = wrote: >Richard Biener : >> 12 hours from remote I guess? The subversion repository is available >through rsync so you can create a local mirror to work from (we've been >doing that at suse for years)=20 > >I'm saying I see rsync plus local checkout take 10-12 hours.=20 For a fresh rsync I can guess that's true. But it works incremental just fi= ne and quick for me...=20 I asked >Jason >about this and his response was basically "Well...we don't do that >often." > >You probably never see thids case. Update from a remote is much >faster. > >I'm trying to do a manual correctness check via update to commit 256000 >now. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 130497 invoked by alias); 10 Jul 2018 08:20:19 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 130481 invoked by uid 89); 10 Jul 2018 08:20:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=Hx-languages-length:1223 X-HELO: mail-io0-f182.google.com Received: from mail-io0-f182.google.com (HELO mail-io0-f182.google.com) (209.85.223.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Jul 2018 08:20:12 +0000 Received: by mail-io0-f182.google.com with SMTP id l14-v6so9597349iob.7 for ; Tue, 10 Jul 2018 01:20:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=G2ZqEcPQwW8dP3GpodIoD5UlG1SVlPRK7rmS+59om1I=; b=jyk6vS8yxOMClbXPRotc7iMAio+zhnkCfrbu13fpdgkldOCkVi7HeDJW8Et56U3863 623DAuoTaX7X3AlSFPYcol5047D7dQt0Sb+QLMXcmProDOY71oAUG4kPIcdeQ9JAYBra SoKf/87eb8ZyK2LPp5DOKhvdafIPErIT5OjUYCN7WByAbUtcS6R0e7DWjWtNfFNYwnF3 p8/Be/VN26DYcod1arryExKnui/CwRdOWwlo15iFlfO4g5+LX/jdh8Kpn0gsgNKNW62U eMSkgv6dqxuJzBSB+IHLPICMqnigkOGIgRmrWj3VF5lPM838JAYDW8qIC4IWCW8KjXK5 DyRA== MIME-Version: 1.0 References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> <20180709195957.GB32057@thyrsus.com> In-Reply-To: <20180709195957.GB32057@thyrsus.com> From: Jonathan Wakely Date: Tue, 10 Jul 2018 08:20:00 -0000 Message-ID: Subject: Re: Repo conversion troubles. To: Eric Raymond Cc: Bernd Schmidt , "gcc@gcc.gnu.org" , fallenpegasus@gmail.com Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00155.txt.bz2 On Mon, 9 Jul 2018 at 21:00, Eric S. Raymond wrote: > > Bernd Schmidt : > > On 07/09/2018 09:19 PM, Eric S. Raymond wrote: > > > Last time I did a comparison between SVN head and the git conversion > > > tip they matched exactly. This time I have mismatches in the following > > > files. > > > > So what are the diffs? Are we talking about small differences (like one > > change missing) or large-scale mismatches? > > Large-scale, I'm afraid. The context diff is about a GLOC. I don't see how that's possible. Most of those files are tiny, or change very rarely, so I don't see how that large a diff can happen. Take zlib/configure.ac and zlib/configure, there's only been one change in the past 18 months: https://gcc.gnu.org/r261739 That change didn't touch the other files in the list. libtool.m4 has one change in the past 2 years (just a few days ago): https://gcc.gnu.org/r262451 That was also tiny, and didn't touch the other files. maintainer-scripts/crontab only has one change in the past 6 months: https://gcc.gnu.org/r259637 That was a tiny change, and didn't touch any other files. None of those were merges from any other branch. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 102002 invoked by alias); 10 Jul 2018 08:34:24 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 101954 invoked by uid 89); 10 Jul 2018 08:34:23 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=H*i:sk:CAH6eHd, jwakelygccgmailcom, jwakely.gcc@gmail.com, U*jwakely.gcc X-HELO: mail-it0-f42.google.com Received: from mail-it0-f42.google.com (HELO mail-it0-f42.google.com) (209.85.214.42) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Jul 2018 08:34:20 +0000 Received: by mail-it0-f42.google.com with SMTP id w16-v6so7940087ita.0 for ; Tue, 10 Jul 2018 01:34:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4PdbA6/HcmwniDDffoHig7UswRa2zeqNhXTx7I5VfMA=; b=rP1wYaPs/p4WnTTHs9P2KzKcHSBkk5ItsYrAM0vGhsJoUy5DlRBzQtfKmQejJ9yXAQ Gx52b5MAB7nGOn4n4IdMftigemk+mQ/8iP5B97fHJyApCCCmZc8Wn772OdZi8w7bx+2l Hb7UpJj9evV/MXAw/obWYfwxMtAHLmy+LaG8vGMSGDyk/pME6Xg0z9aIBgm/GhWdafyl miypvXzgkT1mli3ylN1SZ80W7u9E8oLsI0u8ElgtlGyj8aKovoF9Sq76HoNqXg4lMbvW /WVxw699CcpciZzXVtsuIvQLJTub5iHCKixlnv1QldL+fOrNFXJWNRlUvl6PxUL4bOYi naJw== MIME-Version: 1.0 References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> <20180709195957.GB32057@thyrsus.com> In-Reply-To: From: Jonathan Wakely Date: Tue, 10 Jul 2018 08:34:00 -0000 Message-ID: Subject: Re: Repo conversion troubles. To: Eric Raymond Cc: Bernd Schmidt , "gcc@gcc.gnu.org" , fallenpegasus@gmail.com Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00156.txt.bz2 On Tue, 10 Jul 2018 at 09:19, Jonathan Wakely wrote: > > On Mon, 9 Jul 2018 at 21:00, Eric S. Raymond wrote: > > > > Bernd Schmidt : > > > On 07/09/2018 09:19 PM, Eric S. Raymond wrote: > > > > Last time I did a comparison between SVN head and the git conversion > > > > tip they matched exactly. This time I have mismatches in the following > > > > files. > > > > > > So what are the diffs? Are we talking about small differences (like one > > > change missing) or large-scale mismatches? > > > > Large-scale, I'm afraid. The context diff is about a GLOC. > > I don't see how that's possible. Most of those files are tiny, or > change very rarely, so I don't see how that large a diff can happen. > > Take zlib/configure.ac and zlib/configure, there's only been one > change in the past 18 months: https://gcc.gnu.org/r261739 > That change didn't touch the other files in the list. > > libtool.m4 has one change in the past 2 years (just a few days ago): > https://gcc.gnu.org/r262451 > That was also tiny, and didn't touch the other files. > > maintainer-scripts/crontab only has one change in the past 6 months: > https://gcc.gnu.org/r259637 > That was a tiny change, and didn't touch any other files. > > None of those were merges from any other branch. libtool.m4 ltmain.sh Changed by https://gcc.gnu.org/r262451 libvtv/ChangeLog libvtv/configure libvtv/testsuite/lib/libvtv.exp Changed by https://gcc.gnu.org/r257809 https://gcc.gnu.org/r259462 https://gcc.gnu.org/r259487 https://gcc.gnu.org/r259837 https://gcc.gnu.org/r259838 (but mostly one line changes). lto-plugin/ChangeLog lto-plugin/configure lto-plugin/lto-plugin.c Changed by https://gcc.gnu.org/r259462 and https://gcc.gnu.org/r260960 MAINTAINERS This file sees a air bit of churn, but all one line changes. https://gcc.gnu.org/viewcvs/gcc/trunk/MAINTAINERS?view=log maintainer-scripts/ChangeLog maintainer-scripts/crontab maintainer-scripts/gcc_release Changed by https://gcc.gnu.org/r257045 and https://gcc.gnu.org/r259637 and https://gcc.gnu.org/r259881 Makefile.def Makefile.in Makefile.tpl Changed by https://gcc.gnu.org/r261717 (which didn't touch any other files) but also by some large changes, which might have been merges: https://gcc.gnu.org/r255195 (large removal of feature) https://gcc.gnu.org/r259669 https://gcc.gnu.org/r259755 https://gcc.gnu.org/r261304 (another large feature removal) https://gcc.gnu.org/r262267 zlib/configure zlib/configure.ac Changed by https://gcc.gnu.org/r261739 There's no single change that touched all of them. Not even two or three changes that seem seem to have anything in common, except for autoconf regeneration, which happens frequently throughout GCC's history. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18670 invoked by alias); 10 Jul 2018 10:48:30 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 11426 invoked by uid 89); 10 Jul 2018 10:47:24 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT autolearn=no version=3.3.2 spammy=jwakelygccgmailcom, jwakely.gcc@gmail.com, H*f:CAH6eHdQnaMbJ8, H*i:sk:322wLT5 X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Jul 2018 10:47:22 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 8D21D3A4AA7; Tue, 10 Jul 2018 06:47:21 -0400 (EDT) Date: Tue, 10 Jul 2018 10:48:00 -0000 From: "Eric S. Raymond" To: Jonathan Wakely Cc: Bernd Schmidt , "gcc@gcc.gnu.org" , fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. Message-ID: <20180710104721.GA12256@thyrsus.com> Reply-To: esr@thyrsus.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> <20180709195957.GB32057@thyrsus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00158.txt.bz2 Jonathan Wakely : > On Tue, 10 Jul 2018 at 09:19, Jonathan Wakely wrote: > > > > On Mon, 9 Jul 2018 at 21:00, Eric S. Raymond wrote: > > > > > > Bernd Schmidt : > > > > On 07/09/2018 09:19 PM, Eric S. Raymond wrote: > > > > > Last time I did a comparison between SVN head and the git conversion > > > > > tip they matched exactly. This time I have mismatches in the following > > > > > files. > > > > > > > > So what are the diffs? Are we talking about small differences (like one > > > > change missing) or large-scale mismatches? > > > > > > Large-scale, I'm afraid. The context diff is about a GLOC. > > > > I don't see how that's possible. Most of those files are tiny, or > > change very rarely, so I don't see how that large a diff can happen. > > > > Take zlib/configure.ac and zlib/configure, there's only been one > > change in the past 18 months: https://gcc.gnu.org/r261739 > > That change didn't touch the other files in the list. > > > > libtool.m4 has one change in the past 2 years (just a few days ago): > > https://gcc.gnu.org/r262451 > > That was also tiny, and didn't touch the other files. > > > > maintainer-scripts/crontab only has one change in the past 6 months: > > https://gcc.gnu.org/r259637 > > That was a tiny change, and didn't touch any other files. > > > > None of those were merges from any other branch. > > libtool.m4 > ltmain.sh > > Changed by https://gcc.gnu.org/r262451 > > libvtv/ChangeLog > libvtv/configure > libvtv/testsuite/lib/libvtv.exp > > Changed by https://gcc.gnu.org/r257809 https://gcc.gnu.org/r259462 > https://gcc.gnu.org/r259487 https://gcc.gnu.org/r259837 > https://gcc.gnu.org/r259838 (but mostly one line changes). > > lto-plugin/ChangeLog > lto-plugin/configure > lto-plugin/lto-plugin.c > > Changed by https://gcc.gnu.org/r259462 and https://gcc.gnu.org/r260960 > > MAINTAINERS > > This file sees a air bit of churn, but all one line changes. > https://gcc.gnu.org/viewcvs/gcc/trunk/MAINTAINERS?view=log > > maintainer-scripts/ChangeLog > maintainer-scripts/crontab > maintainer-scripts/gcc_release > > Changed by https://gcc.gnu.org/r257045 and https://gcc.gnu.org/r259637 > and https://gcc.gnu.org/r259881 > > Makefile.def > Makefile.in > Makefile.tpl > > Changed by https://gcc.gnu.org/r261717 (which didn't touch any other > files) but also by some large changes, which might have been merges: > https://gcc.gnu.org/r255195 (large removal of feature) > https://gcc.gnu.org/r259669 https://gcc.gnu.org/r259755 > https://gcc.gnu.org/r261304 (another large feature removal) > https://gcc.gnu.org/r262267 > > zlib/configure > zlib/configure.ac > > Changed by https://gcc.gnu.org/r261739 > > There's no single change that touched all of them. Not even two or > three changes that seem seem to have anything in common, except for > autoconf regeneration, which happens frequently throughout GCC's > history. I don't know what's going on either, yet. I'm trying to idenify the earliest point of content mismatch now. Thanks for all this data. It may help a lot. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 92644 invoked by alias); 10 Jul 2018 11:22:04 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 86006 invoked by uid 89); 10 Jul 2018 11:20:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.4 required=5.0 tests=BAYES_00,KAM_COUK,RCVD_IN_DNSWL_NONE,SPF_SOFTFAIL autolearn=no version=3.3.2 spammy=average, WiFi, fancy, wifi X-HELO: know-smtprelay-omd-3.server.virginmedia.net Received: from know-smtprelay-omd-3.server.virginmedia.net (HELO know-smtprelay-omd-3.server.virginmedia.net) (81.104.62.35) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 10 Jul 2018 11:20:17 +0000 Received: from localhost ([86.9.48.152]) by cmsmtp with ESMTPA id cqgnft78WbtAccqgofGsXO; Tue, 10 Jul 2018 12:20:15 +0100 From: Philip Martin To: "Eric S. Raymond" Cc: Richard Biener , gcc@gcc.gnu.org, fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. References: <20180709191911.648443A4AA7@snark.thyrsus.com> <770521DA-2D67-407B-9AA8-C5F978364121@gmail.com> <20180709202039.GA1897@thyrsus.com> Date: Tue, 10 Jul 2018 11:22:00 -0000 In-Reply-To: <20180709202039.GA1897@thyrsus.com> (Eric S. Raymond's message of "Mon, 9 Jul 2018 16:20:39 -0400") Message-ID: <87zhyznxqq.fsf@codematters.co.uk> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00160.txt.bz2 "Eric S. Raymond" writes: > I'm saying I see rsync plus local checkout take 10-12 hours. The rsync is a one-off cost. Once you have the repository locally you can checkout any individual revision much more quickly. I have a local copy of the gcc repository and a checkout of gcc trunk from localhost takes about 40 seconds. I'm not using fancy hardware. I can even check it out across my very average WiFi in just over 60 seconds. -- Philip From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 48998 invoked by alias); 20 Jul 2018 21:36:13 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 48580 invoked by uid 89); 20 Jul 2018 21:36:12 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,URIBL_RED autolearn=ham version=3.3.2 spammy=Hx-languages-length:730, his X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Jul 2018 21:36:11 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-03.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1fgd4L-0006Wa-P5 from joseph_myers@mentor.com ; Fri, 20 Jul 2018 14:36:09 -0700 Received: from digraph.polyomino.org.uk (137.202.0.87) by SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Fri, 20 Jul 2018 22:36:06 +0100 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.86_2) (envelope-from ) id 1fgd4I-0000T4-0R; Fri, 20 Jul 2018 21:36:06 +0000 Date: Fri, 20 Jul 2018 21:43:00 -0000 From: Joseph Myers To: "Eric S. Raymond" CC: Richard Biener , , Subject: Re: Repo conversion troubles. In-Reply-To: <20180709202039.GA1897@thyrsus.com> Message-ID: References: <20180709191911.648443A4AA7@snark.thyrsus.com> <770521DA-2D67-407B-9AA8-C5F978364121@gmail.com> <20180709202039.GA1897@thyrsus.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" X-SW-Source: 2018-07/txt/msg00314.txt.bz2 On Mon, 9 Jul 2018, Eric S. Raymond wrote: > Richard Biener : > > 12 hours from remote I guess? The subversion repository is available through rsync so you can create a local mirror to work from (we've been doing that at suse for years) > > I'm saying I see rsync plus local checkout take 10-12 hours. I asked Jason > about this and his response was basically "Well...we don't do that often." Isn't that a local checkout *of top-level of the repository*, i.e. checking out all branches and tags? Which is indeed something developers would never normally do - they'd just check out the particular branches they're working on. -- Joseph S. Myers joseph@codesourcery.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 70186 invoked by alias); 20 Jul 2018 21:43:50 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 70176 invoked by uid 89); 20 Jul 2018 21:43:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,URIBL_RED autolearn=ham version=3.3.2 spammy=christmas, Christmas, raid, restoration X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Jul 2018 21:43:48 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-03.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1fgdBh-0000C2-MX from joseph_myers@mentor.com ; Fri, 20 Jul 2018 14:43:45 -0700 Received: from digraph.polyomino.org.uk (137.202.0.87) by SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Fri, 20 Jul 2018 22:43:42 +0100 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.86_2) (envelope-from ) id 1fgdBd-0000a0-IH; Fri, 20 Jul 2018 21:43:41 +0000 Date: Fri, 20 Jul 2018 21:48:00 -0000 From: Joseph Myers To: Alexandre Oliva CC: "Eric S. Raymond" , Bernd Schmidt , GCC Development , Subject: Re: Repo conversion troubles. In-Reply-To: Message-ID: References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> <20180709195957.GB32057@thyrsus.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" X-SW-Source: 2018-07/txt/msg00315.txt.bz2 On Mon, 9 Jul 2018, Alexandre Oliva wrote: > On Jul 9, 2018, Jeff Law wrote: > > > On 07/09/2018 01:57 PM, Eric S. Raymond wrote: > >> Jeff Law : > >>> I'm not aware of any such merges, but any that occurred most likely > >>> happened after mid-April when the trunk was re-opened for development. > > >> I'm pretty certain things were still good at r256000. I've started that > >> check running. Not expecting results in less than twelve hours. > > > r256000 would be roughly Christmas 2017. > > When was the RAID/LVM disk corruption incident? Could it possibly have > left any of our svn repo metadata in a corrupted way that confuses > reposurgeon, and that leads to such huge differences? That was 14/15 Aug 2017, and all the SVN revision data up to r251080 were restored from backup within 24 hours or so. I found no signs of damage to revisions from the 24 hours or so between r251080 and the time of the corruption when I examined diffs for all those revisions by hand at that time. (If anyone rsynced corrupted old revisions from the repository during the window of corruption, those corrupted old revisions might remain in their rsynced repository copy because the restoration preserved file times and size, just fixing corrupted contents.) -- Joseph S. Myers joseph@codesourcery.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 86223 invoked by alias); 20 Jul 2018 21:48:53 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 86140 invoked by uid 89); 20 Jul 2018 21:48:52 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS,URIBL_RED autolearn=ham version=3.3.2 spammy=Hx-languages-length:491, *complete X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Jul 2018 21:48:51 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-03.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1fgdGc-00013g-9z from joseph_myers@mentor.com ; Fri, 20 Jul 2018 14:48:50 -0700 Received: from digraph.polyomino.org.uk (137.202.0.87) by SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Fri, 20 Jul 2018 22:48:46 +0100 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.86_2) (envelope-from ) id 1fgdGX-0000dE-Ou; Fri, 20 Jul 2018 21:48:45 +0000 Date: Fri, 20 Jul 2018 22:06:00 -0000 From: Joseph Myers To: Jonathan Wakely CC: Eric Raymond , Bernd Schmidt , "gcc@gcc.gnu.org" , Subject: Re: Repo conversion troubles. In-Reply-To: Message-ID: References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> <20180709195957.GB32057@thyrsus.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" X-SW-Source: 2018-07/txt/msg00316.txt.bz2 On Tue, 10 Jul 2018, Jonathan Wakely wrote: > > Large-scale, I'm afraid. The context diff is about a GLOC. > > I don't see how that's possible. Most of those files are tiny, or > change very rarely, so I don't see how that large a diff can happen. Concretely, the *complete GCC source tree* (trunk, that is) is under 1 GB. A complete diff generating the whole source tree from nothing would only be about 15 MLOC. -- Joseph S. Myers joseph@codesourcery.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 35763 invoked by alias); 20 Jul 2018 23:47:53 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 35748 invoked by uid 89); 20 Jul 2018 23:47:52 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,URIBL_RED autolearn=no version=3.3.2 spammy=Engineering, icei.org, esr, UD:catb.org X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Jul 2018 23:47:51 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 190833A4AA7; Fri, 20 Jul 2018 19:47:51 -0400 (EDT) Date: Fri, 20 Jul 2018 23:48:00 -0000 From: "Eric S. Raymond" To: Joseph Myers Cc: Richard Biener , gcc@gcc.gnu.org, fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. Message-ID: <20180720234751.GB3840@thyrsus.com> Reply-To: esr@thyrsus.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> <770521DA-2D67-407B-9AA8-C5F978364121@gmail.com> <20180709202039.GA1897@thyrsus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00321.txt.bz2 Joseph Myers : > On Mon, 9 Jul 2018, Eric S. Raymond wrote: > > > Richard Biener : > > > 12 hours from remote I guess? The subversion repository is available through rsync so you can create a local mirror to work from (we've been doing that at suse for years) > > > > I'm saying I see rsync plus local checkout take 10-12 hours. I asked Jason > > about this and his response was basically "Well...we don't do that often." > > Isn't that a local checkout *of top-level of the repository*, i.e. > checking out all branches and tags? Which is indeed something developers > would never normally do - they'd just check out the particular branches > they're working on. It is. I have to check out all tags and branches to validate the conversion. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 39442 invoked by alias); 20 Jul 2018 23:48:51 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 39431 invoked by uid 89); 20 Jul 2018 23:48:51 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,URIBL_RED autolearn=no version=3.3.2 spammy=Engineering, Hx-languages-length:1429, wwwcatborg, icei.org X-HELO: snark.thyrsus.com Received: from thyrsus.com (HELO snark.thyrsus.com) (71.162.243.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Jul 2018 23:48:49 +0000 Received: by snark.thyrsus.com (Postfix, from userid 1000) id 24CBE3A4AA7; Fri, 20 Jul 2018 19:48:49 -0400 (EDT) Date: Sat, 21 Jul 2018 02:04:00 -0000 From: "Eric S. Raymond" To: Joseph Myers Cc: Alexandre Oliva , Bernd Schmidt , GCC Development , fallenpegasus@gmail.com Subject: Re: Repo conversion troubles. Message-ID: <20180720234849.GC3840@thyrsus.com> Reply-To: esr@thyrsus.com References: <20180709191911.648443A4AA7@snark.thyrsus.com> <309363c2-0a5d-29f8-7c7e-19734e7cfa7a@t-online.de> <20180709195957.GB32057@thyrsus.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-IsSubscribed: yes X-SW-Source: 2018-07/txt/msg00322.txt.bz2 Joseph Myers : > On Mon, 9 Jul 2018, Alexandre Oliva wrote: > > > On Jul 9, 2018, Jeff Law wrote: > > > > > On 07/09/2018 01:57 PM, Eric S. Raymond wrote: > > >> Jeff Law : > > >>> I'm not aware of any such merges, but any that occurred most likely > > >>> happened after mid-April when the trunk was re-opened for development. > > > > >> I'm pretty certain things were still good at r256000. I've started that > > >> check running. Not expecting results in less than twelve hours. > > > > > r256000 would be roughly Christmas 2017. > > > > When was the RAID/LVM disk corruption incident? Could it possibly have > > left any of our svn repo metadata in a corrupted way that confuses > > reposurgeon, and that leads to such huge differences? > > That was 14/15 Aug 2017, and all the SVN revision data up to r251080 were > restored from backup within 24 hours or so. I found no signs of damage to > revisions from the 24 hours or so between r251080 and the time of the > corruption when I examined diffs for all those revisions by hand at that > time. Agreed. I don't think that incident is at the root of the problems. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own.