* Test GCC conversion with reposurgeon available @ 2019-12-17 21:32 Joseph Myers 2019-12-17 23:33 ` Bernd Schmidt ` (3 more replies) 0 siblings, 4 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-17 21:32 UTC (permalink / raw) To: gcc; +Cc: esr I've made test conversions of the GCC repository with reposurgeon available (gcc.gnu.org / sourceware.org account required to access these git+ssh repositories, it doesn't need to be one in the gcc group or to have shell access). More information about the repositories, conversion choices made and known issues is given below, and, as noted there, I'm running another conversion now with fixes for some of those issues and the remaining listed issues not fixed in that conversion are being actively worked on. git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1a.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1b.git The two repositories have exactly the same objects (thus, exactly the same commit graph). The only difference is that the 1a conversion has branches and tags named the same as in SVN (or as similarly as possible; tags in branches/st/tags/ in SVN become tags in refs/tags/st/ in git), whereas the 1b conversion has refs rearranged as suggested by Richard (meaning most are not fetched by default, so you may wish to clone with --mirror to inspect them more closely). We can of course do a different rearrangement if desired. The repositories also include refs/deleted refs for each commit that deleted a tag or branch in SVN (to be precise, the ref points to a commit deleting all the tag or branch contents, so preserving the original commit message for the deletion; its parent is thus the final state of the tag or branch before deletion). We may or may not want these in the final conversion, but it seems useful to have them at this point for verification purposes (in particular, I intend to implement a check that the final state of each tag or branch before deletion is correct, as a further check that the conversion machinery is working correctly). The repositories don't include refs for the version of history from the old git-svn mirror, but I have a script to add them (in refs/git-old/ and refs/git-svn-old/) for the benefit of people wishing to interpret old commit hashes after the conversion and to make things more convenient for people wishing to rebase active git-only branches onto the new version of the history. The script is independent of reposurgeon; it's just a single "git fetch" command (which should be followed by "git gc --aggressive"). The repositories include all the non-deleted branches and tags in the SVN repository (and, outside refs/deleted/, that is the exact set of branches and tags present). For this purpose, the file branches/st/README in the SVN repository is considered to have its own branch. reposurgeon generates a "root" branch for commits to paths not part of any branch; this is not included in these repositories (has been deleted at the git level) because I don't believe it contains anything plausibly relevant in git. The commits to branches/st/README were moved to their own branch, as noted; all other commits that end up in "root" are either commits wrongly creating a branch or tag at top level rather than /branches or /tags, commits deleting such branches or tags created in the wrong place, or changes to the SVN /hooks directory. As far as I know, all issues affecting commit tree contents have been fixed, as have some previously noted issues with some merge commits having too many parents, and incorrect attributions seen in an earlier conversion of Richard's. Tree contents are verified correct at every non-deleted branch tip and tag (I intend to do such validation for deleted branches and tags as well, but haven't yet implemented it). For comparisons, the following methodology applies: empty directories are removed from the SVN checkout, because git doesn't store empty directories; .cvsignore files are excluded from the comparison, since reposurgeon doesn't include them (but if people want them in the git history, they could easily be included); .gitignore files are excluded from the comparison, since reposurgeon generates one based on svn:ignore properties (or SVN defaults, in the absence of such properties) where the repository doesn't have one checked in (where there *is* a .gitignore file checked into SVN, it's preferred over the auto-generated one); cases of SVN keyword expansion are excluded manually (only two branches have files with SVN keyword expansion enabled). Every branch with SVN ancestry based on the first commit of /trunk has first-parent ancestry in git going back to that commit, as expected. This includes libstdcxx_so_7-2-branch, which was created via creating the directory for the branch and then copying only the libstdc++-v3 subdirectory from trunk rather than directly copying the whole of /trunk to /branches/libstdcxx_so_7-2-branch; reposurgeon has detected that case automatically and created an appropriate parent link to the relevant trunk commit. Some parts of Richard's commit message improvements are present, but most aren't because of an issue accessing Bugzilla (which also affected some of the improvements not involving accessing Bugzilla, as the script terminated early). This should be fixed for my next conversion run. As discussed, Richard's improvements only add new summary lines, with the original commit message following them. Known issues (all either already fixed or understood and currently being worked on): 1. Some cherry-picks are showing up as merges (this is the only issue I could find in my checks, manual and automated, that affects the commit graph; I couldn't find any issues affecting tree contents, first-parent ancestry or the set of refs present). Being worked on by Julien "_FrnchFrgg_" RIVAUD. 2. Branch creation or recreation commits have attribution taken from some ChangeLog file in the branch when it should come from the SVN committer. Being worked on by Eric S. Raymond. 3. There are still some merge commits with too many parents, although the cases Richard found have all been fixed (and all those parents in the cases I found are genuinely ancestors of the merge commit in question, so it's essentially a cosmetic issue that there are some that are redundant - it won't affect anything in default "git log" output other than the "Merge:" line, for example, as "git log" orders by commit timestamp by default). Being worked on by Julien "_FrnchFrgg_" RIVAUD. 4. Only files called ChangeLog are used to extract attributions, not ChangeLog.<branch> (fixed for my next conversion run, currently running at the git-fast-import stage). 5. Most of Richard's commit message improvements aren't present (fixed for my next conversion run, currently running at the git-fast-import stage). Points for consideration: 1. Do we want some kind of rearrangement of refs as in the 1b repository or not? 2. Should the final converted repository contain refs/deleted/ refs or not? 3. Where an attribution comes from an author map rather than a ChangeLog file, do we wish to use the existing author map or do people prefer using names from that map but with @gcc.gnu.org addresses (and @gnu.org for usernames that only committed in the gcc2 period)? -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-17 21:32 Test GCC conversion with reposurgeon available Joseph Myers @ 2019-12-17 23:33 ` Bernd Schmidt 2019-12-18 0:51 ` Eric S. Raymond 2019-12-18 0:52 ` Joseph Myers 2019-12-18 13:10 ` Jason Merrill ` (2 subsequent siblings) 3 siblings, 2 replies; 67+ messages in thread From: Bernd Schmidt @ 2019-12-17 23:33 UTC (permalink / raw) To: Joseph Myers, gcc; +Cc: esr On 12/17/19 10:32 PM, Joseph Myers wrote: > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1a.git It seems that permission bits are not reproduced entirely correctly. For example, contrib/check_GNU_style_lib.py went from -rwxr-xr-x in svn (and the git-svn repository) to -rw-r--r-- in this new git repository. I vote for including .cvsignore files. Their absence makes diff comparisons of "git ls-tree" on specific revisions needlessly noisy. Bernd ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-17 23:33 ` Bernd Schmidt @ 2019-12-18 0:51 ` Eric S. Raymond 2019-12-18 0:52 ` Joseph Myers 1 sibling, 0 replies; 67+ messages in thread From: Eric S. Raymond @ 2019-12-18 0:51 UTC (permalink / raw) To: Bernd Schmidt; +Cc: Joseph Myers, gcc Bernd Schmidt <bernds_cb1@t-online.de>: > I vote for including .cvsignore files. Their absence makes diff comparisons > of "git ls-tree" on specific revisions needlessly noisy. A few minutes ago I implmemted and pushed a --cvsignores read option for Subversion dumps. That should do what you eant. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-17 23:33 ` Bernd Schmidt 2019-12-18 0:51 ` Eric S. Raymond @ 2019-12-18 0:52 ` Joseph Myers 2019-12-18 3:28 ` Joseph Myers 1 sibling, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-18 0:52 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc, esr On Wed, 18 Dec 2019, Bernd Schmidt wrote: > On 12/17/19 10:32 PM, Joseph Myers wrote: > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1a.git > > It seems that permission bits are not reproduced entirely correctly. For > example, contrib/check_GNU_style_lib.py went from -rwxr-xr-x in svn (and the > git-svn repository) to -rw-r--r-- in this new git repository. Thanks, I've reduced this to a minimal test for Eric, so hopefully it should be resolved soon. I've also implemented comparison of execute permissions in my script checking branch tips, so future conversions will be fully checked in that regard (including the one I'm running right now, although since that one is with unfixed reposurgeon the comparison will just show exactly where there are problems - which should help show whether there are any cases of permission issues different from my minimal test; all cases I see on trunk / master match the pattern of that minimal test). > I vote for including .cvsignore files. Their absence makes diff comparisons of > "git ls-tree" on specific revisions needlessly noisy. This has been implemented, so future conversion runs (again, not the one running right now, which just addresses issues 4 and 5 from my list and is based on r279452 rather than r279402 so includes two days' more changes from SVN) will have .cvsignore files included in the git repository. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-18 0:52 ` Joseph Myers @ 2019-12-18 3:28 ` Joseph Myers 2019-12-18 14:36 ` Joseph Myers 0 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-18 3:28 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc, esr On Wed, 18 Dec 2019, Joseph Myers wrote: > On Wed, 18 Dec 2019, Bernd Schmidt wrote: > > > On 12/17/19 10:32 PM, Joseph Myers wrote: > > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1a.git > > > > It seems that permission bits are not reproduced entirely correctly. For > > example, contrib/check_GNU_style_lib.py went from -rwxr-xr-x in svn (and the > > git-svn repository) to -rw-r--r-- in this new git repository. > > Thanks, I've reduced this to a minimal test for Eric, so hopefully it > should be resolved soon. I believe I have a fix for this; I'm now running a full GCC conversion with that fix (and with some fixes related to merge commits). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-18 3:28 ` Joseph Myers @ 2019-12-18 14:36 ` Joseph Myers 0 siblings, 0 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-18 14:36 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc, esr On Wed, 18 Dec 2019, Joseph Myers wrote: > On Wed, 18 Dec 2019, Joseph Myers wrote: > > > On Wed, 18 Dec 2019, Bernd Schmidt wrote: > > > > > On 12/17/19 10:32 PM, Joseph Myers wrote: > > > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1a.git > > > > > > It seems that permission bits are not reproduced entirely correctly. For > > > example, contrib/check_GNU_style_lib.py went from -rwxr-xr-x in svn (and the > > > git-svn repository) to -rw-r--r-- in this new git repository. > > > > Thanks, I've reduced this to a minimal test for Eric, so hopefully it > > should be resolved soon. > > I believe I have a fix for this; I'm now running a full GCC conversion > with that fix (and with some fixes related to merge commits). The full validation of all branch tips and tags is still running, but I've confirmed that this conversion now has execute permissions on master exactly the same as in SVN trunk, that the spurious merges I previously saw have disappeared and that .cvsignore now appears in the history as requested. I'll upload new repository conversions later today. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-17 21:32 Test GCC conversion with reposurgeon available Joseph Myers 2019-12-17 23:33 ` Bernd Schmidt @ 2019-12-18 13:10 ` Jason Merrill 2019-12-18 18:16 ` Joseph Myers 2019-12-18 21:55 ` Joseph Myers 2020-01-09 9:44 ` GIT conversion: question about tags & release branches Martin Liška 3 siblings, 1 reply; 67+ messages in thread From: Jason Merrill @ 2019-12-18 13:10 UTC (permalink / raw) To: Joseph Myers; +Cc: gcc Mailing List, Eric Raymond On Tue, Dec 17, 2019 at 4:39 PM Joseph Myers <joseph@codesourcery.com> wrote: > Points for consideration: > > 1. Do we want some kind of rearrangement of refs as in the 1b > repository or not? > Maybe? How much space does that save in a clone? How much work does a partial clone add on the server, since the server needs to pack up the objects for the partial clone rather than just transmitting its own packs? > 2. Should the final converted repository contain refs/deleted/ refs or > not? > I think not. > 3. Where an attribution comes from an author map rather than a > ChangeLog file, do we wish to use the existing author map or do people > prefer using names from that map but with @gcc.gnu.org addresses (and > @gnu.org for usernames that only committed in the gcc2 period)? > I lean toward the latter. Jason ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-18 13:10 ` Jason Merrill @ 2019-12-18 18:16 ` Joseph Myers 2019-12-19 5:50 ` Jason Merrill 0 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-18 18:16 UTC (permalink / raw) To: Jason Merrill; +Cc: gcc Mailing List, Eric Raymond On Wed, 18 Dec 2019, Jason Merrill wrote: > On Tue, Dec 17, 2019 at 4:39 PM Joseph Myers <joseph@codesourcery.com> > wrote: > > > Points for consideration: > > > > 1. Do we want some kind of rearrangement of refs as in the 1b > > repository or not? > > > > Maybe? How much space does that save in a clone? How much work does a > partial clone add on the server, since the server needs to pack up the > objects for the partial clone rather than just transmitting its own packs? I haven't measured work on the server, and timing individual clones is liable to a lot of variation from variable load there, but for a single clone --mirror of the 1b repository (so all refs, including refs/deleted/) I got real 13m16.473s user 16m45.429s sys 0m33.901s and 1360 MB objects directory, but for a clone without --mirror (so only a limited subset of refs and the server needing to build a pack) real 15m5.554s user 12m11.771s sys 0m26.914s and 950 MB objects directory. Adding the objects from the existing git-svn mirror (presumably also under refs not fetched by default) increases repository size by about 300 MB, based on a previous test of doing that (most blob and tree objects will be shared between the two versions of the history, but all the commit objects are separate). > > 3. Where an attribution comes from an author map rather than a > > ChangeLog file, do we wish to use the existing author map or do people > > prefer using names from that map but with @gcc.gnu.org addresses (and > > @gnu.org for usernames that only committed in the gcc2 period)? > > I lean toward the latter. I'll plan to change the author map to default to @gcc.gnu.org and @gnu.org addresses. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-18 18:16 ` Joseph Myers @ 2019-12-19 5:50 ` Jason Merrill 2019-12-19 15:55 ` Joseph Myers 0 siblings, 1 reply; 67+ messages in thread From: Jason Merrill @ 2019-12-19 5:50 UTC (permalink / raw) To: Joseph Myers; +Cc: gcc Mailing List, Eric Raymond On Wed, Dec 18, 2019 at 1:17 PM Joseph Myers <joseph@codesourcery.com> wrote: > On Wed, 18 Dec 2019, Jason Merrill wrote: > > > On Tue, Dec 17, 2019 at 4:39 PM Joseph Myers <joseph@codesourcery.com> > > wrote: > > > > > Points for consideration: > > > > > > 1. Do we want some kind of rearrangement of refs as in the 1b > > > repository or not? > > > > > > > Maybe? How much space does that save in a clone? How much work does a > > partial clone add on the server, since the server needs to pack up the > > objects for the partial clone rather than just transmitting its own > packs? > > I haven't measured work on the server, and timing individual clones is > liable to a lot of variation from variable load there, but for a single > clone --mirror of the 1b repository (so all refs, including refs/deleted/) > I got > > real 13m16.473s > user 16m45.429s > sys 0m33.901s > > and 1360 MB objects directory, but for a clone without --mirror (so only a > limited subset of refs and the server needing to build a pack) > > real 15m5.554s > user 12m11.771s > sys 0m26.914s > > and 950 MB objects directory. Adding the objects from the existing > git-svn mirror (presumably also under refs not fetched by default) > increases repository size by about 300 MB, based on a previous test of > doing that (most blob and tree objects will be shared between the two > versions of the history, but all the commit objects are separate). > So a 30% space savings; that's pretty significant. Though I wonder how much of that is refs/dead and refs/deleted, which seem unnecessary to carry over to git at all. I wonder if it would make sense to put them in a separate repository that refers to the main gcc.git? Jason ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-19 5:50 ` Jason Merrill @ 2019-12-19 15:55 ` Joseph Myers 0 siblings, 0 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-19 15:55 UTC (permalink / raw) To: Jason Merrill; +Cc: gcc Mailing List, Eric Raymond On Thu, 19 Dec 2019, Jason Merrill wrote: > So a 30% space savings; that's pretty significant. Though I wonder how > much of that is refs/dead and refs/deleted, which seem unnecessary to carry > over to git at all. I wonder if it would make sense to put them in a > separate repository that refers to the main gcc.git? refs/dead is definitely relevant sometimes; that's old development branches. refs/deleted is less clearly relevant. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-17 21:32 Test GCC conversion with reposurgeon available Joseph Myers 2019-12-17 23:33 ` Bernd Schmidt 2019-12-18 13:10 ` Jason Merrill @ 2019-12-18 21:55 ` Joseph Myers 2019-12-19 0:36 ` Bernd Schmidt ` (2 more replies) 2020-01-09 9:44 ` GIT conversion: question about tags & release branches Martin Liška 3 siblings, 3 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-18 21:55 UTC (permalink / raw) To: gcc; +Cc: esr On Tue, 17 Dec 2019, Joseph Myers wrote: > I've made test conversions of the GCC repository with reposurgeon > available (gcc.gnu.org / sourceware.org account required to access > these git+ssh repositories, it doesn't need to be one in the gcc group > or to have shell access). More information about the repositories, > conversion choices made and known issues is given below, and, as noted > there, I'm running another conversion now with fixes for some of those > issues and the remaining listed issues not fixed in that conversion > are being actively worked on. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1b.git There are now four more repositories available. git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-2a.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-2b.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3a.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3b.git The 2a and 2b repositories are similar to 1a and 1b, but have fixes to issues 4 and 5 I listed (they include attribution extraction from ChangeLog.<branch> files and all of Richard's commit message improvements). The 3a and 3b repositories have further improvements to the conversion machinery. Verification shows that 3a and 3b have execute permissions on files exactly the same as in SVN, at all (non-deleted) branch tips and tags. .cvsignore files are present as requested. There are mergeinfo improvements to avoid spuriously marking cherry-picks as merge commits and to avoid having excessive numbers of merge parents in some cases, but the mergeinfo improvements should be considered a work in progress; further improvements have since been implemented in reposurgeon since then to allow many more merges to result in git merge commits without false positives for cherry-picks, so will be in the next trial conversion after these ones. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-18 21:55 ` Joseph Myers @ 2019-12-19 0:36 ` Bernd Schmidt 2019-12-19 0:58 ` Joseph Myers 2019-12-19 13:51 ` Test GCC conversions (publicly) available Mark Wielaard 2019-12-19 16:29 ` Test GCC conversion with reposurgeon available Joseph Myers 2 siblings, 1 reply; 67+ messages in thread From: Bernd Schmidt @ 2019-12-19 0:36 UTC (permalink / raw) To: Joseph Myers, gcc; +Cc: esr On 12/18/19 10:55 PM, Joseph Myers wrote: > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3a.git I cloned this one and started trying random things again. The previous one had some strange-looking merge commits, but it sounded like that was a known issue, and indeed the ones I had seen were fixed in this new version. I decided to write a small script to check whether there were any merge commits with more than two parents, and there are a few which have three. One commit seems to occur as a parent in all of these: 422854db0e8605867e0834035aa2b1da1b71cbfb. An example is b743e467e43e6211f2c2537f1f07bbceb4d3aa61, apparently from spu-4_5-branch. No idea whether there is an issue or whether this is worth looking at, but I figured I'd point it out at least. Bernd ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-19 0:36 ` Bernd Schmidt @ 2019-12-19 0:58 ` Joseph Myers 0 siblings, 0 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-19 0:58 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc, esr On Thu, 19 Dec 2019, Bernd Schmidt wrote: > On 12/18/19 10:55 PM, Joseph Myers wrote: > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3a.git > > I cloned this one and started trying random things again. > The previous one had some strange-looking merge commits, but it sounded like > that was a known issue, and indeed the ones I had seen were fixed in this new > version. > I decided to write a small script to check whether there were any merge > commits with more than two parents, and there are a few which have three. One > commit seems to occur as a parent in all of these: > 422854db0e8605867e0834035aa2b1da1b71cbfb. An example is > b743e467e43e6211f2c2537f1f07bbceb4d3aa61, apparently from spu-4_5-branch. > > No idea whether there is an issue or whether this is worth looking at, but I > figured I'd point it out at least. b743e467e43e6211f2c2537f1f07bbceb4d3aa61 is r152464, from named-addr-spaces-branch. This merge is the first one that added an svn:mergeinfo property on named-addr-spaces-branch, which previously had only svnmerge-integrated (the svn:mergeinfo property was copied from the one that was on trunk at that time). That svn:mergeinfo property specifies merges from /branches/cxx0x-lambdas-branch, /branches/lto and /trunk, and the merge parents are for the first two of those. For /trunk, it only specifies two revisions, which is clearly not a valid merge and this version of reposurgeon avoids creating merge commits for cherry-picks. svnmerge-integrated specifies /trunk:1-151687,151691-152437, which would be a valid merge from trunk (there are no trunk revisions in the gap) but reposurgeon only looks at svnmerge-integrated if svn:mergeinfo is empty. I'll file a request that it take the union of revision ranges from the two properties, which ought to be easy to implement. Once it recognises that commit as a merge from trunk, it will should automatically discard the other merge parents because it now avoids adding a merge parent that is an ancestor of another merge parent. I.e. this is an artifact of someone having done a merge with svnmerge.py that brought in svn:mergeinfo from another branch where SVN's native merge tracking had been used, and should be straightforward to fix. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversions (publicly) available 2019-12-18 21:55 ` Joseph Myers 2019-12-19 0:36 ` Bernd Schmidt @ 2019-12-19 13:51 ` Mark Wielaard 2019-12-19 14:06 ` Eric S. Raymond 2019-12-19 16:29 ` Test GCC conversion with reposurgeon available Joseph Myers 2 siblings, 1 reply; 67+ messages in thread From: Mark Wielaard @ 2019-12-19 13:51 UTC (permalink / raw) To: Joseph Myers, gcc Hi, On Wed, 2019-12-18 at 21:55 +0000, Joseph Myers wrote: > On Tue, 17 Dec 2019, Joseph Myers wrote: > > I've made test conversions of the GCC repository with reposurgeon > > available (gcc.gnu.org / sourceware.org account required to access > > these git+ssh repositories, it doesn't need to be one in the gcc group > > or to have shell access). More information about the repositories, > > conversion choices made and known issues is given below, and, as noted > > there, I'm running another conversion now with fixes for some of those > > issues and the remaining listed issues not fixed in that conversion > > are being actively worked on. > > > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1a.git > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-1b.git > > There are now four more repositories available. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-2a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-2b.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3b.git For those without gcc.gnu.org accounts I made mirrors at: https://code.wildebeest.org/git/gcc/ There is also a mirror of Maxim's gcc-reparent branch: https://code.wildebeest.org/git/gcc/gcc-reparent/ And the traditional gcc git-svn mirror: https://code.wildebeest.org/git/mirror/gcc/ The last two seem to be kept up to date with current svn. Cloning should work through both git:// and (smart) https:// protocols. Please be gentle. This is on a little machine in my basement. Each full clone also takes 1GB+ of data. So you might just want to look at them just through the cgit webinterface. But hopefully it is still useful. I will likely delete these again after we have picked a conversion we actually want to use. Do we already have a new date for when we are making that decision? Thanks, Mark ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversions (publicly) available 2019-12-19 13:51 ` Test GCC conversions (publicly) available Mark Wielaard @ 2019-12-19 14:06 ` Eric S. Raymond 2019-12-19 14:40 ` Joseph Myers 0 siblings, 1 reply; 67+ messages in thread From: Eric S. Raymond @ 2019-12-19 14:06 UTC (permalink / raw) To: Mark Wielaard; +Cc: Joseph Myers, gcc Mark Wielaard <mark@klomp.org>: > Do we already have a new date for when we are making that decision? I believe Joseph was planning on Dec 31st. My team's part will be ready - the enabling reposurgeon changes should done in a week or so, with most of that being RFEs that could be dropped if there were real time pressure. There are other problems that might cause a delay beyond the 31st, however. Best if I let Joseph nd Richard explain those. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversions (publicly) available 2019-12-19 14:06 ` Eric S. Raymond @ 2019-12-19 14:40 ` Joseph Myers 2019-12-19 16:00 ` Eric S. Raymond 0 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-19 14:40 UTC (permalink / raw) To: Eric S. Raymond; +Cc: Mark Wielaard, gcc On Thu, 19 Dec 2019, Eric S. Raymond wrote: > There are other problems that might cause a delay beyond the > 31st, however. Best if I let Joseph nd Richard explain those. I presume that's referring to the checkme: bug annotations where the PR numbers in commit messages seem suspicious. I don't think that's something to delay the conversion unless we're clearly close to having a complete review of all those cases done; at the point of the final conversion we should simply change the script not to modify commit messages in any remaining unresolved suspicious cases. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversions (publicly) available 2019-12-19 14:40 ` Joseph Myers @ 2019-12-19 16:00 ` Eric S. Raymond 2019-12-19 16:03 ` Richard Earnshaw (lists) 0 siblings, 1 reply; 67+ messages in thread From: Eric S. Raymond @ 2019-12-19 16:00 UTC (permalink / raw) To: Joseph Myers; +Cc: Mark Wielaard, gcc Joseph Myers <joseph@codesourcery.com>: > On Thu, 19 Dec 2019, Eric S. Raymond wrote: > > > There are other problems that might cause a delay beyond the > > 31st, however. Best if I let Joseph nd Richard explain those. > > I presume that's referring to the checkme: bug annotations where the PR > numbers in commit messages seem suspicious. I don't think that's > something to delay the conversion unless we're clearly close to having a > complete review of all those cases done; at the point of the final > conversion we should simply change the script not to modify commit > messages in any remaining unresolved suspicious cases. No, I was thinking more of rearnsha bailing out to handle a family emergency and muttering something about not being back for a couple of weeks. If that's been resolved I haven't heard about it. The only conversion blocker that I know is still live is the wrong attributions in some ChangeLog cases. I'm sure we'll get that fixed soon; at this point I'm more worried about getting the test suite to run clean again. The scenario I want to avoid is the where you get a conversion that looks production-ready before I get my tests cleaned up, you deploy it - and then I find something during the remainder of my cleanup that implies a problem with your conversion. A complicating factor is that I'm getting stale. I've been going hammer and tongs at this for nearly three months now, and that's not counting all the previous time on the Go translation. My defect rate is going up. I need a vacation or to work on something else for a while and I can't have that yet. Never nind. We'll get this done. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversions (publicly) available 2019-12-19 16:00 ` Eric S. Raymond @ 2019-12-19 16:03 ` Richard Earnshaw (lists) 2019-12-19 16:08 ` Eric S. Raymond 0 siblings, 1 reply; 67+ messages in thread From: Richard Earnshaw (lists) @ 2019-12-19 16:03 UTC (permalink / raw) To: esr, Joseph Myers; +Cc: Mark Wielaard, gcc On 19/12/2019 16:00, Eric S. Raymond wrote: > Joseph Myers <joseph@codesourcery.com>: >> On Thu, 19 Dec 2019, Eric S. Raymond wrote: >> >>> There are other problems that might cause a delay beyond the >>> 31st, however. Best if I let Joseph nd Richard explain those. >> >> I presume that's referring to the checkme: bug annotations where the PR >> numbers in commit messages seem suspicious. I don't think that's >> something to delay the conversion unless we're clearly close to having a >> complete review of all those cases done; at the point of the final >> conversion we should simply change the script not to modify commit >> messages in any remaining unresolved suspicious cases. > > No, I was thinking more of rearnsha bailing out to handle a family emergency > and muttering something about not being back for a couple of weeks. If that's > been resolved I haven't heard about it. I don't think that should affect things, as I think Joseph has a good handle on what needs to be done and I think I've handed over everything that's needed w.r.t. the commit summary reprocessing script. Joseph, holler quickly if you think there's anything else you might need... R. ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversions (publicly) available 2019-12-19 16:03 ` Richard Earnshaw (lists) @ 2019-12-19 16:08 ` Eric S. Raymond 0 siblings, 0 replies; 67+ messages in thread From: Eric S. Raymond @ 2019-12-19 16:08 UTC (permalink / raw) To: Richard Earnshaw (lists); +Cc: Joseph Myers, Mark Wielaard, gcc Richard Earnshaw (lists) <Richard.Earnshaw@arm.com>: > > No, I was thinking more of rearnsha bailing out to handle a family emergency > > and muttering something about not being back for a couple of weeks. If that's > > been resolved I haven't heard about it. > > I don't think that should affect things, as I think Joseph has a good handle > on what needs to be done and I think I've handed over everything that's > needed w.r.t. the commit summary reprocessing script. OK, that's good to know. I wish you good fortune with the emergency. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-18 21:55 ` Joseph Myers 2019-12-19 0:36 ` Bernd Schmidt 2019-12-19 13:51 ` Test GCC conversions (publicly) available Mark Wielaard @ 2019-12-19 16:29 ` Joseph Myers 2019-12-22 13:57 ` Joseph Myers 2 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-19 16:29 UTC (permalink / raw) To: gcc; +Cc: esr On Wed, 18 Dec 2019, Joseph Myers wrote: > There are now four more repositories available. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-2a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-2b.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-3b.git And two more. git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4a.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4b.git The main changes in this version are: * More mergeinfo improvements, so more valid merges should be detected and represented as such in git. (This doesn't yet have the changes to handle the case of both svnmerge-integrated and svn:mergeinfo having relevant merge information.) * More bug data used with Richard's script (but not the most recent whitelisting / PR corrections). * @gcc.gnu.org / @gnu.org addresses are preferred in the author map, to avoid anachronistic credits of commits to addresses at the wrong company (this also means that the git "committer" information consistently uses such addresses, which is certainly logically correct). I preserved timezone annotations in the map for existing addresses but accidentally did that in a way that resulted in those existing addresses, when used in ChangeLog entries, being mapped to the @gcc.gnu.org / @gnu.org ones; I'll fix that for the next conversion run so the addresses from the ChangeLog entries are properly preserved in such cases but still use the timezone annotations from gcc.map. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-19 16:29 ` Test GCC conversion with reposurgeon available Joseph Myers @ 2019-12-22 13:57 ` Joseph Myers 2019-12-23 17:27 ` Roman Zhuykov ` (2 more replies) 0 siblings, 3 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-22 13:57 UTC (permalink / raw) To: gcc; +Cc: esr On Thu, 19 Dec 2019, Joseph Myers wrote: > And two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4b.git Two more. git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5a.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5b.git The main changes are: * The case of both svnmerge-integrated and svn:mergeinfo being set is now handled properly, so the commit Bernd found is interpreted as a merge from trunk to named-addr-spaces-branch and has exactly two parents as expected, with the parents corresponding to the merges from other branches to trunk being optimized away. * The author map used now avoids timezone-only entries also remapping email addresses, so the email addresses from the ChangeLogs are used whenever a commit adds ChangeLog entries from exactly one author. * When commits add ChangeLog entries from more than one author (e.g. merges done in CVS), the committer is now used as the author rather than selecting one of the authors from the ChangeLog entries. * The latest whitelisting / PR corrections are used with Richard's script (430 checkme: entries remain). * One fix to the ref renaming in gcc-reposurgeon-5b.git so that the tag gcc-3_2-rhl8-3_2-7 properly ends up in vendors rather than prereleases. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-22 13:57 ` Joseph Myers @ 2019-12-23 17:27 ` Roman Zhuykov 2019-12-24 11:50 ` Joseph Myers 2019-12-24 10:57 ` Maxim Kuvyrkov 2019-12-28 16:30 ` Joseph Myers 2 siblings, 1 reply; 67+ messages in thread From: Roman Zhuykov @ 2019-12-23 17:27 UTC (permalink / raw) To: Joseph Myers, gcc; +Cc: esr 22.12.2019 16:56, Joseph Myers wrote: > On Thu, 19 Dec 2019, Joseph Myers wrote: > >> And two more. >> >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4a.git >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4b.git > Two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5b.git > > The main changes are: > > * The author map used now avoids timezone-only entries also remapping > email addresses, so the email addresses from the ChangeLogs are used > whenever a commit adds ChangeLog entries from exactly one author. Hello. Sorry if I have missed some part of discussion and now describe known issue about "author map", but in gcc-reposurgeon-5b.git I see this: gcc-reposurgeon-5b.git$ git log --pretty=format:"%h%x09 %ae%x09%ad" --author=zhroma ff96b83 zhroma@ispras.ru      Fri Dec 20 15:40:46 2019 +0000 17a6791 zhroma@ispras.ru      Fri Dec 13 17:33:38 2019 +0000 86f05d3 zhroma@ispras.ru      Fri Dec 13 17:17:31 2019 +0000 7103c52 zhroma@ispras.ru      Fri Dec 13 17:02:53 2019 +0000 1e035ae zhroma@ispras.ru      Tue Apr 23 13:14:57 2019 +0000 cea94d2 zhroma@gcc.gnu.org    Tue Apr 23 12:53:43 2019 +0000 28635d5 zhroma@ispras.ru      Mon Apr 22 16:05:36 2019 +0000 2f59c1e zhroma@ispras.ru      Fri Mar 29 12:44:01 2019 -0600 1cc4c09 zhroma@ispras.ru      Fri Feb 10 16:00:30 2012 +0400 9839b87 zhroma@ispras.ru      Mon Jul 25 13:43:01 2011 +0400 gcc-reposurgeon-5b.git$ git log --pretty=format:"%h%x09 %ae%x09%ad" --author=zhroma releases/gcc-8 40cc006 zhroma@ispras.ru      Fri Dec 20 15:52:02 2019 +0000 4cf66d9 zhroma@ispras.ru      Fri Dec 20 15:07:58 2019 +0000 9240950 zhroma@gcc.gnu.org    Fri Apr 26 16:04:54 2019 +0000 1cc4c09 zhroma@ispras.ru      Fri Feb 10 16:00:30 2012 +0400 9839b87 zhroma@ispras.ru      Mon Jul 25 13:43:01 2011 +0400 I've never used zhroma@gcc.gnu.org email in ChangeLog files. So, it seems odd that it is used in r270511 (my first commit as maintainer), but not in next r270512 or later commits. Moreover, it is also used once in r270609 on 8 branch. I think it's better to use zhroma@ispras.ru everywhere. Another option may be using @gcc.gnu.org everywhere since r270511. I also see this discussion https://gcc.gnu.org/ml/gcc/2019-09/msg00216.html, but it was about wwwdocs repo. Committer field is correct in repo, "Roman Zhuykov <zhroma@gcc.gnu.org>" is used everywhere. -- Roman ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-23 17:27 ` Roman Zhuykov @ 2019-12-24 11:50 ` Joseph Myers 2019-12-24 15:55 ` Segher Boessenkool 0 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-24 11:50 UTC (permalink / raw) To: Roman Zhuykov; +Cc: gcc, esr On Mon, 23 Dec 2019, Roman Zhuykov wrote: > I've never used zhroma@gcc.gnu.org email in ChangeLog files. So, it seems odd > that it is used in r270511 (my first commit as maintainer), but not in next That's because that commit also edits ChangeLog entries from other authors. When a commit adds / edits ChangeLog entries for more than one author (the difference between purely editing an existing entry and adding a new one, possibly under an existing date/author header, for a multi-author commit, is not something that can reliably be determined automatically), the conversion falls back to using the committer identity instead of picking one of the multiple relevant authors from the ChangeLog files. -- Joseph S. Myers jsm@polyomino.org.uk ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-24 11:50 ` Joseph Myers @ 2019-12-24 15:55 ` Segher Boessenkool 2019-12-24 17:17 ` Joseph Myers 0 siblings, 1 reply; 67+ messages in thread From: Segher Boessenkool @ 2019-12-24 15:55 UTC (permalink / raw) To: Joseph Myers; +Cc: Roman Zhuykov, gcc, esr On Tue, Dec 24, 2019 at 11:50:30AM +0000, Joseph Myers wrote: > On Mon, 23 Dec 2019, Roman Zhuykov wrote: > > I've never used zhroma@gcc.gnu.org email in ChangeLog files. So, it seems odd > > that it is used in r270511 (my first commit as maintainer), but not in next > > That's because that commit also edits ChangeLog entries from other > authors. When a commit adds / edits ChangeLog entries for more than one > author (the difference between purely editing an existing entry and adding > a new one, possibly under an existing date/author header, for a > multi-author commit, is not something that can reliably be determined > automatically), the conversion falls back to using the committer identity > instead of picking one of the multiple relevant authors from the ChangeLog > files. There is only one relevant author in r270511. It edits a few wrong path names in the previous changelog entries. People often do similar things (like fixing the commit date :-) ) Either never use <account>@gcc.gnu.org, or always use it, don't do the worst of both worlds? Maxim's conversion has this just fine: (from gcc-reparent): commit 6d4633c4d15a92b88332c1e0cbc7f5c1c93c1a8a Author: Roman Zhuykov <zhroma@ispras.ru> AuthorDate: Tue Apr 23 12:53:43 2019 +0000 Commit: Roman Zhuykov <zhroma@ispras.ru> CommitDate: Tue Apr 23 12:53:43 2019 +0000 modulo-sched: fix branch scheduling issue (PR84032) PR rtl-optimization/84032 * modulo-sched.c (ps_insn_find_column): Change condition so that branch will always be the last insn in a row inside partial schedule. testsuite: PR rtl-optimization/84032 * gcc.dg/pr84032.c: New test. git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@270511 138bc75d-0d04-0410-961f (It also gets different author and committer right, and changing email addresses over time). Segher ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-24 15:55 ` Segher Boessenkool @ 2019-12-24 17:17 ` Joseph Myers 2019-12-24 18:14 ` Segher Boessenkool 0 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-24 17:17 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Roman Zhuykov, gcc, esr On Tue, 24 Dec 2019, Segher Boessenkool wrote: > > That's because that commit also edits ChangeLog entries from other > > authors. When a commit adds / edits ChangeLog entries for more than one > > author (the difference between purely editing an existing entry and adding > > a new one, possibly under an existing date/author header, for a > > multi-author commit, is not something that can reliably be determined > > automatically), the conversion falls back to using the committer identity > > instead of picking one of the multiple relevant authors from the ChangeLog > > files. > > There is only one relevant author in r270511. It edits a few wrong path > names in the previous changelog entries. People often do similar things > (like fixing the commit date :-) ) Distinguishing "edits a previous ChangeLog entry" from "adds a new entry under a previous ChangeLog header for a change included in the commit" is a human judgement. (It's necessary to consider the case of a ChangeLog header not included in the new lines added by the commit when looking for an attribution in a ChangeLog file because multiple ChangeLog stanzas, separated by blank lines, under a single date/author header, is common usage for multiple consecutive commits by the same author, and sometimes non-consecutive commits depending on how they did merging.) > Either never use <account>@gcc.gnu.org, or always use it, don't do the > worst of both worlds? The heuristics here are to use an attribution from ChangeLog for the author where unambiguous, but to use the committer (always @gcc.gnu.org / @gnu.org [*], so avoiding attributions at the wrong company even where people were using multiple addresses simultaneously for different changes) as author if in doubt. I think that's better than either always or never using @gcc.gnu.org. git tools computing commit statistics can generally handle multiple email addresses with the same name automatically, and .mailmap can be used to specify more detailed mappings for past commits for at least git shortlog. [*] Except in the special case of "rolfh" from the gcc2 history, where we identified the author of the changes they committed but not who that user was as a committer, so gcc.map specifies that author identity. -- Joseph S. Myers jsm@polyomino.org.uk ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-24 17:17 ` Joseph Myers @ 2019-12-24 18:14 ` Segher Boessenkool 2019-12-25 11:03 ` Roman Zhuykov 0 siblings, 1 reply; 67+ messages in thread From: Segher Boessenkool @ 2019-12-24 18:14 UTC (permalink / raw) To: Joseph Myers; +Cc: Roman Zhuykov, gcc, esr On Tue, Dec 24, 2019 at 05:16:54PM +0000, Joseph Myers wrote: > On Tue, 24 Dec 2019, Segher Boessenkool wrote: > > > That's because that commit also edits ChangeLog entries from other > > > authors. When a commit adds / edits ChangeLog entries for more than one > > > author (the difference between purely editing an existing entry and adding > > > a new one, possibly under an existing date/author header, for a > > > multi-author commit, is not something that can reliably be determined > > > automatically), the conversion falls back to using the committer identity > > > instead of picking one of the multiple relevant authors from the ChangeLog > > > files. > > > > There is only one relevant author in r270511. It edits a few wrong path > > names in the previous changelog entries. People often do similar things > > (like fixing the commit date :-) ) > > Distinguishing "edits a previous ChangeLog entry" from "adds a new entry > under a previous ChangeLog header for a change included in the commit" is > a human judgement. We are doing only one conversion here, the one of the GCC repo. The heuristic works, we checked it did. > > Either never use <account>@gcc.gnu.org, or always use it, don't do the > > worst of both worlds? > > The heuristics here are to use an attribution from ChangeLog for the > author where unambiguous, but to use the committer (always @gcc.gnu.org / > @gnu.org [*], so avoiding attributions at the wrong company even where > people were using multiple addresses simultaneously for different changes) > as author if in doubt. You never need that, and it is worse to use two different schemes than to choose either. I would have chosen the "<account>@gcc.gnu.org" scheme, because it is simple and *correct*. Other people wanted the nicer names. Maxim's conversion gets that correct. Please copy it. If your tool isn't sure what to do, use human intervention. For example, make up a heuristic, and check that exhaustively. We have only one repo to convert! And people do *not* have the same email address for the whole lifetime of the repo. This would mean I can never again contribute to GCC if I start using a different email address after the conversion! Segher ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-24 18:14 ` Segher Boessenkool @ 2019-12-25 11:03 ` Roman Zhuykov 2019-12-25 11:20 ` Joseph Myers ` (2 more replies) 0 siblings, 3 replies; 67+ messages in thread From: Roman Zhuykov @ 2019-12-25 11:03 UTC (permalink / raw) To: Segher Boessenkool, Joseph Myers Cc: gcc, Alexander Monakov, Maxim Kuvyrkov, esr First of all thanks to everyone who spent time making the conversion better and better. Here is my 2c, I have studied a little my colleagues trunk history in Maxim's gcc-pretty vs gcc-reposurgeon-5b. 1) In gcc-pretty timezone info is lost in both author/commiter date (keeping UTC time correct, certainly). Examples are r278990 and r289989. Probably git-svn causes this, current read-only git mirror is also without timezone. Not sure we need that info, but reposurgeon is more correct here. 2) Some thoughts about script for summarizing commit log messages: 2a) Why r143753 and r150680 not have "re PR..." summary instead of "[multiple changes]" ? 2b) On the contrary r155892 have to mention two PRs, even "[multiple changes]" is better here, IMHO. 2c) In r130050 and r155902 we have "Rename too ... " in summary, not sure how to make it better. 2d) r146882 can have better summary if we somehow organize ChangeLog priority (gcc/ChangeLog is more important that testsuite one). 3) About author emails, see below 24.12.2019 21:14, Segher Boessenkool wrote: > On Tue, Dec 24, 2019 at 05:16:54PM +0000, Joseph Myers wrote: >> On Tue, 24 Dec 2019, Segher Boessenkool wrote: >>>> That's because that commit also edits ChangeLog entries from other >>>> authors. When a commit adds / edits ChangeLog entries for more than one >>>> author (the difference between purely editing an existing entry and adding >>>> a new one, possibly under an existing date/author header, for a >>>> multi-author commit, is not something that can reliably be determined >>>> automatically), the conversion falls back to using the committer identity >>>> instead of picking one of the multiple relevant authors from the ChangeLog >>>> files. >>> There is only one relevant author in r270511. It edits a few wrong path >>> names in the previous changelog entries. People often do similar things >>> (like fixing the commit date :-) ) >> Distinguishing "edits a previous ChangeLog entry" from "adds a new entry >> under a previous ChangeLog header for a change included in the commit" is >> a human judgement. > We are doing only one conversion here, the one of the GCC repo. The > heuristic works, we checked it did. > >>> Either never use <account>@gcc.gnu.org, or always use it, don't do the >>> worst of both worlds? >> The heuristics here are to use an attribution from ChangeLog for the >> author where unambiguous, but to use the committer (always @gcc.gnu.org / >> @gnu.org [*], so avoiding attributions at the wrong company even where >> people were using multiple addresses simultaneously for different changes) >> as author if in doubt. > You never need that, and it is worse to use two different schemes than to > choose either. > > I would have chosen the "<account>@gcc.gnu.org" scheme, because it is > simple and *correct*. Other people wanted the nicer names. Maxim's > conversion gets that correct. Please copy it. > IMHO Segher is a bit categorical is the discussion, but I'll be glad to see brief description of Maxim's approach to manage emails, gcc-pretty shows better results. Speaking about the script counting authors from ChangeLog files, even if we drop an "edits a previous ChangeLog entry" issue, it still sometimes work not as Joseph described: 3a) In r155892, r155893 and r259314 Alex is not counted as the only author without any reason. 3b) In r139854, r141108 and r196252 script selected the author successfully, while actually there are more that one. 3c) Maybe here we can also somehow organize ChangeLog priority (again, gcc/ChangeLog is more important that testsuite one). There are a lot of examples, when testsuite/ChangeLog entry have another author: r145055, r150680, r155889, r155894, r155890, r163904, r180186 and r183325. 3d) If we fix 3b+3c we can also look at r143753, r155890 and r155895. 3e) r155891, r207422, r183627 and r234218 are examples of commits which don't touch any ChangeLog files for different reasons. Seems unsolvable in current approach. -- Roman ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 11:03 ` Roman Zhuykov @ 2019-12-25 11:20 ` Joseph Myers 2019-12-25 12:23 ` Eric S. Raymond 2019-12-25 14:32 ` Andreas Schwab 2019-12-27 14:37 ` Richard Earnshaw 2 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-25 11:20 UTC (permalink / raw) To: Roman Zhuykov Cc: Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov, esr, Richard.Earnshaw, rearnsha On Wed, 25 Dec 2019, Roman Zhuykov wrote: > 2) Some thoughts about script for summarizing commit log messages: > 2a) Why r143753 and r150680 not have "re PR..." summary instead of "[multiple > changes]" ? > 2b) On the contrary r155892 have to mention two PRs, even "[multiple changes]" > is better here, IMHO. > 2c) In r130050 and r155902 we have "Rename too ... " in summary, not sure how > to make it better. > 2d) r146882 can have better summary if we somehow organize ChangeLog priority > (gcc/ChangeLog is more important that testsuite one). Richard is best placed to comment on these. His script can provide a complete new summary line if the automatically-generated one seems bad. > 3a) In r155892, r155893 and r259314 Alex is not counted as the only author > without any reason. The first two look like cases where the only difference is in the number of spaces between name and email in the attributions in different ChangeLog files. Should be straightforward to fix by doing more parsing / normalization before deciding whether attributions are the same. The third is a case where the heuristic is applied that if a commit only changes ChangeLog files and nothing else, attributions should not be extracted from those ChangeLog files because it's particularly likely in that case the someone else's ChangeLog entries may be being edited. > 3b) In r139854, r141108 and r196252 script selected the author successfully, > while actually there are more that one. These are all cases covered by the request-for-enhancement issue for adding Co-Authored-by: when the ChangeLog header names multiple authors, as the corresponding de facto git idiom for that case. > 3e) r155891, r207422, r183627 and r234218 are examples of commits which don't > touch any ChangeLog files for different reasons. Seems unsolvable in current > approach. If a ChangeLog file isn't touched, indeed we don't have a good basis for using an author identity other than the committer identity (especially given that some people used multiple email addresses simultaneously, with different ones used for different kinds of commits, and objected to having one with the wrong affiliation associated with a commit they made in connection with a different affiliation). -- Joseph S. Myers jsm@polyomino.org.uk ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 11:20 ` Joseph Myers @ 2019-12-25 12:23 ` Eric S. Raymond 0 siblings, 0 replies; 67+ messages in thread From: Eric S. Raymond @ 2019-12-25 12:23 UTC (permalink / raw) To: Joseph Myers Cc: Roman Zhuykov, Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov, Richard.Earnshaw, rearnsha Joseph Myers <jsm@polyomino.org.uk>: > These are all cases covered by the request-for-enhancement issue for > adding Co-Authored-by: when the ChangeLog header names multiple authors, > as the corresponding de facto git idiom for that case. I apologize, but I am growing doubtful I can deliver that. Even if I can, it may take longer than your conversion schedule allows given that we've only got five days on the clock. Here are the problems: 1. I don't have a reduced test case to validate parsing against. 2. The ChangeLog-parsing code is fragile and difficult to modify. This is inherent - the syntactic cues it's working with are weak and false matches are all too easy. I've got to have 1 before I can even try to deal with 2. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 11:03 ` Roman Zhuykov 2019-12-25 11:20 ` Joseph Myers @ 2019-12-25 14:32 ` Andreas Schwab 2019-12-25 14:41 ` Joseph Myers 2019-12-25 19:19 ` Eric S. Raymond 2019-12-27 14:37 ` Richard Earnshaw 2 siblings, 2 replies; 67+ messages in thread From: Andreas Schwab @ 2019-12-25 14:32 UTC (permalink / raw) To: Roman Zhuykov Cc: Segher Boessenkool, Joseph Myers, gcc, Alexander Monakov, Maxim Kuvyrkov, esr On Dez 25 2019, Roman Zhuykov wrote: > 1) In gcc-pretty timezone info is lost in both author/commiter date > (keeping UTC time correct, certainly). Since svn doesn't record time zones you cannot lose them, only fabricate them. > Not sure we need that info, but reposurgeon is more correct here. Definitely not. I have never authored or committed any revision in the -0800 time zone. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 14:32 ` Andreas Schwab @ 2019-12-25 14:41 ` Joseph Myers 2019-12-25 15:10 ` Andreas Schwab 2019-12-25 19:19 ` Eric S. Raymond 1 sibling, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-25 14:41 UTC (permalink / raw) To: Andreas Schwab Cc: Roman Zhuykov, Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov, esr On Wed, 25 Dec 2019, Andreas Schwab wrote: > > Not sure we need that info, but reposurgeon is more correct here. > > Definitely not. I have never authored or committed any revision in the > -0800 time zone. If reposurgeon is defaulting to the local time where the conversion is run, there's a strong argument it should default to UTC to be deterministic. I've made to note to make the gcc-conversion machinery use TZ=UTC0 to avoid such issues. Timezones for any email address can be specified in gcc.map for any authors wishing to have an appropriate timezone used for their commits. -- Joseph S. Myers jsm@polyomino.org.uk ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 14:41 ` Joseph Myers @ 2019-12-25 15:10 ` Andreas Schwab 2019-12-25 15:36 ` Joseph Myers 0 siblings, 1 reply; 67+ messages in thread From: Andreas Schwab @ 2019-12-25 15:10 UTC (permalink / raw) To: Joseph Myers Cc: Roman Zhuykov, Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov, esr On Dez 25 2019, Joseph Myers wrote: > Timezones for any email address can be specified in gcc.map for any > authors wishing to have an appropriate timezone used for their commits. But that should not be used for unrelated authors. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 15:10 ` Andreas Schwab @ 2019-12-25 15:36 ` Joseph Myers 2019-12-25 17:15 ` Segher Boessenkool ` (2 more replies) 0 siblings, 3 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-25 15:36 UTC (permalink / raw) To: Andreas Schwab Cc: Roman Zhuykov, Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov, esr On Wed, 25 Dec 2019, Andreas Schwab wrote: > On Dez 25 2019, Joseph Myers wrote: > > > Timezones for any email address can be specified in gcc.map for any > > authors wishing to have an appropriate timezone used for their commits. > > But that should not be used for unrelated authors. It's not. On investigation, I think you are referring to the conversion of r269472. That was committed for you by Jim Wilson and thus has you as author and Jim Wilson as committer and Jim Wilson's timezone entry has been applied. So the argument here is that the author's timezone information should be applied to the author date, and the committer's timezone information should be applied to the committer date. I expect that should be straightforward (although when coming from SVN, there's also an argument that we only have committer dates so the committer timezone is the relevant one to apply). -- Joseph S. Myers jsm@polyomino.org.uk ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 15:36 ` Joseph Myers @ 2019-12-25 17:15 ` Segher Boessenkool 2019-12-25 19:33 ` Eric S. Raymond 2019-12-25 19:40 ` Eric S. Raymond 2019-12-27 21:29 ` Andreas Schwab 2 siblings, 1 reply; 67+ messages in thread From: Segher Boessenkool @ 2019-12-25 17:15 UTC (permalink / raw) To: Joseph Myers Cc: Andreas Schwab, Roman Zhuykov, gcc, Alexander Monakov, Maxim Kuvyrkov, esr On Wed, Dec 25, 2019 at 03:36:38PM +0000, Joseph Myers wrote: > On Wed, 25 Dec 2019, Andreas Schwab wrote: > > > On Dez 25 2019, Joseph Myers wrote: > > > > > Timezones for any email address can be specified in gcc.map for any > > > authors wishing to have an appropriate timezone used for their commits. > > > > But that should not be used for unrelated authors. > > It's not. > > On investigation, I think you are referring to the conversion of r269472. > That was committed for you by Jim Wilson and thus has you as author and > Jim Wilson as committer and Jim Wilson's timezone entry has been applied. > So the argument here is that the author's timezone information should be > applied to the author date, and the committer's timezone information > should be applied to the committer date. I expect that should be > straightforward (although when coming from SVN, there's also an argument > that we only have committer dates so the committer timezone is the > relevant one to apply). Or we could just not make up any time zone at all. The information isn't there, what is gained by faking something? Having people's real names is obviously useful. Showing the email address they used when they did the patch (which can be an indication of affiliation, for example) can also be useful, but less so, and is harder to get right. But the timezone some patch was made in (or committed in)? The goal is not to pretend we never used SVN. The goal is to have a Git repo that is as useful as possible for us. For me, that means the stuff inherited from the older repos should be just that: exactly what was there before. With annoyances like real name fixed, perhaps, and maybe actual errors fixed (although I never in practice saw *any* error that made anything even the slightest bit harder to do). But no lipstick. Segher ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 17:15 ` Segher Boessenkool @ 2019-12-25 19:33 ` Eric S. Raymond 2019-12-26 21:03 ` Vincent Lefevre 0 siblings, 1 reply; 67+ messages in thread From: Eric S. Raymond @ 2019-12-25 19:33 UTC (permalink / raw) To: Segher Boessenkool Cc: Joseph Myers, Andreas Schwab, Roman Zhuykov, gcc, Alexander Monakov, Maxim Kuvyrkov Segher Boessenkool <segher@kernel.crashing.org>: > The goal is not to pretend we never used SVN. One of *my* goals is that the illusion of git back to the beginning of time should be as consistent as possible. > The goal is to have a Git repo that is as useful as possible for us. Exactly. I've already written about minimizing cognitive friction. Here's why you want to get timezones right: there are going to be times when the order of commits is significant information for a developer's understanding of what happened. But without a timezone you only know the actual time of a commit to 24-hour resoltion. There is no way we'll get this perfect. But there is more wrong and less wrong, and reposurgeon tries hard to be less wrong. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 19:33 ` Eric S. Raymond @ 2019-12-26 21:03 ` Vincent Lefevre 2019-12-26 21:31 ` Eric S. Raymond 0 siblings, 1 reply; 67+ messages in thread From: Vincent Lefevre @ 2019-12-26 21:03 UTC (permalink / raw) To: gcc On 2019-12-25 14:33:45 -0500, Eric S. Raymond wrote: > Segher Boessenkool <segher@kernel.crashing.org>: > > The goal is not to pretend we never used SVN. > > One of *my* goals is that the illusion of git back to the beginning of > time should be as consistent as possible. > > > The goal is to have a Git repo that is as useful as possible for us. > > Exactly. I've already written about minimizing cognitive friction. > > Here's why you want to get timezones right: there are going to be times > when the order of commits is significant information for a developer's > understanding of what happened. But without a timezone you only know > the actual time of a commit to 24-hour resoltion. I don't understand what you mean. What matters for the order of commits is the global time, and this is what SVN stores. SVN does not store timezone information, i.e. it has no idea of what local time of the user had, but I don't think this is important information. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-26 21:03 ` Vincent Lefevre @ 2019-12-26 21:31 ` Eric S. Raymond 2019-12-26 22:25 ` Toon Moene 2019-12-26 22:57 ` Vincent Lefevre 0 siblings, 2 replies; 67+ messages in thread From: Eric S. Raymond @ 2019-12-26 21:31 UTC (permalink / raw) To: gcc Vincent Lefevre <vincent+gcc@vinc17.org>: > > Here's why you want to get timezones right: there are going to be times > > when the order of commits is significant information for a developer's > > understanding of what happened. But without a timezone you only know > > the actual time of a commit to 24-hour resoltion. > > I don't understand what you mean. What matters for the order of > commits is the global time, and this is what SVN stores. SVN does not > store timezone information, i.e. it has no idea of what local time of > the user had, but I don't think this is important information. UTC time plus a timezone offset set is what git stores. That's not the locus of the problem. In Subversion-land there's newver any doubt about the sequence of commits; the revision numbers tell you that. In Git-land you have to go by timestamps, and if a timezone entry is wrong it can skew the displayed time. Me, I don't undertstand why version-control systems designed for distributed use don't ignore timezones entirely and display all times in UTC - relative time is surely more imoortant than the commit time's relationship to solar noon wherever the keyboard happened to be. But I don't make these decisions. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-26 21:31 ` Eric S. Raymond @ 2019-12-26 22:25 ` Toon Moene 2019-12-26 22:32 ` Eric S. Raymond 2019-12-26 22:57 ` Vincent Lefevre 1 sibling, 1 reply; 67+ messages in thread From: Toon Moene @ 2019-12-26 22:25 UTC (permalink / raw) To: esr, gcc On 12/26/19 10:30 PM, Eric S. Raymond wrote: > Me, I don't undertstand why version-control systems designed for distributed > use don't ignore timezones entirely and display all times in UTC - relative > time is surely more imoortant than the commit time's relationship to solar > noon wherever the keyboard happened to be. But I don't make these decisions. So we are going to base this world wide free software endeavor on a source code system that doesn't keep time by UTC ? My God - imagine if weather forecasting was done this way. -- Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-26 22:25 ` Toon Moene @ 2019-12-26 22:32 ` Eric S. Raymond 2019-12-27 14:40 ` Segher Boessenkool 0 siblings, 1 reply; 67+ messages in thread From: Eric S. Raymond @ 2019-12-26 22:32 UTC (permalink / raw) To: Toon Moene; +Cc: gcc Toon Moene <toon@moene.org>: > On 12/26/19 10:30 PM, Eric S. Raymond wrote: > > > Me, I don't undertstand why version-control systems designed for distributed > > use don't ignore timezones entirely and display all times in UTC - relative > > time is surely more imoortant than the commit time's relationship to solar > > noon wherever the keyboard happened to be. But I don't make these decisions. > > So we are going to base this world wide free software endeavor on a source > code system that doesn't keep time by UTC ? They all *do* keep time by UTC. What confuses me is why they every try to *display* anything other than UTC. It seems pointless to me to ever display local time in clients, but they do it anyway. Wiothout that complication, there would be no need to track user timezones. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-26 22:32 ` Eric S. Raymond @ 2019-12-27 14:40 ` Segher Boessenkool 0 siblings, 0 replies; 67+ messages in thread From: Segher Boessenkool @ 2019-12-27 14:40 UTC (permalink / raw) To: Eric S. Raymond; +Cc: Toon Moene, gcc On Thu, Dec 26, 2019 at 05:32:52PM -0500, Eric S. Raymond wrote: > Toon Moene <toon@moene.org>: > > So we are going to base this world wide free software endeavor on a source > > code system that doesn't keep time by UTC ? > > They all *do* keep time by UTC. (Git stores unix time, instead -- close enough ;-) ) > What confuses me is why they every try to *display* anything other than UTC. That depends on what you yourself configured (log.date, blame.date). > It seems pointless to me to ever display local time in clients, but they do it > anyway. Many people only work with their local team in their local office. For GCC we do not care, and we could always store +0000 as timezone offset -- we do not *have* the correct data, and at least UTC is more useful to display. Segher ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-26 21:31 ` Eric S. Raymond 2019-12-26 22:25 ` Toon Moene @ 2019-12-26 22:57 ` Vincent Lefevre 2019-12-26 23:38 ` Eric S. Raymond 1 sibling, 1 reply; 67+ messages in thread From: Vincent Lefevre @ 2019-12-26 22:57 UTC (permalink / raw) To: gcc On 2019-12-26 16:30:15 -0500, Eric S. Raymond wrote: > Vincent Lefevre <vincent+gcc@vinc17.org>: > > > Here's why you want to get timezones right: there are going to be times > > > when the order of commits is significant information for a developer's > > > understanding of what happened. But without a timezone you only know > > > the actual time of a commit to 24-hour resoltion. > > > > I don't understand what you mean. What matters for the order of > > commits is the global time, and this is what SVN stores. SVN does not > > store timezone information, i.e. it has no idea of what local time of > > the user had, but I don't think this is important information. > > UTC time plus a timezone offset set is what git stores. That's not the > locus of the problem. > > In Subversion-land there's newver any doubt about the sequence of commits; > the revision numbers tell you that. In Git-land you have to go by timestamps, > and if a timezone entry is wrong it can skew the displayed time. What matters is that the date is correct. I don't think the timezone matters (that's why SVN doesn't store timezone information, I assume), possibly except for the committer himself (?). For instance, 2019-11-27 02:32:02 +0100 and 2019-11-27 01:32:02 +0000 correspond to the same date. So, each one (as stored in the repository) is fine if you want to be able to know the order of commits. What is displayed then is actually a user-config issue. The conversion utility can't solve this issue, since after conversion, committers will be able to use any timezone they like. > Me, I don't undertstand why version-control systems designed for distributed > use don't ignore timezones entirely and display all times in UTC - relative > time is surely more imoortant than the commit time's relationship to solar > noon wherever the keyboard happened to be. But I don't make these decisions. I agree, at least being able to display all times in a *fixed* timezone (chosen by the user), as this could be easier for the user to know when recent commits occur (by "recent", this can be less than 24 hours ago). For UTC, you can use: TZ=UTC git log --date=iso-local The date format can be stored in ~/.gitconfig, but unfortunately not local timezone information. In this case, the timezones of the commits chosen by the conversion utility will not matter at all. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-26 22:57 ` Vincent Lefevre @ 2019-12-26 23:38 ` Eric S. Raymond 0 siblings, 0 replies; 67+ messages in thread From: Eric S. Raymond @ 2019-12-26 23:38 UTC (permalink / raw) To: gcc Vincent Lefevre <vincent+gcc@vinc17.org>: > What matters is that the date is correct. I don't think the timezone > matters (that's why SVN doesn't store timezone information, I assume), > possibly except for the committer himself (?). For instance, Subversion doesn't store timezone because all commits are consifered to have occurred at UTC time on a central repository. I think time as well as date matters because soimetimes it could be information of significance what order commits were in even if they were on the same day. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 15:36 ` Joseph Myers 2019-12-25 17:15 ` Segher Boessenkool @ 2019-12-25 19:40 ` Eric S. Raymond 2019-12-27 21:29 ` Andreas Schwab 2 siblings, 0 replies; 67+ messages in thread From: Eric S. Raymond @ 2019-12-25 19:40 UTC (permalink / raw) To: Joseph Myers Cc: Andreas Schwab, Roman Zhuykov, Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov Joseph Myers <jsm@polyomino.org.uk>: > On Wed, 25 Dec 2019, Andreas Schwab wrote: > > > On Dez 25 2019, Joseph Myers wrote: > > > > > Timezones for any email address can be specified in gcc.map for any > > > authors wishing to have an appropriate timezone used for their commits. > > > > But that should not be used for unrelated authors. > > It's not. > > On investigation, I think you are referring to the conversion of r269472. > That was committed for you by Jim Wilson and thus has you as author and > Jim Wilson as committer and Jim Wilson's timezone entry has been applied. > So the argument here is that the author's timezone information should be > applied to the author date, and the committer's timezone information > should be applied to the committer date. I expect that should be > straightforward (although when coming from SVN, there's also an argument > that we only have committer dates so the committer timezone is the > relevant one to apply). Theee's also an FSF policy about Changelogs that's relevant, I think. Git sometimes fills in the author field from the committer, and Changelog parsing is done only after translation. That's probably the source of this bug. If anybody cares enough to file a bug with a test load attached, I can probably fix this. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 15:36 ` Joseph Myers 2019-12-25 17:15 ` Segher Boessenkool 2019-12-25 19:40 ` Eric S. Raymond @ 2019-12-27 21:29 ` Andreas Schwab 2019-12-27 21:43 ` Joseph Myers 2 siblings, 1 reply; 67+ messages in thread From: Andreas Schwab @ 2019-12-27 21:29 UTC (permalink / raw) To: Joseph Myers Cc: Roman Zhuykov, Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov, esr On Dez 25 2019, Joseph Myers wrote: > On investigation, I think you are referring to the conversion of r269472. > That was committed for you by Jim Wilson and thus has you as author and > Jim Wilson as committer and Jim Wilson's timezone entry has been applied. > So the argument here is that the author's timezone information should be > applied to the author date, and the committer's timezone information > should be applied to the committer date. I expect that should be > straightforward (although when coming from SVN, there's also an argument > that we only have committer dates so the committer timezone is the > relevant one to apply). SVN also only has a committer, so the fabricated author should not be influenced by the committer. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-27 21:29 ` Andreas Schwab @ 2019-12-27 21:43 ` Joseph Myers 0 siblings, 0 replies; 67+ messages in thread From: Joseph Myers @ 2019-12-27 21:43 UTC (permalink / raw) To: Andreas Schwab Cc: Roman Zhuykov, Segher Boessenkool, gcc, Alexander Monakov, Maxim Kuvyrkov, esr On Fri, 27 Dec 2019, Andreas Schwab wrote: > SVN also only has a committer, so the fabricated author should not be > influenced by the committer. That issue has been fixed. -- Joseph S. Myers jsm@polyomino.org.uk ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 14:32 ` Andreas Schwab 2019-12-25 14:41 ` Joseph Myers @ 2019-12-25 19:19 ` Eric S. Raymond 2019-12-27 21:30 ` Andreas Schwab 1 sibling, 1 reply; 67+ messages in thread From: Eric S. Raymond @ 2019-12-25 19:19 UTC (permalink / raw) To: Andreas Schwab Cc: Roman Zhuykov, Segher Boessenkool, Joseph Myers, gcc, Alexander Monakov, Maxim Kuvyrkov Andreas Schwab <schwab@linux-m68k.org>: > Definitely not. I have never authored or committed any revision in the > -0800 time zone. That's easily fixed by adding a timezone entry to your author-map entry - CET, is it? That will prevent reposurgeon from making any attempt to deduce your timezone. It would be interesting to know how reposurgeon got misled. Most likely it was by a Changelog entry. Reposurgeon watches as these are being processed to see if it can pin an email address to a single timezone by looking up its TLD in the IANA database. I don't know how that could land you in California, though. Maybe I ought to be logging timezone deductions so we can trace them back. Has anyone else seen wrong timezone attributions? -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 19:19 ` Eric S. Raymond @ 2019-12-27 21:30 ` Andreas Schwab 2019-12-28 2:43 ` Eric S. Raymond 0 siblings, 1 reply; 67+ messages in thread From: Andreas Schwab @ 2019-12-27 21:30 UTC (permalink / raw) To: Eric S. Raymond Cc: Roman Zhuykov, Segher Boessenkool, Joseph Myers, gcc, Alexander Monakov, Maxim Kuvyrkov On Dez 25 2019, Eric S. Raymond wrote: > That's easily fixed by adding a timezone entry to your author-map > entry - CET, is it? The time zone is not constant. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-27 21:30 ` Andreas Schwab @ 2019-12-28 2:43 ` Eric S. Raymond 0 siblings, 0 replies; 67+ messages in thread From: Eric S. Raymond @ 2019-12-28 2:43 UTC (permalink / raw) To: Andreas Schwab Cc: Roman Zhuykov, Segher Boessenkool, Joseph Myers, gcc, Alexander Monakov, Maxim Kuvyrkov Andreas Schwab <schwab@linux-m68k.org>: > On Dez 25 2019, Eric S. Raymond wrote: > > > That's easily fixed by adding a timezone entry to your author-map > > entry - CET, is it? > > The time zone is not constant. Congratulations, you have broken one of reposurgeon's assumptions. It is possible to use reposurgeon;d DSL tset committer TZ on a selected set of commits; if you want to work uo a patch for the lift script we'll take it. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-25 11:03 ` Roman Zhuykov 2019-12-25 11:20 ` Joseph Myers 2019-12-25 14:32 ` Andreas Schwab @ 2019-12-27 14:37 ` Richard Earnshaw 2 siblings, 0 replies; 67+ messages in thread From: Richard Earnshaw @ 2019-12-27 14:37 UTC (permalink / raw) To: Roman Zhuykov, Segher Boessenkool, Joseph Myers Cc: gcc, Alexander Monakov, Maxim Kuvyrkov, esr On 25/12/2019 11:02, Roman Zhuykov wrote: > First of all thanks to everyone who spent time making the conversion > better and better. Here is my 2c, I have studied a little my colleagues > trunk history in Maxim's gcc-pretty vs gcc-reposurgeon-5b. > > 1) In gcc-pretty timezone info is lost in both author/commiter date > (keeping UTC time correct, certainly). Examples are r278990 and r289989. > Probably git-svn causes this, current read-only git mirror is also > without timezone. Not sure we need that info, but reposurgeon is more > correct here. > > 2) Some thoughts about script for summarizing commit log messages: > 2a) Why r143753 and r150680 not have "re PR..." summary instead of > "[multiple changes]" ? Both of these commits have more than one hunk with different authors, that triggers the heuristic to detect multiple independent changes that have been merged into a single commit (this is most common on branches, where it is common to aggregate a large number of backport commits - for those, using the first PR found as a key is rarely right even if there is only one PR mentioned). As Joseph said, providing a specific rule for these cases is possible. > 2b) On the contrary r155892 have to mention two PRs, even "[multiple > changes]" is better here, IMHO. For this one, the heuristic is that the PRs are likely to be related and that either *could* form a summary. We choose the first one mentioned. Looking at the commit log I cannot tell whether this is two independent fixes rolled into a single commit or two bugs that relate to the same change. > 2c) In r130050 and r155902 we have "Rename too ... " in summary, not > sure how to make it better. Well, we *could* try to extract the target function name for the rename of the traget, but I'm not about to try that right now. > 2d) r146882 can have better summary if we somehow organize ChangeLog > priority (gcc/ChangeLog is more important that testsuite one). Commit logs follow conventions quite weakly (if they followed them more strongly we wouldn't need the script at all, since all of them would have a summary line already ;-) As such, parsing them is not easy. The real question you need to answer is along the lines of: is 20071210-2.c: New testcase. a *less* useful summary than gcc/testsuite/Changelog: (which is what will appear if we do nothing in this case)? Unless the answer to that is 'no' then my script is working as intended, which is to try to produce something more useful than we would get by default. If we had another year, it might be possible to develop some AI that read the commit log, searched the mailing list the relevant email for the commit and then proceeded to solve the halting problem as something trivial along the side, but we aren't going to wait that long ;) More importantly, if you see commits with particularly egregious summaries in the trial conversions, please file a ticket at https://gitlab.com/esr/gcc-conversion.git and we can look into what suitable action might be needed. R. > > 3) About author emails, see below > 24.12.2019 21:14, Segher Boessenkool wrote: >> On Tue, Dec 24, 2019 at 05:16:54PM +0000, Joseph Myers wrote: >>> On Tue, 24 Dec 2019, Segher Boessenkool wrote: >>>>> That's because that commit also edits ChangeLog entries from other >>>>> authors. When a commit adds / edits ChangeLog entries for more >>>>> than one >>>>> author (the difference between purely editing an existing entry and >>>>> adding >>>>> a new one, possibly under an existing date/author header, for a >>>>> multi-author commit, is not something that can reliably be determined >>>>> automatically), the conversion falls back to using the committer >>>>> identity >>>>> instead of picking one of the multiple relevant authors from the >>>>> ChangeLog >>>>> files. >>>> There is only one relevant author in r270511. It edits a few wrong >>>> path >>>> names in the previous changelog entries. People often do similar >>>> things >>>> (like fixing the commit date :-) ) >>> Distinguishing "edits a previous ChangeLog entry" from "adds a new entry >>> under a previous ChangeLog header for a change included in the >>> commit" is >>> a human judgement. >> We are doing only one conversion here, the one of the GCC repo. The >> heuristic works, we checked it did. >> >>>> Either never use <account>@gcc.gnu.org, or always use it, don't do the >>>> worst of both worlds? >>> The heuristics here are to use an attribution from ChangeLog for the >>> author where unambiguous, but to use the committer (always >>> @gcc.gnu.org / >>> @gnu.org [*], so avoiding attributions at the wrong company even where >>> people were using multiple addresses simultaneously for different >>> changes) >>> as author if in doubt. >> You never need that, and it is worse to use two different schemes than to >> choose either. >> >> I would have chosen the "<account>@gcc.gnu.org" scheme, because it is >> simple and *correct*. Other people wanted the nicer names. Maxim's >> conversion gets that correct. Please copy it. >> > IMHO Segher is a bit categorical is the discussion, but I'll be glad to > see brief description of Maxim's approach to manage emails, gcc-pretty > shows better results. > Speaking about the script counting authors from ChangeLog files, even if > we drop an "edits a previous ChangeLog entry" issue, it still sometimes > work not as Joseph described: > 3a) In r155892, r155893 and r259314 Alex is not counted as the only > author without any reason. > 3b) In r139854, r141108 and r196252 script selected the author > successfully, while actually there are more that one. > 3c) Maybe here we can also somehow organize ChangeLog priority (again, > gcc/ChangeLog is more important that testsuite one). There are a lot of > examples, when testsuite/ChangeLog entry have another author: r145055, > r150680, r155889, r155894, r155890, r163904, r180186 and r183325. > 3d) If we fix 3b+3c we can also look at r143753, r155890 and r155895. > 3e) r155891, r207422, r183627 and r234218 are examples of commits which > don't touch any ChangeLog files for different reasons. Seems unsolvable > in current approach. > > -- > Roman ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-22 13:57 ` Joseph Myers 2019-12-23 17:27 ` Roman Zhuykov @ 2019-12-24 10:57 ` Maxim Kuvyrkov 2019-12-28 16:30 ` Joseph Myers 2 siblings, 0 replies; 67+ messages in thread From: Maxim Kuvyrkov @ 2019-12-24 10:57 UTC (permalink / raw) To: Joseph S. Myers; +Cc: gcc, esr > On Dec 22, 2019, at 4:56 PM, Joseph Myers <joseph@codesourcery.com> wrote: > > On Thu, 19 Dec 2019, Joseph Myers wrote: > >> And two more. >> >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4a.git >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-4b.git > > Two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5b.git > > The main changes are: > > * The case of both svnmerge-integrated and svn:mergeinfo being set is now > handled properly, so the commit Bernd found is interpreted as a merge from > trunk to named-addr-spaces-branch and has exactly two parents as expected, > with the parents corresponding to the merges from other branches to trunk > being optimized away. > > * The author map used now avoids timezone-only entries also remapping > email addresses, so the email addresses from the ChangeLogs are used > whenever a commit adds ChangeLog entries from exactly one author. > > * When commits add ChangeLog entries from more than one author (e.g. > merges done in CVS), the committer is now used as the author rather than > selecting one of the authors from the ChangeLog entries. > > * The latest whitelisting / PR corrections are used with Richard's script > (430 checkme: entries remain). > > * One fix to the ref renaming in gcc-reposurgeon-5b.git so that the tag > gcc-3_2-rhl8-3_2-7 properly ends up in vendors rather than prereleases. I'll spend next couple of days comparing Joseph's gcc-reposurgeon-5a.git conversion against my gcc-pretty.git and gcc-reparent.git conversions, and will post results along with the scripts to this mailing list. Regarding gcc-pretty.git and gcc-reparent.git conversions, I have the following comments so far: Q1: Why are there missing branches for stuff that didn't originate at trunk@1? A1: Indeed, that's by design / configuration. The scripts start with trunk@1 and build a parent DAG from that node. If desired, it is trivial to add more initial "root" commits to include these missing branches. Q2: Why are entries from branches/st/tags treated as branches, not as tags? A2: Because I opted to not special-case these to simplify comparison of different conversions. Tags/* entries are converted to git annotated tags in a separate pass, an it is trivial to add handling for branches/st/tags there. Q3: Why do reparented branches in gcc-reparent.git repo have merge commits at the point of reparenting? A3: That's an artifact of svn-git machinery my scripts are using. I haven't looked at this in depth. Q4: Is it possible to integrate Richard E.'s script to rewrite commit log messages? A5: Yes, absolutely. The scripts have a pass to rewrite commit author/committer entries, and log rewrite easily fits in there. It would be very helpful to have a version of Richard's script that runs on per-commit basis, suitable for "git filter-branch" consumption. Regards, -- Maxim Kuvyrkov https://www.linaro.org ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-22 13:57 ` Joseph Myers 2019-12-23 17:27 ` Roman Zhuykov 2019-12-24 10:57 ` Maxim Kuvyrkov @ 2019-12-28 16:30 ` Joseph Myers 2020-01-03 12:38 ` Joseph Myers 2 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2019-12-28 16:30 UTC (permalink / raw) To: gcc; +Cc: esr On Sun, 22 Dec 2019, Joseph Myers wrote: > Two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-5b.git Two more. git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6a.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6b.git These have accumulated mostly minor improvements. There are no automatically-generated .gitignore files any more. There are no merge commits on master any more. There are various improvements relating to ChangeLog handling, though some more such improvements went into reposurgeon after this conversion run started and thus aren't included (and this conversion doesn't include Richard's list of typo fixes for attributions). There are many more whitelistings / PR fixups for generated commit summaries (so all revisions 200000 and later with a checkme have had it resolved). Regarding the ChangeLog improvements, the specific common patterns Jakub noted that appear at the start and end of the alphabet are worked around in bugdb.py for this conversion and handled properly in reposurgeon in changes that went in after this conversion started. Of the three authors remaining that look like they follow those patterns, one should be handled by the code now in reposurgeon (just not by the workaround in bugdb.py) and the other two are cases where I'll add fixup entries in bugdb.py (they do things like repeat the whole date twice). I'll also add a fix in bugdb.py for the cases where the name is inside "" or () in the ChangeLog entry as that can be handled fully automatically. -- Joseph S. Myers jsm@polyomino.org.uk ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2019-12-28 16:30 ` Joseph Myers @ 2020-01-03 12:38 ` Joseph Myers 2020-01-06 23:58 ` Andrew Pinski 2020-01-09 12:22 ` Joseph Myers 0 siblings, 2 replies; 67+ messages in thread From: Joseph Myers @ 2020-01-03 12:38 UTC (permalink / raw) To: gcc; +Cc: esr On Sat, 28 Dec 2019, Joseph Myers wrote: > Two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6b.git Two more. git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7a.git git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7b.git These have further accumulated improvements, especially identifying more merge commits, many improved author attributions, and many improved commit summaries / PR number fixups from Richard's scripts. We're working on identifying further cases where author attributions can be safely improved. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2020-01-03 12:38 ` Joseph Myers @ 2020-01-06 23:58 ` Andrew Pinski 2020-01-07 0:30 ` Joseph Myers 2020-01-07 0:44 ` Richard Earnshaw 2020-01-09 12:22 ` Joseph Myers 1 sibling, 2 replies; 67+ messages in thread From: Andrew Pinski @ 2020-01-06 23:58 UTC (permalink / raw) To: Joseph Myers; +Cc: GCC Mailing List, Eric Raymond On Fri, Jan 3, 2020 at 4:38 AM Joseph Myers <joseph@codesourcery.com> wrote: > > On Sat, 28 Dec 2019, Joseph Myers wrote: > > > Two more. > > > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6a.git > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6b.git > > Two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7b.git > > These have further accumulated improvements, especially identifying more > merge commits, many improved author attributions, and many improved commit > summaries / PR number fixups from Richard's scripts. We're working on > identifying further cases where author attributions can be safely > improved. A few comments about my commits (and others). * SVN r133438 had the wrong PR # used but it was fixed with SVN r133439. * SVN r134947, only has one commit associated with it but with new testcases too. ** Maybe just use the non testsuite/ChangeLog reference as the subject line. ** Also I Noticed the author for that revision is detected as pinskia@gcc.gnu.org but that is because I used different cases for the emails in the changelog. *** Maybe always using lower case for the email part. * SVN r160418, does not detect me as the main author or even detect Shujing's email. ** the changelog entry below what was added had a minor whitespace change to it * SVN r211205, does not detect my email address correctly ** I had a typo in the date format (014-06-03 when it should have been 2014-06-03) I don't care if these minor issues don't get fixed, but I suspect fixing them will help fix other issues; I don't know if these have been fixed yet. Thanks, Andrew Pinski > > -- > Joseph S. Myers > joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2020-01-06 23:58 ` Andrew Pinski @ 2020-01-07 0:30 ` Joseph Myers 2020-01-07 0:44 ` Richard Earnshaw 1 sibling, 0 replies; 67+ messages in thread From: Joseph Myers @ 2020-01-07 0:30 UTC (permalink / raw) To: Andrew Pinski; +Cc: GCC Mailing List, Eric Raymond On Mon, 6 Jan 2020, Andrew Pinski wrote: > ** Also I Noticed the author for that revision is detected as > pinskia@gcc.gnu.org but that is because I used different cases for the > emails in the changelog. In my review of possibly suspect authors I'm concentrating on cases where the author name needs review (based on either reposurgeon reporting a ChangeLog header it can't parse, or differences in author name from Maxim's conversion, as an indication that the choice of name is worthy of review), not those where the name is probably OK but it might be possible to improve the choice of email address. > * SVN r160418, does not detect me as the main author or even detect > Shujing's email. > ** the changelog entry below what was added had a minor whitespace change to it > * SVN r211205, does not detect my email address correctly Both of those are included in the manual fixups I've done for suspect cases (since that conversion was generated). I've reviewed all the (about 500) cases of reposurgeon reporting ChangeLog headers it can't parse (with a wide range of typos in date formats etc.), other than those Richard set up fixes for before I started that review, and all the (about 1200) most likely to be significant cases of differences in authors (names) from Maxim's conversion (cases where the commit isn't ChangeLog-only or marked as a backport, resulting in about 400 author improvements, the rest being cases where I thought the author from reposurgeon was the most appropriate one; ChangeLog-missing cases had been reviewed earlier). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2020-01-06 23:58 ` Andrew Pinski 2020-01-07 0:30 ` Joseph Myers @ 2020-01-07 0:44 ` Richard Earnshaw 1 sibling, 0 replies; 67+ messages in thread From: Richard Earnshaw @ 2020-01-07 0:44 UTC (permalink / raw) To: Andrew Pinski, Joseph Myers; +Cc: GCC Mailing List, Eric Raymond On 06/01/2020 23:57, Andrew Pinski wrote: > On Fri, Jan 3, 2020 at 4:38 AM Joseph Myers <joseph@codesourcery.com> wrote: >> >> On Sat, 28 Dec 2019, Joseph Myers wrote: >> >>> Two more. >>> >>> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6a.git >>> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6b.git >> >> Two more. >> >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7a.git >> git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7b.git >> >> These have further accumulated improvements, especially identifying more >> merge commits, many improved author attributions, and many improved commit >> summaries / PR number fixups from Richard's scripts. We're working on >> identifying further cases where author attributions can be safely >> improved. > > A few comments about my commits (and others). > * SVN r133438 had the wrong PR # used but it was fixed with SVN r133439. I've added a fixup > * SVN r134947, only has one commit associated with it but with new > testcases too. This is due to the emails not matching exactly. It's not really practical to do case independent comparisons, so I've added a forced email for the author and fixed the summary manually. > ** Maybe just use the non testsuite/ChangeLog reference as the subject line. The scanner for this doesn't look at ChangeLog files, only at the commit message itself. Again, it's not really feasible to tease testsuite/non-testsuite changes out reliably, so I don't try. > ** Also I Noticed the author for that revision is detected as > pinskia@gcc.gnu.org but that is because I used different cases for the > emails in the changelog. Which lead to all of the above. R. > *** Maybe always using lower case for the email part. > * SVN r160418, does not detect me as the main author or even detect > Shujing's email. > ** the changelog entry below what was added had a minor whitespace change to it > * SVN r211205, does not detect my email address correctly > ** I had a typo in the date format (014-06-03 when it should have been > 2014-06-03) > > I don't care if these minor issues don't get fixed, but I suspect > fixing them will help fix other issues; I don't know if these have > been fixed yet. > > Thanks, > Andrew Pinski > > >> >> -- >> Joseph S. Myers >> joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2020-01-03 12:38 ` Joseph Myers 2020-01-06 23:58 ` Andrew Pinski @ 2020-01-09 12:22 ` Joseph Myers 2020-01-09 21:57 ` Joseph Myers 1 sibling, 1 reply; 67+ messages in thread From: Joseph Myers @ 2020-01-09 12:22 UTC (permalink / raw) To: gcc; +Cc: esr On Fri, 3 Jan 2020, Joseph Myers wrote: > On Sat, 28 Dec 2019, Joseph Myers wrote: > > > Two more. > > > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6a.git > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-6b.git > > Two more. > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7a.git > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-7b.git Here's a test conversion with the conversion machinery in what should be essentially final form. This is like the "b" versions (dead and vendor branches present but not fetched by default), with the addition of refs from the existing git mirror as refs/git-old/* and refs/git-svn-old/* (not fetched by default). git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-8.git -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: Test GCC conversion with reposurgeon available 2020-01-09 12:22 ` Joseph Myers @ 2020-01-09 21:57 ` Joseph Myers 0 siblings, 0 replies; 67+ messages in thread From: Joseph Myers @ 2020-01-09 21:57 UTC (permalink / raw) To: gcc; +Cc: esr On Thu, 9 Jan 2020, Joseph Myers wrote: > Here's a test conversion with the conversion machinery in what should be > essentially final form. This is like the "b" versions (dead and vendor > branches present but not fetched by default), with the addition of refs > from the existing git mirror as refs/git-old/* and refs/git-svn-old/* (not > fetched by default). > > git+ssh://gcc.gnu.org/home/gccadmin/gcc-reposurgeon-8.git Hooks are now set up and ready for testing commits to this repository, including integration with gcc-cvs and libstdc++-cvs mailing lists and Bugzilla. I recommend only referencing test bugs you open for the purpose in commits, not real bugs, to avoid confusing people, as the messages that end up in Bugzilla don't make it obvious that this is a test conversion (the messages on the -cvs mailing lists are more obvious, as they say "gcc-reposurgeon-8" in their subject headers). Commits made in this repository will not end up in the real conversion. gitweb URLs in messages from this conversion won't actually work, because they point to the real gitweb (which currently points to the git-svn mirror). All commits should generate commit emails. Only commits to master and release branches should generate Bugzilla updates. master and release branches do not allow merge commits, other branches do. Branches in refs/users/<user>/heads and refs/vendors/<vendor>/heads allow non-fast-forward pushes and branch deletion, other branches don't. Branch updates or new branches based on the history from the git-svn mirror (with 3cf0d8938a953ef13e57239613d42686f152b4fe, the initial git-svn commit, in their ancestry) are disallowed; this avoids someone accidentally pushing such a branch to a namespace that git fetches by default and causing everyone to fetch a GB of extra history as a result. Thus, for continued development based on such a branch you should start by rebasing (not merging) onto the new version of the history, and then the rebased branch can be pushed to one of the supported namespaces (under refs/users/<user>/heads/ if treating it as a user branch, under refs/heads/devel/ for a development branch that gets fetched by default). The hook configuration is something that seemed reasonable as a starting point, not necessarily what we will have as the final configuration. The configuration file for the AdaCore hooks is project.config on the refs/meta/config branch (but there are local patches to those hooks at present, and as with other refs, changes to refs/meta/config will not persist to the final converted repository). -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* GIT conversion: question about tags & release branches 2019-12-17 21:32 Test GCC conversion with reposurgeon available Joseph Myers ` (2 preceding siblings ...) 2019-12-18 21:55 ` Joseph Myers @ 2020-01-09 9:44 ` Martin Liška 2020-01-09 10:51 ` Richard Earnshaw (lists) 2020-01-09 11:46 ` Martin Jambor 3 siblings, 2 replies; 67+ messages in thread From: Martin Liška @ 2020-01-09 9:44 UTC (permalink / raw) To: Joseph Myers, gcc; +Cc: esr Hi. I have question about release branches and release tags. For the current git mirror, we do have release tags living on release branches. Example: commit 64e1a4df1bc9dbf4cedb3a842c4eaff6b3425a66 Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Mon Aug 12 08:40:24 2019 +0000 * BASE-VER: Set to 9.2.1. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274276 138bc75d-0d04-0410-961f-82ee72b054a4 commit 3e7b85061947bdc7c7465743ba90734566860821 (tag: gcc-9_2_0-release) <- THE TAG Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Mon Aug 12 07:38:49 2019 +0000 Update ChangeLog and version files for release git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274274 138bc75d-0d04-0410-961f-82ee72b054a4 commit fc3f35e10b6ca627727d71c74fd5e76785226200 Author: gccadmin <gccadmin@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Mon Aug 12 00:16:21 2019 +0000 Daily bump. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274271 138bc75d-0d04-0410-961f-82ee72b054a4 while the reposurgeon git has: commit a9044428b313402507aa047a17e6ea10f63b2b8b Author: Jakub Jelinek <jakub@redhat.com> Date: Mon Aug 12 10:40:24 2019 +0200 * BASE-VER: Set to 9.2.1. From-SVN: r274276 commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 <- THE TAG IS MISSING HERE Author: Jakub Jelinek <jakub@gcc.gnu.org> Date: Mon Aug 12 09:38:49 2019 +0200 Update ChangeLog and version files for release From-SVN: r274274 That's when I do git log parent/gcc-9-branch (git log origin/releases/gcc-9 respectively). And git log releases/gcc-9.2.0: commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: releases/gcc-9.2.0) <- THE TAG Author: Jakub Jelinek <jakub@gcc.gnu.org> Date: Mon Aug 12 09:38:59 2019 +0200 Tagging source as tags/gcc_9_2_0_release From-SVN: r274275 commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 Author: Jakub Jelinek <jakub@gcc.gnu.org> Date: Mon Aug 12 09:38:49 2019 +0200 Update ChangeLog and version files for release From-SVN: r274274 I see it useful to have the release tags on release branches. Thoughts? Martin ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 9:44 ` GIT conversion: question about tags & release branches Martin Liška @ 2020-01-09 10:51 ` Richard Earnshaw (lists) 2020-01-09 11:06 ` Martin Liška 2020-01-09 11:46 ` Martin Jambor 1 sibling, 1 reply; 67+ messages in thread From: Richard Earnshaw (lists) @ 2020-01-09 10:51 UTC (permalink / raw) To: Martin Liška, Joseph Myers, gcc; +Cc: esr On 09/01/2020 09:44, Martin LiÅ¡ka wrote: > Hi. > > I have question about release branches and release tags. For the current > git mirror, we do have release tags living on release branches. Example: > > commit 64e1a4df1bc9dbf4cedb3a842c4eaff6b3425a66 > Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> > Date:  Mon Aug 12 08:40:24 2019 +0000 > >            * BASE-VER: Set to 9.2.1. >    git-svn-id: > svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274276 > 138bc75d-0d04-0410-961f-82ee72b054a4 > > commit 3e7b85061947bdc7c7465743ba90734566860821 (tag: gcc-9_2_0-release) > <- THE TAG > Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> > Date:  Mon Aug 12 07:38:49 2019 +0000 > >    Update ChangeLog and version files for release >    git-svn-id: > svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274274 > 138bc75d-0d04-0410-961f-82ee72b054a4 > > commit fc3f35e10b6ca627727d71c74fd5e76785226200 > Author: gccadmin <gccadmin@138bc75d-0d04-0410-961f-82ee72b054a4> > Date:  Mon Aug 12 00:16:21 2019 +0000 > >    Daily bump. >    git-svn-id: > svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274271 > 138bc75d-0d04-0410-961f-82ee72b054a4 > > while the reposurgeon git has: > > commit a9044428b313402507aa047a17e6ea10f63b2b8b > Author: Jakub Jelinek <jakub@redhat.com> > Date:  Mon Aug 12 10:40:24 2019 +0200 > >    * BASE-VER: Set to 9.2.1. >    From-SVN: r274276 > > commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 <- THE TAG IS MISSING HERE > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date:  Mon Aug 12 09:38:49 2019 +0200 > >    Update ChangeLog and version files for release >    From-SVN: r274274 > > That's when I do git log parent/gcc-9-branch (git log > origin/releases/gcc-9 respectively). > And git log releases/gcc-9.2.0: > > commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: > releases/gcc-9.2.0) <- THE TAG > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date:  Mon Aug 12 09:38:59 2019 +0200 > >    Tagging source as tags/gcc_9_2_0_release >    From-SVN: r274275 > > commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date:  Mon Aug 12 09:38:49 2019 +0200 > >    Update ChangeLog and version files for release >    From-SVN: r274274 > > I see it useful to have the release tags on release branches. > Thoughts? > Martin SVN makes tags using copy operations. By default this tends to result in the copies hanging off the side of the main branch and so the tags are not in a particularly git-friendly location. Reposurgeon has code now to try to spot this and collapse them back to the branch revision they were copied from. I think Joseph's next candidate conversion should show this fixed. However, some CVS-era tags cannot be fixed up in this way because of slight differences in the content between the branch and the tag (tagging was not an atomic operation and race conditions meant that changes were being made to the repo while the tag was taking place). For those cases the tag will always be on a side branch. Anyway, please check Joseph's next candidate to see if this shows what you expect -- I think it should be out later today. R. ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 10:51 ` Richard Earnshaw (lists) @ 2020-01-09 11:06 ` Martin Liška 2020-01-09 11:31 ` Eric S. Raymond 0 siblings, 1 reply; 67+ messages in thread From: Martin Liška @ 2020-01-09 11:06 UTC (permalink / raw) To: Richard Earnshaw (lists), Joseph Myers, gcc; +Cc: esr On 1/9/20 11:51 AM, Richard Earnshaw (lists) wrote: > SVN makes tags using copy operations. By default this tends to result in the copies hanging off the side of the main branch and so the tags are not in a particularly git-friendly location. Reposurgeon has code now to try to spot this and collapse them back to the branch revision they were copied from. I think Joseph's next candidate conversion should show this fixed. Great, thanks for explanation. > > However, some CVS-era tags cannot be fixed up in this way because of slight differences in the content between the branch and the tag (tagging was not an atomic operation and race conditions meant that changes were being made to the repo while the tag was taking place). For those cases the tag will always be on a side branch. I do care for SVN era only. > > Anyway, please check Joseph's next candidate to see if this shows what you expect -- I think it should be out later today. I'll check it once it's published. Martin ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 11:06 ` Martin Liška @ 2020-01-09 11:31 ` Eric S. Raymond 0 siblings, 0 replies; 67+ messages in thread From: Eric S. Raymond @ 2020-01-09 11:31 UTC (permalink / raw) To: Martin Liška; +Cc: Richard Earnshaw (lists), Joseph Myers, gcc Martin LiÅ¡ka <mliska@suse.cz>: > > Anyway, please check Joseph's next candidate to see if this shows what you expect -- I think it should be out later today. > > I'll check it once it's published. Everybody: time is growing short before the final conversion, so if you see anything that looks wrong or anomalous please send up a rocket *immediately*. The faster you let us know, the more likely it is we'll be able to nip in with a fix while that is still possible. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 9:44 ` GIT conversion: question about tags & release branches Martin Liška 2020-01-09 10:51 ` Richard Earnshaw (lists) @ 2020-01-09 11:46 ` Martin Jambor 2020-01-09 11:50 ` Martin Liška 2020-01-09 11:57 ` Richard Earnshaw (lists) 1 sibling, 2 replies; 67+ messages in thread From: Martin Jambor @ 2020-01-09 11:46 UTC (permalink / raw) To: Martin Liška, Joseph Myers, gcc; +Cc: esr Hi, On Thu, Jan 09 2020, Martin Liška wrote: > Hi. > > I have question about release branches and release tags. For the current > git mirror, we do have release tags living on release branches. Example: > > commit 64e1a4df1bc9dbf4cedb3a842c4eaff6b3425a66 > Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Mon Aug 12 08:40:24 2019 +0000 > > * BASE-VER: Set to 9.2.1. > > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274276 138bc75d-0d04-0410-961f-82ee72b054a4 > > commit 3e7b85061947bdc7c7465743ba90734566860821 (tag: gcc-9_2_0-release) <- THE TAG > Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Mon Aug 12 07:38:49 2019 +0000 > > Update ChangeLog and version files for release > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274274 138bc75d-0d04-0410-961f-82ee72b054a4 > > commit fc3f35e10b6ca627727d71c74fd5e76785226200 > Author: gccadmin <gccadmin@138bc75d-0d04-0410-961f-82ee72b054a4> > Date: Mon Aug 12 00:16:21 2019 +0000 > > Daily bump. > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274271 138bc75d-0d04-0410-961f-82ee72b054a4 > > while the reposurgeon git has: > > commit a9044428b313402507aa047a17e6ea10f63b2b8b > Author: Jakub Jelinek <jakub@redhat.com> > Date: Mon Aug 12 10:40:24 2019 +0200 > > * BASE-VER: Set to 9.2.1. > > From-SVN: r274276 > > commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 <- THE TAG IS MISSING HERE > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date: Mon Aug 12 09:38:49 2019 +0200 > > Update ChangeLog and version files for release > > From-SVN: r274274 > I use the release tags every now and then so this caught my attention but I do not understand what the problem is? In the gcc-reposurgeon-7a conversion, there is a tag called gcc_9_2_0_release: mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git log -1 gcc_9_2_0_release commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: gcc_9_2_0_release) Author: Jakub Jelinek <jakub@gcc.gnu.org> Date: Mon Aug 12 09:38:59 2019 +0200 Tagging source as tags/gcc_9_2_0_release From-SVN: r274275 Even when I query the commit directly, it shows it is tagged: mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git log -1 56bc3061f168c39a85117d4daefc2d5c0e4edb91 commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: gcc_9_2_0_release) Author: Jakub Jelinek <jakub@gcc.gnu.org> Date: Mon Aug 12 09:38:59 2019 +0200 Tagging source as tags/gcc_9_2_0_release From-SVN: r274275 in gcc-reposurgeon-7b it is called releases/gcc-9.2.0 commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: releases/gcc-9.2.0) Author: Jakub Jelinek <jakub@gcc.gnu.org> Date: Mon Aug 12 09:38:59 2019 +0200 Tagging source as tags/gcc_9_2_0_release From-SVN: r274275 It seems that reposurgeon conversion has a commit representing the revision r274275 whereas git mirror does not, but that does not seem to be too bad? > That's when I do git log parent/gcc-9-branch (git log origin/releases/gcc-9 respectively). > And git log releases/gcc-9.2.0: > > commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: releases/gcc-9.2.0) <- THE TAG > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date: Mon Aug 12 09:38:59 2019 +0200 > > Tagging source as tags/gcc_9_2_0_release > > From-SVN: r274275 > > commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date: Mon Aug 12 09:38:49 2019 +0200 > > Update ChangeLog and version files for release > > From-SVN: r274274 > > I see it useful to have the release tags on release branches. > Thoughts? Thinking about it, assuming the reposurgeon is the way to go, did we decide whether it's going to be the a or b variant? I like the tags in the B version better, but it does not seem to have all the branches, I mean: mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git branch | wc -l 536 mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7b.git$ git branch | wc -l 38 I did clone both with --mirror. Thanks, Martin ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 11:46 ` Martin Jambor @ 2020-01-09 11:50 ` Martin Liška 2020-01-09 12:37 ` Joseph Myers 2020-01-09 11:57 ` Richard Earnshaw (lists) 1 sibling, 1 reply; 67+ messages in thread From: Martin Liška @ 2020-01-09 11:50 UTC (permalink / raw) To: Martin Jambor, Joseph Myers, gcc; +Cc: esr On 1/9/20 12:45 PM, Martin Jambor wrote: > I use the release tags every now and then so this caught my attention > but I do not understand what the problem is? Problem is that if you do: $ git log origin/releases/gcc-9 you will not find the 9.2.0 tag. Which is a useful information when you seek for a presence of a revision in a release. Martin ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 11:50 ` Martin Liška @ 2020-01-09 12:37 ` Joseph Myers 2020-01-09 13:38 ` Martin Liška 0 siblings, 1 reply; 67+ messages in thread From: Joseph Myers @ 2020-01-09 12:37 UTC (permalink / raw) To: Martin Liška; +Cc: Martin Jambor, gcc, esr [-- Attachment #1: Type: text/plain, Size: 976 bytes --] On Thu, 9 Jan 2020, Martin LiÅ¡ka wrote: > On 1/9/20 12:45 PM, Martin Jambor wrote: > > I use the release tags every now and then so this caught my attention > > but I do not understand what the problem is? > > Problem is that if you do: > $ git log origin/releases/gcc-9 > > you will not find the 9.2.0 tag. Which is a useful information when > you seek for a presence of a revision in a release. With my latest test conversion I confirm you now get: [...] commit ffc0d9a6f63ff89be3113ec43f6cad0474ac902b Author: Jakub Jelinek <jakub@redhat.com> Date: Mon Aug 12 10:40:24 2019 +0200 * BASE-VER: Set to 9.2.1. From-SVN: r274276 commit c1b649b21087229d715ba61fe4df12b2b4c40eea (tag: releases/gcc-9.2.0) Author: Jakub Jelinek <jakub@gcc.gnu.org> Date: Mon Aug 12 09:38:49 2019 +0200 Update ChangeLog and version files for release From-SVN: r274274 [...] which I think is what's wanted here. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 12:37 ` Joseph Myers @ 2020-01-09 13:38 ` Martin Liška 0 siblings, 0 replies; 67+ messages in thread From: Martin Liška @ 2020-01-09 13:38 UTC (permalink / raw) To: Joseph Myers; +Cc: Martin Jambor, gcc, esr On 1/9/20 1:37 PM, Joseph Myers wrote: > On Thu, 9 Jan 2020, Martin LiÅ¡ka wrote: > >> On 1/9/20 12:45 PM, Martin Jambor wrote: >>> I use the release tags every now and then so this caught my attention >>> but I do not understand what the problem is? >> >> Problem is that if you do: >> $ git log origin/releases/gcc-9 >> >> you will not find the 9.2.0 tag. Which is a useful information when >> you seek for a presence of a revision in a release. > > With my latest test conversion I confirm you now get: > > [...] > > commit ffc0d9a6f63ff89be3113ec43f6cad0474ac902b > Author: Jakub Jelinek <jakub@redhat.com> > Date: Mon Aug 12 10:40:24 2019 +0200 > > * BASE-VER: Set to 9.2.1. > > From-SVN: r274276 > > commit c1b649b21087229d715ba61fe4df12b2b4c40eea (tag: releases/gcc-9.2.0) > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date: Mon Aug 12 09:38:49 2019 +0200 > > Update ChangeLog and version files for release > > From-SVN: r274274 > > [...] > > which I think is what's wanted here. > Yep, I like it. Thanks, Martin ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 11:46 ` Martin Jambor 2020-01-09 11:50 ` Martin Liška @ 2020-01-09 11:57 ` Richard Earnshaw (lists) 2020-01-09 11:59 ` Richard Earnshaw (lists) 1 sibling, 1 reply; 67+ messages in thread From: Richard Earnshaw (lists) @ 2020-01-09 11:57 UTC (permalink / raw) To: Martin Jambor, Martin Liška, Joseph Myers, gcc; +Cc: esr On 09/01/2020 11:45, Martin Jambor wrote: > Hi, > > On Thu, Jan 09 2020, Martin LiÅ¡ka wrote: >> Hi. >> >> I have question about release branches and release tags. For the current >> git mirror, we do have release tags living on release branches. Example: >> >> commit 64e1a4df1bc9dbf4cedb3a842c4eaff6b3425a66 >> Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> >> Date: Mon Aug 12 08:40:24 2019 +0000 >> >> * BASE-VER: Set to 9.2.1. >> >> >> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274276 138bc75d-0d04-0410-961f-82ee72b054a4 >> >> commit 3e7b85061947bdc7c7465743ba90734566860821 (tag: gcc-9_2_0-release) <- THE TAG >> Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> >> Date: Mon Aug 12 07:38:49 2019 +0000 >> >> Update ChangeLog and version files for release >> >> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274274 138bc75d-0d04-0410-961f-82ee72b054a4 >> >> commit fc3f35e10b6ca627727d71c74fd5e76785226200 >> Author: gccadmin <gccadmin@138bc75d-0d04-0410-961f-82ee72b054a4> >> Date: Mon Aug 12 00:16:21 2019 +0000 >> >> Daily bump. >> >> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274271 138bc75d-0d04-0410-961f-82ee72b054a4 >> >> while the reposurgeon git has: >> >> commit a9044428b313402507aa047a17e6ea10f63b2b8b >> Author: Jakub Jelinek <jakub@redhat.com> >> Date: Mon Aug 12 10:40:24 2019 +0200 >> >> * BASE-VER: Set to 9.2.1. >> >> From-SVN: r274276 >> >> commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 <- THE TAG IS MISSING HERE >> Author: Jakub Jelinek <jakub@gcc.gnu.org> >> Date: Mon Aug 12 09:38:49 2019 +0200 >> >> Update ChangeLog and version files for release >> >> From-SVN: r274274 >> > > I use the release tags every now and then so this caught my attention > but I do not understand what the problem is? > > In the gcc-reposurgeon-7a conversion, there is a tag called > gcc_9_2_0_release: > > mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git log -1 gcc_9_2_0_release > commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: gcc_9_2_0_release) > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date: Mon Aug 12 09:38:59 2019 +0200 > > Tagging source as tags/gcc_9_2_0_release > > From-SVN: r274275 > > Even when I query the commit directly, it shows it is tagged: > > mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git log -1 56bc3061f168c39a85117d4daefc2d5c0e4edb91 > commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: gcc_9_2_0_release) > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date: Mon Aug 12 09:38:59 2019 +0200 > > Tagging source as tags/gcc_9_2_0_release > > From-SVN: r274275 > > > in gcc-reposurgeon-7b it is called releases/gcc-9.2.0 > > commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: releases/gcc-9.2.0) > Author: Jakub Jelinek <jakub@gcc.gnu.org> > Date: Mon Aug 12 09:38:59 2019 +0200 > > Tagging source as tags/gcc_9_2_0_release > > From-SVN: r274275 > > It seems that reposurgeon conversion has a commit representing the > revision r274275 whereas git mirror does not, but that does not seem to > be too bad? > >> That's when I do git log parent/gcc-9-branch (git log origin/releases/gcc-9 respectively). >> And git log releases/gcc-9.2.0: >> >> commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: releases/gcc-9.2.0) <- THE TAG >> Author: Jakub Jelinek <jakub@gcc.gnu.org> >> Date: Mon Aug 12 09:38:59 2019 +0200 >> >> Tagging source as tags/gcc_9_2_0_release >> >> From-SVN: r274275 >> >> commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 >> Author: Jakub Jelinek <jakub@gcc.gnu.org> >> Date: Mon Aug 12 09:38:49 2019 +0200 >> >> Update ChangeLog and version files for release >> >> From-SVN: r274274 >> >> I see it useful to have the release tags on release branches. >> Thoughts? > > Thinking about it, assuming the reposurgeon is the way to go, did we > decide whether it's going to be the a or b variant? I like the tags in > the B version better, but it does not seem to have all the branches, I > mean: > > mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git branch | wc -l > 536 > > mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7b.git$ git branch | wc -l > 38 > > > I did clone both with --mirror. > > Thanks, > > Martin > > The branches are in there, but they're in a namespace which means GIT does not list them by default. If you run git for-each-ref --format='%(refname)' On a mirror clone you'll see all the various tags and branches that are really there. We'll add documentation on how to get these 'private' branches/tags into a normal clone - normally it will involve adding another 'fetch' rule to your configuration. R. ^ permalink raw reply [flat|nested] 67+ messages in thread
* Re: GIT conversion: question about tags & release branches 2020-01-09 11:57 ` Richard Earnshaw (lists) @ 2020-01-09 11:59 ` Richard Earnshaw (lists) 0 siblings, 0 replies; 67+ messages in thread From: Richard Earnshaw (lists) @ 2020-01-09 11:59 UTC (permalink / raw) To: Martin Jambor, Martin Liška, Joseph Myers, gcc; +Cc: esr On 09/01/2020 11:57, Richard Earnshaw (lists) wrote: > On 09/01/2020 11:45, Martin Jambor wrote: >> Hi, >> >> On Thu, Jan 09 2020, Martin LiÅ¡ka wrote: >>> Hi. >>> >>> I have question about release branches and release tags. For the current >>> git mirror, we do have release tags living on release branches. Example: >>> >>> commit 64e1a4df1bc9dbf4cedb3a842c4eaff6b3425a66 >>> Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> >>> Date:  Mon Aug 12 08:40:24 2019 +0000 >>> >>>              * BASE-VER: Set to 9.2.1. >>>      git-svn-id: >>> svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274276 >>> 138bc75d-0d04-0410-961f-82ee72b054a4 >>> >>> commit 3e7b85061947bdc7c7465743ba90734566860821 (tag: >>> gcc-9_2_0-release) <- THE TAG >>> Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> >>> Date:  Mon Aug 12 07:38:49 2019 +0000 >>> >>>      Update ChangeLog and version files for release >>>      git-svn-id: >>> svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274274 >>> 138bc75d-0d04-0410-961f-82ee72b054a4 >>> >>> commit fc3f35e10b6ca627727d71c74fd5e76785226200 >>> Author: gccadmin <gccadmin@138bc75d-0d04-0410-961f-82ee72b054a4> >>> Date:  Mon Aug 12 00:16:21 2019 +0000 >>> >>>      Daily bump. >>>      git-svn-id: >>> svn+ssh://gcc.gnu.org/svn/gcc/branches/gcc-9-branch@274271 >>> 138bc75d-0d04-0410-961f-82ee72b054a4 >>> >>> while the reposurgeon git has: >>> >>> commit a9044428b313402507aa047a17e6ea10f63b2b8b >>> Author: Jakub Jelinek <jakub@redhat.com> >>> Date:  Mon Aug 12 10:40:24 2019 +0200 >>> >>>      * BASE-VER: Set to 9.2.1. >>>      From-SVN: r274276 >>> >>> commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 <- THE TAG IS >>> MISSING HERE >>> Author: Jakub Jelinek <jakub@gcc.gnu.org> >>> Date:  Mon Aug 12 09:38:49 2019 +0200 >>> >>>      Update ChangeLog and version files for release >>>      From-SVN: r274274 >>> >> >> I use the release tags every now and then so this caught my attention >> but I do not understand what the problem is? >> >> In the gcc-reposurgeon-7a conversion, there is a tag called >> gcc_9_2_0_release: >> >> mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git >> log -1 gcc_9_2_0_release >> commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: gcc_9_2_0_release) >> Author: Jakub Jelinek <jakub@gcc.gnu.org> >> Date:  Mon Aug 12 09:38:59 2019 +0200 >> >>     Tagging source as tags/gcc_9_2_0_release >>     From-SVN: r274275 >> >> Even when I query the commit directly, it shows it is tagged: >> >> mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git >> log -1 56bc3061f168c39a85117d4daefc2d5c0e4edb91 >> commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: gcc_9_2_0_release) >> Author: Jakub Jelinek <jakub@gcc.gnu.org> >> Date:  Mon Aug 12 09:38:59 2019 +0200 >> >>     Tagging source as tags/gcc_9_2_0_release >>     From-SVN: r274275 >> >> >> in gcc-reposurgeon-7b it is called releases/gcc-9.2.0 >> >> commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: releases/gcc-9.2.0) >> Author: Jakub Jelinek <jakub@gcc.gnu.org> >> Date:  Mon Aug 12 09:38:59 2019 +0200 >> >>     Tagging source as tags/gcc_9_2_0_release >>     From-SVN: r274275 >> >> It seems that reposurgeon conversion has a commit representing the >> revision r274275 whereas git mirror does not, but that does not seem to >> be too bad? >> >>> That's when I do git log parent/gcc-9-branch (git log >>> origin/releases/gcc-9 respectively). >>> And git log releases/gcc-9.2.0: >>> >>> commit 56bc3061f168c39a85117d4daefc2d5c0e4edb91 (tag: >>> releases/gcc-9.2.0) <- THE TAG >>> Author: Jakub Jelinek <jakub@gcc.gnu.org> >>> Date:  Mon Aug 12 09:38:59 2019 +0200 >>> >>>      Tagging source as tags/gcc_9_2_0_release >>>      From-SVN: r274275 >>> >>> commit d46878c3cce3be8f6c8878be8af326adecbb8ec6 >>> Author: Jakub Jelinek <jakub@gcc.gnu.org> >>> Date:  Mon Aug 12 09:38:49 2019 +0200 >>> >>>      Update ChangeLog and version files for release >>>      From-SVN: r274274 >>> >>> I see it useful to have the release tags on release branches. >>> Thoughts? >> >> Thinking about it, assuming the reposurgeon is the way to go, did we >> decide whether it's going to be the a or b variant? I like the tags in >> the B version better, but it does not seem to have all the branches, I >> mean: >> >> mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7a.git$ git >> branch | wc -l >> 536 >> >> mjambor@virgil:/space/mjambor/gcc/newconv/gcc-reposurgeon-7b.git$ git >> branch | wc -l >> 38 >> >> >> I did clone both with --mirror. >> >> Thanks, >> >> Martin >> >> > > The branches are in there, but they're in a namespace which means GIT > does not list them by default. > > If you run > > git for-each-ref --format='%(refname)' > > On a mirror clone you'll see all the various tags and branches that are > really there. > > We'll add documentation on how to get these 'private' branches/tags into > a normal clone - normally it will involve adding another 'fetch' rule to > your configuration. > > > R. git for-each-ref --format='%(refname)'|grep heads|grep -v deleted|wc -l 536 So they're all there. R. ^ permalink raw reply [flat|nested] 67+ messages in thread
end of thread, other threads:[~2020-01-09 21:57 UTC | newest] Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-17 21:32 Test GCC conversion with reposurgeon available Joseph Myers 2019-12-17 23:33 ` Bernd Schmidt 2019-12-18 0:51 ` Eric S. Raymond 2019-12-18 0:52 ` Joseph Myers 2019-12-18 3:28 ` Joseph Myers 2019-12-18 14:36 ` Joseph Myers 2019-12-18 13:10 ` Jason Merrill 2019-12-18 18:16 ` Joseph Myers 2019-12-19 5:50 ` Jason Merrill 2019-12-19 15:55 ` Joseph Myers 2019-12-18 21:55 ` Joseph Myers 2019-12-19 0:36 ` Bernd Schmidt 2019-12-19 0:58 ` Joseph Myers 2019-12-19 13:51 ` Test GCC conversions (publicly) available Mark Wielaard 2019-12-19 14:06 ` Eric S. Raymond 2019-12-19 14:40 ` Joseph Myers 2019-12-19 16:00 ` Eric S. Raymond 2019-12-19 16:03 ` Richard Earnshaw (lists) 2019-12-19 16:08 ` Eric S. Raymond 2019-12-19 16:29 ` Test GCC conversion with reposurgeon available Joseph Myers 2019-12-22 13:57 ` Joseph Myers 2019-12-23 17:27 ` Roman Zhuykov 2019-12-24 11:50 ` Joseph Myers 2019-12-24 15:55 ` Segher Boessenkool 2019-12-24 17:17 ` Joseph Myers 2019-12-24 18:14 ` Segher Boessenkool 2019-12-25 11:03 ` Roman Zhuykov 2019-12-25 11:20 ` Joseph Myers 2019-12-25 12:23 ` Eric S. Raymond 2019-12-25 14:32 ` Andreas Schwab 2019-12-25 14:41 ` Joseph Myers 2019-12-25 15:10 ` Andreas Schwab 2019-12-25 15:36 ` Joseph Myers 2019-12-25 17:15 ` Segher Boessenkool 2019-12-25 19:33 ` Eric S. Raymond 2019-12-26 21:03 ` Vincent Lefevre 2019-12-26 21:31 ` Eric S. Raymond 2019-12-26 22:25 ` Toon Moene 2019-12-26 22:32 ` Eric S. Raymond 2019-12-27 14:40 ` Segher Boessenkool 2019-12-26 22:57 ` Vincent Lefevre 2019-12-26 23:38 ` Eric S. Raymond 2019-12-25 19:40 ` Eric S. Raymond 2019-12-27 21:29 ` Andreas Schwab 2019-12-27 21:43 ` Joseph Myers 2019-12-25 19:19 ` Eric S. Raymond 2019-12-27 21:30 ` Andreas Schwab 2019-12-28 2:43 ` Eric S. Raymond 2019-12-27 14:37 ` Richard Earnshaw 2019-12-24 10:57 ` Maxim Kuvyrkov 2019-12-28 16:30 ` Joseph Myers 2020-01-03 12:38 ` Joseph Myers 2020-01-06 23:58 ` Andrew Pinski 2020-01-07 0:30 ` Joseph Myers 2020-01-07 0:44 ` Richard Earnshaw 2020-01-09 12:22 ` Joseph Myers 2020-01-09 21:57 ` Joseph Myers 2020-01-09 9:44 ` GIT conversion: question about tags & release branches Martin Liška 2020-01-09 10:51 ` Richard Earnshaw (lists) 2020-01-09 11:06 ` Martin Liška 2020-01-09 11:31 ` Eric S. Raymond 2020-01-09 11:46 ` Martin Jambor 2020-01-09 11:50 ` Martin Liška 2020-01-09 12:37 ` Joseph Myers 2020-01-09 13:38 ` Martin Liška 2020-01-09 11:57 ` Richard Earnshaw (lists) 2020-01-09 11:59 ` Richard Earnshaw (lists)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).