public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git
@ 2019-05-14 16:11 Maxim Kuvyrkov
  2019-05-14 21:20 ` Segher Boessenkool
                   ` (4 more replies)
  0 siblings, 5 replies; 103+ messages in thread
From: Maxim Kuvyrkov @ 2019-05-14 16:11 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jason Merrill, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1410 bytes --]

This patch adds scripts to contrib/ to migrate full history of GCC's subversion repository to git.  My hope is that these scripts will finally allow GCC project to migrate to Git.

The result of the conversion is at https://github.com/maxim-kuvyrkov/gcc/branches/all .  Branches with "@rev" suffixes represent branch points.  The conversion is still running, so not all branches may appear right away.

The scripts are not specific to GCC repo and are usable for other projects.  In particular, they should be able to convert downstream GCC svn repos.

The scripts convert svn history branch by branch.  They rely on git-svn on convert individual branches.  Git-svn is a good tool for converting individual branches.  It is, however, either very slow at converting the entire GCC repo, or goes into infinite loop.

There are 3 scripts:

- svn-git-repo.sh: top level script to convert entire repo or a part of it (e.g., branches/),
- svn-list-branches.sh: helper script to output branches and their parents in bottom-up order,
- svn-git-branch.sh: helper script to convert a single branch.

Whenever possible, svn-git-branch.sh uses existing git branches as caches.

What are your questions and comments?

The attached is cleaned up version, which hasn't been fully tested yet; typos and other silly mistakes are likely.  OK to commit after testing?

--
Maxim Kuvyrkov
www.linaro.org



[-- Attachment #2: 0001-Contrib-SVN-Git-conversion-scripts.patch --]
[-- Type: application/octet-stream, Size: 7877 bytes --]

From 3dbfff128c125f9d3307dff1e5b44f8135620f2b Mon Sep 17 00:00:00 2001
From: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Date: Tue, 14 May 2019 13:12:36 +0000
Subject: [PATCH] [Contrib] SVN -> Git conversion scripts

	* svn-git-repo.sh, svn-list-branches.sh, svn-git-branch.sh: New scripts.

Change-Id: I437d70c3bc431261bde5eec29e7207d4ebb0fe01
---
 contrib/svn-git-branch.sh    |  92 +++++++++++++++++++++++++++++
 contrib/svn-git-repo.sh      |  57 ++++++++++++++++++
 contrib/svn-list-branches.sh | 108 +++++++++++++++++++++++++++++++++++
 3 files changed, 257 insertions(+)
 create mode 100755 contrib/svn-git-branch.sh
 create mode 100755 contrib/svn-git-repo.sh
 create mode 100755 contrib/svn-list-branches.sh

diff --git a/contrib/svn-git-branch.sh b/contrib/svn-git-branch.sh
new file mode 100755
index 00000000000..70f8deb0a4f
--- /dev/null
+++ b/contrib/svn-git-branch.sh
@@ -0,0 +1,92 @@
+#!/bin/bash
+
+set -euf -o pipefail
+
+svntop="svn+ssh://gcc.gnu.org/svn/gcc"
+verbose=0
+
+while test $# -gt 0; do
+    case "$1" in
+	--svntop) svntop="$2"; shift ;;
+	--verbose) verbose=$(($verbose+1)) ;;
+	--) shift; break ;;
+	*) break ;;
+    esac
+    shift
+done
+
+info ()
+{
+    if [ $verbose -ge 1 ]; then
+	echo "INFO: $@" >&2
+    fi
+}
+
+if [ $verbose -ge 2 ]; then
+    set -x
+fi
+
+info "converting $@"
+
+spec="$1"
+shift 1
+
+branch=$(echo "$spec" | cut -d@ -f 1)
+rev=$(echo "$spec" | cut -s -d@ -f 2)
+
+for parent in "$@"; do
+    parent_branch=$(echo "$parent" | cut -d@ -f 1)
+    parent_rev=$(echo "$parent" | cut -s -d@ -f 2)
+
+    sha1=$(git rev-parse refs/remotes/svn/$parent)
+    mkdir -p $(dirname .git/refs/remotes/svn/$branch@$parent_rev)
+    echo $sha1 > .git/refs/remotes/svn/$branch@$parent_rev
+done
+
+sha1=""
+for i in origin extra; do
+    if git rev-parse refs/remotes/$i/${branch#branches/} >/dev/null 2>&1; then
+	sha1=$(git rev-parse refs/remotes/$i/${branch#branches/})
+    fi
+done
+
+if [ x"$sha1" != x"" ]; then
+    if ! git rev-parse refs/remotes/cache/$branch >/dev/null 2>&1 \
+	   || [ x"$rev" != x"" ]; then
+	mkdir -p $(dirname .git/refs/remotes/cache/$branch)
+	echo $sha1 > .git/refs/remotes/cache/$branch
+
+	cat >> .git/config <<EOF
+
+[svn-remote "cache-$branch"]
+	url = svn+ssh://gcc.gnu.org/svn/gcc
+	fetch = $branch:refs/remotes/cache/$branch
+EOF
+	git svn fetch --log-window-size=10000 "cache-$branch"
+
+	if [ x"$rev" != x"" ]; then
+	    sha1=$(git svn find-rev --before r$rev refs/remotes/cache/$branch)
+	else
+	    sha1=$(git rev-parse refs/remotes/cache/$branch)
+	fi
+
+	head -n -4 .git/config > .git/newconfig
+	mv .git/newconfig .git/config
+    else
+	sha1=$(git rev-parse refs/remotes/cache/$branch)
+    fi
+
+    mkdir -p $(dirname .git/refs/remotes/svn/$spec)
+    echo $sha1 > .git/refs/remotes/svn/$spec
+fi
+
+cat >> .git/config <<EOF
+
+[svn-remote "$spec"]
+	url = svn+ssh://gcc.gnu.org/svn/gcc
+	fetch = $branch:refs/remotes/svn/$spec
+EOF
+
+git svn fetch ${rev:+-r BASE:$rev} --log-window-size=10000 "$spec"
+head -n -4 .git/config > .git/newconfig
+mv .git/newconfig .git/config
diff --git a/contrib/svn-git-repo.sh b/contrib/svn-git-repo.sh
new file mode 100755
index 00000000000..bdabd079c67
--- /dev/null
+++ b/contrib/svn-git-repo.sh
@@ -0,0 +1,57 @@
+#!/bin/bash
+
+set -euf -o pipefail
+
+branchfile=""
+giturl=https://gcc.gnu.org/git/gcc.git
+reference=""
+repo=""
+svnpaths=()
+svntop=svn+ssh://gcc.gnu.org/svn/gcc
+
+while test $# -gt 0; do
+    case "$1" in
+	--branchfile) branchfile="$2"; shift ;;
+	--giturl) giturl="$2"; shift ;;
+	--reference) reference="$2"; shift ;;
+	--repo) repo="$2"; shift ;;
+	--svnpath) svnpaths+=("$2"); shift ;;
+	--svntop) svntop="$2"; shift ;;
+	--verbose) set -x ;;
+	*) echo "ERROR: wrong argument: $1"; exit 1 ;;
+    esac
+    shift
+done
+
+top=$(cd $(dirname "$0"); pwd)
+
+if [ x"$repo" = x"" ]; then
+    echo "ERROR: no --repo /path/to/repo"
+    exit 1
+elif [ -d "$repo" ]; then
+    echo "ERROR: directory $repo exists"
+    echo "       rename it to $repo.bak if it is a good starting point"
+    exit 1
+fi
+
+if [ -d "$repo.bak" ]; then
+    rsync -a --del "$repo.bak/" "$repo/"
+else
+    git clone ${reference:+--reference "$reference"} "$giturl" "$repo"
+    git -C "$repo" config --add remote.origin.fetch '+refs/remotes/*:refs/remotes/extra/*'
+    git -C "$repo" remote update -p
+    rsync -a --del "$repo/" "$repo.bak/"
+fi
+
+(
+    if [ x"$branchfile" = x"" ]; then
+	branchfile=$(mktemp)
+	$top/svn-list-branches.sh --svntop "$svntop" "${svnpaths[@]}" > "$branchfile" &
+    fi
+    tail -f -n +1 "$branchfile"
+) | while read -a params; do
+    (
+	cd "$repo"
+	$top/svn-git-branch.sh --svntop "$svntop" "${params[@]}" < /dev/null
+    )
+done
diff --git a/contrib/svn-list-branches.sh b/contrib/svn-list-branches.sh
new file mode 100755
index 00000000000..1a3378f27b8
--- /dev/null
+++ b/contrib/svn-list-branches.sh
@@ -0,0 +1,108 @@
+#!/bin/bash
+
+set -euf -o pipefail
+
+svntop="svn+ssh://gcc.gnu.org/svn/gcc"
+verbose=0
+
+while test $# -gt 0; do
+    case "$1" in
+	--svntop) svntop="$2"; shift ;;
+	--verbose) verbose=$(($verbose+1)) ;;
+	--) shift; break ;;
+	*) break ;;
+    esac
+    shift
+done
+
+info ()
+{
+    if [ $verbose -ge 1 ]; then
+	echo "INFO: $@" >&2
+    fi
+}
+
+error ()
+{
+    echo "ERROR: $@" >&2
+    exit 1
+}
+
+if [ $verbose -ge 2 ]; then
+    set -x
+fi
+
+queue=()
+for i in "$@"; do
+    i=$(echo "$i" | sed -e "s#^/*##g" -e "s#/*\$##g")
+    queue=("$i")
+done
+
+tmp=$(mktemp)
+
+declare -A printed
+
+while [ "${#queue[@]}" -gt 0 ]; do
+    path="${queue[0]}"
+    queue=("${queue[@]:1}")
+
+    info "Processing $path"
+
+    svn ls "$svntop/$path" > $tmp
+    if grep -v -q "/\$" $tmp; then
+	copy_log=$(svn log -v --stop-on-copy "$svntop/$path" \
+		       | grep -A3 -e "------------------------------------------------------------------------" \
+		       | tail -n3 \
+		       | grep "^   A /.* (from /.*:[0-9]*)\$" \
+		       | tail -n 1)
+	if [ x"$copy_log" != x"" ]; then
+	    realpath="$(echo "$copy_log" | sed -e "s#^   A /\(.*\) (from /\(.*\):\([0-9]*\))\$#\1#")"
+	    parent="$(echo "$copy_log" | sed -e "s#^   A /\(.*\) (from /\(.*\):\([0-9]*\))\$#\2#")"
+	    parent_rev="$(echo "$copy_log" | sed -e "s#^   A /\(.*\) (from /\(.*\):\([0-9]*\))\$#\3#")"
+
+	    parent="$parent@$parent_rev"
+	    if [ x"${printed[$parent]+set}" != x"set" ]; then
+		queue=("$parent" "$path" "${queue[@]}")
+		info "Backtracking to $parent"
+		continue
+	    fi
+
+	    if [ x"$realpath" != x"${path%@*}" ]; then
+		rev=$(echo "$path" | cut -s -d@ -f 2)
+		if [ x"$rev" != x"" ]; then
+		    error "path $path is versioned and not a branch"
+		fi
+		path="$realpath"
+	    fi
+	fi
+
+	if [ x"${printed[$path]+set}" != x"set" ]; then
+	    printed[$path]=1
+
+	    printf "$path"
+	    while true; do
+		copy_log=$(svn log -v --stop-on-copy "$svntop/$path" \
+			       | grep -A3 -e "------------------------------------------------------------------------" \
+			       | tail -n3 \
+			       | grep "^   A /.* (from /.*:[0-9]*)\$" \
+			       | tail -n 1)
+		if [ x"$copy_log" = x"" ]; then
+		    break
+		fi
+
+		parent="$(echo "$copy_log" | sed -e "s#^   A /\(.*\) (from /\(.*\):\([0-9]*\))\$#\2#")"
+		parent_rev="$(echo "$copy_log" | sed -e "s#^   A /\(.*\) (from /\(.*\):\([0-9]*\))\$#\3#")"
+
+		path="$parent@$parent_rev"
+
+		printf " $path"
+	    done
+	    printf "\n"
+	fi
+    else
+	while read newpath; do
+	    newpath=$(echo "$path/$newpath" | sed -e "s#^/##g" -e "s#/\$##g")
+	    queue=("${queue[@]}" "$newpath")
+	done < $tmp
+    fi
+done
-- 
2.17.1


^ permalink raw reply	[flat|nested] 103+ messages in thread

end of thread, other threads:[~2019-09-22  0:20 UTC | newest]

Thread overview: 103+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 16:11 [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git Maxim Kuvyrkov
2019-05-14 21:20 ` Segher Boessenkool
2019-05-15  8:34   ` Maxim Kuvyrkov
2019-05-15 18:47     ` Segher Boessenkool
2019-05-16  9:44       ` Maxim Kuvyrkov
2019-05-15 11:19 ` Richard Biener
2019-05-15 12:08   ` Maxim Kuvyrkov
2019-05-15 18:42     ` Eric Gallager
2019-05-16  0:33       ` Paul Koning
2019-05-16  9:53         ` Maxim Kuvyrkov
2019-05-16 16:22   ` Jeff Law
2019-05-16 16:40     ` Maxim Kuvyrkov
2019-05-16 18:36       ` Ramana Radhakrishnan
2019-05-16 19:07         ` Jeff Law
2019-05-16 22:04           ` Jonathan Wakely
2019-05-17 11:33             ` Martin Liška
2019-05-16 23:54       ` Joseph Myers
2019-05-17  8:19         ` Richard Sandiford
2019-05-17 19:51           ` Segher Boessenkool
2019-05-17 20:59             ` Steve Ellcey
2019-05-17 21:23             ` Jason Merrill
2019-05-20 22:42           ` Joseph Myers
2019-05-21 14:24             ` Richard Earnshaw (lists)
2019-05-21 14:45               ` Jeff Law
2019-05-21 15:02                 ` Richard Earnshaw (lists)
2019-05-21 16:44             ` Segher Boessenkool
2019-05-23 22:33               ` Joseph Myers
2019-05-24  8:58                 ` Segher Boessenkool
2019-05-24 12:02                   ` Florian Weimer
2019-05-29  1:50                   ` Joseph Myers
2019-05-29 13:04                     ` Segher Boessenkool
2019-05-31  0:16                       ` Joseph Myers
2019-06-02 23:13                         ` Segher Boessenkool
2019-06-03 22:33                           ` Joseph Myers
2019-06-03 22:49                             ` Segher Boessenkool
2019-06-05 18:04                             ` Jason Merrill
2019-06-06 10:14                               ` Richard Earnshaw (lists)
2019-06-06 23:41                                 ` Joseph Myers
2019-06-06 23:50                                   ` Ian Lance Taylor
2019-06-07  9:32                                     ` Richard Earnshaw (lists)
2019-06-06 23:36                               ` Joseph Myers
2019-07-22  9:05                                 ` Maxim Kuvyrkov
2019-05-16 23:06 ` Joseph Myers
2019-05-17 12:22   ` Martin Liška
2019-05-17 12:39     ` Jakub Jelinek
2019-05-19  7:35       ` Martin Liška
2019-05-19  8:11         ` Segher Boessenkool
2019-05-19 19:21           ` Marek Polacek
2019-05-19 19:46             ` Andreas Schwab
2019-05-19 19:54             ` Segher Boessenkool
2019-05-19 20:01               ` Andrew Pinski
2019-05-19 20:06                 ` Marek Polacek
2019-05-20  7:29                   ` Martin Liška
2019-05-20 13:56                 ` Florian Weimer
2019-05-20 14:18                   ` Segher Boessenkool
2019-05-20 14:25                   ` Jakub Jelinek
2019-05-20 14:26                   ` Andreas Schwab
2019-05-20 14:29                     ` Jakub Jelinek
2019-05-20 14:36                       ` Andreas Schwab
2019-05-20 15:04                       ` Segher Boessenkool
2019-05-17 14:59     ` Maxim Kuvyrkov
2019-05-19  7:09       ` Martin Liška
2019-05-17 14:56   ` Maxim Kuvyrkov
2019-05-17 13:07 ` Jason Merrill
2019-05-17 15:08   ` Maxim Kuvyrkov
2019-05-20 22:48   ` Joseph Myers
2019-05-28 10:44 ` Maxim Kuvyrkov
2019-07-16 10:21   ` Maxim Kuvyrkov
2019-07-16 12:40     ` Jason Merrill
2019-07-16 14:27       ` Maxim Kuvyrkov
2019-07-20 11:24         ` Maxim Kuvyrkov
2019-07-22  9:35         ` Maxim Kuvyrkov
2019-08-01 20:43           ` Jason Merrill
2019-08-02  8:41             ` Maxim Kuvyrkov
2019-08-02  8:57               ` Richard Biener
2019-08-02 10:27               ` Martin Liška
2019-08-02 10:54                 ` Maxim Kuvyrkov
2019-08-02 11:01                   ` Martin Liška
2019-08-02 11:06                     ` Richard Biener
2019-08-02 11:35                       ` Martin Liška
2019-08-02 22:31                         ` Jason Merrill
2019-08-05 13:20                           ` Martin Liška
2019-08-05 15:20                             ` Monotonically increasing counter (was Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git) Jason Merrill
2019-08-05 15:34                               ` Jakub Jelinek
2019-08-05 15:45                                 ` Richard Earnshaw (lists)
2019-08-05 18:22                                 ` Jason Merrill
2019-08-14 18:49                                   ` Jason Merrill
2019-09-19 19:29                                     ` Jason Merrill
2019-09-21 18:18                                       ` Segher Boessenkool
2019-09-21 20:31                                         ` Nicholas Krause
2019-09-21 21:32                                         ` Jason Merrill
2019-09-22  0:20                                           ` Segher Boessenkool
2019-08-02 14:35                       ` [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git Segher Boessenkool
2019-08-02 14:55                       ` Maxim Kuvyrkov
2019-08-05 16:43                       ` Mike Stump
2019-08-05  8:24               ` Maxim Kuvyrkov
2019-08-06 11:16                 ` Maxim Kuvyrkov
2019-08-23  8:27                   ` Maxim Kuvyrkov
2019-08-23 22:08                     ` Joseph Myers
2019-09-13  7:20                       ` Maxim Kuvyrkov
2019-08-02  8:35           ` Maxim Kuvyrkov
2019-08-02 14:14             ` Maxim Kuvyrkov
2019-08-02 15:47               ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).