* Error when accessing git read-only archive
@ 2021-09-13 12:52 Thomas Koenig
2021-09-13 13:01 ` Jonathan Wakely
0 siblings, 1 reply; 10+ messages in thread
From: Thomas Koenig @ 2021-09-13 12:52 UTC (permalink / raw)
To: gcc mailing list
Hi,
I just got an error when accessing the gcc git pages at
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is:
This page contains the following errors:
error on line 91 at column 6: XML declaration allowed only at the start
of the document
Below is a rendering of the page up to the first error.
Just to let you know (and it would be nice if this could be fixed :-)
Best regards
Thomas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-13 12:52 Error when accessing git read-only archive Thomas Koenig
@ 2021-09-13 13:01 ` Jonathan Wakely
2021-09-13 13:03 ` Jonathan Wakely
0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2021-09-13 13:01 UTC (permalink / raw)
To: Thomas Koenig; +Cc: gcc mailing list
On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hi,
>
> I just got an error when accessing the gcc git pages at
> https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is:
>
> This page contains the following errors:
> error on line 91 at column 6: XML declaration allowed only at the start
> of the document
> Below is a rendering of the page up to the first error.
The web server seems to restart the page in the middle of the HTML,
the content contains:
</tr>
<tr class="light">
Content-type: text/html
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-13 13:01 ` Jonathan Wakely
@ 2021-09-13 13:03 ` Jonathan Wakely
2021-09-15 13:18 ` Jonathan Wakely
2021-09-15 13:21 ` David Malcolm
0 siblings, 2 replies; 10+ messages in thread
From: Jonathan Wakely @ 2021-09-13 13:03 UTC (permalink / raw)
To: Thomas Koenig; +Cc: gcc mailing list
On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
> On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > Hi,
> >
> > I just got an error when accessing the gcc git pages at
> > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is:
> >
> > This page contains the following errors:
> > error on line 91 at column 6: XML declaration allowed only at the start
> > of the document
> > Below is a rendering of the page up to the first error.
>
> The web server seems to restart the page in the middle of the HTML,
> the content contains:
>
> </tr>
> <tr class="light">
> Content-type: text/html
>
> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">
Ah, the "second" page it's trying to display (in the middle of the
first) is an error:
<div class="page_body">
<br /><br />
500 - Internal Server Error
<br />
<hr />
Wide character in subroutine entry at /var/www/git/gitweb.cgi line 2208.
</div>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-13 13:03 ` Jonathan Wakely
@ 2021-09-15 13:18 ` Jonathan Wakely
2021-09-15 13:21 ` David Malcolm
1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Wakely @ 2021-09-15 13:18 UTC (permalink / raw)
To: Thomas Koenig; +Cc: gcc mailing list, Jan-Benedict Glaw
On Mon, 13 Sept 2021 at 14:03, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
> On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> >
> > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote:
> > >
> > > Hi,
> > >
> > > I just got an error when accessing the gcc git pages at
> > > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is:
> > >
> > > This page contains the following errors:
> > > error on line 91 at column 6: XML declaration allowed only at the start
> > > of the document
> > > Below is a rendering of the page up to the first error.
> >
> > The web server seems to restart the page in the middle of the HTML,
> > the content contains:
> >
> > </tr>
> > <tr class="light">
> > Content-type: text/html
> >
> > <?xml version="1.0" encoding="utf-8"?>
> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">
>
> Ah, the "second" page it's trying to display (in the middle of the
> first) is an error:
>
> <div class="page_body">
> <br /><br />
> 500 - Internal Server Error
> <br />
> <hr />
> Wide character in subroutine entry at /var/www/git/gitweb.cgi line 2208.
>
> </div>
Jan-Benedict managed to push a commit with a non-ASCII author email,
which gitweb can't handle.
f42e95a830ab48e59389065ce79a013a519646f1 says "@ług-owl.de"
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-13 13:03 ` Jonathan Wakely
2021-09-15 13:18 ` Jonathan Wakely
@ 2021-09-15 13:21 ` David Malcolm
2021-09-15 13:37 ` Jan-Benedict Glaw
1 sibling, 1 reply; 10+ messages in thread
From: David Malcolm @ 2021-09-15 13:21 UTC (permalink / raw)
To: Jonathan Wakely, Thomas Koenig; +Cc: gcc mailing list
On Mon, 2021-09-13 at 14:03 +0100, Jonathan Wakely via Gcc wrote:
> On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com>
> wrote:
> >
> > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <
> > gcc@gcc.gnu.org> wrote:
> > >
> > > Hi,
> > >
> > > I just got an error when accessing the gcc git pages at
> > > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is:
> > >
> > > This page contains the following errors:
> > > error on line 91 at column 6: XML declaration allowed only at the
> > > start
> > > of the document
> > > Below is a rendering of the page up to the first error.
> >
> > The web server seems to restart the page in the middle of the HTML,
> > the content contains:
> >
> > </tr>
> > <tr class="light">
> > Content-type: text/html
> >
> > <?xml version="1.0" encoding="utf-8"?>
> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-
> > US">
>
> Ah, the "second" page it's trying to display (in the middle of the
> first) is an error:
>
> <div class="page_body">
> <br /><br />
> 500 - Internal Server Error
> <br />
> <hr />
> Wide character in subroutine entry at /var/www/git/gitweb.cgi line
> 2208.
>
> </div>
Summarizing some notes from IRC:
The last commit it manages to print successfully in that log seems to
be:
c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
so it appears that:
42e95a830ab48e59389065ce79a013a519646f1
is triggering the issue, and indeed
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f42e95a830ab48e59389065ce79a013a519646f1
fails in a similar way, whereas other commits work.
It appears to be due to the "ł" character in the email address of the
Author, in that:
commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
Author: Jan-Benedict Glaw <jbglaw@lug-owl.de>
works, whereas:
commit f42e95a830ab48e59389065ce79a013a519646f1
Author: Jan-Benedict Glaw <jbglaw@ług-owl.de>
doesn't.
git show f42e95a830ab48e59389065ce79a013a519646f1 | hexdump -C
shows:
00000030 41 75 74 68 6f 72 3a 20 4a 61 6e 2d 42 65 6e 65 |Author: Jan-Bene|
00000040 64 69 63 74 20 47 6c 61 77 20 3c 6a 62 67 6c 61 |dict Glaw <jbgla|
00000050 77 40 c5 82 75 67 2d 6f 77 6c 2e 64 65 3e 0a 44 |w@..ug-owl.de>.D|
00000060 61 74 65 3a 20 20 20 4d 6f 6e 20 53 65 70 20 31 |ate: Mon Sep 1|
i.e. we have the two bytes 0xc5 0x82, which is the UTF-8 encoding of "ł".
$ git format-patch c012297c9d5dfb177adf1423bdd05e5f4b87e5ec^^..c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
0001-Fix-multi-statment-macro.patch
0002-cr16-elf-is-now-obsoleted.patch
$ file *.patch
0001-Fix-multi-statment-macro.patch: unified diff output, UTF-8 Unicode text
0002-cr16-elf-is-now-obsoleted.patch: unified diff output, ASCII text
Hope this is helpful
Dave
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-15 13:21 ` David Malcolm
@ 2021-09-15 13:37 ` Jan-Benedict Glaw
2021-09-15 13:43 ` Jonathan Wakely
0 siblings, 1 reply; 10+ messages in thread
From: Jan-Benedict Glaw @ 2021-09-15 13:37 UTC (permalink / raw)
To: David Malcolm; +Cc: Jonathan Wakely, Thomas Koenig, gcc mailing list
[-- Attachment #1: Type: text/plain, Size: 1560 bytes --]
Hi David, Jonathan and all others,
On Wed, 2021-09-15 09:21:04 -0400, David Malcolm via Gcc <gcc@gcc.gnu.org> wrote:
> On Mon, 2021-09-13 at 14:03 +0100, Jonathan Wakely via Gcc wrote:
> > On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> > > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote:
>
> Summarizing some notes from IRC:
>
> The last commit it manages to print successfully in that log seems to
> be:
> c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
> so it appears that:
> 42e95a830ab48e59389065ce79a013a519646f1
> is triggering the issue, and indeed
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f42e95a830ab48e59389065ce79a013a519646f1
> fails in a similar way, whereas other commits work.
>
> It appears to be due to the "ł" character in the email address of the
> Author, in that:
>
> commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
> Author: Jan-Benedict Glaw <jbglaw@lug-owl.de>
>
> works, whereas:
>
> commit f42e95a830ab48e59389065ce79a013a519646f1
> Author: Jan-Benedict Glaw <jbglaw@ług-owl.de>
That was indeed me, after moving my GCC repo to a different machine
and adding an explicit user.email (as this wasn't automatically
picking up a proper domain.) The "ł" was a typo (AltGr key still
pressed while typing the "l" after having entered the "@" which
requires it on a German keyboard layout.)
So I broke it. Any way to make sure something like this doesn't
occur again?
Sorry for inconvenience!
Jan-Benedict
--
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-15 13:37 ` Jan-Benedict Glaw
@ 2021-09-15 13:43 ` Jonathan Wakely
2021-09-15 14:10 ` Mark Wielaard
2021-09-15 14:14 ` Jan-Benedict Glaw
0 siblings, 2 replies; 10+ messages in thread
From: Jonathan Wakely @ 2021-09-15 13:43 UTC (permalink / raw)
To: Jan-Benedict Glaw; +Cc: David Malcolm, Thomas Koenig, gcc mailing list
On Wed, 15 Sept 2021 at 14:37, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote:
>
> Hi David, Jonathan and all others,
>
> On Wed, 2021-09-15 09:21:04 -0400, David Malcolm via Gcc <gcc@gcc.gnu.org> wrote:
> > On Mon, 2021-09-13 at 14:03 +0100, Jonathan Wakely via Gcc wrote:
> > > On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> > > > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > Summarizing some notes from IRC:
> >
> > The last commit it manages to print successfully in that log seems to
> > be:
> > c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
> > so it appears that:
> > 42e95a830ab48e59389065ce79a013a519646f1
> > is triggering the issue, and indeed
> > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f42e95a830ab48e59389065ce79a013a519646f1
> > fails in a similar way, whereas other commits work.
> >
> > It appears to be due to the "ł" character in the email address of the
> > Author, in that:
> >
> > commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
> > Author: Jan-Benedict Glaw <jbglaw@lug-owl.de>
> >
> > works, whereas:
> >
> > commit f42e95a830ab48e59389065ce79a013a519646f1
> > Author: Jan-Benedict Glaw <jbglaw@ług-owl.de>
>
> That was indeed me, after moving my GCC repo to a different machine
> and adding an explicit user.email (as this wasn't automatically
> picking up a proper domain.) The "ł" was a typo (AltGr key still
> pressed while typing the "l" after having entered the "@" which
> requires it on a German keyboard layout.)
>
> So I broke it. Any way to make sure something like this doesn't
> occur again?
We could add a check to the git hooks (and gcc-verify alias) to reject
non-ASCII email addresses, since they're probably mistakes.
And we should report it to Gitweb (if it isn't already fixed upstream)
and get a fix into the version used on gcc.gnu.org.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-15 13:43 ` Jonathan Wakely
@ 2021-09-15 14:10 ` Mark Wielaard
2021-09-15 18:34 ` Jan-Benedict Glaw
2021-09-15 14:14 ` Jan-Benedict Glaw
1 sibling, 1 reply; 10+ messages in thread
From: Mark Wielaard @ 2021-09-15 14:10 UTC (permalink / raw)
To: Jonathan Wakely, Jan-Benedict Glaw; +Cc: Thomas Koenig, gcc mailing list
Hi,
On Wed, 2021-09-15 at 14:43 +0100, Jonathan Wakely via Gcc wrote:
> On Wed, 15 Sept 2021 at 14:37, Jan-Benedict Glaw <jbglaw@lug-owl.de>
> wrote:
> > On Wed, 2021-09-15 09:21:04 -0400, David Malcolm via Gcc <
> > gcc@gcc.gnu.org> wrote:
> > > It appears to be due to the "ł" character in the email address of
> > > the
> > > Author, in that:
> > >
> > > commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec
> > > Author: Jan-Benedict Glaw <jbglaw@lug-owl.de>
> > >
> > > works, whereas:
> > >
> > > commit f42e95a830ab48e59389065ce79a013a519646f1
> > > Author: Jan-Benedict Glaw <jbglaw@ług-owl.de>
> >
> > That was indeed me, after moving my GCC repo to a different machine
> > and adding an explicit user.email (as this wasn't automatically
> > picking up a proper domain.) The "ł" was a typo (AltGr key still
> > pressed while typing the "l" after having entered the "@" which
> > requires it on a German keyboard layout.)
> >
> > So I broke it. Any way to make sure something like this doesn't
> > occur again?
>
> We could add a check to the git hooks (and gcc-verify alias) to
> reject
> non-ASCII email addresses, since they're probably mistakes.
>
> And we should report it to Gitweb (if it isn't already fixed
> upstream)
> and get a fix into the version used on gcc.gnu.org.
The issue is the gravatar support, which calls md5_hex($email).
For now I disabled gravatar support on sourceware.org/gcc.gnu.org in
/etc/gitweb.conf
Cheers,
Mark
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-15 13:43 ` Jonathan Wakely
2021-09-15 14:10 ` Mark Wielaard
@ 2021-09-15 14:14 ` Jan-Benedict Glaw
1 sibling, 0 replies; 10+ messages in thread
From: Jan-Benedict Glaw @ 2021-09-15 14:14 UTC (permalink / raw)
To: Jonathan Wakely; +Cc: David Malcolm, Thomas Koenig, gcc mailing list
[-- Attachment #1: Type: text/plain, Size: 837 bytes --]
Hi Jonathan!
On Wed, 2021-09-15 14:43:45 +0100, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> On Wed, 15 Sept 2021 at 14:37, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote:
[UTF-8 in committer's email addresses]
> > So I broke it. Any way to make sure something like this doesn't
> > occur again?
>
> We could add a check to the git hooks (and gcc-verify alias) to reject
> non-ASCII email addresses, since they're probably mistakes.
It was indeed a typo for me, but others might, in the long run,
actually use IDNs. Should they prepare their commits using Punycode?
> And we should report it to Gitweb (if it isn't already fixed upstream)
> and get a fix into the version used on gcc.gnu.org.
I hope the local fix is already forwarded. That was quite a Brown
Paperbag typo. :(
Sorry,
Jan-Benedict
--
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive
2021-09-15 14:10 ` Mark Wielaard
@ 2021-09-15 18:34 ` Jan-Benedict Glaw
0 siblings, 0 replies; 10+ messages in thread
From: Jan-Benedict Glaw @ 2021-09-15 18:34 UTC (permalink / raw)
To: Mark Wielaard; +Cc: Jonathan Wakely, Thomas Koenig, gcc mailing list
[-- Attachment #1: Type: text/plain, Size: 845 bytes --]
Hi,
On Wed, 2021-09-15 16:10:50 +0200, Mark Wielaard <mark@klomp.org> wrote:
[UTF-8 email address containing a 'ł']
> The issue is the gravatar support, which calls md5_hex($email).
> For now I disabled gravatar support on sourceware.org/gcc.gnu.org in
> /etc/gitweb.conf
I am not a Perl guy, but it seems this works (tested locally):
--- a/gitweb/gitweb.perl 2021-09-15 20:23:13.788195846 +0200
+++ b/gitweb/gitweb.perl 2021-09-15 20:24:19.911806868 +0200
@@ -2193,7 +2193,7 @@
my $size = shift;
$avatar_cache{$email} ||=
"//www.gravatar.com/avatar/" .
- md5_hex($email) . "?s=";
+ md5_hex(utf8::is_utf8($email)? Encode::encode_utf8($email): $email) . "?s=";
return $avatar_cache{$email} . $size;
}
I'll send that to the GIT mailing list and ask for verification.
Thanks,
Jan-Benedict
--
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-09-15 18:34 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-13 12:52 Error when accessing git read-only archive Thomas Koenig
2021-09-13 13:01 ` Jonathan Wakely
2021-09-13 13:03 ` Jonathan Wakely
2021-09-15 13:18 ` Jonathan Wakely
2021-09-15 13:21 ` David Malcolm
2021-09-15 13:37 ` Jan-Benedict Glaw
2021-09-15 13:43 ` Jonathan Wakely
2021-09-15 14:10 ` Mark Wielaard
2021-09-15 18:34 ` Jan-Benedict Glaw
2021-09-15 14:14 ` Jan-Benedict Glaw
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).