* Error when accessing git read-only archive @ 2021-09-13 12:52 Thomas Koenig 2021-09-13 13:01 ` Jonathan Wakely 0 siblings, 1 reply; 10+ messages in thread From: Thomas Koenig @ 2021-09-13 12:52 UTC (permalink / raw) To: gcc mailing list Hi, I just got an error when accessing the gcc git pages at https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is: This page contains the following errors: error on line 91 at column 6: XML declaration allowed only at the start of the document Below is a rendering of the page up to the first error. Just to let you know (and it would be nice if this could be fixed :-) Best regards Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-13 12:52 Error when accessing git read-only archive Thomas Koenig @ 2021-09-13 13:01 ` Jonathan Wakely 2021-09-13 13:03 ` Jonathan Wakely 0 siblings, 1 reply; 10+ messages in thread From: Jonathan Wakely @ 2021-09-13 13:01 UTC (permalink / raw) To: Thomas Koenig; +Cc: gcc mailing list On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote: > > Hi, > > I just got an error when accessing the gcc git pages at > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is: > > This page contains the following errors: > error on line 91 at column 6: XML declaration allowed only at the start > of the document > Below is a rendering of the page up to the first error. The web server seems to restart the page in the middle of the HTML, the content contains: </tr> <tr class="light"> Content-type: text/html <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-13 13:01 ` Jonathan Wakely @ 2021-09-13 13:03 ` Jonathan Wakely 2021-09-15 13:18 ` Jonathan Wakely 2021-09-15 13:21 ` David Malcolm 0 siblings, 2 replies; 10+ messages in thread From: Jonathan Wakely @ 2021-09-13 13:03 UTC (permalink / raw) To: Thomas Koenig; +Cc: gcc mailing list On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote: > > > > Hi, > > > > I just got an error when accessing the gcc git pages at > > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is: > > > > This page contains the following errors: > > error on line 91 at column 6: XML declaration allowed only at the start > > of the document > > Below is a rendering of the page up to the first error. > > The web server seems to restart the page in the middle of the HTML, > the content contains: > > </tr> > <tr class="light"> > Content-type: text/html > > <?xml version="1.0" encoding="utf-8"?> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> Ah, the "second" page it's trying to display (in the middle of the first) is an error: <div class="page_body"> <br /><br /> 500 - Internal Server Error <br /> <hr /> Wide character in subroutine entry at /var/www/git/gitweb.cgi line 2208. </div> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-13 13:03 ` Jonathan Wakely @ 2021-09-15 13:18 ` Jonathan Wakely 2021-09-15 13:21 ` David Malcolm 1 sibling, 0 replies; 10+ messages in thread From: Jonathan Wakely @ 2021-09-15 13:18 UTC (permalink / raw) To: Thomas Koenig; +Cc: gcc mailing list, Jan-Benedict Glaw On Mon, 13 Sept 2021 at 14:03, Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote: > > > > > > Hi, > > > > > > I just got an error when accessing the gcc git pages at > > > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is: > > > > > > This page contains the following errors: > > > error on line 91 at column 6: XML declaration allowed only at the start > > > of the document > > > Below is a rendering of the page up to the first error. > > > > The web server seems to restart the page in the middle of the HTML, > > the content contains: > > > > </tr> > > <tr class="light"> > > Content-type: text/html > > > > <?xml version="1.0" encoding="utf-8"?> > > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" > > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> > > Ah, the "second" page it's trying to display (in the middle of the > first) is an error: > > <div class="page_body"> > <br /><br /> > 500 - Internal Server Error > <br /> > <hr /> > Wide character in subroutine entry at /var/www/git/gitweb.cgi line 2208. > > </div> Jan-Benedict managed to push a commit with a non-ASCII author email, which gitweb can't handle. f42e95a830ab48e59389065ce79a013a519646f1 says "@ług-owl.de" ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-13 13:03 ` Jonathan Wakely 2021-09-15 13:18 ` Jonathan Wakely @ 2021-09-15 13:21 ` David Malcolm 2021-09-15 13:37 ` Jan-Benedict Glaw 1 sibling, 1 reply; 10+ messages in thread From: David Malcolm @ 2021-09-15 13:21 UTC (permalink / raw) To: Jonathan Wakely, Thomas Koenig; +Cc: gcc mailing list On Mon, 2021-09-13 at 14:03 +0100, Jonathan Wakely via Gcc wrote: > On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> > wrote: > > > > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc < > > gcc@gcc.gnu.org> wrote: > > > > > > Hi, > > > > > > I just got an error when accessing the gcc git pages at > > > https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git , it is: > > > > > > This page contains the following errors: > > > error on line 91 at column 6: XML declaration allowed only at the > > > start > > > of the document > > > Below is a rendering of the page up to the first error. > > > > The web server seems to restart the page in the middle of the HTML, > > the content contains: > > > > </tr> > > <tr class="light"> > > Content-type: text/html > > > > <?xml version="1.0" encoding="utf-8"?> > > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" > > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en- > > US"> > > Ah, the "second" page it's trying to display (in the middle of the > first) is an error: > > <div class="page_body"> > <br /><br /> > 500 - Internal Server Error > <br /> > <hr /> > Wide character in subroutine entry at /var/www/git/gitweb.cgi line > 2208. > > </div> Summarizing some notes from IRC: The last commit it manages to print successfully in that log seems to be: c012297c9d5dfb177adf1423bdd05e5f4b87e5ec so it appears that: 42e95a830ab48e59389065ce79a013a519646f1 is triggering the issue, and indeed https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f42e95a830ab48e59389065ce79a013a519646f1 fails in a similar way, whereas other commits work. It appears to be due to the "ł" character in the email address of the Author, in that: commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec Author: Jan-Benedict Glaw <jbglaw@lug-owl.de> works, whereas: commit f42e95a830ab48e59389065ce79a013a519646f1 Author: Jan-Benedict Glaw <jbglaw@ług-owl.de> doesn't. git show f42e95a830ab48e59389065ce79a013a519646f1 | hexdump -C shows: 00000030 41 75 74 68 6f 72 3a 20 4a 61 6e 2d 42 65 6e 65 |Author: Jan-Bene| 00000040 64 69 63 74 20 47 6c 61 77 20 3c 6a 62 67 6c 61 |dict Glaw <jbgla| 00000050 77 40 c5 82 75 67 2d 6f 77 6c 2e 64 65 3e 0a 44 |w@..ug-owl.de>.D| 00000060 61 74 65 3a 20 20 20 4d 6f 6e 20 53 65 70 20 31 |ate: Mon Sep 1| i.e. we have the two bytes 0xc5 0x82, which is the UTF-8 encoding of "ł". $ git format-patch c012297c9d5dfb177adf1423bdd05e5f4b87e5ec^^..c012297c9d5dfb177adf1423bdd05e5f4b87e5ec 0001-Fix-multi-statment-macro.patch 0002-cr16-elf-is-now-obsoleted.patch $ file *.patch 0001-Fix-multi-statment-macro.patch: unified diff output, UTF-8 Unicode text 0002-cr16-elf-is-now-obsoleted.patch: unified diff output, ASCII text Hope this is helpful Dave ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-15 13:21 ` David Malcolm @ 2021-09-15 13:37 ` Jan-Benedict Glaw 2021-09-15 13:43 ` Jonathan Wakely 0 siblings, 1 reply; 10+ messages in thread From: Jan-Benedict Glaw @ 2021-09-15 13:37 UTC (permalink / raw) To: David Malcolm; +Cc: Jonathan Wakely, Thomas Koenig, gcc mailing list [-- Attachment #1: Type: text/plain, Size: 1560 bytes --] Hi David, Jonathan and all others, On Wed, 2021-09-15 09:21:04 -0400, David Malcolm via Gcc <gcc@gcc.gnu.org> wrote: > On Mon, 2021-09-13 at 14:03 +0100, Jonathan Wakely via Gcc wrote: > > On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote: > > Summarizing some notes from IRC: > > The last commit it manages to print successfully in that log seems to > be: > c012297c9d5dfb177adf1423bdd05e5f4b87e5ec > so it appears that: > 42e95a830ab48e59389065ce79a013a519646f1 > is triggering the issue, and indeed > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f42e95a830ab48e59389065ce79a013a519646f1 > fails in a similar way, whereas other commits work. > > It appears to be due to the "ł" character in the email address of the > Author, in that: > > commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec > Author: Jan-Benedict Glaw <jbglaw@lug-owl.de> > > works, whereas: > > commit f42e95a830ab48e59389065ce79a013a519646f1 > Author: Jan-Benedict Glaw <jbglaw@ług-owl.de> That was indeed me, after moving my GCC repo to a different machine and adding an explicit user.email (as this wasn't automatically picking up a proper domain.) The "ł" was a typo (AltGr key still pressed while typing the "l" after having entered the "@" which requires it on a German keyboard layout.) So I broke it. Any way to make sure something like this doesn't occur again? Sorry for inconvenience! Jan-Benedict -- [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-15 13:37 ` Jan-Benedict Glaw @ 2021-09-15 13:43 ` Jonathan Wakely 2021-09-15 14:10 ` Mark Wielaard 2021-09-15 14:14 ` Jan-Benedict Glaw 0 siblings, 2 replies; 10+ messages in thread From: Jonathan Wakely @ 2021-09-15 13:43 UTC (permalink / raw) To: Jan-Benedict Glaw; +Cc: David Malcolm, Thomas Koenig, gcc mailing list On Wed, 15 Sept 2021 at 14:37, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote: > > Hi David, Jonathan and all others, > > On Wed, 2021-09-15 09:21:04 -0400, David Malcolm via Gcc <gcc@gcc.gnu.org> wrote: > > On Mon, 2021-09-13 at 14:03 +0100, Jonathan Wakely via Gcc wrote: > > > On Mon, 13 Sept 2021 at 14:01, Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > > > > On Mon, 13 Sept 2021 at 13:53, Thomas Koenig via Gcc <gcc@gcc.gnu.org> wrote: > > > > Summarizing some notes from IRC: > > > > The last commit it manages to print successfully in that log seems to > > be: > > c012297c9d5dfb177adf1423bdd05e5f4b87e5ec > > so it appears that: > > 42e95a830ab48e59389065ce79a013a519646f1 > > is triggering the issue, and indeed > > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f42e95a830ab48e59389065ce79a013a519646f1 > > fails in a similar way, whereas other commits work. > > > > It appears to be due to the "ł" character in the email address of the > > Author, in that: > > > > commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec > > Author: Jan-Benedict Glaw <jbglaw@lug-owl.de> > > > > works, whereas: > > > > commit f42e95a830ab48e59389065ce79a013a519646f1 > > Author: Jan-Benedict Glaw <jbglaw@ług-owl.de> > > That was indeed me, after moving my GCC repo to a different machine > and adding an explicit user.email (as this wasn't automatically > picking up a proper domain.) The "ł" was a typo (AltGr key still > pressed while typing the "l" after having entered the "@" which > requires it on a German keyboard layout.) > > So I broke it. Any way to make sure something like this doesn't > occur again? We could add a check to the git hooks (and gcc-verify alias) to reject non-ASCII email addresses, since they're probably mistakes. And we should report it to Gitweb (if it isn't already fixed upstream) and get a fix into the version used on gcc.gnu.org. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-15 13:43 ` Jonathan Wakely @ 2021-09-15 14:10 ` Mark Wielaard 2021-09-15 18:34 ` Jan-Benedict Glaw 2021-09-15 14:14 ` Jan-Benedict Glaw 1 sibling, 1 reply; 10+ messages in thread From: Mark Wielaard @ 2021-09-15 14:10 UTC (permalink / raw) To: Jonathan Wakely, Jan-Benedict Glaw; +Cc: Thomas Koenig, gcc mailing list Hi, On Wed, 2021-09-15 at 14:43 +0100, Jonathan Wakely via Gcc wrote: > On Wed, 15 Sept 2021 at 14:37, Jan-Benedict Glaw <jbglaw@lug-owl.de> > wrote: > > On Wed, 2021-09-15 09:21:04 -0400, David Malcolm via Gcc < > > gcc@gcc.gnu.org> wrote: > > > It appears to be due to the "ł" character in the email address of > > > the > > > Author, in that: > > > > > > commit c012297c9d5dfb177adf1423bdd05e5f4b87e5ec > > > Author: Jan-Benedict Glaw <jbglaw@lug-owl.de> > > > > > > works, whereas: > > > > > > commit f42e95a830ab48e59389065ce79a013a519646f1 > > > Author: Jan-Benedict Glaw <jbglaw@ług-owl.de> > > > > That was indeed me, after moving my GCC repo to a different machine > > and adding an explicit user.email (as this wasn't automatically > > picking up a proper domain.) The "ł" was a typo (AltGr key still > > pressed while typing the "l" after having entered the "@" which > > requires it on a German keyboard layout.) > > > > So I broke it. Any way to make sure something like this doesn't > > occur again? > > We could add a check to the git hooks (and gcc-verify alias) to > reject > non-ASCII email addresses, since they're probably mistakes. > > And we should report it to Gitweb (if it isn't already fixed > upstream) > and get a fix into the version used on gcc.gnu.org. The issue is the gravatar support, which calls md5_hex($email). For now I disabled gravatar support on sourceware.org/gcc.gnu.org in /etc/gitweb.conf Cheers, Mark ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-15 14:10 ` Mark Wielaard @ 2021-09-15 18:34 ` Jan-Benedict Glaw 0 siblings, 0 replies; 10+ messages in thread From: Jan-Benedict Glaw @ 2021-09-15 18:34 UTC (permalink / raw) To: Mark Wielaard; +Cc: Jonathan Wakely, Thomas Koenig, gcc mailing list [-- Attachment #1: Type: text/plain, Size: 845 bytes --] Hi, On Wed, 2021-09-15 16:10:50 +0200, Mark Wielaard <mark@klomp.org> wrote: [UTF-8 email address containing a 'ł'] > The issue is the gravatar support, which calls md5_hex($email). > For now I disabled gravatar support on sourceware.org/gcc.gnu.org in > /etc/gitweb.conf I am not a Perl guy, but it seems this works (tested locally): --- a/gitweb/gitweb.perl 2021-09-15 20:23:13.788195846 +0200 +++ b/gitweb/gitweb.perl 2021-09-15 20:24:19.911806868 +0200 @@ -2193,7 +2193,7 @@ my $size = shift; $avatar_cache{$email} ||= "//www.gravatar.com/avatar/" . - md5_hex($email) . "?s="; + md5_hex(utf8::is_utf8($email)? Encode::encode_utf8($email): $email) . "?s="; return $avatar_cache{$email} . $size; } I'll send that to the GIT mailing list and ask for verification. Thanks, Jan-Benedict -- [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Error when accessing git read-only archive 2021-09-15 13:43 ` Jonathan Wakely 2021-09-15 14:10 ` Mark Wielaard @ 2021-09-15 14:14 ` Jan-Benedict Glaw 1 sibling, 0 replies; 10+ messages in thread From: Jan-Benedict Glaw @ 2021-09-15 14:14 UTC (permalink / raw) To: Jonathan Wakely; +Cc: David Malcolm, Thomas Koenig, gcc mailing list [-- Attachment #1: Type: text/plain, Size: 837 bytes --] Hi Jonathan! On Wed, 2021-09-15 14:43:45 +0100, Jonathan Wakely <jwakely.gcc@gmail.com> wrote: > On Wed, 15 Sept 2021 at 14:37, Jan-Benedict Glaw <jbglaw@lug-owl.de> wrote: [UTF-8 in committer's email addresses] > > So I broke it. Any way to make sure something like this doesn't > > occur again? > > We could add a check to the git hooks (and gcc-verify alias) to reject > non-ASCII email addresses, since they're probably mistakes. It was indeed a typo for me, but others might, in the long run, actually use IDNs. Should they prepare their commits using Punycode? > And we should report it to Gitweb (if it isn't already fixed upstream) > and get a fix into the version used on gcc.gnu.org. I hope the local fix is already forwarded. That was quite a Brown Paperbag typo. :( Sorry, Jan-Benedict -- [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-09-15 18:34 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-13 12:52 Error when accessing git read-only archive Thomas Koenig 2021-09-13 13:01 ` Jonathan Wakely 2021-09-13 13:03 ` Jonathan Wakely 2021-09-15 13:18 ` Jonathan Wakely 2021-09-15 13:21 ` David Malcolm 2021-09-15 13:37 ` Jan-Benedict Glaw 2021-09-15 13:43 ` Jonathan Wakely 2021-09-15 14:10 ` Mark Wielaard 2021-09-15 18:34 ` Jan-Benedict Glaw 2021-09-15 14:14 ` Jan-Benedict Glaw
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).