From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id A8152385041D for ; Mon, 4 Jan 2021 11:01:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A8152385041D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mliska@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AF5F9ACAF; Mon, 4 Jan 2021 11:01:58 +0000 (UTC) Subject: Re: Patch RFA: Support non-ASCII file names in git-changelog To: Joel Brobecker Cc: Jakub Jelinek , Ian Lance Taylor , gcc-patches , Jonathan Wakely References: <2b8fc5da-0a7e-2feb-9d22-6fecc349d842@suse.cz> <733ffec8-8809-d7fd-f0bf-9b1d9a55d7fc@suse.cz> <20201221094837.GG3788@tucnak> <64528957-cf87-676a-70cd-7fdd5bfeaf17@suse.cz> <20201224121638.GL353421@adacore.com> From: =?UTF-8?Q?Martin_Li=c5=a1ka?= Message-ID: <8beaddc2-402d-b90c-6d53-2903f92275a2@suse.cz> Date: Mon, 4 Jan 2021 12:01:58 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20201224121638.GL353421@adacore.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, NICE_REPLY_A, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jan 2021 11:02:01 -0000 On 12/24/20 1:16 PM, Joel Brobecker wrote: >>> I have no idea who that is (if it is a single user at all, >>> if it isn't any user with git write permissions). >> >> CCing Joel, he should help us how to set a git config >> that will be used by the server hooks. > > I am not sure that requiring both the server and the user to agree > on a non-default configuration value would be a practical idea. I agree with that but I was unable to find a way how to "decode" the filenames: > > From what I understand of the problem, I think the proper fix > is really to adapt the git-changelog script to avoid the need > for any assumption about the user's configuration. In particular, > how does the script get the list of files? On server we use: git diff HEAD~ --name-status which works really fine with -z option: Mcontrib/gcc-changelog/git_repository.pyAšpatně.txt without it, the patch is quoted as well: git diff HEAD~ --name-status | cat M contrib/gcc-changelog/git_repository.py A "\305\241patn\304\233.txt" > Poking around, it looks like > you guys are using the GitPython module, which I'm not familiar with, > unfortunately. But as a reference point, the git-hooks simply use > the -z option to get the information in raw format, and thus avoids > the problem of filename quoting entirely. Does GitPython support > something similar? For instance, browing the GitPython documentation, > I found attributes a_raw_path and b_raw_path. Could that be the > solution (instead of using a_path and b_path? Thanks for looking into it. Unfortunately, for a file called "špatně.txt" I get for a_rawpath and b_rawpath: b'"\\305\\241patn\\304\\233.txt"' b'"\\305\\241patn\\304\\233.txt"' > > Either way, the solution will be independent of the git-hooks, > as I don't think they are actually involved, here. > Anyway, I'm going to update server hook first and I'll create an issue for GitPython. Thanks for help, Martin