From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 6A3DC3858C56 for ; Wed, 6 Dec 2023 16:51:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6A3DC3858C56 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6A3DC3858C56 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701881471; cv=none; b=N8uUitS46hy0D8vMG8F16nITCM77ftvNgM8CqkkBFQx3hRM7650qK1m0PycHvbhA6y1RpDobP0AonnGNUnVmdfqaPBYZcUA9Ieo98vZxOZoukkrR90O6CnejyqTwbOpYZsxsqKdShzp6VxdPjsAJBXGS3yQ08SjyN/rXDQZns+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701881471; c=relaxed/simple; bh=BbNihDQdHF1wc2TufaJb2QjbwRK7Ayu+liFHzelRZZE=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=SYucSz6amaCkyXdV5YIM9UAwEcRX++6qanmZv3wKd6rvfsSl9UVMCp6VBSBRfLlJ0ZXnFZGT7a+SyBJ2BaNN8kxykQkO7HUQmeEOFFN2O+4G/Zmpec25hZUPRMAFRd5/gm9aiMix1/Cu3cqLdm2dx/Wpo/8hac4/uF6/GpeDY1U= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701881461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=n/Mdysmu2Hk4p+BEGLM2p0AHkl3YI3bk4Rg4rMife1A=; b=gp44TX/FbccZq9ABA+y0dKNFjMX6hdjZXDUZNV/SKczH8gcYi25a9ykr1P08L2Xe4pShw7 4Ma3tJK5Zsk2Vnbl5qpHbP+zcTBwWWRKnMaRqaqP8ZUGMDp/H5W2XxLxzDhtNmFmczDa+D shcTVv0/qMmvlPnMa5G7OcPtKs96XoI= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-98-MgtNLaCcNtGC3UJEWsh8sA-1; Wed, 06 Dec 2023 11:50:59 -0500 X-MC-Unique: MgtNLaCcNtGC3UJEWsh8sA-1 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-40b4c9c3cffso50793375e9.2 for ; Wed, 06 Dec 2023 08:50:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701881457; x=1702486257; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=n/Mdysmu2Hk4p+BEGLM2p0AHkl3YI3bk4Rg4rMife1A=; b=G0MsOTNiozp3tPW6frRVjCfekbWo0JE7P3nGfCVswhRT/68B1azzwxiFUQ2VOxJUhV B2+nThNHdTkxIK4Of+q8h2l4VTQa+kyExru0AXews8BinMswo77FLHs7FeL6dVcDSCEP J4Rc21Az/yfYc7cAh0BjviEe2qaGDZ+b2Jtch3f/FX92TgEnFqDzBY0dIrZWPNgD7Mh+ 0dxsgwdJFqOBt1iZYdoYnEKzr0vaO9xvGnTvVYdw7hnMqPuUq6HRG74ohq3vS2D0H1X6 dytSuvF8k/7Vy9axpQOxt9g4SJSWc7bCItDP8kTGYCHtBCc7q1cfQ8qpv95iYUi2p5MS 3WXA== X-Gm-Message-State: AOJu0Yyw+v19JUAjWn6WE/9wbejIeRgcXC9fOEb39fEwq+VMY02pD9Gc Na4nW3MSFELKJwWtQcR1iHS6I8OBKLGDJ0Z4bY3puuwST7Rrdn8euEDo9npp32m0KXtr7DIKC2f skleEatziJEyZEQiej1xXNaYThLZVWeQmlBiG5zRZqsVO/q4CeiMl+pG9ndqHmXgW0ls3I0dyrP gBIA== X-Received: by 2002:a05:600c:21c4:b0:40b:5e59:f72a with SMTP id x4-20020a05600c21c400b0040b5e59f72amr359622wmj.156.1701881457537; Wed, 06 Dec 2023 08:50:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IEZF7tVn9ONEFjh2RQuvZ/uUR4Zfu7NLYvsW0WoMqdcdMHFRO8tHy5cm0Dd+b/f6CCPEsehAg== X-Received: by 2002:a05:600c:21c4:b0:40b:5e59:f72a with SMTP id x4-20020a05600c21c400b0040b5e59f72amr359616wmj.156.1701881457133; Wed, 06 Dec 2023 08:50:57 -0800 (PST) Received: from localhost (105.226.159.143.dyn.plus.net. [143.159.226.105]) by smtp.gmail.com with ESMTPSA id h12-20020a05600c350c00b0040b349c91acsm242601wmq.16.2023.12.06.08.50.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 08:50:56 -0800 (PST) From: Andrew Burgess To: gcc-patches@gcc.gnu.org Cc: Andrew Burgess Subject: [PATCH] libiberty/buildargv: POSIX behaviour for backslash handling Date: Wed, 6 Dec 2023 16:50:48 +0000 Message-Id: <24a8d878590403540bc9b579ba58805985a4d2f7.1701881419.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII"; x-default=true X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: GDB makes use of the libiberty function buildargv for splitting the inferior (program being debugged) argument string in the case where the inferior is not being started under a shell. I have recently been working to improve this area of GDB, and have tracked done some of the unexpected behaviour to the libiberty function buildargv, and how it handles backslash escapes. For reference, I've been mostly reading: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html The issues that I would like to fix are: 1. Backslashes within single quotes should not be treated as an escape, thus: '\a' should split to \a, retaining the backslash. 2. Backslashes within double quotes should only act as an escape if they are immediately before one of the characters $ (dollar), ` (backtick), " (double quote), ` (backslash), or \n (newline). In all other cases a backslash should not be treated as an escape character. Thus: "\a" should split to \a, but "\$" should split to $. 3. A backslash-newline sequence should be treated as a line continuation, both the backslash and the newline should be removed. I've updated libiberty and also added some tests. All the existing libiberty tests continue to pass, but I'm not sure if there is more testing that should be done, buildargv is used within lto-wraper.cc, so maybe there's some testing folk can suggest that I run? --- libiberty/argv.c | 8 +++++-- libiberty/testsuite/test-expandargv.c | 34 +++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 2 deletions(-) diff --git a/libiberty/argv.c b/libiberty/argv.c index c2823d3e4ba..6bae4ca2ee9 100644 --- a/libiberty/argv.c +++ b/libiberty/argv.c @@ -224,9 +224,13 @@ char **buildargv (const char *input) if (bsquote) { bsquote = 0; - *arg++ = *input; + if (*input != '\n') + *arg++ = *input; } - else if (*input == '\\') + else if (*input == '\\' + && !squote + && (!dquote + || strchr ("$`\"\\\n", *(input + 1)) != NULL)) { bsquote = 1; } diff --git a/libiberty/testsuite/test-expandargv.c b/libiberty/testsuite/test-expandargv.c index 30f2337ef77..b8dcc6a269a 100644 --- a/libiberty/testsuite/test-expandargv.c +++ b/libiberty/testsuite/test-expandargv.c @@ -142,6 +142,40 @@ const char *test_data[] = { "b", 0, + /* Test 7 - No backslash removal within single quotes. */ + "'a\\$VAR' '\\\"'", /* Test 7 data */ + ARGV0, + "@test-expandargv-7.lst", + 0, + ARGV0, + "a\\$VAR", + "\\\"", + 0, + + /* Test 8 - Remove backslash / newline pairs. */ + "\"ab\\\ncd\" ef\\\ngh", /* Test 8 data */ + ARGV0, + "@test-expandargv-8.lst", + 0, + ARGV0, + "abcd", + "efgh", + 0, + + /* Test 9 - Backslash within double quotes. */ + "\"\\$VAR\" \"\\`\" \"\\\"\" \"\\\\\" \"\\n\" \"\\t\"", /* Test 9 data */ + ARGV0, + "@test-expandargv-9.lst", + 0, + ARGV0, + "$VAR", + "`", + "\"", + "\\", + "\\n", + "\\t", + 0, + 0 /* Test done marker, don't remove. */ }; base-commit: 458e7c937924bbcef80eb006af0b61420dbfc1c1 -- 2.25.4