public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2] gas: retain whitespace between strings
@ 2022-03-21 14:15 Jan Beulich
  2022-03-23  1:06 ` Alan Modra
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2022-03-21 14:15 UTC (permalink / raw)
  To: Binutils

Macro arguments may be separated by commas or just whitespace. Macro
arguments may also be quoted (where one level of quotes is removed in
the course of determining the values for the respective formal
parameters). Furthermore this quote removal knows _two_ somewhat odd
escaping mechanisms: One, apparently in existence forever, is that a
pair of quotes counts as the escaping of a quote, with the pair being
transformed to a single quote in the course of quote removal. The other
(introduced by c06ae4f232e6) looks more usual on the surface in that it
deals with \" sequences, but it _retains_ the escaping \. Hence only the
former mechanism is suitable when the value to be used by the macro body
is to contain a quote. Yet this results in ambiguity of what "a""b" is
intended to mean; elsewhere (e.g. for .ascii) it represents two
successive string literals. However, in any event is the above different
from "a" "b": I don't think this can be viewed the same as "a""b" when
processing macro arguments.

Change the scrubber to retain such whitespace, by making the processing
of strings more similar to that of symbols. And indeed this appears to
make sense when taking into account that for quite a while gas has been
supporting quoted symbol names.

Taking a more general view, however, the change doesn't go quite far
enough. There are further cases where significant whitespace is removed
by the scrubber. The new testcase enumerates a few in its ".if 0"
section. I'm afraid the only way that I see to deal with this would be
to significantly simplify the scrubber, such that it wouldn't do much
more than collapse sequences of unquoted whitespace into a single blank.
To be honest problems in this area aren't really surprising when seeing
that there's hardly any checking of .macro use throughout the testsuite
(and in particular in the [relatively] generic tests under all/).
---
v2: Avoid the new test failing on (at least) tic4x and tic54x. Update
    .macro doc.

--- a/gas/app.c
+++ b/gas/app.c
@@ -1088,10 +1088,10 @@ do_scrub_chars (size_t (*get) (char *, s
 	      /* PUT didn't jump out.  We could just break, but we
 		 know what will happen, so optimize a bit.  */
 	      ch = GET ();
-	      old_state = 3;
+	      old_state = 9;
 	    }
-	  else if (state == 9)
-	    old_state = 3;
+	  else if (state == 3)
+	    old_state = 9;
 	  else
 	    old_state = state;
 	  state = 5;
--- a/gas/doc/as.texi
+++ b/gas/doc/as.texi
@@ -6186,6 +6186,27 @@ Note: this problem of correctly identify
 also applies to the identifiers used in @code{.irp} (@pxref{Irp})
 and @code{.irpc} (@pxref{Irpc}) as well.
 
+Another issue can occur with the actual arguments passed during macro
+invocation: Multiple arguments can be separated by blanks or commas.  To have
+arguments actually contain blanks or commas (or potentially other non-alpha-
+numeric characters), individual arguments will need to be enclosed in either
+parentheses @code{()}, square brackets @code{[]}, or double quote @code{"}
+characters.  The latter may be the only viable option in certain situations,
+as only double quotes are actually stripped while establishing arguments.  It
+may be important to be aware of two escaping models used when processing such
+quoted argument strings: For one two adjacent double quotes represent a single
+double quote in the resulting argument, going along the lines of the stripping
+of the enclosing quotes.  But then double quotes can also be escaped by a
+backslash @code{\}, but this backslash will not be retained in the resulting
+actual argument as then seen / used while expanding the macro.
+
+As a consequence to the first of these escaping mechanisms two string literals
+intended to be representing separate macro arguments need to be separated by
+white space (or, better yet, by a comma).  To state it differently, such
+adjacent string literals - even if separated only by a blank - will not be
+concatenated when determining macro arguments, even if they're only separated
+by white space.  This is unlike certain other pseudo ops, e.g. @code{.ascii}.
+
 @item .endm
 @cindex @code{endm} directive
 Mark the end of a macro definition.
--- a/gas/testsuite/gas/all/gas.exp
+++ b/gas/testsuite/gas/all/gas.exp
@@ -461,6 +461,8 @@ if [is_elf_format] {
 
 run_dump_test quoted-sym-names
 
+run_list_test macro "-alm"
+
 run_list_test pr20312
 
 load_lib gas-dg.exp
--- /dev/null
+++ b/gas/testsuite/gas/all/macro.l
@@ -0,0 +1,25 @@
+# This should match the output of gas -alm macro.s.
+
+.*: Assembler messages:
+.*:10: Error: too many positional arguments
+.*macro.s.*
+
+
+[ 	]*[1-9][0-9]*[ 	]+\.macro[ 	]+m[ 	]+arg1,[ 	]*arg2[ 	]*
+#...
+[ 	]*[1-9][0-9]*[ 	]+\.endm[ 	]*
+[ 	]*[1-9][0-9]*[ 	]+
+[ 	]*[1-9][0-9]*[ 	]+m[ 	]+1,[ 	]*2
+[ 	]*[1-9][0-9]*[ 	]+.... 0+10*[ 	]+>  .byte 1
+[ 	]*[1-9][0-9]*[ 	]+.... 0+20*[ 	]+>  .byte 2
+[ 	]*[1-9][0-9]*[ 	]+m[ 	]+3[ 	]+4
+[ 	]*[1-9][0-9]*[ 	]+.... 0+30*[ 	]+>  .byte 3
+[ 	]*[1-9][0-9]*[ 	]+.... 0+40*[ 	]+>  .byte 4
+[ 	]*[1-9][0-9]*[ 	]+m[ 	]+"5",[ 	]*"6"
+[ 	]*[1-9][0-9]*[ 	]+.... 0+50*[ 	]+>  .byte 5
+[ 	]*[1-9][0-9]*[ 	]+.... 0+60*[ 	]+>  .byte 6
+[ 	]*[1-9][0-9]*[ 	]+m[ 	]+"7"[ 	]+"8"
+[ 	]*[1-9][0-9]*[ 	]+.... 0+70*[ 	]+>  .byte 7
+[ 	]*[1-9][0-9]*[ 	]+.... 0+80*[ 	]+>  .byte 8
+[ 	]*[1-9][0-9]*[ 	]+m[ 	]+""[ 	]+""[ 	]+""
+#pass
--- /dev/null
+++ b/gas/testsuite/gas/all/macro.s
@@ -0,0 +1,16 @@
+	.macro m arg1, arg2
+	.byte \arg1
+	.byte \arg2
+	.endm
+
+	m 1, 2
+	m 3 4
+	m "5", "6"
+	m "7" "8"
+	m "" "" ""
+
+	.if 0
+	m 1 +2
+	m (3) +4
+	m (5) (6)
+	.endif


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] gas: retain whitespace between strings
  2022-03-21 14:15 [PATCH v2] gas: retain whitespace between strings Jan Beulich
@ 2022-03-23  1:06 ` Alan Modra
  2022-03-23  7:00   ` Jan Beulich
  0 siblings, 1 reply; 6+ messages in thread
From: Alan Modra @ 2022-03-23  1:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Mon, Mar 21, 2022 at 03:15:16PM +0100, Jan Beulich via Binutils wrote:
> section. I'm afraid the only way that I see to deal with this would be
> to significantly simplify the scrubber, such that it wouldn't do much
> more than collapse sequences of unquoted whitespace into a single blank.

Agreed.  I said much the same a long time ago.  Patch is OK.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] gas: retain whitespace between strings
  2022-03-23  1:06 ` Alan Modra
@ 2022-03-23  7:00   ` Jan Beulich
  2022-03-23 22:23     ` Alan Modra
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Beulich @ 2022-03-23  7:00 UTC (permalink / raw)
  To: Alan Modra; +Cc: Binutils

On 23.03.2022 02:06, Alan Modra wrote:
> On Mon, Mar 21, 2022 at 03:15:16PM +0100, Jan Beulich via Binutils wrote:
>> section. I'm afraid the only way that I see to deal with this would be
>> to significantly simplify the scrubber, such that it wouldn't do much
>> more than collapse sequences of unquoted whitespace into a single blank.
> 
> Agreed.  I said much the same a long time ago.

In this case, are there reasons speaking against doing so (besides the
effort it takes)? Fear of regressions perhaps (which may be acceptable
to deal with by retaining the present scrubbing logic as a non-default
mode, to be engaged by a command line option for people to use as a
workaround)?

>  Patch is OK.

Thanks.

Jan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] gas: retain whitespace between strings
  2022-03-23  7:00   ` Jan Beulich
@ 2022-03-23 22:23     ` Alan Modra
  2022-03-24  1:31       ` Hans-Peter Nilsson
  0 siblings, 1 reply; 6+ messages in thread
From: Alan Modra @ 2022-03-23 22:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Binutils

On Wed, Mar 23, 2022 at 08:00:15AM +0100, Jan Beulich wrote:
> On 23.03.2022 02:06, Alan Modra wrote:
> > On Mon, Mar 21, 2022 at 03:15:16PM +0100, Jan Beulich via Binutils wrote:
> >> section. I'm afraid the only way that I see to deal with this would be
> >> to significantly simplify the scrubber, such that it wouldn't do much
> >> more than collapse sequences of unquoted whitespace into a single blank.
> > 
> > Agreed.  I said much the same a long time ago.
> 
> In this case, are there reasons speaking against doing so (besides the
> effort it takes)?

I don't think so.  The original intent regarding the gas preprocessor
was to not preprocess gcc output (#NO_APP) *and* for the assembler to
not accept unnecessary whitespace from gcc.  User assembly would be
preprocessed to remove unnecessary whitespace so that most gas code
would not need SKIP_WHITESPACE.  I think this was a bad design,
because the preprocessor needed to know too much info about syntax.
In any case the ideal proved difficult to achieve, so we have
SKIP_WHITESPACE all over the place anyway.

We may need a few more instances of SKIP_WHITESPACE if app.c is
simplified.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] gas: retain whitespace between strings
  2022-03-23 22:23     ` Alan Modra
@ 2022-03-24  1:31       ` Hans-Peter Nilsson
  2022-03-24  8:54         ` Andreas Schwab
  0 siblings, 1 reply; 6+ messages in thread
From: Hans-Peter Nilsson @ 2022-03-24  1:31 UTC (permalink / raw)
  To: Alan Modra; +Cc: Jan Beulich, Binutils

On Thu, 24 Mar 2022, Alan Modra via Binutils wrote:
> On Wed, Mar 23, 2022 at 08:00:15AM +0100, Jan Beulich wrote:
> > On 23.03.2022 02:06, Alan Modra wrote:
> > > On Mon, Mar 21, 2022 at 03:15:16PM +0100, Jan Beulich via Binutils wrote:
> > >> section. I'm afraid the only way that I see to deal with this would be
> > >> to significantly simplify the scrubber, such that it wouldn't do much
> > >> more than collapse sequences of unquoted whitespace into a single blank.
> > >
> > > Agreed.  I said much the same a long time ago.
> >
> > In this case, are there reasons speaking against doing so (besides the
> > effort it takes)?
>
> I don't think so.  The original intent regarding the gas preprocessor
> was to not preprocess gcc output (#NO_APP) *and* for the assembler to
> not accept unnecessary whitespace from gcc.  User assembly would be
> preprocessed to remove unnecessary whitespace so that most gas code
> would not need SKIP_WHITESPACE.  I think this was a bad design,
> because the preprocessor needed to know too much info about syntax.
> In any case the ideal proved difficult to achieve, so we have
> SKIP_WHITESPACE all over the place anyway.

For the record (you know it, I know it, Jan may know it, but
probably not many other people), few gcc targets actually emit a
valid "#NO_APP\n" (exactly) as the *very first* line from
gcc-generated assembly, and as such go through the scrubbing
process for also for all gcc-generated assembly code.

Looking closer, I see it's just one target at the moment,
because you also need to set targetm.asm_file_start_app_off to
true; that is, besides the '#define ASM_APP_OFF "#NO_APP\n"'.

brgds, H-P

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] gas: retain whitespace between strings
  2022-03-24  1:31       ` Hans-Peter Nilsson
@ 2022-03-24  8:54         ` Andreas Schwab
  0 siblings, 0 replies; 6+ messages in thread
From: Andreas Schwab @ 2022-03-24  8:54 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Alan Modra, Binutils

On Mär 23 2022, Hans-Peter Nilsson wrote:

> Looking closer, I see it's just one target at the moment,
> because you also need to set targetm.asm_file_start_app_off to
> true; that is, besides the '#define ASM_APP_OFF "#NO_APP\n"'.

There is also the target hook TARGET_ASM_FILE_START_APP_OFF, which is
defined by m68k and vax.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-24  8:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21 14:15 [PATCH v2] gas: retain whitespace between strings Jan Beulich
2022-03-23  1:06 ` Alan Modra
2022-03-23  7:00   ` Jan Beulich
2022-03-23 22:23     ` Alan Modra
2022-03-24  1:31       ` Hans-Peter Nilsson
2022-03-24  8:54         ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).