From: Richard Sandiford <richard.sandiford@arm.com>
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: Mikael Morin <morin-mikael@orange.fr>,
Richard Earnshaw <Richard.Earnshaw@arm.com>,
"gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
nd <nd@arm.com>
Subject: Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.
Date: Wed, 14 Jun 2023 20:41:28 +0100 [thread overview]
Message-ID: <mptttv9de6v.fsf@arm.com> (raw)
In-Reply-To: <VI1PR08MB532501BE8432A4B569870958FF55A@VI1PR08MB5325.eurprd08.prod.outlook.com> (Tamar Christina's message of "Tue, 13 Jun 2023 16:26:35 +0100")
Tamar Christina <Tamar.Christina@arm.com> writes:
> +The syntax rules are as follows:
> +@itemize @bullet
> +@item
> +Templates must start with @samp{@{@@} to use the new syntax.
> +
> +@item
> +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:}
s/parentheses/square brackets/
> +followed by a comma-separated list of @code{match_operand}/@code{match_scratch}
> +operand numbers, then a semicolon, followed by the same for attributes
> +(@samp{attrs:}). Operand modifiers can be placed in this section group as well.
How about:
Operand modifiers like @code{=} and @code{+} can be placed before
an operand number.
> +Both sections are optional (so you can use only @samp{cons}, or only
> +@samp{attrs}, or both), and @samp{cons} must come before @samp{attrs} if
> +present.
> +
> +@item
> +Each alternative begins with any amount of whitespace.
> +
> +@item
> +Following the whitespace is a comma-separated list of "constraints" and/or
> +"attributes" within brackets @code{[]}, with sections separated by a semicolon.
> +
> +@item
> +Should you want to copy the previous asm line, the symbol @code{^} can be used.
> +This allows less copy pasting between alternative and reduces the number of
> +lines to update on changes.
> +
> +@item
> +When using C functions for output, the idiom @samp{* return @var{function};}
> +can be replaced with the shorthand @samp{<< @var{function};}.
> +
> +@item
> +Following the closing @samp{]} is any amount of whitespace, and then the actual
> +asm output.
> +
> +@item
> +Spaces are allowed in the list (they will simply be removed).
> +
> +@item
> +All constraint alternatives should be specified. For example, a list of
> +of three blank alternatives should be written @samp{[,,]} rather than
> +@samp{[]}.
> +
> +@item
> +All attribute alternatives should be non-empty, with @samp{*}
> +representing the default attribute value. For example, a list of three
> +default attribute values should be written @samp{[*,*,*]} rather than
> +@samp{[]}.
> +
> +
Nit: too many blank lines.
> +@item
> +Within an @samp{@{@@} block both multiline and singleline C comments are
> +allowed, but when used outside of a C block they must be the only non-whitespace
> +blocks on the line.
> +
> +@item
> +Within an @samp{@{@@} block, any iterators that do not get expanded will result
> +in an error. If for some reason it is required to have @code{<} or @code{>} in
> +the output then these must be escaped using @backslashchar{}.
> +
> +@item
> +It is possible to use the @samp{attrs} list to specify some attributes and to
> +use the normal @code{set_attr} syntax to specify other attributes. There must
> +not be any overlap between the two lists.
> +
> +In other words, the following is valid:
> +@smallexample
> +@group
> +(define_insn_and_split ""
> + [(set (match_operand:SI 0 "nonimmediate_operand")
> + (match_operand:SI 1 "aarch64_mov_operand"))]
> + ""
> + @{@@ [cons: 0, 1; attrs: type, arch, length]@}
> + @dots{}
> + [(set_attr "foo" "mov_imm")]
> +)
> +@end group
> +@end smallexample
> +
> +but this is not valid:
> +@smallexample
> +@group
> +(define_insn_and_split ""
> + [(set (match_operand:SI 0 "nonimmediate_operand")
> + (match_operand:SI 1 "aarch64_mov_operand"))]
> + ""
> + @{@@ [cons: 0, 1; attrs: type, arch, length]@}
> + @dots{}
> + [(set_attr "arch" "bar")
> + (set_attr "foo" "mov_imm")]
> +)
> +@end group
> +@end smallexample
> +
> +because you can't mix and match new and old syntax.
Maybe “because it specifies @code{arch} twice”? Suggesting that because
“new” and “old” tend not to age well.
> +/* Add constraints to an rtx. This function is similar to remove_constraints.
> + Errors if adding the constraints would overwrite existing constraints. */
> +
> +static void
> +add_constraints (rtx part, file_location loc, vec_conlist &cons)
> +{
> + const char *format_ptr;
> +
> + if (part == NULL_RTX)
> + return;
> +
> + /* If match_op or match_scr, check if we have the right one, and if so, copy
> + over the constraint list. */
> + if (GET_CODE (part) == MATCH_OPERAND || GET_CODE (part) == MATCH_SCRATCH)
> + {
> + int field = GET_CODE (part) == MATCH_OPERAND ? 2 : 1;
> + unsigned id = XINT (part, 0);
> +
> + if (id >= cons.size ())
> + fatal_at (loc, "could not find match_operand/scratch with id %d", id);
Is this an error? I thought it should be treated like...
> +
> + if (cons[id].idx == -1)
> + return;
...cons[id].idx == -1 is here. I.e. I think they could be combined to:
if (ids >= cons.size () || cons[id].idx == -1)
return;
> +
> + if (XSTR (part, field)[0] != '\0')
> + {
> + error_at (loc, "can't mix normal and compact constraint syntax");
> + return;
> + }
> + XSTR (part, field) = cons[id].out ();
> + cons[id].idx = -1;
> + }
> +
> + format_ptr = GET_RTX_FORMAT (GET_CODE (part));
> +
> + /* Recursively search the rtx. */
> + for (int i = 0; i < GET_RTX_LENGTH (GET_CODE (part)); i++)
> + switch (*format_ptr++)
> + {
> + case 'e':
> + case 'u':
> + add_constraints (XEXP (part, i), loc, cons);
> + break;
> + case 'E':
> + if (XVEC (part, i) != NULL)
> + for (int j = 0; j < XVECLEN (part, i); j++)
> + add_constraints (XVECEXP (part, i, j), loc, cons);
> + break;
> + default:
> + continue;
> + }
> +}
> +
> +/* Add ATTRS to definition X's attribute list. */
> +
> +static void
> +add_attributes (rtx x, vec_conlist &attrs)
> +{
> + unsigned int attr_index = GET_CODE (x) == DEFINE_INSN ? 4 : 3;
> + rtvec orig = XVEC (x, attr_index);
> + size_t n_curr = orig ? XVECLEN (x, attr_index) : 0;
> + rtvec copy = rtvec_alloc (n_curr + attrs.size ());
> +
> + /* Create a shallow copy of existing entries. */
> + memcpy (©->elem[attrs.size ()], &orig->elem[0], sizeof (rtx) * n_curr);
Sorry for not noticing last time, but I think this should strictly
be guarded by:
if (orig)
to avoid calculating &orig->elem[0] on a null pointer.
> + XVEC (x, attr_index) = copy;
> +
> + /* Create the new elements. */
> + for (unsigned i = 0; i < attrs.size (); i++)
> + {
> + rtx attr = rtx_alloc (SET_ATTR);
> + XSTR (attr, 0) = xstrdup (attrs[i].name.c_str ());
> + XSTR (attr, 1) = attrs[i].out ();
> + XVECEXP (x, attr_index, i) = attr;
> + }
> +}
> +
> +/* Consumes spaces and tabs. */
> +
> +static inline void
> +skip_spaces (const char **str)
> +{
> + while (ISBLANK (**str))
> + (*str)++;
> +}
> +
> +/* Consumes the given character, if it's there. */
> +
> +static inline bool
> +expect_char (const char **str, char c)
> +{
> + if (**str != c)
> + return false;
> + (*str)++;
> + return true;
> +}
> +
> +/* Parses the section layout that follows a "{@" if using new syntax. Builds
> + a vector for a single section. E.g. if we have "attrs: length, arch]..."
> + then list will have two elements, the first for "length" and the second
> + for "arch". */
> +
> +static void
> +parse_section_layout (file_location loc, const char **templ, const char *label,
> + vec_conlist &list, bool numeric)
> +{
> + const char *name_start;
> + size_t label_len = strlen (label);
> + if (strncmp (label, *templ, label_len) == 0)
> + {
> + *templ += label_len;
> +
> + /* Gather the names. */
> + while (**templ != ';' && **templ != ']')
> + {
> + skip_spaces (templ);
> + name_start = *templ;
> + int len = 0;
> + char val = (*templ)[len];
> + while (val != ',' && val != ';' && val != ']')
> + {
> + if (val == 0 || val == '\n')
> + fatal_at (loc, "missing ']'");
> + val = (*templ)[++len];
> + }
> + *templ += len;
> + if (val == ',')
> + (*templ)++;
> + list.push_back (conlist (name_start, len, numeric));
> + }
> + }
> +}
> +
> +/* Parse a section, a section is defined as a named space separated list, e.g.
> +
> + foo: a, b, c
> +
> + is a section named "foo" with entries a, b and c. */
> +
> +static void
> +parse_section (const char **templ, unsigned int n_elems, unsigned int alt_no,
> + vec_conlist &list, file_location loc, const char *name)
> +{
> + unsigned int i;
> +
> + /* Go through the list, one character at a time, adding said character
> + to the correct string. */
> + for (i = 0; **templ != ']' && **templ != ';'; (*templ)++)
> + if (!ISBLANK (**templ))
> + {
> + list[i].add (**templ);
> + if (**templ == ',')
> + {
> + ++i;
> + if (i == n_elems)
> + fatal_at (loc, "too many %ss in alternative %d: expected %d",
> + name, alt_no, n_elems);
> + }
> + if (**templ == 0 || **templ == '\n')
> + fatal_at (loc, "missing ']'");
> + }
I think it'd be more obvious for the NIL and EOL check to come first,
so that we don't do anything with invalid chars:
/* Go through the list, one character at a time, adding said character
to the correct string. */
for (i = 0; **templ != ']' && **templ != ';'; (*templ)++)
if (!ISBLANK (**templ))
{
if (**templ == 0 || **templ == '\n')
fatal_at (loc, "missing ']'");
list[i].add (**templ);
if (**templ == ',')
{
++i;
if (i == n_elems)
fatal_at (loc, "too many %ss in alternative %d: expected %d",
name, alt_no, n_elems);
}
}
> +
> + if (i + 1 < n_elems)
> + fatal_at (loc, "too few %ss in alternative %d: expected %d, got %d",
> + name, alt_no, n_elems, i);
> +
> + list[i].add (',');
> +}
> +
> +/* The compact syntax has more convience syntaxes. As such we post process
> + the lines to get them back to something the normal syntax understands. */
> +
> +static void
> +preprocess_compact_syntax (file_location loc, int alt_no, std::string &line,
> + std::string &last_line)
> +{
> + /* Check if we're copying the last statement. */
> + if (line.find ("^") == 0 && line.size () == 1)
> + {
> + if (last_line.empty ())
> + fatal_at (loc, "found instruction to copy previous line (^) in"
> + "alternative %d but no previous line to copy", alt_no);
> + line = last_line;
> + return;
> + }
> +
> + std::string result;
> + std::string buffer;
> + /* Check if we have << which means return c statement. */
> + if (line.find ("<<") == 0)
> + {
> + result.append ("* return ");
> + const char *chunk = line.c_str () + 2;
> + skip_spaces (&chunk);
> + result.append (chunk);
> + }
> + else
> + result.append (line);
> +
> + line = result;
> + return;
> +}
> +
> +/* Converts an rtx from compact syntax to normal syntax if possible. */
> +
> +static void
> +convert_syntax (rtx x, file_location loc)
> +{
> + int alt_no;
> + unsigned int templ_index;
> + const char *templ;
> + vec_conlist tconvec, convec, attrvec;
> +
> + templ_index = GET_CODE (x) == DEFINE_INSN ? 3 : 2;
> +
> + templ = XTMPL (x, templ_index);
> +
> + /* Templates with constraints start with "{@". */
> + if (strncmp ("*{@", templ, 3))
> + return;
> +
> + /* Get the layout for the template. */
> + templ += 3;
> + skip_spaces (&templ);
> +
> + if (!expect_char (&templ, '['))
> + fatal_at (loc, "expecing `[' to begin section list");
> +
> + parse_section_layout (loc, &templ, "cons:", tconvec, true);
> + convec.resize (tconvec.size ());
IMO it'd be better to leave the last line out, since there's nothing
particularly special about tconvec.size() for this mapping.
> +
> + /* Check for any duplicate cons entries and sort based on i. */
> + for (auto e : tconvec)
> + {
> + unsigned idx = e.idx;
> + if (idx >= convec.size ())
> + convec.resize (idx + 1);
> +
> + if (convec[idx].idx >= 0)
> + fatal_at (loc, "duplicate cons number found: %d", idx);
> + convec[idx] = e;
> + }
> + tconvec.clear ();
> +
> + if (*templ != ']')
> + {
> + if (*templ == ';')
> + skip_spaces (&(++templ));
> + parse_section_layout (loc, &templ, "attrs:", attrvec, false);
> + }
> +
> + if (!expect_char (&templ, ']'))
> + fatal_at (loc, "expecting `]` to end section list - section list must have "
> + "cons first, attrs second");
> +
> + /* We will write the un-constrainified template into new_templ. */
> + std::string new_templ;
> + new_templ.append ("@");
> +
> + /* Skip to the first proper line. */
> + while (*templ++ != '\n');
This seems to have reverted to an earlier version, but it has the problems
discussed previously: it never terminates if the string doesn't have a '\n',
and it allows anything to come between the ']' and the '\n'.
I think it should be:
/* Skip to the first proper line. */
skip_spaces (&templ);
if (*templ == 0)
fatal_at (loc, "'{@...}' blocks must have at least one alternative");
if (*templ != '\n')
fatal_at (loc, "unexpected character '%c' after ']'", *templ);
templ++;
> +
> + alt_no = 0;
> + std::string last_line;
> +
> + /* Process the alternatives. */
> + while (*(templ - 1) != '\0')
> + {
> + /* Skip leading whitespace. */
> + std::string buffer;
> + skip_spaces (&templ);
> +
> + /* Check if we're at the end. */
> + if (templ[0] == '}' && templ[1] == '\0')
> + break;
> +
> + if (expect_char (&templ, '['))
> + {
> + new_templ += '\n';
> + new_templ.append (buffer);
> + /* Parse the constraint list, then the attribute list. */
> + if (convec.size () > 0)
> + parse_section (&templ, convec.size (), alt_no, convec, loc,
> + "constraint");
> +
> + if (attrvec.size () > 0)
> + {
> + if (convec.size () > 0 && !expect_char (&templ, ';'))
> + fatal_at (loc, "expected `;' to separate constraints "
> + "and attributes in alternative %d", alt_no);
> +
> + parse_section (&templ, attrvec.size (), alt_no,
> + attrvec, loc, "attribute");
> + }
> +
> + if (!expect_char (&templ, ']'))
> + fatal_at (loc, "expected end of constraint/attribute list but "
> + "missing an ending `]' in alternative %d", alt_no);
> + }
> + else if (templ[0] == '/' && templ[1] == '/')
> + {
> + templ += 2;
> + /* Glob till newline or end of string. */
> + while (*templ != '\n' || *templ != '\0')
> + templ++;
> + templ++;
This should be deleted, since it will skip over '\0' as well as '\n'.
The loop below handles '\n' correctly.
> +
> + /* Skip any newlines or whitespaces needed. */
> + while (ISSPACE(*templ))
> + templ++;
> + continue;
> + }
> + else if (templ[0] == '/' && templ[1] == '*')
> + {
> + templ += 2;
> + /* Glob till newline or end of multiline comment. */
> + while (templ[0] != 0 && templ[0] != '*' && templ[1] != '/')
> + templ++;
> + templ += 2;
Same problem about moving past '\0' here. But the break condition would
stop on things like "*]" or "//", not just "*/". I think it should be:
for (; templ[0] != 0; ++templ)
if (templ[0] == '*' && templ[1] == '/')
{
templ += 2;
break;
}
> +
> + /* Skip any newlines or whitespaces needed. */
> + while (ISSPACE(*templ))
> + templ++;
> + continue;
> + }
> + else
> + fatal_at (loc, "expected constraint/attribute list at beginning of "
> + "alternative %d but missing a starting `['", alt_no);
> +
> + /* Skip whitespace between list and asm. */
> + ++templ;
Looks like this line should be deleted. We only get here after
expect_char (&templ, ']'), which has already skipped the ']'.
OK for trunk with those changes, thanks.
Richard
> + skip_spaces (&templ);
> +
> + /* Copy asm to new template. */
> + std::string line;
> + while (*templ != '\n' && *templ != '\0')
> + line += *templ++;
> +
> + /* Apply any pre-processing needed to the line. */
> + preprocess_compact_syntax (loc, alt_no, line, last_line);
> + new_templ.append (line);
> + last_line = line;
> +
> + /* Normal "*..." syntax expects the closing quote to be on the final
> + line of asm, whereas we allow the closing "}" to be on its own line.
> + Postpone copying the '\n' until we know that there is another
> + alternative in the list. */
> + while (ISSPACE (*templ))
> + templ++;
> + ++alt_no;
> + }
> +
> + /* Write the constraints and attributes into their proper places. */
> + if (convec.size () > 0)
> + add_constraints (x, loc, convec);
> +
> + if (attrvec.size () > 0)
> + add_attributes (x, attrvec);
> +
> + /* Copy over the new un-constrainified template. */
> + XTMPL (x, templ_index) = xstrdup (new_templ.c_str ());
> +
> + /* Register for later checks during iterator expansions. */
> + compact_syntax.add (x);
> +}
> +
> /* Process a top level rtx in some way, queuing as appropriate. */
>
> static void
> @@ -553,10 +1036,12 @@ process_rtx (rtx desc, file_location loc)
> switch (GET_CODE (desc))
> {
> case DEFINE_INSN:
> + convert_syntax (desc, loc);
> queue_pattern (desc, &define_insn_tail, loc);
> break;
>
> case DEFINE_COND_EXEC:
> + convert_syntax (desc, loc);
> queue_pattern (desc, &define_cond_exec_tail, loc);
> break;
>
> @@ -631,6 +1116,7 @@ process_rtx (rtx desc, file_location loc)
> attr = XVEC (desc, split_code + 1);
> PUT_CODE (desc, DEFINE_INSN);
> XVEC (desc, 4) = attr;
> + convert_syntax (desc, loc);
>
> /* Queue them. */
> insn_elem = queue_pattern (desc, &define_insn_tail, loc);
next prev parent reply other threads:[~2023-06-14 19:41 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-18 16:30 [PATCH] RFC: " Tamar Christina
2023-04-21 17:18 ` Richard Sandiford
2023-04-24 8:33 ` Richard Sandiford
2023-05-16 13:56 ` Richard Earnshaw (lists)
2023-04-24 9:05 ` Tamar Christina
2023-04-24 9:37 ` Richard Sandiford
2023-06-05 13:51 ` [PATCH v2] machine descriptor: " Tamar Christina
2023-06-05 20:35 ` Richard Sandiford
2023-06-06 7:47 ` Richard Sandiford
2023-06-06 12:00 ` Tamar Christina
2023-06-06 12:49 ` Richard Sandiford
2023-06-06 16:13 ` Richard Earnshaw (lists)
2023-06-08 9:58 ` Tamar Christina
2023-06-08 10:12 ` Andreas Schwab
2023-06-08 10:29 ` Richard Earnshaw (lists)
2023-06-08 10:33 ` Richard Earnshaw (lists)
2023-06-08 14:24 ` Richard Sandiford
2023-06-08 16:49 ` Mikael Morin
2023-06-13 15:26 ` Tamar Christina
2023-06-14 19:41 ` Richard Sandiford [this message]
2023-06-15 6:24 ` Richard Sandiford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptttv9de6v.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=Tamar.Christina@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=morin-mikael@orange.fr \
--cc=nd@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).