From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 733BB3858C5E for ; Wed, 14 Jun 2023 19:41:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 733BB3858C5E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8458F1FB; Wed, 14 Jun 2023 12:42:14 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4155F3F5A1; Wed, 14 Jun 2023 12:41:29 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,Mikael Morin , Richard Earnshaw , "gcc-patches\@gcc.gnu.org" , nd , richard.sandiford@arm.com Cc: Mikael Morin , Richard Earnshaw , "gcc-patches\@gcc.gnu.org" , nd Subject: Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions. References: <30f06911-e19a-a39b-9b56-803cc9ef0ec1@orange.fr> Date: Wed, 14 Jun 2023 20:41:28 +0100 In-Reply-To: (Tamar Christina's message of "Tue, 13 Jun 2023 16:26:35 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-21.7 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: > +The syntax rules are as follows: > +@itemize @bullet > +@item > +Templates must start with @samp{@{@@} to use the new syntax. > + > +@item > +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:} s/parentheses/square brackets/ > +followed by a comma-separated list of @code{match_operand}/@code{match_s= cratch} > +operand numbers, then a semicolon, followed by the same for attributes > +(@samp{attrs:}). Operand modifiers can be placed in this section group = as well. How about: Operand modifiers like @code{=3D} and @code{+} can be placed before an operand number. > +Both sections are optional (so you can use only @samp{cons}, or only > +@samp{attrs}, or both), and @samp{cons} must come before @samp{attrs} if > +present. > + > +@item > +Each alternative begins with any amount of whitespace. > + > +@item > +Following the whitespace is a comma-separated list of "constraints" and/= or > +"attributes" within brackets @code{[]}, with sections separated by a sem= icolon. > + > +@item > +Should you want to copy the previous asm line, the symbol @code{^} can b= e used. > +This allows less copy pasting between alternative and reduces the number= of > +lines to update on changes. > + > +@item > +When using C functions for output, the idiom @samp{* return @var{functio= n};} > +can be replaced with the shorthand @samp{<< @var{function};}. > + > +@item > +Following the closing @samp{]} is any amount of whitespace, and then the= actual > +asm output. > + > +@item > +Spaces are allowed in the list (they will simply be removed). > + > +@item > +All constraint alternatives should be specified. For example, a list of > +of three blank alternatives should be written @samp{[,,]} rather than > +@samp{[]}. > + > +@item > +All attribute alternatives should be non-empty, with @samp{*} > +representing the default attribute value. For example, a list of three > +default attribute values should be written @samp{[*,*,*]} rather than > +@samp{[]}. > + > + Nit: too many blank lines. > +@item > +Within an @samp{@{@@} block both multiline and singleline C comments are > +allowed, but when used outside of a C block they must be the only non-wh= itespace > +blocks on the line. > + > +@item > +Within an @samp{@{@@} block, any iterators that do not get expanded will= result > +in an error. If for some reason it is required to have @code{<} or @cod= e{>} in > +the output then these must be escaped using @backslashchar{}. > + > +@item > +It is possible to use the @samp{attrs} list to specify some attributes a= nd to > +use the normal @code{set_attr} syntax to specify other attributes. Ther= e must > +not be any overlap between the two lists. > + > +In other words, the following is valid: > +@smallexample > +@group > +(define_insn_and_split "" > + [(set (match_operand:SI 0 "nonimmediate_operand") > + (match_operand:SI 1 "aarch64_mov_operand"))] > + "" > + @{@@ [cons: 0, 1; attrs: type, arch, length]@} > + @dots{} > + [(set_attr "foo" "mov_imm")] > +) > +@end group > +@end smallexample > + > +but this is not valid: > +@smallexample > +@group > +(define_insn_and_split "" > + [(set (match_operand:SI 0 "nonimmediate_operand") > + (match_operand:SI 1 "aarch64_mov_operand"))] > + "" > + @{@@ [cons: 0, 1; attrs: type, arch, length]@} > + @dots{} > + [(set_attr "arch" "bar") > + (set_attr "foo" "mov_imm")] > +) > +@end group > +@end smallexample > + > +because you can't mix and match new and old syntax. Maybe =E2=80=9Cbecause it specifies @code{arch} twice=E2=80=9D? Suggesting= that because =E2=80=9Cnew=E2=80=9D and =E2=80=9Cold=E2=80=9D tend not to age well. > +/* Add constraints to an rtx. This function is similar to remove_constr= aints. > + Errors if adding the constraints would overwrite existing constraints= . */ > + > +static void > +add_constraints (rtx part, file_location loc, vec_conlist &cons) > +{ > + const char *format_ptr; > + > + if (part =3D=3D NULL_RTX) > + return; > + > + /* If match_op or match_scr, check if we have the right one, and if so= , copy > + over the constraint list. */ > + if (GET_CODE (part) =3D=3D MATCH_OPERAND || GET_CODE (part) =3D=3D MAT= CH_SCRATCH) > + { > + int field =3D GET_CODE (part) =3D=3D MATCH_OPERAND ? 2 : 1; > + unsigned id =3D XINT (part, 0); > + > + if (id >=3D cons.size ()) > + fatal_at (loc, "could not find match_operand/scratch with id %d",= id); Is this an error? I thought it should be treated like... > + > + if (cons[id].idx =3D=3D -1) > + return; ...cons[id].idx =3D=3D -1 is here. I.e. I think they could be combined to: if (ids >=3D cons.size () || cons[id].idx =3D=3D -1) return; > + > + if (XSTR (part, field)[0] !=3D '\0') > + { > + error_at (loc, "can't mix normal and compact constraint syntax"= ); > + return; > + } > + XSTR (part, field) =3D cons[id].out (); > + cons[id].idx =3D -1; > + } > + > + format_ptr =3D GET_RTX_FORMAT (GET_CODE (part)); > + > + /* Recursively search the rtx. */ > + for (int i =3D 0; i < GET_RTX_LENGTH (GET_CODE (part)); i++) > + switch (*format_ptr++) > + { > + case 'e': > + case 'u': > + add_constraints (XEXP (part, i), loc, cons); > + break; > + case 'E': > + if (XVEC (part, i) !=3D NULL) > + for (int j =3D 0; j < XVECLEN (part, i); j++) > + add_constraints (XVECEXP (part, i, j), loc, cons); > + break; > + default: > + continue; > + } > +} > + > +/* Add ATTRS to definition X's attribute list. */ > + > +static void > +add_attributes (rtx x, vec_conlist &attrs) > +{ > + unsigned int attr_index =3D GET_CODE (x) =3D=3D DEFINE_INSN ? 4 : 3; > + rtvec orig =3D XVEC (x, attr_index); > + size_t n_curr =3D orig ? XVECLEN (x, attr_index) : 0; > + rtvec copy =3D rtvec_alloc (n_curr + attrs.size ()); > + > + /* Create a shallow copy of existing entries. */ > + memcpy (©->elem[attrs.size ()], &orig->elem[0], sizeof (rtx) * n_c= urr); Sorry for not noticing last time, but I think this should strictly be guarded by: if (orig) to avoid calculating &orig->elem[0] on a null pointer. > + XVEC (x, attr_index) =3D copy; > + > + /* Create the new elements. */ > + for (unsigned i =3D 0; i < attrs.size (); i++) > + { > + rtx attr =3D rtx_alloc (SET_ATTR); > + XSTR (attr, 0) =3D xstrdup (attrs[i].name.c_str ()); > + XSTR (attr, 1) =3D attrs[i].out (); > + XVECEXP (x, attr_index, i) =3D attr; > + } > +} > + > +/* Consumes spaces and tabs. */ > + > +static inline void > +skip_spaces (const char **str) > +{ > + while (ISBLANK (**str)) > + (*str)++; > +} > + > +/* Consumes the given character, if it's there. */ > + > +static inline bool > +expect_char (const char **str, char c) > +{ > + if (**str !=3D c) > + return false; > + (*str)++; > + return true; > +} > + > +/* Parses the section layout that follows a "{@" if using new syntax. Bu= ilds > + a vector for a single section. E.g. if we have "attrs: length, arch].= .." > + then list will have two elements, the first for "length" and the seco= nd > + for "arch". */ > + > +static void > +parse_section_layout (file_location loc, const char **templ, const char = *label, > + vec_conlist &list, bool numeric) > +{ > + const char *name_start; > + size_t label_len =3D strlen (label); > + if (strncmp (label, *templ, label_len) =3D=3D 0) > + { > + *templ +=3D label_len; > + > + /* Gather the names. */ > + while (**templ !=3D ';' && **templ !=3D ']') > + { > + skip_spaces (templ); > + name_start =3D *templ; > + int len =3D 0; > + char val =3D (*templ)[len]; > + while (val !=3D ',' && val !=3D ';' && val !=3D ']') > + { > + if (val =3D=3D 0 || val =3D=3D '\n') > + fatal_at (loc, "missing ']'"); > + val =3D (*templ)[++len]; > + } > + *templ +=3D len; > + if (val =3D=3D ',') > + (*templ)++; > + list.push_back (conlist (name_start, len, numeric)); > + } > + } > +} > + > +/* Parse a section, a section is defined as a named space separated list= , e.g. > + > + foo: a, b, c > + > + is a section named "foo" with entries a, b and c. */ > + > +static void > +parse_section (const char **templ, unsigned int n_elems, unsigned int al= t_no, > + vec_conlist &list, file_location loc, const char *name) > +{ > + unsigned int i; > + > + /* Go through the list, one character at a time, adding said character > + to the correct string. */ > + for (i =3D 0; **templ !=3D ']' && **templ !=3D ';'; (*templ)++) > + if (!ISBLANK (**templ)) > + { > + list[i].add (**templ); > + if (**templ =3D=3D ',') > + { > + ++i; > + if (i =3D=3D n_elems) > + fatal_at (loc, "too many %ss in alternative %d: expected %d= ", > + name, alt_no, n_elems); > + } > + if (**templ =3D=3D 0 || **templ =3D=3D '\n') > + fatal_at (loc, "missing ']'"); > + } I think it'd be more obvious for the NIL and EOL check to come first, so that we don't do anything with invalid chars: /* Go through the list, one character at a time, adding said character to the correct string. */ for (i =3D 0; **templ !=3D ']' && **templ !=3D ';'; (*templ)++) if (!ISBLANK (**templ)) { if (**templ =3D=3D 0 || **templ =3D=3D '\n') fatal_at (loc, "missing ']'"); list[i].add (**templ); if (**templ =3D=3D ',') { ++i; if (i =3D=3D n_elems) fatal_at (loc, "too many %ss in alternative %d: expected %d", name, alt_no, n_elems); } } > + > + if (i + 1 < n_elems) > + fatal_at (loc, "too few %ss in alternative %d: expected %d, got %d", > + name, alt_no, n_elems, i); > + > + list[i].add (','); > +} > + > +/* The compact syntax has more convience syntaxes. As such we post proc= ess > + the lines to get them back to something the normal syntax understands= . */ > + > +static void > +preprocess_compact_syntax (file_location loc, int alt_no, std::string &l= ine, > + std::string &last_line) > +{ > + /* Check if we're copying the last statement. */ > + if (line.find ("^") =3D=3D 0 && line.size () =3D=3D 1) > + { > + if (last_line.empty ()) > + fatal_at (loc, "found instruction to copy previous line (^) in" > + "alternative %d but no previous line to copy", alt= _no); > + line =3D last_line; > + return; > + } > + > + std::string result; > + std::string buffer; > + /* Check if we have << which means return c statement. */ > + if (line.find ("<<") =3D=3D 0) > + { > + result.append ("* return "); > + const char *chunk =3D line.c_str () + 2; > + skip_spaces (&chunk); > + result.append (chunk); > + } > + else > + result.append (line); > + > + line =3D result; > + return; > +} > + > +/* Converts an rtx from compact syntax to normal syntax if possible. */ > + > +static void > +convert_syntax (rtx x, file_location loc) > +{ > + int alt_no; > + unsigned int templ_index; > + const char *templ; > + vec_conlist tconvec, convec, attrvec; > + > + templ_index =3D GET_CODE (x) =3D=3D DEFINE_INSN ? 3 : 2; > + > + templ =3D XTMPL (x, templ_index); > + > + /* Templates with constraints start with "{@". */ > + if (strncmp ("*{@", templ, 3)) > + return; > + > + /* Get the layout for the template. */ > + templ +=3D 3; > + skip_spaces (&templ); > + > + if (!expect_char (&templ, '[')) > + fatal_at (loc, "expecing `[' to begin section list"); > + > + parse_section_layout (loc, &templ, "cons:", tconvec, true); > + convec.resize (tconvec.size ()); IMO it'd be better to leave the last line out, since there's nothing particularly special about tconvec.size() for this mapping. > + > + /* Check for any duplicate cons entries and sort based on i. */ > + for (auto e : tconvec) > + { > + unsigned idx =3D e.idx; > + if (idx >=3D convec.size ()) > + convec.resize (idx + 1); > + > + if (convec[idx].idx >=3D 0) > + fatal_at (loc, "duplicate cons number found: %d", idx); > + convec[idx] =3D e; > + } > + tconvec.clear (); > + > + if (*templ !=3D ']') > + { > + if (*templ =3D=3D ';') > + skip_spaces (&(++templ)); > + parse_section_layout (loc, &templ, "attrs:", attrvec, false); > + } > + > + if (!expect_char (&templ, ']')) > + fatal_at (loc, "expecting `]` to end section list - section list mus= t have " > + "cons first, attrs second"); > + > + /* We will write the un-constrainified template into new_templ. */ > + std::string new_templ; > + new_templ.append ("@"); > + > + /* Skip to the first proper line. */ > + while (*templ++ !=3D '\n'); This seems to have reverted to an earlier version, but it has the problems discussed previously: it never terminates if the string doesn't have a '\n', and it allows anything to come between the ']' and the '\n'. I think it should be: /* Skip to the first proper line. */ skip_spaces (&templ); if (*templ =3D=3D 0) fatal_at (loc, "'{@...}' blocks must have at least one alternative"); if (*templ !=3D '\n') fatal_at (loc, "unexpected character '%c' after ']'", *templ); templ++; > + > + alt_no =3D 0; > + std::string last_line; > + > + /* Process the alternatives. */ > + while (*(templ - 1) !=3D '\0') > + { > + /* Skip leading whitespace. */ > + std::string buffer; > + skip_spaces (&templ); > + > + /* Check if we're at the end. */ > + if (templ[0] =3D=3D '}' && templ[1] =3D=3D '\0') > + break; > + > + if (expect_char (&templ, '[')) > + { > + new_templ +=3D '\n'; > + new_templ.append (buffer); > + /* Parse the constraint list, then the attribute list. */ > + if (convec.size () > 0) > + parse_section (&templ, convec.size (), alt_no, convec, loc, > + "constraint"); > + > + if (attrvec.size () > 0) > + { > + if (convec.size () > 0 && !expect_char (&templ, ';')) > + fatal_at (loc, "expected `;' to separate constraints " > + "and attributes in alternative %d", alt_no= ); > + > + parse_section (&templ, attrvec.size (), alt_no, > + attrvec, loc, "attribute"); > + } > + > + if (!expect_char (&templ, ']')) > + fatal_at (loc, "expected end of constraint/attribute list but= " > + "missing an ending `]' in alternative %d", alt= _no); > + } > + else if (templ[0] =3D=3D '/' && templ[1] =3D=3D '/') > + { > + templ +=3D 2; > + /* Glob till newline or end of string. */ > + while (*templ !=3D '\n' || *templ !=3D '\0') > + templ++; > + templ++; This should be deleted, since it will skip over '\0' as well as '\n'. The loop below handles '\n' correctly. > + > + /* Skip any newlines or whitespaces needed. */ > + while (ISSPACE(*templ)) > + templ++; > + continue; > + } > + else if (templ[0] =3D=3D '/' && templ[1] =3D=3D '*') > + { > + templ +=3D 2; > + /* Glob till newline or end of multiline comment. */ > + while (templ[0] !=3D 0 && templ[0] !=3D '*' && templ[1] !=3D '/= ') > + templ++; > + templ +=3D 2; Same problem about moving past '\0' here. But the break condition would stop on things like "*]" or "//", not just "*/". I think it should be: for (; templ[0] !=3D 0; ++templ) if (templ[0] =3D=3D '*' && templ[1] =3D=3D '/') { templ +=3D 2; break; } > + > + /* Skip any newlines or whitespaces needed. */ > + while (ISSPACE(*templ)) > + templ++; > + continue; > + } > + else > + fatal_at (loc, "expected constraint/attribute list at beginning o= f " > + "alternative %d but missing a starting `['", alt_n= o); > + > + /* Skip whitespace between list and asm. */ > + ++templ; Looks like this line should be deleted. We only get here after expect_char (&templ, ']'), which has already skipped the ']'. OK for trunk with those changes, thanks. Richard > + skip_spaces (&templ); > + > + /* Copy asm to new template. */ > + std::string line; > + while (*templ !=3D '\n' && *templ !=3D '\0') > + line +=3D *templ++; > + > + /* Apply any pre-processing needed to the line. */ > + preprocess_compact_syntax (loc, alt_no, line, last_line); > + new_templ.append (line); > + last_line =3D line; > + > + /* Normal "*..." syntax expects the closing quote to be on the fin= al > + line of asm, whereas we allow the closing "}" to be on its own l= ine. > + Postpone copying the '\n' until we know that there is another > + alternative in the list. */ > + while (ISSPACE (*templ)) > + templ++; > + ++alt_no; > + } > + > + /* Write the constraints and attributes into their proper places. */ > + if (convec.size () > 0) > + add_constraints (x, loc, convec); > + > + if (attrvec.size () > 0) > + add_attributes (x, attrvec); > + > + /* Copy over the new un-constrainified template. */ > + XTMPL (x, templ_index) =3D xstrdup (new_templ.c_str ()); > + > + /* Register for later checks during iterator expansions. */ > + compact_syntax.add (x); > +} > + > /* Process a top level rtx in some way, queuing as appropriate. */ > > static void > @@ -553,10 +1036,12 @@ process_rtx (rtx desc, file_location loc) > switch (GET_CODE (desc)) > { > case DEFINE_INSN: > + convert_syntax (desc, loc); > queue_pattern (desc, &define_insn_tail, loc); > break; > > case DEFINE_COND_EXEC: > + convert_syntax (desc, loc); > queue_pattern (desc, &define_cond_exec_tail, loc); > break; > > @@ -631,6 +1116,7 @@ process_rtx (rtx desc, file_location loc) > attr =3D XVEC (desc, split_code + 1); > PUT_CODE (desc, DEFINE_INSN); > XVEC (desc, 4) =3D attr; > + convert_syntax (desc, loc); > > /* Queue them. */ > insn_elem =3D queue_pattern (desc, &define_insn_tail, loc);