From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 761C33858D32 for ; Mon, 24 Apr 2023 09:37:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 761C33858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BA008D75; Mon, 24 Apr 2023 02:38:00 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 250FE3F5A1; Mon, 24 Apr 2023 02:37:16 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,"gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw Subject: Re: [PATCH] RFC: New compact syntax for insn and insn_split in Machine Descriptions References: Date: Mon, 24 Apr 2023 10:37:14 +0100 In-Reply-To: (Tamar Christina's message of "Mon, 24 Apr 2023 09:05:42 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-24.5 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: >> -----Original Message----- >> From: Richard Sandiford >> Sent: Friday, April 21, 2023 6:19 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> >> Subject: Re: [PATCH] RFC: New compact syntax for insn and insn_split in >> Machine Descriptions >> >> Tamar Christina writes: >> > Hi All, >> > >> > This patch adds support for a compact syntax for specifying >> > constraints in instruction patterns. Credit for the idea goes to Richard >> Earnshaw. >> > >> > I am sending up this RFC to get feedback for it's inclusion in GCC 14. >> > With this new syntax we want a clean break from the current >> > limitations to make something that is hopefully easier to use and maintain. >> > >> > The idea behind this compact syntax is that often times it's quite >> > hard to correlate the entries in the constrains list, attributes and instruction >> lists. >> > >> > One has to count and this often is tedious. Additionally when >> > changing a single line in the insn multiple lines in a diff change, >> > making it harder to see what's going on. >> > >> > This new syntax takes into account many of the common things that are >> done in MD >> > files. It's also worth saying that this version is intended to deal with the >> > common case of a string based alternatives. For C chunks we have some >> ideas >> > but those are not intended to be addressed here. >> > >> > It's easiest to explain with an example: >> > >> > normal syntax: >> > >> > (define_insn_and_split "*movsi_aarch64" >> > [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m, >> r, r, r, w,r,w, w") >> > (match_operand:SI 1 "aarch64_mov_operand" " >> r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))] >> > "(register_operand (operands[0], SImode) >> > || aarch64_reg_or_zero (operands[1], SImode))" >> > "@ >> > mov\\t%w0, %w1 >> > mov\\t%w0, %w1 >> > mov\\t%w0, %w1 >> > mov\\t%w0, %1 >> > # >> > * return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", >> operands[1]); >> > ldr\\t%w0, %1 >> > ldr\\t%s0, %1 >> > str\\t%w1, %0 >> > str\\t%s1, %0 >> > adrp\\t%x0, %A1\;ldr\\t%w0, [%x0, %L1] >> > adr\\t%x0, %c1 >> > adrp\\t%x0, %A1 >> > fmov\\t%s0, %w1 >> > fmov\\t%w0, %s1 >> > fmov\\t%s0, %s1 >> > * return aarch64_output_scalar_simd_mov_immediate (operands[1], >> SImode);" >> > "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL >> (operands[1]), SImode) >> > && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))" >> > [(const_int 0)] >> > "{ >> > aarch64_expand_mov_immediate (operands[0], operands[1]); >> > DONE; >> > }" >> > ;; The "mov_imm" type for CNT is just a placeholder. >> > [(set_attr "type" >> "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,mov_imm,load_4, >> > >> load_4,store_4,store_4,load_4,adr,adr,f_mcr,f_mrc,fmov,neon_move") >> > (set_attr "arch" "*,*,*,*,*,sve,*,fp,*,fp,*,*,*,fp,fp,fp,simd") >> > (set_attr "length" "4,4,4,4,*, 4,4, 4,4, 4,8,4,4, 4, 4, 4, 4") >> > ] >> > ) >> > >> > New syntax: >> > >> > (define_insn_and_split "*movsi_aarch64" >> > [(set (match_operand:SI 0 "nonimmediate_operand") >> > (match_operand:SI 1 "aarch64_mov_operand"))] >> > "(register_operand (operands[0], SImode) >> > || aarch64_reg_or_zero (operands[1], SImode))" >> > "@@ (cons: 0 1; attrs: type arch length) >> > [=r, r ; mov_reg , * , 4] mov\t%w0, %w1 >> > [k , r ; mov_reg , * , 4] ^ >> > [r , k ; mov_reg , * , 4] ^ >> > [r , M ; mov_imm , * , 4] mov\t%w0, %1 >> > [r , n ; mov_imm , * , *] # >> > [r , Usv; mov_imm , sve , 4] << aarch64_output_sve_cnt_immediate ('cnt', >> '%x0', operands[1]); >> > [r , m ; load_4 , * , 4] ldr\t%w0, %1 >> > [w , m ; load_4 , fp , 4] ldr\t%s0, %1 >> > [m , rZ ; store_4 , * , 4] str\t%w1, %0 >> > [m , w ; store_4 , fp , 4] str\t%s1, %0 >> > [r , Usw; load_4 , * , 8] adrp\t%x0, %A1;ldr\t%w0, [%x0, %L1] >> > [r , Usa; adr , * , 4] adr\t%x0, %c1 >> > [r , Ush; adr , * , 4] adrp\t%x0, %A1 >> > [w , rZ ; f_mcr , fp , 4] fmov\t%s0, %w1 >> > [r , w ; f_mrc , fp , 4] fmov\t%w0, %s1 >> > [w , w ; fmov , fp , 4] fmov\t%s0, %s1 >> > [w , Ds ; neon_move, simd, 4] << >> aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);" >> > "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL >> (operands[1]), SImode) >> > && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))" >> > [(const_int 0)] >> > { >> > aarch64_expand_mov_immediate (operands[0], operands[1]); >> > DONE; >> > } >> > ;; The "mov_imm" type for CNT is just a placeholder. >> > ) >> > >> > The patch contains some more rewritten examples for both Arm and >> > AArch64. I have included them for examples in this RFC but the final >> > version posted in GCC 14 will have these split out. >> > >> > The main syntax rules are as follows (See docs for full rules): >> > - Template must start with "@@" to use the new syntax. >> > - "@@" is followed by a layout in parentheses which is "cons:" followed by >> > a list of match_operand/match_scratch IDs, then a semicolon, then the >> > same for attributes ("attrs:"). Both sections are optional (so you can >> > use only cons, or only attrs, or both), and cons must come before attrs >> > if present. >> > - Each alternative begins with any amount of whitespace. >> > - Following the whitespace is a comma-separated list of constraints and/or >> > attributes within brackets [], with sections separated by a semicolon. >> > - Following the closing ']' is any amount of whitespace, and then the actual >> > asm output. >> > - Spaces are allowed in the list (they will simply be removed). >> > - All alternatives should be specified: a blank list should be >> > "[,,]", "[,,;,]" etc., not "[]" or "" (however genattr may segfault if >> > you leave certain attributes empty, I have found). >> > - The actual constraint string in the match_operand or match_scratch, and >> > the attribute string in the set_attr, must be blank or an empty string >> > (you can't combine the old and new syntaxes). >> > - The common idion * return can be shortened by using <<. >> > - Any unexpanded iterators left during processing will result in an error at >> > compile time. If for some reason <> is needed in the output then these >> > must be escaped using \. >> > - Inside a @@ block '' is treated as "" when there are multiple characters >> > inside the single quotes. This version does not handle multi byte literals >> > like specifying characters as their numerical encoding, like \003 nor does >> > it handle unicode, especially multibyte encodings. This feature may be >> more >> > trouble than it's worth so have no finished it off, however this means one >> > can use 'foo' instead of \"foo\" to denote a multicharacter string. >> > - Inside an @@ block any unexpanded iterators will result in a compile time >> > fault instead of incorrect assembly being generated at runtime. If the >> > literal <> is needed in the output this needs to be escaped with \<\>. >> > - This check is not performed inside C blocks (lines starting with *). >> > - Instead of copying the previous instruction again in the next pattern, one >> > can use ^ to refer to the previous asm string. >> >> Thanks for doing this. The new syntax seems like a clear improvement for >> complex patterns like movs. >> >> Some comments/suggestions: >> >> - From a style perspective, out-of-order constraints should IMO be strongly >> discouraged. The asm string uses %0, %1, %2 etc. to refer to operands, >> and having that directly after a list that puts the constraints in >> a different order (such as [%2, %0, %1]) would IMO be very confusing. >> >> I agree there might be cases where dropping constraints makes sense. >> But I think in general we should encourage all constraints to be >> specified, and be specified in order. And that's likely to be the >> natural choice in an overwhelming majority of cases anyway. >> >> So how about having a simpler syntax for the first line when all >> constraints are specified in order? Maybe just "cons" (without the >> colon or numbers). >> >> - I'm not too keen on the '' thing. It sounded from internal >> discussion like backslashes and quoting were a problem generally. >> >> Would it work to quote the new form in {@ ... } instead? There should >> be no compatibility problem with that, since @ isn't a standard C++ >> lexing token. > > Fair enough, did you mean {@} or @'string' ? I meant quote that whole asm block in {@...} rather than "@@...". I.e.: {@ [cons: =0, 1; attrs: type, arch, length] [r, r ; mov_reg , * , 4] mov\t%w0, %w1 [k , r ; mov_reg , * , 4] ^ [r , k ; mov_reg , * , 4] ^ [r , M ; mov_imm , * , 4] mov\t%w0, %1 [r , n ; mov_imm , * , *] # [r , Usv; mov_imm , sve , 4] << aarch64_output_sve_cnt_immediate ("cnt", "%x0", operands[1]); [r , m ; load_4 , * , 4] ldr\t%w0, %1 [w , m ; load_4 , fp , 4] ldr\t%s0, %1 [m , rZ ; store_4 , * , 4] str\t%w1, %0 [m , w ; store_4 , fp , 4] str\t%s1, %0 [r , Usw; load_4 , * , 8] adrp\t%x0, %A1;ldr\t%w0, [%x0, %L1] [r , Usa; adr , * , 4] adr\t%x0, %c1 [r , Ush; adr , * , 4] adrp\t%x0, %A1 [w , rZ ; f_mcr , fp , 4] fmov\t%s0, %w1 [r , w ; f_mrc , fp , 4] fmov\t%w0, %s1 [w , w ; fmov , fp , 4] fmov\t%s0, %s1 [w , Ds ; neon_move, simd, 4] << arch64_output_scalar_simd_mov_immediate (operands[1], SImode); } That will also help if we do want to support C++ code blocks in future. Thanks, Richard