From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5674 invoked by alias); 29 May 2003 17:13:27 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 5523 invoked from network); 29 May 2003 17:13:24 -0000 Received: from unknown (HELO cam-admin0.cambridge.arm.com) (193.131.176.54) by sources.redhat.com with SMTP; 29 May 2003 17:13:24 -0000 Received: from pc960.cambridge.arm.com (pc960.cambridge.arm.com [10.1.205.4]) by cam-admin0.cambridge.arm.com (8.9.3/8.9.3) with ESMTP id SAA07633; Thu, 29 May 2003 18:13:20 +0100 (BST) Received: from pc960.cambridge.arm.com (rearnsha@localhost) by pc960.cambridge.arm.com (8.11.6/8.9.3) with ESMTP id h4THDK903420; Thu, 29 May 2003 18:13:20 +0100 Message-Id: <200305291713.h4THDK903420@pc960.cambridge.arm.com> X-Authentication-Warning: pc960.cambridge.arm.com: rearnsha owned process doing -bs To: fnf@intrinsity.com cc: Richard.Earnshaw@arm.com, echristo@redhat.com (Eric Christopher), gcc@gcc.gnu.org Reply-To: Richard.Earnshaw@arm.com Organization: ARM Ltd. X-Telephone: +44 1223 400569 (direct+voicemail), +44 1223 400400 (switchbd) X-Fax: +44 1223 400410 X-Address: ARM Ltd., 110 Fulbourn Road, Cherry Hinton, Cambridge CB1 9NJ. Subject: Re: md description for intruction that modifies multiple operands In-reply-to: Your message of "Thu, 29 May 2003 11:27:49 CDT." <20030529162749.9EA592F2980@beeville.vert.intrinsity.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 29 May 2003 17:37:00 -0000 From: Richard Earnshaw X-SW-Source: 2003-05/txt/msg02369.txt.bz2 > >> I don't think so. Though it looks like you might want to define a single > >> unspec number for the pattern and maybe use a parallel? *guesses* > > I thought each unspec had to have a unique number. That was probably a cut-and-paste error on my part. But an unspec is just a black-box operation to the compiler. The number is only there as a discriminator to resolve potential ambiguities. So two insn matches that insns of the form insn_a: (set x (unspec [(y) (z)] 0)) insn_b: (set a (unspec [(b)] 0)) is not illegal, though it is bad style. > > Where are you suggesting placing a "parallel"? A define_insn is an implicit parallel when there are multiple statements. Only define expand needs an explicit parallel. > > > You are probably better off if you only use match_dup to match inputs to > > inputs and outputs to outputs. Use tied register allocation for inputs to > > outputs. Ties are best done using adjacent number pairs. Hence something > > like: > > Thanks much for the example. I didn't see anything in the docs about > "tied register allocation". What specifically does this mean? Is it > a way to get registers allocated in sequence? It's a way to ensure that an input operand is allocated to the same register as an output operand. Look for "0 in constraint" in the documentation (the machine description section). > > Perhaps I should give a more realistic code example and *.md entry. > > The hardware handles vectors of 512 bits each, which can be organized as > a 4x4 matrix of 16 32-bit ints. We typedef a "matrix_t" to be a V16SI type. > Here is an actual code example: > > typedef int matrix_t __attribute__((__mode__(V16SI))); > > matrix_t foo (matrix_t t0, matrix_t t1, matrix_t t2, matrix_t t3) > { > __BLOCK4_M (t0, t1, t2, t3); > return (t0); > } > > This example takes four matrix_t (V16SI) types as function arguments, > passed in hardware registers $m0, $m1, $m2, and $m3, for t0, t1, t2, > and t3 respectively. The __BLOCK4_M builtin takes four matrix_t > operands, does some matrix arithmetic on them, and returns the results > left in the four operands. One restriction is that the block4 > operands have to be allocated to sequential hardware registers. > > Here is the actual md file entry I put in based on your example: > > (define_insn "fm_block4" > [(set (match_operand:V16SI 0 "register_operand" "=v") > (unspec:V16SI [(match_operand:V16SI 3 "register_operand" "2") > (match_operand:V16SI 5 "register_operand" "4") > (match_operand:V16SI 7 "register_operand" "6")] 460)) > (set (match_operand:V16SI 2 "register_operand" "=v") > (unspec:V16SI [(match_operand:V16SI 1 "register_operand" "0") > (match_dup 5) (match_dup 7)] 461)) > (set (match_operand:V16SI 4 "register_operand" "=v") > (unspec:V16SI [(match_dup 1) (match_dup 3) (match_dup 7)] 462)) > (set (match_operand:V16SI 6 "register_operand" "=v") > (unspec:V16SI [(match_dup 1) (match_dup 3) (match_dup 7)] 463))] > "TARGET_FM" > "block4.m\\t%0,%2,%4,%6" > [(set_attr "type" "fm")]) > > For the above example, running "cc1 -da -O2 x.c" generates the > following rtl file and then the compiler gets a segfault due to the > set of a "(nil)". BTW, first matrix hardware register is 176, first > pseudo reg is 200. Note also I deleted some extraneous instructions > like NOTES: > > (insn 3 2 4 (nil) (set (reg/v:V16SI 206 [ t0 ]) > (reg:V16SI 176 $m0 [ t0 ])) -1 (nil) > (nil)) > > (insn 4 3 5 (nil) (set (reg/v:V16SI 207 [ t1 ]) > (reg:V16SI 177 $m1 [ t1 ])) -1 (nil) > (nil)) > > (insn 5 4 6 (nil) (set (reg/v:V16SI 208 [ t2 ]) > (reg:V16SI 178 $m2 [ t2 ])) -1 (nil) > (nil)) > > (insn 6 5 7 (nil) (set (reg/v:V16SI 209 [ t3 ]) > (reg:V16SI 179 $m3 [ t3 ])) -1 (nil) > (nil)) > > (insn 12 10 14 (nil) (parallel [ > (set (reg/v:V16SI 206 [ t0 ]) > (unspec:V16SI [ > (reg/v:V16SI 209 [ t3 ]) > (reg/v:V16SI 209 [ t3 ]) > (reg/v:V16SI 207 [ t1 ]) > ] 460)) > (set (reg/v:V16SI 208 [ t2 ]) > (unspec:V16SI [ > (reg/v:V16SI 207 [ t1 ]) > (reg/v:V16SI 209 [ t3 ]) > (reg/v:V16SI 207 [ t1 ]) > ] 461)) > (set (nil) > (unspec:V16SI [ > (reg/v:V16SI 207 [ t1 ]) > (reg/v:V16SI 209 [ t3 ]) > (reg/v:V16SI 207 [ t1 ]) > ] 462)) > (set (reg/v:V16SI 208 [ t2 ]) > (unspec:V16SI [ > (reg/v:V16SI 207 [ t1 ]) > (reg/v:V16SI 209 [ t3 ]) > (reg/v:V16SI 207 [ t1 ]) > ] 463)) > ]) -1 (nil) > (nil)) > > (insn 16 15 17 (nil) (set (reg:V16SI 205 [ ]) > (reg/v:V16SI 206 [ t0 ])) -1 (nil) > (nil)) > > (jump_insn 17 16 18 (nil) (set (pc) > (label_ref 22)) -1 (nil) > (nil)) > This looks like an expansion problem. How are you calling gen_fm_block4()? You need to pass 8 arguments to it now, something like gen_fm_block4(t0, t0, t1, t1, t2, t2, t3, t3); > I do much appreciate all the help. I've been a gdb hacker for the > last 14 years and a gcc hacker for all of about 2 months. :-) I'm nearer the reverse. Expect me to call in the favour sometime :-) R.