RE: PATCH: Add XOP 128-bit and 256-bit support for upcoming AMD Orochi processor.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "rajagopal, dwarak" <dwarak.rajagopal@amd.com>
To: "'Jan Hubicka'" <hubicka@ucw.cz>,
	        "gcc-patches@gcc.gnu.org" 	<gcc-patches@gcc.gnu.org>
Cc: "Harle, Christophe" <christophe.harle@amd.com>,
	        "Jagasia, Harsha" 	<harsha.jagasia@amd.com>,
	        "'Jan Hubicka'" <hubicka@ucw.cz>
Subject: RE: PATCH: Add XOP 128-bit and 256-bit support for upcoming AMD  Orochi processor.
Date: Mon, 12 Oct 2009 16:54:00 -0000	[thread overview]
Message-ID: <1C8DE0332CB01445BF7ADEDE3DDD57071F74C4@sausexmbp02.amd.com> (raw)

Hi Honza,
I will be going to take over this patch from Harsha and wrap this up, so that I can check in this patch.

It would be great if you could answer the comments below from Harsha so that I will get more clarity.

Thanks,
Dwarak

Hi Honza,

I have gone through your feedback on the XOP patch and it makes sense and I will fix as per your suggestions.

However I am not entirely sure I understand the 2 comments below. Since Mike Meissner added these patterns, I mostly inherited them from him.

Can you please elaborate or provide some more guidance on what I need to do.
(Please see my comments below)

> +;; We don't have a straight 32-bit parallel multiply on XOP, so fake it with a
> +;; multiply/add.  In general, we expect the define_split to occur before
> +;; register allocation, so we have to handle the corner case where the target
> +;; is the same as one of the inputs.
> +(define_insn_and_split "*xop_mulv4si3"
> +  [(set (match_operand:V4SI 0 "register_operand" "=&x")
> +	(mult:V4SI (match_operand:V4SI 1 "register_operand" "%x")
> +		   (match_operand:V4SI 2 "nonimmediate_operand" "xm")))]
> > +  "TARGET_XOP"
> > +  "#"
> > +  "&& (reload_completed
> > +       || (!reg_mentioned_p (operands[0], operands[1])
> > +	   && !reg_mentioned_p (operands[0], operands[2])))"
> 
> WHat happens when regs are mentioned?
> There are other cases 2 memory operand multiply-add splitting testing
> these, are we somehow making sure this conditional will always hold and
> we won't ICE not being able to satisfy the conditions?

Actually I am not even sure this xop_mulv4si3 pattern is needed because XOP now implies SSE 4.2 and AVX and so we can just generate the mulv4si3 patterns for AVX or SSE 4.1 when -mxop is used. Can I just remove this xop_mulv4si3 pattern then?

As for your reference to "other cases 2 memory operand multiply-add splitting", I assume you are referring to the vpmac/d* define_splits.

In XOP vpmac/d* instructions, there is no restriction any more for the destination reg to be same as the third src operand unlike SSE5. And only the second source can be memory. Also I don't see anything in the manual that the destination reg is enforced to be different from source 1, source 2 or source 3 operands individually either.

Should I then remove below from the pmac/d* patterns?

(!reg_mentioned_p (operands[0], operands[1])
> > +	   && !reg_mentioned_p (operands[0], operands[2]))

> 
> > +;; The following are for the various unpack insns which doesn't need
> the first
> > +;; source operand, so we can just use the output operand for the first
> operand.
> > +;; This allows either of the other two operands to be a memory operand.
> We
> > +;; can't just use the first operand as an argument to the normal pperm
> because
> > +;; then an output only argument, suddenly becomes an input operand.
> > +(define_insn "xop_pperm_zero_v16qi_v8hi"
> > +  [(set (match_operand:V8HI 0 "register_operand" "=x,x")
> > +	(zero_extend:V8HI
> > +	 (vec_select:V8QI
> > +	  (match_operand:V16QI 1 "nonimmediate_operand" "xm,x")
> > +	  (match_operand 2 "" ""))))	;; parallel with const_int's
> > +   (use (match_operand:V16QI 3 "nonimmediate_operand" "x,xm"))]
> 
> Hmm, there is no unspec or omething that would make it clear that we can
> not ever somehow simplify into this form with operand 2 being something
> different than parallel with const_ints.  I think this needs new
> predicate.

I can define a new predicate for it in predicates.md, but I am not sure how exactly to represent the "parallel with const ints" part.

Any suggestions?

Thanks,
Harsha

next             reply	other threads:[~2009-10-12 16:39 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-12 16:54 rajagopal, dwarak [this message]
2009-10-13 23:17 ` Jan Hubicka
2009-10-20 21:34   ` rajagopal, dwarak
  -- strict thread matches above, loose matches on Subject: below --
2009-10-21 10:09 Uros Bizjak
2009-10-21 19:53 ` rajagopal, dwarak
2009-10-26 17:29   ` Uros Bizjak
2009-10-29 18:18     ` rajagopal, dwarak
2009-10-29 19:21       ` Uros Bizjak
2009-11-19 15:58     ` Sebastian Pop
2009-11-19 20:19       ` Uros Bizjak
2009-11-19 21:41         ` Uros Bizjak
2009-12-21 13:57       ` Jakub Jelinek
2009-12-21 16:15         ` Jan Hubicka
2009-09-30  6:52 Harsha Jagasia
2009-10-01 15:37 ` Jan Hubicka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1C8DE0332CB01445BF7ADEDE3DDD57071F74C4@sausexmbp02.amd.com \
    --to=dwarak.rajagopal@amd.com \
    --cc=christophe.harle@amd.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=harsha.jagasia@amd.com \
    --cc=hubicka@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).