inline assembly vs. intrinsic functions

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* inline assembly vs. intrinsic functions
@ 2010-10-26  5:54 roy rosen
  2010-10-26 13:31 ` Ian Lance Taylor
  0 siblings, 1 reply; 15+ messages in thread
From: roy rosen @ 2010-10-26  5:54 UTC (permalink / raw)
  To: gcc

Hi,

I am trying to demonstrate my port capabilities.
I am writing an application which needs to use instructions like max
a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
Is that possible to write an intrinsic function for that?
I think not because that means that I need to pass d,e,f by reference
which means that they would be in memory and not in a register as
meant by the instruction.
Is there any port with such an example?
So, I thought of implementing that with inline assembly but here I
encounter a different problem: The compiler does not understand the
instruction given in inline assembly and therefore it does not
parallelize it with other insns.

Is there any other solution for that which I don't see?

Thanks, Roy.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-10-26  5:54 inline assembly vs. intrinsic functions roy rosen
@ 2010-10-26 13:31 ` Ian Lance Taylor
  2010-10-26 13:36   ` roy rosen
  2011-03-17 11:34   ` roy rosen
  0 siblings, 2 replies; 15+ messages in thread
From: Ian Lance Taylor @ 2010-10-26 13:31 UTC (permalink / raw)
  To: roy rosen; +Cc: gcc

roy rosen <roy.1rosen@gmail.com> writes:

> I am trying to demonstrate my port capabilities.
> I am writing an application which needs to use instructions like max
> a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
> Is that possible to write an intrinsic function for that?
> I think not because that means that I need to pass d,e,f by reference
> which means that they would be in memory and not in a register as
> meant by the instruction.

That is correct.  An intrinsic function is a normal function.  If you
want it to have multiple outputs, you need to pass in addresses, or you
need to have it return a struct.

I'm a bit curious as to why a function named max would have multiple
outputs.

> Is there any port with such an example?

Not to my knowledge.  I wrote a private port in which some intrinsics
returned a struct, and to keep everything out of memory I added
additional intrinsics to retrieve elements of the struct.  It's awkward
to use but the resulting code is fine.

> So, I thought of implementing that with inline assembly but here I
> encounter a different problem: The compiler does not understand the
> instruction given in inline assembly and therefore it does not
> parallelize it with other insns.

Yes.

> Is there any other solution for that which I don't see?

I can't think of anything.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-10-26 13:31 ` Ian Lance Taylor
@ 2010-10-26 13:36   ` roy rosen
  2010-10-26 15:01     ` roy rosen
  2011-03-17 11:34   ` roy rosen
  1 sibling, 1 reply; 15+ messages in thread
From: roy rosen @ 2010-10-26 13:36 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

I didn't give the full details of the instruction but for example a
max instruction which gets an array and returns both the max value and
its index in the array will need to return more than one argument.

2010/10/26 Ian Lance Taylor <iant@google.com>:
> roy rosen <roy.1rosen@gmail.com> writes:
>
>> I am trying to demonstrate my port capabilities.
>> I am writing an application which needs to use instructions like max
>> a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
>> Is that possible to write an intrinsic function for that?
>> I think not because that means that I need to pass d,e,f by reference
>> which means that they would be in memory and not in a register as
>> meant by the instruction.
>
> That is correct.  An intrinsic function is a normal function.  If you
> want it to have multiple outputs, you need to pass in addresses, or you
> need to have it return a struct.
>
> I'm a bit curious as to why a function named max would have multiple
> outputs.
>
>> Is there any port with such an example?
>
> Not to my knowledge.  I wrote a private port in which some intrinsics
> returned a struct, and to keep everything out of memory I added
> additional intrinsics to retrieve elements of the struct.  It's awkward
> to use but the resulting code is fine.
>
>> So, I thought of implementing that with inline assembly but here I
>> encounter a different problem: The compiler does not understand the
>> instruction given in inline assembly and therefore it does not
>> parallelize it with other insns.
>
> Yes.
>
>> Is there any other solution for that which I don't see?
>
> I can't think of anything.
>
> Ian
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-10-26 13:36   ` roy rosen
@ 2010-10-26 15:01     ` roy rosen
  2010-10-26 16:12       ` Ian Lance Taylor
  0 siblings, 1 reply; 15+ messages in thread
From: roy rosen @ 2010-10-26 15:01 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

If I want the compiler to understand the inline assembly is it
possible to write define_insn which would match the pattern that GCC
creates for the inline assembly and then GCC would be able to 'know'
some attributes about this insn and would be able to parallelize it?

2010/10/26 roy rosen <roy.1rosen@gmail.com>:
> I didn't give the full details of the instruction but for example a
> max instruction which gets an array and returns both the max value and
> its index in the array will need to return more than one argument.
>
> 2010/10/26 Ian Lance Taylor <iant@google.com>:
>> roy rosen <roy.1rosen@gmail.com> writes:
>>
>>> I am trying to demonstrate my port capabilities.
>>> I am writing an application which needs to use instructions like max
>>> a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
>>> Is that possible to write an intrinsic function for that?
>>> I think not because that means that I need to pass d,e,f by reference
>>> which means that they would be in memory and not in a register as
>>> meant by the instruction.
>>
>> That is correct.  An intrinsic function is a normal function.  If you
>> want it to have multiple outputs, you need to pass in addresses, or you
>> need to have it return a struct.
>>
>> I'm a bit curious as to why a function named max would have multiple
>> outputs.
>>
>>> Is there any port with such an example?
>>
>> Not to my knowledge.  I wrote a private port in which some intrinsics
>> returned a struct, and to keep everything out of memory I added
>> additional intrinsics to retrieve elements of the struct.  It's awkward
>> to use but the resulting code is fine.
>>
>>> So, I thought of implementing that with inline assembly but here I
>>> encounter a different problem: The compiler does not understand the
>>> instruction given in inline assembly and therefore it does not
>>> parallelize it with other insns.
>>
>> Yes.
>>
>>> Is there any other solution for that which I don't see?
>>
>> I can't think of anything.
>>
>> Ian
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-10-26 15:01     ` roy rosen
@ 2010-10-26 16:12       ` Ian Lance Taylor
  2010-11-15 16:05         ` roy rosen
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Lance Taylor @ 2010-10-26 16:12 UTC (permalink / raw)
  To: roy rosen; +Cc: gcc

roy rosen <roy.1rosen@gmail.com> writes:

> If I want the compiler to understand the inline assembly is it
> possible to write define_insn which would match the pattern that GCC
> creates for the inline assembly and then GCC would be able to 'know'
> some attributes about this insn and would be able to parallelize it?

No, sorry.  Inline asms are not looked up in the MD file.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-10-26 16:12       ` Ian Lance Taylor
@ 2010-11-15 16:05         ` roy rosen
  2010-11-15 16:06           ` Joern Rennecke
  2010-11-15 19:11           ` Ian Lance Taylor
  0 siblings, 2 replies; 15+ messages in thread
From: roy rosen @ 2010-11-15 16:05 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

Is there any another way to give attributes to inline assembly insns?

2010/10/26 Ian Lance Taylor <iant@google.com>:
> roy rosen <roy.1rosen@gmail.com> writes:
>
>> If I want the compiler to understand the inline assembly is it
>> possible to write define_insn which would match the pattern that GCC
>> creates for the inline assembly and then GCC would be able to 'know'
>> some attributes about this insn and would be able to parallelize it?
>
> No, sorry.  Inline asms are not looked up in the MD file.
>
> Ian
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-11-15 16:05         ` roy rosen
@ 2010-11-15 16:06           ` Joern Rennecke
  2010-11-15 16:55             ` roy rosen
  2010-11-15 19:11           ` Ian Lance Taylor
  1 sibling, 1 reply; 15+ messages in thread
From: Joern Rennecke @ 2010-11-15 16:06 UTC (permalink / raw)
  To: roy rosen; +Cc: Ian Lance Taylor, gcc

Quoting roy rosen <roy.1rosen@gmail.com>:

> Is there any another way to give attributes to inline assembly insns?

See define_asm_attributes.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-11-15 16:06           ` Joern Rennecke
@ 2010-11-15 16:55             ` roy rosen
  0 siblings, 0 replies; 15+ messages in thread
From: roy rosen @ 2010-11-15 16:55 UTC (permalink / raw)
  To: Joern Rennecke; +Cc: Ian Lance Taylor, gcc

But this lets you just set default attributes.
I want to set real attributes so that the compiler would be able to
know which insn can be parallelized with another.
Is there a different way?
Are you saying that an inline assembly statement would stay as is, and
would not be touched by the compiler no matter what I do?

2010/11/15 Joern Rennecke <amylaar@spamcop.net>:
> Quoting roy rosen <roy.1rosen@gmail.com>:
>
>> Is there any another way to give attributes to inline assembly insns?
>
> See define_asm_attributes.
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-11-15 16:05         ` roy rosen
  2010-11-15 16:06           ` Joern Rennecke
@ 2010-11-15 19:11           ` Ian Lance Taylor
  1 sibling, 0 replies; 15+ messages in thread
From: Ian Lance Taylor @ 2010-11-15 19:11 UTC (permalink / raw)
  To: roy rosen; +Cc: gcc

roy rosen <roy.1rosen@gmail.com> writes:

> Is there any another way to give attributes to inline assembly insns?

Not that I know of.  It would be a useful feature in some cases, though
difficult to document.

For specific cases a backend can normally do better by providing builtin
functions.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2010-10-26 13:31 ` Ian Lance Taylor
  2010-10-26 13:36   ` roy rosen
@ 2011-03-17 11:34   ` roy rosen
  2011-03-21 23:26     ` Ian Lance Taylor
  1 sibling, 1 reply; 15+ messages in thread
From: roy rosen @ 2011-03-17 11:34 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

2010/10/26 Ian Lance Taylor <iant@google.com>:
> roy rosen <roy.1rosen@gmail.com> writes:
>
>> I am trying to demonstrate my port capabilities.
>> I am writing an application which needs to use instructions like max
>> a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
>> Is that possible to write an intrinsic function for that?
>> I think not because that means that I need to pass d,e,f by reference
>> which means that they would be in memory and not in a register as
>> meant by the instruction.
>
> That is correct.  An intrinsic function is a normal function.  If you
> want it to have multiple outputs, you need to pass in addresses, or you
> need to have it return a struct.
>
> I'm a bit curious as to why a function named max would have multiple
> outputs.
>
>> Is there any port with such an example?
>
> Not to my knowledge.  I wrote a private port in which some intrinsics
> returned a struct, and to keep everything out of memory I added
> additional intrinsics to retrieve elements of the struct.  It's awkward
> to use but the resulting code is fine.
>
Can you please explain how this solution should work?
First a code with memory accesses would be generated and then
optimizations would optimize it to use registers directly?

Roy.

>> So, I thought of implementing that with inline assembly but here I
>> encounter a different problem: The compiler does not understand the
>> instruction given in inline assembly and therefore it does not
>> parallelize it with other insns.
>
> Yes.
>
>> Is there any other solution for that which I don't see?
>
> I can't think of anything.
>
> Ian
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2011-03-17 11:34   ` roy rosen
@ 2011-03-21 23:26     ` Ian Lance Taylor
  2011-03-24 13:50       ` roy rosen
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Lance Taylor @ 2011-03-21 23:26 UTC (permalink / raw)
  To: roy rosen; +Cc: gcc

roy rosen <roy.1rosen@gmail.com> writes:

> 2010/10/26 Ian Lance Taylor <iant@google.com>:
>> roy rosen <roy.1rosen@gmail.com> writes:
>>
>>> I am trying to demonstrate my port capabilities.
>>> I am writing an application which needs to use instructions like max
>>> a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
>>> Is that possible to write an intrinsic function for that?
>>> I think not because that means that I need to pass d,e,f by reference
>>> which means that they would be in memory and not in a register as
>>> meant by the instruction.
>>
>> That is correct.  An intrinsic function is a normal function.  If you
>> want it to have multiple outputs, you need to pass in addresses, or you
>> need to have it return a struct.
>>
>> I'm a bit curious as to why a function named max would have multiple
>> outputs.
>>
>>> Is there any port with such an example?
>>
>> Not to my knowledge.  I wrote a private port in which some intrinsics
>> returned a struct, and to keep everything out of memory I added
>> additional intrinsics to retrieve elements of the struct.  It's awkward
>> to use but the resulting code is fine.
>>
> Can you please explain how this solution should work?
> First a code with memory accesses would be generated and then
> optimizations would optimize it to use registers directly?

You build a RECORD_TYPE holding the fields you want to return.  You
define the appropriate builtin functions to return that record type.
You define another builtin function for each field, which takes the
RECORD_TYPE as its argument and returns the type of the field.  In
TARGET_FOLD_BUILTIN you convert the per-field functions into
COMPONENT_REFs.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2011-03-21 23:26     ` Ian Lance Taylor
@ 2011-03-24 13:50       ` roy rosen
  2011-03-24 17:38         ` Ian Lance Taylor
  0 siblings, 1 reply; 15+ messages in thread
From: roy rosen @ 2011-03-24 13:50 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

2011/3/22 Ian Lance Taylor <iant@google.com>:
> roy rosen <roy.1rosen@gmail.com> writes:
>
>> 2010/10/26 Ian Lance Taylor <iant@google.com>:
>>> roy rosen <roy.1rosen@gmail.com> writes:
>>>
>>>> I am trying to demonstrate my port capabilities.
>>>> I am writing an application which needs to use instructions like max
>>>> a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
>>>> Is that possible to write an intrinsic function for that?
>>>> I think not because that means that I need to pass d,e,f by reference
>>>> which means that they would be in memory and not in a register as
>>>> meant by the instruction.
>>>
>>> That is correct.  An intrinsic function is a normal function.  If you
>>> want it to have multiple outputs, you need to pass in addresses, or you
>>> need to have it return a struct.
>>>
>>> I'm a bit curious as to why a function named max would have multiple
>>> outputs.
>>>
>>>> Is there any port with such an example?
>>>
>>> Not to my knowledge.  I wrote a private port in which some intrinsics
>>> returned a struct, and to keep everything out of memory I added
>>> additional intrinsics to retrieve elements of the struct.  It's awkward
>>> to use but the resulting code is fine.
>>>
>> Can you please explain how this solution should work?
>> First a code with memory accesses would be generated and then
>> optimizations would optimize it to use registers directly?
>
> You build a RECORD_TYPE holding the fields you want to return.  You
> define the appropriate builtin functions to return that record type.

How is that done? using define_insn? How do I tell it to return a struct?
Is there an example I can look at?

Roy.

> You define another builtin function for each field, which takes the
> RECORD_TYPE as its argument and returns the type of the field.  In
> TARGET_FOLD_BUILTIN you convert the per-field functions into
> COMPONENT_REFs.
>
> Ian
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2011-03-24 13:50       ` roy rosen
@ 2011-03-24 17:38         ` Ian Lance Taylor
  2011-03-28 11:36           ` roy rosen
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Lance Taylor @ 2011-03-24 17:38 UTC (permalink / raw)
  To: roy rosen; +Cc: gcc

roy rosen <roy.1rosen@gmail.com> writes:

>> You build a RECORD_TYPE holding the fields you want to return.  You
>> define the appropriate builtin functions to return that record type.
>
> How is that done? using define_insn? How do I tell it to return a struct?
> Is there an example I can look at?

A RECORD_TYPE is what gcc generates when you define a struct in your
source code.  For an example of a backend building a struct, see, e.g.,
ix86_build_builtin_va_list_abi.

When you define your builtin functions in TARGET_INIT_BUILTINS you
specify the argument types and the return type, typically by building a
FUNCTION_TYPE and passing it to add_builtin_function.  To define a
builtin which returns a struct, just arrange for the return type of the
FUNCTION_TYPE that you pass to add_builtin_function be the RECORD_TYPE
that you built.

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2011-03-24 17:38         ` Ian Lance Taylor
@ 2011-03-28 11:36           ` roy rosen
  2011-03-28 17:37             ` Ian Lance Taylor
  0 siblings, 1 reply; 15+ messages in thread
From: roy rosen @ 2011-03-28 11:36 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc

2011/3/24 Ian Lance Taylor <iant@google.com>:
> roy rosen <roy.1rosen@gmail.com> writes:
>
>>> You build a RECORD_TYPE holding the fields you want to return.  You
>>> define the appropriate builtin functions to return that record type.
>>
>> How is that done? using define_insn? How do I tell it to return a struct?
>> Is there an example I can look at?
>
> A RECORD_TYPE is what gcc generates when you define a struct in your
> source code.  For an example of a backend building a struct, see, e.g.,
> ix86_build_builtin_va_list_abi.
>
> When you define your builtin functions in TARGET_INIT_BUILTINS you
> specify the argument types and the return type, typically by building a
> FUNCTION_TYPE and passing it to add_builtin_function.  To define a
> builtin which returns a struct, just arrange for the return type of the
> FUNCTION_TYPE that you pass to add_builtin_function be the RECORD_TYPE
> that you built.

I understood this part.
What I don't understand is:
In addition to adding the builtin function, in the md file I have a
define_insn for each built in, for example:

(define_insn "A_ssodssxx2w"
	[(set (match_operand:SI 0 "register_operand" "=d ")
	(unspec:SI [(match_operand:SI 1 "register_operand" "d ")
	(match_operand:SI 2 "register_operand" "d ")]
	UNSPEC_A_SSODSSXX2W))]
	""
	"ssodssxx.2w %2,%1,%0 %!"
)

How do I create something equivalent which would have an rtl set
expression to the structure.

Thanks, Roy.

>
> Ian
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: inline assembly vs. intrinsic functions
  2011-03-28 11:36           ` roy rosen
@ 2011-03-28 17:37             ` Ian Lance Taylor
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Lance Taylor @ 2011-03-28 17:37 UTC (permalink / raw)
  To: roy rosen; +Cc: gcc

roy rosen <roy.1rosen@gmail.com> writes:

> 2011/3/24 Ian Lance Taylor <iant@google.com>:
>> roy rosen <roy.1rosen@gmail.com> writes:
>>
>>>> You build a RECORD_TYPE holding the fields you want to return.  You
>>>> define the appropriate builtin functions to return that record type.
>>>
>>> How is that done? using define_insn? How do I tell it to return a struct?
>>> Is there an example I can look at?
>>
>> A RECORD_TYPE is what gcc generates when you define a struct in your
>> source code.  For an example of a backend building a struct, see, e.g.,
>> ix86_build_builtin_va_list_abi.
>>
>> When you define your builtin functions in TARGET_INIT_BUILTINS you
>> specify the argument types and the return type, typically by building a
>> FUNCTION_TYPE and passing it to add_builtin_function.  To define a
>> builtin which returns a struct, just arrange for the return type of the
>> FUNCTION_TYPE that you pass to add_builtin_function be the RECORD_TYPE
>> that you built.
>
> I understood this part.
> What I don't understand is:
> In addition to adding the builtin function, in the md file I have a
> define_insn for each built in, for example:
>
> (define_insn "A_ssodssxx2w"
> 	[(set (match_operand:SI 0 "register_operand" "=d ")
> 	(unspec:SI [(match_operand:SI 1 "register_operand" "d ")
> 	(match_operand:SI 2 "register_operand" "d ")]
> 	UNSPEC_A_SSODSSXX2W))]
> 	""
> 	"ssodssxx.2w %2,%1,%0 %!"
> )
>
> How do I create something equivalent which would have an rtl set
> expression to the structure.

At the RTL level the structure doesn't matter.  Your instruction
presumably sets some registers.  A register is just a register, it
doesn't have a type.

In TARGET_EXPAND_BUILTIN you need to pick up the two registers and
assemble them into a PARALLEL.  You're going to build REGs to pass to
the insn pattern.  Then do something along the lines of:

  ret0 = gen_rtx_EXPR_LIST (VOIDmode, reg0, const0_rtx);
  ret1 = gen_rtx_EXPR_LIST (VOIDmode, reg1, const1_rtx);
  ret = gen_rtx_PARALLEL (??mode, gen_rtvec (2, reg0, reg1));
  tree nt = build_qualified_type (struct_type, TYPE_QUAL_CONST);
  target = assign_temp (nt, 0, 0, 1);
  emit_group_store (target, ret, struct_type, ??);

Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-03-28 16:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-26  5:54 inline assembly vs. intrinsic functions roy rosen
2010-10-26 13:31 ` Ian Lance Taylor
2010-10-26 13:36   ` roy rosen
2010-10-26 15:01     ` roy rosen
2010-10-26 16:12       ` Ian Lance Taylor
2010-11-15 16:05         ` roy rosen
2010-11-15 16:06           ` Joern Rennecke
2010-11-15 16:55             ` roy rosen
2010-11-15 19:11           ` Ian Lance Taylor
2011-03-17 11:34   ` roy rosen
2011-03-21 23:26     ` Ian Lance Taylor
2011-03-24 13:50       ` roy rosen
2011-03-24 17:38         ` Ian Lance Taylor
2011-03-28 11:36           ` roy rosen
2011-03-28 17:37             ` Ian Lance Taylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).