* [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
@ 2017-04-14 12:30 Uros Bizjak
2017-04-14 12:38 ` Florian Weimer
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Uros Bizjak @ 2017-04-14 12:30 UTC (permalink / raw)
To: libc-alpha; +Cc: Richard Henderson
Add earlyclobber to sqrtt/sqrtf insns.
When using software completions, we have to prevent assembler to match
input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
output operand to avoid unwanted operand matching.
2017-04-14 Uros Bizjak <ubizjak@gmail.com>
* sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Add
earlyclobber to output operand of sqrt insn.
(__ieee754_sqrtf): Ditto.
diff --git a/sysdeps/alpha/fpu/math_private.h b/sysdeps/alpha/fpu/math_private.h
index 9e06e25..1e97c86 100644
--- a/sysdeps/alpha/fpu/math_private.h
+++ b/sysdeps/alpha/fpu/math_private.h
@@ -27,9 +27,9 @@ __ieee754_sqrt (double d)
{
double ret;
# ifdef _IEEE_FP_INEXACT
- asm ("sqrtt/suid %1,%0" : "=f"(ret) : "f"(d));
+ asm ("sqrtt/suid %1,%0" : "=&f"(ret) : "f"(d));
# else
- asm ("sqrtt/sud %1,%0" : "=f"(ret) : "f"(d));
+ asm ("sqrtt/sud %1,%0" : "=&f"(ret) : "f"(d));
# endif
return ret;
}
@@ -39,9 +39,9 @@ __ieee754_sqrtf (float d)
{
float ret;
# ifdef _IEEE_FP_INEXACT
- asm ("sqrts/suid %1,%0" : "=f"(ret) : "f"(d));
+ asm ("sqrts/suid %1,%0" : "=&f"(ret) : "f"(d));
# else
- asm ("sqrts/sud %1,%0" : "=f"(ret) : "f"(d));
+ asm ("sqrts/sud %1,%0" : "=&f"(ret) : "f"(d));
# endif
return ret;
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
2017-04-14 12:30 [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns Uros Bizjak
@ 2017-04-14 12:38 ` Florian Weimer
2017-04-14 12:47 ` Uros Bizjak
2017-04-14 13:55 ` Richard Henderson
2017-04-26 13:44 ` Richard Henderson
2 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2017-04-14 12:38 UTC (permalink / raw)
To: Uros Bizjak; +Cc: libc-alpha, Richard Henderson
* Uros Bizjak:
> Add earlyclobber to sqrtt/sqrtf insns.
>
> When using software completions, we have to prevent assembler to match
> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
> output operand to avoid unwanted operand matching.
Is this so that the trap handler can recover the original input?
In this case, please add this to the commit message for those of us
who are not familiar with the Alpha architecture.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
2017-04-14 12:38 ` Florian Weimer
@ 2017-04-14 12:47 ` Uros Bizjak
2017-04-14 12:54 ` Florian Weimer
0 siblings, 1 reply; 8+ messages in thread
From: Uros Bizjak @ 2017-04-14 12:47 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha, Richard Henderson
On Fri, Apr 14, 2017 at 2:38 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Uros Bizjak:
>
>> Add earlyclobber to sqrtt/sqrtf insns.
>>
>> When using software completions, we have to prevent assembler to match
>> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
>> output operand to avoid unwanted operand matching.
>
> Is this so that the trap handler can recover the original input?
>
> In this case, please add this to the commit message for those of us
> who are not familiar with the Alpha architecture.
I don't know the low-level details, perhaps Richard can explain these
requirements.
The sqrtt simply returns zero for subnormal operand with software
completions when input and output reg are matched.
Uros.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
2017-04-14 12:47 ` Uros Bizjak
@ 2017-04-14 12:54 ` Florian Weimer
2017-04-14 13:02 ` Uros Bizjak
0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2017-04-14 12:54 UTC (permalink / raw)
To: Uros Bizjak; +Cc: libc-alpha, Richard Henderson
* Uros Bizjak:
> On Fri, Apr 14, 2017 at 2:38 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>> * Uros Bizjak:
>>
>>> Add earlyclobber to sqrtt/sqrtf insns.
>>>
>>> When using software completions, we have to prevent assembler to match
>>> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
>>> output operand to avoid unwanted operand matching.
>>
>> Is this so that the trap handler can recover the original input?
>>
>> In this case, please add this to the commit message for those of us
>> who are not familiar with the Alpha architecture.
>
> I don't know the low-level details, perhaps Richard can explain these
> requirements.
>
> The sqrtt simply returns zero for subnormal operand with software
> completions when input and output reg are matched.
Did you see that in an emulator, or on actual hardware?
The architecture manual suggests that we also need to take measures to
limit the trap shadow.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
2017-04-14 12:54 ` Florian Weimer
@ 2017-04-14 13:02 ` Uros Bizjak
0 siblings, 0 replies; 8+ messages in thread
From: Uros Bizjak @ 2017-04-14 13:02 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha, Richard Henderson
On Fri, Apr 14, 2017 at 2:54 PM, Florian Weimer <fw@deneb.enyo.de> wrote:
>>>> Add earlyclobber to sqrtt/sqrtf insns.
>>>>
>>>> When using software completions, we have to prevent assembler to match
>>>> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
>>>> output operand to avoid unwanted operand matching.
>>>
>>> Is this so that the trap handler can recover the original input?
>>>
>>> In this case, please add this to the commit message for those of us
>>> who are not familiar with the Alpha architecture.
>>
>> I don't know the low-level details, perhaps Richard can explain these
>> requirements.
>>
>> The sqrtt simply returns zero for subnormal operand with software
>> completions when input and output reg are matched.
>
> Did you see that in an emulator, or on actual hardware?
I have seen this on actual hardware, AlphaServer ES40, EV68AL Tsunami.
Without the patch, there are many instances of problematic insns,
simple grep sqrt[ts] of libm dump shows
~ $ grep sqrt[ts] aaa
6e94: 6b bd eb 53 sqrtt/sud $f11,$f11
6ef0: 70 bd f0 53 sqrtt/sud $f16,$f16
cb8c: 60 bd e0 53 sqrtt/sud $f0,$f0
cc14: 60 bd e0 53 sqrtt/sud $f0,$f0
d28c: 70 bd e2 53 sqrtt/sud $f2,$f16
d350: 70 bd e2 53 sqrtt/sud $f2,$f16
d588: 62 bd e2 53 sqrtt/sud $f2,$f2
d60c: 62 bd e2 53 sqrtt/sud $f2,$f2
dad0: 70 bd e2 53 sqrtt/sud $f2,$f16
db64: 70 bd e2 53 sqrtt/sud $f2,$f16
ddbc: 62 bd e2 53 sqrtt/sud $f2,$f2
de24: 62 bd e2 53 sqrtt/sud $f2,$f2
e028: 62 bd e2 53 sqrtt/sud $f2,$f2
e628: 71 bd f1 53 sqrtt/sud $f17,$f17
12290: 6a bd ea 53 sqrtt/sud $f10,$f10
133ec: 6b bd eb 53 sqrtt/sud $f11,$f11
13460: 6b bd eb 53 sqrtt/sud $f11,$f11
1dff4: 60 bd e0 53 sqrtt/sud $f0,$f0
23de4: 62 bd e2 53 sqrtt/sud $f2,$f2
23f00: 62 bd e2 53 sqrtt/sud $f2,$f2
24000: 70 bd f0 53 sqrtt/sud $f16,$f16
24040: 64 bd ea 53 sqrtt/sud $f10,$f4
24084: 64 bd ea 53 sqrtt/sud $f10,$f4
24124: 6a bd ea 53 sqrtt/sud $f10,$f10
2414c: 70 bd e0 53 sqrtt/sud $f0,$f16
24168: 61 bd e1 53 sqrtt/sud $f1,$f1
25cac: 63 bd ea 53 sqrtt/sud $f10,$f3
25d94: 71 bd e5 53 sqrtt/sud $f5,$f17
25eb4: 65 bd e5 53 sqrtt/sud $f5,$f5
25f24: 63 bd e3 53 sqrtt/sud $f3,$f3
25f84: 6b bd eb 53 sqrtt/sud $f11,$f11
25f98: 6a bd ea 53 sqrtt/sud $f10,$f10
25fa8: 6b bd eb 53 sqrtt/sud $f11,$f11
2607c: 6c bd ec 53 sqrtt/sud $f12,$f12
26094: 6a bd ea 53 sqrtt/sud $f10,$f10
2613c: 6b bd eb 53 sqrtt/sud $f11,$f11
26150: 63 bd eb 53 sqrtt/sud $f11,$f3
2995c: 6b b9 ea 53 sqrts/sud $f10,$f11
29ad0: 6b b9 ea 53 sqrts/sud $f10,$f11
29c00: 70 b9 f0 53 sqrts/sud $f16,$f16
29c60: 6b b9 eb 53 sqrts/sud $f11,$f11
29d50: 6d b9 ea 53 sqrts/sud $f10,$f13
29e18: 6b b9 ea 53 sqrts/sud $f10,$f11
2aa00: 6a bd ea 53 sqrtt/sud $f10,$f10
2aebc: 62 b9 e2 53 sqrts/sud $f2,$f2
2af6c: 62 b9 e2 53 sqrts/sud $f2,$f2
2b1a8: 62 b9 e2 53 sqrts/sud $f2,$f2
2b230: 62 b9 e2 53 sqrts/sud $f2,$f2
2b630: 70 b9 e2 53 sqrts/sud $f2,$f16
2b6c8: 70 b9 e2 53 sqrts/sud $f2,$f16
2b940: 62 b9 e2 53 sqrts/sud $f2,$f2
2b9b8: 62 b9 e2 53 sqrts/sud $f2,$f2
2d29c: 60 b9 f0 53 sqrts/sud $f16,$f0
2e528: 6a b9 ea 53 sqrts/sud $f10,$f10
2fb68: 6b b9 eb 53 sqrts/sud $f11,$f11
2fbfc: 6b b9 eb 53 sqrts/sud $f11,$f11
33948: 60 b9 e0 53 sqrts/sud $f0,$f0
37c30: 63 b9 e3 53 sqrts/sud $f3,$f3
37d64: 63 b9 e3 53 sqrts/sud $f3,$f3
37e64: 70 b9 f0 53 sqrts/sud $f16,$f16
37eac: 62 b9 ea 53 sqrts/sud $f10,$f2
37ed8: 62 b9 ea 53 sqrts/sud $f10,$f2
37f88: 6a b9 ea 53 sqrts/sud $f10,$f10
37fa8: 70 b9 e0 53 sqrts/sud $f0,$f16
37fb8: 61 b9 e1 53 sqrts/sud $f1,$f1
3909c: 65 b9 ea 53 sqrts/sud $f10,$f5
39184: 65 b9 e7 53 sqrts/sud $f7,$f5
39294: 67 b9 e7 53 sqrts/sud $f7,$f7
39310: 65 b9 e5 53 sqrts/sud $f5,$f5
39370: 6b b9 eb 53 sqrts/sud $f11,$f11
39384: 65 b9 e5 53 sqrts/sud $f5,$f5
39390: 6a b9 ea 53 sqrts/sud $f10,$f10
39458: 6c b9 ec 53 sqrts/sud $f12,$f12
39470: 6a b9 ea 53 sqrts/sud $f10,$f10
39514: 6a b9 ea 53 sqrts/sud $f10,$f10
39528: 65 b9 ea 53 sqrts/sud $f10,$f5
7429c: 6a bd f0 53 sqrtt/sud $f16,$f10
742f8: 6a bd f0 53 sqrtt/sud $f16,$f10
Uros.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
2017-04-14 12:30 [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns Uros Bizjak
2017-04-14 12:38 ` Florian Weimer
@ 2017-04-14 13:55 ` Richard Henderson
2017-04-14 18:01 ` Uros Bizjak
2017-04-26 13:44 ` Richard Henderson
2 siblings, 1 reply; 8+ messages in thread
From: Richard Henderson @ 2017-04-14 13:55 UTC (permalink / raw)
To: Uros Bizjak, libc-alpha
On 04/14/2017 05:30 AM, Uros Bizjak wrote:
> Add earlyclobber to sqrtt/sqrtf insns.
>
> When using software completions, we have to prevent assembler to match
> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
> output operand to avoid unwanted operand matching.
>
> 2017-04-14 Uros Bizjak <ubizjak@gmail.com>
>
> * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Add
> earlyclobber to output operand of sqrt insn.
> (__ieee754_sqrtf): Ditto.
>
> diff --git a/sysdeps/alpha/fpu/math_private.h b/sysdeps/alpha/fpu/math_private.h
> index 9e06e25..1e97c86 100644
> --- a/sysdeps/alpha/fpu/math_private.h
> +++ b/sysdeps/alpha/fpu/math_private.h
> @@ -27,9 +27,9 @@ __ieee754_sqrt (double d)
> {
> double ret;
> # ifdef _IEEE_FP_INEXACT
> - asm ("sqrtt/suid %1,%0" : "=f"(ret) : "f"(d));
> + asm ("sqrtt/suid %1,%0" : "=&f"(ret) : "f"(d));
> # else
> - asm ("sqrtt/sud %1,%0" : "=f"(ret) : "f"(d));
> + asm ("sqrtt/sud %1,%0" : "=&f"(ret) : "f"(d));
> # endif
Hmm. This is surprising because any host that has sqrtt also has exact traps,
and so trap shadows and recovery of the input shouldn't be an issue.
That said, do we actually still support a gcc that doesn't have these as
builtins? That would be more ideal than inline asm.
r~
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
2017-04-14 13:55 ` Richard Henderson
@ 2017-04-14 18:01 ` Uros Bizjak
0 siblings, 0 replies; 8+ messages in thread
From: Uros Bizjak @ 2017-04-14 18:01 UTC (permalink / raw)
To: Richard Henderson; +Cc: libc-alpha
On Fri, Apr 14, 2017 at 3:55 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 04/14/2017 05:30 AM, Uros Bizjak wrote:
>>
>> Add earlyclobber to sqrtt/sqrtf insns.
>>
>> When using software completions, we have to prevent assembler to match
>> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
>> output operand to avoid unwanted operand matching.
>>
>> 2017-04-14 Uros Bizjak <ubizjak@gmail.com>
>>
>> * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Add
>> earlyclobber to output operand of sqrt insn.
>> (__ieee754_sqrtf): Ditto.
>>
>> diff --git a/sysdeps/alpha/fpu/math_private.h
>> b/sysdeps/alpha/fpu/math_private.h
>> index 9e06e25..1e97c86 100644
>> --- a/sysdeps/alpha/fpu/math_private.h
>> +++ b/sysdeps/alpha/fpu/math_private.h
>> @@ -27,9 +27,9 @@ __ieee754_sqrt (double d)
>> {
>> double ret;
>> # ifdef _IEEE_FP_INEXACT
>> - asm ("sqrtt/suid %1,%0" : "=f"(ret) : "f"(d));
>> + asm ("sqrtt/suid %1,%0" : "=&f"(ret) : "f"(d));
>> # else
>> - asm ("sqrtt/sud %1,%0" : "=f"(ret) : "f"(d));
>> + asm ("sqrtt/sud %1,%0" : "=&f"(ret) : "f"(d));
>> # endif
>
>
> Hmm. This is surprising because any host that has sqrtt also has exact
> traps, and so trap shadows and recovery of the input shouldn't be an issue.
I have tried with a sqrtt/sud $f0, $f0, where the value in $f0 was
0x0....10. I can confirm that the returned result was 0.
> That said, do we actually still support a gcc that doesn't have these as
> builtins? That would be more ideal than inline asm.
Compiling
double test_sqrt (double d)
{
return __builtin_sqrt (d);
}
with -O2 -mcpu=ev6 resulted in:
test_sqrt:
.frame $30,16,$26,0
.mask 0x4000000,-16
$LFB0:
.cfi_startproc
ldah $29,0($27) !gpdisp!1
lda $29,0($29) !gpdisp!1
$test_sqrt..ng:
sqrtt $f16,$f0
lda $30,-16($30)
.cfi_def_cfa_offset 16
stq $26,0($30)
.cfi_offset 26, -16
.prologue 1
cmpteq $f0,$f0,$f10
fbeq $f10,$L4
$L2:
ldq $26,0($30)
lda $30,16($30)
.cfi_remember_state
.cfi_restore 26
.cfi_def_cfa_offset 0
ret $31,($26),1
$L4:
.cfi_restore_state
ldq $27,sqrt($29) !literal!2
jsr $26,($27),sqrt !lituse_jsr!2
ldah $29,0($26) !gpdisp!3
lda $29,0($29) !gpdisp!3
br $31,$L2
__builtin_sqrt expands with a call to sqrt(), this could possibly
create a recursive call loop if the builtin is used in libc to
implement sqrt.
Uros.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns.
2017-04-14 12:30 [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns Uros Bizjak
2017-04-14 12:38 ` Florian Weimer
2017-04-14 13:55 ` Richard Henderson
@ 2017-04-26 13:44 ` Richard Henderson
2 siblings, 0 replies; 8+ messages in thread
From: Richard Henderson @ 2017-04-26 13:44 UTC (permalink / raw)
To: Uros Bizjak, libc-alpha
On 04/14/2017 02:30 PM, Uros Bizjak wrote:
> Add earlyclobber to sqrtt/sqrtf insns.
>
> When using software completions, we have to prevent assembler to match
> input and output operands of sqrtt/sqrtf insn. Add earlyclobber to
> output operand to avoid unwanted operand matching.
>
> 2017-04-14 Uros Bizjak<ubizjak@gmail.com>
>
> * sysdeps/alpha/fpu/math_private.h (__ieee754_sqrt): Add
> earlyclobber to output operand of sqrt insn.
> (__ieee754_sqrtf): Ditto.
Pushed to master.
r~
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-04-26 13:44 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-14 12:30 [PATCH, alpha]: Add earlyclobber to sqrtt/sqrtf insns Uros Bizjak
2017-04-14 12:38 ` Florian Weimer
2017-04-14 12:47 ` Uros Bizjak
2017-04-14 12:54 ` Florian Weimer
2017-04-14 13:02 ` Uros Bizjak
2017-04-14 13:55 ` Richard Henderson
2017-04-14 18:01 ` Uros Bizjak
2017-04-26 13:44 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).