public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug middle-end/67553] New: Saturating SSE/AVX instructions do not get optimized @ 2015-09-11 16:41 tmb99 at gmx dot net 2015-09-11 19:24 ` [Bug rtl-optimization/67553] " tmb99 at gmx dot net 2015-09-14 11:37 ` rguenth at gcc dot gnu.org 0 siblings, 2 replies; 3+ messages in thread From: tmb99 at gmx dot net @ 2015-09-11 16:41 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67553 Bug ID: 67553 Summary: Saturating SSE/AVX instructions do not get optimized Product: gcc Version: 5.2.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tmb99 at gmx dot net Target Milestone: --- Compiling this code with -O3 -mavx __m128i v0 = _mm_setzero_si128(); __m128i v2 = _mm_setzero_si128(); __m128i sum = _mm_adds_epi16(v0,v2); __m128i dif = _mm_subs_epi16(v0,v2); results in the following badly optimized assembly code: vpxor %xmm0, %xmm0, %xmm0 vpsubsw %xmm0, %xmm0, %xmm1 vpaddsw %xmm0, %xmm0, %xmm0 IMHO the adds and subs instructions should be eliminated by the optimizer ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug rtl-optimization/67553] Saturating SSE/AVX instructions do not get optimized 2015-09-11 16:41 [Bug middle-end/67553] New: Saturating SSE/AVX instructions do not get optimized tmb99 at gmx dot net @ 2015-09-11 19:24 ` tmb99 at gmx dot net 2015-09-14 11:37 ` rguenth at gcc dot gnu.org 1 sibling, 0 replies; 3+ messages in thread From: tmb99 at gmx dot net @ 2015-09-11 19:24 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67553 --- Comment #2 from tmb99 at gmx dot net --- seems to be the same for most saturating instructions: __m128i v0 = _mm_setzero_si128(); __m128i v2 = _mm_setzero_si128(); __m128i sum = _mm_adds_epi16(v0,v2); __m128i dif = _mm_subs_epi8(v0,v2); __m128i hsum = _mm_hadds_epi16(v0,v2); __m128i hdif = _mm_hsubs_epi16(v0,v2); __m128i pacu = _mm_packus_epi16(v0,v2); __m128i pacs = _mm_packs_epi32(v0,v2); compiles to: vpxor %xmm0, %xmm0, %xmm0 vpxor %xmm2, %xmm2, %xmm2 vphsubsw %xmm0, %xmm0, %xmm4 vpackuswb %xmm0, %xmm0, %xmm3 vphaddsw %xmm0, %xmm0, %xmm5 vpsubsb %xmm2, %xmm2, %xmm2 vpxor %xmm1, %xmm1, %xmm1 vpaddsw %xmm0, %xmm0, %xmm0 vpackssdw %xmm1, %xmm1, %xmm1 also: 3 setzero/vpxor instructions instead of just one. ^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug rtl-optimization/67553] Saturating SSE/AVX instructions do not get optimized 2015-09-11 16:41 [Bug middle-end/67553] New: Saturating SSE/AVX instructions do not get optimized tmb99 at gmx dot net 2015-09-11 19:24 ` [Bug rtl-optimization/67553] " tmb99 at gmx dot net @ 2015-09-14 11:37 ` rguenth at gcc dot gnu.org 1 sibling, 0 replies; 3+ messages in thread From: rguenth at gcc dot gnu.org @ 2015-09-14 11:37 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67553 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Target| |x86_64-*-*, i?86-*-* --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- The backend probably uses unspecs for this (the intrinsics use builtins). I wonder if the backend could enable fixed-point types/modes more generally. ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-09-14 11:37 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-09-11 16:41 [Bug middle-end/67553] New: Saturating SSE/AVX instructions do not get optimized tmb99 at gmx dot net 2015-09-11 19:24 ` [Bug rtl-optimization/67553] " tmb99 at gmx dot net 2015-09-14 11:37 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).