From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-xa31.google.com (mail-vk1-xa31.google.com [IPv6:2607:f8b0:4864:20::a31]) by sourceware.org (Postfix) with ESMTPS id 352A438B3423 for ; Tue, 11 Jan 2022 19:26:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 352A438B3423 Received: by mail-vk1-xa31.google.com with SMTP id bj47so144838vkb.13 for ; Tue, 11 Jan 2022 11:26:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=hKdWhDUHcB92ZBn61guszUBTibcR5EdzfrSZ/zB67aQ=; b=NNThlrhmSLNHHWfnRZt4P/lRUM+Sowrxxd2Q2/WTSHxF9q/MOgNYqhYtb9SUalI7OE frYPAk8T/TiIYqtv9fWDQz6zsH1ddM5/axzIJdIA20UdLsUWMHrvzedD0FDun5hHlupU eAEHQYa1pMKj+rTWisJ5VtRilLZx4yXohwIHOh5hGUSHgd0kIK+NRTrRSZIgirFYwMWK KD6/vS1Lw3iZmqvQ9bHz+XYfxIFow+4vcQ7R9YhAtgP2PTuRmbJUG2ma21BW+HvZQLvn sqNP9Mnuko75kY9B6FNEURYSn+jM5nRK2PLqe+ubGkJ3GEjkYAKL38OnyTJKcdBjY3zh zhQQ== X-Gm-Message-State: AOAM530x7dwV3yk5XDsylk+j+EhjT4N8KnQYvHbMxfq9301divyCFLSV jiXo+3IRNjbqWRr/i3w/J0+8HMtkdk2f/GWc3F7S09Jg0ks= X-Google-Smtp-Source: ABdhPJwuIMS6fkNrLvf2Ai4Mzc/S54xELYiiWt6Gn8gteuFUl8vzyOp0eGrEP1qww4NuJCRCYY9wnh3ZcKwzBfHIFaM= X-Received: by 2002:a05:6122:1811:: with SMTP id ay17mr7157vkb.21.1641929161751; Tue, 11 Jan 2022 11:26:01 -0800 (PST) MIME-Version: 1.0 From: David Edelsohn Date: Tue, 11 Jan 2022 14:25:50 -0500 Message-ID: Subject: Re: [PATCH v4 2/3] rs6000: Support SSE4.1 "round" intrinsics To: "Paul A. Clarke" , Segher Boessenkool , Bill Schmidt Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2022 19:26:03 -0000 Suppress exceptions (when specified), by saving, manipulating, and restoring the FPSCR. Similarly, save, set, and restore the floating-point rounding mode when required. No attempt is made to optimize writing the FPSCR (by checking if the new value would be the same), other than using lighter weight instructions when possible. Note that explicit instruction scheduling "barriers" are added to prevent floating-point computations from being moved before or after the explicit FPSCR manipulations. (That these are required has been reported as an issue in GCC: PR102783.) The scalar versions naively use the parallel versions to compute the single scalar result and then construct the remainder of the result. Of minor note, the values of _MM_FROUND_TO_NEG_INF and _MM_FROUND_TO_ZERO are swapped from the corresponding values on x86 so as to match the corresponding rounding mode values in the Power ISA. Move implementations of _mm_ceil* and _mm_floor* into _mm_round*, and convert _mm_ceil* and _mm_floor* into macros. This matches the current analogous implementations in config/i386/smmintrin.h. Function signatures match the analogous functions in config/i386/smmintrin.h. Add tests for _mm_round_pd, _mm_round_ps, _mm_round_sd, _mm_round_ss, modeled after the very similar "floor" and "ceil" tests. Include basic tests, plus tests at the boundaries for floating-point representation, positive and negative, test all of the parameterized rounding modes as well as the C99 rounding modes and interactions between the two. Exceptions are not explicitly tested. 2021-10-18 Paul A. Clarke gcc * config/rs6000/smmintrin.h (_mm_round_pd, _mm_round_ps, _mm_round_sd, _mm_round_ss, _MM_FROUND_TO_NEAREST_INT, _MM_FROUND_TO_ZERO, _MM_FROUND_TO_POS_INF, _MM_FROUND_TO_NEG_INF, _MM_FROUND_CUR_DIRECTION, _MM_FROUND_RAISE_EXC, _MM_FROUND_NO_EXC, _MM_FROUND_NINT, _MM_FROUND_FLOOR, _MM_FROUND_CEIL, _MM_FROUND_TRUNC, _MM_FROUND_RINT, _MM_FROUND_NEARBYINT): New. * config/rs6000/smmintrin.h (_mm_ceil_pd, _mm_ceil_ps, _mm_ceil_sd, _mm_ceil_ss, _mm_floor_pd, _mm_floor_ps, _mm_floor_sd, _mm_floor_ss): Convert from function to macro. gcc/testsuite * gcc.target/powerpc/sse4_1-round3.h: New. * gcc.target/powerpc/sse4_1-roundpd.c: New. * gcc.target/powerpc/sse4_1-roundps.c: New. * gcc.target/powerpc/sse4_1-roundsd.c: New. * gcc.target/powerpc/sse4_1-roundss.c: New. Okay. Thanks, David