From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-416650-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 14182 invoked by alias); 1 Mar 2013 21:58:55 -0000
Received: (qmail 14144 invoked by uid 48); 1 Mar 2013 21:58:40 -0000
From: "burnus at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/56504] New: -mveclibabi=... Support AMD's LibM 3.0 (sucessor of ACML)
Date: Fri, 01 Mar 2013 21:58:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: middle-end
X-Bugzilla-Keywords:
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: burnus at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Changed-Fields:
Message-ID: <bug-56504-4@http.gcc.gnu.org/bugzilla/>
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
Content-Type: text/plain; charset="UTF-8"
MIME-Version: 1.0
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
X-SW-Source: 2013-03/txt/msg00092.txt.bz2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56504

             Bug #: 56504
           Summary: -mveclibabi=... Support AMD's LibM 3.0 (sucessor of
                    ACML)
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: burnus@gcc.gnu.org


GCC currently supports:

       -mveclibabi=type
           Specifies the ABI type to use for vectorizing intrinsics
           [...] and acml for the AMD math core library. [...]

           [...]    and "__vrd2_sin",
           "__vrd2_cos", "__vrd2_exp", "__vrd2_log", "__vrd2_log2",
           "__vrd2_log10", "__vrs4_sinf", "__vrs4_cosf", "__vrs4_expf",
           "__vrs4_logf", "__vrs4_log2f", "__vrs4_log10f" and
           "__vrs4_powf" for the corresponding function type when
           -mveclibabi=acml is used.

The current AMD LibM version, however, supports much more:
http://developer.amd.com/tools/cpu-development/libm/


>>From the release notes:

Vector Functions 
----------------
         Exponential
         -----------
            * vrs4_expf, vrs4_exp2f, vrs4_exp10f, vrs4_expm1f
            * vrsa_expf, vrsa_exp2f, vrsa_exp10f, vrsa_expm1f
            * vrd2_exp, vrd2_exp2, vrd2_exp10, vrd2_expm1
            * vrda_exp, vrda_exp2, vrda_exp10, vrda_expm1

         Logarithmic
         -----------
            * vrs4_logf, vrs4_log2f, vrs4_log10f, vrs4_log1pf
            * vrsa_logf, vrsa_log2f, vrsa_log10f, vrsa_log1pf
            * vrd2_log, vrd2_log2, vrd2_log10, vrd2_log1p
            * vrda_log, vrda_log2, vrda_log10, vrda_log1p

         Trigonometric
         -------------
            * vrs4_cosf, vrs4_sinf
            * vrsa_cosf, vrsa_sinf
            * vrd2_cos, vrd2_sin
            * vrda_cos, vrda_sin
            * vrd2_sincos,vrda_sincos
            * vrs4_sincosf,vrsa_sincosf 
            * vrd2_tan, vrs4_tanf
            * vrd2_cosh


         Power
         -----
            * vrs4_cbrtf, vrd2_cbrt, vrs4_powf, vrs4_powxf
            * vrsa_cbrtf, vrda_cbrt, vrsa_powf, vrsa_powxf
            * vrd2_pow


The vector functions are the known (cf. include/amdlibm.h):
    __m128d amd_vrd2_exp    (__m128d x);
    __m128  amd_vrs4_expf   (__m128  x);
    etc.

While the array version use:
    void amd_vrsa_expf      (int len, float  *src, float  *dst);
    void amd_vrda_exp2      (int len, double *src, double *dst);

    void amd_vrda_exp       (int len, double *src, double *dst);
    void amd_vrsa_expf      (int len, float  *src, float  *dst);

Unfortunately, no further documentation is available, telling whether, e.g.,
src and dst may be the same or not.



Note that AMD LibM now uses "amd_" as prefix to the vector functions. It
contains the old version as weak symbols but only those:

0000000000000340 W __vrd2_cos
00000000000000e0 W __vrd2_exp
00000000000001a0 W __vrd2_log
00000000000001c0 W __vrd2_log10
00000000000001b0 W __vrd2_log2
0000000000000330 W __vrd2_sin
0000000000000390 W __vrs4_cosf
00000000000000a0 W __vrs4_expf
0000000000000200 W __vrs4_log10f
00000000000001f0 W __vrs4_log2f
00000000000001e0 W __vrs4_logf
00000000000003a0 W __vrs4_sinf