From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by sourceware.org (Postfix) with ESMTPS id 3F8473858402 for ; Thu, 24 Feb 2022 16:42:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3F8473858402 Received: by mail-ej1-x62c.google.com with SMTP id a8so5592423ejc.8 for ; Thu, 24 Feb 2022 08:42:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=VKnUTaDWWAQwgb26Bc4b9LIEVLWTqBFkFFiOpVXdjZI=; b=tBunqthL1G4tw1ltoJrJdUcDRTSJTTnuERHfD1o/4mY3EjoywAOx2xhJOVsImsH6uj 4lQiN5kqd+B2l3zpQe/VhrwJnsW3oevooEmKhcbLa1ewGtvnruKxDwJfbsqZrG3d6KUl 5lthsm04Cjd4HiPr+giN3CBg2aO9XwT6pgrHjvSEbFBivrAwzG0v5ak0RuMo1hVjnRx8 emtaM+sXUwvx7s2kZP55OMkVPcQBMZQW0Zy+7+XTsH0Tzeh0oIOu3RObgKVjiGau+2T1 KDafxz9EXYrbtgSa9h2Dr37dt7dBQaG7ftJ7uRcHeIrbZtmkWpyDWPFyA6JPZ4ObkDtb A07w== X-Gm-Message-State: AOAM5326s7M/sYoYLXAxRRB7yGreN+6nh/oywEeOzzb+oeTGSLQ6GqBL 7ipvo1tJgzln1b9VSMip09v9LsfM71SulEnN0G/Vb/wPrWBa1g== X-Google-Smtp-Source: ABdhPJxpZu6ik/jh81UsYCQwwZwY95fTa+mDLefSqq3omkKBSjWYtRzI4zRhPZWzomh85GWXpfjdKaOMeY3EZtvqONI= X-Received: by 2002:a17:906:2555:b0:6b7:5a75:b4f1 with SMTP id j21-20020a170906255500b006b75a75b4f1mr2898496ejb.60.1645720963821; Thu, 24 Feb 2022 08:42:43 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Satish Vasudeva Date: Thu, 24 Feb 2022 08:42:56 -0800 Message-ID: Subject: Re: Libatomic 16B To: gcc-help@gcc.gnu.org X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, HTML_MESSAGE, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-help@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-help mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Feb 2022 16:42:47 -0000 I looked into this further. Seems like libat_load_16_i1 is implementing the load 16B as "*lock* *cmpxchg16b* (%*rdi*)" This is assuming that the CPU doesn't support 16B loads in a single transaction. How can I compile libatomics to use intrinsics for load 16B instead of LOCK cmpxchg? Appreciate your response. Satish On Wed, Feb 23, 2022 at 8:42 AM Satish Vasudeva < satish.vasudeva@cohesity.com> wrote: > Hi Team, > > I was looking at the hotspots in our software stack and interestingly I > see libat_load_16_i1 seems to be one of the top in the list. > > I am trying to understand why that is the case. My suspicion is some kind > of lock usage for 16B atomic accesses. > > I came across this discussion but frankly I am still confused. > https://gcc.gnu.org/legacy-ml/gcc-patches/2017-01/msg02344.html > > Do you think the overhead of libat_load_16_i1 is due to spinlock usage? > Also reading some other Intel CPU docs, it seems like the CPU does support > loading 16B in single access. In that case can we optimize this for > performance? > > Thanks and appreciate your help. > > Satish >