From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id AD60538582A8 for ; Fri, 16 Feb 2024 13:52:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AD60538582A8 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AD60538582A8 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708091535; cv=none; b=DV+g9Z1N6pjhti9SZ03NJkKKb29/8E37905bUzeiJSpm9xyWlwPCiTasF2RaNzHReFhBtOt6qiv6azNwJkfO0KdqX4uRyjQXAVkk3AzXOeO1o0HQDpJ242qg7AIikgs3GaUm43zovZk6j8VoJaQJ4iswUYhJjeQKUO2+ULqOdO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708091535; c=relaxed/simple; bh=kDNLfF+Pnu/Nn1iAOVDm9uh/PKUQorUYvGi44egjip8=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=nifxHch85By36m/u+a3bp1FmrIKjqMEZKgZmAzgCquUbeVilK6apGI+lmcGkBjev75yUjTNnOFvWis/LMzabKM//zc4jPcYn70FCoKKvSM1vTb1ZS1VDiWPhFmpf/g7AVIVwj8O8sZhqnNeSeGDFNQUtBtyduQi04kLOtZ9WXE4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1708091532; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=J7JpQHj4I0CBUGjJJCjjVZ5RHVPoqWtg3r6HlQwN7a8=; b=AFap69rEZfDmEmsS7Hv/RJfHV6PlswlMa7RzXygo0KX6qlb1GecLWY4/7GwhNNTVFiaFVE ktErIaLpTiAEVh11QYnUVBF2wqlUqg1UFF/DCliwHt8MX9wU6hKXLY+pU3ZgLLPHPle2cL lkyVklhg+XQtymC086NS2YDyfOoM/U8= Received: from mail-yw1-f197.google.com (mail-yw1-f197.google.com [209.85.128.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-591-Y6IgkF3_NJ2-_MOLoKLSMg-1; Fri, 16 Feb 2024 08:52:10 -0500 X-MC-Unique: Y6IgkF3_NJ2-_MOLoKLSMg-1 Received: by mail-yw1-f197.google.com with SMTP id 00721157ae682-607a628209eso38834057b3.3 for ; Fri, 16 Feb 2024 05:52:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708091530; x=1708696330; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=J7JpQHj4I0CBUGjJJCjjVZ5RHVPoqWtg3r6HlQwN7a8=; b=fvJsqiU04rgZ30mOTP5G5RRsRYC43r83VSVskknXEVjZ9RKTzajfn58cMg7sWz3Z/F vjHFzUdI2NaG+OVu9ej/qg2/TQpMcFUH9XFnASDLxLJ8cFUz9EyBpytyj7N6AF+dgDI4 QyRN7qmuKKCAmnvt7vyb7tA96rw3tWj5Qe4l4o2JebLSIlQkrnc49ni0MVGebmmPckZV RSygb+BeIgtYU+0JSY/FFYw9WXza0RTEf5JaWLjiESztW4kAVBerkNLEK2yVf0VlrOUh 5F4Sde6g5GjUyFgRgEJjVFqCQNutsXG3eZJfdlTC88gSAxGpJgkuZuGUvVl40+j4kmqg YtzA== X-Forwarded-Encrypted: i=1; AJvYcCUKZ3x9Dfk8e5DpHsh9qZD4PhTV9rwDx5GzKocrPeqIiHImTtCOIivyASj06Ziw+hkaLiuhWhUFX8SRCRrEMY1a1jsj8go= X-Gm-Message-State: AOJu0YwqxVDP31yrOGPck4o6ecuvXbH+Nqg6kS3t/WMwDdCacFBuUa2T XMFmGxuf6mP1PVfQGvfvcc7QKm8bGwnzN3tktVOegB5bjXe1skht9RXWMi0Mm1FGGpZeuHDLiHz f/4oDH+RTTaGQl/oV2QOGKu6v5MGEsvh3yk9nRr51Hui4nwqjheD5u4qoGk3qkPNzg2vLQeyeFJ fcj6vwHQeYE3m8ZemZYvA8rM5nMpo= X-Received: by 2002:a81:4c41:0:b0:607:a085:b434 with SMTP id z62-20020a814c41000000b00607a085b434mr4707303ywa.8.1708091530422; Fri, 16 Feb 2024 05:52:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IFXPSFGtEFdaeJqqP+b2mCI+sQnGGcMcmZbgkHb16pkVqfsttN+HaP08QM/2jzuYtXbQcWrSEml3A6eow8/gHo= X-Received: by 2002:a81:4c41:0:b0:607:a085:b434 with SMTP id z62-20020a814c41000000b00607a085b434mr4707294ywa.8.1708091530133; Fri, 16 Feb 2024 05:52:10 -0800 (PST) MIME-Version: 1.0 References: <9f2bdd3a973d98199bb5b322baa575ab2fba8a58.camel@xry111.site> In-Reply-To: From: Jonathan Wakely Date: Fri, 16 Feb 2024 13:51:54 +0000 Message-ID: Subject: Re: [PATCH] libstdc++: atomic: Add missing clear_padding in __atomic_float constructor To: xndcn Cc: Jakub Jelinek , Xi Ruoyao , "H.J. Lu" , GCC Patches , libstdc++@gcc.gnu.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 16 Feb 2024 at 12:38, Jonathan Wakely wrote: > > On Fri, 2 Feb 2024 at 16:52, xndcn wrote: > > > > Thank you for your careful review! > > > > > But we don't need a new one if it's going to be used in exactly one test and if the new option does the same thing for all targets that run the test. > > Got it, thanks. Now add option "-latomic" directly, but it still rely > > on the trick "[atomic_link_flags [get_multilibs]]" > > > > > No, because the patch is supposed to prevent the infinite loop, and so there's no need to stop it looping after 10s. It won't loop at all. > > Thanks, deleted. > > > > > We only need to clear padding for long double, not float and double, right? > > Yes, actually there is a check "if constexpr > > (__atomic_impl::__maybe_has_padding<_Fp>())". > > But "__atomic_impl::__clear_padding(_M_fp); " is indeed simply, so fixed here. > > > > > Why can't we run this on all targets? > > Got it, now target option deleted. > > > > > There's no reason to use __builtin_memset here, just include and use std::memcpy. > > Thanks, fixed. > > > > > It definitely does have padding, just say "long double has padding bits on x86" > > Thanks, fixed. > > > > So here comes the latest patch: > > > Thanks. I've applied the patch to my tree, but the new test fails > pretty reliably. > > The infinite loop in std::atomic::fetch_add is fixed by > clearing padding in the constructor, but the test fails on the > compare_exchange_weak or compare_exchange_strong lines here: > > > > + as.fetch_add(t); > > + long double s = f + t; > > + t = as.load(); > > + VERIFY(s == t); // padding ignored on float comparing > > + fill_padding(s); > > + VERIFY(as.compare_exchange_weak(s, f)); // padding cleared on cmpexchg > > + fill_padding(f); > > + VERIFY(as.compare_exchange_strong(f, t)); // padding cleared on cmpexchg > > > > > I think the problem is here in __atomic_impl::__compare_exchange: > > if (__atomic_compare_exchange(__pval, __pexp, __pi, > __is_weak, int(__s), int(__f))) > return true; > > Even though padding in *__pexp and *__pi has been cleared, the value > of *__pval after a successful __atomic_compare_exchange has non-zero > padding. That means that the next compare_exchange will fail, because > we assume that the stored value always has zeroed padding bits. > > Here's a gdb session showing that __atomic_compare_exchange stores a > value with non-zero padding: > > Breakpoint 2, test01 () at compare_exchange_padding.cc:43 > 43 long double s2 = s; > (gdb) n > 44 fill_padding(s2); > (gdb) > 45 while (!as.compare_exchange_weak(s2, f)) // padding cleared > on compexchg > (gdb) p/x as._M_fp > $11 = 0x40008000000000000000 > (gdb) step > std::__atomic_float::compare_exchange_weak > (this=0x7fffffffd8c0, __expected=@0x7fffffffd8a0: 2, __desired=0.5, > __order=std::memory_order::seq_cst) at > /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1387 > 1387 return compare_exchange_weak(__expected, __desired, __order, > (gdb) step > std::__atomic_float::compare_exchange_weak > (this=0x7fffffffd8c0, __expected=@0x7fffffffd8a0: 2, __desired=0.5, > __success=std::memory_order::seq_cst, > __failure=std::memory_order::seq_cst) at > /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1347 > 1347 return __atomic_impl::compare_exchange_weak(&_M_fp, > (gdb) step > std::__atomic_impl::compare_exchange_weak > (__check_padding=false, __failure=std::memory_order::seq_cst, > __success=std::memory_order::seq_cst, __desired=0.5, > __expected=@0x7fffffffd8a0: 2, __ptr=0x7fffffffd8c0) > at /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:1123 > 1123 return __atomic_impl::__compare_exchange<_AtomicRef>( > (gdb) > std::__atomic_impl::__compare_exchange > (__f=std::memory_order::seq_cst, __s=std::memory_order::seq_cst, > __is_weak=true, > __i=, __e=@0x7fffffffd8a0: 2, > __val=@0x7fffffffd8c0: 2) at > /home/jwakely/gcc/14/include/c++/14.0.1/bits/atomic_base.h:994 > 994 __glibcxx_assert(__is_valid_cmpexch_failure_order(__f)); > (gdb) n > 997 _Tp* const __pval = std::__addressof(__val); > (gdb) > 1008 _Vp* const __pi = __atomic_impl::__clear_padding(__i); > (gdb) > 1010 _Vp __exp = __e; > (gdb) > 1012 _Vp* const __pexp = __atomic_impl::__clear_padding(__exp); > (gdb) > 1016 if (__atomic_compare_exchange(__pval, __pexp, __pi, > (gdb) p/x *__pval > $12 = 0x40008000000000000000 > (gdb) p/x *__pexp > $13 = 0x40008000000000000000 > (gdb) p/x *__pi > $14 = 0x3ffe8000000000000000 > (gdb) n > 1018 return true; > (gdb) p/x *__pval > $15 = 0x7ffff7bf3ffe8000000000000000 > (gdb) > > We stored *__pi which has zero padding, but the result in *__pval has > non-zero padding. This doesn't seem to be gdb being misleading by > loading *__pval into a FP register which doesn't preserve the zero > padding, because if I do this then it fails: > > as.fetch_add(t); > VERIFY(as.load() == s); > __builtin_clear_padding(&s); > VERIFY( std::memcmp(&s, &as, sizeof(s)) == 0 ); > > So the value stored by fetch_add (which uses compare_exchange_weak in > a loop) really doesn't have zero padding bits. > > I'm not sure what's going on here yet, maybe __atomic_compare_exchange > recognizes that the pointers are long double* and so only copies the > non-padding bits into the value? Ah, although __atomic_compare_exchange only takes pointers, the compiler replaces that with a call to __atomic_compare_exchange_n which takes the newval by value, which presumably uses an 80-bit FP register and so the padding bits become indeterminate again. Maybe we can make std::atomic actually store a 16-byte struct containing an opaque char buffer, which we store a long double into. That should stop FP registers being used for it. We never need an actual long double, because we only access the value using the __atomic_xxx built-ins. Maybe we need that for every std::atomic where T has padding bits, because even for struct S { int i; char c; } passing S by value could be scalarized and only copy the value representation, not preserve the padding that we've carefully cleared. > > I think I'll push the fix to the __atomic_float constructor, but with > a simplified test case that doesn't include the > compare_exchange_{weak,strong} calls, until I figure out how to fix > it.