From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by sourceware.org (Postfix) with ESMTPS id 1ED8F3857C60 for ; Mon, 10 May 2021 21:43:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1ED8F3857C60 Received: by mail-qk1-x735.google.com with SMTP id 197so16890503qkl.12 for ; Mon, 10 May 2021 14:43:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=9fwzqrxMtOZDWzT+H16QaTnpyg1wUaNM9GbcZl2q37g=; b=caaLP0YQ6Hh+DEJLUASl47zESj1WrbojzIo4bPUEowTxmJiJZl2+x1KBocC7QLYC8w PWnvtGw2fMYDlpb54I6WAMhQMiTXNMMcyoMGjYaYVrQTB+sV00iNnkTgaMWYN1l0CF7X Ez0oUoOueo0AaluoGfxExl29h4S5wS7cxVjdq4Uzb1J5E5483T+dUf5e0XB/9tqxkn3P Ke/M/HIPBsENObykn9QqqErKVMsM7UoQrZj5tE9th4J73mz33wPy+72eL0/yHn9L+O0g ZNEHBztn38RTDOhHwt6oBE5lN/1aG8XmWAHcAnHG8ryonbysv7M3HiesvzNQMemF2G2F 7xPQ== X-Gm-Message-State: AOAM5323lbPXF7h2NxFQt4j77kKFNa8PiJTDgZD1bvHdXx2VCcj5qecv T5DT124SVsweu6hX2HSubu9EPHwLlS+uOQ== X-Google-Smtp-Source: ABdhPJzOJSQ44ocdC01c60J2DHw7Ur1Qp0sf9yvn7KrXqF9IoWzYjEZWaeXuyM3kA4pwTD3SzZ7sJw== X-Received: by 2002:a37:89c7:: with SMTP id l190mr26154376qkd.361.1620682980419; Mon, 10 May 2021 14:43:00 -0700 (PDT) Received: from [192.168.1.4] ([177.194.37.86]) by smtp.gmail.com with ESMTPSA id e12sm12856015qtj.81.2021.05.10.14.42.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 May 2021 14:43:00 -0700 (PDT) Subject: Re: [PATCH v2 4/4] libio: Fix race access to _IO_FLAGS2_NEED_LOCK To: Florian Weimer Cc: libc-alpha@sourceware.org References: <20210510170426.2533768-1-adhemerval.zanella@linaro.org> <20210510170426.2533768-4-adhemerval.zanella@linaro.org> <875yzq7a8s.fsf@oldenburg.str.redhat.com> <04d487bf-6c41-97ed-df4e-7d2eac9e1337@linaro.org> <87k0o65rf8.fsf@oldenburg.str.redhat.com> <66b7f9ac-f74a-2dc9-1c63-7b0f381fdc1f@linaro.org> <878s4m5qpx.fsf@oldenburg.str.redhat.com> From: Adhemerval Zanella Message-ID: <58e9c589-5005-11db-86ac-8495c3190795@linaro.org> Date: Mon, 10 May 2021 18:42:57 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <878s4m5qpx.fsf@oldenburg.str.redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2021 21:43:02 -0000 On 10/05/2021 16:33, Florian Weimer wrote: > * Adhemerval Zanella: > >> On 10/05/2021 16:18, Florian Weimer wrote: >>> * Adhemerval Zanella: >>> >>>> I see that my first assumption of moving the _IO_FLAGS2_NEED_LOCK after >>>> the lock should be suffice. The data race is not really related to >>>> _IO_FLAGS2_NEED_LOCK set in fact, but rather the small window where >>>> the _flags2 read and update with _IO_FLAGS2_NEED_LOCK might lost some >>>> bits update. >>> >>> Right. >>> >>>> Maybe the comment: >>>> >>>> /* The _IO_FLAGS2_NEED_LOCK is required here to enable the non-atomic >>>> fast path on function that call _IO_need_lock. However it requires >>>> to be set with stream locked to avoid a potential data race. */ >>> >>> s/fast path/path/ I suppose, and “it necessary to set it with the stream >>> locked”. >> >> >> /* The _IO_FLAGS2_NEED_LOCK is required here to enable the non-atomic >> path on function that call _IO_need_lock. However it is necessary >> to set it with the stream locked to avoid a potential data race. */ > > Meh, I'm confused. Aren't we disabling the non-atomic path? With this discussion it seems to me that the _IO_FLAGS2_NEED_LOCK set on flockfile is indeed superfluous. On the code: { flockfile (arq); write (STDERR_FILENO, "(1)\n", sizeof ("(1)\n")); feof (arq); funlockfile (arq); } { write (STDERR_FILENO, "(2)\n", sizeof ("(2)\n")); feof (arq); } The second feof will always take the lock because _IO_FLAGS2_NEED_LOCK is sticky, even though this is fine to use the fast non-atomic path (I am assuming single-thread here). > >>> Why not? The single-thread optimization would still avoid the atomic >>> compare-and-set, which should be a significant win on AArch64 and POWER. >>> There is more work to be done for the single-threaded case compared to >>> _IO_FLAGS2_NEED_LOCK, but all that is without atomics, so I think those >>> should be reasonably fast. >> >> Because it would solve only for single-thread case, for multithread >> case the single-thread optimization won't give the function that >> call _IO_need_lock the correct information to elide the lock. >> >> The seqeuence: >> >> flockfile (arq); >> >> feof (arq); >> ferror (arq); >> ... >> >> funlockfile (arq); >> >> Should only incur is atomic operation on flockfile and funlockfile >> in both single and multi-thread case. BY tying the feof taking the >> lock to single-thread optimization it incur in extra atomic operations >> on multi-thread case. > > There are two relevant locking optimizations: flockfile acquires the > lock for the current thread. At that point, the ownership check for the > recursive lock succeeds. That check doesn't use an atomic load, so it's > fast. > > The other optimization is generally enabled via __fsetlocking (arq, > FSETLOCKING_BYCALLER), which enables the _IO_USER_LOCK flag that > disables locking in feof and ferror as well (but not for flockfile and > funlockfile). > > The latter is perhaps not necessary because the application could just > call flockfile. This is indeed messy, ideally I think we would like something like: int _IO_feof (FILE *fp) { __flockfile (fp); int result = _IO_feof_unlocked (); __funlockfile (fp); return result; } void __flockfile (FILE *stream) { _IO_lock_lock (*stream->lock); } static inline void _IO_lock_lock (_IO_lock_t *lock) { void *self = THREAD_SELF; if (lock->owner != self) { lll_lock (name->lock, LLL_PRIVATE); lock->owner = self; } ++lock->cnt; } The __flockfile should provide single-thread optimization through the _IO_lock_lock (and the recursive lock). However we can't really use __flockfile internally because of __fsetlocking (FSETLOCKING_BYCALLER), so we need a wrapper to check for _IO_USER_LOCK (which is provided by the _IO_need_lock wrapper). The _IO_need_lock and _unlocked call really seems unnecessary, at least on Linux that already provides a recursive lock with owner.