From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id A6B34383F097 for ; Wed, 14 Dec 2022 16:57:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A6B34383F097 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671037054; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WbC3EwVUh5vvuNDRQneMDrgVgiz5hPV3g6dusgps4R4=; b=ckOBY9odb7XXl+Fqo2EATWiankH/ujN62MtB6boisTwUT5ifsEUU3RqtbIQYawCF9lAR1V cZUGz+vvX3Ahx78QpAsxgdY/CEuLnqZSS/G6UCDvU5LXVFDC1POcliCE6CmVvvlaCG0vLP ak4LZicYXv0wzNnnzK4akBeIcect4iY= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-592-15pvWK8ZP1OlR1gihPpupA-1; Wed, 14 Dec 2022 11:57:32 -0500 X-MC-Unique: 15pvWK8ZP1OlR1gihPpupA-1 Received: by mail-qt1-f197.google.com with SMTP id m8-20020ac807c8000000b003a7f82f0da7so2764443qth.11 for ; Wed, 14 Dec 2022 08:57:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=WbC3EwVUh5vvuNDRQneMDrgVgiz5hPV3g6dusgps4R4=; b=dU3iUs3TBDowLU3LPDeX/eRF1qlva6bqu5C7GZm0GBJtxm0mrZ9DHqw65jUGARxrMO mHnfB93PSxerPuzIuT1jq2hZTey3nFzP+I71dePvy/KkO3BAPLM3tm1yf/Oz+ujdRZGO 7lEAvDmqVcMoNbAXCxGNvrX8v4ynbRJZWPNzCbiwO1stvS85A8r6Z8OfsI2X5spq3yre SaLUiJ7wmtJfSnkw12eiONv0xyEPiSP4UyDDocGq1PHXCSf8g5fXV+edlHfRftMPV8mj aGHlaSEcyTs0+VVuIGSZf0nB/k1x/35PAOUFnSqrHkvvZh/W+//oRnxviPGPU+Ik3myx 0qVA== X-Gm-Message-State: ANoB5pliRqx/uLJni9NLIZ4IRHKxkNgZ8XJU5ukAOikKVFiKvEEwGe/0 xiNCoLUgzJE/fy4+9iOyeBPzbfWywFBexIv+URbiCTKZF4yMUh7wrdQ7gBFd5/YSwpZc+Fk54pF VQr2COZfQEwnZkJT4XGNV X-Received: by 2002:a05:622a:2586:b0:3a7:e91f:d23 with SMTP id cj6-20020a05622a258600b003a7e91f0d23mr10819761qtb.67.1671037051722; Wed, 14 Dec 2022 08:57:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf7Z37+fZem2ckFl7A+2HkXhuv1x58CYqLRb1LKk0dt0sQT7osMsy9ZW3OzUO4uO5A2k48bx4g== X-Received: by 2002:a05:622a:2586:b0:3a7:e91f:d23 with SMTP id cj6-20020a05622a258600b003a7e91f0d23mr10819726qtb.67.1671037051380; Wed, 14 Dec 2022 08:57:31 -0800 (PST) Received: from [192.168.0.241] (192-0-145-146.cpe.teksavvy.com. [192.0.145.146]) by smtp.gmail.com with ESMTPSA id d18-20020ac847d2000000b003a54a19c550sm1960296qtr.57.2022.12.14.08.57.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 14 Dec 2022 08:57:30 -0800 (PST) Message-ID: Date: Wed, 14 Dec 2022 11:57:30 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.0 Subject: Re: [PATCH v3] x86: Prevent SIGSEGV in memcmp-sse2 when data is concurrently modified [BZ #29863] To: "H.J. Lu" , Noah Goldstein Cc: libc-alpha@sourceware.org, carlos@systemhalted.org References: <20221214001147.2814047-1-goldstein.w.n@gmail.com> <20221214073633.2876766-1-goldstein.w.n@gmail.com> From: Carlos O'Donell Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 12/14/22 11:09, H.J. Lu via Libc-alpha wrote: > On Tue, Dec 13, 2022 at 11:36 PM Noah Goldstein wrote: >> >> In the case of INCORRECT usage of `memcmp(a, b, N)` where `a` and `b` >> are concurrently modified as `memcmp` runs, there can be a SIGSEGV >> in `L(ret_nonzero_vec_end_0)` because the sequential logic >> assumes that `(rdx - 32 + rax)` is a positive 32-bit integer. >> >> To be clear, this change does not mean the usage of `memcmp` is >> supported. The program behaviour is undefined (UB) in the >> presence of data races, and `memcmp` is incorrect when the values >> of `a` and/or `b` are modified concurrently (data race). This UB >> may manifest itself as a SIGSEGV. That being said, if we can >> allow the idiomatic use cases, like those in yottadb with >> opportunistic concurrency control (OCC), to execute without a >> SIGSEGV, at no cost to regular use cases, then we can aim to >> minimize harm to those existing users. >> >> The fix replaces a 32-bit `addl %edx, %eax` with the 64-bit variant >> `addq %rdx, %rax`. The 1-extra byte of code size from using the >> 64-bit instruction doesn't contribute to overall code size as the >> next target is aligned and has multiple bytes of `nop` padding >> before it. As well all the logic between the add and `ret` still >> fits in the same fetch block, so the cost of this change is >> basically zero. >> >> The relevant sequential logic can be seen in the following >> pseudo-code: >> ``` >> /* >> * rsi = a >> * rdi = b >> * rdx = len - 32 >> */ >> /* cmp a[0:15] and b[0:15]. Since length is known to be [17, 32] >> in this case, this check is also assumed to cover a[0:(31 - len)] >> and b[0:(31 - len)]. */ >> movups (%rsi), %xmm0 >> movups (%rdi), %xmm1 >> PCMPEQ %xmm0, %xmm1 >> pmovmskb %xmm1, %eax >> subl %ecx, %eax >> jnz L(END_NEQ) >> >> /* cmp a[len-16:len-1] and b[len-16:len-1]. */ >> movups 16(%rsi, %rdx), %xmm0 >> movups 16(%rdi, %rdx), %xmm1 >> PCMPEQ %xmm0, %xmm1 >> pmovmskb %xmm1, %eax >> subl %ecx, %eax >> jnz L(END_NEQ2) >> ret >> >> L(END2): >> /* Position first mismatch. */ >> bsfl %eax, %eax >> >> /* The sequential version is able to assume this value is a >> positive 32-bit value because the first check included bytes in >> range a[0:(31 - len)] and b[0:(31 - len)] so `eax` must be >> greater than `31 - len` so the minimum value of `edx` + `eax` is >> `(len - 32) + (32 - len) >= 0`. In the concurrent case, however, >> `a` or `b` could have been changed so a mismatch in `eax` less or >> equal than `(31 - len)` is possible (the new low bound is `(16 - >> len)`. This can result in a negative 32-bit signed integer, which >> when zero extended to 64-bits is a random large value this out >> out of bounds. */ >> addl %edx, %eax >> >> /* Crash here because 32-bit negative number in `eax` zero >> extends to out of bounds 64-bit offset. */ >> movzbl 16(%rdi, %rax), %ecx >> movzbl 16(%rsi, %rax), %eax >> ``` >> >> This fix is quite simple, just make the `addl %edx, %eax` 64 bit (i.e >> `addq %rdx, %rax`). This prevents the 32-bit zero extension >> and since `eax` is still a low bound of `16 - len` the `rdx + rax` >> is bound by `(len - 32) - (16 - len) >= -16`. Since we have a >> fixed offset of `16` in the memory access this must be in bounds. >> --- >> sysdeps/x86_64/multiarch/memcmp-sse2.S | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/sysdeps/x86_64/multiarch/memcmp-sse2.S b/sysdeps/x86_64/multiarch/memcmp-sse2.S >> index afd450d020..34e60e567d 100644 >> --- a/sysdeps/x86_64/multiarch/memcmp-sse2.S >> +++ b/sysdeps/x86_64/multiarch/memcmp-sse2.S >> @@ -308,7 +308,7 @@ L(ret_nonzero_vec_end_0): >> setg %dl >> leal -1(%rdx, %rdx), %eax >> # else >> - addl %edx, %eax >> + addq %rdx, %rax > > Please add some comments here and a testcase. I do not recommend a test case. I would accept Noah's v3 as-is. Even with a two or three thread test you'd have to run for long enough to statistically trigger the change at the right time, and that means spending developer CPU resources to test that a specific IFUNC works under UB conditions. This slows down development cycle time for a use case we don't actually support. I support Noah making this change because it doesn't have a performance impact and I'm empathetic to fixing a developer reported issue. I wouldn't go any further than that. My strong suggestion to yottadb is that they need to write their own hand-crafted memcmp to ensure the entire algorithm operates under the *must-only-read-bytes-once* conditions. If you read bytes *twice* due to overlapped loads you may have problems if you read inconsistent values where you didn't expect them. The algorithm limitations are quite severe IMO, and any mix of C and asm here could result in the compiler exposing visible UB to the application author. >> movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rsi, %rax), %ecx >> movzbl (VEC_SIZE * -1 + SIZE_OFFSET)(%rdi, %rax), %eax >> subl %ecx, %eax >> -- >> 2.34.1 >> > > -- Cheers, Carlos.