From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x331.google.com (mail-ot1-x331.google.com [IPv6:2607:f8b0:4864:20::331]) by sourceware.org (Postfix) with ESMTPS id 94AB13857404 for ; Mon, 21 Mar 2022 20:19:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 94AB13857404 Received: by mail-ot1-x331.google.com with SMTP id i23-20020a9d6117000000b005cb58c354e6so5788251otj.10 for ; Mon, 21 Mar 2022 13:19:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=AoNy6Fs+FyVKcwXXC+2KIa/0X8955nVogq8Fc0PzTro=; b=VxAGHxZHZn2elxxMPVXP3Gm2LGb7zXT0sMiOFKkeVWpeYfiSv3U3JxXPkp0wZRp0sr pRNxkMAcdbZ6OkJKktvu8EdzKQj+i9dFy41oIfpFIVPwHT56d/CYvkyuJ3Vq5hLgw6G6 R3w+YsE3Vfyxyvi38ePUC0ATuv7q2ueokrKLdzjshSwdAde5RUjEfLTqnZN+9XpPdmus t3LyW+G8Uw4axRgmYz7uey6l5Ds052v/ZMdPQv0fzH94y/tKhuIpIpZxMHX1SVpLxYg3 4YDJYmRg8AhUqMO/1uLn9WPgUblPiUQJSb8RrPl5xkWiw/XVWiQ80vvjxV+N1EuI33aL R65g== X-Gm-Message-State: AOAM530e2owUoGi9SYbf6nMlXKuBYVphgb/WRnDR7ei1eB0MI7wDtYR1 B1CmLGbnM2jayxO7ev1yX6uxfg== X-Google-Smtp-Source: ABdhPJxWU6FHYx+unzX6do1iNBEeZ7enN/DbVePSTRk3XDdPs1dh6VUJcb0txxjLX5A112NQi8Yq4Q== X-Received: by 2002:a9d:6b0a:0:b0:5b2:3355:5ec8 with SMTP id g10-20020a9d6b0a000000b005b233555ec8mr8691837otp.51.1647893993819; Mon, 21 Mar 2022 13:19:53 -0700 (PDT) Received: from ?IPV6:2804:431:c7ca:2d55:f04a:67c7:cbf3:571d? ([2804:431:c7ca:2d55:f04a:67c7:cbf3:571d]) by smtp.gmail.com with ESMTPSA id o18-20020a9d7652000000b005cbf6f5d7c5sm3208283otl.21.2022.03.21.13.19.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 21 Mar 2022 13:19:53 -0700 (PDT) Message-ID: <0599084c-49bc-23a3-da7e-ccb197d4c88c@linaro.org> Date: Mon, 21 Mar 2022 17:19:49 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: rwlock different behavior on glibc 2.28 then on previous versions Content-Language: en-US To: David Mozes Cc: Christian Hoff , Florian Weimer , "libc-help@sourceware.org" , Carlos O'Donell References: <87czis1tgk.fsf@oldenburg.str.redhat.com> <8681b86f-963c-7c01-3d7a-9c4cb3da52c8@gmx.net> <6D2D3001-2308-4336-829F-8369DF7D2514@silk.us> From: Adhemerval Zanella In-Reply-To: <6D2D3001-2308-4336-829F-8369DF7D2514@silk.us> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-help@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-help mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Mar 2022 20:19:56 -0000 On 21/03/2022 16:46, David Mozes wrote: > > Thanks Adhemerval, > For the suggestion! >   >  I suggest to remove the check just on the unlock function.I tested it with very high load and with application with 3k active threads.  >  Can you on a case or bug(dedlock or something) that removing thIs check will expose? I am no sure, so I would suggest you to maybe start a discussion on libc-alpha. However __pthread_rwlock_wrunlock and __pthread_rwlock_rdunlock have different semantic allowing unlocking write locks on different threads will require a lot of analysis on why it would trigger any more regression (and sorry, but stating it works on our environment is not suffice). > Thx > David >> On Mar 21, 2022, at 7:02 PM, Adhemerval Zanella wrote: >> >> As Florian has put, this is essentially a misuse of the API has always been >> undefined.  The rwlock implementation uses  __cur_write to both make a fast >> deadlock check and to know whether to handle the lock as a writer or reader >> one.  The former is a implementation detail that we might change (although >> it is useful in some case), but to handle the former will most likely to >> refactor and change the rdlock algorithm (just removing the check as you are >> proposing is not suffice). >> >> Since it is not a bug in our implementation, I think you will need to discuss >> it further on the libc-alpha to see if other are willing to actually make >> this change. I foresee that the algorithm change won't be simple to keep >> interface requirements and allow the specific case you are asking. In fact >> I think it might not be actually possible without change the rwlock semantic >> (not sure though). >> >> Carlos has fixed some issue on the rwlock, he might have some comments. >> >> On 21/03/2022 11:01, David Mozes wrote: >>> Hello Christain, >>> Yes, Thread Sanitizer caught it. >>> However, we can't run it all the time on the production code. >>> We need to make a small change of the unlock code to have async programing and backward compatibility for older code. >>> >>> Thx >>> David   >>>     >>> -----Original Message----- >>> From: Christian Hoff >>> Sent: Monday, March 21, 2022 8:20 AM >>> To: David Mozes ; Florian Weimer >>> Cc: libc-help@sourceware.org >>> Subject: Re: rwlock different behavior on glibc 2.28 then on previous versions >>> >>> Hallo David, >>> >>> did you try to run your program with Helgrind already? Helgrind was >>> designed to catch problems like this. Maybe Thread Sanitizer would also >>> detect the issue. >>> >>> >>> Best regards, >>> >>>    Christian >>> >>> On 3/20/22 5:44 PM, David Mozes wrote: >>>> Hi, >>>> Where are we with this? >>>> Also, the current implementation of the __pthread_rwlock_unlock  is problematic. >>>> On the one hand, it checks the API agreement. However, on the case of violation, the user don't have any induction >>>> And function return OK >>>> See example code below: >>>> >>>> >>>> struct threadInfo{ >>>>     pthread_t* tid; >>>>     pthread_rwlock_t *rwlock; >>>> }; >>>> >>>> void *lockerThreadFunction(void *vargp) >>>> { >>>>     km_status stat = KM_OK; >>>>     struct threadInfo* Tinfo = (struct threadInfo*)vargp; >>>> >>>>     do { >>>>         stat = pthread_rwlock_trywrlock(Tinfo->rwlock); >>>>         if (KM_OK == stat) { >>>>             printf("[INFO] - FIRST thread 0x%x locked %p \n", *(Tinfo->tid), Tinfo->rwlock); >>>>             sleep(10); >>>>             stat = pthread_rwlock_unlock(Tinfo->rwlock); >>>>             if (KM_OK == stat) { >>>>                 printf("[INFO] FIRST thread 0x%x released %p \n", *(Tinfo->tid), Tinfo->rwlock); >>>>                 sleep(1); >>>>             } >>>>         } >>>>     }while(KM_OK != stat); >>>>     return NULL; >>>> } >>>> >>>> >>>> void *secondThreadFunction(void *vargp) >>>> { >>>>     km_status stat = KM_OK; >>>>     struct threadInfo* Tinfo = (struct threadInfo*)vargp; >>>> >>>>     sleep(5); >>>>     do { >>>>         stat = pthread_rwlock_unlock(Tinfo->rwlock); >>>>         if (KM_OK == stat) { >>>>             printf("[INFO] SECOND thread 0x%x released %p \n", *(Tinfo->tid), Tinfo->rwlock); >>>>             sleep(1); >>>>         } else >>>>             printf("[INFO] SECOND thread 0x%x try_to_released %p erro status is: %d \n", *(Tinfo->tid), Tinfo->rwlock,stat); >>>> >>>>     }while(KM_OK != stat); >>>>     return NULL; >>>> } >>>> >>>> void *thirdThreadFunction(void *vargp) >>>> { >>>>     km_status stat = KM_OK; >>>>     struct threadInfo* Tinfo = (struct threadInfo*)vargp; >>>> >>>>     sleep(2); >>>>     do { >>>>         stat = pthread_rwlock_trywrlock(Tinfo->rwlock); >>>>         if (KM_OK == stat) { >>>>             printf("[INFO] THIRD thread 0X%x locked %x\n", *(Tinfo->tid), Tinfo->rwlock); >>>>             sleep(5); >>>>         } >>>>         else { >>>>             printf("[ERROR]: THIRD thread 0x%x failed Locking %p status is %d\n", *(Tinfo->tid), Tinfo->rwlock,stat); >>>>             sleep(1); >>>>         } >>>>     }while(KM_OK != stat); >>>>     return NULL; >>>> } >>>> >>>> void main() >>>> { >>>>     km_status stat = KM_OK; >>>>     /* km_rwlock lock; */ >>>>     pthread_rwlock_t rwlock; >>>>     pthread_t lockerTid = 0, releaseTid = 0, thirdTid=0; >>>>     struct threadInfo firstTinfo, secondTinfo, thirdTinfo; >>>> >>>>    /* stat = km_rwlock_init(&rwlock, KM_TRUE); */ >>>>     stat = pthread_rwlock_init (&rwlock, NULL); >>>> >>>>     if (KM_OK == stat) { >>>>         printf("read\\writer lock initialized\n"); >>>>         printf("Before Thread\n"); >>>> >>>>         firstTinfo.tid  = &lockerTid; >>>>         firstTinfo.rwlock = &rwlock; >>>>         secondTinfo.tid  = &releaseTid; >>>>         secondTinfo.rwlock = &rwlock; >>>>         thirdTinfo.tid  = &thirdTid; >>>>         thirdTinfo.rwlock = &rwlock; >>>> >>>> >>>>         pthread_create(&lockerTid, NULL, lockerThreadFunction, &firstTinfo); >>>>         pthread_create(&releaseTid,  NULL, secondThreadFunction, &secondTinfo); >>>>         pthread_create(&thirdTid, NULL, thirdThreadFunction, &thirdTinfo); >>>>         printf("After Thread %x\n", lockerTid); >>>> >>>> pthread_join(lockerTid, NULL); >>>>         pthread_join(releaseTid, NULL); >>>>         pthread_join(thirdTid, NULL); >>>> >>>> >>>> >>>> the output we saw is this: >>>> >>>> >>>> the ouput we saw is this: >>>> >>>> [root@Rocky85Mozes thread_lock]# ./rw_test >>>> read\writer lock initialized >>>> Before Thread >>>> [INFO] - FIRST thread 0x461f700 locked 0x7ffcda053cc0 >>>> After Thread 461f700 >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16  -> first thread take the rwlock >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> [INFO] SECOND thread 0x3e1e700 released 0x7ffcda053cc0   -> Second thread unlock the rwlock -- > and get OK!! >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> [INFO] FIRST thread 0x461f700 released 0x7ffcda053cc0 >>>> [ERROR]: THIRD thread 0x361d700 failed Locking 0x7ffcda053cc0 status is 16 >>>> >>>> >>>> -----Original Message----- >>>> From: David Mozes >>>> Sent: Sunday, March 13, 2022 12:23 PM >>>> To: Florian Weimer >>>> Cc: libc-help@sourceware.org >>>> Subject: RE: rwlock different behavior on glibc 2.28 then on previous versions >>>> >>>> Thx Florin, >>>> >>>> >>>> -----Original Message----- >>>> From: Florian Weimer >>>> Sent: Friday, March 11, 2022 3:59 PM >>>> To: David Mozes >>>> Cc: libc-help@sourceware.org >>>> Subject: Re: rwlock different behavior on glibc 2.28 then on previous versions >>>> >>>> * David Mozes: >>>> >>>>> On the previous version of Glibc (before the commit) We could release >>>>> write lock with different thread as the one that owned the lock (If we >>>>> like) . >>>> Sorry, this misuse of the API has always been undefined according to >>>> POSIX (“Results are undefined if the read-write lock rwlock is not held >>>> by the calling thread.”). >>>> >>>>>> Correct, but practically we can do it on the user's responsibility. >>>>>> Do you know if this "if" solved some synchronization issue or just enforced the undefined  API? >>>>> This is help very much for async program that has a many threads that >>>>> communicate with some target.  And we like to define a call back >>>>> function instead of blocking on aio thread that send the data in order >>>>> to release the rwlock.this improves the performance dramatically . >>>> Could you use a different synchronization object (perhaps a condition >>>> variable)? >>>> >>>>>> If we use a different synchronization object like semaphores, condition variable (by the way they use mutex under the same API agreement),  we will need to write self rwlock , which has less performance than other undefined states. >>>>>> If we like to keep the API agreement, why not define the different async programming unlock functions? >>>> Thanks >>>> Florian >>>> >>>> Thx >>>> David