From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 47705 invoked by alias); 14 Feb 2018 07:37:45 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 47689 invoked by uid 89); 14 Feb 2018 07:37:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FREEMAIL_REPLYTO_END_DIGIT,RCVD_IN_DNSWL_NONE,SPF_PASS,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=H*x:KHTML, H*x:Chrome, H*x:AppleWebKit, H*x:Safari X-HELO: sonic312-23.consmr.mail.gq1.yahoo.com Received: from sonic312-23.consmr.mail.gq1.yahoo.com (HELO sonic312-23.consmr.mail.gq1.yahoo.com) (98.137.69.204) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Feb 2018 07:37:42 +0000 Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.gq1.yahoo.com with HTTP; Wed, 14 Feb 2018 07:37:41 +0000 Date: Wed, 14 Feb 2018 07:37:00 -0000 From: "Xiaofeng Liu via cygwin" Reply-To: Xiaofeng Liu Reply-To: Xiaofeng Liu To: "cygwin@cygwin.com" Message-ID: <1541824013.551552.1518593858391@mail.yahoo.com> In-Reply-To: <1460534026.545542.1518591843449@mail.yahoo.com> References: <1543396632.5417641.1512146709346.ref@mail.yahoo.com> <1543396632.5417641.1512146709346@mail.yahoo.com> <20171201171536.GA4325@calimero.vinschen.de> <1264797847.540865.1518590850864@mail.yahoo.com> <1460534026.545542.1518591843449@mail.yahoo.com> Subject: Re: mixed usage of lock protection and lock-free List template class in thread.h MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-SW-Source: 2018-02/txt/msg00158.txt.bz2 Sorry, I need send again for another try for the formatting. (Yahoo deleted my spaces. Please reformat the code in your C++ editor if yo= u want to use the code. Sorry!) Here is the sample code that will hang in cygwin: test-thread.cpp to compile:=20 g++ -std=3Dc++0x test-thread.cpp -lpthread=20 In this code, mutex and cond need be created and destroyed very frequently,= which could corrupt the static list object owned by some classes in thread= .h. In my test, I have a computer of 8 threads to run cygwin, and the hang = could happen when cond/mutex objects are created and destroyed for the orde= r of 1 millions times within a few minutes. Is this practical? Yes. My code for my product used a lot of std::future wh= ich use one mutex and one mutex for each object. The future objects are cre= ated and destroyed on the flight, so are for mutex and cond variables. I c= an also observe that the peak memory kept increasing to a few hundred MB, a= nd I suspect there is a MEMORY LEAK in cygwin kernel.=20 I hope the format will be good. If not, I will try again. Thanks.=20 Xiaofeng -----------------------------test-thread.cpp------------------- #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 #include =20 using namespace std;=20 template=20 class Queue {=20 typedef std::deque Container;=20 public:=20 Queue(int _capacity =3D std::numeric_limits::max()) : capacity(_capaci= ty), closed(false) {}=20 bool enqueue(T* d)=20 {=20 if (closed) return false;=20 std::unique_lock lock(mutex);=20 if (d =3D=3D 0) { closed =3D true; empty_cv.notify_all(); return true; }=20 while (data.size() >=3D capacity)=20 full_cv.wait(lock);=20 data.push_back(d);=20 empty_cv.notify_one();=20 return true;=20 }=20 T* dequeue()=20 {=20 std::unique_lock lock(mutex);=20 while (data.empty() && !closed)=20 empty_cv.wait(lock);=20 if (data.size()) {=20 T* d =3D data.front();=20 data.pop_front();=20 full_cv.notify_one();=20 return d;=20 } else return 0;=20 }=20 size_t size() const { return data.size(); }=20 private:=20 std::mutex mutex;=20 std::condition_variable full_cv, empty_cv;=20 uint32_t capacity;=20 bool closed;=20 Container data;=20 };=20 struct Node {=20 int data;=20 };=20 struct Job {=20 Node* node;=20 Queue* recycle;=20 };=20 static Queue jobs;=20 static void* handle_job(void* arg)=20 {=20 long ithr =3D (long)arg;=20 unsigned long seed =3D time(0) + ithr*1000000;=20 int NS =3D 1000;=20 while (Job* j =3D jobs.dequeue()) {=20 struct timespec ts;=20 ts.tv_sec =3D 0;=20 seed =3D seed * 1103515245 + 12345;=20 ts.tv_nsec =3D seed%NS;=20 nanosleep(&ts, 0);=20 j->recycle->enqueue(j->node);=20 delete j;=20 }=20 }=20 struct Task {=20 Queue recycle;=20 int size; // number of sub jobs=20 Task(int N) : size(N)=20 {=20 for (int i =3D 0; idata =3D i;=20 Job* job =3D new Job;=20 job->node =3D t;=20 job->recycle =3D &recycle;=20 jobs.enqueue(job);=20 }=20 }=20 ~Task()=20 {=20 int i =3D 0;=20 while (Node* t =3D recycle.dequeue()) {=20 delete t;=20 if (++i =3D=3D size) break;=20 }=20 }=20 };=20 static string timestamp()=20 {=20 time_t t;=20 struct tm tmp;=20 char buf[80];=20 t =3D time(NULL);=20 localtime_r(&t, &tmp);=20 strftime(buf, sizeof(buf), "%d-%m-%Y %I:%M:%S", &tmp);=20 return buf;=20 }=20 static void* create_task(void* arg)=20 {=20 long ithr =3D (long)arg;=20 int TASK_NUM =3D 1000000;=20 int one_percent =3D TASK_NUM/100;=20 int TASK_SUB_JOB_NUM =3D 1;=20 int NS =3D 1000;=20 unsigned long seed =3D time(0) + ithr*10000;=20 int i =3D 0;=20 for (; i < TASK_NUM; ++i) {=20 struct timespec ts;=20 ts.tv_sec =3D 0;=20 seed =3D seed * 1103515245 + 12345;=20 ts.tv_nsec =3D seed%NS;=20 nanosleep(&ts, 0);=20 if (i%one_percent =3D=3D 0) {=20 fprintf(stderr, "%s: create_task[%d]: %d%% done\n", timestamp().c_str(), it= hr, i/one_percent);=20 }=20 Task task(TASK_SUB_JOB_NUM);=20 }=20 fprintf(stderr, "%s: create_task[%d]: %d%% done\n", timestamp().c_str(), it= hr, i/one_percent);=20 }=20 int main()=20 {=20 int NTHR_HANDLE_JOB =3D 4, NTHR_CREATE_TASK =3D 4;=20 std::vector threads(NTHR_HANDLE_JOB+NTHR_CREATE_TASK);=20 int k =3D 0;=20 for (long i =3D 0; i < NTHR_HANDLE_JOB; ++i) {=20 pthread_create(&threads[k++], NULL, handle_job, (void*)i);=20 }=20 for (long i =3D 0; i < NTHR_CREATE_TASK; ++i) {=20 pthread_create(&threads[k++], NULL, create_task, (void*)i);=20 }=20 // wait for create_task thread=20 for (size_t i =3D NTHR_HANDLE_JOB; i < threads.size(); ++i) {=20 pthread_join(threads[i], NULL);=20 }=20 jobs.enqueue(0);=20 // wait for handle_job thread=20 for (size_t i =3D 0; i < NTHR_HANDLE_JOB; ++i)=20 pthread_join(threads[i], NULL);=20 }=20 ----------------------------end of test-thread.cpp-------------------------- To: cygwin@cygwin.com Sent: Friday, December 1, 2017 9:15 AM Subject: Re: mixed usage of lock protection and lock-free List template cla= ss in thread.h On Dec 1 16:45, Xiaofeng Liu via cygwin wrote: > Lock protection and lock-free should never be mixed !=20 > =E2=80=8Bhttps://cygwin.com/git/gitweb.cgi?p=3Dnewlib-cygwin.git;a=3Dblob= ;f=3Dwinsup/cygwin/thread.h;hb=3D1f42dc2bcf58d3b8629eb13d53de3f69fc314b47#l= 110 >=20 > 110 template inline void 111 List_insert (list_node *&= head, list_node *node) 112 { 113 if (!node) 114 return; 115 do 116 = node->next =3D head; 117 while (InterlockedCompareExchangePointer ((P= VOID volatile *) &head, 118 nod= e, node->next) !=3D node->next); 119 } 120 121 template = inline void 122 List_remove (fast_mutex &mx, list_node *&head, list_node *n= ode) 123 { 124 if (!node) 125 return; 126 mx.lock (); 127 if (hea= d) 128 { 129 if (InterlockedCompareExchangePointer ((PVOID volati= le *) &head, 130 node->next, n= ode) !=3D node) 131 { 132 list_node *cur =3D head; 133 1= 34 while (cur->next && node !=3D cur->next) 135 cur = =3D cur->next; 136 if (node =3D=3D cur->next) 137 cur= ->next =3D cur->next->next; 138 } 139 } 140 mx.unlock (); 141= } > The symptom I met is a job hang with the following stack: > #0 0x000000007711c2ea in ntdll!ZwWaitForMultipleObjects () from /cygdriv= e/c/Windows/SYSTEM32/ntdll.dll > #1 0x000007fefd111430 in KERNELBASE!GetCurrentProcess () from /cygdrive/= c/Windows/system32/KERNELBASE.dll > #2 0x0000000076fc06c0 in WaitForMultipleObjects () from /cygdrive/c/Wind= ows/system32/kernel32.dll > #3 0x00000001800458ac in cygwait(void*, _LARGE_INTEGER*, unsigned int) (= ) from /usr/bin/cygwin1.dll > #4 0x000000018013d029 in pthread_cond::~pthread_cond() () from /usr/bin/= cygwin1.dll > #5 0x000000018013d0dd in pthread_cond::~pthread_cond() () from /usr/bin/= cygwin1.dll > #6 0x0000000180141196 in pthread_cond_destroy () from /usr/bin/cygwin1.d= ll > #7 0x0000000180116d5b in _sigfe () from /usr/bin/cygwin1.dll > #8 0x0000000100908e38 in std::_Sp_counted_ptr_inplace, std::allocator, void ()>, std::a= llocator, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () > The problem with the current implementation for concurrent insert and del= ete is explained at WikipediaNon-blocking linked list >=20 > My question is how to solve this ? Adding lock protection in List_insert = (removing lock-freee) or implementing a complete lock-free List based on Ha= rris's solution to use two CAS? First of all, please, please, please fix your MUA! Just like your mails to cygwin-patches a couple of weeks ago, your mails are pretty much unreadable due to broken line wrapping. Back to business: This code is working since 2003. So, is that just a theoretical problem, or a practical one? If the latter, what is broken exactly? However, since you're asking, a lockless implementation where appropriate is always welcome. Thanks, Corinna --=20 Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple