On 7/20/20 3:19 PM, Kwok Cheung Yeung wrote: > On 01/07/2020 3:28 pm, Tom de Vries wrote: >> So, I think gcc needs a copy of (some of) the >> gcc/testsuite/gcc.dg/ia64-sync-*.c tests for effective target >> sync_char_short. >> >> However, since this patch only adds partial support, we cannot enable >> sync_char_short for nvptx yet.  So, if you stick to partial support, you >> should add a char/short copy of ia64-sync-3.c to gcc.target/nvptx (which >> ideally could be an include of a generic test-case that is active for >> sync_char_short only, with mention that it can be removed once >> sync_char_short is enabled for nvptx). >> > > I have added gcc.target/nvptx/sync.c, which is a version of > ia64-sync-3.c extended to test chars and shorts too. I've: - added the short/char part of that as gcc.dg/ia64-async-5.c - included the sync_int_long ones of gcc.dg/ia64-async-* in gcc.target/nvptx - did the same for gcc.dg/ia64-async-5.c as part of this patch > I kept the original > int and long tests because sync_int_long isn't indicated as being > supported on nvptx either. > Yep, I proposed a patch to enable that: https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551842.html . >> I looked at the implementation, and it looks ok to me, though I think we >> need to make explicit in a comment what the assumptions are: >> - that we have read and write access to the entire word, and >> - that the word is not volatile. >> > > I've added some extra comments in the implementation. Like I said > previously, the loop accounts for the larger word being volatile. > Right, I known what that loop is intending to do. The loop though may manifest worst-case as a hang, so I've mentioned that in the comment. >> As for the oacc test-case, you could add the __int128 bit, perhaps along >> the lines of how things are done in >> libgomp/testsuite/libgomp.c++/target-8.C ? >> > > I've added a extra test for __int128 types in my libgomp testcase that > runs if 128-bit types are supported. > > I've tested that there are no regressions with the patch on standalone > nvptx, and that the new reduction-16.c testcase passes with both nvptx > and AMD GCN offloading. > > Is this version okay for master and og10? > Pushed as attached to master. Thanks, - Tom