I changed it to use 128-byte alignment to match the GPU cache-lines. Committed to OG12. Andrew On 11/01/2023 18:05, Andrew Stubbs wrote: > This patch fixes a runtime issue I encountered with the AMD GCN Unified > Shared Memory implementation. > > We were using regular malloc'd memory configured into USM mode, but > there were random intermittent crashes. I can't be completely sure, but > my best guess is that the HSA driver is using malloc internally from the > same heap, and therefore using memory on the same page as the offload > kernel. What I do know is that I could make the crashes go away by > simply padding the USM allocations before and after. > > With this patch USM allocations are now completely separated from the > system heap. The custom allocator is probably less optimal is some > use-cases, but does have the advantage that all the metadata is stored > in a side-table that won't ever cause any pages to migrate back to > main-memory unnecessarily. It's still possible for the user program to > use USM memory in a way that causes it to thrash, and this might have > been the ultimate cause of the crashes, but there's not much we can do > about that here. > > I've broken the allocator out into a new file because I anticipate it > being needed in more than one place, but I didn't put full > data-isolation on it yet. > > I'll rebase, merge, and repost all of the OpenMP memory patches sometime > soonish. > > Andrew