I've committed this patch to add a missing vector operator on amdgcn. The architecture doesn't have a 64-bit not instruction so we didn't have an insn for it, but the vectorizer didn't like that and caused the v64df_pow function to use 2MB of stack frame. This is a problem when you typically have over 3000 threads and only want to allocate 32k of stack space each! Andrew