From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 471253857806; Wed, 16 Mar 2022 09:42:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 471253857806 From: "tansheng at spacesoftwares dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/104951] New: avx512fintrin.h(9146): error: identifier "__builtin_ia32_rndscaless_round" is undefined Date: Wed, 16 Mar 2022 09:42:07 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 9.4.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: tansheng at spacesoftwares dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Mar 2022 09:42:07 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104951 Bug ID: 104951 Summary: avx512fintrin.h(9146): error: identifier "__builtin_ia32_rndscaless_round" is undefined Product: gcc Version: 9.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: tansheng at spacesoftwares dot com Target Milestone: --- Compiler ERROR, I was trying to compile paddlepaddle (both v2.2 and 2.1) wi= th gcc9.4.0, the following log give all the process and information. It seems this is a problem from the compiler, not from the definitions or f= lags of the project. ----------------- Basic information: 1=EF=BC=89PaddlePaddle2.2.2 2=EF=BC=89CPU=EF=BC=9Ai5-11400H 3=EF=BC=89GPU=EF=BC=9ANVIDIA GeForce RTX 3050 Laptop GPU, cuda_11.4.3_47= 0.82.01_linux, cudnn-11.4-linux-x64-v8.2.4.15 4=EF=BC=89system environment AsusLaptop TUF Gaming (FX506HCB) + Ubuntu20= .04 + Python3.8.10 ----------------- GCC + make log: https://paddle-inference.readthedocs.io/en/latest/user_guides/source_compil= e.html#ubuntu-18-04 (1) down load and configure git clone https://github.com/PaddlePaddle/Paddle.git cd Paddle git checkout release/2.2 mkdir build_cuda && cd build_cuda cmake .. -DPY_VERSION=3D3 -DWITH_TESTING=3DOFF -DWITH_MKL=3DON -DWITH_GPU=3DON -DON_INFER=3DON .. (2) make make -j12 [ 0%] Built target extern_gflags [ 1%] Built target extern_zlib [ 1%] Built target extern_lapack [ 1%] Built target extern_utf8proc [ 1%] Built target extern_warpctc [ 1%] Built target extern_dlpack [ 1%] Built target extern_boost [ 2%] Built target extern_threadpool [ 3%] Built target extern_eigen3 [ 3%] copy_if_different /home/mc/ocr/Paddle/build_cuda/paddle/fluid/operators/jit/kernels.h [ 3%] Built target copy_kernels_command [ 3%] copy_if_different /home/mc/ocr/Paddle/build_cuda/paddle/fluid/inference/api/paddle_inference_= pass.h [ 3%] copy_if_different /home/mc/ocr/Paddle/build_cuda/paddle/fluid/pybind/pybind.h [ 3%] Built target download_externalError [ 3%] Built target extern_xbyak [ 3%] Built target extern_gloo [ 3%] Built target copy_paddle_inference_pass_command [ 3%] Built target copy_pybind_command [ 3%] Built target extern_cryptopp [ 3%] Built target extern_pocketfft [ 3%] Built target extern_protobuf [ 3%] Built target extern_pybind [ 4%] Built target extern_mkldnn [ 4%] Built target profiler_py_proto_init [ 4%] Built target extern_glog [ 4%] Built target extern_mklml Consolidate compiler generated dependencies of target heter_service_proto Consolidate compiler generated dependencies of target error_codes_proto [ 4%] Built target framework_py_proto_init Consolidate compiler generated dependencies of target data_feed_proto Consolidate compiler generated dependencies of target framework_proto Consolidate compiler generated dependencies of target external_error_proto Copy generated python proto into directory paddle/fluid/proto/profiler. [ 4%] Built target mkldnn_cmd [ 4%] Built target trainer_py_proto [ 5%] Built target distributed_strategy_py_proto [ 5%] Built target pass_desc_py_proto Consolidate compiler generated dependencies of target cudnn_workspace_helper Consolidate compiler generated dependencies of target mkldnn [ 5%] Built target mkldnn [ 5%] Built target profiler_py_proto [ 5%] Built target fleet_proto_init Consolidate compiler generated dependencies of target monitor Consolidate compiler generated dependencies of target version Consolidate compiler generated dependencies of target denormal Consolidate compiler generated dependencies of target shape_range_info_proto [ 5%] Built target cudnn_workspace_helper [ 5%] Built target version [ 5%] Built target denormal [ 5%] Built target error_codes_proto [ 5%] Built target heter_service_proto [ 5%] Built target framework_proto [ 5%] Built target data_feed_proto [ 5%] Built target external_error_proto Copy generated python proto into directory paddle/fluid/proto. [ 5%] Built target shape_range_info_proto Consolidate compiler generated dependencies of target timer Consolidate compiler generated dependencies of target pass_desc_proto [ 5%] Built target pass_desc_proto Consolidate compiler generated dependencies of target flags Consolidate compiler generated dependencies of target string_array Consolidate compiler generated dependencies of target op_def_proto [ 5%] Built target op_def_proto [ 5%] Built target monitor [ 5%] Built target timer Consolidate compiler generated dependencies of target zero_copy_tensor_dummy [ 6%] Built target zero_copy_tensor_dummy Consolidate compiler generated dependencies of target profiler_proto [ 6%] Built target profiler_proto Consolidate compiler generated dependencies of target trainer_desc_proto [ 6%] Built target trainer_desc_proto Consolidate compiler generated dependencies of target errors Consolidate compiler generated dependencies of target op_version_proto [ 6%] Built target errors [ 6%] Built target op_version_proto Consolidate compiler generated dependencies of target table_printer [ 7%] Built target table_printer [ 7%] Built target string_array [ 7%] Built target extern_xxhash Consolidate compiler generated dependencies of target paddle_pass_builder Consolidate compiler generated dependencies of target op_def_api [ 7%] Built target third_party [ 7%] Built target op_def_api [ 7%] Built target framework_py_proto Consolidate compiler generated dependencies of target prune Consolidate compiler generated dependencies of target op_version_registry Consolidate compiler generated dependencies of target activation_functions [ 7%] Built target paddle_pass_builder [ 7%] Built target activation_functions [ 7%] Built target flags Consolidate compiler generated dependencies of target enforce [ 7%] Built target enforce Consolidate compiler generated dependencies of target imperative_profiler [ 7%] Built target imperative_profiler Consolidate compiler generated dependencies of target cuda_stream Consolidate compiler generated dependencies of target imperative_flag [ 7%] Built target imperative_flag Consolidate compiler generated dependencies of target place Consolidate compiler generated dependencies of target cpu_info Consolidate compiler generated dependencies of target dynamic_loader Consolidate compiler generated dependencies of target string_helper [ 7%] Built target string_helper Consolidate compiler generated dependencies of target stream_callback_manag= er [ 7%] Built target cpu_info Consolidate compiler generated dependencies of target ddim Consolidate compiler generated dependencies of target pretty_log [ 7%] Built target pretty_log Consolidate compiler generated dependencies of target threadpool Consolidate compiler generated dependencies of target stringpiece [ 7%] Built target stringpiece Consolidate compiler generated dependencies of target eigen_function Consolidate compiler generated dependencies of target fs Consolidate compiler generated dependencies of target shell Consolidate compiler generated dependencies of target attribute [ 7%] Built target prune [ 7%] Built target attribute [ 7%] Built target fs [ 7%] Built target ddim [ 7%] Built target threadpool [ 7%] Built target place [ 7%] Built target dynamic_loader [ 7%] Built target stream_callback_manager [ 7%] Built target shell [ 7%] Built target op_version_registry Consolidate compiler generated dependencies of target cuda_profiler [ 7%] Built target cuda_profiler Consolidate compiler generated dependencies of target workqueue [ 7%] Built target workqueue Consolidate compiler generated dependencies of target benchmark Consolidate compiler generated dependencies of target dynload_lapack [ 7%] Built target dynload_lapack [ 7%] Built target benchmark Consolidate compiler generated dependencies of target dynload_mklml Consolidate compiler generated dependencies of target generator Consolidate compiler generated dependencies of target data_loader [ 7%] Built target dynload_mklml [ 7%] Built target generator [ 7%] Built target data_loader Consolidate compiler generated dependencies of target memory_block Consolidate compiler generated dependencies of target cblas [ 7%] Built target cblas Consolidate compiler generated dependencies of target dynload_warpctc [ 7%] Built target memory_block Consolidate compiler generated dependencies of target cpu_helper Consolidate compiler generated dependencies of target allocator Consolidate compiler generated dependencies of target paddle_crypto [ 7%] Built target cpu_helper [ 7%] Built target allocator [ 7%] Built target dynload_warpctc [ 7%] Built target paddle_crypto Consolidate compiler generated dependencies of target best_fit_allocator Consolidate compiler generated dependencies of target retry_allocator [ 7%] Built target best_fit_allocator [ 7%] Built target retry_allocator Consolidate compiler generated dependencies of target op_proto_maker [ 7%] Built target op_proto_maker Consolidate compiler generated dependencies of target locked_allocator Consolidate compiler generated dependencies of target cpu_allocator Consolidate compiler generated dependencies of target buffered_allocator Consolidate compiler generated dependencies of target pinned_allocator Consolidate compiler generated dependencies of target thread_local_allocator [ 7%] Built target locked_allocator [ 7%] Built target cpu_allocator [ 7%] Built target buffered_allocator [ 7%] Built target pinned_allocator Consolidate compiler generated dependencies of target aligned_allocator Consolidate compiler generated dependencies of target mmap_allocator [ 7%] Built target thread_local_allocator [ 7%] Built target mmap_allocator [ 7%] Built target aligned_allocator Consolidate compiler generated dependencies of target auto_growth_best_fit_allocator Consolidate compiler generated dependencies of target op_call_stack [ 7%] Built target auto_growth_best_fit_allocator [ 7%] Built target op_call_stack Consolidate compiler generated dependencies of target dynload_cuda [ 7%] Built target cuda_stream [ 7%] Built target dynload_cuda [ 7%] Built target eigen_function Consolidate compiler generated dependencies of target jit_kernel_base [ 7%] Built target jit_kernel_base Consolidate compiler generated dependencies of target gpu_info Consolidate compiler generated dependencies of target device_tracer [ 7%] Built target gpu_info [ 7%] Built target device_tracer Consolidate compiler generated dependencies of target jit_kernel_refer Consolidate compiler generated dependencies of target jit_kernel_mkl [ 7%] Built target jit_kernel_refer [ 7%] Built target jit_kernel_mkl Consolidate compiler generated dependencies of target jit_kernel_intrinsic Consolidate compiler generated dependencies of target device_memory_aligment Consolidate compiler generated dependencies of target cuda_resource_pool [ 7%] Built target jit_kernel_intrinsic [ 7%] Built target device_memory_aligment [ 7%] Built target cuda_resource_pool Consolidate compiler generated dependencies of target system_allocator [ 7%] Built target system_allocator Consolidate compiler generated dependencies of target cuda_device_guard [ 7%] Built target cuda_device_guard Consolidate compiler generated dependencies of target jit_kernel_mix [ 7%] Built target jit_kernel_mix Consolidate compiler generated dependencies of target buddy_allocator [ 7%] Built target buddy_allocator Consolidate compiler generated dependencies of target cuda_allocator [ 7%] Built target cuda_allocator Consolidate compiler generated dependencies of target profiler [ 8%] Built target profiler Consolidate compiler generated dependencies of target naive_best_fit_alloca= tor [ 8%] Built target naive_best_fit_allocator Consolidate compiler generated dependencies of target allocator_strategy [ 8%] Built target allocator_strategy Consolidate compiler generated dependencies of target allocator_facade Consolidate compiler generated dependencies of target jit_kernel_jitcode [ 8%] Built target allocator_facade Consolidate compiler generated dependencies of target cuda_graph [ 9%] Built target jit_kernel_jitcode [ 9%] Built target cuda_graph Consolidate compiler generated dependencies of target jit_kernel_helper [ 9%] Built target jit_kernel_helper Consolidate compiler generated dependencies of target malloc [ 9%] Built target malloc Consolidate compiler generated dependencies of target device_context [ 10%] Built target device_context Consolidate compiler generated dependencies of target data_type Consolidate compiler generated dependencies of target op_kernel_type [ 10%] Built target data_type Consolidate compiler generated dependencies of target device_code [ 10%] Built target op_kernel_type [ 10%] Built target device_code Consolidate compiler generated dependencies of target no_need_buffer_vars_inference Consolidate compiler generated dependencies of target memcpy [ 10%] Built target no_need_buffer_vars_inference [ 10%] Built target memcpy Consolidate compiler generated dependencies of target shape_inference [ 10%] Built target shape_inference Consolidate compiler generated dependencies of target mkldnn_axpy_handler [ 10%] Built target mkldnn_axpy_handler Consolidate compiler generated dependencies of target sequence_padding Consolidate compiler generated dependencies of target blas Consolidate compiler generated dependencies of target cos_sim_functor Consolidate compiler generated dependencies of target im2col [ 10%] Built target blas [ 10%] Built target sequence_padding Consolidate compiler generated dependencies of target sequence_scale [ 10%] Built target cos_sim_functor [ 10%] Built target im2col Consolidate compiler generated dependencies of target cross_entropy Consolidate compiler generated dependencies of target depthwise_conv [ 10%] Built target sequence_scale [ 10%] Built target cross_entropy Consolidate compiler generated dependencies of target concat_and_split Consolidate compiler generated dependencies of target lstm_compute [ 10%] Built target concat_and_split [ 10%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/lstm_compute.dir/lstm_compute.cu.o Consolidate compiler generated dependencies of target maxouting Consolidate compiler generated dependencies of target sample_prob [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/fc.dir/fc.cc.o [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/lapack_function.dir/lapack_function.= cc.o [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/matrix_bit_code.dir/matrix_bit_code.= cc.o [ 10%] Built target depthwise_conv [ 10%] Built target maxouting Consolidate compiler generated dependencies of target sampler [ 10%] Built target sampler Consolidate compiler generated dependencies of target heter_wrapper [ 10%] Built target heter_wrapper [ 10%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/prelu.dir/prelu.cu.o Consolidate compiler generated dependencies of target collective_helper [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/unpooling.dir/unpooling.cc.o [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/vol2col.dir/vol2col.cc.o Consolidate compiler generated dependencies of target pooling Consolidate compiler generated dependencies of target sequence2batch [ 10%] Built target pooling [ 10%] Built target sequence2batch [ 10%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/vol2col.dir/vol2col.cu.o [ 10%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/fc.dir/fc.cu.o [ 10%] Built target sample_prob [ 10%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/bert_encoder_functor.dir/bert_encode= r_functor.cu.o [ 10%] Built target collective_helper [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/matrix_inverse.dir/matrix_inverse.cc= .o [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/segment_pooling.dir/segment_pooling.= cc.o [ 10%] Linking CXX static library liblapack_function.a [ 10%] Built target lapack_function [ 10%] Building CXX object paddle/fluid/operators/math/CMakeFiles/matrix_inverse.dir/matrix_inverse.cu= .cc.o /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(9146): error: identifier "__builtin_ia32_rndscaless_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(9155): error: identifier "__builtin_ia32_rndscalesd_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(14797): error: identifier "__builtin_ia32_rndscaless_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(14806): error: identifier "__builtin_ia32_rndscalesd_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512dqintrin.h(1365): error: identifier "__builtin_ia32_fpclassss" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512dqintrin.h(1372): error: identifier "__builtin_ia32_fpclasssd" is undefined 6 errors detected in the compilation of "/home/mc/ocr/Paddle/paddle/fluid/operators/math/lstm_compute.cu". make[2]: *** [paddle/fluid/operators/math/CMakeFiles/lstm_compute.dir/build.make:90=EF= =BC=9Apaddle/fluid/operators/math/CMakeFiles/lstm_compute.dir/lstm_compute.= cu.o] =E9=94=99=E8=AF=AF 1 make[1]: *** [CMakeFiles/Makefile2:77899=EF=BC=9Apaddle/fluid/operators/math/CMakeFiles/= lstm_compute.dir/all] =E9=94=99=E8=AF=AF 2 make[1]: *** =E6=AD=A3=E5=9C=A8=E7=AD=89=E5=BE=85=E6=9C=AA=E5=AE=8C=E6=88= =90=E7=9A=84=E4=BB=BB=E5=8A=A1.... [ 11%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/segment_pooling.dir/segment_pooling.= cu.o [ 11%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/unpooling.dir/unpooling.cu.o [ 11%] Linking CXX static library libmatrix_bit_code.a [ 11%] Built target matrix_bit_code [ 11%] Linking CXX static library libmatrix_inverse.a [ 11%] Built target matrix_inverse [ 11%] Linking CXX static library libvol2col.a [ 11%] Built target vol2col [ 11%] Linking CXX static library libprelu.a [ 11%] Built target prelu [ 11%] Linking CXX static library libunpooling.a [ 11%] Built target unpooling [ 11%] Linking CXX static library libfc.a [ 11%] Linking CXX static library libbert_encoder_functor.a [ 11%] Built target bert_encoder_functor [ 11%] Built target fc [ 11%] Linking CXX static library libsegment_pooling.a [ 11%] Built target segment_pooling make: *** [Makefile:136=EF=BC=9Aall] =E9=94=99=E8=AF=AF 2 ----------------------------------------- below info is from paddlepaddle2.1 ----------------------------------------- =E5=90=8C=E6=A0=B7=E7=9A=84=E9=85=8D=E7=BD=AE=EF=BC=8C=E6=88=91=E8=AF=95=E5= =9B=BE=E4=BD=BF=E7=94=A8=E4=B8=8D=E5=90=8C=E7=9A=84paddle=E7=89=88=E6=9C=AC= =EF=BC=8C=E5=8F=91=E7=8E=B0=E6=B2=A1=E6=9C=89=E7=94=A8=EF=BC=9BPaddle2.1=E4= =B9=9F=E5=87=BA=E7=8E=B0=E4=BA=86=E7=B1=BB=E4=BC=BC=E7=9A=84=E6=8A=A5=E9=94= =99=EF=BC=8C=E4=B8=BB=E8=A6=81=E9=83=BD=E5=9C=A8fluid/operators/math=E4=B8= =8B=E9=9D=A2=E7=9A=84cuda=E7=AE=97=E5=AD=90=E5=87=BD=E6=95=B0=EF=BC=8C=E4= =B8=8D=E6=B8=85=E6=A5=9A=E6=98=AF=E5=90=A6=E5=9B=A0=E4=B8=BAcuda=E7=89=88= =E6=9C=AC=E9=97=AE=E9=A2=98=E5=BC=95=E8=B5=B7=EF=BC=8C [ 13%] Built target blas [ 13%] Building CUDA object paddle/fluid/operators/math/CMakeFiles/depthwise_conv.dir/depthwise_conv.cu= .o /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(9146): error: identifier "__builtin_ia32_rndscaless_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(9155): error: identifier "__builtin_ia32_rndscalesd_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(14797): error: identifier "__builtin_ia32_rndscaless_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512fintrin.h(14806): error: identifier "__builtin_ia32_rndscalesd_round" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512dqintrin.h(1365): error: identifier "__builtin_ia32_fpclassss" is undefined /usr/lib/gcc/x86_64-linux-gnu/9/include/avx512dqintrin.h(1372): error: identifier "__builtin_ia32_fpclasssd" is undefined=