From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-il1-x131.google.com (mail-il1-x131.google.com [IPv6:2607:f8b0:4864:20::131]) by sourceware.org (Postfix) with ESMTPS id 5BC093857362 for ; Thu, 14 Apr 2022 16:51:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5BC093857362 Received: by mail-il1-x131.google.com with SMTP id x9so3467680ilc.3 for ; Thu, 14 Apr 2022 09:51:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=th3mbEbY5/tweU12aN7C9RXFuzu65N0+DrR/wDIjsF4=; b=zE2hMvYw0vwdukX4dj7JwwZeAnL7tYZ67MNkfO2YjxJJMio80a3J44SOAX2BrUYmfR pGFxmDS+he4YXt2FCuqp+w+Lx9pTj+U81qHhrnzv782gbspNXseeSFkZxcvU9kGPBKi0 gFiD8ZmYzVFnHdAL4Ppa0aUk5EtH4x3dz9EomZckvXhD5qTsChf8sy2dhNwx9o5kO9nU DSSHkzz5s12JQSxrtMTc4/WjZvc4AoEHof7UD7ZR4iWk+94L4pGvB+7I9RRnjTRUEDpP aa6BJhXuaePk2SqaKjXkgHMmeLHZGOf7qWAtTnjLP9F9Oces91GHES+CyN5mtpFEpusS H9DA== X-Gm-Message-State: AOAM5317rBWdK2eYKLMCiX1nFf0RsnPqghBDS7x/nacsGKoUfoV/eUeg xZeS/Atc7xF9v10ZlITmkxB6SrSaG7k= X-Google-Smtp-Source: ABdhPJxwdwGIsaZgP/g9VPlN3l6Udi9ymXed4zs7z3akaTHic/ZdlINxOuUZgh0Cn0BU16y8hk/Aig== X-Received: by 2002:a92:130b:0:b0:2c5:66a6:cad8 with SMTP id 11-20020a92130b000000b002c566a6cad8mr1316567ilt.285.1649955069154; Thu, 14 Apr 2022 09:51:09 -0700 (PDT) Received: from localhost.localdomain (node-17-161.flex.volo.net. [76.191.17.161]) by smtp.googlemail.com with ESMTPSA id z5-20020a92cec5000000b002cbe5a870dfsm1127612ilq.36.2022.04.14.09.51.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Apr 2022 09:51:08 -0700 (PDT) From: Noah Goldstein To: libc-alpha@sourceware.org Subject: [PATCH v5 6/6] x86: Reduce code size of mem{move|pcpy|cpy}-ssse3 Date: Thu, 14 Apr 2022 11:47:40 -0500 Message-Id: <20220414164739.3146735-6-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220414164739.3146735-1-goldstein.w.n@gmail.com> References: <20220325183625.1170867-2-goldstein.w.n@gmail.com> <20220414164739.3146735-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2022 16:51:15 -0000 The goal is to remove most SSSE3 function as SSE4, AVX2, and EVEX are generally preferable. memcpy/memmove is one exception where avoiding unaligned loads with `palignr` is important for some targets. This commit replaces memmove-ssse3 with a better optimized are lower code footprint verion. As well it aliases memcpy to memmove. Aside from this function all other SSSE3 functions should be safe to remove. The performance is not changed drastically although shows overall improvements without any major regressions or gains. bench-memcpy geometric_mean(N=50) New / Original: 0.957 bench-memcpy-random geometric_mean(N=50) New / Original: 0.912 bench-memcpy-large geometric_mean(N=50) New / Original: 0.892 Benchmarks where run on Zhaoxin KX-6840@2000MHz See attached numbers for all results. More important this saves 7246 bytes of code size in memmove an additional 10741 bytes by reusing memmove code for memcpy (total 17987 bytes saves). As well an additional 896 bytes of rodata for the jump table entries. --- Results For: bench-memcpy length, align1, align2, dst > src, New Time / Old Time 1, 0, 0, 0, 0.946 1, 0, 0, 1, 0.946 1, 32, 0, 0, 0.948 1, 32, 0, 1, 1.185 1, 0, 32, 0, 0.982 1, 0, 32, 1, 1.14 1, 32, 32, 0, 0.981 1, 32, 32, 1, 1.057 1, 2048, 0, 0, 0.945 1, 2048, 0, 1, 0.945 2, 0, 0, 0, 1.041 2, 0, 0, 1, 1.041 2, 1, 0, 0, 1.044 2, 1, 0, 1, 1.044 2, 33, 0, 0, 1.044 2, 33, 0, 1, 1.044 2, 0, 1, 0, 1.041 2, 0, 1, 1, 1.041 2, 0, 33, 0, 1.042 2, 0, 33, 1, 1.041 2, 1, 1, 0, 1.041 2, 1, 1, 1, 1.041 2, 33, 33, 0, 1.041 2, 33, 33, 1, 1.041 2, 2048, 0, 0, 1.042 2, 2048, 0, 1, 1.041 2, 2049, 0, 0, 1.044 2, 2049, 0, 1, 1.044 2, 2048, 1, 0, 1.041 2, 2048, 1, 1, 1.042 2, 2049, 1, 0, 1.042 2, 2049, 1, 1, 1.042 4, 0, 0, 0, 0.962 4, 0, 0, 1, 0.962 4, 2, 0, 0, 0.98 4, 2, 0, 1, 0.984 4, 34, 0, 0, 0.986 4, 34, 0, 1, 0.987 4, 0, 2, 0, 0.962 4, 0, 2, 1, 0.962 4, 0, 34, 0, 0.962 4, 0, 34, 1, 0.962 4, 2, 2, 0, 0.962 4, 2, 2, 1, 0.962 4, 34, 34, 0, 0.962 4, 34, 34, 1, 0.962 4, 2048, 0, 0, 0.962 4, 2048, 0, 1, 0.962 4, 2050, 0, 0, 0.996 4, 2050, 0, 1, 1.0 4, 2048, 2, 0, 0.962 4, 2048, 2, 1, 0.962 4, 2050, 2, 0, 0.962 4, 2050, 2, 1, 0.962 8, 0, 0, 0, 0.962 8, 0, 0, 1, 0.962 8, 3, 0, 0, 1.0 8, 3, 0, 1, 1.0 8, 35, 0, 0, 1.001 8, 35, 0, 1, 1.0 8, 0, 3, 0, 0.962 8, 0, 3, 1, 0.962 8, 0, 35, 0, 0.962 8, 0, 35, 1, 0.962 8, 3, 3, 0, 0.962 8, 3, 3, 1, 0.962 8, 35, 35, 0, 0.962 8, 35, 35, 1, 0.962 8, 2048, 0, 0, 0.962 8, 2048, 0, 1, 0.962 8, 2051, 0, 0, 1.0 8, 2051, 0, 1, 1.0 8, 2048, 3, 0, 0.962 8, 2048, 3, 1, 0.962 8, 2051, 3, 0, 0.962 8, 2051, 3, 1, 0.962 16, 0, 0, 0, 0.798 16, 0, 0, 1, 0.799 16, 4, 0, 0, 0.801 16, 4, 0, 1, 0.801 16, 36, 0, 0, 0.801 16, 36, 0, 1, 0.801 16, 0, 4, 0, 0.798 16, 0, 4, 1, 0.799 16, 0, 36, 0, 0.799 16, 0, 36, 1, 0.799 16, 4, 4, 0, 0.799 16, 4, 4, 1, 0.799 16, 36, 36, 0, 0.799 16, 36, 36, 1, 0.799 16, 2048, 0, 0, 0.799 16, 2048, 0, 1, 0.799 16, 2052, 0, 0, 0.801 16, 2052, 0, 1, 0.801 16, 2048, 4, 0, 0.798 16, 2048, 4, 1, 0.799 16, 2052, 4, 0, 0.799 16, 2052, 4, 1, 0.799 32, 0, 0, 0, 0.472 32, 0, 0, 1, 0.472 32, 5, 0, 0, 0.472 32, 5, 0, 1, 0.472 32, 37, 0, 0, 0.962 32, 37, 0, 1, 0.962 32, 0, 5, 0, 0.472 32, 0, 5, 1, 0.472 32, 0, 37, 0, 1.021 32, 0, 37, 1, 1.021 32, 5, 5, 0, 0.472 32, 5, 5, 1, 0.472 32, 37, 37, 0, 1.011 32, 37, 37, 1, 1.011 32, 2048, 0, 0, 0.472 32, 2048, 0, 1, 0.472 32, 2053, 0, 0, 0.472 32, 2053, 0, 1, 0.472 32, 2048, 5, 0, 0.472 32, 2048, 5, 1, 0.472 32, 2053, 5, 0, 0.472 32, 2053, 5, 1, 0.472 64, 0, 0, 0, 1.0 64, 0, 0, 1, 1.0 64, 6, 0, 0, 0.862 64, 6, 0, 1, 0.862 64, 38, 0, 0, 0.912 64, 38, 0, 1, 0.912 64, 0, 6, 0, 0.896 64, 0, 6, 1, 0.896 64, 0, 38, 0, 0.906 64, 0, 38, 1, 0.906 64, 6, 6, 0, 0.91 64, 6, 6, 1, 0.91 64, 38, 38, 0, 0.883 64, 38, 38, 1, 0.883 64, 2048, 0, 0, 1.0 64, 2048, 0, 1, 1.0 64, 2054, 0, 0, 0.862 64, 2054, 0, 1, 0.862 64, 2048, 6, 0, 0.887 64, 2048, 6, 1, 0.887 64, 2054, 6, 0, 0.887 64, 2054, 6, 1, 0.887 128, 0, 0, 0, 0.857 128, 0, 0, 1, 0.857 128, 7, 0, 0, 0.875 128, 7, 0, 1, 0.875 128, 39, 0, 0, 0.892 128, 39, 0, 1, 0.892 128, 0, 7, 0, 1.183 128, 0, 7, 1, 1.183 128, 0, 39, 0, 1.113 128, 0, 39, 1, 1.113 128, 7, 7, 0, 0.692 128, 7, 7, 1, 0.692 128, 39, 39, 0, 1.104 128, 39, 39, 1, 1.104 128, 2048, 0, 0, 0.857 128, 2048, 0, 1, 0.857 128, 2055, 0, 0, 0.875 128, 2055, 0, 1, 0.875 128, 2048, 7, 0, 0.959 128, 2048, 7, 1, 0.959 128, 2055, 7, 0, 1.036 128, 2055, 7, 1, 1.036 256, 0, 0, 0, 0.889 256, 0, 0, 1, 0.889 256, 8, 0, 0, 0.966 256, 8, 0, 1, 0.966 256, 40, 0, 0, 0.983 256, 40, 0, 1, 0.983 256, 0, 8, 0, 1.29 256, 0, 8, 1, 1.29 256, 0, 40, 0, 1.274 256, 0, 40, 1, 1.274 256, 8, 8, 0, 0.865 256, 8, 8, 1, 0.865 256, 40, 40, 0, 1.477 256, 40, 40, 1, 1.477 256, 2048, 0, 0, 0.889 256, 2048, 0, 1, 0.889 256, 2056, 0, 0, 0.966 256, 2056, 0, 1, 0.966 256, 2048, 8, 0, 0.952 256, 2048, 8, 1, 0.952 256, 2056, 8, 0, 0.878 256, 2056, 8, 1, 0.878 512, 0, 0, 0, 1.077 512, 0, 0, 1, 1.077 512, 9, 0, 0, 1.0 512, 9, 0, 1, 1.0 512, 41, 0, 0, 0.954 512, 41, 0, 1, 0.954 512, 0, 9, 0, 1.191 512, 0, 9, 1, 1.191 512, 0, 41, 0, 1.181 512, 0, 41, 1, 1.181 512, 9, 9, 0, 0.765 512, 9, 9, 1, 0.765 512, 41, 41, 0, 0.905 512, 41, 41, 1, 0.905 512, 2048, 0, 0, 1.077 512, 2048, 0, 1, 1.077 512, 2057, 0, 0, 1.0 512, 2057, 0, 1, 1.0 512, 2048, 9, 0, 1.0 512, 2048, 9, 1, 1.0 512, 2057, 9, 0, 0.733 512, 2057, 9, 1, 0.733 1024, 0, 0, 0, 1.143 1024, 0, 0, 1, 1.143 1024, 10, 0, 0, 1.015 1024, 10, 0, 1, 1.015 1024, 42, 0, 0, 1.045 1024, 42, 0, 1, 1.045 1024, 0, 10, 0, 1.126 1024, 0, 10, 1, 1.126 1024, 0, 42, 0, 1.114 1024, 0, 42, 1, 1.114 1024, 10, 10, 0, 0.89 1024, 10, 10, 1, 0.89 1024, 42, 42, 0, 0.986 1024, 42, 42, 1, 0.986 1024, 2048, 0, 0, 1.143 1024, 2048, 0, 1, 1.143 1024, 2058, 0, 0, 1.015 1024, 2058, 0, 1, 1.015 1024, 2048, 10, 0, 1.03 1024, 2048, 10, 1, 1.03 1024, 2058, 10, 0, 0.854 1024, 2058, 10, 1, 0.854 2048, 0, 0, 0, 1.005 2048, 0, 0, 1, 1.005 2048, 11, 0, 0, 1.013 2048, 11, 0, 1, 1.014 2048, 43, 0, 0, 1.044 2048, 43, 0, 1, 1.044 2048, 0, 11, 0, 1.002 2048, 0, 11, 1, 1.003 2048, 0, 43, 0, 1.003 2048, 0, 43, 1, 1.003 2048, 11, 11, 0, 0.92 2048, 11, 11, 1, 0.92 2048, 43, 43, 0, 1.0 2048, 43, 43, 1, 1.0 2048, 2048, 0, 0, 1.005 2048, 2048, 0, 1, 1.005 2048, 2059, 0, 0, 0.904 2048, 2059, 0, 1, 0.904 2048, 2048, 11, 0, 1.0 2048, 2048, 11, 1, 1.0 2048, 2059, 11, 0, 0.979 2048, 2059, 11, 1, 0.979 4096, 0, 0, 0, 1.014 4096, 0, 0, 1, 1.014 4096, 12, 0, 0, 0.855 4096, 12, 0, 1, 0.855 4096, 44, 0, 0, 0.857 4096, 44, 0, 1, 0.857 4096, 0, 12, 0, 0.932 4096, 0, 12, 1, 0.932 4096, 0, 44, 0, 0.932 4096, 0, 44, 1, 0.933 4096, 12, 12, 0, 0.999 4096, 12, 12, 1, 0.999 4096, 44, 44, 0, 1.051 4096, 44, 44, 1, 1.051 4096, 2048, 0, 0, 1.014 4096, 2048, 0, 1, 1.014 4096, 2060, 0, 0, 0.967 4096, 2060, 0, 1, 0.967 4096, 2048, 12, 0, 0.769 4096, 2048, 12, 1, 0.769 4096, 2060, 12, 0, 0.943 4096, 2060, 12, 1, 0.943 8192, 0, 0, 0, 1.045 8192, 0, 0, 1, 1.046 8192, 13, 0, 0, 0.885 8192, 13, 0, 1, 0.885 8192, 45, 0, 0, 0.887 8192, 45, 0, 1, 0.887 8192, 0, 13, 0, 0.942 8192, 0, 13, 1, 0.942 8192, 0, 45, 0, 0.942 8192, 0, 45, 1, 0.942 8192, 13, 13, 0, 1.03 8192, 13, 13, 1, 1.029 8192, 45, 45, 0, 1.048 8192, 45, 45, 1, 1.049 8192, 2048, 0, 0, 1.048 8192, 2048, 0, 1, 1.048 8192, 2061, 0, 0, 1.011 8192, 2061, 0, 1, 1.011 8192, 2048, 13, 0, 0.789 8192, 2048, 13, 1, 0.788 8192, 2061, 13, 0, 0.991 8192, 2061, 13, 1, 0.992 16384, 0, 0, 0, 1.026 16384, 0, 0, 1, 1.011 16384, 14, 0, 0, 0.943 16384, 14, 0, 1, 0.95 16384, 46, 0, 0, 0.856 16384, 46, 0, 1, 0.86 16384, 0, 14, 0, 0.815 16384, 0, 14, 1, 0.817 16384, 0, 46, 0, 0.859 16384, 0, 46, 1, 0.867 16384, 14, 14, 0, 0.987 16384, 14, 14, 1, 0.979 16384, 46, 46, 0, 1.027 16384, 46, 46, 1, 1.031 16384, 2048, 0, 0, 1.078 16384, 2048, 0, 1, 1.084 16384, 2062, 0, 0, 0.851 16384, 2062, 0, 1, 0.85 16384, 2048, 14, 0, 0.935 16384, 2048, 14, 1, 0.932 16384, 2062, 14, 0, 1.015 16384, 2062, 14, 1, 1.012 32768, 0, 0, 0, 0.978 32768, 0, 0, 1, 0.979 32768, 15, 0, 0, 1.006 32768, 15, 0, 1, 1.006 32768, 47, 0, 0, 1.004 32768, 47, 0, 1, 1.004 32768, 0, 15, 0, 1.045 32768, 0, 15, 1, 1.045 32768, 0, 47, 0, 1.011 32768, 0, 47, 1, 1.011 32768, 15, 15, 0, 0.977 32768, 15, 15, 1, 0.977 32768, 47, 47, 0, 0.96 32768, 47, 47, 1, 0.96 32768, 2048, 0, 0, 0.978 32768, 2048, 0, 1, 0.978 32768, 2063, 0, 0, 1.004 32768, 2063, 0, 1, 1.004 32768, 2048, 15, 0, 1.036 32768, 2048, 15, 1, 1.036 32768, 2063, 15, 0, 0.978 32768, 2063, 15, 1, 0.978 65536, 0, 0, 0, 0.981 65536, 0, 0, 1, 0.981 65536, 16, 0, 0, 0.987 65536, 16, 0, 1, 0.987 65536, 48, 0, 0, 0.968 65536, 48, 0, 1, 0.968 65536, 0, 16, 0, 1.014 65536, 0, 16, 1, 1.014 65536, 0, 48, 0, 0.984 65536, 0, 48, 1, 0.984 65536, 16, 16, 0, 1.01 65536, 16, 16, 1, 1.01 65536, 48, 48, 0, 0.968 65536, 48, 48, 1, 0.968 65536, 2048, 0, 0, 0.982 65536, 2048, 0, 1, 0.982 65536, 2064, 0, 0, 0.987 65536, 2064, 0, 1, 0.987 65536, 2048, 16, 0, 1.012 65536, 2048, 16, 1, 1.012 65536, 2064, 16, 0, 1.007 65536, 2064, 16, 1, 1.007 0, 0, 0, 0, 0.867 0, 2048, 0, 0, 0.867 0, 4095, 0, 0, 0.868 0, 0, 4095, 0, 0.866 1, 1, 0, 0, 1.108 1, 0, 1, 0, 0.946 1, 1, 1, 0, 0.946 1, 2049, 0, 0, 0.947 1, 2048, 1, 0, 0.945 1, 2049, 1, 0, 0.945 1, 4095, 0, 0, 1.482 1, 0, 4095, 0, 0.981 2, 2, 0, 0, 1.044 2, 0, 2, 0, 1.041 2, 2, 2, 0, 1.041 2, 2050, 0, 0, 1.044 2, 2048, 2, 0, 1.042 2, 2050, 2, 0, 1.041 2, 4095, 0, 0, 1.057 2, 0, 4095, 0, 1.022 3, 0, 0, 0, 0.899 3, 3, 0, 0, 0.902 3, 0, 3, 0, 0.9 3, 3, 3, 0, 0.9 3, 2048, 0, 0, 0.9 3, 2051, 0, 0, 0.902 3, 2048, 3, 0, 0.9 3, 2051, 3, 0, 0.9 3, 4095, 0, 0, 0.261 3, 0, 4095, 0, 0.211 4, 4, 0, 0, 0.965 4, 0, 4, 0, 0.962 4, 4, 4, 0, 0.962 4, 2052, 0, 0, 0.969 4, 2048, 4, 0, 0.962 4, 2052, 4, 0, 0.962 4, 4095, 0, 0, 1.971 4, 0, 4095, 0, 1.988 5, 0, 0, 0, 0.898 5, 5, 0, 0, 0.9 5, 0, 5, 0, 0.898 5, 5, 5, 0, 0.898 5, 2048, 0, 0, 0.898 5, 2053, 0, 0, 0.9 5, 2048, 5, 0, 0.898 5, 2053, 5, 0, 0.898 5, 4095, 0, 0, 0.935 5, 0, 4095, 0, 1.02 6, 0, 0, 0, 0.898 6, 6, 0, 0, 0.9 6, 0, 6, 0, 0.898 6, 6, 6, 0, 0.898 6, 2048, 0, 0, 0.898 6, 2054, 0, 0, 0.9 6, 2048, 6, 0, 0.898 6, 2054, 6, 0, 0.898 6, 4095, 0, 0, 0.935 6, 0, 4095, 0, 1.021 7, 0, 0, 0, 0.898 7, 7, 0, 0, 0.9 7, 0, 7, 0, 0.898 7, 7, 7, 0, 0.898 7, 2048, 0, 0, 0.898 7, 2055, 0, 0, 0.9 7, 2048, 7, 0, 0.898 7, 2055, 7, 0, 0.898 7, 4095, 0, 0, 0.935 7, 0, 4095, 0, 1.021 8, 8, 0, 0, 1.001 8, 0, 8, 0, 0.962 8, 8, 8, 0, 0.962 8, 2056, 0, 0, 1.0 8, 2048, 8, 0, 0.962 8, 2056, 8, 0, 0.962 8, 4095, 0, 0, 1.971 8, 0, 4095, 0, 1.988 9, 0, 0, 0, 0.898 9, 9, 0, 0, 0.9 9, 0, 9, 0, 0.899 9, 9, 9, 0, 0.899 9, 2048, 0, 0, 0.899 9, 2057, 0, 0, 0.9 9, 2048, 9, 0, 0.899 9, 2057, 9, 0, 0.899 9, 4095, 0, 0, 0.935 9, 0, 4095, 0, 1.019 10, 0, 0, 0, 0.898 10, 10, 0, 0, 0.9 10, 0, 10, 0, 0.899 10, 10, 10, 0, 0.899 10, 2048, 0, 0, 0.899 10, 2058, 0, 0, 0.9 10, 2048, 10, 0, 0.899 10, 2058, 10, 0, 0.899 10, 4095, 0, 0, 0.935 10, 0, 4095, 0, 1.02 11, 0, 0, 0, 0.898 11, 11, 0, 0, 0.9 11, 0, 11, 0, 0.899 11, 11, 11, 0, 0.899 11, 2048, 0, 0, 0.899 11, 2059, 0, 0, 0.9 11, 2048, 11, 0, 0.899 11, 2059, 11, 0, 0.899 11, 4095, 0, 0, 0.935 11, 0, 4095, 0, 1.02 12, 0, 0, 0, 0.898 12, 12, 0, 0, 0.9 12, 0, 12, 0, 0.899 12, 12, 12, 0, 0.899 12, 2048, 0, 0, 0.899 12, 2060, 0, 0, 0.9 12, 2048, 12, 0, 0.899 12, 2060, 12, 0, 0.899 12, 4095, 0, 0, 0.935 12, 0, 4095, 0, 1.018 13, 0, 0, 0, 0.897 13, 13, 0, 0, 0.901 13, 0, 13, 0, 0.898 13, 13, 13, 0, 0.898 13, 2048, 0, 0, 0.898 13, 2061, 0, 0, 0.9 13, 2048, 13, 0, 0.898 13, 2061, 13, 0, 0.898 13, 4095, 0, 0, 0.935 13, 0, 4095, 0, 1.019 14, 0, 0, 0, 0.897 14, 14, 0, 0, 0.9 14, 0, 14, 0, 0.898 14, 14, 14, 0, 0.898 14, 2048, 0, 0, 0.898 14, 2062, 0, 0, 0.9 14, 2048, 14, 0, 0.898 14, 2062, 14, 0, 0.898 14, 4095, 0, 0, 0.935 14, 0, 4095, 0, 1.02 15, 0, 0, 0, 0.897 15, 15, 0, 0, 0.901 15, 0, 15, 0, 0.898 15, 15, 15, 0, 0.898 15, 2048, 0, 0, 0.898 15, 2063, 0, 0, 0.9 15, 2048, 15, 0, 0.898 15, 2063, 15, 0, 0.898 15, 4095, 0, 0, 0.935 15, 0, 4095, 0, 1.02 16, 16, 0, 0, 0.801 16, 0, 16, 0, 0.799 16, 16, 16, 0, 0.799 16, 2064, 0, 0, 0.801 16, 2048, 16, 0, 0.799 16, 2064, 16, 0, 0.799 16, 4095, 0, 0, 1.818 16, 0, 4095, 0, 1.957 17, 0, 0, 0, 0.798 17, 17, 0, 0, 0.801 17, 0, 17, 0, 0.799 17, 17, 17, 0, 0.799 17, 2048, 0, 0, 0.799 17, 2065, 0, 0, 0.801 17, 2048, 17, 0, 0.799 17, 2065, 17, 0, 0.799 17, 4095, 0, 0, 0.938 17, 0, 4095, 0, 1.021 18, 0, 0, 0, 0.798 18, 18, 0, 0, 0.801 18, 0, 18, 0, 0.799 18, 18, 18, 0, 0.799 18, 2048, 0, 0, 0.799 18, 2066, 0, 0, 0.801 18, 2048, 18, 0, 0.799 18, 2066, 18, 0, 0.799 18, 4095, 0, 0, 0.938 18, 0, 4095, 0, 1.021 19, 0, 0, 0, 0.798 19, 19, 0, 0, 0.801 19, 0, 19, 0, 0.799 19, 19, 19, 0, 0.799 19, 2048, 0, 0, 0.799 19, 2067, 0, 0, 0.801 19, 2048, 19, 0, 0.799 19, 2067, 19, 0, 0.799 19, 4095, 0, 0, 0.938 19, 0, 4095, 0, 1.021 20, 0, 0, 0, 0.798 20, 20, 0, 0, 0.801 20, 0, 20, 0, 0.799 20, 20, 20, 0, 0.799 20, 2048, 0, 0, 0.799 20, 2068, 0, 0, 0.801 20, 2048, 20, 0, 0.799 20, 2068, 20, 0, 0.799 20, 4095, 0, 0, 0.937 20, 0, 4095, 0, 1.021 21, 0, 0, 0, 0.798 21, 21, 0, 0, 0.801 21, 0, 21, 0, 0.799 21, 21, 21, 0, 0.799 21, 2048, 0, 0, 0.799 21, 2069, 0, 0, 0.801 21, 2048, 21, 0, 0.799 21, 2069, 21, 0, 0.799 21, 4095, 0, 0, 0.938 21, 0, 4095, 0, 1.021 22, 0, 0, 0, 0.798 22, 22, 0, 0, 0.801 22, 0, 22, 0, 0.799 22, 22, 22, 0, 0.799 22, 2048, 0, 0, 0.799 22, 2070, 0, 0, 0.801 22, 2048, 22, 0, 0.799 22, 2070, 22, 0, 0.799 22, 4095, 0, 0, 0.938 22, 0, 4095, 0, 1.021 23, 0, 0, 0, 0.798 23, 23, 0, 0, 0.801 23, 0, 23, 0, 0.799 23, 23, 23, 0, 0.799 23, 2048, 0, 0, 0.799 23, 2071, 0, 0, 0.801 23, 2048, 23, 0, 0.799 23, 2071, 23, 0, 0.799 23, 4095, 0, 0, 0.938 23, 0, 4095, 0, 1.021 24, 0, 0, 0, 0.798 24, 24, 0, 0, 0.801 24, 0, 24, 0, 0.799 24, 24, 24, 0, 0.799 24, 2048, 0, 0, 0.799 24, 2072, 0, 0, 0.801 24, 2048, 24, 0, 0.799 24, 2072, 24, 0, 0.799 24, 4095, 0, 0, 0.937 24, 0, 4095, 0, 1.021 25, 0, 0, 0, 0.501 25, 25, 0, 0, 0.502 25, 0, 25, 0, 0.502 25, 25, 25, 0, 0.501 25, 2048, 0, 0, 0.501 25, 2073, 0, 0, 0.502 25, 2048, 25, 0, 0.502 25, 2073, 25, 0, 0.501 25, 4095, 0, 0, 0.974 25, 0, 4095, 0, 0.98 26, 0, 0, 0, 0.501 26, 26, 0, 0, 0.502 26, 0, 26, 0, 0.502 26, 26, 26, 0, 0.501 26, 2048, 0, 0, 0.501 26, 2074, 0, 0, 0.502 26, 2048, 26, 0, 0.502 26, 2074, 26, 0, 0.501 26, 4095, 0, 0, 0.974 26, 0, 4095, 0, 1.0 27, 0, 0, 0, 0.501 27, 27, 0, 0, 0.502 27, 0, 27, 0, 0.502 27, 27, 27, 0, 0.501 27, 2048, 0, 0, 0.501 27, 2075, 0, 0, 0.502 27, 2048, 27, 0, 0.502 27, 2075, 27, 0, 0.501 27, 4095, 0, 0, 0.974 27, 0, 4095, 0, 1.0 28, 0, 0, 0, 0.501 28, 28, 0, 0, 0.502 28, 0, 28, 0, 0.502 28, 28, 28, 0, 0.501 28, 2048, 0, 0, 0.501 28, 2076, 0, 0, 0.502 28, 2048, 28, 0, 0.502 28, 2076, 28, 0, 0.502 28, 4095, 0, 0, 0.974 28, 0, 4095, 0, 1.0 29, 0, 0, 0, 0.472 29, 29, 0, 0, 0.472 29, 0, 29, 0, 0.472 29, 29, 29, 0, 0.472 29, 2048, 0, 0, 0.472 29, 2077, 0, 0, 0.472 29, 2048, 29, 0, 0.472 29, 2077, 29, 0, 0.472 29, 4095, 0, 0, 0.974 29, 0, 4095, 0, 1.0 30, 0, 0, 0, 0.472 30, 30, 0, 0, 0.472 30, 0, 30, 0, 0.472 30, 30, 30, 0, 0.472 30, 2048, 0, 0, 0.472 30, 2078, 0, 0, 0.472 30, 2048, 30, 0, 0.472 30, 2078, 30, 0, 0.472 30, 4095, 0, 0, 0.974 30, 0, 4095, 0, 1.0 31, 0, 0, 0, 0.472 31, 31, 0, 0, 0.472 31, 0, 31, 0, 0.472 31, 31, 31, 0, 0.472 31, 2048, 0, 0, 0.472 31, 2079, 0, 0, 0.472 31, 2048, 31, 0, 0.472 31, 2079, 31, 0, 0.472 31, 4095, 0, 0, 0.974 31, 0, 4095, 0, 1.0 48, 0, 0, 0, 1.0 48, 0, 0, 1, 1.0 48, 3, 0, 0, 1.0 48, 3, 0, 1, 1.0 48, 0, 3, 0, 1.0 48, 0, 3, 1, 1.0 48, 3, 3, 0, 1.0 48, 3, 3, 1, 1.0 48, 2048, 0, 0, 1.0 48, 2048, 0, 1, 1.0 48, 2051, 0, 0, 1.0 48, 2051, 0, 1, 1.0 48, 2048, 3, 0, 1.0 48, 2048, 3, 1, 1.0 48, 2051, 3, 0, 1.0 48, 2051, 3, 1, 1.0 80, 0, 0, 0, 0.781 80, 0, 0, 1, 0.782 80, 5, 0, 0, 0.976 80, 5, 0, 1, 0.976 80, 0, 5, 0, 1.232 80, 0, 5, 1, 1.232 80, 5, 5, 0, 1.542 80, 5, 5, 1, 1.543 80, 2048, 0, 0, 0.781 80, 2048, 0, 1, 0.782 80, 2053, 0, 0, 0.976 80, 2053, 0, 1, 0.976 80, 2048, 5, 0, 1.093 80, 2048, 5, 1, 1.093 80, 2053, 5, 0, 1.371 80, 2053, 5, 1, 1.371 96, 0, 0, 0, 0.758 96, 0, 0, 1, 0.758 96, 6, 0, 0, 0.929 96, 6, 0, 1, 0.929 96, 0, 6, 0, 1.204 96, 0, 6, 1, 1.204 96, 6, 6, 0, 1.559 96, 6, 6, 1, 1.562 96, 2048, 0, 0, 0.758 96, 2048, 0, 1, 0.758 96, 2054, 0, 0, 0.929 96, 2054, 0, 1, 0.929 96, 2048, 6, 0, 1.068 96, 2048, 6, 1, 1.068 96, 2054, 6, 0, 1.562 96, 2054, 6, 1, 1.562 112, 0, 0, 0, 0.736 112, 0, 0, 1, 0.736 112, 7, 0, 0, 0.675 112, 7, 0, 1, 0.675 112, 0, 7, 0, 0.778 112, 0, 7, 1, 0.778 112, 7, 7, 0, 0.909 112, 7, 7, 1, 0.909 112, 2048, 0, 0, 0.736 112, 2048, 0, 1, 0.736 112, 2055, 0, 0, 0.675 112, 2055, 0, 1, 0.675 112, 2048, 7, 0, 0.778 112, 2048, 7, 1, 0.778 112, 2055, 7, 0, 0.909 112, 2055, 7, 1, 0.909 144, 0, 0, 0, 0.857 144, 0, 0, 1, 0.857 144, 9, 0, 0, 0.939 144, 9, 0, 1, 0.939 144, 0, 9, 0, 1.137 144, 0, 9, 1, 1.137 144, 9, 9, 0, 1.514 144, 9, 9, 1, 1.514 144, 2048, 0, 0, 0.857 144, 2048, 0, 1, 0.857 144, 2057, 0, 0, 0.939 144, 2057, 0, 1, 0.939 144, 2048, 9, 0, 0.922 144, 2048, 9, 1, 0.922 144, 2057, 9, 0, 1.514 144, 2057, 9, 1, 1.514 160, 0, 0, 0, 0.698 160, 0, 0, 1, 0.698 160, 10, 0, 0, 0.91 160, 10, 0, 1, 0.91 160, 0, 10, 0, 1.211 160, 0, 10, 1, 1.212 160, 10, 10, 0, 1.357 160, 10, 10, 1, 1.357 160, 2048, 0, 0, 0.698 160, 2048, 0, 1, 0.698 160, 2058, 0, 0, 0.91 160, 2058, 0, 1, 0.91 160, 2048, 10, 0, 0.923 160, 2048, 10, 1, 0.923 160, 2058, 10, 0, 1.357 160, 2058, 10, 1, 1.357 176, 0, 0, 0, 0.796 176, 0, 0, 1, 0.796 176, 11, 0, 0, 0.804 176, 11, 0, 1, 0.804 176, 0, 11, 0, 0.774 176, 0, 11, 1, 0.774 176, 11, 11, 0, 0.814 176, 11, 11, 1, 0.814 176, 2048, 0, 0, 0.796 176, 2048, 0, 1, 0.796 176, 2059, 0, 0, 0.804 176, 2059, 0, 1, 0.804 176, 2048, 11, 0, 0.774 176, 2048, 11, 1, 0.774 176, 2059, 11, 0, 0.814 176, 2059, 11, 1, 0.814 192, 0, 0, 0, 0.778 192, 0, 0, 1, 0.778 192, 12, 0, 0, 0.881 192, 12, 0, 1, 0.881 192, 0, 12, 0, 1.167 192, 0, 12, 1, 1.167 192, 12, 12, 0, 0.841 192, 12, 12, 1, 0.841 192, 2048, 0, 0, 0.778 192, 2048, 0, 1, 0.778 192, 2060, 0, 0, 0.881 192, 2060, 0, 1, 0.881 192, 2048, 12, 0, 0.889 192, 2048, 12, 1, 0.889 192, 2060, 12, 0, 0.906 192, 2060, 12, 1, 0.906 208, 0, 0, 0, 0.833 208, 0, 0, 1, 0.833 208, 13, 0, 0, 0.921 208, 13, 0, 1, 0.921 208, 0, 13, 0, 1.003 208, 0, 13, 1, 0.85 208, 13, 13, 0, 1.333 208, 13, 13, 1, 1.333 208, 2048, 0, 0, 0.834 208, 2048, 0, 1, 0.833 208, 2061, 0, 0, 0.921 208, 2061, 0, 1, 0.921 208, 2048, 13, 0, 0.833 208, 2048, 13, 1, 0.833 208, 2061, 13, 0, 1.333 208, 2061, 13, 1, 1.333 224, 0, 0, 0, 0.93 224, 0, 0, 1, 0.93 224, 14, 0, 0, 1.0 224, 14, 0, 1, 1.0 224, 0, 14, 0, 1.15 224, 0, 14, 1, 1.15 224, 14, 14, 0, 1.452 224, 14, 14, 1, 1.452 224, 2048, 0, 0, 0.93 224, 2048, 0, 1, 0.93 224, 2062, 0, 0, 1.0 224, 2062, 0, 1, 1.0 224, 2048, 14, 0, 0.833 224, 2048, 14, 1, 0.833 224, 2062, 14, 0, 1.452 224, 2062, 14, 1, 1.452 240, 0, 0, 0, 0.909 240, 0, 0, 1, 0.909 240, 15, 0, 0, 0.797 240, 15, 0, 1, 0.797 240, 0, 15, 0, 0.771 240, 0, 15, 1, 0.771 240, 15, 15, 0, 0.93 240, 15, 15, 1, 0.93 240, 2048, 0, 0, 0.909 240, 2048, 0, 1, 0.909 240, 2063, 0, 0, 0.797 240, 2063, 0, 1, 0.797 240, 2048, 15, 0, 0.771 240, 2048, 15, 1, 0.771 240, 2063, 15, 0, 0.93 240, 2063, 15, 1, 0.93 272, 0, 0, 0, 0.9 272, 0, 0, 1, 0.9 272, 17, 0, 0, 1.015 272, 17, 0, 1, 1.015 272, 0, 17, 0, 0.927 272, 0, 17, 1, 0.927 272, 17, 17, 0, 0.892 272, 17, 17, 1, 0.892 272, 2048, 0, 0, 0.9 272, 2048, 0, 1, 0.9 272, 2065, 0, 0, 1.015 272, 2065, 0, 1, 1.015 272, 2048, 17, 0, 0.927 272, 2048, 17, 1, 0.927 272, 2065, 17, 0, 0.878 272, 2065, 17, 1, 0.878 288, 0, 0, 0, 0.882 288, 0, 0, 1, 0.882 288, 18, 0, 0, 0.803 288, 18, 0, 1, 0.803 288, 0, 18, 0, 0.768 288, 0, 18, 1, 0.768 288, 18, 18, 0, 0.882 288, 18, 18, 1, 0.882 288, 2048, 0, 0, 0.882 288, 2048, 0, 1, 0.882 288, 2066, 0, 0, 0.803 288, 2066, 0, 1, 0.803 288, 2048, 18, 0, 0.768 288, 2048, 18, 1, 0.768 288, 2066, 18, 0, 0.882 288, 2066, 18, 1, 0.882 304, 0, 0, 0, 0.865 304, 0, 0, 1, 0.866 304, 19, 0, 0, 0.944 304, 19, 0, 1, 0.944 304, 0, 19, 0, 0.943 304, 0, 19, 1, 0.943 304, 19, 19, 0, 0.956 304, 19, 19, 1, 0.956 304, 2048, 0, 0, 0.865 304, 2048, 0, 1, 0.865 304, 2067, 0, 0, 0.944 304, 2067, 0, 1, 0.944 304, 2048, 19, 0, 0.943 304, 2048, 19, 1, 0.943 304, 2067, 19, 0, 0.947 304, 2067, 19, 1, 0.947 320, 0, 0, 0, 0.944 320, 0, 0, 1, 0.944 320, 20, 0, 0, 0.962 320, 20, 0, 1, 0.962 320, 0, 20, 0, 1.214 320, 0, 20, 1, 1.214 320, 20, 20, 0, 1.365 320, 20, 20, 1, 1.365 320, 2048, 0, 0, 0.944 320, 2048, 0, 1, 0.944 320, 2068, 0, 0, 0.962 320, 2068, 0, 1, 0.962 320, 2048, 20, 0, 0.914 320, 2048, 20, 1, 0.914 320, 2068, 20, 0, 1.365 320, 2068, 20, 1, 1.365 336, 0, 0, 0, 1.0 336, 0, 0, 1, 1.0 336, 21, 0, 0, 0.986 336, 21, 0, 1, 0.986 336, 0, 21, 0, 0.853 336, 0, 21, 1, 0.853 336, 21, 21, 0, 0.843 336, 21, 21, 1, 0.843 336, 2048, 0, 0, 1.0 336, 2048, 0, 1, 1.0 336, 2069, 0, 0, 0.986 336, 2069, 0, 1, 0.986 336, 2048, 21, 0, 0.853 336, 2048, 21, 1, 0.853 336, 2069, 21, 0, 0.831 336, 2069, 21, 1, 0.831 352, 0, 0, 0, 0.98 352, 0, 0, 1, 0.98 352, 22, 0, 0, 0.811 352, 22, 0, 1, 0.811 352, 0, 22, 0, 0.882 352, 0, 22, 1, 0.882 352, 22, 22, 0, 1.1 352, 22, 22, 1, 1.1 352, 2048, 0, 0, 0.98 352, 2048, 0, 1, 0.98 352, 2070, 0, 0, 0.811 352, 2070, 0, 1, 0.811 352, 2048, 22, 0, 0.882 352, 2048, 22, 1, 0.882 352, 2070, 22, 0, 1.1 352, 2070, 22, 1, 1.1 368, 0, 0, 0, 1.058 368, 0, 0, 1, 1.058 368, 23, 0, 0, 1.0 368, 23, 0, 1, 1.0 368, 0, 23, 0, 0.948 368, 0, 23, 1, 0.948 368, 23, 23, 0, 0.723 368, 23, 23, 1, 0.723 368, 2048, 0, 0, 1.058 368, 2048, 0, 1, 1.058 368, 2071, 0, 0, 1.0 368, 2071, 0, 1, 1.0 368, 2048, 23, 0, 0.948 368, 2048, 23, 1, 0.948 368, 2071, 23, 0, 0.701 368, 2071, 23, 1, 0.701 384, 0, 0, 0, 1.012 384, 0, 0, 1, 1.012 384, 24, 0, 0, 1.04 384, 24, 0, 1, 1.04 384, 0, 24, 0, 1.154 384, 0, 24, 1, 1.154 384, 24, 24, 0, 1.423 384, 24, 24, 1, 1.423 384, 2048, 0, 0, 1.012 384, 2048, 0, 1, 1.012 384, 2072, 0, 0, 1.04 384, 2072, 0, 1, 1.04 384, 2048, 24, 0, 0.91 384, 2048, 24, 1, 0.91 384, 2072, 24, 0, 1.423 384, 2072, 24, 1, 1.423 400, 0, 0, 0, 0.948 400, 0, 0, 1, 0.948 400, 25, 0, 0, 0.957 400, 25, 0, 1, 0.957 400, 0, 25, 0, 1.054 400, 0, 25, 1, 1.097 400, 25, 25, 0, 0.885 400, 25, 25, 1, 0.885 400, 2048, 0, 0, 0.948 400, 2048, 0, 1, 0.948 400, 2073, 0, 0, 0.957 400, 2073, 0, 1, 0.957 400, 2048, 25, 0, 0.94 400, 2048, 25, 1, 0.94 400, 2073, 25, 0, 0.908 400, 2073, 25, 1, 0.908 416, 0, 0, 0, 1.017 416, 0, 0, 1, 1.017 416, 26, 0, 0, 0.903 416, 26, 0, 1, 0.903 416, 0, 26, 0, 0.881 416, 0, 26, 1, 0.881 416, 26, 26, 0, 1.035 416, 26, 26, 1, 1.035 416, 2048, 0, 0, 1.017 416, 2048, 0, 1, 1.017 416, 2074, 0, 0, 0.903 416, 2074, 0, 1, 0.903 416, 2048, 26, 0, 0.881 416, 2048, 26, 1, 0.881 416, 2074, 26, 0, 1.035 416, 2074, 26, 1, 1.035 432, 0, 0, 0, 1.0 432, 0, 0, 1, 1.0 432, 27, 0, 0, 0.933 432, 27, 0, 1, 0.933 432, 0, 27, 0, 0.941 432, 0, 27, 1, 0.941 432, 27, 27, 0, 0.953 432, 27, 27, 1, 0.954 432, 2048, 0, 0, 1.0 432, 2048, 0, 1, 1.0 432, 2075, 0, 0, 0.933 432, 2075, 0, 1, 0.933 432, 2048, 27, 0, 0.941 432, 2048, 27, 1, 0.941 432, 2075, 27, 0, 0.93 432, 2075, 27, 1, 0.93 448, 0, 0, 0, 0.984 448, 0, 0, 1, 0.984 448, 28, 0, 0, 0.896 448, 28, 0, 1, 0.896 448, 0, 28, 0, 1.244 448, 0, 28, 1, 1.244 448, 28, 28, 0, 1.333 448, 28, 28, 1, 1.333 448, 2048, 0, 0, 0.984 448, 2048, 0, 1, 0.984 448, 2076, 0, 0, 0.896 448, 2076, 0, 1, 0.896 448, 2048, 28, 0, 0.988 448, 2048, 28, 1, 0.988 448, 2076, 28, 0, 1.333 448, 2076, 28, 1, 1.333 464, 0, 0, 0, 1.083 464, 0, 0, 1, 1.083 464, 29, 0, 0, 0.978 464, 29, 0, 1, 0.978 464, 0, 29, 0, 0.924 464, 0, 29, 1, 0.924 464, 29, 29, 0, 0.901 464, 29, 29, 1, 0.901 464, 2048, 0, 0, 1.083 464, 2048, 0, 1, 1.083 464, 2077, 0, 0, 0.978 464, 2077, 0, 1, 0.978 464, 2048, 29, 0, 0.924 464, 2048, 29, 1, 0.924 464, 2077, 29, 0, 0.89 464, 2077, 29, 1, 0.89 480, 0, 0, 0, 1.066 480, 0, 0, 1, 1.066 480, 30, 0, 0, 0.9 480, 30, 0, 1, 0.9 480, 0, 30, 0, 0.88 480, 0, 30, 1, 0.88 480, 30, 30, 0, 1.083 480, 30, 30, 1, 1.083 480, 2048, 0, 0, 1.066 480, 2048, 0, 1, 1.066 480, 2078, 0, 0, 0.9 480, 2078, 0, 1, 0.9 480, 2048, 30, 0, 0.88 480, 2048, 30, 1, 0.88 480, 2078, 30, 0, 1.083 480, 2078, 30, 1, 1.083 496, 0, 0, 0, 1.032 496, 0, 0, 1, 1.032 496, 31, 0, 0, 0.95 496, 31, 0, 1, 0.95 496, 0, 31, 0, 1.011 496, 0, 31, 1, 1.011 496, 31, 31, 0, 0.973 496, 31, 31, 1, 0.973 496, 2048, 0, 0, 1.032 496, 2048, 0, 1, 1.032 496, 2079, 0, 0, 0.95 496, 2079, 0, 1, 0.95 496, 2048, 31, 0, 1.011 496, 2048, 31, 1, 1.011 496, 2079, 31, 0, 0.941 496, 2079, 31, 1, 0.941 1024, 32, 0, 0, 1.143 1024, 32, 0, 1, 1.143 1024, 0, 32, 0, 1.143 1024, 0, 32, 1, 1.143 1024, 32, 32, 0, 1.143 1024, 32, 32, 1, 1.143 1024, 2080, 0, 0, 1.143 1024, 2080, 0, 1, 1.143 1024, 2048, 32, 0, 1.143 1024, 2048, 32, 1, 1.143 1024, 2080, 32, 0, 1.143 1024, 2080, 32, 1, 1.143 1056, 0, 0, 0, 1.165 1056, 0, 0, 1, 1.162 1056, 33, 0, 0, 1.067 1056, 33, 0, 1, 1.067 1056, 0, 33, 0, 0.977 1056, 0, 33, 1, 0.977 1056, 33, 33, 0, 1.043 1056, 33, 33, 1, 1.043 1056, 2048, 0, 0, 1.168 1056, 2048, 0, 1, 1.168 1056, 2081, 0, 0, 1.067 1056, 2081, 0, 1, 1.067 1056, 2048, 33, 0, 0.977 1056, 2048, 33, 1, 0.977 1056, 2081, 33, 0, 1.0 1056, 2081, 33, 1, 1.0 1088, 0, 0, 0, 1.171 1088, 0, 0, 1, 1.171 1088, 34, 0, 0, 1.041 1088, 34, 0, 1, 1.041 1088, 0, 34, 0, 1.079 1088, 0, 34, 1, 1.079 1088, 34, 34, 0, 0.966 1088, 34, 34, 1, 0.966 1088, 2048, 0, 0, 1.171 1088, 2048, 0, 1, 1.171 1088, 2082, 0, 0, 1.041 1088, 2082, 0, 1, 1.041 1088, 2048, 34, 0, 0.994 1088, 2048, 34, 1, 0.994 1088, 2082, 34, 0, 0.966 1088, 2082, 34, 1, 0.966 1120, 0, 0, 0, 1.154 1120, 0, 0, 1, 1.151 1120, 35, 0, 0, 1.051 1120, 35, 0, 1, 1.051 1120, 0, 35, 0, 1.0 1120, 0, 35, 1, 1.0 1120, 35, 35, 0, 1.068 1120, 35, 35, 1, 1.068 1120, 2048, 0, 0, 1.151 1120, 2048, 0, 1, 1.151 1120, 2083, 0, 0, 1.051 1120, 2083, 0, 1, 1.051 1120, 2048, 35, 0, 1.0 1120, 2048, 35, 1, 1.0 1120, 2083, 35, 0, 1.027 1120, 2083, 35, 1, 1.027 1152, 0, 0, 0, 1.159 1152, 0, 0, 1, 1.159 1152, 36, 0, 0, 1.034 1152, 36, 0, 1, 1.034 1152, 0, 36, 0, 1.07 1152, 0, 36, 1, 1.07 1152, 36, 36, 0, 0.967 1152, 36, 36, 1, 0.967 1152, 2048, 0, 0, 1.159 1152, 2048, 0, 1, 1.159 1152, 2084, 0, 0, 1.034 1152, 2084, 0, 1, 1.034 1152, 2048, 36, 0, 0.984 1152, 2048, 36, 1, 0.984 1152, 2084, 36, 0, 0.967 1152, 2084, 36, 1, 0.967 1184, 0, 0, 0, 1.157 1184, 0, 0, 1, 1.157 1184, 37, 0, 0, 1.066 1184, 37, 0, 1, 1.066 1184, 0, 37, 0, 0.993 1184, 0, 37, 1, 0.993 1184, 37, 37, 0, 1.08 1184, 37, 37, 1, 1.081 1184, 2048, 0, 0, 1.157 1184, 2048, 0, 1, 1.157 1184, 2085, 0, 0, 1.066 1184, 2085, 0, 1, 1.066 1184, 2048, 37, 0, 0.993 1184, 2048, 37, 1, 0.993 1184, 2085, 37, 0, 1.04 1184, 2085, 37, 1, 1.04 1216, 0, 0, 0, 1.139 1216, 0, 0, 1, 1.139 1216, 38, 0, 0, 1.024 1216, 38, 0, 1, 1.024 1216, 0, 38, 0, 1.086 1216, 0, 38, 1, 1.087 1216, 38, 38, 0, 1.0 1216, 38, 38, 1, 1.0 1216, 2048, 0, 0, 1.138 1216, 2048, 0, 1, 1.138 1216, 2086, 0, 0, 1.024 1216, 2086, 0, 1, 1.024 1216, 2048, 38, 0, 1.01 1216, 2048, 38, 1, 1.01 1216, 2086, 38, 0, 1.0 1216, 2086, 38, 1, 1.0 1248, 0, 0, 0, 1.175 1248, 0, 0, 1, 1.174 1248, 39, 0, 0, 1.074 1248, 39, 0, 1, 1.074 1248, 0, 39, 0, 0.975 1248, 0, 39, 1, 0.985 1248, 39, 39, 0, 1.064 1248, 39, 39, 1, 1.064 1248, 2048, 0, 0, 1.179 1248, 2048, 0, 1, 1.178 1248, 2087, 0, 0, 1.074 1248, 2087, 0, 1, 1.074 1248, 2048, 39, 0, 0.985 1248, 2048, 39, 1, 0.985 1248, 2087, 39, 0, 1.026 1248, 2087, 39, 1, 1.026 1280, 0, 0, 0, 0.992 1280, 0, 0, 1, 0.992 1280, 40, 0, 0, 1.051 1280, 40, 0, 1, 1.051 1280, 0, 40, 0, 1.044 1280, 0, 40, 1, 1.044 1280, 40, 40, 0, 1.252 1280, 40, 40, 1, 1.252 1280, 2048, 0, 0, 0.992 1280, 2048, 0, 1, 0.992 1280, 2088, 0, 0, 1.051 1280, 2088, 0, 1, 1.051 1280, 2048, 40, 0, 0.946 1280, 2048, 40, 1, 0.946 1280, 2088, 40, 0, 1.252 1280, 2088, 40, 1, 1.252 1312, 0, 0, 0, 0.969 1312, 0, 0, 1, 0.969 1312, 41, 0, 0, 0.988 1312, 41, 0, 1, 0.988 1312, 0, 41, 0, 0.837 1312, 0, 41, 1, 0.837 1312, 41, 41, 0, 1.025 1312, 41, 41, 1, 1.025 1312, 2048, 0, 0, 0.969 1312, 2048, 0, 1, 0.969 1312, 2089, 0, 0, 0.988 1312, 2089, 0, 1, 0.987 1312, 2048, 41, 0, 0.837 1312, 2048, 41, 1, 0.837 1312, 2089, 41, 0, 0.975 1312, 2089, 41, 1, 0.975 1344, 0, 0, 0, 0.987 1344, 0, 0, 1, 0.988 1344, 42, 0, 0, 1.031 1344, 42, 0, 1, 1.031 1344, 0, 42, 0, 1.033 1344, 0, 42, 1, 1.033 1344, 42, 42, 0, 0.982 1344, 42, 42, 1, 0.982 1344, 2048, 0, 0, 0.992 1344, 2048, 0, 1, 0.992 1344, 2090, 0, 0, 1.031 1344, 2090, 0, 1, 1.031 1344, 2048, 42, 0, 0.943 1344, 2048, 42, 1, 0.943 1344, 2090, 42, 0, 0.982 1344, 2090, 42, 1, 0.982 1376, 0, 0, 0, 1.016 1376, 0, 0, 1, 1.016 1376, 43, 0, 0, 1.005 1376, 43, 0, 1, 1.005 1376, 0, 43, 0, 0.829 1376, 0, 43, 1, 0.829 1376, 43, 43, 0, 1.024 1376, 43, 43, 1, 1.024 1376, 2048, 0, 0, 1.005 1376, 2048, 0, 1, 1.013 1376, 2091, 0, 0, 1.005 1376, 2091, 0, 1, 1.005 1376, 2048, 43, 0, 0.829 1376, 2048, 43, 1, 0.829 1376, 2091, 43, 0, 0.98 1376, 2091, 43, 1, 0.98 1408, 0, 0, 0, 0.988 1408, 0, 0, 1, 0.988 1408, 44, 0, 0, 1.015 1408, 44, 0, 1, 1.015 1408, 0, 44, 0, 1.023 1408, 0, 44, 1, 1.03 1408, 44, 44, 0, 0.998 1408, 44, 44, 1, 0.994 1408, 2048, 0, 0, 0.988 1408, 2048, 0, 1, 0.988 1408, 2092, 0, 0, 1.015 1408, 2092, 0, 1, 1.015 1408, 2048, 44, 0, 0.955 1408, 2048, 44, 1, 0.955 1408, 2092, 44, 0, 0.999 1408, 2092, 44, 1, 0.994 1440, 0, 0, 0, 0.986 1440, 0, 0, 1, 0.986 1440, 45, 0, 0, 1.008 1440, 45, 0, 1, 1.008 1440, 0, 45, 0, 0.814 1440, 0, 45, 1, 0.814 1440, 45, 45, 0, 1.006 1440, 45, 45, 1, 1.006 1440, 2048, 0, 0, 0.986 1440, 2048, 0, 1, 0.986 1440, 2093, 0, 0, 1.008 1440, 2093, 0, 1, 1.008 1440, 2048, 45, 0, 0.814 1440, 2048, 45, 1, 0.814 1440, 2093, 45, 0, 0.966 1440, 2093, 45, 1, 0.966 1472, 0, 0, 0, 0.993 1472, 0, 0, 1, 0.992 1472, 46, 0, 0, 1.045 1472, 46, 0, 1, 1.045 1472, 0, 46, 0, 1.026 1472, 0, 46, 1, 1.026 1472, 46, 46, 0, 0.966 1472, 46, 46, 1, 0.966 1472, 2048, 0, 0, 0.999 1472, 2048, 0, 1, 0.997 1472, 2094, 0, 0, 1.045 1472, 2094, 0, 1, 1.045 1472, 2048, 46, 0, 0.939 1472, 2048, 46, 1, 0.939 1472, 2094, 46, 0, 0.966 1472, 2094, 46, 1, 0.966 1504, 0, 0, 0, 0.991 1504, 0, 0, 1, 0.991 1504, 47, 0, 0, 0.999 1504, 47, 0, 1, 0.999 1504, 0, 47, 0, 0.826 1504, 0, 47, 1, 0.826 1504, 47, 47, 0, 1.023 1504, 47, 47, 1, 1.023 1504, 2048, 0, 0, 0.993 1504, 2048, 0, 1, 0.993 1504, 2095, 0, 0, 0.999 1504, 2095, 0, 1, 0.999 1504, 2048, 47, 0, 0.826 1504, 2048, 47, 1, 0.826 1504, 2095, 47, 0, 0.993 1504, 2095, 47, 1, 0.993 1536, 0, 0, 0, 0.994 1536, 0, 0, 1, 0.993 1536, 48, 0, 0, 1.019 1536, 48, 0, 1, 1.019 1536, 0, 48, 0, 1.025 1536, 0, 48, 1, 1.025 1536, 48, 48, 0, 0.993 1536, 48, 48, 1, 0.993 1536, 2048, 0, 0, 0.994 1536, 2048, 0, 1, 0.994 1536, 2096, 0, 0, 1.019 1536, 2096, 0, 1, 1.019 1536, 2048, 48, 0, 1.025 1536, 2048, 48, 1, 1.025 1536, 2096, 48, 0, 0.994 1536, 2096, 48, 1, 0.994 1568, 0, 0, 0, 0.994 1568, 0, 0, 1, 0.994 1568, 49, 0, 0, 0.903 1568, 49, 0, 1, 0.903 1568, 0, 49, 0, 1.147 1568, 0, 49, 1, 1.147 1568, 49, 49, 0, 1.461 1568, 49, 49, 1, 1.46 1568, 2048, 0, 0, 0.994 1568, 2048, 0, 1, 0.993 1568, 2097, 0, 0, 0.903 1568, 2097, 0, 1, 0.903 1568, 2048, 49, 0, 1.09 1568, 2048, 49, 1, 1.09 1568, 2097, 49, 0, 1.46 1568, 2097, 49, 1, 1.46 1600, 0, 0, 0, 0.981 1600, 0, 0, 1, 0.981 1600, 50, 0, 0, 1.022 1600, 50, 0, 1, 1.022 1600, 0, 50, 0, 1.017 1600, 0, 50, 1, 1.017 1600, 50, 50, 0, 0.973 1600, 50, 50, 1, 0.973 1600, 2048, 0, 0, 0.981 1600, 2048, 0, 1, 0.981 1600, 2098, 0, 0, 1.022 1600, 2098, 0, 1, 1.022 1600, 2048, 50, 0, 0.961 1600, 2048, 50, 1, 0.961 1600, 2098, 50, 0, 0.973 1600, 2098, 50, 1, 0.973 1632, 0, 0, 0, 1.018 1632, 0, 0, 1, 1.018 1632, 51, 0, 0, 0.893 1632, 51, 0, 1, 0.893 1632, 0, 51, 0, 1.134 1632, 0, 51, 1, 1.134 1632, 51, 51, 0, 1.444 1632, 51, 51, 1, 1.444 1632, 2048, 0, 0, 1.019 1632, 2048, 0, 1, 1.019 1632, 2099, 0, 0, 0.893 1632, 2099, 0, 1, 0.893 1632, 2048, 51, 0, 1.079 1632, 2048, 51, 1, 1.079 1632, 2099, 51, 0, 1.449 1632, 2099, 51, 1, 1.449 1664, 0, 0, 0, 1.006 1664, 0, 0, 1, 1.006 1664, 52, 0, 0, 0.982 1664, 52, 0, 1, 0.986 1664, 0, 52, 0, 1.004 1664, 0, 52, 1, 1.004 1664, 52, 52, 0, 0.976 1664, 52, 52, 1, 0.976 1664, 2048, 0, 0, 1.006 1664, 2048, 0, 1, 1.006 1664, 2100, 0, 0, 0.983 1664, 2100, 0, 1, 0.983 1664, 2048, 52, 0, 0.946 1664, 2048, 52, 1, 0.946 1664, 2100, 52, 0, 0.976 1664, 2100, 52, 1, 0.976 1696, 0, 0, 0, 0.99 1696, 0, 0, 1, 0.99 1696, 53, 0, 0, 0.884 1696, 53, 0, 1, 0.884 1696, 0, 53, 0, 1.141 1696, 0, 53, 1, 1.141 1696, 53, 53, 0, 1.43 1696, 53, 53, 1, 1.428 1696, 2048, 0, 0, 0.994 1696, 2048, 0, 1, 0.993 1696, 2101, 0, 0, 0.884 1696, 2101, 0, 1, 0.884 1696, 2048, 53, 0, 1.088 1696, 2048, 53, 1, 1.088 1696, 2101, 53, 0, 1.429 1696, 2101, 53, 1, 1.429 1728, 0, 0, 0, 0.978 1728, 0, 0, 1, 0.977 1728, 54, 0, 0, 1.032 1728, 54, 0, 1, 1.033 1728, 0, 54, 0, 1.0 1728, 0, 54, 1, 1.0 1728, 54, 54, 0, 0.96 1728, 54, 54, 1, 0.96 1728, 2048, 0, 0, 0.976 1728, 2048, 0, 1, 0.976 1728, 2102, 0, 0, 1.033 1728, 2102, 0, 1, 1.033 1728, 2048, 54, 0, 0.947 1728, 2048, 54, 1, 0.947 1728, 2102, 54, 0, 0.96 1728, 2102, 54, 1, 0.96 1760, 0, 0, 0, 1.019 1760, 0, 0, 1, 1.022 1760, 55, 0, 0, 0.9 1760, 55, 0, 1, 0.9 1760, 0, 55, 0, 1.125 1760, 0, 55, 1, 1.125 1760, 55, 55, 0, 1.438 1760, 55, 55, 1, 1.439 1760, 2048, 0, 0, 1.015 1760, 2048, 0, 1, 1.015 1760, 2103, 0, 0, 0.9 1760, 2103, 0, 1, 0.9 1760, 2048, 55, 0, 1.073 1760, 2048, 55, 1, 1.074 1760, 2103, 55, 0, 1.435 1760, 2103, 55, 1, 1.44 1792, 0, 0, 0, 1.003 1792, 0, 0, 1, 1.002 1792, 56, 0, 0, 1.028 1792, 56, 0, 1, 1.028 1792, 0, 56, 0, 1.014 1792, 0, 56, 1, 1.015 1792, 56, 56, 0, 1.191 1792, 56, 56, 1, 1.191 1792, 2048, 0, 0, 1.003 1792, 2048, 0, 1, 1.003 1792, 2104, 0, 0, 1.028 1792, 2104, 0, 1, 1.028 1792, 2048, 56, 0, 0.963 1792, 2048, 56, 1, 0.963 1792, 2104, 56, 0, 1.191 1792, 2104, 56, 1, 1.191 1824, 0, 0, 0, 1.001 1824, 0, 0, 1, 1.001 1824, 57, 0, 0, 0.891 1824, 57, 0, 1, 0.891 1824, 0, 57, 0, 1.114 1824, 0, 57, 1, 1.114 1824, 57, 57, 0, 1.407 1824, 57, 57, 1, 1.407 1824, 2048, 0, 0, 1.001 1824, 2048, 0, 1, 1.001 1824, 2105, 0, 0, 0.891 1824, 2105, 0, 1, 0.891 1824, 2048, 57, 0, 1.064 1824, 2048, 57, 1, 1.064 1824, 2105, 57, 0, 1.407 1824, 2105, 57, 1, 1.407 1856, 0, 0, 0, 0.991 1856, 0, 0, 1, 0.991 1856, 58, 0, 0, 1.042 1856, 58, 0, 1, 1.042 1856, 0, 58, 0, 1.007 1856, 0, 58, 1, 1.007 1856, 58, 58, 0, 0.98 1856, 58, 58, 1, 0.972 1856, 2048, 0, 0, 0.992 1856, 2048, 0, 1, 0.992 1856, 2106, 0, 0, 1.042 1856, 2106, 0, 1, 1.042 1856, 2048, 58, 0, 0.954 1856, 2048, 58, 1, 0.954 1856, 2106, 58, 0, 0.98 1856, 2106, 58, 1, 0.972 1888, 0, 0, 0, 0.993 1888, 0, 0, 1, 0.992 1888, 59, 0, 0, 0.883 1888, 59, 0, 1, 0.883 1888, 0, 59, 0, 1.124 1888, 0, 59, 1, 1.125 1888, 59, 59, 0, 1.413 1888, 59, 59, 1, 1.413 1888, 2048, 0, 0, 0.986 1888, 2048, 0, 1, 0.991 1888, 2107, 0, 0, 0.883 1888, 2107, 0, 1, 0.883 1888, 2048, 59, 0, 1.076 1888, 2048, 59, 1, 1.076 1888, 2107, 59, 0, 1.413 1888, 2107, 59, 1, 1.413 1920, 0, 0, 0, 1.0 1920, 0, 0, 1, 1.0 1920, 60, 0, 0, 1.033 1920, 60, 0, 1, 1.034 1920, 0, 60, 0, 0.996 1920, 0, 60, 1, 0.997 1920, 60, 60, 0, 0.968 1920, 60, 60, 1, 0.968 1920, 2048, 0, 0, 1.0 1920, 2048, 0, 1, 1.0 1920, 2108, 0, 0, 1.034 1920, 2108, 0, 1, 1.034 1920, 2048, 60, 0, 0.949 1920, 2048, 60, 1, 0.949 1920, 2108, 60, 0, 0.968 1920, 2108, 60, 1, 0.968 1952, 0, 0, 0, 1.004 1952, 0, 0, 1, 1.004 1952, 61, 0, 0, 0.897 1952, 61, 0, 1, 0.898 1952, 0, 61, 0, 1.118 1952, 0, 61, 1, 1.118 1952, 61, 61, 0, 1.387 1952, 61, 61, 1, 1.387 1952, 2048, 0, 0, 1.004 1952, 2048, 0, 1, 1.004 1952, 2109, 0, 0, 0.898 1952, 2109, 0, 1, 0.898 1952, 2048, 61, 0, 1.071 1952, 2048, 61, 1, 1.071 1952, 2109, 61, 0, 1.387 1952, 2109, 61, 1, 1.387 1984, 0, 0, 0, 0.993 1984, 0, 0, 1, 0.993 1984, 62, 0, 0, 1.025 1984, 62, 0, 1, 1.025 1984, 0, 62, 0, 1.005 1984, 0, 62, 1, 1.007 1984, 62, 62, 0, 0.982 1984, 62, 62, 1, 0.982 1984, 2048, 0, 0, 0.993 1984, 2048, 0, 1, 0.993 1984, 2110, 0, 0, 1.025 1984, 2110, 0, 1, 1.025 1984, 2048, 62, 0, 0.96 1984, 2048, 62, 1, 0.96 1984, 2110, 62, 0, 0.982 1984, 2110, 62, 1, 0.982 2016, 0, 0, 0, 0.999 2016, 0, 0, 1, 0.999 2016, 63, 0, 0, 0.889 2016, 63, 0, 1, 0.89 2016, 0, 63, 0, 1.093 2016, 0, 63, 1, 1.094 2016, 63, 63, 0, 1.362 2016, 63, 63, 1, 1.363 2016, 2048, 0, 0, 1.0 2016, 2048, 0, 1, 1.0 2016, 2111, 0, 0, 0.965 2016, 2111, 0, 1, 0.965 2016, 2048, 63, 0, 1.049 2016, 2048, 63, 1, 1.049 2016, 2111, 63, 0, 1.405 2016, 2111, 63, 1, 1.405 2048, 32, 0, 0, 1.01 2048, 32, 0, 1, 1.01 2048, 0, 32, 0, 1.005 2048, 0, 32, 1, 1.005 2048, 32, 32, 0, 1.005 2048, 32, 32, 1, 1.005 2048, 0, 1, 0, 0.983 2048, 0, 1, 1, 0.984 2048, 1, 0, 0, 1.039 2048, 1, 0, 1, 1.039 2048, 32, 1, 0, 1.063 2048, 32, 1, 1, 1.063 2048, 1, 32, 0, 0.94 2048, 1, 32, 1, 0.94 2048, 2048, 1, 0, 0.981 2048, 2048, 1, 1, 0.981 2048, 2049, 0, 0, 0.904 2048, 2049, 0, 1, 0.904 2112, 0, 0, 0, 0.996 2112, 0, 0, 1, 0.996 2112, 1, 0, 0, 1.031 2112, 1, 0, 1, 1.031 2112, 33, 0, 0, 1.01 2112, 33, 0, 1, 1.01 2112, 0, 1, 0, 0.972 2112, 0, 1, 1, 0.972 2112, 0, 33, 0, 0.988 2112, 0, 33, 1, 0.988 2112, 1, 1, 0, 0.914 2112, 1, 1, 1, 0.914 2112, 33, 33, 0, 0.983 2112, 33, 33, 1, 0.983 2112, 2048, 0, 0, 0.993 2112, 2048, 0, 1, 0.991 2112, 2049, 0, 0, 1.031 2112, 2049, 0, 1, 1.031 2112, 2048, 1, 0, 0.955 2112, 2048, 1, 1, 0.955 2112, 2049, 1, 0, 0.906 2112, 2049, 1, 1, 0.906 2112, 33, 1, 0, 1.163 2112, 33, 1, 1, 1.164 2112, 1, 33, 0, 1.046 2112, 1, 33, 1, 1.046 2176, 0, 0, 0, 0.985 2176, 0, 0, 1, 0.985 2176, 2, 0, 0, 1.023 2176, 2, 0, 1, 1.023 2176, 34, 0, 0, 1.0 2176, 34, 0, 1, 1.0 2176, 0, 2, 0, 0.984 2176, 0, 2, 1, 0.985 2176, 0, 34, 0, 0.986 2176, 0, 34, 1, 0.993 2176, 2, 2, 0, 0.928 2176, 2, 2, 1, 0.928 2176, 34, 34, 0, 1.004 2176, 34, 34, 1, 1.004 2176, 2048, 0, 0, 0.985 2176, 2048, 0, 1, 0.985 2176, 2050, 0, 0, 1.023 2176, 2050, 0, 1, 1.023 2176, 2048, 2, 0, 0.802 2176, 2048, 2, 1, 0.802 2176, 2050, 2, 0, 0.894 2176, 2050, 2, 1, 0.894 2176, 2, 1, 0, 1.068 2176, 2, 1, 1, 1.068 2176, 1, 2, 0, 0.976 2176, 1, 2, 1, 0.976 2176, 34, 1, 0, 1.077 2176, 34, 1, 1, 1.077 2176, 1, 34, 0, 0.978 2176, 1, 34, 1, 0.978 2176, 2050, 1, 0, 1.061 2176, 2050, 1, 1, 1.061 2176, 2049, 2, 0, 0.971 2176, 2049, 2, 1, 0.971 2240, 0, 0, 0, 0.994 2240, 0, 0, 1, 0.994 2240, 3, 0, 0, 1.038 2240, 3, 0, 1, 1.039 2240, 35, 0, 0, 1.019 2240, 35, 0, 1, 1.019 2240, 0, 3, 0, 0.979 2240, 0, 3, 1, 0.98 2240, 0, 35, 0, 0.991 2240, 0, 35, 1, 0.991 2240, 3, 3, 0, 0.931 2240, 3, 3, 1, 0.931 2240, 35, 35, 0, 0.999 2240, 35, 35, 1, 0.999 2240, 2048, 0, 0, 0.995 2240, 2048, 0, 1, 0.995 2240, 2051, 0, 0, 1.039 2240, 2051, 0, 1, 1.039 2240, 2048, 3, 0, 0.799 2240, 2048, 3, 1, 0.799 2240, 2051, 3, 0, 0.889 2240, 2051, 3, 1, 0.889 2240, 3, 1, 0, 1.06 2240, 3, 1, 1, 1.06 2240, 1, 3, 0, 0.968 2240, 1, 3, 1, 0.968 2240, 35, 1, 0, 1.071 2240, 35, 1, 1, 1.071 2240, 1, 35, 0, 0.971 2240, 1, 35, 1, 0.971 2240, 2051, 1, 0, 1.057 2240, 2051, 1, 1, 1.057 2240, 2049, 3, 0, 0.966 2240, 2049, 3, 1, 0.966 2304, 0, 0, 0, 0.988 2304, 0, 0, 1, 0.988 2304, 4, 0, 0, 1.031 2304, 4, 0, 1, 1.032 2304, 36, 0, 0, 1.011 2304, 36, 0, 1, 1.011 2304, 0, 4, 0, 0.968 2304, 0, 4, 1, 0.967 2304, 0, 36, 0, 0.988 2304, 0, 36, 1, 0.988 2304, 4, 4, 0, 0.931 2304, 4, 4, 1, 0.931 2304, 36, 36, 0, 0.992 2304, 36, 36, 1, 0.992 2304, 2048, 0, 0, 0.988 2304, 2048, 0, 1, 0.988 2304, 2052, 0, 0, 1.032 2304, 2052, 0, 1, 1.032 2304, 2048, 4, 0, 0.793 2304, 2048, 4, 1, 0.793 2304, 2052, 4, 0, 0.884 2304, 2052, 4, 1, 0.884 2304, 4, 1, 0, 0.989 2304, 4, 1, 1, 0.989 2304, 1, 4, 0, 0.897 2304, 1, 4, 1, 0.898 2304, 36, 1, 0, 1.057 2304, 36, 1, 1, 1.057 2304, 1, 36, 0, 0.966 2304, 1, 36, 1, 0.966 2304, 2052, 1, 0, 1.052 2304, 2052, 1, 1, 1.052 2304, 2049, 4, 0, 0.955 2304, 2049, 4, 1, 0.955 2368, 0, 0, 0, 0.999 2368, 0, 0, 1, 1.0 2368, 5, 0, 0, 1.024 2368, 5, 0, 1, 1.025 2368, 37, 0, 0, 1.0 2368, 37, 0, 1, 1.0 2368, 0, 5, 0, 0.98 2368, 0, 5, 1, 0.981 2368, 0, 37, 0, 0.986 2368, 0, 37, 1, 0.981 2368, 5, 5, 0, 0.944 2368, 5, 5, 1, 0.944 2368, 37, 37, 0, 1.003 2368, 37, 37, 1, 1.003 2368, 2048, 0, 0, 1.002 2368, 2048, 0, 1, 1.002 2368, 2053, 0, 0, 1.025 2368, 2053, 0, 1, 1.025 2368, 2048, 5, 0, 0.801 2368, 2048, 5, 1, 0.801 2368, 2053, 5, 0, 0.907 2368, 2053, 5, 1, 0.907 2368, 5, 1, 0, 1.071 2368, 5, 1, 1, 1.071 2368, 1, 5, 0, 0.973 2368, 1, 5, 1, 0.973 2368, 37, 1, 0, 1.07 2368, 37, 1, 1, 1.07 2368, 1, 37, 0, 0.974 2368, 1, 37, 1, 0.974 2368, 2053, 1, 0, 1.065 2368, 2053, 1, 1, 1.065 2368, 2049, 5, 0, 0.967 2368, 2049, 5, 1, 0.967 2432, 0, 0, 0, 0.968 2432, 0, 0, 1, 1.002 2432, 6, 0, 0, 1.032 2432, 6, 0, 1, 1.033 2432, 38, 0, 0, 1.021 2432, 38, 0, 1, 1.021 2432, 0, 6, 0, 0.973 2432, 0, 6, 1, 0.976 2432, 0, 38, 0, 0.986 2432, 0, 38, 1, 0.986 2432, 6, 6, 0, 0.926 2432, 6, 6, 1, 0.926 2432, 38, 38, 0, 1.0 2432, 38, 38, 1, 1.0 2432, 2048, 0, 0, 1.005 2432, 2048, 0, 1, 1.004 2432, 2054, 0, 0, 1.032 2432, 2054, 0, 1, 1.033 2432, 2048, 6, 0, 0.797 2432, 2048, 6, 1, 0.797 2432, 2054, 6, 0, 0.898 2432, 2054, 6, 1, 0.898 2432, 6, 1, 0, 1.058 2432, 6, 1, 1, 1.058 2432, 1, 6, 0, 0.96 2432, 1, 6, 1, 0.96 2432, 38, 1, 0, 1.062 2432, 38, 1, 1, 1.062 2432, 1, 38, 0, 0.963 2432, 1, 38, 1, 0.963 2432, 2054, 1, 0, 1.054 2432, 2054, 1, 1, 1.054 2432, 2049, 6, 0, 0.957 2432, 2049, 6, 1, 0.957 2496, 0, 0, 0, 1.013 2496, 0, 0, 1, 1.013 2496, 7, 0, 0, 1.025 2496, 7, 0, 1, 1.026 2496, 39, 0, 0, 1.013 2496, 39, 0, 1, 1.013 2496, 0, 7, 0, 0.964 2496, 0, 7, 1, 0.966 2496, 0, 39, 0, 0.979 2496, 0, 39, 1, 0.979 2496, 7, 7, 0, 0.925 2496, 7, 7, 1, 0.925 2496, 39, 39, 0, 0.989 2496, 39, 39, 1, 0.989 2496, 2048, 0, 0, 1.013 2496, 2048, 0, 1, 1.013 2496, 2055, 0, 0, 1.026 2496, 2055, 0, 1, 1.026 2496, 2048, 7, 0, 0.792 2496, 2048, 7, 1, 0.792 2496, 2055, 7, 0, 0.93 2496, 2055, 7, 1, 0.93 2496, 7, 1, 0, 0.982 2496, 7, 1, 1, 0.982 2496, 1, 7, 0, 0.893 2496, 1, 7, 1, 0.893 2496, 39, 1, 0, 1.048 2496, 39, 1, 1, 1.049 2496, 1, 39, 0, 0.958 2496, 1, 39, 1, 0.958 2496, 2055, 1, 0, 1.042 2496, 2055, 1, 1, 1.042 2496, 2049, 7, 0, 0.947 2496, 2049, 7, 1, 0.947 2560, 0, 0, 0, 0.993 2560, 0, 0, 1, 0.993 2560, 8, 0, 0, 1.031 2560, 8, 0, 1, 1.032 2560, 40, 0, 0, 1.029 2560, 40, 0, 1, 1.029 2560, 0, 8, 0, 0.992 2560, 0, 8, 1, 0.992 2560, 0, 40, 0, 0.981 2560, 0, 40, 1, 0.98 2560, 8, 8, 0, 0.943 2560, 8, 8, 1, 0.942 2560, 40, 40, 0, 1.141 2560, 40, 40, 1, 1.141 2560, 2048, 0, 0, 0.993 2560, 2048, 0, 1, 0.993 2560, 2056, 0, 0, 1.032 2560, 2056, 0, 1, 1.032 2560, 2048, 8, 0, 0.812 2560, 2048, 8, 1, 0.812 2560, 2056, 8, 0, 0.912 2560, 2056, 8, 1, 0.912 2560, 8, 1, 0, 1.069 2560, 8, 1, 1, 1.069 2560, 1, 8, 0, 0.974 2560, 1, 8, 1, 0.974 2560, 40, 1, 0, 1.068 2560, 40, 1, 1, 1.068 2560, 1, 40, 0, 0.996 2560, 1, 40, 1, 0.996 2560, 2056, 1, 0, 1.063 2560, 2056, 1, 1, 1.063 2560, 2049, 8, 0, 0.969 2560, 2049, 8, 1, 0.969 2624, 0, 0, 0, 0.997 2624, 0, 0, 1, 0.997 2624, 9, 0, 0, 1.008 2624, 9, 0, 1, 1.012 2624, 41, 0, 0, 1.044 2624, 41, 0, 1, 1.044 2624, 0, 9, 0, 0.988 2624, 0, 9, 1, 0.99 2624, 0, 41, 0, 0.99 2624, 0, 41, 1, 0.99 2624, 9, 9, 0, 0.943 2624, 9, 9, 1, 0.943 2624, 41, 41, 0, 0.993 2624, 41, 41, 1, 0.993 2624, 2048, 0, 0, 0.998 2624, 2048, 0, 1, 0.998 2624, 2057, 0, 0, 1.012 2624, 2057, 0, 1, 1.012 2624, 2048, 9, 0, 0.81 2624, 2048, 9, 1, 0.81 2624, 2057, 9, 0, 0.907 2624, 2057, 9, 1, 0.907 2624, 9, 1, 0, 1.085 2624, 9, 1, 1, 1.084 2624, 1, 9, 0, 0.962 2624, 1, 9, 1, 0.963 2624, 41, 1, 0, 1.078 2624, 41, 1, 1, 1.078 2624, 1, 41, 0, 0.962 2624, 1, 41, 1, 0.962 2624, 2057, 1, 0, 1.081 2624, 2057, 1, 1, 1.081 2624, 2049, 9, 0, 0.959 2624, 2049, 9, 1, 0.959 2688, 0, 0, 0, 0.995 2688, 0, 0, 1, 0.995 2688, 10, 0, 0, 1.003 2688, 10, 0, 1, 1.006 2688, 42, 0, 0, 1.036 2688, 42, 0, 1, 1.036 2688, 0, 10, 0, 0.978 2688, 0, 10, 1, 0.979 2688, 0, 42, 0, 0.978 2688, 0, 42, 1, 0.977 2688, 10, 10, 0, 0.942 2688, 10, 10, 1, 0.942 2688, 42, 42, 0, 0.989 2688, 42, 42, 1, 0.989 2688, 2048, 0, 0, 0.995 2688, 2048, 0, 1, 0.995 2688, 2058, 0, 0, 1.006 2688, 2058, 0, 1, 1.006 2688, 2048, 10, 0, 0.804 2688, 2048, 10, 1, 0.804 2688, 2058, 10, 0, 0.905 2688, 2058, 10, 1, 0.905 2688, 10, 1, 0, 0.985 2688, 10, 1, 1, 0.985 2688, 1, 10, 0, 0.892 2688, 1, 10, 1, 0.892 2688, 42, 1, 0, 1.048 2688, 42, 1, 1, 1.048 2688, 1, 42, 0, 0.958 2688, 1, 42, 1, 0.958 2688, 2058, 1, 0, 1.046 2688, 2058, 1, 1, 1.046 2688, 2049, 10, 0, 0.948 2688, 2049, 10, 1, 0.948 2752, 0, 0, 0, 0.998 2752, 0, 0, 1, 0.993 2752, 11, 0, 0, 0.96 2752, 11, 0, 1, 0.96 2752, 43, 0, 0, 0.979 2752, 43, 0, 1, 0.979 2752, 0, 11, 0, 0.939 2752, 0, 11, 1, 0.939 2752, 0, 43, 0, 0.93 2752, 0, 43, 1, 0.93 2752, 11, 11, 0, 0.949 2752, 11, 11, 1, 0.949 2752, 43, 43, 0, 1.007 2752, 43, 43, 1, 1.007 2752, 2048, 0, 0, 0.993 2752, 2048, 0, 1, 0.994 2752, 2059, 0, 0, 0.96 2752, 2059, 0, 1, 0.96 2752, 2048, 11, 0, 0.77 2752, 2048, 11, 1, 0.77 2752, 2059, 11, 0, 0.916 2752, 2059, 11, 1, 0.916 2752, 11, 1, 0, 1.0 2752, 11, 1, 1, 1.0 2752, 1, 11, 0, 0.933 2752, 1, 11, 1, 0.933 2752, 43, 1, 0, 1.028 2752, 43, 1, 1, 1.028 2752, 1, 43, 0, 0.925 2752, 1, 43, 1, 0.925 2752, 2059, 1, 0, 0.995 2752, 2059, 1, 1, 0.995 2752, 2049, 11, 0, 0.929 2752, 2049, 11, 1, 0.929 2816, 0, 0, 0, 1.004 2816, 0, 0, 1, 1.004 2816, 12, 0, 0, 0.897 2816, 12, 0, 1, 0.894 2816, 44, 0, 0, 0.914 2816, 44, 0, 1, 0.914 2816, 0, 12, 0, 0.877 2816, 0, 12, 1, 0.874 2816, 0, 44, 0, 0.871 2816, 0, 44, 1, 0.87 2816, 12, 12, 0, 0.948 2816, 12, 12, 1, 0.948 2816, 44, 44, 0, 1.009 2816, 44, 44, 1, 1.009 2816, 2048, 0, 0, 1.005 2816, 2048, 0, 1, 1.005 2816, 2060, 0, 0, 0.894 2816, 2060, 0, 1, 0.894 2816, 2048, 12, 0, 0.715 2816, 2048, 12, 1, 0.713 2816, 2060, 12, 0, 0.915 2816, 2060, 12, 1, 0.915 2816, 12, 1, 0, 0.918 2816, 12, 1, 1, 0.917 2816, 1, 12, 0, 0.863 2816, 1, 12, 1, 0.863 2816, 44, 1, 0, 0.944 2816, 44, 1, 1, 0.943 2816, 1, 44, 0, 0.861 2816, 1, 44, 1, 0.861 2816, 2060, 1, 0, 0.919 2816, 2060, 1, 1, 0.924 2816, 2049, 12, 0, 0.86 2816, 2049, 12, 1, 0.86 2880, 0, 0, 0, 0.989 2880, 0, 0, 1, 0.989 2880, 13, 0, 0, 0.967 2880, 13, 0, 1, 0.967 2880, 45, 0, 0, 0.987 2880, 45, 0, 1, 0.987 2880, 0, 13, 0, 0.925 2880, 0, 13, 1, 0.925 2880, 0, 45, 0, 0.927 2880, 0, 45, 1, 0.927 2880, 13, 13, 0, 0.944 2880, 13, 13, 1, 0.944 2880, 45, 45, 0, 1.003 2880, 45, 45, 1, 1.003 2880, 2048, 0, 0, 0.989 2880, 2048, 0, 1, 0.989 2880, 2061, 0, 0, 0.967 2880, 2061, 0, 1, 0.967 2880, 2048, 13, 0, 0.76 2880, 2048, 13, 1, 0.76 2880, 2061, 13, 0, 0.91 2880, 2061, 13, 1, 0.91 2880, 13, 1, 0, 0.922 2880, 13, 1, 1, 0.922 2880, 1, 13, 0, 0.859 2880, 1, 13, 1, 0.859 2880, 45, 1, 0, 1.013 2880, 45, 1, 1, 1.013 2880, 1, 45, 0, 0.92 2880, 1, 45, 1, 0.92 2880, 2061, 1, 0, 0.984 2880, 2061, 1, 1, 0.984 2880, 2049, 13, 0, 0.918 2880, 2049, 13, 1, 0.918 2944, 0, 0, 0, 1.014 2944, 0, 0, 1, 1.015 2944, 14, 0, 0, 0.961 2944, 14, 0, 1, 0.961 2944, 46, 0, 0, 0.979 2944, 46, 0, 1, 0.979 2944, 0, 14, 0, 0.934 2944, 0, 14, 1, 0.937 2944, 0, 46, 0, 0.924 2944, 0, 46, 1, 0.921 2944, 14, 14, 0, 0.953 2944, 14, 14, 1, 0.953 2944, 46, 46, 0, 1.009 2944, 46, 46, 1, 1.009 2944, 2048, 0, 0, 1.015 2944, 2048, 0, 1, 1.015 2944, 2062, 0, 0, 0.961 2944, 2062, 0, 1, 0.961 2944, 2048, 14, 0, 0.769 2944, 2048, 14, 1, 0.769 2944, 2062, 14, 0, 0.923 2944, 2062, 14, 1, 0.923 2944, 14, 1, 0, 0.999 2944, 14, 1, 1, 0.999 2944, 1, 14, 0, 0.927 2944, 1, 14, 1, 0.927 2944, 46, 1, 0, 1.027 2944, 46, 1, 1, 1.027 2944, 1, 46, 0, 0.918 2944, 1, 46, 1, 0.918 2944, 2062, 1, 0, 0.995 2944, 2062, 1, 1, 0.995 2944, 2049, 14, 0, 0.922 2944, 2049, 14, 1, 0.922 3008, 0, 0, 0, 0.998 3008, 0, 0, 1, 0.997 3008, 15, 0, 0, 0.953 3008, 15, 0, 1, 0.953 3008, 47, 0, 0, 0.996 3008, 47, 0, 1, 0.996 3008, 0, 15, 0, 0.933 3008, 0, 15, 1, 0.929 3008, 0, 47, 0, 0.933 3008, 0, 47, 1, 0.933 3008, 15, 15, 0, 0.95 3008, 15, 15, 1, 0.949 3008, 47, 47, 0, 1.003 3008, 47, 47, 1, 1.003 3008, 2048, 0, 0, 0.998 3008, 2048, 0, 1, 0.998 3008, 2063, 0, 0, 0.953 3008, 2063, 0, 1, 0.953 3008, 2048, 15, 0, 0.766 3008, 2048, 15, 1, 0.766 3008, 2063, 15, 0, 0.916 3008, 2063, 15, 1, 0.916 3008, 15, 1, 0, 0.996 3008, 15, 1, 1, 0.996 3008, 1, 15, 0, 0.927 3008, 1, 15, 1, 0.927 3008, 47, 1, 0, 1.026 3008, 47, 1, 1, 1.026 3008, 1, 47, 0, 0.918 3008, 1, 47, 1, 0.918 3008, 2063, 1, 0, 0.994 3008, 2063, 1, 1, 0.994 3008, 2049, 15, 0, 0.925 3008, 2049, 15, 1, 0.925 3072, 0, 0, 0, 1.015 3072, 0, 0, 1, 1.016 3072, 16, 0, 0, 1.045 3072, 16, 0, 1, 1.045 3072, 48, 0, 0, 1.045 3072, 48, 0, 1, 1.045 3072, 0, 16, 0, 1.049 3072, 0, 16, 1, 1.049 3072, 0, 48, 0, 1.049 3072, 0, 48, 1, 1.049 3072, 16, 16, 0, 1.016 3072, 16, 16, 1, 1.015 3072, 48, 48, 0, 1.015 3072, 48, 48, 1, 1.016 3072, 2048, 0, 0, 1.016 3072, 2048, 0, 1, 1.016 3072, 2064, 0, 0, 1.045 3072, 2064, 0, 1, 1.045 3072, 2048, 16, 0, 1.049 3072, 2048, 16, 1, 1.049 3072, 2064, 16, 0, 1.016 3072, 2064, 16, 1, 1.016 3072, 16, 1, 0, 0.815 3072, 16, 1, 1, 0.815 3072, 1, 16, 0, 0.872 3072, 1, 16, 1, 0.872 3072, 48, 1, 0, 1.017 3072, 48, 1, 1, 1.017 3072, 1, 48, 0, 0.872 3072, 1, 48, 1, 0.872 3072, 2064, 1, 0, 0.815 3072, 2064, 1, 1, 0.815 3072, 2049, 16, 0, 0.872 3072, 2049, 16, 1, 0.872 3136, 0, 0, 0, 0.995 3136, 0, 0, 1, 0.996 3136, 17, 0, 0, 0.949 3136, 17, 0, 1, 0.949 3136, 49, 0, 0, 0.987 3136, 49, 0, 1, 0.987 3136, 0, 17, 0, 0.922 3136, 0, 17, 1, 0.919 3136, 0, 49, 0, 0.931 3136, 0, 49, 1, 0.931 3136, 17, 17, 0, 1.122 3136, 17, 17, 1, 1.119 3136, 49, 49, 0, 0.987 3136, 49, 49, 1, 0.987 3136, 2048, 0, 0, 0.997 3136, 2048, 0, 1, 0.997 3136, 2065, 0, 0, 0.949 3136, 2065, 0, 1, 0.949 3136, 2048, 17, 0, 0.896 3136, 2048, 17, 1, 0.896 3136, 2065, 17, 0, 1.122 3136, 2065, 17, 1, 1.12 3136, 17, 1, 0, 1.185 3136, 17, 1, 1, 1.185 3136, 1, 17, 0, 1.124 3136, 1, 17, 1, 1.124 3136, 49, 1, 0, 1.11 3136, 49, 1, 1, 1.109 3136, 1, 49, 0, 1.044 3136, 1, 49, 1, 1.044 3136, 2065, 1, 0, 1.147 3136, 2065, 1, 1, 1.147 3136, 2049, 17, 0, 1.103 3136, 2049, 17, 1, 1.103 3200, 0, 0, 0, 1.006 3200, 0, 0, 1, 1.006 3200, 18, 0, 0, 0.978 3200, 18, 0, 1, 0.978 3200, 50, 0, 0, 0.998 3200, 50, 0, 1, 0.998 3200, 0, 18, 0, 0.932 3200, 0, 18, 1, 0.932 3200, 0, 50, 0, 0.93 3200, 0, 50, 1, 0.93 3200, 18, 18, 0, 1.11 3200, 18, 18, 1, 1.11 3200, 50, 50, 0, 0.994 3200, 50, 50, 1, 0.994 3200, 2048, 0, 0, 1.007 3200, 2048, 0, 1, 1.007 3200, 2066, 0, 0, 0.978 3200, 2066, 0, 1, 0.978 3200, 2048, 18, 0, 0.894 3200, 2048, 18, 1, 0.894 3200, 2066, 18, 0, 1.11 3200, 2066, 18, 1, 1.11 3200, 18, 1, 0, 1.002 3200, 18, 1, 1, 1.002 3200, 1, 18, 0, 0.917 3200, 1, 18, 1, 0.917 3200, 50, 1, 0, 0.963 3200, 50, 1, 1, 0.964 3200, 1, 50, 0, 0.888 3200, 1, 50, 1, 0.888 3200, 2066, 1, 0, 1.002 3200, 2066, 1, 1, 1.002 3200, 2049, 18, 0, 0.914 3200, 2049, 18, 1, 0.914 3264, 0, 0, 0, 0.994 3264, 0, 0, 1, 0.994 3264, 19, 0, 0, 0.959 3264, 19, 0, 1, 0.959 3264, 51, 0, 0, 0.994 3264, 51, 0, 1, 0.994 3264, 0, 19, 0, 0.927 3264, 0, 19, 1, 0.927 3264, 0, 51, 0, 0.927 3264, 0, 51, 1, 0.927 3264, 19, 19, 0, 1.1 3264, 19, 19, 1, 1.099 3264, 51, 51, 0, 0.982 3264, 51, 51, 1, 0.982 3264, 2048, 0, 0, 0.994 3264, 2048, 0, 1, 0.994 3264, 2067, 0, 0, 0.959 3264, 2067, 0, 1, 0.959 3264, 2048, 19, 0, 0.891 3264, 2048, 19, 1, 0.891 3264, 2067, 19, 0, 1.099 3264, 2067, 19, 1, 1.099 3264, 19, 1, 0, 0.977 3264, 19, 1, 1, 0.976 3264, 1, 19, 0, 0.921 3264, 1, 19, 1, 0.921 3264, 51, 1, 0, 0.959 3264, 51, 1, 1, 0.959 3264, 1, 51, 0, 0.886 3264, 1, 51, 1, 0.886 3264, 2067, 1, 0, 0.976 3264, 2067, 1, 1, 0.976 3264, 2049, 19, 0, 0.917 3264, 2049, 19, 1, 0.917 3328, 0, 0, 0, 0.997 3328, 0, 0, 1, 0.993 3328, 20, 0, 0, 0.955 3328, 20, 0, 1, 0.955 3328, 52, 0, 0, 0.99 3328, 52, 0, 1, 0.99 3328, 0, 20, 0, 0.925 3328, 0, 20, 1, 0.927 3328, 0, 52, 0, 0.933 3328, 0, 52, 1, 0.933 3328, 20, 20, 0, 1.11 3328, 20, 20, 1, 1.11 3328, 52, 52, 0, 0.988 3328, 52, 52, 1, 0.988 3328, 2048, 0, 0, 0.996 3328, 2048, 0, 1, 0.993 3328, 2068, 0, 0, 0.955 3328, 2068, 0, 1, 0.955 3328, 2048, 20, 0, 0.9 3328, 2048, 20, 1, 0.9 3328, 2068, 20, 0, 1.109 3328, 2068, 20, 1, 1.109 3328, 20, 1, 0, 0.996 3328, 20, 1, 1, 0.996 3328, 1, 20, 0, 0.927 3328, 1, 20, 1, 0.927 3328, 52, 1, 0, 0.972 3328, 52, 1, 1, 0.972 3328, 1, 52, 0, 0.901 3328, 1, 52, 1, 0.901 3328, 2068, 1, 0, 0.996 3328, 2068, 1, 1, 0.996 3328, 2049, 20, 0, 0.924 3328, 2049, 20, 1, 0.924 3392, 0, 0, 0, 0.996 3392, 0, 0, 1, 1.0 3392, 21, 0, 0, 0.964 3392, 21, 0, 1, 0.964 3392, 53, 0, 0, 0.999 3392, 53, 0, 1, 0.999 3392, 0, 21, 0, 0.932 3392, 0, 21, 1, 0.932 3392, 0, 53, 0, 0.93 3392, 0, 53, 1, 0.93 3392, 21, 21, 0, 1.113 3392, 21, 21, 1, 1.113 3392, 53, 53, 0, 0.983 3392, 53, 53, 1, 0.983 3392, 2048, 0, 0, 1.0 3392, 2048, 0, 1, 1.0 3392, 2069, 0, 0, 0.964 3392, 2069, 0, 1, 0.964 3392, 2048, 21, 0, 0.896 3392, 2048, 21, 1, 0.896 3392, 2069, 21, 0, 1.113 3392, 2069, 21, 1, 1.113 3392, 21, 1, 0, 0.994 3392, 21, 1, 1, 0.994 3392, 1, 21, 0, 0.918 3392, 1, 21, 1, 0.918 3392, 53, 1, 0, 0.972 3392, 53, 1, 1, 0.972 3392, 1, 53, 0, 0.891 3392, 1, 53, 1, 0.891 3392, 2069, 1, 0, 0.994 3392, 2069, 1, 1, 0.994 3392, 2049, 21, 0, 0.915 3392, 2049, 21, 1, 0.915 3456, 0, 0, 0, 0.995 3456, 0, 0, 1, 0.995 3456, 22, 0, 0, 0.965 3456, 22, 0, 1, 0.965 3456, 54, 0, 0, 0.996 3456, 54, 0, 1, 0.996 3456, 0, 22, 0, 0.927 3456, 0, 22, 1, 0.927 3456, 0, 54, 0, 0.927 3456, 0, 54, 1, 0.927 3456, 22, 22, 0, 1.106 3456, 22, 22, 1, 1.107 3456, 54, 54, 0, 0.98 3456, 54, 54, 1, 0.98 3456, 2048, 0, 0, 0.995 3456, 2048, 0, 1, 0.995 3456, 2070, 0, 0, 0.965 3456, 2070, 0, 1, 0.965 3456, 2048, 22, 0, 0.893 3456, 2048, 22, 1, 0.893 3456, 2070, 22, 0, 1.107 3456, 2070, 22, 1, 1.107 3456, 22, 1, 0, 0.988 3456, 22, 1, 1, 0.988 3456, 1, 22, 0, 0.915 3456, 1, 22, 1, 0.915 3456, 54, 1, 0, 0.963 3456, 54, 1, 1, 0.963 3456, 1, 54, 0, 0.887 3456, 1, 54, 1, 0.887 3456, 2070, 1, 0, 0.988 3456, 2070, 1, 1, 0.988 3456, 2049, 22, 0, 0.911 3456, 2049, 22, 1, 0.911 3520, 0, 0, 0, 1.016 3520, 0, 0, 1, 1.016 3520, 23, 0, 0, 0.957 3520, 23, 0, 1, 0.957 3520, 55, 0, 0, 0.991 3520, 55, 0, 1, 0.991 3520, 0, 23, 0, 0.918 3520, 0, 23, 1, 0.929 3520, 0, 55, 0, 0.935 3520, 0, 55, 1, 0.934 3520, 23, 23, 0, 1.111 3520, 23, 23, 1, 1.111 3520, 55, 55, 0, 0.994 3520, 55, 55, 1, 0.994 3520, 2048, 0, 0, 1.016 3520, 2048, 0, 1, 1.016 3520, 2071, 0, 0, 0.957 3520, 2071, 0, 1, 0.957 3520, 2048, 23, 0, 0.903 3520, 2048, 23, 1, 0.902 3520, 2071, 23, 0, 1.111 3520, 2071, 23, 1, 1.111 3520, 23, 1, 0, 0.997 3520, 23, 1, 1, 0.997 3520, 1, 23, 0, 0.926 3520, 1, 23, 1, 0.927 3520, 55, 1, 0, 0.976 3520, 55, 1, 1, 0.976 3520, 1, 55, 0, 0.902 3520, 1, 55, 1, 0.902 3520, 2071, 1, 0, 0.997 3520, 2071, 1, 1, 0.997 3520, 2049, 23, 0, 0.924 3520, 2049, 23, 1, 0.924 3584, 0, 0, 0, 1.005 3584, 0, 0, 1, 1.004 3584, 24, 0, 0, 0.985 3584, 24, 0, 1, 0.979 3584, 56, 0, 0, 1.006 3584, 56, 0, 1, 1.006 3584, 0, 24, 0, 0.931 3584, 0, 24, 1, 0.931 3584, 0, 56, 0, 0.93 3584, 0, 56, 1, 0.93 3584, 24, 24, 0, 1.111 3584, 24, 24, 1, 1.11 3584, 56, 56, 0, 1.102 3584, 56, 56, 1, 1.101 3584, 2048, 0, 0, 1.006 3584, 2048, 0, 1, 1.005 3584, 2072, 0, 0, 0.983 3584, 2072, 0, 1, 0.977 3584, 2048, 24, 0, 0.896 3584, 2048, 24, 1, 0.897 3584, 2072, 24, 0, 1.111 3584, 2072, 24, 1, 1.111 3584, 24, 1, 0, 1.004 3584, 24, 1, 1, 1.004 3584, 1, 24, 0, 0.921 3584, 1, 24, 1, 0.921 3584, 56, 1, 0, 0.97 3584, 56, 1, 1, 0.97 3584, 1, 56, 0, 0.891 3584, 1, 56, 1, 0.891 3584, 2072, 1, 0, 1.004 3584, 2072, 1, 1, 1.004 3584, 2049, 24, 0, 0.918 3584, 2049, 24, 1, 0.918 3648, 0, 0, 0, 1.012 3648, 0, 0, 1, 1.012 3648, 25, 0, 0, 0.96 3648, 25, 0, 1, 0.96 3648, 57, 0, 0, 0.988 3648, 57, 0, 1, 0.988 3648, 0, 25, 0, 0.927 3648, 0, 25, 1, 0.927 3648, 0, 57, 0, 0.927 3648, 0, 57, 1, 0.927 3648, 25, 25, 0, 1.1 3648, 25, 25, 1, 1.1 3648, 57, 57, 0, 0.986 3648, 57, 57, 1, 0.986 3648, 2048, 0, 0, 1.012 3648, 2048, 0, 1, 1.012 3648, 2073, 0, 0, 0.96 3648, 2073, 0, 1, 0.96 3648, 2048, 25, 0, 0.895 3648, 2048, 25, 1, 0.894 3648, 2073, 25, 0, 1.103 3648, 2073, 25, 1, 1.103 3648, 25, 1, 0, 1.032 3648, 25, 1, 1, 1.032 3648, 1, 25, 0, 0.9 3648, 1, 25, 1, 0.901 3648, 57, 1, 0, 0.974 3648, 57, 1, 1, 0.974 3648, 1, 57, 0, 0.888 3648, 1, 57, 1, 0.888 3648, 2073, 1, 0, 1.032 3648, 2073, 1, 1, 1.032 3648, 2049, 25, 0, 0.895 3648, 2049, 25, 1, 0.896 3712, 0, 0, 0, 0.996 3712, 0, 0, 1, 0.996 3712, 26, 0, 0, 0.959 3712, 26, 0, 1, 0.959 3712, 58, 0, 0, 0.995 3712, 58, 0, 1, 0.995 3712, 0, 26, 0, 0.92 3712, 0, 26, 1, 0.919 3712, 0, 58, 0, 0.931 3712, 0, 58, 1, 0.931 3712, 26, 26, 0, 1.103 3712, 26, 26, 1, 1.101 3712, 58, 58, 0, 0.99 3712, 58, 58, 1, 0.989 3712, 2048, 0, 0, 0.997 3712, 2048, 0, 1, 0.997 3712, 2074, 0, 0, 0.959 3712, 2074, 0, 1, 0.959 3712, 2048, 26, 0, 0.901 3712, 2048, 26, 1, 0.901 3712, 2074, 26, 0, 1.103 3712, 2074, 26, 1, 1.103 3712, 26, 1, 0, 1.001 3712, 26, 1, 1, 1.001 3712, 1, 26, 0, 0.928 3712, 1, 26, 1, 0.928 3712, 58, 1, 0, 0.974 3712, 58, 1, 1, 0.974 3712, 1, 58, 0, 0.903 3712, 1, 58, 1, 0.902 3712, 2074, 1, 0, 1.001 3712, 2074, 1, 1, 1.001 3712, 2049, 26, 0, 0.925 3712, 2049, 26, 1, 0.925 3776, 0, 0, 0, 1.003 3776, 0, 0, 1, 1.003 3776, 27, 0, 0, 0.964 3776, 27, 0, 1, 0.963 3776, 59, 0, 0, 1.004 3776, 59, 0, 1, 1.004 3776, 0, 27, 0, 0.931 3776, 0, 27, 1, 0.931 3776, 0, 59, 0, 0.929 3776, 0, 59, 1, 0.929 3776, 27, 27, 0, 1.097 3776, 27, 27, 1, 1.097 3776, 59, 59, 0, 0.992 3776, 59, 59, 1, 0.992 3776, 2048, 0, 0, 1.003 3776, 2048, 0, 1, 1.003 3776, 2075, 0, 0, 0.964 3776, 2075, 0, 1, 0.963 3776, 2048, 27, 0, 0.898 3776, 2048, 27, 1, 0.898 3776, 2075, 27, 0, 1.097 3776, 2075, 27, 1, 1.097 3776, 27, 1, 0, 0.991 3776, 27, 1, 1, 0.991 3776, 1, 27, 0, 0.919 3776, 1, 27, 1, 0.919 3776, 59, 1, 0, 0.979 3776, 59, 1, 1, 0.979 3776, 1, 59, 0, 0.894 3776, 1, 59, 1, 0.894 3776, 2075, 1, 0, 0.991 3776, 2075, 1, 1, 0.991 3776, 2049, 27, 0, 0.916 3776, 2049, 27, 1, 0.917 3840, 0, 0, 0, 0.998 3840, 0, 0, 1, 0.998 3840, 28, 0, 0, 0.968 3840, 28, 0, 1, 0.968 3840, 60, 0, 0, 1.001 3840, 60, 0, 1, 1.001 3840, 0, 28, 0, 0.927 3840, 0, 28, 1, 0.927 3840, 0, 60, 0, 0.927 3840, 0, 60, 1, 0.927 3840, 28, 28, 0, 1.094 3840, 28, 28, 1, 1.094 3840, 60, 60, 0, 0.982 3840, 60, 60, 1, 0.982 3840, 2048, 0, 0, 0.998 3840, 2048, 0, 1, 0.998 3840, 2076, 0, 0, 0.968 3840, 2076, 0, 1, 0.968 3840, 2048, 28, 0, 0.896 3840, 2048, 28, 1, 0.896 3840, 2076, 28, 0, 1.094 3840, 2076, 28, 1, 1.094 3840, 28, 1, 0, 0.99 3840, 28, 1, 1, 0.99 3840, 1, 28, 0, 0.91 3840, 1, 28, 1, 0.91 3840, 60, 1, 0, 0.969 3840, 60, 1, 1, 0.969 3840, 1, 60, 0, 0.89 3840, 1, 60, 1, 0.891 3840, 2076, 1, 0, 0.99 3840, 2076, 1, 1, 0.99 3840, 2049, 28, 0, 0.906 3840, 2049, 28, 1, 0.906 3904, 0, 0, 0, 1.001 3904, 0, 0, 1, 0.998 3904, 29, 0, 0, 0.961 3904, 29, 0, 1, 0.961 3904, 61, 0, 0, 0.997 3904, 61, 0, 1, 0.997 3904, 0, 29, 0, 0.92 3904, 0, 29, 1, 0.926 3904, 0, 61, 0, 0.933 3904, 0, 61, 1, 0.933 3904, 29, 29, 0, 1.103 3904, 29, 29, 1, 1.103 3904, 61, 61, 0, 0.995 3904, 61, 61, 1, 0.995 3904, 2048, 0, 0, 0.998 3904, 2048, 0, 1, 0.998 3904, 2077, 0, 0, 0.961 3904, 2077, 0, 1, 0.961 3904, 2048, 29, 0, 0.904 3904, 2048, 29, 1, 0.904 3904, 2077, 29, 0, 1.102 3904, 2077, 29, 1, 1.102 3904, 29, 1, 0, 1.0 3904, 29, 1, 1, 1.0 3904, 1, 29, 0, 0.911 3904, 1, 29, 1, 0.911 3904, 61, 1, 0, 0.98 3904, 61, 1, 1, 0.98 3904, 1, 61, 0, 0.904 3904, 1, 61, 1, 0.904 3904, 2077, 1, 0, 1.0 3904, 2077, 1, 1, 1.0 3904, 2049, 29, 0, 0.906 3904, 2049, 29, 1, 0.907 3968, 0, 0, 0, 1.003 3968, 0, 0, 1, 1.003 3968, 30, 0, 0, 0.969 3968, 30, 0, 1, 0.969 3968, 62, 0, 0, 1.005 3968, 62, 0, 1, 1.006 3968, 0, 30, 0, 0.931 3968, 0, 30, 1, 0.931 3968, 0, 62, 0, 0.93 3968, 0, 62, 1, 0.93 3968, 30, 30, 0, 1.103 3968, 30, 30, 1, 1.103 3968, 62, 62, 0, 0.99 3968, 62, 62, 1, 0.99 3968, 2048, 0, 0, 1.004 3968, 2048, 0, 1, 1.004 3968, 2078, 0, 0, 0.968 3968, 2078, 0, 1, 0.969 3968, 2048, 30, 0, 0.899 3968, 2048, 30, 1, 0.899 3968, 2078, 30, 0, 1.105 3968, 2078, 30, 1, 1.105 3968, 30, 1, 0, 0.993 3968, 30, 1, 1, 0.993 3968, 1, 30, 0, 0.914 3968, 1, 30, 1, 0.913 3968, 62, 1, 0, 0.978 3968, 62, 1, 1, 0.978 3968, 1, 62, 0, 0.895 3968, 1, 62, 1, 0.895 3968, 2078, 1, 0, 0.993 3968, 2078, 1, 1, 0.993 3968, 2049, 30, 0, 0.911 3968, 2049, 30, 1, 0.911 4032, 0, 0, 0, 0.995 4032, 0, 0, 1, 0.995 4032, 31, 0, 0, 0.967 4032, 31, 0, 1, 0.967 4032, 63, 0, 0, 1.003 4032, 63, 0, 1, 1.002 4032, 0, 31, 0, 0.927 4032, 0, 31, 1, 0.927 4032, 0, 63, 0, 0.927 4032, 0, 63, 1, 0.927 4032, 31, 31, 0, 1.09 4032, 31, 31, 1, 1.09 4032, 63, 63, 0, 0.987 4032, 63, 63, 1, 0.987 4032, 2048, 0, 0, 0.995 4032, 2048, 0, 1, 0.995 4032, 2079, 0, 0, 0.967 4032, 2079, 0, 1, 0.967 4032, 2048, 31, 0, 0.897 4032, 2048, 31, 1, 0.897 4032, 2079, 31, 0, 1.09 4032, 2079, 31, 1, 1.09 4032, 31, 1, 0, 0.989 4032, 31, 1, 1, 0.989 4032, 1, 31, 0, 0.922 4032, 1, 31, 1, 0.923 4032, 63, 1, 0, 0.971 4032, 63, 1, 1, 0.972 4032, 1, 63, 0, 0.892 4032, 1, 63, 1, 0.892 4032, 2079, 1, 0, 0.988 4032, 2079, 1, 1, 0.988 4032, 2049, 31, 0, 0.919 4032, 2049, 31, 1, 0.919 4096, 32, 0, 0, 1.014 4096, 32, 0, 1, 1.014 4096, 64, 0, 0, 1.014 4096, 64, 0, 1, 1.014 4096, 0, 32, 0, 1.013 4096, 0, 32, 1, 1.013 4096, 0, 64, 0, 1.013 4096, 0, 64, 1, 1.013 4096, 32, 32, 0, 1.014 4096, 32, 32, 1, 1.014 4096, 64, 64, 0, 1.014 4096, 64, 64, 1, 1.014 4096, 2080, 0, 0, 1.014 4096, 2080, 0, 1, 1.014 4096, 2048, 32, 0, 1.014 4096, 2048, 32, 1, 1.014 4096, 2080, 32, 0, 1.014 4096, 2080, 32, 1, 1.014 4096, 32, 1, 0, 0.975 4096, 32, 1, 1, 0.975 4096, 1, 32, 0, 0.769 4096, 1, 32, 1, 0.769 4096, 64, 1, 0, 0.858 4096, 64, 1, 1, 0.858 4096, 1, 64, 0, 0.769 4096, 1, 64, 1, 0.769 4096, 2080, 1, 0, 0.829 4096, 2080, 1, 1, 0.829 4096, 2049, 32, 0, 0.886 4096, 2049, 32, 1, 0.886 4160, 0, 0, 0, 1.003 4160, 0, 0, 1, 1.003 4160, 33, 0, 0, 1.004 4160, 33, 0, 1, 1.004 4160, 65, 0, 0, 0.999 4160, 65, 0, 1, 0.999 4160, 0, 33, 0, 0.931 4160, 0, 33, 1, 0.931 4160, 0, 65, 0, 0.765 4160, 0, 65, 1, 0.765 4160, 33, 33, 0, 0.998 4160, 33, 33, 1, 0.998 4160, 65, 65, 0, 0.942 4160, 65, 65, 1, 0.942 4160, 2048, 0, 0, 1.003 4160, 2048, 0, 1, 1.003 4160, 2081, 0, 0, 1.005 4160, 2081, 0, 1, 1.005 4160, 2048, 33, 0, 0.899 4160, 2048, 33, 1, 0.899 4160, 2081, 33, 0, 1.002 4160, 2081, 33, 1, 1.002 4160, 33, 1, 0, 1.114 4160, 33, 1, 1, 1.114 4160, 1, 33, 0, 1.01 4160, 1, 33, 1, 1.01 4160, 65, 1, 0, 1.077 4160, 65, 1, 1, 1.077 4160, 1, 65, 0, 0.935 4160, 1, 65, 1, 0.936 4160, 2081, 1, 0, 1.077 4160, 2081, 1, 1, 1.077 4160, 2049, 33, 0, 1.008 4160, 2049, 33, 1, 1.007 4224, 0, 0, 0, 1.014 4224, 0, 0, 1, 1.014 4224, 34, 0, 0, 1.0 4224, 34, 0, 1, 1.0 4224, 66, 0, 0, 1.001 4224, 66, 0, 1, 1.001 4224, 0, 34, 0, 0.928 4224, 0, 34, 1, 0.928 4224, 0, 66, 0, 0.762 4224, 0, 66, 1, 0.762 4224, 34, 34, 0, 0.998 4224, 34, 34, 1, 0.998 4224, 66, 66, 0, 0.959 4224, 66, 66, 1, 0.959 4224, 2048, 0, 0, 1.014 4224, 2048, 0, 1, 1.014 4224, 2082, 0, 0, 1.001 4224, 2082, 0, 1, 1.001 4224, 2048, 34, 0, 0.899 4224, 2048, 34, 1, 0.898 4224, 2082, 34, 0, 0.998 4224, 2082, 34, 1, 0.997 4224, 34, 1, 0, 1.024 4224, 34, 1, 1, 1.024 4224, 1, 34, 0, 0.923 4224, 1, 34, 1, 0.923 4224, 66, 1, 0, 1.013 4224, 66, 1, 1, 1.013 4224, 1, 66, 0, 0.917 4224, 1, 66, 1, 0.917 4224, 2082, 1, 0, 1.022 4224, 2082, 1, 1, 1.022 4224, 2049, 34, 0, 0.92 4224, 2049, 34, 1, 0.92 4288, 0, 0, 0, 0.999 4288, 0, 0, 1, 0.999 4288, 35, 0, 0, 0.995 4288, 35, 0, 1, 0.996 4288, 67, 0, 0, 0.998 4288, 67, 0, 1, 0.998 4288, 0, 35, 0, 0.917 4288, 0, 35, 1, 0.919 4288, 0, 67, 0, 0.767 4288, 0, 67, 1, 0.767 4288, 35, 35, 0, 1.004 4288, 35, 35, 1, 1.004 4288, 67, 67, 0, 0.995 4288, 67, 67, 1, 0.995 4288, 2048, 0, 0, 0.999 4288, 2048, 0, 1, 0.999 4288, 2083, 0, 0, 0.995 4288, 2083, 0, 1, 0.995 4288, 2048, 35, 0, 0.905 4288, 2048, 35, 1, 0.904 4288, 2083, 35, 0, 1.004 4288, 2083, 35, 1, 1.004 4288, 35, 1, 0, 1.032 4288, 35, 1, 1, 1.033 4288, 1, 35, 0, 0.928 4288, 1, 35, 1, 0.928 4288, 67, 1, 0, 1.019 4288, 67, 1, 1, 1.019 4288, 1, 67, 0, 0.924 4288, 1, 67, 1, 0.924 4288, 2083, 1, 0, 1.03 4288, 2083, 1, 1, 1.031 4288, 2049, 35, 0, 0.925 4288, 2049, 35, 1, 0.925 4352, 0, 0, 0, 1.005 4352, 0, 0, 1, 1.006 4352, 36, 0, 0, 1.006 4352, 36, 0, 1, 1.007 4352, 68, 0, 0, 1.006 4352, 68, 0, 1, 1.007 4352, 0, 36, 0, 0.929 4352, 0, 36, 1, 0.928 4352, 0, 68, 0, 0.766 4352, 0, 68, 1, 0.765 4352, 36, 36, 0, 0.998 4352, 36, 36, 1, 0.998 4352, 68, 68, 0, 0.964 4352, 68, 68, 1, 0.964 4352, 2048, 0, 0, 1.006 4352, 2048, 0, 1, 1.006 4352, 2084, 0, 0, 1.007 4352, 2084, 0, 1, 1.007 4352, 2048, 36, 0, 0.897 4352, 2048, 36, 1, 0.898 4352, 2084, 36, 0, 0.998 4352, 2084, 36, 1, 0.998 4352, 36, 1, 0, 1.031 4352, 36, 1, 1, 1.031 4352, 1, 36, 0, 0.924 4352, 1, 36, 1, 0.925 4352, 68, 1, 0, 0.999 4352, 68, 1, 1, 0.999 4352, 1, 68, 0, 0.922 4352, 1, 68, 1, 0.922 4352, 2084, 1, 0, 1.032 4352, 2084, 1, 1, 1.03 4352, 2049, 36, 0, 0.923 4352, 2049, 36, 1, 0.923 4416, 0, 0, 0, 0.997 4416, 0, 0, 1, 0.997 4416, 37, 0, 0, 1.001 4416, 37, 0, 1, 1.002 4416, 69, 0, 0, 1.004 4416, 69, 0, 1, 1.003 4416, 0, 37, 0, 0.928 4416, 0, 37, 1, 0.927 4416, 0, 69, 0, 0.762 4416, 0, 69, 1, 0.763 4416, 37, 37, 0, 0.994 4416, 37, 37, 1, 0.994 4416, 69, 69, 0, 0.959 4416, 69, 69, 1, 0.959 4416, 2048, 0, 0, 0.997 4416, 2048, 0, 1, 0.997 4416, 2085, 0, 0, 1.002 4416, 2085, 0, 1, 1.001 4416, 2048, 37, 0, 0.9 4416, 2048, 37, 1, 0.9 4416, 2085, 37, 0, 0.994 4416, 2085, 37, 1, 0.994 4416, 37, 1, 0, 1.024 4416, 37, 1, 1, 1.025 4416, 1, 37, 0, 0.922 4416, 1, 37, 1, 0.922 4416, 69, 1, 0, 1.008 4416, 69, 1, 1, 1.009 4416, 1, 69, 0, 0.913 4416, 1, 69, 1, 0.912 4416, 2085, 1, 0, 1.025 4416, 2085, 1, 1, 1.024 4416, 2049, 37, 0, 0.92 4416, 2049, 37, 1, 0.919 4480, 0, 0, 0, 1.0 4480, 0, 0, 1, 0.998 4480, 38, 0, 0, 0.996 4480, 38, 0, 1, 0.996 4480, 70, 0, 0, 0.992 4480, 70, 0, 1, 0.992 4480, 0, 38, 0, 0.919 4480, 0, 38, 1, 0.916 4480, 0, 70, 0, 0.767 4480, 0, 70, 1, 0.767 4480, 38, 38, 0, 1.002 4480, 38, 38, 1, 1.002 4480, 70, 70, 0, 0.963 4480, 70, 70, 1, 0.963 4480, 2048, 0, 0, 0.998 4480, 2048, 0, 1, 0.998 4480, 2086, 0, 0, 0.996 4480, 2086, 0, 1, 0.996 4480, 2048, 38, 0, 0.907 4480, 2048, 38, 1, 0.907 4480, 2086, 38, 0, 1.002 4480, 2086, 38, 1, 1.002 4480, 38, 1, 0, 1.023 4480, 38, 1, 1, 1.024 4480, 1, 38, 0, 0.914 4480, 1, 38, 1, 0.913 4480, 70, 1, 0, 1.01 4480, 70, 1, 1, 1.011 4480, 1, 70, 0, 0.922 4480, 1, 70, 1, 0.922 4480, 2086, 1, 0, 1.024 4480, 2086, 1, 1, 1.024 4480, 2049, 38, 0, 0.911 4480, 2049, 38, 1, 0.91 4544, 0, 0, 0, 1.002 4544, 0, 0, 1, 1.002 4544, 39, 0, 0, 1.007 4544, 39, 0, 1, 1.007 4544, 71, 0, 0, 1.01 4544, 71, 0, 1, 1.008 4544, 0, 39, 0, 0.93 4544, 0, 39, 1, 0.93 4544, 0, 71, 0, 0.766 4544, 0, 71, 1, 0.766 4544, 39, 39, 0, 1.001 4544, 39, 39, 1, 1.001 4544, 71, 71, 0, 0.966 4544, 71, 71, 1, 0.966 4544, 2048, 0, 0, 1.002 4544, 2048, 0, 1, 1.002 4544, 2087, 0, 0, 1.008 4544, 2087, 0, 1, 1.008 4544, 2048, 39, 0, 0.901 4544, 2048, 39, 1, 0.902 4544, 2087, 39, 0, 1.001 4544, 2087, 39, 1, 1.001 4544, 39, 1, 0, 1.032 4544, 39, 1, 1, 1.032 4544, 1, 39, 0, 0.925 4544, 1, 39, 1, 0.925 4544, 71, 1, 0, 0.997 4544, 71, 1, 1, 0.998 4544, 1, 71, 0, 0.921 4544, 1, 71, 1, 0.922 4544, 2087, 1, 0, 1.032 4544, 2087, 1, 1, 1.032 4544, 2049, 39, 0, 0.924 4544, 2049, 39, 1, 0.923 4608, 0, 0, 0, 0.999 4608, 0, 0, 1, 0.998 4608, 40, 0, 0, 1.013 4608, 40, 0, 1, 1.012 4608, 72, 0, 0, 1.013 4608, 72, 0, 1, 1.013 4608, 0, 40, 0, 0.925 4608, 0, 40, 1, 0.926 4608, 0, 72, 0, 0.765 4608, 0, 72, 1, 0.765 4608, 40, 40, 0, 1.085 4608, 40, 40, 1, 1.086 4608, 72, 72, 0, 0.966 4608, 72, 72, 1, 0.966 4608, 2048, 0, 0, 0.999 4608, 2048, 0, 1, 0.999 4608, 2088, 0, 0, 1.012 4608, 2088, 0, 1, 1.013 4608, 2048, 40, 0, 0.898 4608, 2048, 40, 1, 0.898 4608, 2088, 40, 0, 1.087 4608, 2088, 40, 1, 1.087 4608, 40, 1, 0, 1.006 4608, 40, 1, 1, 1.007 4608, 1, 40, 0, 0.919 4608, 1, 40, 1, 0.919 4608, 72, 1, 0, 1.012 4608, 72, 1, 1, 1.012 4608, 1, 72, 0, 0.914 4608, 1, 72, 1, 0.914 4608, 2088, 1, 0, 1.006 4608, 2088, 1, 1, 1.007 4608, 2049, 40, 0, 0.916 4608, 2049, 40, 1, 0.916 4672, 0, 0, 0, 1.014 4672, 0, 0, 1, 1.014 4672, 41, 0, 0, 1.002 4672, 41, 0, 1, 1.002 4672, 73, 0, 0, 0.976 4672, 73, 0, 1, 0.975 4672, 0, 41, 0, 0.919 4672, 0, 41, 1, 0.919 4672, 0, 73, 0, 0.772 4672, 0, 73, 1, 0.772 4672, 41, 41, 0, 1.012 4672, 41, 41, 1, 1.012 4672, 73, 73, 0, 0.973 4672, 73, 73, 1, 0.973 4672, 2048, 0, 0, 1.014 4672, 2048, 0, 1, 1.014 4672, 2089, 0, 0, 1.003 4672, 2089, 0, 1, 1.002 4672, 2048, 41, 0, 0.907 4672, 2048, 41, 1, 0.908 4672, 2089, 41, 0, 1.012 4672, 2089, 41, 1, 1.012 4672, 41, 1, 0, 1.02 4672, 41, 1, 1, 1.02 4672, 1, 41, 0, 0.916 4672, 1, 41, 1, 0.914 4672, 73, 1, 0, 1.024 4672, 73, 1, 1, 1.024 4672, 1, 73, 0, 0.927 4672, 1, 73, 1, 0.927 4672, 2089, 1, 0, 1.019 4672, 2089, 1, 1, 1.02 4672, 2049, 41, 0, 0.912 4672, 2049, 41, 1, 0.912 4736, 0, 0, 0, 1.007 4736, 0, 0, 1, 1.006 4736, 42, 0, 0, 1.012 4736, 42, 0, 1, 1.013 4736, 74, 0, 0, 0.976 4736, 74, 0, 1, 0.975 4736, 0, 42, 0, 0.93 4736, 0, 42, 1, 0.931 4736, 0, 74, 0, 0.769 4736, 0, 74, 1, 0.77 4736, 42, 42, 0, 1.007 4736, 42, 42, 1, 1.007 4736, 74, 74, 0, 0.965 4736, 74, 74, 1, 0.965 4736, 2048, 0, 0, 1.006 4736, 2048, 0, 1, 1.007 4736, 2090, 0, 0, 1.012 4736, 2090, 0, 1, 1.013 4736, 2048, 42, 0, 0.902 4736, 2048, 42, 1, 0.901 4736, 2090, 42, 0, 1.007 4736, 2090, 42, 1, 1.007 4736, 42, 1, 0, 1.032 4736, 42, 1, 1, 1.032 4736, 1, 42, 0, 0.919 4736, 1, 42, 1, 0.919 4736, 74, 1, 0, 1.017 4736, 74, 1, 1, 1.018 4736, 1, 74, 0, 0.919 4736, 1, 74, 1, 0.918 4736, 2090, 1, 0, 1.031 4736, 2090, 1, 1, 1.031 4736, 2049, 42, 0, 0.916 4736, 2049, 42, 1, 0.916 4800, 0, 0, 0, 1.012 4800, 0, 0, 1, 1.012 4800, 43, 0, 0, 1.008 4800, 43, 0, 1, 1.009 4800, 75, 0, 0, 0.99 4800, 75, 0, 1, 0.99 4800, 0, 43, 0, 0.929 4800, 0, 43, 1, 0.927 4800, 0, 75, 0, 0.768 4800, 0, 75, 1, 0.768 4800, 43, 43, 0, 1.004 4800, 43, 43, 1, 1.004 4800, 75, 75, 0, 0.965 4800, 75, 75, 1, 0.965 4800, 2048, 0, 0, 1.012 4800, 2048, 0, 1, 1.012 4800, 2091, 0, 0, 1.009 4800, 2091, 0, 1, 1.008 4800, 2048, 43, 0, 0.901 4800, 2048, 43, 1, 0.901 4800, 2091, 43, 0, 1.004 4800, 2091, 43, 1, 1.004 4800, 43, 1, 0, 1.026 4800, 43, 1, 1, 1.026 4800, 1, 43, 0, 0.923 4800, 1, 43, 1, 0.922 4800, 75, 1, 0, 0.993 4800, 75, 1, 1, 0.991 4800, 1, 75, 0, 0.921 4800, 1, 75, 1, 0.92 4800, 2091, 1, 0, 1.026 4800, 2091, 1, 1, 1.026 4800, 2049, 43, 0, 0.92 4800, 2049, 43, 1, 0.919 4864, 0, 0, 0, 0.999 4864, 0, 0, 1, 0.999 4864, 44, 0, 0, 0.998 4864, 44, 0, 1, 0.998 4864, 76, 0, 0, 0.981 4864, 76, 0, 1, 0.981 4864, 0, 44, 0, 0.916 4864, 0, 44, 1, 0.918 4864, 0, 76, 0, 0.772 4864, 0, 76, 1, 0.771 4864, 44, 44, 0, 1.006 4864, 44, 44, 1, 1.005 4864, 76, 76, 0, 0.97 4864, 76, 76, 1, 0.97 4864, 2048, 0, 0, 0.999 4864, 2048, 0, 1, 0.999 4864, 2092, 0, 0, 0.997 4864, 2092, 0, 1, 0.997 4864, 2048, 44, 0, 0.908 4864, 2048, 44, 1, 0.907 4864, 2092, 44, 0, 1.005 4864, 2092, 44, 1, 1.005 4864, 44, 1, 0, 0.893 4864, 44, 1, 1, 0.893 4864, 1, 44, 0, 0.922 4864, 1, 44, 1, 0.921 4864, 76, 1, 0, 0.866 4864, 76, 1, 1, 0.866 4864, 1, 76, 0, 0.919 4864, 1, 76, 1, 0.919 4864, 2092, 1, 0, 0.893 4864, 2092, 1, 1, 0.893 4864, 2049, 44, 0, 0.919 4864, 2049, 44, 1, 0.919 4928, 0, 0, 0, 1.005 4928, 0, 0, 1, 1.005 4928, 45, 0, 0, 1.005 4928, 45, 0, 1, 1.005 4928, 77, 0, 0, 0.97 4928, 77, 0, 1, 0.97 4928, 0, 45, 0, 0.931 4928, 0, 45, 1, 0.932 4928, 0, 77, 0, 0.771 4928, 0, 77, 1, 0.771 4928, 45, 45, 0, 1.0 4928, 45, 45, 1, 1.0 4928, 77, 77, 0, 0.972 4928, 77, 77, 1, 0.972 4928, 2048, 0, 0, 1.005 4928, 2048, 0, 1, 1.005 4928, 2093, 0, 0, 1.005 4928, 2093, 0, 1, 1.005 4928, 2048, 45, 0, 0.904 4928, 2048, 45, 1, 0.905 4928, 2093, 45, 0, 1.0 4928, 2093, 45, 1, 1.0 4928, 45, 1, 0, 1.024 4928, 45, 1, 1, 1.024 4928, 1, 45, 0, 0.913 4928, 1, 45, 1, 0.912 4928, 77, 1, 0, 0.996 4928, 77, 1, 1, 0.996 4928, 1, 77, 0, 0.925 4928, 1, 77, 1, 0.925 4928, 2093, 1, 0, 1.025 4928, 2093, 1, 1, 1.024 4928, 2049, 45, 0, 0.916 4928, 2049, 45, 1, 0.911 4992, 0, 0, 0, 1.0 4992, 0, 0, 1, 1.0 4992, 46, 0, 0, 1.009 4992, 46, 0, 1, 1.009 4992, 78, 0, 0, 0.992 4992, 78, 0, 1, 0.992 4992, 0, 46, 0, 0.908 4992, 0, 46, 1, 0.908 4992, 0, 78, 0, 0.751 4992, 0, 78, 1, 0.752 4992, 46, 46, 0, 0.997 4992, 46, 46, 1, 0.997 4992, 78, 78, 0, 0.968 4992, 78, 78, 1, 0.969 4992, 2048, 0, 0, 1.0 4992, 2048, 0, 1, 1.001 4992, 2094, 0, 0, 1.008 4992, 2094, 0, 1, 1.009 4992, 2048, 46, 0, 0.883 4992, 2048, 46, 1, 0.883 4992, 2094, 46, 0, 0.997 4992, 2094, 46, 1, 0.997 4992, 46, 1, 0, 1.025 4992, 46, 1, 1, 1.025 4992, 1, 46, 0, 0.923 4992, 1, 46, 1, 0.923 4992, 78, 1, 0, 1.0 4992, 78, 1, 1, 1.001 4992, 1, 78, 0, 0.92 4992, 1, 78, 1, 0.92 4992, 2094, 1, 0, 1.025 4992, 2094, 1, 1, 1.026 4992, 2049, 46, 0, 0.92 4992, 2049, 46, 1, 0.921 5056, 0, 0, 0, 1.002 5056, 0, 0, 1, 1.001 5056, 47, 0, 0, 1.006 5056, 47, 0, 1, 1.006 5056, 79, 0, 0, 0.99 5056, 79, 0, 1, 0.988 5056, 0, 47, 0, 0.917 5056, 0, 47, 1, 0.916 5056, 0, 79, 0, 0.771 5056, 0, 79, 1, 0.772 5056, 47, 47, 0, 1.006 5056, 47, 47, 1, 1.006 5056, 79, 79, 0, 0.972 5056, 79, 79, 1, 0.973 5056, 2048, 0, 0, 1.003 5056, 2048, 0, 1, 1.001 5056, 2095, 0, 0, 1.005 5056, 2095, 0, 1, 1.004 5056, 2048, 47, 0, 0.908 5056, 2048, 47, 1, 0.909 5056, 2095, 47, 0, 1.006 5056, 2095, 47, 1, 1.006 5056, 47, 1, 0, 1.032 5056, 47, 1, 1, 1.034 5056, 1, 47, 0, 0.926 5056, 1, 47, 1, 0.926 5056, 79, 1, 0, 1.003 5056, 79, 1, 1, 1.004 5056, 1, 79, 0, 0.927 5056, 1, 79, 1, 0.927 5056, 2095, 1, 0, 1.034 5056, 2095, 1, 1, 1.033 5056, 2049, 47, 0, 0.924 5056, 2049, 47, 1, 0.923 5120, 0, 0, 0, 1.003 5120, 0, 0, 1, 1.004 5120, 48, 0, 0, 1.068 5120, 48, 0, 1, 1.068 5120, 80, 0, 0, 1.068 5120, 80, 0, 1, 1.068 5120, 0, 48, 0, 1.065 5120, 0, 48, 1, 1.064 5120, 0, 80, 0, 1.065 5120, 0, 80, 1, 1.065 5120, 48, 48, 0, 1.004 5120, 48, 48, 1, 1.005 5120, 80, 80, 0, 1.005 5120, 80, 80, 1, 1.005 5120, 2048, 0, 0, 1.005 5120, 2048, 0, 1, 1.005 5120, 2096, 0, 0, 1.068 5120, 2096, 0, 1, 1.068 5120, 2048, 48, 0, 1.066 5120, 2048, 48, 1, 1.065 5120, 2096, 48, 0, 1.005 5120, 2096, 48, 1, 1.005 5120, 48, 1, 0, 1.032 5120, 48, 1, 1, 1.032 5120, 1, 48, 0, 0.899 5120, 1, 48, 1, 0.899 5120, 80, 1, 0, 0.844 5120, 80, 1, 1, 0.843 5120, 1, 80, 0, 0.892 5120, 1, 80, 1, 0.892 5120, 2096, 1, 0, 0.856 5120, 2096, 1, 1, 0.856 5120, 2049, 48, 0, 0.898 5120, 2049, 48, 1, 0.898 Results For: bench-memcpy-large length, align1, align2, dst > src, New Time / Old Time 65543, 0, 0, 0, 0.977 65543, 0, 0, 1, 0.976 65551, 0, 3, 0, 1.01 65551, 0, 3, 1, 1.011 65567, 3, 0, 0, 1.02 65567, 3, 0, 1, 1.02 65599, 3, 5, 0, 1.056 65599, 3, 5, 1, 1.057 65536, 0, 127, 0, 1.043 65536, 0, 127, 1, 1.043 65536, 0, 255, 0, 1.07 65536, 0, 255, 1, 1.071 65536, 0, 256, 0, 0.978 65536, 0, 256, 1, 0.979 65536, 0, 4064, 0, 1.017 65536, 0, 4064, 1, 1.018 131079, 0, 0, 0, 0.979 131079, 0, 0, 1, 0.979 131087, 0, 3, 0, 1.016 131087, 0, 3, 1, 1.016 131103, 3, 0, 0, 1.022 131103, 3, 0, 1, 1.022 131135, 3, 5, 0, 1.063 131135, 3, 5, 1, 1.063 131072, 0, 127, 0, 1.048 131072, 0, 127, 1, 1.048 131072, 0, 255, 0, 1.074 131072, 0, 255, 1, 1.074 131072, 0, 256, 0, 0.982 131072, 0, 256, 1, 0.982 131072, 0, 4064, 0, 1.018 131072, 0, 4064, 1, 1.019 262151, 0, 0, 0, 0.984 262151, 0, 0, 1, 0.984 262159, 0, 3, 0, 1.024 262159, 0, 3, 1, 1.024 262175, 3, 0, 0, 1.03 262175, 3, 0, 1, 1.03 262207, 3, 5, 0, 1.068 262207, 3, 5, 1, 1.069 262144, 0, 127, 0, 1.056 262144, 0, 127, 1, 1.056 262144, 0, 255, 0, 1.078 262144, 0, 255, 1, 1.078 262144, 0, 256, 0, 0.986 262144, 0, 256, 1, 0.986 262144, 0, 4064, 0, 1.02 262144, 0, 4064, 1, 1.02 524295, 0, 0, 0, 0.692 524295, 0, 0, 1, 0.692 524303, 0, 3, 0, 0.736 524303, 0, 3, 1, 0.736 524319, 3, 0, 0, 0.759 524319, 3, 0, 1, 0.759 524351, 3, 5, 0, 0.758 524351, 3, 5, 1, 0.759 524288, 0, 127, 0, 1.057 524288, 0, 127, 1, 1.057 524288, 0, 255, 0, 1.079 524288, 0, 255, 1, 1.079 524288, 0, 256, 0, 0.987 524288, 0, 256, 1, 0.987 524288, 0, 4064, 0, 1.02 524288, 0, 4064, 1, 1.02 1048583, 0, 0, 0, 0.948 1048583, 0, 0, 1, 0.949 1048591, 0, 3, 0, 0.734 1048591, 0, 3, 1, 0.735 1048607, 3, 0, 0, 0.758 1048607, 3, 0, 1, 0.757 1048639, 3, 5, 0, 0.757 1048639, 3, 5, 1, 0.757 1048576, 0, 127, 0, 0.761 1048576, 0, 127, 1, 0.763 1048576, 0, 255, 0, 0.751 1048576, 0, 255, 1, 0.751 1048576, 0, 256, 0, 0.93 1048576, 0, 256, 1, 0.93 1048576, 0, 4064, 0, 0.93 1048576, 0, 4064, 1, 0.93 2097159, 0, 0, 0, 0.928 2097159, 0, 0, 1, 0.931 2097167, 0, 3, 0, 0.735 2097167, 0, 3, 1, 0.734 2097183, 3, 0, 0, 0.759 2097183, 3, 0, 1, 0.76 2097215, 3, 5, 0, 0.758 2097215, 3, 5, 1, 0.757 2097152, 0, 127, 0, 0.77 2097152, 0, 127, 1, 0.77 2097152, 0, 255, 0, 0.745 2097152, 0, 255, 1, 0.745 2097152, 0, 256, 0, 0.924 2097152, 0, 256, 1, 0.925 2097152, 0, 4064, 0, 0.926 2097152, 0, 4064, 1, 0.927 4194311, 0, 0, 0, 0.886 4194311, 0, 0, 1, 0.89 4194319, 0, 3, 0, 0.746 4194319, 0, 3, 1, 0.745 4194335, 3, 0, 0, 0.816 4194335, 3, 0, 1, 0.816 4194367, 3, 5, 0, 0.78 4194367, 3, 5, 1, 0.781 4194304, 0, 127, 0, 0.792 4194304, 0, 127, 1, 0.791 4194304, 0, 255, 0, 0.803 4194304, 0, 255, 1, 0.799 4194304, 0, 256, 0, 0.865 4194304, 0, 256, 1, 0.863 4194304, 0, 4064, 0, 0.953 4194304, 0, 4064, 1, 0.95 8388615, 0, 0, 0, 0.876 8388615, 0, 0, 1, 0.877 8388623, 0, 3, 0, 0.762 8388623, 0, 3, 1, 0.762 8388639, 3, 0, 0, 0.871 8388639, 3, 0, 1, 0.87 8388671, 3, 5, 0, 0.805 8388671, 3, 5, 1, 0.808 8388608, 0, 127, 0, 0.824 8388608, 0, 127, 1, 0.823 8388608, 0, 255, 0, 0.858 8388608, 0, 255, 1, 0.857 8388608, 0, 256, 0, 0.843 8388608, 0, 256, 1, 0.84 8388608, 0, 4064, 0, 0.981 8388608, 0, 4064, 1, 0.981 16777223, 0, 0, 0, 0.881 16777223, 0, 0, 1, 0.882 16777231, 0, 3, 0, 0.765 16777231, 0, 3, 1, 0.765 16777247, 3, 0, 0, 0.87 16777247, 3, 0, 1, 0.87 16777279, 3, 5, 0, 0.807 16777279, 3, 5, 1, 0.811 16777216, 0, 127, 0, 0.827 16777216, 0, 127, 1, 0.827 16777216, 0, 255, 0, 0.858 16777216, 0, 255, 1, 0.857 16777216, 0, 256, 0, 0.848 16777216, 0, 256, 1, 0.844 16777216, 0, 4064, 0, 0.98 16777216, 0, 4064, 1, 0.981 33554439, 0, 0, 0, 0.883 33554439, 0, 0, 1, 0.884 33554447, 0, 3, 0, 0.767 33554447, 0, 3, 1, 0.766 33554463, 3, 0, 0, 0.87 33554463, 3, 0, 1, 0.87 33554495, 3, 5, 0, 0.809 33554495, 3, 5, 1, 0.813 33554432, 0, 127, 0, 0.829 33554432, 0, 127, 1, 0.829 33554432, 0, 255, 0, 0.857 33554432, 0, 255, 1, 0.857 33554432, 0, 256, 0, 0.85 33554432, 0, 256, 1, 0.846 33554432, 0, 4064, 0, 0.981 33554432, 0, 4064, 1, 0.981 Results For: bench-memcpy-random length, New Time / Old Time 32768, 0.888 65536, 0.906 131072, 0.915 262144, 0.919 524288, 0.921 1048576, 0.929 sysdeps/x86_64/multiarch/Makefile | 1 - sysdeps/x86_64/multiarch/memcpy-ssse3.S | 3151 ---------------------- sysdeps/x86_64/multiarch/memmove-ssse3.S | 384 ++- 3 files changed, 380 insertions(+), 3156 deletions(-) delete mode 100644 sysdeps/x86_64/multiarch/memcpy-ssse3.S diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index 303fb5d734..e7ea963fc0 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -16,7 +16,6 @@ sysdep_routines += \ memcmpeq-avx2-rtm \ memcmpeq-evex \ memcmpeq-sse2 \ - memcpy-ssse3 \ memmove-avx-unaligned-erms \ memmove-avx-unaligned-erms-rtm \ memmove-avx512-no-vzeroupper \ diff --git a/sysdeps/x86_64/multiarch/memcpy-ssse3.S b/sysdeps/x86_64/multiarch/memcpy-ssse3.S deleted file mode 100644 index 65644d3a09..0000000000 --- a/sysdeps/x86_64/multiarch/memcpy-ssse3.S +++ /dev/null @@ -1,3151 +0,0 @@ -/* memcpy with SSSE3 - Copyright (C) 2010-2022 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include - -#if IS_IN (libc) - -#include "asm-syntax.h" - -#ifndef MEMCPY -# define MEMCPY __memcpy_ssse3 -# define MEMCPY_CHK __memcpy_chk_ssse3 -# define MEMPCPY __mempcpy_ssse3 -# define MEMPCPY_CHK __mempcpy_chk_ssse3 -#endif - -#define JMPTBL(I, B) I - B - -/* Branch to an entry in a jump table. TABLE is a jump table with - relative offsets. INDEX is a register contains the index into the - jump table. SCALE is the scale of INDEX. */ -#define BRANCH_TO_JMPTBL_ENTRY(TABLE, INDEX, SCALE) \ - lea TABLE(%rip), %r11; \ - movslq (%r11, INDEX, SCALE), INDEX; \ - lea (%r11, INDEX), INDEX; \ - _CET_NOTRACK jmp *INDEX; \ - ud2 - - .section .text.ssse3,"ax",@progbits -#if !defined USE_AS_MEMPCPY && !defined USE_AS_MEMMOVE -ENTRY (MEMPCPY_CHK) - cmp %RDX_LP, %RCX_LP - jb HIDDEN_JUMPTARGET (__chk_fail) -END (MEMPCPY_CHK) - -ENTRY (MEMPCPY) - mov %RDI_LP, %RAX_LP - add %RDX_LP, %RAX_LP - jmp L(start) -END (MEMPCPY) -#endif - -#if !defined USE_AS_BCOPY -ENTRY (MEMCPY_CHK) - cmp %RDX_LP, %RCX_LP - jb HIDDEN_JUMPTARGET (__chk_fail) -END (MEMCPY_CHK) -#endif - -ENTRY (MEMCPY) - mov %RDI_LP, %RAX_LP -#ifdef USE_AS_MEMPCPY - add %RDX_LP, %RAX_LP -#endif - -#ifdef __ILP32__ - /* Clear the upper 32 bits. */ - mov %edx, %edx -#endif - -#ifdef USE_AS_MEMMOVE - cmp %rsi, %rdi - jb L(copy_forward) - je L(write_0bytes) - cmp $79, %rdx - jbe L(copy_forward) - jmp L(copy_backward) -L(copy_forward): -#endif -L(start): - cmp $79, %rdx - lea L(table_less_80bytes)(%rip), %r11 - ja L(80bytesormore) - movslq (%r11, %rdx, 4), %r9 - add %rdx, %rsi - add %rdx, %rdi - add %r11, %r9 - _CET_NOTRACK jmp *%r9 - ud2 - - .p2align 4 -L(80bytesormore): -#ifndef USE_AS_MEMMOVE - cmp %dil, %sil - jle L(copy_backward) -#endif - - movdqu (%rsi), %xmm0 - mov %rdi, %rcx - and $-16, %rdi - add $16, %rdi - mov %rcx, %r8 - sub %rdi, %rcx - add %rcx, %rdx - sub %rcx, %rsi - -#ifdef SHARED_CACHE_SIZE_HALF - mov $SHARED_CACHE_SIZE_HALF, %RCX_LP -#else - mov __x86_shared_cache_size_half(%rip), %RCX_LP -#endif - cmp %rcx, %rdx - mov %rsi, %r9 - ja L(large_page_fwd) - and $0xf, %r9 - jz L(shl_0) -#ifdef DATA_CACHE_SIZE_HALF - mov $DATA_CACHE_SIZE_HALF, %RCX_LP -#else - mov __x86_data_cache_size_half(%rip), %RCX_LP -#endif - BRANCH_TO_JMPTBL_ENTRY (L(shl_table), %r9, 4) - - .p2align 4 -L(copy_backward): - movdqu -16(%rsi, %rdx), %xmm0 - add %rdx, %rsi - lea -16(%rdi, %rdx), %r8 - add %rdx, %rdi - - mov %rdi, %rcx - and $0xf, %rcx - xor %rcx, %rdi - sub %rcx, %rdx - sub %rcx, %rsi - -#ifdef SHARED_CACHE_SIZE_HALF - mov $SHARED_CACHE_SIZE_HALF, %RCX_LP -#else - mov __x86_shared_cache_size_half(%rip), %RCX_LP -#endif - - cmp %rcx, %rdx - mov %rsi, %r9 - ja L(large_page_bwd) - and $0xf, %r9 - jz L(shl_0_bwd) -#ifdef DATA_CACHE_SIZE_HALF - mov $DATA_CACHE_SIZE_HALF, %RCX_LP -#else - mov __x86_data_cache_size_half(%rip), %RCX_LP -#endif - BRANCH_TO_JMPTBL_ENTRY (L(shl_table_bwd), %r9, 4) - - .p2align 4 -L(shl_0): - sub $16, %rdx - movdqa (%rsi), %xmm1 - add $16, %rsi - movdqa %xmm1, (%rdi) - add $16, %rdi - cmp $128, %rdx - movdqu %xmm0, (%r8) - ja L(shl_0_gobble) - cmp $64, %rdx - jb L(shl_0_less_64bytes) - movaps (%rsi), %xmm4 - movaps 16(%rsi), %xmm1 - movaps 32(%rsi), %xmm2 - movaps 48(%rsi), %xmm3 - movaps %xmm4, (%rdi) - movaps %xmm1, 16(%rdi) - movaps %xmm2, 32(%rdi) - movaps %xmm3, 48(%rdi) - sub $64, %rdx - add $64, %rsi - add $64, %rdi -L(shl_0_less_64bytes): - add %rdx, %rsi - add %rdx, %rdi - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_0_gobble): -#ifdef DATA_CACHE_SIZE_HALF - cmp $DATA_CACHE_SIZE_HALF, %RDX_LP -#else - cmp __x86_data_cache_size_half(%rip), %RDX_LP -#endif - lea -128(%rdx), %rdx - jae L(shl_0_gobble_mem_loop) -L(shl_0_gobble_cache_loop): - movdqa (%rsi), %xmm4 - movaps 0x10(%rsi), %xmm1 - movaps 0x20(%rsi), %xmm2 - movaps 0x30(%rsi), %xmm3 - - movdqa %xmm4, (%rdi) - movaps %xmm1, 0x10(%rdi) - movaps %xmm2, 0x20(%rdi) - movaps %xmm3, 0x30(%rdi) - - sub $128, %rdx - movaps 0x40(%rsi), %xmm4 - movaps 0x50(%rsi), %xmm5 - movaps 0x60(%rsi), %xmm6 - movaps 0x70(%rsi), %xmm7 - lea 0x80(%rsi), %rsi - movaps %xmm4, 0x40(%rdi) - movaps %xmm5, 0x50(%rdi) - movaps %xmm6, 0x60(%rdi) - movaps %xmm7, 0x70(%rdi) - lea 0x80(%rdi), %rdi - - jae L(shl_0_gobble_cache_loop) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(shl_0_cache_less_64bytes) - - movdqa (%rsi), %xmm4 - sub $0x40, %rdx - movdqa 0x10(%rsi), %xmm1 - - movdqa %xmm4, (%rdi) - movdqa %xmm1, 0x10(%rdi) - - movdqa 0x20(%rsi), %xmm4 - movdqa 0x30(%rsi), %xmm1 - add $0x40, %rsi - - movdqa %xmm4, 0x20(%rdi) - movdqa %xmm1, 0x30(%rdi) - add $0x40, %rdi -L(shl_0_cache_less_64bytes): - add %rdx, %rsi - add %rdx, %rdi - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_0_gobble_mem_loop): - prefetcht0 0x1c0(%rsi) - prefetcht0 0x280(%rsi) - - movdqa (%rsi), %xmm0 - movdqa 0x10(%rsi), %xmm1 - movdqa 0x20(%rsi), %xmm2 - movdqa 0x30(%rsi), %xmm3 - movdqa 0x40(%rsi), %xmm4 - movdqa 0x50(%rsi), %xmm5 - movdqa 0x60(%rsi), %xmm6 - movdqa 0x70(%rsi), %xmm7 - lea 0x80(%rsi), %rsi - sub $0x80, %rdx - movdqa %xmm0, (%rdi) - movdqa %xmm1, 0x10(%rdi) - movdqa %xmm2, 0x20(%rdi) - movdqa %xmm3, 0x30(%rdi) - movdqa %xmm4, 0x40(%rdi) - movdqa %xmm5, 0x50(%rdi) - movdqa %xmm6, 0x60(%rdi) - movdqa %xmm7, 0x70(%rdi) - lea 0x80(%rdi), %rdi - - jae L(shl_0_gobble_mem_loop) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(shl_0_mem_less_64bytes) - - movdqa (%rsi), %xmm0 - sub $0x40, %rdx - movdqa 0x10(%rsi), %xmm1 - - movdqa %xmm0, (%rdi) - movdqa %xmm1, 0x10(%rdi) - - movdqa 0x20(%rsi), %xmm0 - movdqa 0x30(%rsi), %xmm1 - add $0x40, %rsi - - movdqa %xmm0, 0x20(%rdi) - movdqa %xmm1, 0x30(%rdi) - add $0x40, %rdi -L(shl_0_mem_less_64bytes): - cmp $0x20, %rdx - jb L(shl_0_mem_less_32bytes) - movdqa (%rsi), %xmm0 - sub $0x20, %rdx - movdqa 0x10(%rsi), %xmm1 - add $0x20, %rsi - movdqa %xmm0, (%rdi) - movdqa %xmm1, 0x10(%rdi) - add $0x20, %rdi -L(shl_0_mem_less_32bytes): - add %rdx, %rdi - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_0_bwd): - sub $16, %rdx - movdqa -0x10(%rsi), %xmm1 - sub $16, %rsi - movdqa %xmm1, -0x10(%rdi) - sub $16, %rdi - cmp $0x80, %rdx - movdqu %xmm0, (%r8) - ja L(shl_0_gobble_bwd) - cmp $64, %rdx - jb L(shl_0_less_64bytes_bwd) - movaps -0x10(%rsi), %xmm0 - movaps -0x20(%rsi), %xmm1 - movaps -0x30(%rsi), %xmm2 - movaps -0x40(%rsi), %xmm3 - movaps %xmm0, -0x10(%rdi) - movaps %xmm1, -0x20(%rdi) - movaps %xmm2, -0x30(%rdi) - movaps %xmm3, -0x40(%rdi) - sub $64, %rdx - sub $0x40, %rsi - sub $0x40, %rdi -L(shl_0_less_64bytes_bwd): - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_0_gobble_bwd): -#ifdef DATA_CACHE_SIZE_HALF - cmp $DATA_CACHE_SIZE_HALF, %RDX_LP -#else - cmp __x86_data_cache_size_half(%rip), %RDX_LP -#endif - lea -128(%rdx), %rdx - jae L(shl_0_gobble_mem_bwd_loop) -L(shl_0_gobble_bwd_loop): - movdqa -0x10(%rsi), %xmm0 - movaps -0x20(%rsi), %xmm1 - movaps -0x30(%rsi), %xmm2 - movaps -0x40(%rsi), %xmm3 - - movdqa %xmm0, -0x10(%rdi) - movaps %xmm1, -0x20(%rdi) - movaps %xmm2, -0x30(%rdi) - movaps %xmm3, -0x40(%rdi) - - sub $0x80, %rdx - movaps -0x50(%rsi), %xmm4 - movaps -0x60(%rsi), %xmm5 - movaps -0x70(%rsi), %xmm6 - movaps -0x80(%rsi), %xmm7 - lea -0x80(%rsi), %rsi - movaps %xmm4, -0x50(%rdi) - movaps %xmm5, -0x60(%rdi) - movaps %xmm6, -0x70(%rdi) - movaps %xmm7, -0x80(%rdi) - lea -0x80(%rdi), %rdi - - jae L(shl_0_gobble_bwd_loop) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(shl_0_gobble_bwd_less_64bytes) - - movdqa -0x10(%rsi), %xmm0 - sub $0x40, %rdx - movdqa -0x20(%rsi), %xmm1 - - movdqa %xmm0, -0x10(%rdi) - movdqa %xmm1, -0x20(%rdi) - - movdqa -0x30(%rsi), %xmm0 - movdqa -0x40(%rsi), %xmm1 - sub $0x40, %rsi - - movdqa %xmm0, -0x30(%rdi) - movdqa %xmm1, -0x40(%rdi) - sub $0x40, %rdi -L(shl_0_gobble_bwd_less_64bytes): - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_0_gobble_mem_bwd_loop): - prefetcht0 -0x1c0(%rsi) - prefetcht0 -0x280(%rsi) - movdqa -0x10(%rsi), %xmm0 - movdqa -0x20(%rsi), %xmm1 - movdqa -0x30(%rsi), %xmm2 - movdqa -0x40(%rsi), %xmm3 - movdqa -0x50(%rsi), %xmm4 - movdqa -0x60(%rsi), %xmm5 - movdqa -0x70(%rsi), %xmm6 - movdqa -0x80(%rsi), %xmm7 - lea -0x80(%rsi), %rsi - sub $0x80, %rdx - movdqa %xmm0, -0x10(%rdi) - movdqa %xmm1, -0x20(%rdi) - movdqa %xmm2, -0x30(%rdi) - movdqa %xmm3, -0x40(%rdi) - movdqa %xmm4, -0x50(%rdi) - movdqa %xmm5, -0x60(%rdi) - movdqa %xmm6, -0x70(%rdi) - movdqa %xmm7, -0x80(%rdi) - lea -0x80(%rdi), %rdi - - jae L(shl_0_gobble_mem_bwd_loop) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(shl_0_mem_bwd_less_64bytes) - - movdqa -0x10(%rsi), %xmm0 - sub $0x40, %rdx - movdqa -0x20(%rsi), %xmm1 - - movdqa %xmm0, -0x10(%rdi) - movdqa %xmm1, -0x20(%rdi) - - movdqa -0x30(%rsi), %xmm0 - movdqa -0x40(%rsi), %xmm1 - sub $0x40, %rsi - - movdqa %xmm0, -0x30(%rdi) - movdqa %xmm1, -0x40(%rdi) - sub $0x40, %rdi -L(shl_0_mem_bwd_less_64bytes): - cmp $0x20, %rdx - jb L(shl_0_mem_bwd_less_32bytes) - movdqa -0x10(%rsi), %xmm0 - sub $0x20, %rdx - movdqa -0x20(%rsi), %xmm1 - sub $0x20, %rsi - movdqa %xmm0, -0x10(%rdi) - movdqa %xmm1, -0x20(%rdi) - sub $0x20, %rdi -L(shl_0_mem_bwd_less_32bytes): - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_1): - lea (L(shl_1_loop_L1)-L(shl_1))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x01(%rsi), %xmm1 - jb L(L1_fwd) - lea (L(shl_1_loop_L2)-L(shl_1_loop_L1))(%r9), %r9 -L(L1_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_1_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_1_loop_L1): - sub $64, %rdx - movaps 0x0f(%rsi), %xmm2 - movaps 0x1f(%rsi), %xmm3 - movaps 0x2f(%rsi), %xmm4 - movaps 0x3f(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $1, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $1, %xmm3, %xmm4 - palignr $1, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $1, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_1_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_1_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_1_bwd): - lea (L(shl_1_bwd_loop_L1)-L(shl_1_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x01(%rsi), %xmm1 - jb L(L1_bwd) - lea (L(shl_1_bwd_loop_L2)-L(shl_1_bwd_loop_L1))(%r9), %r9 -L(L1_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_1_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_1_bwd_loop_L1): - movaps -0x11(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x21(%rsi), %xmm3 - movaps -0x31(%rsi), %xmm4 - movaps -0x41(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $1, %xmm2, %xmm1 - palignr $1, %xmm3, %xmm2 - palignr $1, %xmm4, %xmm3 - palignr $1, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_1_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_1_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_2): - lea (L(shl_2_loop_L1)-L(shl_2))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x02(%rsi), %xmm1 - jb L(L2_fwd) - lea (L(shl_2_loop_L2)-L(shl_2_loop_L1))(%r9), %r9 -L(L2_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_2_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_2_loop_L1): - sub $64, %rdx - movaps 0x0e(%rsi), %xmm2 - movaps 0x1e(%rsi), %xmm3 - movaps 0x2e(%rsi), %xmm4 - movaps 0x3e(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $2, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $2, %xmm3, %xmm4 - palignr $2, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $2, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_2_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_2_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_2_bwd): - lea (L(shl_2_bwd_loop_L1)-L(shl_2_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x02(%rsi), %xmm1 - jb L(L2_bwd) - lea (L(shl_2_bwd_loop_L2)-L(shl_2_bwd_loop_L1))(%r9), %r9 -L(L2_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_2_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_2_bwd_loop_L1): - movaps -0x12(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x22(%rsi), %xmm3 - movaps -0x32(%rsi), %xmm4 - movaps -0x42(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $2, %xmm2, %xmm1 - palignr $2, %xmm3, %xmm2 - palignr $2, %xmm4, %xmm3 - palignr $2, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_2_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_2_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_3): - lea (L(shl_3_loop_L1)-L(shl_3))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x03(%rsi), %xmm1 - jb L(L3_fwd) - lea (L(shl_3_loop_L2)-L(shl_3_loop_L1))(%r9), %r9 -L(L3_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_3_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_3_loop_L1): - sub $64, %rdx - movaps 0x0d(%rsi), %xmm2 - movaps 0x1d(%rsi), %xmm3 - movaps 0x2d(%rsi), %xmm4 - movaps 0x3d(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $3, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $3, %xmm3, %xmm4 - palignr $3, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $3, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_3_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_3_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_3_bwd): - lea (L(shl_3_bwd_loop_L1)-L(shl_3_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x03(%rsi), %xmm1 - jb L(L3_bwd) - lea (L(shl_3_bwd_loop_L2)-L(shl_3_bwd_loop_L1))(%r9), %r9 -L(L3_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_3_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_3_bwd_loop_L1): - movaps -0x13(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x23(%rsi), %xmm3 - movaps -0x33(%rsi), %xmm4 - movaps -0x43(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $3, %xmm2, %xmm1 - palignr $3, %xmm3, %xmm2 - palignr $3, %xmm4, %xmm3 - palignr $3, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_3_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_3_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_4): - lea (L(shl_4_loop_L1)-L(shl_4))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x04(%rsi), %xmm1 - jb L(L4_fwd) - lea (L(shl_4_loop_L2)-L(shl_4_loop_L1))(%r9), %r9 -L(L4_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_4_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_4_loop_L1): - sub $64, %rdx - movaps 0x0c(%rsi), %xmm2 - movaps 0x1c(%rsi), %xmm3 - movaps 0x2c(%rsi), %xmm4 - movaps 0x3c(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $4, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $4, %xmm3, %xmm4 - palignr $4, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $4, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_4_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_4_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_4_bwd): - lea (L(shl_4_bwd_loop_L1)-L(shl_4_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x04(%rsi), %xmm1 - jb L(L4_bwd) - lea (L(shl_4_bwd_loop_L2)-L(shl_4_bwd_loop_L1))(%r9), %r9 -L(L4_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_4_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_4_bwd_loop_L1): - movaps -0x14(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x24(%rsi), %xmm3 - movaps -0x34(%rsi), %xmm4 - movaps -0x44(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $4, %xmm2, %xmm1 - palignr $4, %xmm3, %xmm2 - palignr $4, %xmm4, %xmm3 - palignr $4, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_4_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_4_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_5): - lea (L(shl_5_loop_L1)-L(shl_5))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x05(%rsi), %xmm1 - jb L(L5_fwd) - lea (L(shl_5_loop_L2)-L(shl_5_loop_L1))(%r9), %r9 -L(L5_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_5_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_5_loop_L1): - sub $64, %rdx - movaps 0x0b(%rsi), %xmm2 - movaps 0x1b(%rsi), %xmm3 - movaps 0x2b(%rsi), %xmm4 - movaps 0x3b(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $5, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $5, %xmm3, %xmm4 - palignr $5, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $5, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_5_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_5_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_5_bwd): - lea (L(shl_5_bwd_loop_L1)-L(shl_5_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x05(%rsi), %xmm1 - jb L(L5_bwd) - lea (L(shl_5_bwd_loop_L2)-L(shl_5_bwd_loop_L1))(%r9), %r9 -L(L5_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_5_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_5_bwd_loop_L1): - movaps -0x15(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x25(%rsi), %xmm3 - movaps -0x35(%rsi), %xmm4 - movaps -0x45(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $5, %xmm2, %xmm1 - palignr $5, %xmm3, %xmm2 - palignr $5, %xmm4, %xmm3 - palignr $5, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_5_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_5_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_6): - lea (L(shl_6_loop_L1)-L(shl_6))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x06(%rsi), %xmm1 - jb L(L6_fwd) - lea (L(shl_6_loop_L2)-L(shl_6_loop_L1))(%r9), %r9 -L(L6_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_6_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_6_loop_L1): - sub $64, %rdx - movaps 0x0a(%rsi), %xmm2 - movaps 0x1a(%rsi), %xmm3 - movaps 0x2a(%rsi), %xmm4 - movaps 0x3a(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $6, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $6, %xmm3, %xmm4 - palignr $6, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $6, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_6_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_6_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_6_bwd): - lea (L(shl_6_bwd_loop_L1)-L(shl_6_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x06(%rsi), %xmm1 - jb L(L6_bwd) - lea (L(shl_6_bwd_loop_L2)-L(shl_6_bwd_loop_L1))(%r9), %r9 -L(L6_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_6_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_6_bwd_loop_L1): - movaps -0x16(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x26(%rsi), %xmm3 - movaps -0x36(%rsi), %xmm4 - movaps -0x46(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $6, %xmm2, %xmm1 - palignr $6, %xmm3, %xmm2 - palignr $6, %xmm4, %xmm3 - palignr $6, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_6_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_6_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_7): - lea (L(shl_7_loop_L1)-L(shl_7))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x07(%rsi), %xmm1 - jb L(L7_fwd) - lea (L(shl_7_loop_L2)-L(shl_7_loop_L1))(%r9), %r9 -L(L7_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_7_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_7_loop_L1): - sub $64, %rdx - movaps 0x09(%rsi), %xmm2 - movaps 0x19(%rsi), %xmm3 - movaps 0x29(%rsi), %xmm4 - movaps 0x39(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $7, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $7, %xmm3, %xmm4 - palignr $7, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $7, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_7_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_7_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_7_bwd): - lea (L(shl_7_bwd_loop_L1)-L(shl_7_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x07(%rsi), %xmm1 - jb L(L7_bwd) - lea (L(shl_7_bwd_loop_L2)-L(shl_7_bwd_loop_L1))(%r9), %r9 -L(L7_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_7_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_7_bwd_loop_L1): - movaps -0x17(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x27(%rsi), %xmm3 - movaps -0x37(%rsi), %xmm4 - movaps -0x47(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $7, %xmm2, %xmm1 - palignr $7, %xmm3, %xmm2 - palignr $7, %xmm4, %xmm3 - palignr $7, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_7_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_7_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_8): - lea (L(shl_8_loop_L1)-L(shl_8))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x08(%rsi), %xmm1 - jb L(L8_fwd) - lea (L(shl_8_loop_L2)-L(shl_8_loop_L1))(%r9), %r9 -L(L8_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 -L(shl_8_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_8_loop_L1): - sub $64, %rdx - movaps 0x08(%rsi), %xmm2 - movaps 0x18(%rsi), %xmm3 - movaps 0x28(%rsi), %xmm4 - movaps 0x38(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $8, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $8, %xmm3, %xmm4 - palignr $8, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $8, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_8_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 - .p2align 4 -L(shl_8_end): - lea 64(%rdx), %rdx - movaps %xmm4, -0x20(%rdi) - add %rdx, %rsi - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_8_bwd): - lea (L(shl_8_bwd_loop_L1)-L(shl_8_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x08(%rsi), %xmm1 - jb L(L8_bwd) - lea (L(shl_8_bwd_loop_L2)-L(shl_8_bwd_loop_L1))(%r9), %r9 -L(L8_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_8_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_8_bwd_loop_L1): - movaps -0x18(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x28(%rsi), %xmm3 - movaps -0x38(%rsi), %xmm4 - movaps -0x48(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $8, %xmm2, %xmm1 - palignr $8, %xmm3, %xmm2 - palignr $8, %xmm4, %xmm3 - palignr $8, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_8_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_8_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_9): - lea (L(shl_9_loop_L1)-L(shl_9))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x09(%rsi), %xmm1 - jb L(L9_fwd) - lea (L(shl_9_loop_L2)-L(shl_9_loop_L1))(%r9), %r9 -L(L9_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_9_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_9_loop_L1): - sub $64, %rdx - movaps 0x07(%rsi), %xmm2 - movaps 0x17(%rsi), %xmm3 - movaps 0x27(%rsi), %xmm4 - movaps 0x37(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $9, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $9, %xmm3, %xmm4 - palignr $9, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $9, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_9_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_9_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_9_bwd): - lea (L(shl_9_bwd_loop_L1)-L(shl_9_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x09(%rsi), %xmm1 - jb L(L9_bwd) - lea (L(shl_9_bwd_loop_L2)-L(shl_9_bwd_loop_L1))(%r9), %r9 -L(L9_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_9_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_9_bwd_loop_L1): - movaps -0x19(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x29(%rsi), %xmm3 - movaps -0x39(%rsi), %xmm4 - movaps -0x49(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $9, %xmm2, %xmm1 - palignr $9, %xmm3, %xmm2 - palignr $9, %xmm4, %xmm3 - palignr $9, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_9_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_9_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_10): - lea (L(shl_10_loop_L1)-L(shl_10))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0a(%rsi), %xmm1 - jb L(L10_fwd) - lea (L(shl_10_loop_L2)-L(shl_10_loop_L1))(%r9), %r9 -L(L10_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_10_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_10_loop_L1): - sub $64, %rdx - movaps 0x06(%rsi), %xmm2 - movaps 0x16(%rsi), %xmm3 - movaps 0x26(%rsi), %xmm4 - movaps 0x36(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $10, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $10, %xmm3, %xmm4 - palignr $10, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $10, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_10_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_10_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_10_bwd): - lea (L(shl_10_bwd_loop_L1)-L(shl_10_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0a(%rsi), %xmm1 - jb L(L10_bwd) - lea (L(shl_10_bwd_loop_L2)-L(shl_10_bwd_loop_L1))(%r9), %r9 -L(L10_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_10_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_10_bwd_loop_L1): - movaps -0x1a(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x2a(%rsi), %xmm3 - movaps -0x3a(%rsi), %xmm4 - movaps -0x4a(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $10, %xmm2, %xmm1 - palignr $10, %xmm3, %xmm2 - palignr $10, %xmm4, %xmm3 - palignr $10, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_10_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_10_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_11): - lea (L(shl_11_loop_L1)-L(shl_11))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0b(%rsi), %xmm1 - jb L(L11_fwd) - lea (L(shl_11_loop_L2)-L(shl_11_loop_L1))(%r9), %r9 -L(L11_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_11_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_11_loop_L1): - sub $64, %rdx - movaps 0x05(%rsi), %xmm2 - movaps 0x15(%rsi), %xmm3 - movaps 0x25(%rsi), %xmm4 - movaps 0x35(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $11, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $11, %xmm3, %xmm4 - palignr $11, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $11, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_11_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_11_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_11_bwd): - lea (L(shl_11_bwd_loop_L1)-L(shl_11_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0b(%rsi), %xmm1 - jb L(L11_bwd) - lea (L(shl_11_bwd_loop_L2)-L(shl_11_bwd_loop_L1))(%r9), %r9 -L(L11_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_11_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_11_bwd_loop_L1): - movaps -0x1b(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x2b(%rsi), %xmm3 - movaps -0x3b(%rsi), %xmm4 - movaps -0x4b(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $11, %xmm2, %xmm1 - palignr $11, %xmm3, %xmm2 - palignr $11, %xmm4, %xmm3 - palignr $11, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_11_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_11_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_12): - lea (L(shl_12_loop_L1)-L(shl_12))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0c(%rsi), %xmm1 - jb L(L12_fwd) - lea (L(shl_12_loop_L2)-L(shl_12_loop_L1))(%r9), %r9 -L(L12_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_12_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_12_loop_L1): - sub $64, %rdx - movaps 0x04(%rsi), %xmm2 - movaps 0x14(%rsi), %xmm3 - movaps 0x24(%rsi), %xmm4 - movaps 0x34(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $12, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $12, %xmm3, %xmm4 - palignr $12, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $12, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_12_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_12_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_12_bwd): - lea (L(shl_12_bwd_loop_L1)-L(shl_12_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0c(%rsi), %xmm1 - jb L(L12_bwd) - lea (L(shl_12_bwd_loop_L2)-L(shl_12_bwd_loop_L1))(%r9), %r9 -L(L12_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_12_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_12_bwd_loop_L1): - movaps -0x1c(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x2c(%rsi), %xmm3 - movaps -0x3c(%rsi), %xmm4 - movaps -0x4c(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $12, %xmm2, %xmm1 - palignr $12, %xmm3, %xmm2 - palignr $12, %xmm4, %xmm3 - palignr $12, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_12_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_12_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_13): - lea (L(shl_13_loop_L1)-L(shl_13))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0d(%rsi), %xmm1 - jb L(L13_fwd) - lea (L(shl_13_loop_L2)-L(shl_13_loop_L1))(%r9), %r9 -L(L13_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_13_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_13_loop_L1): - sub $64, %rdx - movaps 0x03(%rsi), %xmm2 - movaps 0x13(%rsi), %xmm3 - movaps 0x23(%rsi), %xmm4 - movaps 0x33(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $13, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $13, %xmm3, %xmm4 - palignr $13, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $13, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_13_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_13_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_13_bwd): - lea (L(shl_13_bwd_loop_L1)-L(shl_13_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0d(%rsi), %xmm1 - jb L(L13_bwd) - lea (L(shl_13_bwd_loop_L2)-L(shl_13_bwd_loop_L1))(%r9), %r9 -L(L13_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_13_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_13_bwd_loop_L1): - movaps -0x1d(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x2d(%rsi), %xmm3 - movaps -0x3d(%rsi), %xmm4 - movaps -0x4d(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $13, %xmm2, %xmm1 - palignr $13, %xmm3, %xmm2 - palignr $13, %xmm4, %xmm3 - palignr $13, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_13_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_13_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_14): - lea (L(shl_14_loop_L1)-L(shl_14))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0e(%rsi), %xmm1 - jb L(L14_fwd) - lea (L(shl_14_loop_L2)-L(shl_14_loop_L1))(%r9), %r9 -L(L14_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_14_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_14_loop_L1): - sub $64, %rdx - movaps 0x02(%rsi), %xmm2 - movaps 0x12(%rsi), %xmm3 - movaps 0x22(%rsi), %xmm4 - movaps 0x32(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $14, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $14, %xmm3, %xmm4 - palignr $14, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $14, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_14_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_14_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_14_bwd): - lea (L(shl_14_bwd_loop_L1)-L(shl_14_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0e(%rsi), %xmm1 - jb L(L14_bwd) - lea (L(shl_14_bwd_loop_L2)-L(shl_14_bwd_loop_L1))(%r9), %r9 -L(L14_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_14_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_14_bwd_loop_L1): - movaps -0x1e(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x2e(%rsi), %xmm3 - movaps -0x3e(%rsi), %xmm4 - movaps -0x4e(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $14, %xmm2, %xmm1 - palignr $14, %xmm3, %xmm2 - palignr $14, %xmm4, %xmm3 - palignr $14, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_14_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_14_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_15): - lea (L(shl_15_loop_L1)-L(shl_15))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0f(%rsi), %xmm1 - jb L(L15_fwd) - lea (L(shl_15_loop_L2)-L(shl_15_loop_L1))(%r9), %r9 -L(L15_fwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_15_loop_L2): - prefetchnta 0x1c0(%rsi) -L(shl_15_loop_L1): - sub $64, %rdx - movaps 0x01(%rsi), %xmm2 - movaps 0x11(%rsi), %xmm3 - movaps 0x21(%rsi), %xmm4 - movaps 0x31(%rsi), %xmm5 - movdqa %xmm5, %xmm6 - palignr $15, %xmm4, %xmm5 - lea 64(%rsi), %rsi - palignr $15, %xmm3, %xmm4 - palignr $15, %xmm2, %xmm3 - lea 64(%rdi), %rdi - palignr $15, %xmm1, %xmm2 - movdqa %xmm6, %xmm1 - movdqa %xmm2, -0x40(%rdi) - movaps %xmm3, -0x30(%rdi) - jb L(shl_15_end) - movaps %xmm4, -0x20(%rdi) - movaps %xmm5, -0x10(%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_15_end): - movaps %xmm4, -0x20(%rdi) - lea 64(%rdx), %rdx - movaps %xmm5, -0x10(%rdi) - add %rdx, %rdi - movdqu %xmm0, (%r8) - add %rdx, %rsi - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(shl_15_bwd): - lea (L(shl_15_bwd_loop_L1)-L(shl_15_bwd))(%r9), %r9 - cmp %rcx, %rdx - movaps -0x0f(%rsi), %xmm1 - jb L(L15_bwd) - lea (L(shl_15_bwd_loop_L2)-L(shl_15_bwd_loop_L1))(%r9), %r9 -L(L15_bwd): - lea -64(%rdx), %rdx - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_15_bwd_loop_L2): - prefetchnta -0x1c0(%rsi) -L(shl_15_bwd_loop_L1): - movaps -0x1f(%rsi), %xmm2 - sub $0x40, %rdx - movaps -0x2f(%rsi), %xmm3 - movaps -0x3f(%rsi), %xmm4 - movaps -0x4f(%rsi), %xmm5 - lea -0x40(%rsi), %rsi - palignr $15, %xmm2, %xmm1 - palignr $15, %xmm3, %xmm2 - palignr $15, %xmm4, %xmm3 - palignr $15, %xmm5, %xmm4 - - movaps %xmm1, -0x10(%rdi) - movaps %xmm5, %xmm1 - - movaps %xmm2, -0x20(%rdi) - lea -0x40(%rdi), %rdi - - movaps %xmm3, 0x10(%rdi) - jb L(shl_15_bwd_end) - movaps %xmm4, (%rdi) - _CET_NOTRACK jmp *%r9 - ud2 -L(shl_15_bwd_end): - movaps %xmm4, (%rdi) - lea 64(%rdx), %rdx - movdqu %xmm0, (%r8) - BRANCH_TO_JMPTBL_ENTRY(L(table_less_80bytes), %rdx, 4) - - .p2align 4 -L(write_72bytes): - movdqu -72(%rsi), %xmm0 - movdqu -56(%rsi), %xmm1 - mov -40(%rsi), %r8 - mov -32(%rsi), %r9 - mov -24(%rsi), %r10 - mov -16(%rsi), %r11 - mov -8(%rsi), %rcx - movdqu %xmm0, -72(%rdi) - movdqu %xmm1, -56(%rdi) - mov %r8, -40(%rdi) - mov %r9, -32(%rdi) - mov %r10, -24(%rdi) - mov %r11, -16(%rdi) - mov %rcx, -8(%rdi) - ret - - .p2align 4 -L(write_64bytes): - movdqu -64(%rsi), %xmm0 - mov -48(%rsi), %rcx - mov -40(%rsi), %r8 - mov -32(%rsi), %r9 - mov -24(%rsi), %r10 - mov -16(%rsi), %r11 - mov -8(%rsi), %rdx - movdqu %xmm0, -64(%rdi) - mov %rcx, -48(%rdi) - mov %r8, -40(%rdi) - mov %r9, -32(%rdi) - mov %r10, -24(%rdi) - mov %r11, -16(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_56bytes): - movdqu -56(%rsi), %xmm0 - mov -40(%rsi), %r8 - mov -32(%rsi), %r9 - mov -24(%rsi), %r10 - mov -16(%rsi), %r11 - mov -8(%rsi), %rcx - movdqu %xmm0, -56(%rdi) - mov %r8, -40(%rdi) - mov %r9, -32(%rdi) - mov %r10, -24(%rdi) - mov %r11, -16(%rdi) - mov %rcx, -8(%rdi) - ret - - .p2align 4 -L(write_48bytes): - mov -48(%rsi), %rcx - mov -40(%rsi), %r8 - mov -32(%rsi), %r9 - mov -24(%rsi), %r10 - mov -16(%rsi), %r11 - mov -8(%rsi), %rdx - mov %rcx, -48(%rdi) - mov %r8, -40(%rdi) - mov %r9, -32(%rdi) - mov %r10, -24(%rdi) - mov %r11, -16(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_40bytes): - mov -40(%rsi), %r8 - mov -32(%rsi), %r9 - mov -24(%rsi), %r10 - mov -16(%rsi), %r11 - mov -8(%rsi), %rdx - mov %r8, -40(%rdi) - mov %r9, -32(%rdi) - mov %r10, -24(%rdi) - mov %r11, -16(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_32bytes): - mov -32(%rsi), %r9 - mov -24(%rsi), %r10 - mov -16(%rsi), %r11 - mov -8(%rsi), %rdx - mov %r9, -32(%rdi) - mov %r10, -24(%rdi) - mov %r11, -16(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_24bytes): - mov -24(%rsi), %r10 - mov -16(%rsi), %r11 - mov -8(%rsi), %rdx - mov %r10, -24(%rdi) - mov %r11, -16(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_16bytes): - mov -16(%rsi), %r11 - mov -8(%rsi), %rdx - mov %r11, -16(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_8bytes): - mov -8(%rsi), %rdx - mov %rdx, -8(%rdi) -L(write_0bytes): - ret - - .p2align 4 -L(write_73bytes): - movdqu -73(%rsi), %xmm0 - movdqu -57(%rsi), %xmm1 - mov -41(%rsi), %rcx - mov -33(%rsi), %r9 - mov -25(%rsi), %r10 - mov -17(%rsi), %r11 - mov -9(%rsi), %r8 - mov -4(%rsi), %edx - movdqu %xmm0, -73(%rdi) - movdqu %xmm1, -57(%rdi) - mov %rcx, -41(%rdi) - mov %r9, -33(%rdi) - mov %r10, -25(%rdi) - mov %r11, -17(%rdi) - mov %r8, -9(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_65bytes): - movdqu -65(%rsi), %xmm0 - movdqu -49(%rsi), %xmm1 - mov -33(%rsi), %r9 - mov -25(%rsi), %r10 - mov -17(%rsi), %r11 - mov -9(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -65(%rdi) - movdqu %xmm1, -49(%rdi) - mov %r9, -33(%rdi) - mov %r10, -25(%rdi) - mov %r11, -17(%rdi) - mov %rcx, -9(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_57bytes): - movdqu -57(%rsi), %xmm0 - mov -41(%rsi), %r8 - mov -33(%rsi), %r9 - mov -25(%rsi), %r10 - mov -17(%rsi), %r11 - mov -9(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -57(%rdi) - mov %r8, -41(%rdi) - mov %r9, -33(%rdi) - mov %r10, -25(%rdi) - mov %r11, -17(%rdi) - mov %rcx, -9(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_49bytes): - movdqu -49(%rsi), %xmm0 - mov -33(%rsi), %r9 - mov -25(%rsi), %r10 - mov -17(%rsi), %r11 - mov -9(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -49(%rdi) - mov %r9, -33(%rdi) - mov %r10, -25(%rdi) - mov %r11, -17(%rdi) - mov %rcx, -9(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_41bytes): - mov -41(%rsi), %r8 - mov -33(%rsi), %r9 - mov -25(%rsi), %r10 - mov -17(%rsi), %r11 - mov -9(%rsi), %rcx - mov -1(%rsi), %dl - mov %r8, -41(%rdi) - mov %r9, -33(%rdi) - mov %r10, -25(%rdi) - mov %r11, -17(%rdi) - mov %rcx, -9(%rdi) - mov %dl, -1(%rdi) - ret - - .p2align 4 -L(write_33bytes): - mov -33(%rsi), %r9 - mov -25(%rsi), %r10 - mov -17(%rsi), %r11 - mov -9(%rsi), %rcx - mov -1(%rsi), %dl - mov %r9, -33(%rdi) - mov %r10, -25(%rdi) - mov %r11, -17(%rdi) - mov %rcx, -9(%rdi) - mov %dl, -1(%rdi) - ret - - .p2align 4 -L(write_25bytes): - mov -25(%rsi), %r10 - mov -17(%rsi), %r11 - mov -9(%rsi), %rcx - mov -1(%rsi), %dl - mov %r10, -25(%rdi) - mov %r11, -17(%rdi) - mov %rcx, -9(%rdi) - mov %dl, -1(%rdi) - ret - - .p2align 4 -L(write_17bytes): - mov -17(%rsi), %r11 - mov -9(%rsi), %rcx - mov -4(%rsi), %edx - mov %r11, -17(%rdi) - mov %rcx, -9(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_9bytes): - mov -9(%rsi), %rcx - mov -4(%rsi), %edx - mov %rcx, -9(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_1bytes): - mov -1(%rsi), %dl - mov %dl, -1(%rdi) - ret - - .p2align 4 -L(write_74bytes): - movdqu -74(%rsi), %xmm0 - movdqu -58(%rsi), %xmm1 - mov -42(%rsi), %r8 - mov -34(%rsi), %r9 - mov -26(%rsi), %r10 - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -74(%rdi) - movdqu %xmm1, -58(%rdi) - mov %r8, -42(%rdi) - mov %r9, -34(%rdi) - mov %r10, -26(%rdi) - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_66bytes): - movdqu -66(%rsi), %xmm0 - movdqu -50(%rsi), %xmm1 - mov -42(%rsi), %r8 - mov -34(%rsi), %r9 - mov -26(%rsi), %r10 - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -66(%rdi) - movdqu %xmm1, -50(%rdi) - mov %r8, -42(%rdi) - mov %r9, -34(%rdi) - mov %r10, -26(%rdi) - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_58bytes): - movdqu -58(%rsi), %xmm1 - mov -42(%rsi), %r8 - mov -34(%rsi), %r9 - mov -26(%rsi), %r10 - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm1, -58(%rdi) - mov %r8, -42(%rdi) - mov %r9, -34(%rdi) - mov %r10, -26(%rdi) - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_50bytes): - movdqu -50(%rsi), %xmm0 - mov -34(%rsi), %r9 - mov -26(%rsi), %r10 - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -50(%rdi) - mov %r9, -34(%rdi) - mov %r10, -26(%rdi) - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_42bytes): - mov -42(%rsi), %r8 - mov -34(%rsi), %r9 - mov -26(%rsi), %r10 - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - mov %r8, -42(%rdi) - mov %r9, -34(%rdi) - mov %r10, -26(%rdi) - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_34bytes): - mov -34(%rsi), %r9 - mov -26(%rsi), %r10 - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - mov %r9, -34(%rdi) - mov %r10, -26(%rdi) - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_26bytes): - mov -26(%rsi), %r10 - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - mov %r10, -26(%rdi) - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_18bytes): - mov -18(%rsi), %r11 - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - mov %r11, -18(%rdi) - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_10bytes): - mov -10(%rsi), %rcx - mov -4(%rsi), %edx - mov %rcx, -10(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_2bytes): - mov -2(%rsi), %dx - mov %dx, -2(%rdi) - ret - - .p2align 4 -L(write_75bytes): - movdqu -75(%rsi), %xmm0 - movdqu -59(%rsi), %xmm1 - mov -43(%rsi), %r8 - mov -35(%rsi), %r9 - mov -27(%rsi), %r10 - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -75(%rdi) - movdqu %xmm1, -59(%rdi) - mov %r8, -43(%rdi) - mov %r9, -35(%rdi) - mov %r10, -27(%rdi) - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_67bytes): - movdqu -67(%rsi), %xmm0 - movdqu -59(%rsi), %xmm1 - mov -43(%rsi), %r8 - mov -35(%rsi), %r9 - mov -27(%rsi), %r10 - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -67(%rdi) - movdqu %xmm1, -59(%rdi) - mov %r8, -43(%rdi) - mov %r9, -35(%rdi) - mov %r10, -27(%rdi) - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_59bytes): - movdqu -59(%rsi), %xmm0 - mov -43(%rsi), %r8 - mov -35(%rsi), %r9 - mov -27(%rsi), %r10 - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -59(%rdi) - mov %r8, -43(%rdi) - mov %r9, -35(%rdi) - mov %r10, -27(%rdi) - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_51bytes): - movdqu -51(%rsi), %xmm0 - mov -35(%rsi), %r9 - mov -27(%rsi), %r10 - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -51(%rdi) - mov %r9, -35(%rdi) - mov %r10, -27(%rdi) - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_43bytes): - mov -43(%rsi), %r8 - mov -35(%rsi), %r9 - mov -27(%rsi), %r10 - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - mov %r8, -43(%rdi) - mov %r9, -35(%rdi) - mov %r10, -27(%rdi) - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_35bytes): - mov -35(%rsi), %r9 - mov -27(%rsi), %r10 - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - mov %r9, -35(%rdi) - mov %r10, -27(%rdi) - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_27bytes): - mov -27(%rsi), %r10 - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - mov %r10, -27(%rdi) - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_19bytes): - mov -19(%rsi), %r11 - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - mov %r11, -19(%rdi) - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_11bytes): - mov -11(%rsi), %rcx - mov -4(%rsi), %edx - mov %rcx, -11(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_3bytes): - mov -3(%rsi), %dx - mov -2(%rsi), %cx - mov %dx, -3(%rdi) - mov %cx, -2(%rdi) - ret - - .p2align 4 -L(write_76bytes): - movdqu -76(%rsi), %xmm0 - movdqu -60(%rsi), %xmm1 - mov -44(%rsi), %r8 - mov -36(%rsi), %r9 - mov -28(%rsi), %r10 - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -76(%rdi) - movdqu %xmm1, -60(%rdi) - mov %r8, -44(%rdi) - mov %r9, -36(%rdi) - mov %r10, -28(%rdi) - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_68bytes): - movdqu -68(%rsi), %xmm0 - movdqu -52(%rsi), %xmm1 - mov -36(%rsi), %r9 - mov -28(%rsi), %r10 - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -68(%rdi) - movdqu %xmm1, -52(%rdi) - mov %r9, -36(%rdi) - mov %r10, -28(%rdi) - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_60bytes): - movdqu -60(%rsi), %xmm0 - mov -44(%rsi), %r8 - mov -36(%rsi), %r9 - mov -28(%rsi), %r10 - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -60(%rdi) - mov %r8, -44(%rdi) - mov %r9, -36(%rdi) - mov %r10, -28(%rdi) - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_52bytes): - movdqu -52(%rsi), %xmm0 - mov -36(%rsi), %r9 - mov -28(%rsi), %r10 - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - movdqu %xmm0, -52(%rdi) - mov %r9, -36(%rdi) - mov %r10, -28(%rdi) - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_44bytes): - mov -44(%rsi), %r8 - mov -36(%rsi), %r9 - mov -28(%rsi), %r10 - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - mov %r8, -44(%rdi) - mov %r9, -36(%rdi) - mov %r10, -28(%rdi) - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_36bytes): - mov -36(%rsi), %r9 - mov -28(%rsi), %r10 - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - mov %r9, -36(%rdi) - mov %r10, -28(%rdi) - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_28bytes): - mov -28(%rsi), %r10 - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - mov %r10, -28(%rdi) - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_20bytes): - mov -20(%rsi), %r11 - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - mov %r11, -20(%rdi) - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_12bytes): - mov -12(%rsi), %rcx - mov -4(%rsi), %edx - mov %rcx, -12(%rdi) - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_4bytes): - mov -4(%rsi), %edx - mov %edx, -4(%rdi) - ret - - .p2align 4 -L(write_77bytes): - movdqu -77(%rsi), %xmm0 - movdqu -61(%rsi), %xmm1 - mov -45(%rsi), %r8 - mov -37(%rsi), %r9 - mov -29(%rsi), %r10 - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -77(%rdi) - movdqu %xmm1, -61(%rdi) - mov %r8, -45(%rdi) - mov %r9, -37(%rdi) - mov %r10, -29(%rdi) - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_69bytes): - movdqu -69(%rsi), %xmm0 - movdqu -53(%rsi), %xmm1 - mov -37(%rsi), %r9 - mov -29(%rsi), %r10 - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -69(%rdi) - movdqu %xmm1, -53(%rdi) - mov %r9, -37(%rdi) - mov %r10, -29(%rdi) - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_61bytes): - movdqu -61(%rsi), %xmm0 - mov -45(%rsi), %r8 - mov -37(%rsi), %r9 - mov -29(%rsi), %r10 - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -61(%rdi) - mov %r8, -45(%rdi) - mov %r9, -37(%rdi) - mov %r10, -29(%rdi) - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_53bytes): - movdqu -53(%rsi), %xmm0 - mov -45(%rsi), %r8 - mov -37(%rsi), %r9 - mov -29(%rsi), %r10 - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -53(%rdi) - mov %r9, -37(%rdi) - mov %r10, -29(%rdi) - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_45bytes): - mov -45(%rsi), %r8 - mov -37(%rsi), %r9 - mov -29(%rsi), %r10 - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r8, -45(%rdi) - mov %r9, -37(%rdi) - mov %r10, -29(%rdi) - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_37bytes): - mov -37(%rsi), %r9 - mov -29(%rsi), %r10 - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r9, -37(%rdi) - mov %r10, -29(%rdi) - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_29bytes): - mov -29(%rsi), %r10 - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r10, -29(%rdi) - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_21bytes): - mov -21(%rsi), %r11 - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r11, -21(%rdi) - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_13bytes): - mov -13(%rsi), %rcx - mov -8(%rsi), %rdx - mov %rcx, -13(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_5bytes): - mov -5(%rsi), %edx - mov -4(%rsi), %ecx - mov %edx, -5(%rdi) - mov %ecx, -4(%rdi) - ret - - .p2align 4 -L(write_78bytes): - movdqu -78(%rsi), %xmm0 - movdqu -62(%rsi), %xmm1 - mov -46(%rsi), %r8 - mov -38(%rsi), %r9 - mov -30(%rsi), %r10 - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -78(%rdi) - movdqu %xmm1, -62(%rdi) - mov %r8, -46(%rdi) - mov %r9, -38(%rdi) - mov %r10, -30(%rdi) - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_70bytes): - movdqu -70(%rsi), %xmm0 - movdqu -54(%rsi), %xmm1 - mov -38(%rsi), %r9 - mov -30(%rsi), %r10 - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -70(%rdi) - movdqu %xmm1, -54(%rdi) - mov %r9, -38(%rdi) - mov %r10, -30(%rdi) - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_62bytes): - movdqu -62(%rsi), %xmm0 - mov -46(%rsi), %r8 - mov -38(%rsi), %r9 - mov -30(%rsi), %r10 - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -62(%rdi) - mov %r8, -46(%rdi) - mov %r9, -38(%rdi) - mov %r10, -30(%rdi) - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_54bytes): - movdqu -54(%rsi), %xmm0 - mov -38(%rsi), %r9 - mov -30(%rsi), %r10 - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -54(%rdi) - mov %r9, -38(%rdi) - mov %r10, -30(%rdi) - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_46bytes): - mov -46(%rsi), %r8 - mov -38(%rsi), %r9 - mov -30(%rsi), %r10 - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r8, -46(%rdi) - mov %r9, -38(%rdi) - mov %r10, -30(%rdi) - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_38bytes): - mov -38(%rsi), %r9 - mov -30(%rsi), %r10 - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r9, -38(%rdi) - mov %r10, -30(%rdi) - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_30bytes): - mov -30(%rsi), %r10 - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r10, -30(%rdi) - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_22bytes): - mov -22(%rsi), %r11 - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r11, -22(%rdi) - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_14bytes): - mov -14(%rsi), %rcx - mov -8(%rsi), %rdx - mov %rcx, -14(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_6bytes): - mov -6(%rsi), %edx - mov -4(%rsi), %ecx - mov %edx, -6(%rdi) - mov %ecx, -4(%rdi) - ret - - .p2align 4 -L(write_79bytes): - movdqu -79(%rsi), %xmm0 - movdqu -63(%rsi), %xmm1 - mov -47(%rsi), %r8 - mov -39(%rsi), %r9 - mov -31(%rsi), %r10 - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -79(%rdi) - movdqu %xmm1, -63(%rdi) - mov %r8, -47(%rdi) - mov %r9, -39(%rdi) - mov %r10, -31(%rdi) - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_71bytes): - movdqu -71(%rsi), %xmm0 - movdqu -55(%rsi), %xmm1 - mov -39(%rsi), %r9 - mov -31(%rsi), %r10 - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -71(%rdi) - movdqu %xmm1, -55(%rdi) - mov %r9, -39(%rdi) - mov %r10, -31(%rdi) - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_63bytes): - movdqu -63(%rsi), %xmm0 - mov -47(%rsi), %r8 - mov -39(%rsi), %r9 - mov -31(%rsi), %r10 - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -63(%rdi) - mov %r8, -47(%rdi) - mov %r9, -39(%rdi) - mov %r10, -31(%rdi) - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_55bytes): - movdqu -55(%rsi), %xmm0 - mov -39(%rsi), %r9 - mov -31(%rsi), %r10 - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - movdqu %xmm0, -55(%rdi) - mov %r9, -39(%rdi) - mov %r10, -31(%rdi) - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_47bytes): - mov -47(%rsi), %r8 - mov -39(%rsi), %r9 - mov -31(%rsi), %r10 - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r8, -47(%rdi) - mov %r9, -39(%rdi) - mov %r10, -31(%rdi) - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_39bytes): - mov -39(%rsi), %r9 - mov -31(%rsi), %r10 - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r9, -39(%rdi) - mov %r10, -31(%rdi) - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_31bytes): - mov -31(%rsi), %r10 - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r10, -31(%rdi) - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_23bytes): - mov -23(%rsi), %r11 - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - mov %r11, -23(%rdi) - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_15bytes): - mov -15(%rsi), %rcx - mov -8(%rsi), %rdx - mov %rcx, -15(%rdi) - mov %rdx, -8(%rdi) - ret - - .p2align 4 -L(write_7bytes): - mov -7(%rsi), %edx - mov -4(%rsi), %ecx - mov %edx, -7(%rdi) - mov %ecx, -4(%rdi) - ret - - .p2align 4 -L(large_page_fwd): - movdqu (%rsi), %xmm1 - lea 16(%rsi), %rsi - movdqu %xmm0, (%r8) - movntdq %xmm1, (%rdi) - lea 16(%rdi), %rdi - lea -0x90(%rdx), %rdx -#ifdef USE_AS_MEMMOVE - mov %rsi, %r9 - sub %rdi, %r9 - cmp %rdx, %r9 - jae L(memmove_is_memcpy_fwd) - shl $2, %rcx - cmp %rcx, %rdx - jb L(ll_cache_copy_fwd_start) -L(memmove_is_memcpy_fwd): -#endif -L(large_page_loop): - movdqu (%rsi), %xmm0 - movdqu 0x10(%rsi), %xmm1 - movdqu 0x20(%rsi), %xmm2 - movdqu 0x30(%rsi), %xmm3 - movdqu 0x40(%rsi), %xmm4 - movdqu 0x50(%rsi), %xmm5 - movdqu 0x60(%rsi), %xmm6 - movdqu 0x70(%rsi), %xmm7 - lea 0x80(%rsi), %rsi - - sub $0x80, %rdx - movntdq %xmm0, (%rdi) - movntdq %xmm1, 0x10(%rdi) - movntdq %xmm2, 0x20(%rdi) - movntdq %xmm3, 0x30(%rdi) - movntdq %xmm4, 0x40(%rdi) - movntdq %xmm5, 0x50(%rdi) - movntdq %xmm6, 0x60(%rdi) - movntdq %xmm7, 0x70(%rdi) - lea 0x80(%rdi), %rdi - jae L(large_page_loop) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(large_page_less_64bytes) - - movdqu (%rsi), %xmm0 - movdqu 0x10(%rsi), %xmm1 - movdqu 0x20(%rsi), %xmm2 - movdqu 0x30(%rsi), %xmm3 - lea 0x40(%rsi), %rsi - - movntdq %xmm0, (%rdi) - movntdq %xmm1, 0x10(%rdi) - movntdq %xmm2, 0x20(%rdi) - movntdq %xmm3, 0x30(%rdi) - lea 0x40(%rdi), %rdi - sub $0x40, %rdx -L(large_page_less_64bytes): - add %rdx, %rsi - add %rdx, %rdi - sfence - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - -#ifdef USE_AS_MEMMOVE - .p2align 4 -L(ll_cache_copy_fwd_start): - prefetcht0 0x1c0(%rsi) - prefetcht0 0x200(%rsi) - movdqu (%rsi), %xmm0 - movdqu 0x10(%rsi), %xmm1 - movdqu 0x20(%rsi), %xmm2 - movdqu 0x30(%rsi), %xmm3 - movdqu 0x40(%rsi), %xmm4 - movdqu 0x50(%rsi), %xmm5 - movdqu 0x60(%rsi), %xmm6 - movdqu 0x70(%rsi), %xmm7 - lea 0x80(%rsi), %rsi - - sub $0x80, %rdx - movaps %xmm0, (%rdi) - movaps %xmm1, 0x10(%rdi) - movaps %xmm2, 0x20(%rdi) - movaps %xmm3, 0x30(%rdi) - movaps %xmm4, 0x40(%rdi) - movaps %xmm5, 0x50(%rdi) - movaps %xmm6, 0x60(%rdi) - movaps %xmm7, 0x70(%rdi) - lea 0x80(%rdi), %rdi - jae L(ll_cache_copy_fwd_start) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(large_page_ll_less_fwd_64bytes) - - movdqu (%rsi), %xmm0 - movdqu 0x10(%rsi), %xmm1 - movdqu 0x20(%rsi), %xmm2 - movdqu 0x30(%rsi), %xmm3 - lea 0x40(%rsi), %rsi - - movaps %xmm0, (%rdi) - movaps %xmm1, 0x10(%rdi) - movaps %xmm2, 0x20(%rdi) - movaps %xmm3, 0x30(%rdi) - lea 0x40(%rdi), %rdi - sub $0x40, %rdx -L(large_page_ll_less_fwd_64bytes): - add %rdx, %rsi - add %rdx, %rdi - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - -#endif - .p2align 4 -L(large_page_bwd): - movdqu -0x10(%rsi), %xmm1 - lea -16(%rsi), %rsi - movdqu %xmm0, (%r8) - movdqa %xmm1, -0x10(%rdi) - lea -16(%rdi), %rdi - lea -0x90(%rdx), %rdx -#ifdef USE_AS_MEMMOVE - mov %rdi, %r9 - sub %rsi, %r9 - cmp %rdx, %r9 - jae L(memmove_is_memcpy_bwd) - cmp %rcx, %r9 - jb L(ll_cache_copy_bwd_start) -L(memmove_is_memcpy_bwd): -#endif -L(large_page_bwd_loop): - movdqu -0x10(%rsi), %xmm0 - movdqu -0x20(%rsi), %xmm1 - movdqu -0x30(%rsi), %xmm2 - movdqu -0x40(%rsi), %xmm3 - movdqu -0x50(%rsi), %xmm4 - movdqu -0x60(%rsi), %xmm5 - movdqu -0x70(%rsi), %xmm6 - movdqu -0x80(%rsi), %xmm7 - lea -0x80(%rsi), %rsi - - sub $0x80, %rdx - movntdq %xmm0, -0x10(%rdi) - movntdq %xmm1, -0x20(%rdi) - movntdq %xmm2, -0x30(%rdi) - movntdq %xmm3, -0x40(%rdi) - movntdq %xmm4, -0x50(%rdi) - movntdq %xmm5, -0x60(%rdi) - movntdq %xmm6, -0x70(%rdi) - movntdq %xmm7, -0x80(%rdi) - lea -0x80(%rdi), %rdi - jae L(large_page_bwd_loop) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(large_page_less_bwd_64bytes) - - movdqu -0x10(%rsi), %xmm0 - movdqu -0x20(%rsi), %xmm1 - movdqu -0x30(%rsi), %xmm2 - movdqu -0x40(%rsi), %xmm3 - lea -0x40(%rsi), %rsi - - movntdq %xmm0, -0x10(%rdi) - movntdq %xmm1, -0x20(%rdi) - movntdq %xmm2, -0x30(%rdi) - movntdq %xmm3, -0x40(%rdi) - lea -0x40(%rdi), %rdi - sub $0x40, %rdx -L(large_page_less_bwd_64bytes): - sfence - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) - -#ifdef USE_AS_MEMMOVE - .p2align 4 -L(ll_cache_copy_bwd_start): - prefetcht0 -0x1c0(%rsi) - prefetcht0 -0x200(%rsi) - movdqu -0x10(%rsi), %xmm0 - movdqu -0x20(%rsi), %xmm1 - movdqu -0x30(%rsi), %xmm2 - movdqu -0x40(%rsi), %xmm3 - movdqu -0x50(%rsi), %xmm4 - movdqu -0x60(%rsi), %xmm5 - movdqu -0x70(%rsi), %xmm6 - movdqu -0x80(%rsi), %xmm7 - lea -0x80(%rsi), %rsi - - sub $0x80, %rdx - movaps %xmm0, -0x10(%rdi) - movaps %xmm1, -0x20(%rdi) - movaps %xmm2, -0x30(%rdi) - movaps %xmm3, -0x40(%rdi) - movaps %xmm4, -0x50(%rdi) - movaps %xmm5, -0x60(%rdi) - movaps %xmm6, -0x70(%rdi) - movaps %xmm7, -0x80(%rdi) - lea -0x80(%rdi), %rdi - jae L(ll_cache_copy_bwd_start) - cmp $-0x40, %rdx - lea 0x80(%rdx), %rdx - jl L(large_page_ll_less_bwd_64bytes) - - movdqu -0x10(%rsi), %xmm0 - movdqu -0x20(%rsi), %xmm1 - movdqu -0x30(%rsi), %xmm2 - movdqu -0x40(%rsi), %xmm3 - lea -0x40(%rsi), %rsi - - movaps %xmm0, -0x10(%rdi) - movaps %xmm1, -0x20(%rdi) - movaps %xmm2, -0x30(%rdi) - movaps %xmm3, -0x40(%rdi) - lea -0x40(%rdi), %rdi - sub $0x40, %rdx -L(large_page_ll_less_bwd_64bytes): - BRANCH_TO_JMPTBL_ENTRY (L(table_less_80bytes), %rdx, 4) -#endif - -END (MEMCPY) - - .section .rodata.ssse3,"a",@progbits - .p2align 3 -L(table_less_80bytes): - .int JMPTBL (L(write_0bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_1bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_2bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_3bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_4bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_5bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_6bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_7bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_8bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_9bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_10bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_11bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_12bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_13bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_14bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_15bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_16bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_17bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_18bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_19bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_20bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_21bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_22bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_23bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_24bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_25bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_26bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_27bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_28bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_29bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_30bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_31bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_32bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_33bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_34bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_35bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_36bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_37bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_38bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_39bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_40bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_41bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_42bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_43bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_44bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_45bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_46bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_47bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_48bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_49bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_50bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_51bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_52bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_53bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_54bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_55bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_56bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_57bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_58bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_59bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_60bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_61bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_62bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_63bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_64bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_65bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_66bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_67bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_68bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_69bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_70bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_71bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_72bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_73bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_74bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_75bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_76bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_77bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_78bytes), L(table_less_80bytes)) - .int JMPTBL (L(write_79bytes), L(table_less_80bytes)) - - .p2align 3 -L(shl_table): - .int JMPTBL (L(shl_0), L(shl_table)) - .int JMPTBL (L(shl_1), L(shl_table)) - .int JMPTBL (L(shl_2), L(shl_table)) - .int JMPTBL (L(shl_3), L(shl_table)) - .int JMPTBL (L(shl_4), L(shl_table)) - .int JMPTBL (L(shl_5), L(shl_table)) - .int JMPTBL (L(shl_6), L(shl_table)) - .int JMPTBL (L(shl_7), L(shl_table)) - .int JMPTBL (L(shl_8), L(shl_table)) - .int JMPTBL (L(shl_9), L(shl_table)) - .int JMPTBL (L(shl_10), L(shl_table)) - .int JMPTBL (L(shl_11), L(shl_table)) - .int JMPTBL (L(shl_12), L(shl_table)) - .int JMPTBL (L(shl_13), L(shl_table)) - .int JMPTBL (L(shl_14), L(shl_table)) - .int JMPTBL (L(shl_15), L(shl_table)) - - .p2align 3 -L(shl_table_bwd): - .int JMPTBL (L(shl_0_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_1_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_2_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_3_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_4_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_5_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_6_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_7_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_8_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_9_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_10_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_11_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_12_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_13_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_14_bwd), L(shl_table_bwd)) - .int JMPTBL (L(shl_15_bwd), L(shl_table_bwd)) - -#endif diff --git a/sysdeps/x86_64/multiarch/memmove-ssse3.S b/sysdeps/x86_64/multiarch/memmove-ssse3.S index 295430b1ef..215583e7bd 100644 --- a/sysdeps/x86_64/multiarch/memmove-ssse3.S +++ b/sysdeps/x86_64/multiarch/memmove-ssse3.S @@ -1,4 +1,380 @@ -#define USE_AS_MEMMOVE -#define MEMCPY __memmove_ssse3 -#define MEMCPY_CHK __memmove_chk_ssse3 -#include "memcpy-ssse3.S" +#include + +#ifndef MEMMOVE +# define MEMMOVE __memmove_ssse3 +# define MEMMOVE_CHK __memmove_chk_ssse3 +# define MEMCPY __memcpy_ssse3 +# define MEMCPY_CHK __memcpy_chk_ssse3 +# define MEMPCPY __mempcpy_ssse3 +# define MEMPCPY_CHK __mempcpy_chk_ssse3 +#endif + + .section .text.ssse3, "ax", @progbits +ENTRY(MEMPCPY_CHK) + cmp %RDX_LP, %RCX_LP + jb HIDDEN_JUMPTARGET(__chk_fail) +END(MEMPCPY_CHK) + +ENTRY(MEMPCPY) + mov %RDI_LP, %RAX_LP + add %RDX_LP, %RAX_LP + jmp L(start) +END(MEMPCPY) + +ENTRY(MEMMOVE_CHK) + cmp %RDX_LP, %RCX_LP + jb HIDDEN_JUMPTARGET(__chk_fail) +END(MEMMOVE_CHK) + +ENTRY_P2ALIGN(MEMMOVE, 6) + movq %rdi, %rax +L(start): + cmpq $16, %rdx + jb L(copy_0_15) + + /* These loads are always useful. */ + movups 0(%rsi), %xmm0 + movups -16(%rsi, %rdx), %xmm7 + cmpq $32, %rdx + ja L(more_2x_vec) + + movups %xmm0, 0(%rdi) + movups %xmm7, -16(%rdi, %rdx) + ret + + .p2align 4,, 4 +L(copy_0_15): + cmpl $4, %edx + jb L(copy_0_3) + cmpl $8, %edx + jb L(copy_4_7) + movq 0(%rsi), %rcx + movq -8(%rsi, %rdx), %rsi + movq %rcx, 0(%rdi) + movq %rsi, -8(%rdi, %rdx) + ret + + .p2align 4,, 4 +L(copy_4_7): + movl 0(%rsi), %ecx + movl -4(%rsi, %rdx), %esi + movl %ecx, 0(%rdi) + movl %esi, -4(%rdi, %rdx) + ret + + .p2align 4,, 4 +L(copy_0_3): + decl %edx + jl L(copy_0_0) + movb (%rsi), %cl + je L(copy_1_1) + + movzwl -1(%rsi, %rdx), %esi + movw %si, -1(%rdi, %rdx) +L(copy_1_1): + movb %cl, (%rdi) +L(copy_0_0): + ret + + .p2align 4,, 4 +L(copy_4x_vec): + movups 16(%rsi), %xmm1 + movups -32(%rsi, %rdx), %xmm2 + + movups %xmm0, 0(%rdi) + movups %xmm1, 16(%rdi) + movups %xmm2, -32(%rdi, %rdx) + movups %xmm7, -16(%rdi, %rdx) +L(nop): + ret + + .p2align 4 +L(more_2x_vec): + cmpq $64, %rdx + jbe L(copy_4x_vec) + + /* We use rcx later to get alignr value. */ + movq %rdi, %rcx + + /* Backward copy for overlap + dst > src for memmove safety. */ + subq %rsi, %rcx + cmpq %rdx, %rcx + jb L(copy_backward) + + /* Load tail. */ + + /* -16(%rsi, %rdx) already loaded into xmm7. */ + movups -32(%rsi, %rdx), %xmm8 + movups -48(%rsi, %rdx), %xmm9 + + /* Get misalignment. */ + andl $0xf, %ecx + + movq %rsi, %r9 + addq %rcx, %rsi + andq $-16, %rsi + /* Get first vec for `palignr`. */ + movaps (%rsi), %xmm1 + + /* We have loaded (%rsi) so safe to do this store before the + loop. */ + movups %xmm0, (%rdi) + +#ifdef SHARED_CACHE_SIZE_HALF + cmp $SHARED_CACHE_SIZE_HALF, %RDX_LP +#else + cmp __x86_shared_cache_size_half(%rip), %rdx +#endif + ja L(large_memcpy) + + leaq -64(%rdi, %rdx), %r8 + andq $-16, %rdi + movl $48, %edx + + leaq L(loop_fwd_start)(%rip), %r9 + sall $6, %ecx + addq %r9, %rcx + jmp * %rcx + + .p2align 4,, 8 +L(copy_backward): + testq %rcx, %rcx + jz L(nop) + + /* Preload tail. */ + + /* (%rsi) already loaded into xmm0. */ + movups 16(%rsi), %xmm4 + movups 32(%rsi), %xmm5 + + movq %rdi, %r8 + subq %rdi, %rsi + leaq -49(%rdi, %rdx), %rdi + andq $-16, %rdi + addq %rdi, %rsi + andq $-16, %rsi + + movaps 48(%rsi), %xmm6 + + + leaq L(loop_bkwd_start)(%rip), %r9 + andl $0xf, %ecx + sall $6, %ecx + addq %r9, %rcx + jmp * %rcx + + .p2align 4,, 8 +L(large_memcpy): + movups -64(%r9, %rdx), %xmm10 + movups -80(%r9, %rdx), %xmm11 + + sall $5, %ecx + leal (%rcx, %rcx, 2), %r8d + leaq -96(%rdi, %rdx), %rcx + andq $-16, %rdi + leaq L(large_loop_fwd_start)(%rip), %rdx + addq %r8, %rdx + jmp * %rdx + + + /* Instead of a typical jump table all 16 loops are exactly + 64-bytes in size. So, we can just jump to first loop + r8 * + 64. Before modifying any loop ensure all their sizes match! + */ + .p2align 6 +L(loop_fwd_start): +L(loop_fwd_0x0): + movaps 16(%rsi), %xmm1 + movaps 32(%rsi), %xmm2 + movaps 48(%rsi), %xmm3 + movaps %xmm1, 16(%rdi) + movaps %xmm2, 32(%rdi) + movaps %xmm3, 48(%rdi) + addq %rdx, %rdi + addq %rdx, %rsi + cmpq %rdi, %r8 + ja L(loop_fwd_0x0) +L(end_loop_fwd): + movups %xmm9, 16(%r8) + movups %xmm8, 32(%r8) + movups %xmm7, 48(%r8) + ret + + /* Extactly 64 bytes if `jmp L(end_loop_fwd)` is long encoding. + 60 bytes otherwise. */ +#define ALIGNED_LOOP_FWD(align_by); \ + .p2align 6; \ +L(loop_fwd_ ## align_by): \ + movaps 16(%rsi), %xmm0; \ + movaps 32(%rsi), %xmm2; \ + movaps 48(%rsi), %xmm3; \ + movaps %xmm3, %xmm4; \ + palignr $align_by, %xmm2, %xmm3; \ + palignr $align_by, %xmm0, %xmm2; \ + palignr $align_by, %xmm1, %xmm0; \ + movaps %xmm4, %xmm1; \ + movaps %xmm0, 16(%rdi); \ + movaps %xmm2, 32(%rdi); \ + movaps %xmm3, 48(%rdi); \ + addq %rdx, %rdi; \ + addq %rdx, %rsi; \ + cmpq %rdi, %r8; \ + ja L(loop_fwd_ ## align_by); \ + jmp L(end_loop_fwd); + + /* Must be in descending order. */ + ALIGNED_LOOP_FWD (0xf) + ALIGNED_LOOP_FWD (0xe) + ALIGNED_LOOP_FWD (0xd) + ALIGNED_LOOP_FWD (0xc) + ALIGNED_LOOP_FWD (0xb) + ALIGNED_LOOP_FWD (0xa) + ALIGNED_LOOP_FWD (0x9) + ALIGNED_LOOP_FWD (0x8) + ALIGNED_LOOP_FWD (0x7) + ALIGNED_LOOP_FWD (0x6) + ALIGNED_LOOP_FWD (0x5) + ALIGNED_LOOP_FWD (0x4) + ALIGNED_LOOP_FWD (0x3) + ALIGNED_LOOP_FWD (0x2) + ALIGNED_LOOP_FWD (0x1) + + .p2align 6 +L(large_loop_fwd_start): +L(large_loop_fwd_0x0): + movaps 16(%rsi), %xmm1 + movaps 32(%rsi), %xmm2 + movaps 48(%rsi), %xmm3 + movaps 64(%rsi), %xmm4 + movaps 80(%rsi), %xmm5 + movntps %xmm1, 16(%rdi) + movntps %xmm2, 32(%rdi) + movntps %xmm3, 48(%rdi) + movntps %xmm4, 64(%rdi) + movntps %xmm5, 80(%rdi) + addq $80, %rdi + addq $80, %rsi + cmpq %rdi, %rcx + ja L(large_loop_fwd_0x0) + + /* Ensure no icache line split on tail. */ + .p2align 4 +L(end_large_loop_fwd): + sfence + movups %xmm11, 16(%rcx) + movups %xmm10, 32(%rcx) + movups %xmm9, 48(%rcx) + movups %xmm8, 64(%rcx) + movups %xmm7, 80(%rcx) + ret + + + /* Size > 64 bytes and <= 96 bytes. 32-byte align between ensure + 96-byte spacing between each. */ +#define ALIGNED_LARGE_LOOP_FWD(align_by); \ + .p2align 5; \ +L(large_loop_fwd_ ## align_by): \ + movaps 16(%rsi), %xmm0; \ + movaps 32(%rsi), %xmm2; \ + movaps 48(%rsi), %xmm3; \ + movaps 64(%rsi), %xmm4; \ + movaps 80(%rsi), %xmm5; \ + movaps %xmm5, %xmm6; \ + palignr $align_by, %xmm4, %xmm5; \ + palignr $align_by, %xmm3, %xmm4; \ + palignr $align_by, %xmm2, %xmm3; \ + palignr $align_by, %xmm0, %xmm2; \ + palignr $align_by, %xmm1, %xmm0; \ + movaps %xmm6, %xmm1; \ + movntps %xmm0, 16(%rdi); \ + movntps %xmm2, 32(%rdi); \ + movntps %xmm3, 48(%rdi); \ + movntps %xmm4, 64(%rdi); \ + movntps %xmm5, 80(%rdi); \ + addq $80, %rdi; \ + addq $80, %rsi; \ + cmpq %rdi, %rcx; \ + ja L(large_loop_fwd_ ## align_by); \ + jmp L(end_large_loop_fwd); + + /* Must be in descending order. */ + ALIGNED_LARGE_LOOP_FWD (0xf) + ALIGNED_LARGE_LOOP_FWD (0xe) + ALIGNED_LARGE_LOOP_FWD (0xd) + ALIGNED_LARGE_LOOP_FWD (0xc) + ALIGNED_LARGE_LOOP_FWD (0xb) + ALIGNED_LARGE_LOOP_FWD (0xa) + ALIGNED_LARGE_LOOP_FWD (0x9) + ALIGNED_LARGE_LOOP_FWD (0x8) + ALIGNED_LARGE_LOOP_FWD (0x7) + ALIGNED_LARGE_LOOP_FWD (0x6) + ALIGNED_LARGE_LOOP_FWD (0x5) + ALIGNED_LARGE_LOOP_FWD (0x4) + ALIGNED_LARGE_LOOP_FWD (0x3) + ALIGNED_LARGE_LOOP_FWD (0x2) + ALIGNED_LARGE_LOOP_FWD (0x1) + + + .p2align 6 +L(loop_bkwd_start): +L(loop_bkwd_0x0): + movaps 32(%rsi), %xmm1 + movaps 16(%rsi), %xmm2 + movaps 0(%rsi), %xmm3 + movaps %xmm1, 32(%rdi) + movaps %xmm2, 16(%rdi) + movaps %xmm3, 0(%rdi) + subq $48, %rdi + subq $48, %rsi + cmpq %rdi, %r8 + jb L(loop_bkwd_0x0) +L(end_loop_bkwd): + movups %xmm7, -16(%r8, %rdx) + movups %xmm0, 0(%r8) + movups %xmm4, 16(%r8) + movups %xmm5, 32(%r8) + + ret + + + /* Extactly 64 bytes if `jmp L(end_loop_bkwd)` is long encoding. + 60 bytes otherwise. */ +#define ALIGNED_LOOP_BKWD(align_by); \ + .p2align 6; \ +L(loop_bkwd_ ## align_by): \ + movaps 32(%rsi), %xmm1; \ + movaps 16(%rsi), %xmm2; \ + movaps 0(%rsi), %xmm3; \ + palignr $align_by, %xmm1, %xmm6; \ + palignr $align_by, %xmm2, %xmm1; \ + palignr $align_by, %xmm3, %xmm2; \ + movaps %xmm6, 32(%rdi); \ + movaps %xmm1, 16(%rdi); \ + movaps %xmm2, 0(%rdi); \ + subq $48, %rdi; \ + subq $48, %rsi; \ + movaps %xmm3, %xmm6; \ + cmpq %rdi, %r8; \ + jb L(loop_bkwd_ ## align_by); \ + jmp L(end_loop_bkwd); + + /* Must be in descending order. */ + ALIGNED_LOOP_BKWD (0xf) + ALIGNED_LOOP_BKWD (0xe) + ALIGNED_LOOP_BKWD (0xd) + ALIGNED_LOOP_BKWD (0xc) + ALIGNED_LOOP_BKWD (0xb) + ALIGNED_LOOP_BKWD (0xa) + ALIGNED_LOOP_BKWD (0x9) + ALIGNED_LOOP_BKWD (0x8) + ALIGNED_LOOP_BKWD (0x7) + ALIGNED_LOOP_BKWD (0x6) + ALIGNED_LOOP_BKWD (0x5) + ALIGNED_LOOP_BKWD (0x4) + ALIGNED_LOOP_BKWD (0x3) + ALIGNED_LOOP_BKWD (0x2) + ALIGNED_LOOP_BKWD (0x1) +END(MEMMOVE) + +strong_alias (MEMMOVE, MEMCPY) +strong_alias (MEMMOVE_CHK, MEMCPY_CHK) -- 2.25.1