From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) by sourceware.org (Postfix) with ESMTPS id E057E3858C74 for ; Fri, 19 Jan 2024 05:58:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E057E3858C74 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E057E3858C74 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::f32 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705643915; cv=none; b=mYvWy7OF9yLrXM6CT4ZNKe46vEY2JLnjS3w+ggJxflAJbLootcNpBNN6hkZbOcifZeSvhx028EdDHMWV7gYhLbqoBVIjL9BtgAjbC3T14KVulVI++iAIur4WdmDTbDe+Euf5FCjzpii02Vvf5jJFYqVeqOehp8T2izXLjuV/wO8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705643915; c=relaxed/simple; bh=NHJ9YPhQH1+7RQikBOM9NzEsynBzVjwBxU4UrXOt1pI=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=xRPNMXc992uh18PJnxIVamBRCGb58crTZE+mooKElwkpCQ/edsim2mj+leveZ9hmUkFEy+CTHLVNmm+o/4+vRn3yaARwzmI1+YwN9a61H+gPoIAQcFnTEU2dCxLHytUGmFkkUvNq2wO7IQSrAJHg/rIl+GhzebvFe8Re42YmM6w= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-qv1-xf32.google.com with SMTP id 6a1803df08f44-6818a9fe380so2658086d6.2 for ; Thu, 18 Jan 2024 21:58:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705643910; x=1706248710; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=NHJ9YPhQH1+7RQikBOM9NzEsynBzVjwBxU4UrXOt1pI=; b=mKP2sR7bMX9y+ua1bEmLRqfWmT37WZYfsb2EAKA9MHeXOIH5kSXV30Q8p6i34IynIg +I9wf7wmkh9Y/f9FKquZ0MZGjPDyXKKSp9JXZWrkUCFmVengABDYTvvLNBwVDpvpw2QK fFo+EueUkTxR4E+/OHYEf4ppFwY0OnlzltD3ts5xBF7xj4xrXJW2j3jz6YDDrejf9enP b2W2h1uqtwU6+cf54gCvlAVVsVWBsmkylr98fHYBGn3bVnx+Ji3bSHleNAbivXqIdeVs B3LkbXiOwhgkJw32df8lBPGePDUU6cllH59D63IOw+6y/ZuvRIYPXM4LEeSQMrrl8vem vJSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705643910; x=1706248710; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=NHJ9YPhQH1+7RQikBOM9NzEsynBzVjwBxU4UrXOt1pI=; b=b5vYrFpvw0Zs8jSC0v0/9TkmwjbVDg20D5Ud2k33j4ZDi5YFaHbk7pPpVAtqxKd07Z xPjEZ+LGVfH6gg5/QrN+HKmgXZSeHAVlIa7QlLLOdkYRq/7JAe3v+EYThbXQ5sce0GXh Eh7MiMFQ8jg+cr8WgtY9iConndtZ43S4cWinOpxy1XGcQgrgKTaY2uK3G/XZjlBH2/S1 AT7KzqzUHuv+UVoYNxTJ1/v3rVKN645Uqj2NHytFmXODVnjc/9CvFLsEmWUQZiVDYIfi gyMSi6UAfqsM7AF/g3i1N16jH9VY2qewcnIBZLcEoRmmpb14NOnkkonlF735hnZTIUIa wATg== X-Gm-Message-State: AOJu0YwIs9awQTkjGj/a0KzTl22lZHcgJ5WpubluuJ9oPugDHQQaW2X0 MhkkPm8J9+TOWpYTH69rd/Yaxxk+oyPW8qmPQbQu/R1yoxHXJQ9CQ4wyYEMnvEOWUTfwPs3tO88 3nfsau3MeALQ80eoe447SxTEqeNs= X-Google-Smtp-Source: AGHT+IELi/8vxMRz6tNd8KAV3HFqOL6uZM4JGNCIgkI+1iykHzUWi868xR4YD7pZ5UDqz+jNcezhQKS9HGl4NuKHUSo= X-Received: by 2002:a05:6214:2427:b0:680:feef:7f with SMTP id gy7-20020a056214242700b00680feef007fmr1982562qvb.125.1705643910075; Thu, 18 Jan 2024 21:58:30 -0800 (PST) MIME-Version: 1.0 From: Duke Abbaddon Date: Fri, 19 Jan 2024 05:58:18 +0000 Message-ID: Subject: SOLVE (c)RS Multi-line Packed-Bit Int SiMD Maths : Relevance HDR, WCG, ML Machine Learning (Most advantaged ADDER Maths) To: amd.gcc@amd.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Multi-line Packed-Bit Int SiMD Maths : Relevance HDR, WCG, ML Machine Learning (Most advantaged ADDER Maths) The rules of multiple Maths with lower Bit widths into SiMD 256Bit (example) 64Bit & 128Bit & 512Bit can be used In all methods you use packed bits per save, so single line save or load, Parallel, No ram thrashing. You cannot flow a 16Bit block into another segment (the next 16Bit block) You can however use 9 bit as a separator & rolling an addition to the next bit means a more accurate result! in 32Bit you do 3 * 8bit & 1 * 4Bit, in this example the 4Bit op has 5 Bit results & The 8Bit have 9Bit results.. This is preferable! 2Bit, 3Bit, 4Bit Operation 1 , 8Bit Operations 3: Table 32Bit 4 : 1, 8 : 3 64Bit 4 : 2, 8 : 6 2 : 1, 7 : 8 3 : 1, 8 : 1, 16 : 3 Addition is the only place where 16Bit * 4 = 64Bit works easily, but when you ADD or - you can only roll to the lowest boundary of each 16Bit segment & not into the higher or lower segment. A: In order to multiply you need adaptable rules to division & multiply B: you need a dividable Maths unit with And OR & Not gates to segment the registered Mul SiMD Unit.. In the case of + * you need to use single line rule addition (no over flow per pixel).. & Either Many AND-OR / Not gate layer or Parallel 16Bit blocks.. You can however painful as it is Multi Load & Zero remainder registers & &or X or Not remainder 00000 on higher depth instructions & so remain pure! 8Bit blocks are a bit small and we use HDR & WCG, So mostly pointless! We can however 8Bit Write a patch of pallet & sub divide our colour pallet & Light Shadow Curves in anything over 8Bit depth colour, In the case of Intel 8Bit * 8 Inferencing unit : 16 Bit Colour in probably (WCG 8 * 8) + (HDR 8 * 8) Segments, In any case Addition is fortunately what we need! so with ADD we can use SiMD & Integer Today. Rupert S https://science.n-helix.com/2018/01/integer-floats-with-remainder-theory.html https://science.n-helix.com/2021/11/parallel-execution.html https://science.n-helix.com/2022/10/ml.html https://science.n-helix.com/2021/03/brain-bit-precision-int32-fp32-int16.html https://science.n-helix.com/2023/06/map.html