* [Patch,AVR] PR54222: Add fixed point support @ 2012-08-10 15:53 Georg-Johann Lay 2012-08-10 16:09 ` Weddington, Eric 0 siblings, 1 reply; 12+ messages in thread From: Georg-Johann Lay @ 2012-08-10 15:53 UTC (permalink / raw) To: gcc-patches Cc: Denis Chertykov, Eric Weddington, Sean D'Epagnier, Joerg Wunsch [-- Attachment #1: Type: text/plain, Size: 4838 bytes --] This patch adds fixed point support to the avr target. It's based on the work of Sean, see http://lists.gnu.org/archive/html/avr-gcc-list/2012-07/msg00030.html This patch has several changes compared to Sean's patch: * Additions and subtractions are merged with the existing INT_MODE operations by means of mode iterator. The output routines are generic enough to handle fixed-point, too. The changes were minimal, e.g. some new fixed-point constraints. * Similar for shifts and comparisons. * The patch neither implements NEG nor ABS. The standard requires saturation, e.g. -0x80 must not become 0x80 again. This is work still to be done, but just a matter of optimization. * avr-modes.def adjusts TAmode and UTAmode to be 64-bit modes. The GCC default with 128 bits is too extreme for AVR. Besides that, the libgcc machinery won't generate TA/UTA functions because it thinks the mode it too big (is does not code for ADJUST_BYTESIZE) and the respective functions are empty. TA/UTA have 48 fractional bits so that the user can pick 2 different resolutions of 64-bit Accum types. * To make TA/UTA work, avr-lib.h needs some hand-made defines. * There are no middle-end changes. The original patch changed: rtl.h, cse.c, fold-const.c, varasm.c. The patch works out fine. However, because of PR53923 which shreds the AVR port, currently no reasonable testing is possible. Work to be done is better testing after PR53923 is fixed and the AVR port works properly again. And there are many possible tweaks, e.g. the above mentioned NEG implementation etc, but that can be done later. Ok for trunk? Johann libgcc/ PR target/54222 * config/avr/lib1funcs-fixed.S: New file. * config/avr/lib1funcs.S: Include it. Undefine some divmodsi after they are used. (neg2, neg4): New macros. * config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's avr-modes.def. * config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf, _fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf, _fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq, _fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3, _mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3, _udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3. gcc/ PR target/54222 * avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes. * avr/avr-fixed.md: New file. * avr/avr.md: Include it. (cc): Add: minus. (adjust_len): Add: minus, minus64, ufract, sfract. (ALL1, ALL2, ALL4, ORDERED234): New mode iterators. (MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. (MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. (pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3, subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi, cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1. (*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3, ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3, *lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all 16-bit modes in ALL2. (subhi3, casesi, strlenhi): Add clobber when expanding minus:HI. (*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const, ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const, *reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all 32-bit modes in ALL4. * avr-dimode.md (ALL8): New mode iterator. (adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn, subdi3_const_insn, cbranchdi4, compare_di2, compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn, ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle all 64-bit modes in ALL8. * config/avr/avr-protos.h (avr_to_int_mode): New prototype. (avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes. * config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Return true. (avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P. (avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new pseudo instead of gen_rtx_MINUS. (avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED. (notice_update_cc): Handle: CC_MINUS. (output_movqi): Generalize to handle respective fixed-point modes. (output_movhi, output_movsisf, avr_2word_insn_p): Ditto. (avr_out_compare, avr_out_plus_1): Also handle fixed-point modes. (avr_assemble_integer): Ditto. (output_reload_in_const, output_reload_insisf): Ditto. (avr_out_fract, avr_out_minus, avr_out_minus64): New functions. (avr_to_int_mode): New function. (adjust_insn_length): Handle: ADJUST_LEN_SFRACT, ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64. * config/avr/predicates.md (const0_operand): Allow const_fixed. (const_operand, const_or_immediate_operand): New. (nonmemory_or_const_operand): New. * config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ): New constraints. [-- Attachment #2: fixed-48.diff --] [-- Type: text/x-patch, Size: 147887 bytes --] Index: gcc/config/avr/predicates.md =================================================================== --- gcc/config/avr/predicates.md (revision 190299) +++ gcc/config/avr/predicates.md (working copy) @@ -74,7 +74,7 @@ (define_predicate "nox_general_operand" ;; Return 1 if OP is the zero constant for MODE. (define_predicate "const0_operand" - (and (match_code "const_int,const_double") + (and (match_code "const_int,const_fixed,const_double") (match_test "op == CONST0_RTX (mode)"))) ;; Return 1 if OP is the one constant integer for MODE. @@ -248,3 +248,21 @@ (define_predicate "s16_operand" (define_predicate "o16_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), -(1<<16), -1)"))) + +;; Const int, fixed, or double operand +(define_predicate "const_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "const_int_operand"))) + +;; Const int, const fixed, or const double operand +(define_predicate "nonmemory_or_const_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "nonmemory_operand"))) + +;; Immediate, const fixed, or const double operand +(define_predicate "const_or_immediate_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "immediate_operand"))) Index: gcc/config/avr/avr-fixed.md =================================================================== --- gcc/config/avr/avr-fixed.md (revision 0) +++ gcc/config/avr/avr-fixed.md (revision 0) @@ -0,0 +1,334 @@ +;; This file contains instructions that support fixed-point operations +;; for Atmel AVR micro controllers. +;; Copyright (C) 2012 +;; Free Software Foundation, Inc. +;; Contributed by Sean D'Epagnier (sean@depagnier.com) + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. + +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +(define_mode_iterator ALL1Q [(QQ "") (UQQ "")]) +(define_mode_iterator ALL2Q [(HQ "") (UHQ "")]) +(define_mode_iterator ALL2A [(HA "") (UHA "")]) +(define_mode_iterator ALL2QA [(HQ "") (UHQ "") + (HA "") (UHA "")]) +(define_mode_iterator ALL4A [(SA "") (USA "")]) + +;;; Conversions + +(define_mode_iterator FIXED_A + [(QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "") + (DQ "") (UDQ "") (DA "") (UDA "") + (TA "") (UTA "") + (QI "") (HI "") (SI "") (DI "")]) + +;; Same so that be can build cross products + +(define_mode_iterator FIXED_B + [(QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "") + (DQ "") (UDQ "") (DA "") (UDA "") + (TA "") (UTA "") + (QI "") (HI "") (SI "") (DI "")]) + +(define_insn "fract<FIXED_B:mode><FIXED_A:mode>2" + [(set (match_operand:FIXED_A 0 "register_operand" "=r") + (fract_convert:FIXED_A + (match_operand:FIXED_B 1 "register_operand" "r")))] + "<FIXED_B:MODE>mode != <FIXED_A:MODE>mode" + { + return avr_out_fract (insn, operands, true, NULL); + } + [(set_attr "cc" "clobber") + (set_attr "adjust_len" "sfract")]) + +(define_insn "fractuns<FIXED_B:mode><FIXED_A:mode>2" + [(set (match_operand:FIXED_A 0 "register_operand" "=r") + (unsigned_fract_convert:FIXED_A + (match_operand:FIXED_B 1 "register_operand" "r")))] + "<FIXED_B:MODE>mode != <FIXED_A:MODE>mode" + { + return avr_out_fract (insn, operands, false, NULL); + } + [(set_attr "cc" "clobber") + (set_attr "adjust_len" "ufract")]) + +;****************************************************************************** +; mul + +(define_insn "mulqq3" + [(set (match_operand:QQ 0 "register_operand" "=r") + (mult:QQ (match_operand:QQ 1 "register_operand" "a") + (match_operand:QQ 2 "register_operand" "a")))] + "AVR_HAVE_MUL" + "fmuls %1,%2\;mov %0,r1\;clr __zero_reg__" + [(set_attr "length" "3") + (set_attr "cc" "clobber")]) + +(define_insn "muluqq3" + [(set (match_operand:UQQ 0 "register_operand" "=r") + (mult:UQQ (match_operand:UQQ 1 "register_operand" "r") + (match_operand:UQQ 2 "register_operand" "r")))] + "AVR_HAVE_MUL" + "mul %1,%2\;mov %0,r1\;clr __zero_reg__" + [(set_attr "length" "3") + (set_attr "cc" "clobber")]) + +;; (reg:ALL2Q 20) not clobbered on the enhanced core. +;; Use registers from 16-23 so we can use FMULS. +;; All call-used registers clobbered otherwise - Normal library call. + +;; "mulhq3" "muluhq3" +(define_expand "mul<mode>3" + [(set (reg:ALL2Q 22) + (match_operand:ALL2Q 1 "register_operand" "")) + (set (reg:ALL2Q 20) + (match_operand:ALL2Q 2 "register_operand" "")) + (parallel [(set (reg:ALL2Q 18) + (mult:ALL2Q (reg:ALL2Q 22) + (reg:ALL2Q 20))) + (clobber (reg:ALL2Q 22))]) + (set (match_operand:ALL2Q 0 "register_operand" "") + (reg:ALL2Q 18))] + "AVR_HAVE_MUL") + +;; "*mulhq3_enh_call" "*muluhq3_enh_call" +(define_insn "*mul<mode>3_enh_call" + [(set (reg:ALL2Q 18) + (mult:ALL2Q (reg:ALL2Q 22) + (reg:ALL2Q 20))) + (clobber (reg:ALL2Q 22))] + "AVR_HAVE_MUL" + "%~call __mul<mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +; Special calls for with and without mul. +;; "mulha3" "muluha3" +(define_expand "mul<mode>3" + [(set (reg:ALL2A 22) + (match_operand:ALL2A 1 "register_operand" "")) + (set (reg:ALL2A 20) + (match_operand:ALL2A 2 "register_operand" "")) + (parallel [(set (reg:ALL2A 18) + (mult:ALL2A (reg:ALL2A 22) + (reg:ALL2A 20))) + (clobber (reg:ALL2A 22))]) + (set (match_operand:ALL2A 0 "register_operand" "") + (reg:ALL2A 18))] + "" + { + if (!AVR_HAVE_MUL) + { + emit_insn (gen_mul<mode>3_call (operands[0], operands[1], operands[2])); + DONE; + } + }) + +;; "*mulha3_enh" "*muluhq3_enh" +(define_insn "*mul<mode>3_enh" + [(set (reg:ALL2A 18) + (mult:ALL2A (reg:ALL2A 22) + (reg:ALL2A 20))) + (clobber (reg:ALL2A 22))] + "AVR_HAVE_MUL" + "%~call __mul<mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; Without MUL. Clobbers both inputs, needs a separate output register. + +;; "mulha3_call" "muluhq3_call" +(define_expand "mul<mode>3_call" + [(set (reg:ALL2A 24) + (match_operand:ALL2A 1 "register_operand" "")) + (set (reg:ALL2A 22) + (match_operand:ALL2A 2 "register_operand" "")) + (parallel [(set (reg:ALL2A 18) + (mult:ALL2A (reg:ALL2A 22) + (reg:ALL2A 24))) + (clobber (reg:ALL2A 22)) + (clobber (reg:ALL2A 24))]) + (set (match_operand:ALL2A 0 "register_operand" "") + (reg:ALL2A 18))] + "!AVR_HAVE_MUL") + +;; "*mulha3_call" "*muluha3_call" +(define_insn "*mul<mode>3_call" + [(set (reg:ALL2A 18) + (mult:ALL2A (reg:ALL2A 22) + (reg:ALL2A 24))) + (clobber (reg:ALL2A 22)) + (clobber (reg:ALL2A 24))] + "!AVR_HAVE_MUL" + "%~call __mul<mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; On the enhanced core, don't clobber either input and use a separate output. +;; R15 is needed as a zero register since R1 aka. __zero_reg__ is used for MUL. + +;; "mulsa3" "mulusa3" +(define_expand "mul<mode>3" + [(set (reg:ALL4A 16) + (match_operand:ALL4A 1 "register_operand" "")) + (set (reg:ALL4A 20) + (match_operand:ALL4A 2 "register_operand" "")) + (parallel [(set (reg:ALL4A 24) + (mult:ALL4A (reg:ALL4A 16) + (reg:ALL4A 20))) + (clobber (reg:QI 15))]) + (set (match_operand:ALL4A 0 "register_operand" "") + (reg:ALL4A 24))] + "" + { + if (!AVR_HAVE_MUL) + { + emit_insn (gen_mul<mode>3_call (operands[0], operands[1], operands[2])); + DONE; + } + }) + +;; "*mulsa3_enh" "*mulusa3_enh" +(define_insn "*mul<mode>3_enh" + [(set (reg:ALL4A 24) + (mult:ALL4A (reg:ALL4A 16) + (reg:ALL4A 20))) + (clobber (reg:QI 15))] + "AVR_HAVE_MUL" + "%~call __mul<mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; Without MUL. Clobbers both inputs, needs a separate output, +;; needs two more scratch registers. + +;; "mulsa3_call" "mulusa3_call" +(define_expand "mul<mode>3_call" + [(set (reg:ALL4A 18) + (match_operand:ALL4A 1 "register_operand" "")) + (set (reg:ALL4A 24) + (match_operand:ALL4A 2 "register_operand" "")) + (parallel [(set (reg:ALL4A 14) + (mult:ALL4A (reg:ALL4A 18) + (reg:ALL4A 24))) + (clobber (reg:ALL4A 18)) + (clobber (reg:ALL4A 24)) + (clobber (reg:HI 22))]) + (set (match_operand:ALL4A 0 "register_operand" "") + (reg:ALL4A 14))] + "!AVR_HAVE_MUL") + +;; "*mulsa3_call" "*mulusa3_call" +(define_insn "*mul<mode>3_call" + [(set (reg:ALL4A 14) + (mult:ALL4A (reg:ALL4A 18) + (reg:ALL4A 24))) + (clobber (reg:ALL4A 18)) + (clobber (reg:ALL4A 24)) + (clobber (reg:HI 22))] + "!AVR_HAVE_MUL" + "%~call __mul<mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +; / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / +; div + +;; Expand signed and unsigned in one shot +(define_code_iterator usdiv [udiv div]) + +;; "divqq3" "udivuqq3" +(define_expand "<code><mode>3" + [(set (reg:ALL1Q 25) + (match_operand:ALL1Q 1 "register_operand" "")) + (set (reg:ALL1Q 22) + (match_operand:ALL1Q 2 "register_operand" "")) + (parallel [(set (reg:ALL1Q 24) + (usdiv:ALL1Q (reg:ALL1Q 25) + (reg:ALL1Q 22))) + (clobber (reg:ALL1Q 25))]) + (set (match_operand:ALL1Q 0 "register_operand" "") + (reg:ALL1Q 24))]) + +;; "*divqq3_call" "*udivuqq3_call" +(define_insn "*<code><mode>3_call" + [(set (reg:ALL1Q 24) + (usdiv:ALL1Q (reg:ALL1Q 25) + (reg:ALL1Q 22))) + (clobber (reg:ALL1Q 25))] + "" + "%~call __<code><mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; "divhq3" "udivuhq3" "divha3" "udivuha3" +(define_expand "<code><mode>3" + [(set (reg:ALL2QA 26) + (match_operand:ALL2QA 1 "register_operand" "")) + (set (reg:ALL2QA 22) + (match_operand:ALL2QA 2 "register_operand" "")) + (parallel [(set (reg:ALL2QA 24) + (usdiv:ALL2QA (reg:ALL2QA 26) + (reg:ALL2QA 22))) + (clobber (reg:ALL2QA 26)) + (clobber (reg:QI 21))]) + (set (match_operand:ALL2QA 0 "register_operand" "") + (reg:ALL2QA 24))]) + +;; "*divhq3_call" "*udivuhq3_call" +;; "*divha3_call" "*udivuha3_call" +(define_insn "*<code><mode>3_call" + [(set (reg:ALL2QA 24) + (usdiv:ALL2QA (reg:ALL2QA 26) + (reg:ALL2QA 22))) + (clobber (reg:ALL2QA 26)) + (clobber (reg:QI 21))] + "" + "%~call __<code><mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; Note the first parameter gets passed in already offset by 2 bytes + +;; "divsa3" "udivusa3" +(define_expand "<code><mode>3" + [(set (reg:ALL4A 24) + (match_operand:ALL4A 1 "register_operand" "")) + (set (reg:ALL4A 18) + (match_operand:ALL4A 2 "register_operand" "")) + (parallel [(set (reg:ALL4A 22) + (usdiv:ALL4A (reg:ALL4A 24) + (reg:ALL4A 18))) + (clobber (reg:HI 26)) + (clobber (reg:HI 30))]) + (set (match_operand:ALL4A 0 "register_operand" "") + (reg:ALL4A 22))]) + +;; "*divsa3_call" "*udivusa3_call" +(define_insn "*<code><mode>3_call" + [(set (reg:ALL4A 22) + (usdiv:ALL4A (reg:ALL4A 24) + (reg:ALL4A 18))) + (clobber (reg:HI 26)) + (clobber (reg:HI 30))] + "" + "%~call __<code><mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) Index: gcc/config/avr/avr-dimode.md =================================================================== --- gcc/config/avr/avr-dimode.md (revision 190299) +++ gcc/config/avr/avr-dimode.md (working copy) @@ -47,44 +47,58 @@ (define_constants [(ACC_A 18) (ACC_B 10)]) +;; Supported modes that are 8 bytes wide +(define_mode_iterator ALL8 [(DI "") + (DQ "") (UDQ "") + (DA "") (UDA "") + (TA "") (UTA "")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Addition ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -(define_expand "adddi3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (match_operand:DI 1 "general_operand" "") - (match_operand:DI 2 "general_operand" "")])] +;; "adddi3" +;; "adddq3" "addudq3" +;; "addda3" "adduda3" +;; "addta3" "adduta3" +(define_expand "add<mode>3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (match_operand:ALL8 1 "general_operand" "") + (match_operand:ALL8 2 "general_operand" "")])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); - if (s8_operand (operands[2], VOIDmode)) + if (DImode == <MODE>mode + && s8_operand (operands[2], VOIDmode)) { emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]); emit_insn (gen_adddi3_const8_insn ()); } - else if (CONST_INT_P (operands[2]) - || CONST_DOUBLE_P (operands[2])) + else if (const_operand (operands[2], GET_MODE (operands[2]))) { - emit_insn (gen_adddi3_const_insn (operands[2])); + emit_insn (gen_add<mode>3_const_insn (operands[2])); } else { - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_adddi3_insn ()); + emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]); + emit_insn (gen_add<mode>3_insn ()); } emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "adddi3_insn" - [(set (reg:DI ACC_A) - (plus:DI (reg:DI ACC_A) - (reg:DI ACC_B)))] +;; "adddi3_insn" +;; "adddq3_insn" "addudq3_insn" +;; "addda3_insn" "adduda3_insn" +;; "addta3_insn" "adduta3_insn" +(define_insn "add<mode>3_insn" + [(set (reg:ALL8 ACC_A) + (plus:ALL8 (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __adddi3" [(set_attr "adjust_len" "call") @@ -99,10 +113,14 @@ (define_insn "adddi3_const8_insn" [(set_attr "adjust_len" "call") (set_attr "cc" "clobber")]) -(define_insn "adddi3_const_insn" - [(set (reg:DI ACC_A) - (plus:DI (reg:DI ACC_A) - (match_operand:DI 0 "const_double_operand" "n")))] +;; "adddi3_const_insn" +;; "adddq3_const_insn" "addudq3_const_insn" +;; "addda3_const_insn" "adduda3_const_insn" +;; "addta3_const_insn" "adduta3_const_insn" +(define_insn "add<mode>3_const_insn" + [(set (reg:ALL8 ACC_A) + (plus:ALL8 (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn")))] "avr_have_dimode && !s8_operand (operands[0], VOIDmode)" { @@ -116,30 +134,62 @@ (define_insn "adddi3_const_insn" ;; Subtraction ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -(define_expand "subdi3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (match_operand:DI 1 "general_operand" "") - (match_operand:DI 2 "general_operand" "")])] +;; "subdi3" +;; "subdq3" "subudq3" +;; "subda3" "subuda3" +;; "subta3" "subuta3" +(define_expand "sub<mode>3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (match_operand:ALL8 1 "general_operand" "") + (match_operand:ALL8 2 "general_operand" "")])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_subdi3_insn ()); + + if (const_operand (operands[2], GET_MODE (operands[2]))) + { + emit_insn (gen_sub<mode>3_const_insn (operands[2])); + } + else + { + emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]); + emit_insn (gen_sub<mode>3_insn ()); + } + emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "subdi3_insn" - [(set (reg:DI ACC_A) - (minus:DI (reg:DI ACC_A) - (reg:DI ACC_B)))] +;; "subdi3_insn" +;; "subdq3_insn" "subudq3_insn" +;; "subda3_insn" "subuda3_insn" +;; "subta3_insn" "subuta3_insn" +(define_insn "sub<mode>3_insn" + [(set (reg:ALL8 ACC_A) + (minus:ALL8 (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __subdi3" [(set_attr "adjust_len" "call") (set_attr "cc" "set_czn")]) +;; "subdi3_const_insn" +;; "subdq3_const_insn" "subudq3_const_insn" +;; "subda3_const_insn" "subuda3_const_insn" +;; "subta3_const_insn" "subuta3_const_insn" +(define_insn "sub<mode>3_const_insn" + [(set (reg:ALL8 ACC_A) + (minus:ALL8 (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn")))] + "avr_have_dimode" + { + return avr_out_minus64 (operands[0], NULL); + } + [(set_attr "adjust_len" "minus64") + (set_attr "cc" "clobber")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Negation @@ -180,15 +230,19 @@ (define_expand "conditional_jump" (pc)))] "avr_have_dimode") -(define_expand "cbranchdi4" - [(parallel [(match_operand:DI 1 "register_operand" "") - (match_operand:DI 2 "nonmemory_operand" "") +;; "cbranchdi4" +;; "cbranchdq4" "cbranchudq4" +;; "cbranchda4" "cbranchuda4" +;; "cbranchta4" "cbranchuta4" +(define_expand "cbranch<mode>4" + [(parallel [(match_operand:ALL8 1 "register_operand" "") + (match_operand:ALL8 2 "nonmemory_operand" "") (match_operator 0 "ordered_comparison_operator" [(cc0) (const_int 0)]) (label_ref (match_operand 3 "" ""))])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); @@ -197,25 +251,28 @@ (define_expand "cbranchdi4" emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]); emit_insn (gen_compare_const8_di2 ()); } - else if (CONST_INT_P (operands[2]) - || CONST_DOUBLE_P (operands[2])) + else if (const_operand (operands[2], GET_MODE (operands[2]))) { - emit_insn (gen_compare_const_di2 (operands[2])); + emit_insn (gen_compare_const_<mode>2 (operands[2])); } else { - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_compare_di2 ()); + emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]); + emit_insn (gen_compare_<mode>2 ()); } emit_jump_insn (gen_conditional_jump (operands[0], operands[3])); DONE; }) -(define_insn "compare_di2" +;; "compare_qi2" +;; "compare_qq2" "compare_udq2" +;; "compare_da2" "compare_uda2" +;; "compare_ta2" "compare_uta2" +(define_insn "compare_<mode>2" [(set (cc0) - (compare (reg:DI ACC_A) - (reg:DI ACC_B)))] + (compare (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __cmpdi2" [(set_attr "adjust_len" "call") @@ -230,10 +287,14 @@ (define_insn "compare_const8_di2" [(set_attr "adjust_len" "call") (set_attr "cc" "compare")]) -(define_insn "compare_const_di2" +;; "compare_const_qi2" +;; "compare_const_qq2" "compare_const_udq2" +;; "compare_const_da2" "compare_const_uda2" +;; "compare_const_ta2" "compare_const_uta2" +(define_insn "compare_const_<mode>2" [(set (cc0) - (compare (reg:DI ACC_A) - (match_operand:DI 0 "const_double_operand" "n"))) + (compare (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn"))) (clobber (match_scratch:QI 1 "=&d"))] "avr_have_dimode && !s8_operand (operands[0], VOIDmode)" @@ -258,25 +319,41 @@ (define_code_iterator di_shifts ;; "ashrdi3" ;; "lshrdi3" ;; "rotldi3" -(define_expand "<code_stdname>di3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (di_shifts:DI (match_operand:DI 1 "general_operand" "") - (match_operand:QI 2 "general_operand" ""))])] +;; "ashldq3" "ashrdq3" "lshrdq3" "rotldq3" +;; "ashlda3" "ashrda3" "lshrda3" "rotlda3" +;; "ashlta3" "ashrta3" "lshrta3" "rotlta3" +;; "ashludq3" "ashrudq3" "lshrudq3" "rotludq3" +;; "ashluda3" "ashruda3" "lshruda3" "rotluda3" +;; "ashluta3" "ashruta3" "lshruta3" "rotluta3" +(define_expand "<code_stdname><mode>3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (di_shifts:ALL8 (match_operand:ALL8 1 "general_operand" "") + (match_operand:QI 2 "general_operand" ""))])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); emit_move_insn (gen_rtx_REG (QImode, 16), operands[2]); - emit_insn (gen_<code_stdname>di3_insn ()); + emit_insn (gen_<code_stdname><mode>3_insn ()); emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "<code_stdname>di3_insn" - [(set (reg:DI ACC_A) - (di_shifts:DI (reg:DI ACC_A) - (reg:QI 16)))] +;; "ashldi3_insn" +;; "ashrdi3_insn" +;; "lshrdi3_insn" +;; "rotldi3_insn" +;; "ashldq3_insn" "ashrdq3_insn" "lshrdq3_insn" "rotldq3_insn" +;; "ashlda3_insn" "ashrda3_insn" "lshrda3_insn" "rotlda3_insn" +;; "ashlta3_insn" "ashrta3_insn" "lshrta3_insn" "rotlta3_insn" +;; "ashludq3_insn" "ashrudq3_insn" "lshrudq3_insn" "rotludq3_insn" +;; "ashluda3_insn" "ashruda3_insn" "lshruda3_insn" "rotluda3_insn" +;; "ashluta3_insn" "ashruta3_insn" "lshruta3_insn" "rotluta3_insn" +(define_insn "<code_stdname><mode>3_insn" + [(set (reg:ALL8 ACC_A) + (di_shifts:ALL8 (reg:ALL8 ACC_A) + (reg:QI 16)))] "avr_have_dimode" "%~call __<code_stdname>di3" [(set_attr "adjust_len" "call") Index: gcc/config/avr/avr.md =================================================================== --- gcc/config/avr/avr.md (revision 190299) +++ gcc/config/avr/avr.md (working copy) @@ -88,10 +88,10 @@ (define_c_enum "unspecv" (include "predicates.md") (include "constraints.md") - + ;; Condition code settings. (define_attr "cc" "none,set_czn,set_zn,set_n,compare,clobber, - out_plus, out_plus_noclobber,ldi" + out_plus, out_plus_noclobber,ldi,minus" (const_string "none")) (define_attr "type" "branch,branch1,arith,xcall" @@ -139,8 +139,10 @@ (define_attr "length" "" (define_attr "adjust_len" "out_bitop, out_plus, out_plus_noclobber, plus64, addto_sp, + minus, minus64, tsthi, tstpsi, tstsi, compare, compare64, call, mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32, + ufract, sfract, xload, movmem, load_lpm, ashlqi, ashrqi, lshrqi, ashlhi, ashrhi, lshrhi, @@ -225,8 +227,20 @@ (define_mode_iterator QISI [(QI "") (HI (define_mode_iterator QIDI [(QI "") (HI "") (PSI "") (SI "") (DI "")]) (define_mode_iterator HISI [(HI "") (PSI "") (SI "")]) +(define_mode_iterator ALL1 [(QI "") (QQ "") (UQQ "")]) +(define_mode_iterator ALL2 [(HI "") (HQ "") (UHQ "") (HA "") (UHA "")]) +(define_mode_iterator ALL4 [(SI "") (SQ "") (USQ "") (SA "") (USA "")]) + ;; All supported move-modes -(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "")]) +(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "") + (QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "")]) + +;; Supported ordered modes that are 2, 3, 4 bytes wide +(define_mode_iterator ORDERED234 [(HI "") (SI "") (PSI "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "")]) ;; Define code iterators ;; Define two incarnations so that we can build the cross product. @@ -317,9 +331,11 @@ (define_expand "nonlocal_goto" DONE; }) -(define_insn "pushqi1" - [(set (mem:QI (post_dec:HI (reg:HI REG_SP))) - (match_operand:QI 0 "reg_or_0_operand" "r,L"))] +;; "pushqi1" +;; "pushqq1" "pushuqq1" +(define_insn "push<mode>1" + [(set (mem:ALL1 (post_dec:HI (reg:HI REG_SP))) + (match_operand:ALL1 0 "reg_or_0_operand" "r,Y00"))] "" "@ push %0 @@ -334,7 +350,9 @@ (define_mode_iterator MPUSH (PSI "") (SI "") (CSI "") (DI "") (CDI "") - (SF "") (SC "")]) + (SF "") (SC "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "")]) (define_expand "push<mode>1" [(match_operand:MPUSH 0 "" "")] @@ -422,12 +440,14 @@ (define_insn "load_<mode>_clobber" (set_attr "cc" "clobber")]) -(define_insn_and_split "xload8_A" - [(set (match_operand:QI 0 "register_operand" "=r") - (match_operand:QI 1 "memory_operand" "m")) +;; "xload8qi_A" +;; "xload8qq_A" "xload8uqq_A" +(define_insn_and_split "xload8<mode>_A" + [(set (match_operand:ALL1 0 "register_operand" "=r") + (match_operand:ALL1 1 "memory_operand" "m")) (clobber (reg:HI REG_Z))] "can_create_pseudo_p() - && !avr_xload_libgcc_p (QImode) + && !avr_xload_libgcc_p (<MODE>mode) && avr_mem_memx_p (operands[1]) && REG_P (XEXP (operands[1], 0))" { gcc_unreachable(); } @@ -441,16 +461,16 @@ (define_insn_and_split "xload8_A" emit_move_insn (reg_z, simplify_gen_subreg (HImode, addr, PSImode, 0)); emit_move_insn (hi8, simplify_gen_subreg (QImode, addr, PSImode, 2)); - insn = emit_insn (gen_xload_8 (operands[0], hi8)); + insn = emit_insn (gen_xload<mode>_8 (operands[0], hi8)); set_mem_addr_space (SET_SRC (single_set (insn)), MEM_ADDR_SPACE (operands[1])); DONE; }) -;; "xloadqi_A" -;; "xloadhi_A" +;; "xloadqi_A" "xloadqq_A" "xloaduqq_A" +;; "xloadhi_A" "xloadhq_A" "xloaduhq_A" "xloadha_A" "xloaduha_A" +;; "xloadsi_A" "xloadsq_A" "xloadusq_A" "xloadsa_A" "xloadusa_A" ;; "xloadpsi_A" -;; "xloadsi_A" ;; "xloadsf_A" (define_insn_and_split "xload<mode>_A" [(set (match_operand:MOVMODE 0 "register_operand" "=r") @@ -488,11 +508,13 @@ (define_insn_and_split "xload<mode>_A" ;; Move value from address space memx to a register ;; These insns must be prior to respective generic move insn. -(define_insn "xload_8" - [(set (match_operand:QI 0 "register_operand" "=&r,r") - (mem:QI (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r") - (reg:HI REG_Z))))] - "!avr_xload_libgcc_p (QImode)" +;; "xloadqi_8" +;; "xloadqq_8" "xloaduqq_8" +(define_insn "xload<mode>_8" + [(set (match_operand:ALL1 0 "register_operand" "=&r,r") + (mem:ALL1 (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r") + (reg:HI REG_Z))))] + "!avr_xload_libgcc_p (<MODE>mode)" { return avr_out_xload (insn, operands, NULL); } @@ -504,11 +526,11 @@ (define_insn "xload_8" ;; R21:Z : 24-bit source address ;; R22 : 1-4 byte output -;; "xload_qi_libgcc" -;; "xload_hi_libgcc" -;; "xload_psi_libgcc" -;; "xload_si_libgcc" +;; "xload_qi_libgcc" "xload_qq_libgcc" "xload_uqq_libgcc" +;; "xload_hi_libgcc" "xload_hq_libgcc" "xload_uhq_libgcc" "xload_ha_libgcc" "xload_uha_libgcc" +;; "xload_si_libgcc" "xload_sq_libgcc" "xload_usq_libgcc" "xload_sa_libgcc" "xload_usa_libgcc" ;; "xload_sf_libgcc" +;; "xload_psi_libgcc" (define_insn "xload_<mode>_libgcc" [(set (reg:MOVMODE 22) (mem:MOVMODE (lo_sum:PSI (reg:QI 21) @@ -528,9 +550,9 @@ (define_insn "xload_<mode>_libgcc" ;; General move expanders -;; "movqi" -;; "movhi" -;; "movsi" +;; "movqi" "movqq" "movuqq" +;; "movhi" "movhq" "movuhq" "movha" "movuha" +;; "movsi" "movsq" "movusq" "movsa" "movusa" ;; "movsf" ;; "movpsi" (define_expand "mov<mode>" @@ -546,8 +568,7 @@ (define_expand "mov<mode>" /* One of the operands has to be in a register. */ if (!register_operand (dest, <MODE>mode) - && !(register_operand (src, <MODE>mode) - || src == CONST0_RTX (<MODE>mode))) + && !reg_or_0_operand (src, <MODE>mode)) { operands[1] = src = copy_to_mode_reg (<MODE>mode, src); } @@ -560,7 +581,9 @@ (define_expand "mov<mode>" src = replace_equiv_address (src, copy_to_mode_reg (PSImode, addr)); if (!avr_xload_libgcc_p (<MODE>mode)) - emit_insn (gen_xload8_A (dest, src)); + /* ; No <mode> here because gen_xload8<mode>_A only iterates over ALL1. + ; insn-emit does not depend on the mode, it' all about operands. */ + emit_insn (gen_xload8qi_A (dest, src)); else emit_insn (gen_xload<mode>_A (dest, src)); @@ -627,12 +650,13 @@ (define_expand "mov<mode>" ;; are call-saved registers, and most of LD_REGS are call-used registers, ;; so this may still be a win for registers live across function calls. -(define_insn "movqi_insn" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r ,d,Qm,r ,q,r,*r") - (match_operand:QI 1 "nox_general_operand" "rL,i,rL,Qm,r,q,i"))] - "register_operand (operands[0], QImode) - || register_operand (operands[1], QImode) - || const0_rtx == operands[1]" +;; "movqi_insn" +;; "movqq_insn" "movuqq_insn" +(define_insn "mov<mode>_insn" + [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r ,d ,Qm ,r ,q,r,*r") + (match_operand:ALL1 1 "nox_general_operand" "r Y00,n Ynn,r Y00,Qm,r,q,i"))] + "register_operand (operands[0], <MODE>mode) + || reg_or_0_operand (operands[1], <MODE>mode)" { return output_movqi (insn, operands, NULL); } @@ -643,9 +667,11 @@ (define_insn "movqi_insn" ;; This is used in peephole2 to optimize loading immediate constants ;; if a scratch register from LD_REGS happens to be available. -(define_insn "*reload_inqi" - [(set (match_operand:QI 0 "register_operand" "=l") - (match_operand:QI 1 "immediate_operand" "i")) +;; "*reload_inqi" +;; "*reload_inqq" "*reload_inuqq" +(define_insn "*reload_in<mode>" + [(set (match_operand:ALL1 0 "register_operand" "=l") + (match_operand:ALL1 1 "const_operand" "i")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" "ldi %2,lo8(%1) @@ -655,14 +681,15 @@ (define_insn "*reload_inqi" (define_peephole2 [(match_scratch:QI 2 "d") - (set (match_operand:QI 0 "l_register_operand" "") - (match_operand:QI 1 "immediate_operand" ""))] - "(operands[1] != const0_rtx - && operands[1] != const1_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (set (match_operand:ALL1 0 "l_register_operand" "") + (match_operand:ALL1 1 "const_operand" ""))] + ; No need for a clobber reg for 0x0, 0x01 or 0xff + "!satisfies_constraint_Y00 (operands[1]) + && !satisfies_constraint_Y01 (operands[1]) + && !satisfies_constraint_Ym1 (operands[1])" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;;============================================================================ ;; move word (16 bit) @@ -693,18 +720,20 @@ (define_insn "movhi_sp_r" (define_peephole2 [(match_scratch:QI 2 "d") - (set (match_operand:HI 0 "l_register_operand" "") - (match_operand:HI 1 "immediate_operand" ""))] - "(operands[1] != const0_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (set (match_operand:ALL2 0 "l_register_operand" "") + (match_operand:ALL2 1 "const_or_immediate_operand" ""))] + "operands[1] != CONST0_RTX (<MODE>mode)" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation, only in above peephole -(define_insn "*reload_inhi" - [(set (match_operand:HI 0 "register_operand" "=r") - (match_operand:HI 1 "immediate_operand" "i")) +;; "*reload_inhi" +;; "*reload_inhq" "*reload_inuhq" +;; "*reload_inha" "*reload_inuha" +(define_insn "*reload_in<mode>" + [(set (match_operand:ALL2 0 "l_register_operand" "=l") + (match_operand:ALL2 1 "immediate_operand" "i")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" { @@ -712,14 +741,16 @@ (define_insn "*reload_inhi" } [(set_attr "length" "4") (set_attr "adjust_len" "reload_in16") - (set_attr "cc" "none")]) + (set_attr "cc" "clobber")]) -(define_insn "*movhi" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m ,d,*r,q,r") - (match_operand:HI 1 "nox_general_operand" "r,L,m,rL,i,i ,r,q"))] - "register_operand (operands[0], HImode) - || register_operand (operands[1], HImode) - || const0_rtx == operands[1]" +;; "*movhi" +;; "*movhq" "*movuhq" +;; "*movha" "*movuha" +(define_insn "*mov<mode>" + [(set (match_operand:ALL2 0 "nonimmediate_operand" "=r,r ,r,m ,d,*r,q,r") + (match_operand:ALL2 1 "nox_general_operand" "r,Y00,m,r Y00,i,i ,r,q"))] + "register_operand (operands[0], <MODE>mode) + || reg_or_0_operand (operands[1], <MODE>mode)" { return output_movhi (insn, operands, NULL); } @@ -728,28 +759,30 @@ (define_insn "*movhi" (set_attr "cc" "none,none,clobber,clobber,none,clobber,none,none")]) (define_peephole2 ; movw - [(set (match_operand:QI 0 "even_register_operand" "") - (match_operand:QI 1 "even_register_operand" "")) - (set (match_operand:QI 2 "odd_register_operand" "") - (match_operand:QI 3 "odd_register_operand" ""))] + [(set (match_operand:ALL1 0 "even_register_operand" "") + (match_operand:ALL1 1 "even_register_operand" "")) + (set (match_operand:ALL1 2 "odd_register_operand" "") + (match_operand:ALL1 3 "odd_register_operand" ""))] "(AVR_HAVE_MOVW && REGNO (operands[0]) == REGNO (operands[2]) - 1 && REGNO (operands[1]) == REGNO (operands[3]) - 1)" - [(set (match_dup 4) (match_dup 5))] + [(set (match_dup 4) + (match_dup 5))] { operands[4] = gen_rtx_REG (HImode, REGNO (operands[0])); operands[5] = gen_rtx_REG (HImode, REGNO (operands[1])); }) (define_peephole2 ; movw_r - [(set (match_operand:QI 0 "odd_register_operand" "") - (match_operand:QI 1 "odd_register_operand" "")) - (set (match_operand:QI 2 "even_register_operand" "") - (match_operand:QI 3 "even_register_operand" ""))] + [(set (match_operand:ALL1 0 "odd_register_operand" "") + (match_operand:ALL1 1 "odd_register_operand" "")) + (set (match_operand:ALL1 2 "even_register_operand" "") + (match_operand:ALL1 3 "even_register_operand" ""))] "(AVR_HAVE_MOVW && REGNO (operands[2]) == REGNO (operands[0]) - 1 && REGNO (operands[3]) == REGNO (operands[1]) - 1)" - [(set (match_dup 4) (match_dup 5))] + [(set (match_dup 4) + (match_dup 5))] { operands[4] = gen_rtx_REG (HImode, REGNO (operands[2])); operands[5] = gen_rtx_REG (HImode, REGNO (operands[3])); @@ -801,19 +834,21 @@ (define_insn "*movpsi" (define_peephole2 ; *reload_insi [(match_scratch:QI 2 "d") - (set (match_operand:SI 0 "l_register_operand" "") - (match_operand:SI 1 "const_int_operand" "")) + (set (match_operand:ALL4 0 "l_register_operand" "") + (match_operand:ALL4 1 "immediate_operand" "")) (match_dup 2)] - "(operands[1] != const0_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + "operands[1] != CONST0_RTX (<MODE>mode)" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation. +;; "*reload_insi" +;; "*reload_insq" "*reload_inusq" +;; "*reload_insa" "*reload_inusa" (define_insn "*reload_insi" - [(set (match_operand:SI 0 "register_operand" "=r") - (match_operand:SI 1 "const_int_operand" "n")) + [(set (match_operand:ALL4 0 "register_operand" "=r") + (match_operand:ALL4 1 "immediate_operand" "n Ynn")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" { @@ -824,12 +859,14 @@ (define_insn "*reload_insi" (set_attr "cc" "clobber")]) -(define_insn "*movsi" - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r") - (match_operand:SI 1 "nox_general_operand" "r,L,Qm,rL,i ,i"))] - "register_operand (operands[0], SImode) - || register_operand (operands[1], SImode) - || const0_rtx == operands[1]" +;; "*movsi" +;; "*movsq" "*movusq" +;; "*movsa" "*movusa" +(define_insn "*mov<mode>" + [(set (match_operand:ALL4 0 "nonimmediate_operand" "=r,r ,r ,Qm ,!d,r") + (match_operand:ALL4 1 "nox_general_operand" "r,Y00,Qm,r Y00,i ,i"))] + "register_operand (operands[0], <MODE>mode) + || reg_or_0_operand (operands[1], <MODE>mode)" { return output_movsisf (insn, operands, NULL); } @@ -844,8 +881,7 @@ (define_insn "*movsf" [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r") (match_operand:SF 1 "nox_general_operand" "r,G,Qm,rG,F ,F"))] "register_operand (operands[0], SFmode) - || register_operand (operands[1], SFmode) - || operands[1] == CONST0_RTX (SFmode)" + || reg_or_0_operand (operands[1], SFmode)" { return output_movsisf (insn, operands, NULL); } @@ -861,8 +897,7 @@ (define_peephole2 ; *reload_insf "operands[1] != CONST0_RTX (SFmode)" [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation. (define_insn "*reload_insf" @@ -1015,9 +1050,10 @@ (define_expand "strlenhi" (set (match_dup 4) (plus:HI (match_dup 4) (const_int -1))) - (set (match_operand:HI 0 "register_operand" "") - (minus:HI (match_dup 4) - (match_dup 5)))] + (parallel [(set (match_operand:HI 0 "register_operand" "") + (minus:HI (match_dup 4) + (match_dup 5))) + (clobber (scratch:QI))])] "" { rtx addr; @@ -1043,10 +1079,12 @@ (define_insn "*strlenhi" ;+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ; add bytes -(define_insn "addqi3" - [(set (match_operand:QI 0 "register_operand" "=r,d,r,r,r,r") - (plus:QI (match_operand:QI 1 "register_operand" "%0,0,0,0,0,0") - (match_operand:QI 2 "nonmemory_operand" "r,i,P,N,K,Cm2")))] +;; "addqi3" +;; "addqq3" "adduqq3" +(define_insn "add<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,d ,r ,r ,r ,r") + (plus:ALL1 (match_operand:ALL1 1 "register_operand" "%0,0 ,0 ,0 ,0 ,0") + (match_operand:ALL1 2 "nonmemory_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))] "" "@ add %0,%2 @@ -1058,11 +1096,13 @@ (define_insn "addqi3" [(set_attr "length" "1,1,1,1,2,2") (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")]) - -(define_expand "addhi3" - [(set (match_operand:HI 0 "register_operand" "") - (plus:HI (match_operand:HI 1 "register_operand" "") - (match_operand:HI 2 "nonmemory_operand" "")))] +;; "addhi3" +;; "addhq3" "adduhq3" +;; "addha3" "adduha3" +(define_expand "add<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "")))] "" { if (CONST_INT_P (operands[2])) @@ -1079,6 +1119,12 @@ (define_expand "addhi3" DONE; } } + + if (CONST_FIXED == GET_CODE (operands[2])) + { + emit_insn (gen_add<mode>3_clobber (operands[0], operands[1], operands[2])); + DONE; + } }) @@ -1124,24 +1170,22 @@ (define_insn "*addhi3_sp" [(set_attr "length" "6") (set_attr "adjust_len" "addto_sp")]) -(define_insn "*addhi3" - [(set (match_operand:HI 0 "register_operand" "=r,d,!w,d") - (plus:HI (match_operand:HI 1 "register_operand" "%0,0,0 ,0") - (match_operand:HI 2 "nonmemory_operand" "r,s,IJ,n")))] +;; "*addhi3" +;; "*addhq3" "*adduhq3" +;; "*addha3" "*adduha3" +(define_insn "*add<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,d,!w ,d") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "%0,0,0 ,0") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,s,IJ YIJ,n Ynn")))] "" { - static const char * const asm_code[] = - { - "add %A0,%A2\;adc %B0,%B2", - "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))", - "", - "" - }; - - if (*asm_code[which_alternative]) - return asm_code[which_alternative]; - - return avr_out_plus_noclobber (operands, NULL, NULL); + if (REG_P (operands[2])) + return "add %A0,%A2\;adc %B0,%B2"; + else if (CONST_INT_P (operands[2]) + || CONST_FIXED == GET_CODE (operands[2])) + return avr_out_plus_noclobber (operands, NULL, NULL); + else + return "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))"; } [(set_attr "length" "2,2,2,2") (set_attr "adjust_len" "*,*,out_plus_noclobber,out_plus_noclobber") @@ -1152,41 +1196,44 @@ (define_insn "*addhi3" ;; itself because that insn is special to reload. (define_peephole2 ; addhi3_clobber - [(set (match_operand:HI 0 "d_register_operand" "") - (match_operand:HI 1 "const_int_operand" "")) - (set (match_operand:HI 2 "l_register_operand" "") - (plus:HI (match_dup 2) - (match_dup 0)))] + [(set (match_operand:ALL2 0 "d_register_operand" "") + (match_operand:ALL2 1 "const_operand" "")) + (set (match_operand:ALL2 2 "l_register_operand" "") + (plus:ALL2 (match_dup 2) + (match_dup 0)))] "peep2_reg_dead_p (2, operands[0])" [(parallel [(set (match_dup 2) - (plus:HI (match_dup 2) - (match_dup 1))) + (plus:ALL2 (match_dup 2) + (match_dup 1))) (clobber (match_dup 3))])] { - operands[3] = simplify_gen_subreg (QImode, operands[0], HImode, 0); + operands[3] = simplify_gen_subreg (QImode, operands[0], <MODE>mode, 0); }) ;; Same, but with reload to NO_LD_REGS ;; Combine *reload_inhi with *addhi3 (define_peephole2 ; addhi3_clobber - [(parallel [(set (match_operand:HI 0 "l_register_operand" "") - (match_operand:HI 1 "const_int_operand" "")) + [(parallel [(set (match_operand:ALL2 0 "l_register_operand" "") + (match_operand:ALL2 1 "const_operand" "")) (clobber (match_operand:QI 2 "d_register_operand" ""))]) - (set (match_operand:HI 3 "l_register_operand" "") - (plus:HI (match_dup 3) - (match_dup 0)))] + (set (match_operand:ALL2 3 "l_register_operand" "") + (plus:ALL2 (match_dup 3) + (match_dup 0)))] "peep2_reg_dead_p (2, operands[0])" [(parallel [(set (match_dup 3) - (plus:HI (match_dup 3) - (match_dup 1))) + (plus:ALL2 (match_dup 3) + (match_dup 1))) (clobber (match_dup 2))])]) -(define_insn "addhi3_clobber" - [(set (match_operand:HI 0 "register_operand" "=!w,d,r") - (plus:HI (match_operand:HI 1 "register_operand" "%0,0,0") - (match_operand:HI 2 "const_int_operand" "IJ,n,n"))) - (clobber (match_scratch:QI 3 "=X,X,&d"))] +;; "addhi3_clobber" +;; "addhq3_clobber" "adduhq3_clobber" +;; "addha3_clobber" "adduha3_clobber" +(define_insn "add<mode>3_clobber" + [(set (match_operand:ALL2 0 "register_operand" "=!w ,d ,r") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "%0 ,0 ,0") + (match_operand:ALL2 2 "const_operand" "IJ YIJ,n Ynn,n Ynn"))) + (clobber (match_scratch:QI 3 "=X ,X ,&d"))] "" { gcc_assert (REGNO (operands[0]) == REGNO (operands[1])); @@ -1198,29 +1245,24 @@ (define_insn "addhi3_clobber" (set_attr "cc" "out_plus")]) -(define_insn "addsi3" - [(set (match_operand:SI 0 "register_operand" "=r,d ,d,r") - (plus:SI (match_operand:SI 1 "register_operand" "%0,0 ,0,0") - (match_operand:SI 2 "nonmemory_operand" "r,s ,n,n"))) - (clobber (match_scratch:QI 3 "=X,X ,X,&d"))] +;; "addsi3" +;; "addsq3" "addusq3" +;; "addsa3" "addusa3" +(define_insn "add<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,d ,r") + (plus:ALL4 (match_operand:ALL4 1 "register_operand" "%0,0 ,0") + (match_operand:ALL4 2 "nonmemory_operand" "r,i ,n Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" { - static const char * const asm_code[] = - { - "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2", - "subi %0,lo8(-(%2))\;sbci %B0,hi8(-(%2))\;sbci %C0,hlo8(-(%2))\;sbci %D0,hhi8(-(%2))", - "", - "" - }; - - if (*asm_code[which_alternative]) - return asm_code[which_alternative]; + if (REG_P (operands[2])) + return "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2"; return avr_out_plus (operands, NULL, NULL); } - [(set_attr "length" "4,4,4,8") - (set_attr "adjust_len" "*,*,out_plus,out_plus") - (set_attr "cc" "set_n,set_czn,out_plus,out_plus")]) + [(set_attr "length" "4,4,8") + (set_attr "adjust_len" "*,out_plus,out_plus") + (set_attr "cc" "set_n,out_plus,out_plus")]) (define_insn "*addpsi3_zero_extend.qi" [(set (match_operand:PSI 0 "register_operand" "=r") @@ -1329,27 +1371,38 @@ (define_insn "*subpsi3_sign_extend.hi" ;----------------------------------------------------------------------------- ; sub bytes -(define_insn "subqi3" - [(set (match_operand:QI 0 "register_operand" "=r,d") - (minus:QI (match_operand:QI 1 "register_operand" "0,0") - (match_operand:QI 2 "nonmemory_operand" "r,i")))] + +;; "subqi3" +;; "subqq3" "subuqq3" +(define_insn "sub<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,d ,r ,r ,r ,r") + (minus:ALL1 (match_operand:ALL1 1 "register_operand" "0,0 ,0 ,0 ,0 ,0") + (match_operand:ALL1 2 "nonmemory_or_const_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))] "" "@ sub %0,%2 - subi %0,lo8(%2)" - [(set_attr "length" "1,1") - (set_attr "cc" "set_czn,set_czn")]) + subi %0,lo8(%2) + dec %0 + inc %0 + dec %0\;dec %0 + inc %0\;inc %0" + [(set_attr "length" "1,1,1,1,2,2") + (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")]) -(define_insn "subhi3" - [(set (match_operand:HI 0 "register_operand" "=r,d") - (minus:HI (match_operand:HI 1 "register_operand" "0,0") - (match_operand:HI 2 "nonmemory_operand" "r,i")))] +;; "subhi3" +;; "subhq3" "subuhq3" +;; "subha3" "subuha3" +(define_insn "sub<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,d ,*r") + (minus:ALL2 (match_operand:ALL2 1 "register_operand" "0,0 ,0") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,i Ynn,Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" - "@ - sub %A0,%A2\;sbc %B0,%B2 - subi %A0,lo8(%2)\;sbci %B0,hi8(%2)" - [(set_attr "length" "2,2") - (set_attr "cc" "set_czn,set_czn")]) + { + return avr_out_minus (operands, NULL, NULL); + } + [(set_attr "adjust_len" "minus") + (set_attr "cc" "minus")]) (define_insn "*subhi3_zero_extend1" [(set (match_operand:HI 0 "register_operand" "=r") @@ -1373,13 +1426,23 @@ (define_insn "*subhi3.sign_extend2" [(set_attr "length" "5") (set_attr "cc" "clobber")]) -(define_insn "subsi3" - [(set (match_operand:SI 0 "register_operand" "=r") - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "register_operand" "r")))] +;; "subsi3" +;; "subsq3" "subusq3" +;; "subsa3" "subusa3" +(define_insn "sub<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,d ,r") + (minus:ALL4 (match_operand:ALL4 1 "register_operand" "0,0 ,0") + (match_operand:ALL4 2 "nonmemory_or_const_operand" "r,n Ynn,Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" - "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2" + { + if (REG_P (operands[2])) + return "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2"; + + return avr_out_minus (operands, NULL, NULL); + } [(set_attr "length" "4") + (set_attr "adjust_len" "*,minus,minus") (set_attr "cc" "set_czn")]) (define_insn "*subsi3_zero_extend" @@ -3303,44 +3366,58 @@ (define_insn_and_split "*rotb<mode>" ;;<< << << << << << << << << << << << << << << << << << << << << << << << << << ;; arithmetic shift left -(define_expand "ashlqi3" - [(set (match_operand:QI 0 "register_operand" "") - (ashift:QI (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nop_general_operand" "")))]) +;; "ashlqi3" +;; "ashlqq3" "ashluqq3" +(define_expand "ashl<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "") + (ashift:ALL1 (match_operand:ALL1 1 "register_operand" "") + (match_operand:QI 2 "nop_general_operand" "")))]) (define_split ; ashlqi3_const4 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 4)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 4)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -16)))] - "") + [(set (match_dup 1) + (rotate:QI (match_dup 1) + (const_int 4))) + (set (match_dup 1) + (and:QI (match_dup 1) + (const_int -16)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; ashlqi3_const5 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 5)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 5)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -32)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 1))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int -32)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; ashlqi3_const6 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 6)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 6)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -64)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 2))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int -64)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) -(define_insn "*ashlqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,!d,r,r") - (ashift:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] +;; "*ashlqi3" +;; "*ashlqq3" "*ashluqq3" +(define_insn "*ashl<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,!d,r,r") + (ashift:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] "" { return ashlqi3_out (insn, operands, NULL); @@ -3349,10 +3426,10 @@ (define_insn "*ashlqi3" (set_attr "adjust_len" "ashlqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")]) -(define_insn "ashlhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashift:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +(define_insn "ashl<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashlhi3_out (insn, operands, NULL); @@ -3377,8 +3454,7 @@ (define_insn_and_split "*ashl<extend_su> "" [(set (match_dup 0) (ashift:QI (match_dup 1) - (match_dup 2)))] - "") + (match_dup 2)))]) ;; ??? Combiner does not recognize that it could split the following insn; ;; presumably because he has no register handy? @@ -3443,10 +3519,13 @@ (define_peephole2 }) -(define_insn "ashlsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashift:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashlsi3" +;; "ashlsq3" "ashlusq3" +;; "ashlsa3" "ashlusa3" +(define_insn "ashl<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashlsi3_out (insn, operands, NULL); @@ -3458,55 +3537,65 @@ (define_insn "ashlsi3" ;; Optimize if a scratch register from LD_REGS happens to be available. (define_peephole2 ; ashlqi3_l_const4 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 4))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 4))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) (set (match_dup 1) (const_int -16)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; ashlqi3_l_const5 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 5))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 5))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 1))) (set (match_dup 1) (const_int -32)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; ashlqi3_l_const6 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 6))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 6))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 2))) (set (match_dup 1) (const_int -64)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (ashift:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashift:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*ashlhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (ashift:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (ashift:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*ashlhi3_const" +;; "*ashlhq3_const" "*ashluhq3_const" +;; "*ashlha3_const" "*ashluha3_const" +(define_insn "*ashl<mode>3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return ashlhi3_out (insn, operands, NULL); @@ -3517,19 +3606,24 @@ (define_insn "*ashlhi3_const" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (ashift:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashift:SI (match_dup 1) (match_dup 2))) + [(parallel [(set (match_dup 0) + (ashift:ALL4 (match_dup 1) + (match_dup 2))) (clobber (match_dup 3))])] "") -(define_insn "*ashlsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (ashift:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] +;; "*ashlsi3_const" +;; "*ashlsq3_const" "*ashlusq3_const" +;; "*ashlsa3_const" "*ashlusa3_const" +(define_insn "*ashl<mode>3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return ashlsi3_out (insn, operands, NULL); @@ -3580,10 +3674,12 @@ (define_insn "*ashlpsi3" ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ;; arithmetic shift right -(define_insn "ashrqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,r ,r ,r") - (ashiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0 ,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,C03 C04 C05,C06 C07,Qm")))] +;; "ashrqi3" +;; "ashrqq3" "ashruqq3" +(define_insn "ashr<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,r ,r ,r") + (ashiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0 ,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,C03 C04 C05,C06 C07,Qm")))] "" { return ashrqi3_out (insn, operands, NULL); @@ -3592,10 +3688,13 @@ (define_insn "ashrqi3" (set_attr "adjust_len" "ashrqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,clobber,clobber")]) -(define_insn "ashrhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashrhi3" +;; "ashrhq3" "ashruhq3" +;; "ashrha3" "ashruha3" +(define_insn "ashr<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashrhi3_out (insn, operands, NULL); @@ -3616,10 +3715,13 @@ (define_insn "ashrpsi3" [(set_attr "adjust_len" "ashrpsi") (set_attr "cc" "clobber")]) -(define_insn "ashrsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashrsi3" +;; "ashrsq3" "ashrusq3" +;; "ashrsa3" "ashrusa3" +(define_insn "ashr<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashrsi3_out (insn, operands, NULL); @@ -3632,19 +3734,23 @@ (define_insn "ashrsi3" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashiftrt:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*ashrhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (ashiftrt:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*ashrhi3_const" +;; "*ashrhq3_const" "*ashruhq3_const" +;; "*ashrha3_const" "*ashruha3_const" +(define_insn "*ashr<mode>3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return ashrhi3_out (insn, operands, NULL); @@ -3655,19 +3761,23 @@ (define_insn "*ashrhi3_const" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*ashrsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (ashiftrt:ALL4 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*ashrsi3_const" +;; "*ashrsq3_const" "*ashrusq3_const" +;; "*ashrsa3_const" "*ashrusa3_const" +(define_insn "*ashr<mode>3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return ashrsi3_out (insn, operands, NULL); @@ -3679,44 +3789,59 @@ (define_insn "*ashrsi3_const" ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ;; logical shift right -(define_expand "lshrqi3" - [(set (match_operand:QI 0 "register_operand" "") - (lshiftrt:QI (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nop_general_operand" "")))]) +;; "lshrqi3" +;; "lshrqq3 "lshruqq3" +(define_expand "lshr<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "") + (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "") + (match_operand:QI 2 "nop_general_operand" "")))]) (define_split ; lshrqi3_const4 - [(set (match_operand:QI 0 "d_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 4)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 4)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 15)))] - "") + [(set (match_dup 1) + (rotate:QI (match_dup 1) + (const_int 4))) + (set (match_dup 1) + (and:QI (match_dup 1) + (const_int 15)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; lshrqi3_const5 - [(set (match_operand:QI 0 "d_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 5)))] - "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 7)))] - "") + [(set (match_operand:ALL1 0 "d_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 5)))] + "" + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 1))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int 7)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; lshrqi3_const6 [(set (match_operand:QI 0 "d_register_operand" "") (lshiftrt:QI (match_dup 0) (const_int 6)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 3)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 2))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int 3)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) -(define_insn "*lshrqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,!d,r,r") - (lshiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] +;; "*lshrqi3" +;; "*lshrqq3" +;; "*lshruqq3" +(define_insn "*lshr<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,!d,r,r") + (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] "" { return lshrqi3_out (insn, operands, NULL); @@ -3725,10 +3850,13 @@ (define_insn "*lshrqi3" (set_attr "adjust_len" "lshrqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")]) -(define_insn "lshrhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "lshrhi3" +;; "lshrhq3" "lshruhq3" +;; "lshrha3" "lshruha3" +(define_insn "lshr<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return lshrhi3_out (insn, operands, NULL); @@ -3749,10 +3877,13 @@ (define_insn "lshrpsi3" [(set_attr "adjust_len" "lshrpsi") (set_attr "cc" "clobber")]) -(define_insn "lshrsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "lshrsi3" +;; "lshrsq3" "lshrusq3" +;; "lshrsa3" "lshrusa3" +(define_insn "lshr<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return lshrsi3_out (insn, operands, NULL); @@ -3764,55 +3895,65 @@ (define_insn "lshrsi3" ;; Optimize if a scratch register from LD_REGS happens to be available. (define_peephole2 ; lshrqi3_l_const4 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 4))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 4))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) (set (match_dup 1) (const_int 15)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; lshrqi3_l_const5 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 5))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 5))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 1))) (set (match_dup 1) (const_int 7)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; lshrqi3_l_const6 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 6))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 6))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 2))) (set (match_dup 1) (const_int 3)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (lshiftrt:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*lshrhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (lshiftrt:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*lshrhi3_const" +;; "*lshrhq3_const" "*lshruhq3_const" +;; "*lshrha3_const" "*lshruha3_const" +(define_insn "*lshr<mode>3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return lshrhi3_out (insn, operands, NULL); @@ -3823,19 +3964,23 @@ (define_insn "*lshrhi3_const" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (lshiftrt:SI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*lshrsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (lshiftrt:ALL4 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*lshrsi3_const" +;; "*lshrsq3_const" "*lshrusq3_const" +;; "*lshrsa3_const" "*lshrusa3_const" +(define_insn "*lshr<mode>3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return lshrsi3_out (insn, operands, NULL); @@ -4278,24 +4423,29 @@ (define_insn "*negated_tstsi" [(set_attr "cc" "compare") (set_attr "length" "4")]) -(define_insn "*reversed_tstsi" +;; "*reversed_tstsi" +;; "*reversed_tstsq" "*reversed_tstusq" +;; "*reversed_tstsa" "*reversed_tstusa" +(define_insn "*reversed_tst<mode>" [(set (cc0) - (compare (const_int 0) - (match_operand:SI 0 "register_operand" "r"))) - (clobber (match_scratch:QI 1 "=X"))] - "" - "cp __zero_reg__,%A0 - cpc __zero_reg__,%B0 - cpc __zero_reg__,%C0 - cpc __zero_reg__,%D0" + (compare (match_operand:ALL4 0 "const0_operand" "Y00") + (match_operand:ALL4 1 "register_operand" "r"))) + (clobber (match_scratch:QI 2 "=X"))] + "" + "cp __zero_reg__,%A1 + cpc __zero_reg__,%B1 + cpc __zero_reg__,%C1 + cpc __zero_reg__,%D1" [(set_attr "cc" "compare") (set_attr "length" "4")]) -(define_insn "*cmpqi" +;; "*cmpqi" +;; "*cmpqq" "*cmpuqq" +(define_insn "*cmp<mode>" [(set (cc0) - (compare (match_operand:QI 0 "register_operand" "r,r,d") - (match_operand:QI 1 "nonmemory_operand" "L,r,i")))] + (compare (match_operand:ALL1 0 "register_operand" "r ,r,d") + (match_operand:ALL1 1 "nonmemory_operand" "Y00,r,i")))] "" "@ tst %0 @@ -4313,11 +4463,14 @@ (define_insn "*cmpqi_sign_extend" [(set_attr "cc" "compare") (set_attr "length" "1")]) -(define_insn "*cmphi" +;; "*cmphi" +;; "*cmphq" "*cmpuhq" +;; "*cmpha" "*cmpuha" +(define_insn "*cmp<mode>" [(set (cc0) - (compare (match_operand:HI 0 "register_operand" "!w,r,r,d ,r ,d,r") - (match_operand:HI 1 "nonmemory_operand" "L ,L,r,s ,s ,M,n"))) - (clobber (match_scratch:QI 2 "=X ,X,X,&d,&d ,X,&d"))] + (compare (match_operand:ALL2 0 "register_operand" "!w ,r ,r,d ,r ,d,r") + (match_operand:ALL2 1 "nonmemory_operand" "Y00,Y00,r,s ,s ,M,n Ynn"))) + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d ,X,&d"))] "" { switch (which_alternative) @@ -4330,11 +4483,15 @@ (define_insn "*cmphi" return "cp %A0,%A1\;cpc %B0,%B1"; case 3: + if (<MODE>mode != HImode) + break; return reg_unused_after (insn, operands[0]) ? "subi %A0,lo8(%1)\;sbci %B0,hi8(%1)" : "ldi %2,hi8(%1)\;cpi %A0,lo8(%1)\;cpc %B0,%2"; case 4: + if (<MODE>mode != HImode) + break; return "ldi %2,lo8(%1)\;cp %A0,%2\;ldi %2,hi8(%1)\;cpc %B0,%2"; } @@ -4374,11 +4531,14 @@ (define_insn "*cmppsi" (set_attr "length" "3,3,5,6,3,7") (set_attr "adjust_len" "tstpsi,*,*,*,compare,compare")]) -(define_insn "*cmpsi" +;; "*cmpsi" +;; "*cmpsq" "*cmpusq" +;; "*cmpsa" "*cmpusa" +(define_insn "*cmp<mode>" [(set (cc0) - (compare (match_operand:SI 0 "register_operand" "r,r ,d,r ,r") - (match_operand:SI 1 "nonmemory_operand" "L,r ,M,M ,n"))) - (clobber (match_scratch:QI 2 "=X,X ,X,&d,&d"))] + (compare (match_operand:ALL4 0 "register_operand" "r ,r ,d,r ,r") + (match_operand:ALL4 1 "nonmemory_operand" "Y00,r ,M,M ,n Ynn"))) + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d"))] "" { if (0 == which_alternative) @@ -4398,55 +4558,33 @@ (define_insn "*cmpsi" ;; ---------------------------------------------------------------------- ;; Conditional jump instructions -(define_expand "cbranchsi4" - [(parallel [(set (cc0) - (compare (match_operand:SI 1 "register_operand" "") - (match_operand:SI 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) +;; "cbranchqi4" +;; "cbranchqq4" "cbranchuqq4" +(define_expand "cbranch<mode>4" + [(set (cc0) + (compare (match_operand:ALL1 1 "register_operand" "") + (match_operand:ALL1 2 "nonmemory_operand" ""))) (set (pc) (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") - -(define_expand "cbranchpsi4" - [(parallel [(set (cc0) - (compare (match_operand:PSI 1 "register_operand" "") - (match_operand:PSI 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) - (set (pc) - (if_then_else (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") + (match_operator 0 "ordered_comparison_operator" [(cc0) + (const_int 0)]) + (label_ref (match_operand 3 "" "")) + (pc)))]) -(define_expand "cbranchhi4" +;; "cbranchhi4" "cbranchhq4" "cbranchuhq4" "cbranchha4" "cbranchuha4" +;; "cbranchsi4" "cbranchsq4" "cbranchusq4" "cbranchsa4" "cbranchusa4" +;; "cbranchpsi4" +(define_expand "cbranch<mode>4" [(parallel [(set (cc0) - (compare (match_operand:HI 1 "register_operand" "") - (match_operand:HI 2 "nonmemory_operand" ""))) + (compare (match_operand:ORDERED234 1 "register_operand" "") + (match_operand:ORDERED234 2 "nonmemory_operand" ""))) (clobber (match_scratch:QI 4 ""))]) (set (pc) (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") - -(define_expand "cbranchqi4" - [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nonmemory_operand" ""))) - (set (pc) - (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") + (match_operator 0 "ordered_comparison_operator" [(cc0) + (const_int 0)]) + (label_ref (match_operand 3 "" "")) + (pc)))]) ;; Test a single bit in a QI/HI/SImode register. @@ -4477,7 +4615,7 @@ (define_insn "*sbrx_branch<mode>" (const_int 4)))) (set_attr "cc" "clobber")]) -;; Same test based on Bitwise AND RTL. Keep this incase gcc changes patterns. +;; Same test based on bitwise AND. Keep this in case gcc changes patterns. ;; or for old peepholes. ;; Fixme - bitwise Mask will not work for DImode @@ -4492,12 +4630,12 @@ (define_insn "*sbrx_and_branch<mode>" (label_ref (match_operand 3 "" "")) (pc)))] "" -{ + { HOST_WIDE_INT bitnumber; bitnumber = exact_log2 (GET_MODE_MASK (<MODE>mode) & INTVAL (operands[2])); operands[2] = GEN_INT (bitnumber); return avr_out_sbxx_branch (insn, operands); -} + } [(set (attr "length") (if_then_else (and (ge (minus (pc) (match_dup 3)) (const_int -2046)) (le (minus (pc) (match_dup 3)) (const_int 2046))) @@ -4837,9 +4975,10 @@ (define_insn "*tablejump" (define_expand "casesi" - [(set (match_dup 6) - (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0) - (match_operand:HI 1 "register_operand" ""))) + [(parallel [(set (match_dup 6) + (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0) + (match_operand:HI 1 "register_operand" ""))) + (clobber (scratch:QI))]) (parallel [(set (cc0) (compare (match_dup 6) (match_operand:HI 2 "register_operand" ""))) @@ -5201,8 +5340,8 @@ (define_peephole ; "*dec-and-branchqi!=- (define_peephole ; "*cpse.eq" [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "r,r") - (match_operand:QI 2 "reg_or_0_operand" "r,L"))) + (compare (match_operand:ALL1 1 "register_operand" "r,r") + (match_operand:ALL1 2 "reg_or_0_operand" "r,Y00"))) (set (pc) (if_then_else (eq (cc0) (const_int 0)) @@ -5236,8 +5375,8 @@ (define_peephole ; "*cpse.eq" (define_peephole ; "*cpse.ne" [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "reg_or_0_operand" ""))) + (compare (match_operand:ALL1 1 "register_operand" "") + (match_operand:ALL1 2 "reg_or_0_operand" ""))) (set (pc) (if_then_else (ne (cc0) (const_int 0)) @@ -5246,7 +5385,7 @@ (define_peephole ; "*cpse.ne" "!AVR_HAVE_JMP_CALL || !avr_current_device->errata_skip" { - if (operands[2] == const0_rtx) + if (operands[2] == CONST0_RTX (<MODE>mode)) operands[2] = zero_reg_rtx; return 3 == avr_jump_mode (operands[0], insn) @@ -6265,4 +6404,8 @@ (define_insn_and_split "*extzv.qihi2" }) \f +;; Fixed-point instructions +(include "avr-fixed.md") + +;; Operations on 64-bit registers (include "avr-dimode.md") Index: gcc/config/avr/avr-modes.def =================================================================== --- gcc/config/avr/avr-modes.def (revision 190299) +++ gcc/config/avr/avr-modes.def (working copy) @@ -1 +1,27 @@ FRACTIONAL_INT_MODE (PSI, 24, 3); + +/* On 8 bit machines it requires fewer instructions for fixed point + routines if the decimal place is on a byte boundary which is not + the default for signed accum types. */ + +ADJUST_IBIT (HA, 7); +ADJUST_FBIT (HA, 8); + +ADJUST_IBIT (SA, 15); +ADJUST_FBIT (SA, 16); + +ADJUST_IBIT (DA, 31); +ADJUST_FBIT (DA, 32); + +/* Make TA and UTA 64 bits wide. + 128 bit wide modes would be insane on a 8-bit machine. */ + +ADJUST_BYTESIZE (TA, 8); +ADJUST_ALIGNMENT (TA, 1); +ADJUST_IBIT (TA, 15); +ADJUST_FBIT (TA, 48); + +ADJUST_BYTESIZE (UTA, 8); +ADJUST_ALIGNMENT (UTA, 1); +ADJUST_IBIT (UTA, 16); +ADJUST_FBIT (UTA, 48); Index: gcc/config/avr/avr-protos.h =================================================================== --- gcc/config/avr/avr-protos.h (revision 190299) +++ gcc/config/avr/avr-protos.h (working copy) @@ -79,6 +79,9 @@ extern const char* avr_load_lpm (rtx, rt extern bool avr_rotate_bytes (rtx operands[]); +extern const char* avr_out_fract (rtx, rtx[], bool, int*); +extern rtx avr_to_int_mode (rtx); + extern void expand_prologue (void); extern void expand_epilogue (bool); extern bool avr_emit_movmemhi (rtx*); @@ -92,6 +95,8 @@ extern const char* avr_out_plus (rtx*, i extern const char* avr_out_plus_noclobber (rtx*, int*, int*); extern const char* avr_out_plus64 (rtx, int*); extern const char* avr_out_addto_sp (rtx*, int*); +extern const char* avr_out_minus (rtx*, int*, int*); +extern const char* avr_out_minus64 (rtx, int*); extern const char* avr_out_xload (rtx, rtx*, int*); extern const char* avr_out_movmem (rtx, rtx*, int*); extern const char* avr_out_insert_bits (rtx*, int*); Index: gcc/config/avr/constraints.md =================================================================== --- gcc/config/avr/constraints.md (revision 190299) +++ gcc/config/avr/constraints.md (working copy) @@ -192,3 +192,47 @@ (define_constraint "C0f" "32-bit integer constant where no nibble equals 0xf." (and (match_code "const_int") (match_test "!avr_has_nibble_0xf (op)"))) + +;; CONST_FIXED is no element of 'n' so cook our own. +;; "i" or "s" would match but because the insn uses iterators that cover +;; INT_MODE, "i" or "s" is not always possible. + +(define_constraint "Ynn" + "Fixed-point constant known at compile time." + (match_code "const_fixed")) + +(define_constraint "Y00" + "Fixed-point or integer constant with bit representation 0x0" + (and (match_code "const_fixed,const_int") + (match_test "op == CONST0_RTX (GET_MODE (op))"))) + +(define_constraint "Y01" + "Fixed-point or integer constant with bit representation 0x1" + (ior (and (match_code "const_fixed") + (match_test "1 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_P (op)"))) + +(define_constraint "Ym1" + "Fixed-point or integer constant with bit representation -0x1" + (ior (and (match_code "const_fixed") + (match_test "-1 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_N (op)"))) + +(define_constraint "Y02" + "Fixed-point or integer constant with bit representation 0x2" + (ior (and (match_code "const_fixed") + (match_test "2 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_K (op)"))) + +(define_constraint "Ym2" + "Fixed-point or integer constant with bit representation -0x2" + (ior (and (match_code "const_fixed") + (match_test "-2 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_Cm2 (op)"))) + +;; Similar to "IJ" used with ADIW/SBIW, but for CONST_FIXED. + +(define_constraint "YIJ" + "Fixed-point constant from @minus{}0x003f to 0x003f." + (and (match_code "const_fixed") + (match_test "IN_RANGE (INTVAL (avr_to_int_mode (op)), -63, 63)"))) Index: gcc/config/avr/avr.c =================================================================== --- gcc/config/avr/avr.c (revision 190299) +++ gcc/config/avr/avr.c (working copy) @@ -49,6 +49,10 @@ #include "params.h" #include "df.h" +#ifndef CONST_FIXED_P +#define CONST_FIXED_P(X) (CONST_FIXED == GET_CODE (X)) +#endif + /* Maximal allowed offset for an address in the LD command */ #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE)) @@ -264,6 +268,23 @@ avr_popcount_each_byte (rtx xval, int n_ return true; } + +/* Access some RTX as INT_MODE. If X is a CONST_FIXED we can get + the bit representation of X by "casting" it to CONST_INT. */ + +rtx +avr_to_int_mode (rtx x) +{ + enum machine_mode mode = GET_MODE (x); + + return VOIDmode == mode + ? x + : simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0); +} + + +/* Implement `TARGET_OPTION_OVERRIDE'. */ + static void avr_option_override (void) { @@ -389,9 +410,14 @@ avr_regno_reg_class (int r) } +/* Implement `TARGET_SCALAR_MODE_SUPPORTED_P'. */ + static bool avr_scalar_mode_supported_p (enum machine_mode mode) { + if (ALL_FIXED_POINT_MODE_P (mode)) + return true; + if (PSImode == mode) return true; @@ -715,6 +741,8 @@ avr_initial_elimination_offset (int from } } + +/* Implement `TARGET_BUILTIN_SETJMP_FRAME_VALUE'. */ /* Actual start of frame is virtual_stack_vars_rtx this is offset from frame pointer by +STARTING_FRAME_OFFSET. Using saved frame = virtual_stack_vars_rtx - STARTING_FRAME_OFFSET @@ -723,10 +751,13 @@ avr_initial_elimination_offset (int from static rtx avr_builtin_setjmp_frame_value (void) { - return gen_rtx_MINUS (Pmode, virtual_stack_vars_rtx, - gen_int_mode (STARTING_FRAME_OFFSET, Pmode)); + rtx xval = gen_reg_rtx (Pmode); + emit_insn (gen_subhi3 (xval, virtual_stack_vars_rtx, + gen_int_mode (STARTING_FRAME_OFFSET, Pmode))); + return xval; } + /* Return contents of MEM at frame pointer + stack size + 1 (+2 if 3 byte PC). This is return address of function. */ rtx @@ -2081,6 +2112,14 @@ avr_print_operand (FILE *file, rtx x, in /* Use normal symbol for direct address no linker trampoline needed */ output_addr_const (file, x); } + else if (GET_CODE (x) == CONST_FIXED) + { + HOST_WIDE_INT ival = INTVAL (avr_to_int_mode (x)); + if (code != 0) + output_operand_lossage ("Unsupported code '%c'for fixed-point:", + code); + fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival); + } else if (GET_CODE (x) == CONST_DOUBLE) { long val; @@ -2116,6 +2155,7 @@ notice_update_cc (rtx body ATTRIBUTE_UNU case CC_OUT_PLUS: case CC_OUT_PLUS_NOCLOBBER: + case CC_MINUS: case CC_LDI: { rtx *op = recog_data.operand; @@ -2139,6 +2179,11 @@ notice_update_cc (rtx body ATTRIBUTE_UNU cc = (enum attr_cc) icc; break; + case CC_MINUS: + avr_out_minus (op, &len_dummy, &icc); + cc = (enum attr_cc) icc; + break; + case CC_LDI: cc = (op[1] == CONST0_RTX (GET_MODE (op[0])) @@ -2779,9 +2824,11 @@ output_movqi (rtx insn, rtx operands[], if (real_l) *real_l = 1; - if (register_operand (dest, QImode)) + gcc_assert (1 == GET_MODE_SIZE (GET_MODE (dest))); + + if (REG_P (dest)) { - if (register_operand (src, QImode)) /* mov r,r */ + if (REG_P (src)) /* mov r,r */ { if (test_hard_reg_class (STACK_REG, dest)) return "out %0,%1"; @@ -2803,7 +2850,7 @@ output_movqi (rtx insn, rtx operands[], rtx xop[2]; xop[0] = dest; - xop[1] = src == const0_rtx ? zero_reg_rtx : src; + xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src; return out_movqi_mr_r (insn, xop, real_l); } @@ -2825,6 +2872,8 @@ output_movhi (rtx insn, rtx xop[], int * return avr_out_lpm (insn, xop, plen); } + gcc_assert (2 == GET_MODE_SIZE (GET_MODE (dest))); + if (REG_P (dest)) { if (REG_P (src)) /* mov r,r */ @@ -2843,7 +2892,6 @@ output_movhi (rtx insn, rtx xop[], int * return TARGET_NO_INTERRUPTS ? avr_asm_len ("out __SP_H__,%B1" CR_TAB "out __SP_L__,%A1", xop, plen, -2) - : avr_asm_len ("in __tmp_reg__,__SREG__" CR_TAB "cli" CR_TAB "out __SP_H__,%B1" CR_TAB @@ -2880,7 +2928,7 @@ output_movhi (rtx insn, rtx xop[], int * rtx xop[2]; xop[0] = dest; - xop[1] = src == const0_rtx ? zero_reg_rtx : src; + xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src; return out_movhi_mr_r (insn, xop, plen); } @@ -3403,9 +3451,10 @@ output_movsisf (rtx insn, rtx operands[] if (!l) l = &dummy; - if (register_operand (dest, VOIDmode)) + gcc_assert (4 == GET_MODE_SIZE (GET_MODE (dest))); + if (REG_P (dest)) { - if (register_operand (src, VOIDmode)) /* mov r,r */ + if (REG_P (src)) /* mov r,r */ { if (true_regnum (dest) > true_regnum (src)) { @@ -3440,10 +3489,10 @@ output_movsisf (rtx insn, rtx operands[] { return output_reload_insisf (operands, NULL_RTX, real_l); } - else if (GET_CODE (src) == MEM) + else if (MEM_P (src)) return out_movsi_r_mr (insn, operands, real_l); /* mov r,m */ } - else if (GET_CODE (dest) == MEM) + else if (MEM_P (dest)) { const char *templ; @@ -4126,14 +4175,25 @@ avr_out_compare (rtx insn, rtx *xop, int rtx xval = xop[1]; /* MODE of the comparison. */ - enum machine_mode mode = GET_MODE (xreg); + enum machine_mode mode; /* Number of bytes to operate on. */ - int i, n_bytes = GET_MODE_SIZE (mode); + int i, n_bytes = GET_MODE_SIZE (GET_MODE (xreg)); /* Value (0..0xff) held in clobber register xop[2] or -1 if unknown. */ int clobber_val = -1; + /* Map fixed mode operands to integer operands with the same binary + representation. They are easier to handle in the remainder. */ + + if (CONST_FIXED == GET_CODE (xval)) + { + xreg = avr_to_int_mode (xop[0]); + xval = avr_to_int_mode (xop[1]); + } + + mode = GET_MODE (xreg); + gcc_assert (REG_P (xreg)); gcc_assert ((CONST_INT_P (xval) && n_bytes <= 4) || (const_double_operand (xval, VOIDmode) && n_bytes == 8)); @@ -5884,6 +5944,9 @@ avr_out_plus_1 (rtx *xop, int *plen, enu /* MODE of the operation. */ enum machine_mode mode = GET_MODE (xop[0]); + /* INT_MODE of the same size. */ + enum machine_mode imode = int_mode_for_mode (mode); + /* Number of bytes to operate on. */ int i, n_bytes = GET_MODE_SIZE (mode); @@ -5908,8 +5971,11 @@ avr_out_plus_1 (rtx *xop, int *plen, enu *pcc = (MINUS == code) ? CC_SET_CZN : CC_CLOBBER; + if (CONST_FIXED_P (xval)) + xval = avr_to_int_mode (xval); + if (MINUS == code) - xval = simplify_unary_operation (NEG, mode, xval, mode); + xval = simplify_unary_operation (NEG, imode, xval, imode); op[2] = xop[3]; @@ -5920,7 +5986,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu { /* We operate byte-wise on the destination. */ rtx reg8 = simplify_gen_subreg (QImode, xop[0], mode, i); - rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i); + rtx xval8 = simplify_gen_subreg (QImode, xval, imode, i); /* 8-bit value to operate with this byte. */ unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode); @@ -5941,7 +6007,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu && i + 2 <= n_bytes && test_hard_reg_class (ADDW_REGS, reg8)) { - rtx xval16 = simplify_gen_subreg (HImode, xval, mode, i); + rtx xval16 = simplify_gen_subreg (HImode, xval, imode, i); unsigned int val16 = UINTVAL (xval16) & GET_MODE_MASK (HImode); /* Registers R24, X, Y, Z can use ADIW/SBIW with constants < 64 @@ -6085,6 +6151,41 @@ avr_out_plus_noclobber (rtx *xop, int *p } +/* Output subtraction of register XOP[0] and compile time constant XOP[2]: + + XOP[0] = XOP[0] - XOP[2] + + This is basically the same as `avr_out_plus' except that we subtract. + It's needed because (minus x const) is not mapped to (plus x -const) + for the fixed point modes. */ + +const char* +avr_out_minus (rtx *xop, int *plen, int *pcc) +{ + rtx op[4]; + + if (pcc) + *pcc = (int) CC_SET_CZN; + + if (REG_P (xop[2])) + return avr_asm_len ("sub %A0,%A2" CR_TAB + "sbc %B0,%B2", xop, plen, -2); + + if (!CONST_INT_P (xop[2]) + && !CONST_FIXED_P (xop[2])) + return avr_asm_len ("subi %A0,lo8(%2)" CR_TAB + "sbci %B0,hi8(%2)", xop, plen, -2); + + op[0] = avr_to_int_mode (xop[0]); + op[1] = avr_to_int_mode (xop[1]); + op[2] = gen_int_mode (-INTVAL (avr_to_int_mode (xop[2])), + GET_MODE (op[0])); + op[3] = xop[3]; + + return avr_out_plus (op, plen, pcc); +} + + /* Prepare operands of adddi3_const_insn to be used with avr_out_plus_1. */ const char* @@ -6103,6 +6204,19 @@ avr_out_plus64 (rtx addend, int *plen) return ""; } + +/* Prepare operands of subdi3_const_insn to be used with avr_out_plus64. */ + +const char* +avr_out_minus64 (rtx subtrahend, int *plen) +{ + rtx xneg = avr_to_int_mode (subtrahend); + xneg = simplify_unary_operation (NEG, DImode, xneg, DImode); + + return avr_out_plus64 (xneg, plen); +} + + /* Output bit operation (IOR, AND, XOR) with register XOP[0] and compile time constant XOP[2]: @@ -6442,6 +6556,319 @@ avr_rotate_bytes (rtx operands[]) return true; } + +/* Outputs instructions needed for fixed point type conversion. + This includes converting between any fixed point type, as well + as converting to any integer type. Conversion between integer + types is not supported. + + The number of instructions generated depends on the types + being converted and the registers assigned to them. + + The number of instructions required to complete the conversion + is least if the registers for source and destination are overlapping + and are aligned at the decimal place as actual movement of data is + completely avoided. In some cases, the conversion may already be + complete without any instructions needed. + + When converting to signed types from signed types, sign extension + is implemented. + + Converting signed fractional types requires a bit shift if converting + to or from any unsigned fractional type because the decimal place is + shifted by 1 bit. When the destination is a signed fractional, the sign + is stored in either the carry or T bit. */ + +const char* +avr_out_fract (rtx insn, rtx operands[], bool intsigned, int *plen) +{ + int i; + bool sbit[2]; + /* ilen: Length of integral part (in bytes) + flen: Length of fractional part (in bytes) + tlen: Length of operand (in bytes) + blen: Length of operand (in bits) */ + int ilen[2], flen[2], tlen[2], blen[2]; + int rdest, rsource, offset; + int start, end, dir; + bool sign_in_T = false, sign_in_Carry = false; + int clrword = -1, lastclr = 0, clr = 0; + rtx xop[6]; + + const int dest = 0; + const int src = 1; + + xop[dest] = operands[dest]; + xop[src] = operands[src]; + + if (plen) + *plen = 0; + + /* Determine format (integer and fractional parts) + of types needing conversion. */ + + for (i = 0; i < 2; i++) + { + enum machine_mode mode = GET_MODE (xop[i]); + + tlen[i] = GET_MODE_SIZE (mode); + blen[i] = GET_MODE_BITSIZE (mode); + + if (SCALAR_INT_MODE_P (mode)) + { + sbit[i] = intsigned; + ilen[i] = GET_MODE_SIZE (mode); + flen[i] = 0; + } + else if (ALL_SCALAR_FIXED_POINT_MODE_P (mode)) + { + sbit[i] = SIGNED_SCALAR_FIXED_POINT_MODE_P (mode); + ilen[i] = (GET_MODE_IBIT (mode) + 1) / 8; + flen[i] = (GET_MODE_FBIT (mode) + 1) / 8; + } + else + fatal_insn ("unsupported fixed-point conversion", insn); + } + + rdest = REGNO (xop[dest]); + rsource = REGNO (xop[src]); + offset = flen[src] - flen[dest]; + + /* For bit index */ + + xop[2] = GEN_INT (blen[dest] - 1); + xop[3] = GEN_INT (blen[src] - 1); + + /* Store the sign bit if the destination is a signed fract and the source + has a sign in the integer part. */ + + if (sbit[dest] && ilen[dest] == 0 && sbit[src] && ilen[src] > 0) + { + /* Position of MSB in the source operand. */ + + xop[4] = GEN_INT (blen[1] - 1); + + /* To avoid using BST and BLD if the source and destination registers + overlap or the source is unused after, we can use LSL to store the + sign bit in carry since we don't need the integral part of the source. + Restoring the sign from carry saves one BLD instruction below. */ + + if (reg_unused_after (insn, xop[src]) + || (rdest < rsource + tlen[src] + && rdest + tlen[dest] > rsource)) + { + avr_asm_len ("lsl %T1%t4", xop, plen, 1); + sign_in_Carry = true; + } + else + { + avr_asm_len ("bst %T1%T4", xop, plen, 1); + sign_in_T = true; + } + } + + /* Pick the correct direction to shift bytes. */ + + if (rdest < rsource + offset) + { + dir = 1; + start = 0; + end = tlen[dest]; + } + else + { + dir = -1; + start = tlen[dest] - 1; + end = -1; + } + + /* Perform conversion by moving registers into place, clearing + destination registers that do not overlap with any source. */ + + for (i = start; i != end; i += dir) + { + int destloc = rdest + i; + int sourceloc = rsource + i + offset; + + /* Source register location is outside range of source register, + so clear this byte in the dest. */ + + if (sourceloc < rsource + || sourceloc >= rsource + tlen[src]) + { + if (AVR_HAVE_MOVW + && i + dir != end + && (sourceloc + dir < rsource + || sourceloc + dir >= rsource + tlen[src]) + && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2)) + || (dir == -1 && (destloc % 2) && (sourceloc % 2))) + && clrword != -1) + { + /* Use already cleared word to clear two bytes at a time. */ + + int even_i = i & ~1; + int even_clrword = clrword & ~1; + + xop[4] = GEN_INT (8 * even_i); + xop[5] = GEN_INT (8 * even_clrword); + avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1); + i += dir; + } + else + { + /* Do not clear the register if it is going to get + sign extended with a MOV later. */ + + if (sbit[dest] && sbit[src] + && i != tlen[dest] - 1 + && i >= flen[dest]) + continue; + + xop[4] = GEN_INT (8 * i); + avr_asm_len ("clr %T0%t4", xop, plen, 1); + + /* If the last byte was cleared too, we have a cleared + word we can MOVW to clear two bytes at a time. */ + + if (lastclr) + clrword = i; + + clr = 1; + } + } + else if (destloc == sourceloc) + { + /* Source byte is already in destination: Nothing needed. */ + + continue; + } + else + { + /* Registers do not line up and source register location + is within range: Perform move, shifting with MOV or MOVW. */ + + if (AVR_HAVE_MOVW + && i + dir != end + && sourceloc + dir >= rsource + && sourceloc + dir < rsource + tlen[src] + && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2)) + || (dir == -1 && (destloc % 2) && (sourceloc % 2)))) + { + int even_i = i & ~1; + int even_i_plus_offset = (i + offset) & ~1; + + xop[4] = GEN_INT (8 * even_i); + xop[5] = GEN_INT (8 * even_i_plus_offset); + avr_asm_len ("movw %T0%t4,%T1%t5", xop, plen, 1); + i += dir; + } + else + { + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (i + offset)); + avr_asm_len ("mov %T0%t4,%T1%t5", xop, plen, 1); + } + } + + lastclr = clr; + clr = 0; + } + + /* Perform sign extension if source and dest are both signed, + and there are more integer parts in dest than in source. */ + + if (sbit[dest] && sbit[src] && ilen[dest] > ilen[src]) + { + xop[4] = GEN_INT (blen[src] - 1 - 8 * offset); + avr_asm_len ("sbrc %T0%T4", xop, plen, 1); + + /* Register was previously cleared, so can become 0xff and extended. */ + + avr_asm_len ("com %T0%t2", xop, plen, 1); + + /* Sign extend additional bytes by MOV and MOVW. */ + + start = tlen[dest] - 2; + end = flen[dest] + ilen[src] - 1; + + for (i = start; i != end; i--) + { + if (AVR_HAVE_MOVW && i != start && i-1 != end) + { + i--; + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (tlen[dest] - 2)); + avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1); + } + else + { + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (tlen[dest] - 1)); + avr_asm_len ("mov %T0%t4,%T0%t5", xop, plen, 1); + } + } + } + + /* If destination is a signed fract, and the source was not, a shift + by 1 bit is needed. Also restore sign from carry or T. */ + + if (sbit[dest] && !ilen[dest] && (!sbit[src] || ilen[src])) + { + /* We have flen[src] non-zero fractional bytes to shift. + Because of the right shift, handle one byte more so that the + LSB won't be lost. */ + + int nonzero = flen[src] + 1; + + /* If the LSB is in the T flag and there are no fractional + bits, the high byte is zero and no shift needed. */ + + if (flen[src] == 0 && sign_in_T) + nonzero = 0; + + start = flen[dest] - 1; + end = start - nonzero; + + for (i = start; i > end && i >= 0; i--) + { + xop[4] = GEN_INT (8 * i); + if (i == start && !sign_in_Carry) + avr_asm_len ("lsr %T0%t4", xop, plen, 1); + else + avr_asm_len ("ror %T0%t4", xop, plen, 1); + } + + if (sign_in_T) + { + xop[4] = GEN_INT (blen[dest] - 1); + avr_asm_len ("bld %T0%T4", xop, plen, 1); + } + } + else if (sbit[src] && !ilen[src] && (!sbit[dest] || ilen[dest])) + { + /* If source was a signed fract and dest was not, shift 1 bit + other way. */ + + start = flen[dest] - flen[src]; + + if (start < 0) + start = 0; + + for (i = start; i < flen[dest]; i++) + { + xop[4] = GEN_INT (8 * i); + + if (i == start) + avr_asm_len ("lsl %T0%t4", xop, plen, 1); + else + avr_asm_len ("rol %T0%t4", xop, plen, 1); + } + } + + return ""; +} + + /* Modifies the length assigned to instruction INSN LEN is the initially computed length of the insn. */ @@ -6489,6 +6916,8 @@ adjust_insn_length (rtx insn, int len) case ADJUST_LEN_OUT_PLUS: avr_out_plus (op, &len, NULL); break; case ADJUST_LEN_PLUS64: avr_out_plus64 (op[0], &len); break; + case ADJUST_LEN_MINUS: avr_out_minus (op, &len, NULL); break; + case ADJUST_LEN_MINUS64: avr_out_minus64 (op[0], &len); break; case ADJUST_LEN_OUT_PLUS_NOCLOBBER: avr_out_plus_noclobber (op, &len, NULL); break; @@ -6502,6 +6931,9 @@ adjust_insn_length (rtx insn, int len) case ADJUST_LEN_XLOAD: avr_out_xload (insn, op, &len); break; case ADJUST_LEN_LOAD_LPM: avr_load_lpm (insn, op, &len); break; + case ADJUST_LEN_SFRACT: avr_out_fract (insn, op, true, &len); break; + case ADJUST_LEN_UFRACT: avr_out_fract (insn, op, false, &len); break; + case ADJUST_LEN_TSTHI: avr_out_tsthi (insn, op, &len); break; case ADJUST_LEN_TSTPSI: avr_out_tstpsi (insn, op, &len); break; case ADJUST_LEN_TSTSI: avr_out_tstsi (insn, op, &len); break; @@ -6683,6 +7115,20 @@ avr_assemble_integer (rtx x, unsigned in return true; } + else if (CONST_FIXED_P (x)) + { + unsigned n; + + /* varasm fails to handle big fixed modes that don't fit in hwi. */ + + for (n = 0; n < size; n++) + { + rtx xn = simplify_gen_subreg (QImode, x, GET_MODE (x), n); + default_assemble_integer (xn, 1, aligned_p); + } + + return true; + } return default_assemble_integer (x, size, aligned_p); } @@ -7489,6 +7935,7 @@ avr_operand_rtx_cost (rtx x, enum machin return 0; case CONST_INT: + case CONST_FIXED: case CONST_DOUBLE: return COSTS_N_INSNS (GET_MODE_SIZE (mode)); @@ -7518,6 +7965,7 @@ avr_rtx_costs_1 (rtx x, int codearg, int switch (code) { case CONST_INT: + case CONST_FIXED: case CONST_DOUBLE: case SYMBOL_REF: case CONST: @@ -8788,6 +9236,8 @@ avr_2word_insn_p (rtx insn) return false; case CODE_FOR_movqi_insn: + case CODE_FOR_movuqq_insn: + case CODE_FOR_movqq_insn: { rtx set = single_set (insn); rtx src = SET_SRC (set); @@ -8796,7 +9246,7 @@ avr_2word_insn_p (rtx insn) /* Factor out LDS and STS from movqi_insn. */ if (MEM_P (dest) - && (REG_P (src) || src == const0_rtx)) + && (REG_P (src) || src == CONST0_RTX (GET_MODE (dest)))) { return CONSTANT_ADDRESS_P (XEXP (dest, 0)); } @@ -9021,7 +9471,7 @@ output_reload_in_const (rtx *op, rtx clo if (NULL_RTX == clobber_reg && !test_hard_reg_class (LD_REGS, dest) - && (! (CONST_INT_P (src) || CONST_DOUBLE_P (src)) + && (! (CONST_INT_P (src) || CONST_FIXED_P (src) || CONST_DOUBLE_P (src)) || !avr_popcount_each_byte (src, n_bytes, (1 << 0) | (1 << 1) | (1 << 8)))) { @@ -9048,6 +9498,7 @@ output_reload_in_const (rtx *op, rtx clo ldreg_p = test_hard_reg_class (LD_REGS, xdest[n]); if (!CONST_INT_P (src) + && !CONST_FIXED_P (src) && !CONST_DOUBLE_P (src)) { static const char* const asm_code[][2] = @@ -9239,6 +9690,7 @@ output_reload_insisf (rtx *op, rtx clobb if (AVR_HAVE_MOVW && !test_hard_reg_class (LD_REGS, op[0]) && (CONST_INT_P (op[1]) + || CONST_FIXED_P (op[1]) || CONST_DOUBLE_P (op[1]))) { int len_clr, len_noclr; @@ -10834,6 +11286,9 @@ avr_fold_builtin (tree fndecl, int n_arg #undef TARGET_SCALAR_MODE_SUPPORTED_P #define TARGET_SCALAR_MODE_SUPPORTED_P avr_scalar_mode_supported_p +#undef TARGET_FIXED_POINT_SUPPORTED_P +#define TARGET_FIXED_POINT_SUPPORTED_P hook_bool_void_true + #undef TARGET_ADDR_SPACE_SUBSET_P #define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p Index: libgcc/config/avr/avr-lib.h =================================================================== --- libgcc/config/avr/avr-lib.h (revision 190299) +++ libgcc/config/avr/avr-lib.h (working copy) @@ -4,3 +4,79 @@ #define DI SI typedef int QItype __attribute__ ((mode (QI))); #endif + +/* fixed-bit.h does not define functions for TA and UTA because + that part is wrapped in #if MIN_UNITS_PER_WORD > 4. + This would lead to empty functions for TA and UTA. + Thus, supply appropriate defines as if HAVE_[U]TA == 1. + #define HAVE_[U]TA 1 won't work because avr-modes.def + uses ADJUST_BYTESIZE(TA,8) and fixed-bit.h is not generic enough + to arrange for such changes of the mode size. */ + +typedef unsigned _Fract UTAtype __attribute__ ((mode (UTA))); + +#if defined (UTA_MODE) +#define FIXED_SIZE 8 /* in bytes */ +#define INT_C_TYPE UDItype +#define UINT_C_TYPE UDItype +#define HINT_C_TYPE USItype +#define HUINT_C_TYPE USItype +#define MODE_NAME UTA +#define MODE_NAME_S uta +#define MODE_UNSIGNED 1 +#endif + +#if defined (FROM_UTA) +#define FROM_TYPE 4 /* ID for fixed-point */ +#define FROM_MODE_NAME UTA +#define FROM_MODE_NAME_S uta +#define FROM_INT_C_TYPE UDItype +#define FROM_SINT_C_TYPE DItype +#define FROM_UINT_C_TYPE UDItype +#define FROM_MODE_UNSIGNED 1 +#define FROM_FIXED_SIZE 8 /* in bytes */ +#elif defined (TO_UTA) +#define TO_TYPE 4 /* ID for fixed-point */ +#define TO_MODE_NAME UTA +#define TO_MODE_NAME_S uta +#define TO_INT_C_TYPE UDItype +#define TO_SINT_C_TYPE DItype +#define TO_UINT_C_TYPE UDItype +#define TO_MODE_UNSIGNED 1 +#define TO_FIXED_SIZE 8 /* in bytes */ +#endif + +/* Same for TAmode */ + +typedef _Fract TAtype __attribute__ ((mode (TA))); + +#if defined (TA_MODE) +#define FIXED_SIZE 8 /* in bytes */ +#define INT_C_TYPE DItype +#define UINT_C_TYPE UDItype +#define HINT_C_TYPE SItype +#define HUINT_C_TYPE USItype +#define MODE_NAME TA +#define MODE_NAME_S ta +#define MODE_UNSIGNED 0 +#endif + +#if defined (FROM_TA) +#define FROM_TYPE 4 /* ID for fixed-point */ +#define FROM_MODE_NAME TA +#define FROM_MODE_NAME_S ta +#define FROM_INT_C_TYPE DItype +#define FROM_SINT_C_TYPE DItype +#define FROM_UINT_C_TYPE UDItype +#define FROM_MODE_UNSIGNED 0 +#define FROM_FIXED_SIZE 8 /* in bytes */ +#elif defined (TO_TA) +#define TO_TYPE 4 /* ID for fixed-point */ +#define TO_MODE_NAME TA +#define TO_MODE_NAME_S ta +#define TO_INT_C_TYPE DItype +#define TO_SINT_C_TYPE DItype +#define TO_UINT_C_TYPE UDItype +#define TO_MODE_UNSIGNED 0 +#define TO_FIXED_SIZE 8 /* in bytes */ +#endif Index: libgcc/config/avr/lib1funcs-fixed.S =================================================================== --- libgcc/config/avr/lib1funcs-fixed.S (revision 0) +++ libgcc/config/avr/lib1funcs-fixed.S (revision 0) @@ -0,0 +1,1019 @@ +/* -*- Mode: Asm -*- */ +;; Copyright (C) 2012 +;; Free Software Foundation, Inc. +;; Contributed by Sean D'Epagnier (sean@depagnier.com) + +;; This file is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by the +;; Free Software Foundation; either version 3, or (at your option) any +;; later version. + +;; In addition to the permissions in the GNU General Public License, the +;; Free Software Foundation gives you unlimited permission to link the +;; compiled version of this file into combinations with other programs, +;; and to distribute those combinations without any restriction coming +;; from the use of this file. (The General Public License restrictions +;; do apply in other respects; for example, they cover modification of +;; the file, and distribution when not linked into a combine +;; executable.) + +;; This file is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with this program; see the file COPYING. If not, write to +;; the Free Software Foundation, 51 Franklin Street, Fifth Floor, +;; Boston, MA 02110-1301, USA. + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Fixed point library routines for AVR +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +.section .text.libgcc.fixed, "ax", @progbits + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Conversions to float +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +#if defined (L_fractqqsf) +DEFUN __fractqqsf + clr r25 + sbrc r24, 7 ; if negative + com r25 ; sign extend + mov r23, r24 ; move in place + mov r24, r25 ; sign extend lower byte + lsl r23 + clr r22 + XJMP __fractsasf ; call larger conversion +ENDF __fractqqsf +#endif /* defined (L_fractqqsf) */ + +#if defined (L_fractuqqsf) +DEFUN __fractuqqsf + clr r22 + mov r23, r24 + clr r24 + clr r25 + XJMP __fractsasf ; call larger conversion +ENDF __fractuqqsf +#endif /* defined (L_fractuqqsf) */ + +#if defined (L_fracthqsf) +DEFUN __fracthqsf + wmov 22, 24 ; put fractional part in place + clr r25 + sbrc r23, 7 ; if negative + com r25 ; sign extend + mov r24, r25 ; sign extend lower byte + lsl r22 + rol r23 + XJMP __fractsasf ; call larger conversion +ENDF __fracthqsf +#endif /* defined (L_fracthqsf) */ + +#if defined (L_fractuhqsf) +DEFUN __fractuhqsf + wmov 22, 24 ; put fractional part in place + clr r24 + clr r25 + XJMP __fractsasf ; call larger conversion +ENDF __fractuhqsf +#endif /* defined (L_fractuhqsf) */ + +#if defined (L_fracthasf) +DEFUN __fracthasf + clr r22 + mov r23, r24 ; move into place + mov r24, r25 + clr r25 + sbrc r24, 7 ; if negative + com r25 ; sign extend + XJMP __fractsasf ; call larger conversion +ENDF __fracthasf +#endif /* defined (L_fracthasf) */ + +#if defined (L_fractuhasf) +DEFUN __fractuhasf + clr r22 + mov r23, r24 ; move into place + XJMP __fractsasf ; call larger conversion +ENDF __fractuhasf +#endif /* defined (L_fractuhasf) */ + +#if defined (L_fractsasf) +DEFUN __fractsasf + XCALL __floatsisf + cpse r25, __zero_reg__ ; skip if zero + subi r25, 0x08 ; adjust exponent + ret +ENDF __fractsasf +#endif /* defined (L_fractsasf) */ + +#if defined (L_fractusasf) +DEFUN __fractusasf + XCALL __floatunsisf + cpse r25, __zero_reg__ ; skip if zero + subi r25, 0x08 ; adjust exponent + ret +ENDF __fractusasf +#endif /* defined (L_fractusasf) */ + +#if defined (L_fractsfqq) /* Conversions from float. */ +DEFUN __fractsfqq + subi r25, -11 ; adjust exponent + subi r24, 128 + XJMP __fixsfsi +ENDF __fractsfqq +#endif /* defined (L_fractqq) */ + +#if defined (L_fractsfuqq) +DEFUN __fractsfuqq + subi r25, -12 ; adjust exponent + XJMP __fixsfsi +ENDF __fractsfuqq +#endif /* defined (L_fractuqq) */ + +#if defined (L_fractsfhq) +DEFUN __fractsfhq + subi r25, -15 ; adjust exponent + subi r24, 128 + XJMP __fixsfsi +ENDF __fractsfhq +#endif /* defined (L_fractsfhq) */ + +#if defined (L_fractsfuhq) +DEFUN __fractsfuhq + subi r25, -16 ; adjust exponent + XJMP __fixsfsi +ENDF __fractsfuhq +#endif /* defined (L_fractsfuhq) */ + +#if defined (L_fractsfha) +DEFUN __fractsfha +ENDF __fractsfha + +DEFUN __fractsfuha + subi r25, -12 ; adjust exponent + XJMP __fixsfsi +ENDF __fractsfuha +#endif /* defined (L_fractsfha) */ + +#if defined (L_fractsfsa) +DEFUN __fractsfsa +ENDF __fractsfsa + +DEFUN __fractsfusa + subi r25, -8 ; adjust exponent + XJMP __fixsfsi +ENDF __fractsfsa +#endif /* defined (L_fractsfsa) */ + +/* For multiplication the functions here are called directly from + avr-fixed.md patterns, instead of using the standard libcall mechanisms. + This can make better code because GCC knows exactly which + of the call-used registers (not all of them) are clobbered. */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg2L +#undef r_arg2H +#undef r_resL +#undef r_resH + +/* mulqq and muluqq open coded on the enhanced core */ +#if !defined (__AVR_HAVE_MUL__) +/******************************************************* + Fractional Multiplication 8 x 8 +*******************************************************/ +#define r_arg2 r22 /* multiplicand */ +#define r_arg1 r24 /* multiplier */ +#define r_res __tmp_reg__ /* result */ + +#if defined (L_mulqq3) +DEFUN __mulqq3 + mov r_res, r_arg1 + eor r_res, r_arg2 + bst r_res, 7 + lsl r_arg1 + lsl r_arg2 + brcc 0f + neg r_arg2 +0: + XCALL __muluqq3 + lsr r_arg1 + brtc 1f + neg r_arg1 +1: + ret + +ENDF __mulqq3 +#endif /* defined (L_mulqq3) */ + +#if defined (L_muluqq3) +DEFUN __muluqq3 + clr r_res ; clear result +__muluqq3_loop: + lsr r_arg2 ; shift multiplicand + sbrc r_arg1,7 + add r_res,r_arg2 + breq 1f ; while multiplicand != 0 + lsl r_arg1 + brne __muluqq3_loop ; exit if multiplier = 0 +1: + mov r_arg1,r_res ; result to return register + ret +#undef r_arg2 +#undef r_arg1 +#undef r_res + +ENDF __muluqq3 +#endif /* defined (L_muluqq3) */ +#endif /* !defined (__AVR_HAVE_MUL__) */ + +/******************************************************* + Fractional Multiplication 16 x 16 +*******************************************************/ + +#if defined (__AVR_HAVE_MUL__) +#define r_arg1L r22 /* multiplier Low */ +#define r_arg1H r23 /* multiplier High */ +#define r_arg2L r20 /* multiplicand Low */ +#define r_arg2H r21 /* multiplicand High */ +#define r_resL r18 /* result Low */ +#define r_resH r19 /* result High */ + +#if defined (L_mulhq3) +DEFUN __mulhq3 + fmuls r_arg1H, r_arg2H + movw r_resL, r0 + fmulsu r_arg2H, r_arg1L + clr r_arg1L + sbc r_resH, r_arg1L + add r_resL, r1 + adc r_resH, r_arg1L + fmulsu r_arg1H, r_arg2L + sbc r_resH, r_arg1L + add r_resL, r1 + adc r_resH, r_arg1L + clr __zero_reg__ + ret +ENDF __mulhq3 +#endif /* defined (L_mulhq3) */ + +#if defined (L_muluhq3) +DEFUN __muluhq3 + mul r_arg1H, r_arg2H + movw r_resL, r0 + mul r_arg1H, r_arg2L + add r_resL, r1 + clr __zero_reg__ + adc r_resH, __zero_reg__ + mul r_arg1L, r_arg2H + add r_resL, r1 + clr __zero_reg__ + adc r_resH, __zero_reg__ + ret +ENDF __muluhq3 + +#endif /* defined (L_muluhq3) */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg2L +#undef r_arg2H +#undef r_resL +#undef r_resH + +#else /* defined (__AVR_HAVE_MUL__) */ + +#define r_arg1L 24 /* multiplier Low */ +#define r_arg1H 25 /* multiplier High */ +#define r_arg2L 22 /* multiplicand Low */ +#define r_arg2H 23 /* multiplicand High */ +#define r_resL 0 /* __tmp_reg__ result Low */ +#define r_resH 1 /* __zero_reg__ result High */ + +#if defined (L_mulhq3) +DEFUN __mulhq3 + mov r_resL, r_arg1H + eor r_resL, r_arg2H + bst r_resL, 7 + lsl r_arg1L + rol r_arg1H + lsl r_arg2L + rol r_arg2H + brcc 0f + neg2 r_arg2L +0: + XCALL __muluhq3 + lsr r_arg1H + ror r_arg1L + brtc mulhq3_exit + neg2 r_arg1L +mulhq3_exit: + ret +ENDF __mulhq3 +#endif /* defined (L_mulhq3) */ + +#if defined (L_muluhq3) +DEFUN __muluhq3 + clr r_resL ; clear result +__muluhq3_loop: + lsr r_arg2H ; shift multiplicand + ror r_arg2L + sbrs r_arg1H,7 + rjmp 0f + add r_resL,r_arg2L ; result + multiplicand + adc r_resH,r_arg2H +0: + lsl r_arg1L ; shift multiplier + rol r_arg1H + brne __muluhq3_loop + cpi r_arg1L, 0 + brne __muluhq3_loop ; exit multiplier = 0 + wmov r_arg1L,r_resL ; result to return register + clr __zero_reg__ ; zero the zero reg + ret +ENDF __muluhq3 +#endif /* defined (L_muluhq3) */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg2L +#undef r_arg2H +#undef r_resL +#undef r_resH + +#endif /* defined (__AVR_HAVE_MUL__) */ + +/******************************************************* + Fixed Multiplication 8.8 x 8.8 +*******************************************************/ + +#if defined (__AVR_HAVE_MUL__) +#define r_arg1L r22 /* multiplier Low */ +#define r_arg1H r23 /* multiplier High */ +#define r_arg2L r20 /* multiplicand Low */ +#define r_arg2H r21 /* multiplicand High */ +#define r_resL r18 /* result Low */ +#define r_resH r19 /* result High */ + +#if defined (L_mulha3) +DEFUN __mulha3 + mul r_arg1L, r_arg2L + mov r_resL, r1 + muls r_arg1H, r_arg2H + mov r_resH, r0 + mulsu r_arg1H, r_arg2L + add r_resL, r0 + adc r_resH, r1 + mulsu r_arg2H, r_arg1L + add r_resL, r0 + adc r_resH, r1 + clr __zero_reg__ + ret +ENDF __mulha3 +#endif /* defined (L_mulha3) */ + +#if defined (L_muluha3) +DEFUN __muluha3 + mul r_arg1L, r_arg2L + mov r_resL, r1 + mul r_arg1H, r_arg2H + mov r_resH, r0 + mul r_arg1H, r_arg2L + add r_resL, r0 + adc r_resH, r1 + mul r_arg1L, r_arg2H + add r_resL, r0 + adc r_resH, r1 + clr __zero_reg__ + ret +ENDF __muluha3 +#endif /* defined (L_muluha3) */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg2L +#undef r_arg2H +#undef r_resL +#undef r_resH + +#else /* defined (__AVR_HAVE_MUL__) */ + +#define r_arg1L 24 /* multiplier Low */ +#define r_arg1H 25 /* multiplier High */ +#define r_arg2L 22 /* multiplicand Low */ +#define r_arg2H 23 /* multiplicand High */ +#define r_resL 18 /* result Low */ +#define r_resH 19 /* result High */ +#define r_scratchL 0 /* scratch Low */ +#define r_scratchH 1 + +#if defined (L_mulha3) +DEFUN __mulha3 + mov r_resL, r_arg1H + eor r_resL, r_arg2H + bst r_resL, 7 + sbrs r_arg1H, 7 + rjmp 1f + neg2 r_arg1L +1: + sbrs r_arg2H, 7 + rjmp 2f + neg2 r_arg2L +2: + XCALL __muluha3 + brtc __mulha3_exit + neg2 r_resL +__mulha3_exit: + ret +ENDF __mulha3 +#endif /* defined (L_mulha3) */ + +#if defined (L_muluha3) +DEFUN __muluha3 + clr r_resL ; clear result + clr r_resH + wmov 0, r_arg1L ; save multiplicand +__muluha3_loop1: + sbrs r_arg2H,0 + rjmp 0f + add r_resL,r_arg1L ; result + multiplicand + adc r_resH,r_arg1H +0: + lsl r_arg1L ; shift multiplicand + rol r_arg1H + sbiw r_arg1L,0 + breq __muluha3_loop1_done ; exit multiplicand = 0 + lsr r_arg2H + brne __muluha3_loop1 ; exit multiplier = 0 +__muluha3_loop1_done: + wmov r_arg1L, r_scratchL ; restore multiplicand +__muluha3_loop2: + lsr r_arg1H ; shift multiplicand + ror r_arg1L + sbiw r_arg1L,0 + breq __muluha3_exit ; exit if multiplicand = 0 + sbrs r_arg2L,7 + rjmp 0f + add r_resL,r_arg1L ; result + multiplicand + adc r_resH,r_arg1H +0: + lsl r_arg2L + brne __muluha3_loop2 ; exit if multiplier = 0 +__muluha3_exit: + clr __zero_reg__ ; got clobbered + ret +ENDF __muluha3 +#endif /* defined (L_muluha3) */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg2L +#undef r_arg2H +#undef r_resL +#undef r_resH +#undef r_scratchL +#undef r_scratchH + +#endif /* defined (__AVR_HAVE_MUL__) */ + + +/******************************************************* + Fixed Multiplication 16.16 x 16.16 +*******************************************************/ + +#if defined (__AVR_HAVE_MUL__) +/* uses nonstandard registers because mulus only works from 16-23 */ +#define r_clr r15 + +#define r_arg1L r16 /* multiplier Low */ +#define r_arg1H r17 +#define r_arg1HL r18 +#define r_arg1HH r19 /* multiplier High */ + +#define r_arg2L r20 /* multiplicand Low */ +#define r_arg2H r21 +#define r_arg2HL r22 +#define r_arg2HH r23 /* multiplicand High */ + +#define r_resL r24 /* result Low */ +#define r_resH r25 +#define r_resHL r26 +#define r_resHH r27 /* result High */ + +#if defined (L_mulsa3) +DEFUN __mulsa3 + clr r_clr + clr r_resH + clr r_resHL + clr r_resHH + mul r_arg1H, r_arg2L + mov r_resL, r1 + mul r_arg1L, r_arg2H + add r_resL, r1 + adc r_resH, r_clr + mul r_arg1L, r_arg2HL + add r_resL, r0 + adc r_resH, r1 + adc r_resHL, r_clr + mul r_arg1H, r_arg2H + add r_resL, r0 + adc r_resH, r1 + adc r_resHL, r_clr + mul r_arg1HL, r_arg2L + add r_resL, r0 + adc r_resH, r1 + adc r_resHL, r_clr + mulsu r_arg2HH, r_arg1L + sbc r_resHH, r_clr + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mul r_arg1H, r_arg2HL + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mul r_arg1HL, r_arg2H + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mulsu r_arg1HH, r_arg2L + sbc r_resHH, r_clr + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mulsu r_arg2HH, r_arg1H + add r_resHL, r0 + adc r_resHH, r1 + mul r_arg1HL, r_arg2HL + add r_resHL, r0 + adc r_resHH, r1 + mulsu r_arg1HH, r_arg2H + add r_resHL, r0 + adc r_resHH, r1 + mulsu r_arg2HH, r_arg1HL + add r_resHH, r0 + mulsu r_arg1HH, r_arg2HL + add r_resHH, r0 + clr __zero_reg__ + ret +ENDF __mulsa3 +#endif /* L_mulsa3 */ + +#if defined (L_mulusa3) +DEFUN __mulusa3 + clr r_clr + clr r_resH + clr r_resHL + clr r_resHH + mul r_arg1H, r_arg2L + mov r_resL, r1 + mul r_arg1L, r_arg2H + add r_resL, r1 + adc r_resH, r_clr + mul r_arg1L, r_arg2HL + add r_resL, r0 + adc r_resH, r1 + adc r_resHL, r_clr + mul r_arg1H, r_arg2H + add r_resL, r0 + adc r_resH, r1 + adc r_resHL, r_clr + mul r_arg1HL, r_arg2L + add r_resL, r0 + adc r_resH, r1 + adc r_resHL, r_clr + mul r_arg1L, r_arg2HH + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mul r_arg1H, r_arg2HL + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mul r_arg1HL, r_arg2H + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mul r_arg1HH, r_arg2L + add r_resH, r0 + adc r_resHL, r1 + adc r_resHH, r_clr + mul r_arg1H, r_arg2HH + add r_resHL, r0 + adc r_resHH, r1 + mul r_arg1HL, r_arg2HL + add r_resHL, r0 + adc r_resHH, r1 + mul r_arg1HH, r_arg2H + add r_resHL, r0 + adc r_resHH, r1 + mul r_arg1HL, r_arg2HH + add r_resHH, r0 + mul r_arg1HH, r_arg2HL + add r_resHH, r0 + clr __zero_reg__ + ret +ENDF __mulusa3 +#endif /* L_mulusa3 */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg1HL +#undef r_arg1HH + +#undef r_arg2L +#undef r_arg2H +#undef r_arg2HL +#undef r_arg2HH + +#undef r_resL +#undef r_resH +#undef r_resHL +#undef r_resHH +#undef r_clr + +#else /* defined (__AVR_HAVE_MUL__) */ + +#define r_arg1L 18 /* multiplier Low */ +#define r_arg1H 19 +#define r_arg1HL 20 +#define r_arg1HH 21 /* multiplier High */ + +;; These registers needed for SBIW */ +#define r_arg2L 24 /* multiplicand Low */ +#define r_arg2H 25 +#define r_arg2HL 26 +#define r_arg2HH 27 /* multiplicand High */ + +#define r_resL 14 /* result Low */ +#define r_resH 15 +#define r_resHL 16 +#define r_resHH 17 /* result High */ + +#define r_scratchL 0 /* scratch Low */ +#define r_scratchH 1 +#define r_scratchHL 22 +#define r_scratchHH 23 /* scratch High */ + +#if defined (L_mulsa3) +DEFUN __mulsa3 + mov r_resL, r_arg1HH + eor r_resL, r_arg2HH + bst r_resL, 7 + sbrs r_arg1HH, 7 + rjmp 1f + neg4 r_arg1L +1: + sbrs r_arg2HH, 7 + rjmp 2f + neg4 r_arg2L +2: + XCALL __mulusa3 + brtc __mulsa3_exit + neg4 r_resL +__mulsa3_exit: + ret +ENDF __mulsa3 +#endif /* defined (L_mulsa3) */ + +#if defined (L_mulusa3) +DEFUN __mulusa3 + clr r_resL ; clear result + clr r_resH + wmov r_resHL, r_resL + wmov r_scratchL, r_arg1L ; save multiplicand + wmov r_scratchHL, r_arg1HL +__mulusa3_loop1: + sbrs r_arg2HL,0 + rjmp 0f + add r_resL,r_arg1L ; result + multiplicand + adc r_resH,r_arg1H + adc r_resHL,r_arg1HL + adc r_resHH,r_arg1HH +0: + lsl r_arg1L ; shift multiplicand + rol r_arg1H + rol r_arg1HL + rol r_arg1HH + lsr r_arg2HH + ror r_arg2HL + sbiw r_arg2HL,0 + brne __mulusa3_loop1 ; exit multiplier = 0 + wmov r_arg1L, r_scratchL ; restore multiplicand + wmov r_arg1HL, r_scratchHL +__mulusa3_loop2: + lsr r_arg1HH ; shift multiplicand + ror r_arg1HL + ror r_arg1H + ror r_arg1L + sbrs r_arg2H,7 + rjmp 1f + add r_resL,r_arg1L ; result + multiplicand + adc r_resH,r_arg1H + adc r_resHL,r_arg1HL + adc r_resHH,r_arg1HH +1: + lsl r_arg2L + rol r_arg2H + sbiw r_arg2L,0 + brne __mulusa3_loop2 ; exit if multiplier = 0 + clr __zero_reg__ ; got clobbered + ret +ENDF __mulusa3 +#endif /* defined (L_mulusa3) */ + +#undef r_scratchL +#undef r_scratchH +#undef r_scratchHL +#undef r_scratchHH + +#undef r_arg1L +#undef r_arg1H +#undef r_arg1HL +#undef r_arg1HH + +#undef r_arg2L +#undef r_arg2H +#undef r_arg2HL +#undef r_arg2HH + +#undef r_resL +#undef r_resH +#undef r_resHL +#undef r_resHH + +#endif /* defined (__AVR_HAVE_MUL__) */ + +/******************************************************* + Fractional Division 8 / 8 +*******************************************************/ + +#define r_divd r25 /* dividend */ +#define r_quo r24 /* quotient */ +#define r_div r22 /* divisor */ + +#if defined (L_divqq3) +DEFUN __divqq3 + mov r0, r_divd + eor r0, r_div + sbrc r_div, 7 + neg r_div + sbrc r_divd, 7 + neg r_divd + cp r_divd, r_div + breq __divqq3_minus1 ; if equal return -1 + XCALL __udivuqq3 + lsr r_quo + sbrc r0, 7 ; negate result if needed + neg r_quo + ret +__divqq3_minus1: + ldi r_quo, 0x80 + ret +ENDF __divqq3 +#endif /* defined (L_divqq3) */ + +#if defined (L_udivuqq3) +DEFUN __udivuqq3 + clr r_quo ; clear quotient + inc __zero_reg__ ; init loop counter, used per shift +__udivuqq3_loop: + lsl r_divd ; shift dividend + brcs 0f ; dividend overflow + cp r_divd,r_div ; compare dividend & divisor + brcc 0f ; dividend >= divisor + rol r_quo ; shift quotient (with CARRY) + rjmp __udivuqq3_cont +0: + sub r_divd,r_div ; restore dividend + lsl r_quo ; shift quotient (without CARRY) +__udivuqq3_cont: + lsl __zero_reg__ ; shift loop-counter bit + brne __udivuqq3_loop + com r_quo ; complement result + ; because C flag was complemented in loop + ret +ENDF __udivuqq3 +#endif /* defined (L_udivuqq3) */ + +#undef r_divd +#undef r_quo +#undef r_div + + +/******************************************************* + Fractional Division 16 / 16 +*******************************************************/ +#define r_divdL 26 /* dividend Low */ +#define r_divdH 27 /* dividend Hig */ +#define r_quoL 24 /* quotient Low */ +#define r_quoH 25 /* quotient High */ +#define r_divL 22 /* divisor */ +#define r_divH 23 /* divisor */ +#define r_cnt 21 + +#if defined (L_divhq3) +DEFUN __divhq3 + mov r0, r_divdH + eor r0, r_divH + sbrs r_divH, 7 + rjmp 1f + neg2 r_divL +1: + sbrs r_divdH, 7 + rjmp 2f + neg2 r_divdL +2: + cp r_divdL, r_divL + cpc r_divdH, r_divH + breq __divhq3_minus1 ; if equal return -1 + XCALL __udivuhq3 + lsr r_quoH + ror r_quoL + brpl 9f + ;; negate result if needed + neg2 r_quoL +9: + ret +__divhq3_minus1: + ldi r_quoH, 0x80 + clr r_quoL + ret +ENDF __divhq3 +#endif /* defined (L_divhq3) */ + +#if defined (L_udivuhq3) +DEFUN __udivuhq3 + sub r_quoH,r_quoH ; clear quotient and carry + ;; FALLTHRU +ENDF __udivuhq3 + +DEFUN __udivuha3_common + clr r_quoL ; clear quotient + ldi r_cnt,16 ; init loop counter +__udivuhq3_loop: + rol r_divdL ; shift dividend (with CARRY) + rol r_divdH + brcs __udivuhq3_ep ; dividend overflow + cp r_divdL,r_divL ; compare dividend & divisor + cpc r_divdH,r_divH + brcc __udivuhq3_ep ; dividend >= divisor + rol r_quoL ; shift quotient (with CARRY) + rjmp __udivuhq3_cont +__udivuhq3_ep: + sub r_divdL,r_divL ; restore dividend + sbc r_divdH,r_divH + lsl r_quoL ; shift quotient (without CARRY) +__udivuhq3_cont: + rol r_quoH ; shift quotient + dec r_cnt ; decrement loop counter + brne __udivuhq3_loop + com r_quoL ; complement result + com r_quoH ; because C flag was complemented in loop + ret +ENDF __udivuha3_common +#endif /* defined (L_udivuhq3) */ + +/******************************************************* + Fixed Division 8.8 / 8.8 +*******************************************************/ +#if defined (L_divha3) +DEFUN __divha3 + mov r0, r_divdH + eor r0, r_divH + sbrs r_divH, 7 + rjmp 1f + neg2 r_divL +1: + sbrs r_divdH, 7 + rjmp 2f + neg2 r_divdL +2: + XCALL __udivuha3 + sbrs r0, 7 ; negate result if needed + ret + neg2 r_quoL + ret +ENDF __divha3 +#endif /* defined (L_divha3) */ + +#if defined (L_udivuha3) +DEFUN __udivuha3 + mov r_quoH, r_divdL + mov r_divdL, r_divdH + clr r_divdH + lsl r_quoH ; shift quotient into carry + XJMP __udivuha3_common ; same as fractional after rearrange +ENDF __udivuha3 +#endif /* defined (L_udivuha3) */ + +#undef r_divdL +#undef r_divdH +#undef r_quoL +#undef r_quoH +#undef r_divL +#undef r_divH +#undef r_cnt + +/******************************************************* + Fixed Division 16.16 / 16.16 +*******************************************************/ + +#define r_arg1L 24 /* arg1 gets passed already in place */ +#define r_arg1H 25 +#define r_arg1HL 26 +#define r_arg1HH 27 +#define r_divdL 26 /* dividend Low */ +#define r_divdH 27 +#define r_divdHL 30 +#define r_divdHH 31 /* dividend High */ +#define r_quoL 22 /* quotient Low */ +#define r_quoH 23 +#define r_quoHL 24 +#define r_quoHH 25 /* quotient High */ +#define r_divL 18 /* divisor Low */ +#define r_divH 19 +#define r_divHL 20 +#define r_divHH 21 /* divisor High */ +#define r_cnt __zero_reg__ /* loop count (0 after the loop!) */ + +#if defined (L_divsa3) +DEFUN __divsa3 + mov r0, r_arg1HH + eor r0, r_divHH + sbrs r_divHH, 7 + rjmp 1f + neg4 r_divL +1: + sbrs r_arg1HH, 7 + rjmp 2f + neg4 r_arg1L +2: + XCALL __udivusa3 + sbrs r0, 7 ; negate result if needed + ret + neg4 r_quoL + ret +ENDF __divsa3 +#endif /* defined (L_divsa3) */ + +#if defined (L_udivusa3) +DEFUN __udivusa3 + ldi r_divdHL, 32 ; init loop counter + mov r_cnt, r_divdHL + clr r_divdHL + clr r_divdHH + wmov r_quoL, r_divdHL + lsl r_quoHL ; shift quotient into carry + rol r_quoHH +__udivusa3_loop: + rol r_divdL ; shift dividend (with CARRY) + rol r_divdH + rol r_divdHL + rol r_divdHH + brcs __udivusa3_ep ; dividend overflow + cp r_divdL,r_divL ; compare dividend & divisor + cpc r_divdH,r_divH + cpc r_divdHL,r_divHL + cpc r_divdHH,r_divHH + brcc __udivusa3_ep ; dividend >= divisor + rol r_quoL ; shift quotient (with CARRY) + rjmp __udivusa3_cont +__udivusa3_ep: + sub r_divdL,r_divL ; restore dividend + sbc r_divdH,r_divH + sbc r_divdHL,r_divHL + sbc r_divdHH,r_divHH + lsl r_quoL ; shift quotient (without CARRY) +__udivusa3_cont: + rol r_quoH ; shift quotient + rol r_quoHL + rol r_quoHH + dec r_cnt ; decrement loop counter + brne __udivusa3_loop + com r_quoL ; complement result + com r_quoH ; because C flag was complemented in loop + com r_quoHL + com r_quoHH + ret +ENDF __udivusa3 +#endif /* defined (L_udivusa3) */ + +#undef r_divdL +#undef r_divdH +#undef r_divdHL +#undef r_divdHH +#undef r_quoL +#undef r_quoH +#undef r_quoHL +#undef r_quoHH +#undef r_divL +#undef r_divH +#undef r_divHL +#undef r_divHH +#undef r_cnt Index: libgcc/config/avr/lib1funcs.S =================================================================== --- libgcc/config/avr/lib1funcs.S (revision 190299) +++ libgcc/config/avr/lib1funcs.S (working copy) @@ -91,6 +91,32 @@ see the files COPYING3 and COPYING.RUNTI .endfunc .endm +;; Negate a 2-byte value held in consecutive registers +.macro neg2 reg + com \reg+1 + neg \reg + sbci \reg+1, -1 +.endm + +;; Negate a 4-byte value held in consecutive registers +.macro neg4 reg + com \reg+3 + com \reg+2 + com \reg+1 +.if \reg >= 16 + neg \reg + sbci \reg+1, -1 + sbci \reg+2, -1 + sbci \reg+3, -1 +.else + com \reg + adc \reg, __zero_reg__ + adc \reg+1, __zero_reg__ + adc \reg+2, __zero_reg__ + adc \reg+3, __zero_reg__ +.endif +.endm + \f .section .text.libgcc.mul, "ax", @progbits @@ -1247,6 +1273,22 @@ __divmodsi4_exit: ENDF __divmodsi4 #endif /* defined (L_divmodsi4) */ +#undef r_remHH +#undef r_remHL +#undef r_remH +#undef r_remL + +#undef r_arg1HH +#undef r_arg1HL +#undef r_arg1H +#undef r_arg1L + +#undef r_arg2HH +#undef r_arg2HL +#undef r_arg2H +#undef r_arg2L + +#undef r_cnt /******************************************************* Division 64 / 64 @@ -2794,3 +2836,5 @@ ENDF __fmul #undef B1 #undef C0 #undef C1 + +#include "lib1funcs-fixed.S" Index: libgcc/config/avr/t-avr =================================================================== --- libgcc/config/avr/t-avr (revision 190299) +++ libgcc/config/avr/t-avr (working copy) @@ -55,6 +55,24 @@ LIB1ASMFUNCS = \ _cmpdi2 _cmpdi2_s8 \ _fmul _fmuls _fmulsu +# Fixed point routines in avr/lib1funcs-fixed.S +LIB1ASMFUNCS += \ + _fractqqsf _fractuqqsf \ + _fracthqsf _fractuhqsf \ + _fracthasf _fractuhasf \ + _fractsasf _fractusasf \ + _fractsfqq _fractsfuqq \ + _fractsfhq _fractsfuhq \ + _fractsfha _fractsfsa \ + _mulqq3 _muluqq3 \ + _mulhq3 _muluhq3 \ + _mulha3 _muluha3 \ + _mulsa3 _mulusa3 \ + _divqq3 _udivuqq3 \ + _divhq3 _udivuhq3 \ + _divha3 _udivuha3 \ + _divsa3 _udivusa3 + LIB2FUNCS_EXCLUDE = \ _moddi3 _umoddi3 \ _clz ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [Patch,AVR] PR54222: Add fixed point support 2012-08-10 15:53 [Patch,AVR] PR54222: Add fixed point support Georg-Johann Lay @ 2012-08-10 16:09 ` Weddington, Eric 2012-08-10 17:06 ` Georg-Johann Lay 0 siblings, 1 reply; 12+ messages in thread From: Weddington, Eric @ 2012-08-10 16:09 UTC (permalink / raw) To: Georg-Johann Lay, gcc-patches Cc: Denis Chertykov, Sean D'Epagnier, Joerg Wunsch > -----Original Message----- > From: Georg-Johann Lay > Sent: Friday, August 10, 2012 9:52 AM > To: gcc-patches@gcc.gnu.org > Cc: Denis Chertykov; Weddington, Eric; Sean D'Epagnier; Joerg Wunsch > Subject: [Patch,AVR] PR54222: Add fixed point support > > This patch adds fixed point support to the avr target. Hi Johann, Thanks for doing this work to get Sean's patch into GCC! :-) > It's based on the work of Sean, see > > http://lists.gnu.org/archive/html/avr-gcc-list/2012-07/msg00030.html > > This patch has several changes compared to Sean's patch: > <snip> > * The patch neither implements NEG nor ABS. The standard requires > saturation, e.g. -0x80 must not become 0x80 again. This is work > still to be done, but just a matter of optimization. Can I assume that you have plans to do this in the near-ish future? <snip> > The patch works out fine. However, because of PR53923 which > shreds the AVR port, currently no reasonable testing is possible. > > Work to be done is better testing after PR53923 is fixed and the > AVR port works properly again. Is there a reasonable time frame as to when PR53923 will be fixed? My only real concern is that this is a major feature addition and the AVR port is currently broken. Thanks, Eric Weddington ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-10 16:09 ` Weddington, Eric @ 2012-08-10 17:06 ` Georg-Johann Lay 2012-08-10 17:56 ` Weddington, Eric 0 siblings, 1 reply; 12+ messages in thread From: Georg-Johann Lay @ 2012-08-10 17:06 UTC (permalink / raw) To: Weddington, Eric; +Cc: gcc-patches, Denis Chertykov Weddington, Eric wrote: >> From: Georg-Johann Lay >> The patch works out fine. However, because of PR53923 which >> shreds the AVR port, currently no reasonable testing is possible. >> >> Work to be done is better testing after PR53923 is fixed and the >> AVR port works properly again. > > Is there a reasonable time frame as to when PR53923 will be fixed? The first step would be to bisect and find the patch that lead to PR53923. It was not a change in the avr BE, so the question goes to the authors of the respective patch. Up to now I didn't even try to bisect; that would take years on the host that I have available... > My only real concern is that this is a major feature addition and > the AVR port is currently broken. I don't know if it's the avr port or some parts of the middle end that don't cooperate with avr. Johann ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [Patch,AVR] PR54222: Add fixed point support 2012-08-10 17:06 ` Georg-Johann Lay @ 2012-08-10 17:56 ` Weddington, Eric 2012-08-10 21:34 ` Georg-Johann Lay 0 siblings, 1 reply; 12+ messages in thread From: Weddington, Eric @ 2012-08-10 17:56 UTC (permalink / raw) To: Georg-Johann Lay; +Cc: gcc-patches, Denis Chertykov > -----Original Message----- > From: Georg-Johann Lay > Sent: Friday, August 10, 2012 11:06 AM > To: Weddington, Eric > Cc: gcc-patches@gcc.gnu.org; Denis Chertykov > Subject: Re: [Patch,AVR] PR54222: Add fixed point support > > The first step would be to bisect and find the patch that lead to > PR53923. It was not a change in the avr BE, so the question > goes to the authors of the respective patch. > > Up to now I didn't even try to bisect; that would take years on > the host that I have available... > > > My only real concern is that this is a major feature addition and > > the AVR port is currently broken. > > I don't know if it's the avr port or some parts of the middle end > that don't cooperate with avr. > I would really, really love to see fixed point support added in, especially since I know that Sean has worked on it for quite a while, and you've also done a lot of work in getting the patches in shape to get them committed. But, if the AVR port is currently broken (by whomever, and whatever patch) and a major feature like this can't be tested to make sure it doesn't break anything else in the AVR backend, then I'm hesitant to approve (even though I really want to approve). I'll defer to Denis on this one. Or, until the AVR port is fixed and can be tested properly. Oh, an afterthought: Can this patch work on a snapshot *before* the breakage from PR53923? I see that that bug report was only from a month ago, and doing that might give us some indication that the patch won't break anything else on the AVR port. Eric Weddington ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-10 17:56 ` Weddington, Eric @ 2012-08-10 21:34 ` Georg-Johann Lay 2012-08-12 9:13 ` Denis Chertykov 0 siblings, 1 reply; 12+ messages in thread From: Georg-Johann Lay @ 2012-08-10 21:34 UTC (permalink / raw) To: Weddington, Eric; +Cc: gcc-patches, Denis Chertykov Weddington, Eric schrieb: >> From: Georg-Johann Lay >> >> The first step would be to bisect and find the patch that lead to >> PR53923. It was not a change in the avr BE, so the question goes >> to the authors of the respective patch. >> >> Up to now I didn't even try to bisect; that would take years on the >> host that I have available... >> >>> My only real concern is that this is a major feature addition and >>> the AVR port is currently broken. >> >> I don't know if it's the avr port or some parts of the middle end >> that don't cooperate with avr. > > I would really, really love to see fixed point support added in, > especially since I know that Sean has worked on it for quite a while, > and you've also done a lot of work in getting the patches in shape to > get them committed. > > But, if the AVR port is currently broken (by whomever, and whatever > patch) and a major feature like this can't be tested to make sure it > doesn't break anything else in the AVR backend, then I'm hesitant to > approve (even though I really want to approve). I don't understand enough of DF to fix PR53923. The insn that leads to the ICE is (in df-problems.c:dead_debug_insert_temp): (insn 328 886 866 37 (set (reg:SF 16 r16) (unspec:SF [ (reg:SF 16 r16) (reg/v:SF 4 r4 [orig:162 b ] [162]) ] UNSPEC_COPYSIGN)) libgcc2-mulsc3.c:1307 322 {copysignsf3} (expr_list:REG_DEAD (reg/v:SF 4 r4 [orig:162 b ] [162]) (expr_list:REG_DEAD (reg/v:SF 4 r4 [orig:162 b ] [162]) (expr_list:REG_EQUAL (unspec:SF [ (const_double:SF 0 [0] 0.0 [0x0.0p+0]) (reg/v:SF 4 r4 [orig:162 b ] [162]) ] UNSPEC_COPYSIGN) (nil))))) The only odd thing is that REG_DEAD r4 is there twice. Dunno if that is ok or confuses DF. > I'll defer to Denis on this one. Or, until the AVR port is fixed and > can be tested properly. I already asked at gcc-help for that issue, but no response :-( http://gcc.gnu.org/ml/gcc-help/2012-08/msg00014.html > Oh, an afterthought: Can this patch work on a snapshot *before* the > breakage from PR53923? Yes, of course. It's pure avr stuff. The patch would even work fine with 4.7, there is no reason why it should break, provided it works fine with trunk. > I see that that bug report was only from a month ago, > and doing that might give us some indication that the > patch won't break anything else on the AVR port. The avr port is still the same as 4.7, except PR53344 which just changes text output in avr_assembler_integer to use binutils PR13503 (byte relocs for __memx addresses). Johann ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-10 21:34 ` Georg-Johann Lay @ 2012-08-12 9:13 ` Denis Chertykov 2012-08-13 9:28 ` Georg-Johann Lay 0 siblings, 1 reply; 12+ messages in thread From: Denis Chertykov @ 2012-08-12 9:13 UTC (permalink / raw) To: Georg-Johann Lay; +Cc: Weddington, Eric, gcc-patches 2012/8/11 Georg-Johann Lay <avr@gjlay.de>: > Weddington, Eric schrieb: >>> >>> From: Georg-Johann Lay >>> >>> >>> The first step would be to bisect and find the patch that lead to >>> PR53923. It was not a change in the avr BE, so the question goes >>> to the authors of the respective patch. >>> >>> Up to now I didn't even try to bisect; that would take years on the >>> host that I have available... >>> >>>> My only real concern is that this is a major feature addition and >>>> the AVR port is currently broken. >>> >>> >>> I don't know if it's the avr port or some parts of the middle end that >>> don't cooperate with avr. >> >> >> I would really, really love to see fixed point support added in, >> especially since I know that Sean has worked on it for quite a while, >> and you've also done a lot of work in getting the patches in shape to >> get them committed. >> >> But, if the AVR port is currently broken (by whomever, and whatever >> patch) and a major feature like this can't be tested to make sure it >> doesn't break anything else in the AVR backend, then I'm hesitant to >> approve (even though I really want to approve). > > > I don't understand enough of DF to fix PR53923. The insn that leads > to the ICE is (in df-problems.c:dead_debug_insert_temp): > Today I have updated GCC svn tree and successfully compiled avr-gcc. The libgcc2-mulsc3.c from PR53923 also compiled without bugs. Denis. PS: May be I'm doing something wrong ? (I had too long vacations) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-12 9:13 ` Denis Chertykov @ 2012-08-13 9:28 ` Georg-Johann Lay 2012-08-21 16:10 ` Denis Chertykov 0 siblings, 1 reply; 12+ messages in thread From: Georg-Johann Lay @ 2012-08-13 9:28 UTC (permalink / raw) To: Denis Chertykov; +Cc: Weddington, Eric, gcc-patches Denis Chertykov wrote: > 2012/8/11 Georg-Johann Lay <avr@gjlay.de>: >> Weddington, Eric schrieb: >>>> From: Georg-Johann Lay >>>> >>>> >>>> The first step would be to bisect and find the patch that lead to >>>> PR53923. It was not a change in the avr BE, so the question goes >>>> to the authors of the respective patch. >>>> >>>> Up to now I didn't even try to bisect; that would take years on the >>>> host that I have available... >>>> >>>>> My only real concern is that this is a major feature addition and >>>>> the AVR port is currently broken. >>>> >>>> I don't know if it's the avr port or some parts of the middle end that >>>> don't cooperate with avr. >>> >>> I would really, really love to see fixed point support added in, >>> especially since I know that Sean has worked on it for quite a while, >>> and you've also done a lot of work in getting the patches in shape to >>> get them committed. >>> >>> But, if the AVR port is currently broken (by whomever, and whatever >>> patch) and a major feature like this can't be tested to make sure it >>> doesn't break anything else in the AVR backend, then I'm hesitant to >>> approve (even though I really want to approve). >> >> I don't understand enough of DF to fix PR53923. The insn that leads >> to the ICE is (in df-problems.c:dead_debug_insert_temp): >> > > Today I have updated GCC svn tree and successfully compiled avr-gcc. > The libgcc2-mulsc3.c from PR53923 also compiled without bugs. > > Denis. > > PS: May be I'm doing something wrong ? (I had too long vacations) I am configuring with --target=avr --disable-nls --with-dwarf2 --enable-languages=c,c++ --enable-target-optspace=yes --enable-checking=yes,rtl Build GCC is "gcc version 4.3.2". Build and host are i686-pc-linux-gnu. Maybe it's different on a 64-bit computer, but I only have 32-bit host. Johann ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-13 9:28 ` Georg-Johann Lay @ 2012-08-21 16:10 ` Denis Chertykov 2012-08-23 14:50 ` Georg-Johann Lay 0 siblings, 1 reply; 12+ messages in thread From: Denis Chertykov @ 2012-08-21 16:10 UTC (permalink / raw) To: Georg-Johann Lay; +Cc: Weddington, Eric, gcc-patches 2012/8/13 Georg-Johann Lay <avr@gjlay.de>: > Denis Chertykov wrote: >> 2012/8/11 Georg-Johann Lay <avr@gjlay.de>: >>> Weddington, Eric schrieb: >>>>> From: Georg-Johann Lay >>>>> >>>>> >>>>> The first step would be to bisect and find the patch that lead to >>>>> PR53923. It was not a change in the avr BE, so the question goes >>>>> to the authors of the respective patch. >>>>> >>>>> Up to now I didn't even try to bisect; that would take years on the >>>>> host that I have available... >>>>> >>>>>> My only real concern is that this is a major feature addition and >>>>>> the AVR port is currently broken. >>>>> >>>>> I don't know if it's the avr port or some parts of the middle end that >>>>> don't cooperate with avr. >>>> >>>> I would really, really love to see fixed point support added in, >>>> especially since I know that Sean has worked on it for quite a while, >>>> and you've also done a lot of work in getting the patches in shape to >>>> get them committed. >>>> >>>> But, if the AVR port is currently broken (by whomever, and whatever >>>> patch) and a major feature like this can't be tested to make sure it >>>> doesn't break anything else in the AVR backend, then I'm hesitant to >>>> approve (even though I really want to approve). >>> >>> I don't understand enough of DF to fix PR53923. The insn that leads >>> to the ICE is (in df-problems.c:dead_debug_insert_temp): >>> >> >> Today I have updated GCC svn tree and successfully compiled avr-gcc. >> The libgcc2-mulsc3.c from also compiled without bugs. >> >> Denis. >> >> PS: May be I'm doing something wrong ? (I had too long vacations) > > I am configuring with --target=avr --disable-nls --with-dwarf2 > --enable-languages=c,c++ --enable-target-optspace=yes --enable-checking=yes,rtl > > Build GCC is "gcc version 4.3.2". > Build and host are i686-pc-linux-gnu. > > Maybe it's different on a 64-bit computer, but I only have 32-bit host. > I have debugging PR53923 and on my opinion it's not an AVR port bug. Please commit fixed point support. Denis. PS: sorry for delay ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-21 16:10 ` Denis Chertykov @ 2012-08-23 14:50 ` Georg-Johann Lay 2012-08-23 16:42 ` Weddington, Eric 2012-08-24 11:58 ` Denis Chertykov 0 siblings, 2 replies; 12+ messages in thread From: Georg-Johann Lay @ 2012-08-23 14:50 UTC (permalink / raw) To: Denis Chertykov; +Cc: Weddington, Eric, gcc-patches [-- Attachment #1: Type: text/plain, Size: 7067 bytes --] Denis Chertykov wrote: > 2012/8/13 Georg-Johann Lay: >> Denis Chertykov wrote: >>> 2012/8/11 Georg-Johann Lay: >>>> Weddington, Eric schrieb: >>>>>> From: Georg-Johann Lay >>>>>> >>>>>> >>>>>> The first step would be to bisect and find the patch that lead to >>>>>> PR53923. It was not a change in the avr BE, so the question goes >>>>>> to the authors of the respective patch. >>>>>> >>>>>> Up to now I didn't even try to bisect; that would take years on the >>>>>> host that I have available... >>>>>> >>>>>>> My only real concern is that this is a major feature addition and >>>>>>> the AVR port is currently broken. >>>>>> I don't know if it's the avr port or some parts of the middle end that >>>>>> don't cooperate with avr. >>>>> I would really, really love to see fixed point support added in, >>>>> especially since I know that Sean has worked on it for quite a while, >>>>> and you've also done a lot of work in getting the patches in shape to >>>>> get them committed. >>>>> >>>>> But, if the AVR port is currently broken (by whomever, and whatever >>>>> patch) and a major feature like this can't be tested to make sure it >>>>> doesn't break anything else in the AVR backend, then I'm hesitant to >>>>> approve (even though I really want to approve). >>>> I don't understand enough of DF to fix PR53923. The insn that leads >>>> to the ICE is (in df-problems.c:dead_debug_insert_temp): >>>> >>> Today I have updated GCC svn tree and successfully compiled avr-gcc. >>> The libgcc2-mulsc3.c from also compiled without bugs. >>> >>> Denis. >>> >>> PS: May be I'm doing something wrong ? (I had too long vacations) >> I am configuring with --target=avr --disable-nls --with-dwarf2 >> --enable-languages=c,c++ --enable-target-optspace=yes --enable-checking=yes,rtl >> >> Build GCC is "gcc version 4.3.2". >> Build and host are i686-pc-linux-gnu. >> >> Maybe it's different on a 64-bit computer, but I only have 32-bit host. >> > > I have debugging PR53923 and on my opinion it's not an AVR port bug. > Please commit fixed point support. > > Denis. Hi, here is an updated patch. Some functions are reworked and there is some code clean up. The test results look good, there are no additional regressions. The new test cases in gcc.dg/fixed-point pass except some convert-*.c for two reasons: * Some test cases have a loss of precision and therefore fail. One fail is that 0x3fffffffc0000000 is compared against 0x4000000000000000 and thus fails. Presumably its a rounding error from float. I'd say this is not critical. * PR54330: This leads to wrong code for __satfractudadq and the wrong code is already present in .expand. From the distance this looks like a middle-end or tree-ssa problem. The new patch implements TARGET_BUILD_BUILTIN_VA_LIST. Rationale is that avr-fixed.md adjust some modes bit these changes are not reflected by the built-in macros made by gcc. This leads to wrong code in libgcc because it deduces the type layout from these built-in defines. Thus, the respective nodes must be patches *before* built-in macros are emit. The changes to LIB2FUNCS_EXCLUDE currently have no effects, this needs http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01580.html which is currently under review. Ok to install? Johann libgcc/ PR target/54222 * config/avr/lib1funcs-fixed.S: New file. * config/avr/lib1funcs.S: Include it. Undefine some divmodsi after they are used. (neg2, neg4): New macros. (__mulqihi3,__umulqihi3,__mulhi3): Rewrite non-MUL variants. (__mulhisi3,__umulhisi3,__mulsi3): Rewrite non-MUL variants. (__umulhisi3): Speed up MUL variant if there is enough flash. * config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's avr-modes.def. * config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf, _fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf, _fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq, _fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3, _mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3, _udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3. (LIB2FUNCS_EXCLUDE): Add supported functions. gcc/ PR target/54222 * avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes. * avr/avr-fixed.md: New file. * avr/avr.md: Include it. (cc): Add: minus. (adjust_len): Add: minus, minus64, ufract, sfract. (ALL1, ALL2, ALL4, ORDERED234): New mode iterators. (MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. (MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA. (pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3, subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi, cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1. (*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3, ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3, *lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all 16-bit modes in ALL2. (subhi3, casesi, strlenhi): Add clobber when expanding minus:HI. (*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const, ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const, *reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all 32-bit modes in ALL4. * avr-dimode.md (ALL8): New mode iterator. (adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn, subdi3_const_insn, cbranchdi4, compare_di2, compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn, ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle all 64-bit modes in ALL8. * config/avr/avr-protos.h (avr_to_int_mode): New prototype. (avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes. * config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Define to... (avr_fixed_point_supported_p): ...this new static function. (TARGET_BUILD_BUILTIN_VA_LIST): Define to... (avr_build_builtin_va_list): ...this new static function. (avr_adjust_type_node): New static function. (avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P. (avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new pseudo instead of gen_rtx_MINUS. (avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED. (notice_update_cc): Handle: CC_MINUS. (output_movqi): Generalize to handle respective fixed-point modes. (output_movhi, output_movsisf, avr_2word_insn_p): Ditto. (avr_out_compare, avr_out_plus_1): Also handle fixed-point modes. (avr_assemble_integer): Ditto. (output_reload_in_const, output_reload_insisf): Ditto. (avr_compare_pattern): Skip all modes > 4 bytes. (avr_2word_insn_p): Skip movuqq_insn, movqq_insn. (avr_out_fract, avr_out_minus, avr_out_minus64): New functions. (avr_to_int_mode): New function. (adjust_insn_length): Handle: ADJUST_LEN_SFRACT, ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64. * config/avr/predicates.md (const0_operand): Allow const_fixed. (const_operand, const_or_immediate_operand): New. (nonmemory_or_const_operand): New. * config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ): New constraints. * config/avr/avr.h (LONG_LONG_ACCUM_TYPE_SIZE): Define. [-- Attachment #2: fixed-48-7.diff --] [-- Type: text/x-patch, Size: 160227 bytes --] Index: gcc/config/avr/predicates.md =================================================================== --- gcc/config/avr/predicates.md (revision 190535) +++ gcc/config/avr/predicates.md (working copy) @@ -74,7 +74,7 @@ (define_predicate "nox_general_operand" ;; Return 1 if OP is the zero constant for MODE. (define_predicate "const0_operand" - (and (match_code "const_int,const_double") + (and (match_code "const_int,const_fixed,const_double") (match_test "op == CONST0_RTX (mode)"))) ;; Return 1 if OP is the one constant integer for MODE. @@ -248,3 +248,21 @@ (define_predicate "s16_operand" (define_predicate "o16_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), -(1<<16), -1)"))) + +;; Const int, fixed, or double operand +(define_predicate "const_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "const_int_operand"))) + +;; Const int, const fixed, or const double operand +(define_predicate "nonmemory_or_const_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "nonmemory_operand"))) + +;; Immediate, const fixed, or const double operand +(define_predicate "const_or_immediate_operand" + (ior (match_code "const_fixed") + (match_code "const_double") + (match_operand 0 "immediate_operand"))) Index: gcc/config/avr/avr-fixed.md =================================================================== --- gcc/config/avr/avr-fixed.md (revision 0) +++ gcc/config/avr/avr-fixed.md (revision 0) @@ -0,0 +1,287 @@ +;; This file contains instructions that support fixed-point operations +;; for Atmel AVR micro controllers. +;; Copyright (C) 2012 +;; Free Software Foundation, Inc. +;; +;; Contributed by Sean D'Epagnier (sean@depagnier.com) +;; Georg-Johann Lay (avr@gjlay.de) + +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; <http://www.gnu.org/licenses/>. + +(define_mode_iterator ALL1Q [(QQ "") (UQQ "")]) +(define_mode_iterator ALL2Q [(HQ "") (UHQ "")]) +(define_mode_iterator ALL2A [(HA "") (UHA "")]) +(define_mode_iterator ALL2QA [(HQ "") (UHQ "") + (HA "") (UHA "")]) +(define_mode_iterator ALL4A [(SA "") (USA "")]) + +;;; Conversions + +(define_mode_iterator FIXED_A + [(QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "") + (DQ "") (UDQ "") (DA "") (UDA "") + (TA "") (UTA "") + (QI "") (HI "") (SI "") (DI "")]) + +;; Same so that be can build cross products + +(define_mode_iterator FIXED_B + [(QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "") + (DQ "") (UDQ "") (DA "") (UDA "") + (TA "") (UTA "") + (QI "") (HI "") (SI "") (DI "")]) + +(define_insn "fract<FIXED_B:mode><FIXED_A:mode>2" + [(set (match_operand:FIXED_A 0 "register_operand" "=r") + (fract_convert:FIXED_A + (match_operand:FIXED_B 1 "register_operand" "r")))] + "<FIXED_B:MODE>mode != <FIXED_A:MODE>mode" + { + return avr_out_fract (insn, operands, true, NULL); + } + [(set_attr "cc" "clobber") + (set_attr "adjust_len" "sfract")]) + +(define_insn "fractuns<FIXED_B:mode><FIXED_A:mode>2" + [(set (match_operand:FIXED_A 0 "register_operand" "=r") + (unsigned_fract_convert:FIXED_A + (match_operand:FIXED_B 1 "register_operand" "r")))] + "<FIXED_B:MODE>mode != <FIXED_A:MODE>mode" + { + return avr_out_fract (insn, operands, false, NULL); + } + [(set_attr "cc" "clobber") + (set_attr "adjust_len" "ufract")]) + +;****************************************************************************** +; mul + +;; "mulqq3" "muluqq3" +(define_expand "mul<mode>3" + [(parallel [(match_operand:ALL1Q 0 "register_operand" "") + (match_operand:ALL1Q 1 "register_operand" "") + (match_operand:ALL1Q 2 "register_operand" "")])] + "" + { + emit_insn (AVR_HAVE_MUL + ? gen_mul<mode>3_enh (operands[0], operands[1], operands[2]) + : gen_mul<mode>3_nomul (operands[0], operands[1], operands[2])); + DONE; + }) + +(define_insn "mulqq3_enh" + [(set (match_operand:QQ 0 "register_operand" "=r") + (mult:QQ (match_operand:QQ 1 "register_operand" "a") + (match_operand:QQ 2 "register_operand" "a")))] + "AVR_HAVE_MUL" + "fmuls %1,%2\;dec r1\;brvs 0f\;inc r1\;0:\;mov %0,r1\;clr __zero_reg__" + [(set_attr "length" "6") + (set_attr "cc" "clobber")]) + +(define_insn "muluqq3_enh" + [(set (match_operand:UQQ 0 "register_operand" "=r") + (mult:UQQ (match_operand:UQQ 1 "register_operand" "r") + (match_operand:UQQ 2 "register_operand" "r")))] + "AVR_HAVE_MUL" + "mul %1,%2\;mov %0,r1\;clr __zero_reg__" + [(set_attr "length" "3") + (set_attr "cc" "clobber")]) + +(define_expand "mulqq3_nomul" + [(set (reg:QQ 24) + (match_operand:QQ 1 "register_operand" "")) + (set (reg:QQ 25) + (match_operand:QQ 2 "register_operand" "")) + ;; "*mulqq3.call" + (parallel [(set (reg:QQ 23) + (mult:QQ (reg:QQ 24) + (reg:QQ 25))) + (clobber (reg:QI 22)) + (clobber (reg:HI 24))]) + (set (match_operand:QQ 0 "register_operand" "") + (reg:QQ 23))] + "!AVR_HAVE_MUL") + +(define_expand "muluqq3_nomul" + [(set (reg:UQQ 22) + (match_operand:UQQ 1 "register_operand" "")) + (set (reg:UQQ 24) + (match_operand:UQQ 2 "register_operand" "")) + ;; "*umulqihi3.call" + (parallel [(set (reg:HI 24) + (mult:HI (zero_extend:HI (reg:QI 22)) + (zero_extend:HI (reg:QI 24)))) + (clobber (reg:QI 21)) + (clobber (reg:HI 22))]) + (set (match_operand:UQQ 0 "register_operand" "") + (reg:UQQ 25))] + "!AVR_HAVE_MUL") + +(define_insn "*mulqq3.call" + [(set (reg:QQ 23) + (mult:QQ (reg:QQ 24) + (reg:QQ 25))) + (clobber (reg:QI 22)) + (clobber (reg:HI 24))] + "!AVR_HAVE_MUL" + "%~call __mulqq3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + + +;; "mulhq3" "muluhq3" +;; "mulha3" "muluha3" +(define_expand "mul<mode>3" + [(set (reg:ALL2QA 18) + (match_operand:ALL2QA 1 "register_operand" "")) + (set (reg:ALL2QA 26) + (match_operand:ALL2QA 2 "register_operand" "")) + ;; "*mulhq3.call.enh" + (parallel [(set (reg:ALL2QA 24) + (mult:ALL2QA (reg:ALL2QA 18) + (reg:ALL2QA 26))) + (clobber (reg:HI 22))]) + (set (match_operand:ALL2QA 0 "register_operand" "") + (reg:ALL2QA 24))] + "AVR_HAVE_MUL") + +;; "*mulhq3.call" "*muluhq3.call" +;; "*mulha3.call" "*muluha3.call" +(define_insn "*mul<mode>3.call" + [(set (reg:ALL2QA 24) + (mult:ALL2QA (reg:ALL2QA 18) + (reg:ALL2QA 26))) + (clobber (reg:HI 22))] + "AVR_HAVE_MUL" + "%~call __mul<mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + + +;; On the enhanced core, don't clobber either input and use a separate output + +;; "mulsa3" "mulusa3" +(define_expand "mul<mode>3" + [(set (reg:ALL4A 16) + (match_operand:ALL4A 1 "register_operand" "")) + (set (reg:ALL4A 20) + (match_operand:ALL4A 2 "register_operand" "")) + (set (reg:ALL4A 24) + (mult:ALL4A (reg:ALL4A 16) + (reg:ALL4A 20))) + (set (match_operand:ALL4A 0 "register_operand" "") + (reg:ALL4A 24))] + "AVR_HAVE_MUL") + +;; "*mulsa3.call" "*mulusa3.call" +(define_insn "*mul<mode>3.call" + [(set (reg:ALL4A 24) + (mult:ALL4A (reg:ALL4A 16) + (reg:ALL4A 20)))] + "AVR_HAVE_MUL" + "%~call __mul<mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +; / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / +; div + +(define_code_iterator usdiv [udiv div]) + +;; "divqq3" "udivuqq3" +(define_expand "<code><mode>3" + [(set (reg:ALL1Q 25) + (match_operand:ALL1Q 1 "register_operand" "")) + (set (reg:ALL1Q 22) + (match_operand:ALL1Q 2 "register_operand" "")) + (parallel [(set (reg:ALL1Q 24) + (usdiv:ALL1Q (reg:ALL1Q 25) + (reg:ALL1Q 22))) + (clobber (reg:QI 25))]) + (set (match_operand:ALL1Q 0 "register_operand" "") + (reg:ALL1Q 24))]) + +;; "*divqq3.call" "*udivuqq3.call" +(define_insn "*<code><mode>3.call" + [(set (reg:ALL1Q 24) + (usdiv:ALL1Q (reg:ALL1Q 25) + (reg:ALL1Q 22))) + (clobber (reg:QI 25))] + "" + "%~call __<code><mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; "divhq3" "udivuhq3" +;; "divha3" "udivuha3" +(define_expand "<code><mode>3" + [(set (reg:ALL2QA 26) + (match_operand:ALL2QA 1 "register_operand" "")) + (set (reg:ALL2QA 22) + (match_operand:ALL2QA 2 "register_operand" "")) + (parallel [(set (reg:ALL2QA 24) + (usdiv:ALL2QA (reg:ALL2QA 26) + (reg:ALL2QA 22))) + (clobber (reg:HI 26)) + (clobber (reg:QI 21))]) + (set (match_operand:ALL2QA 0 "register_operand" "") + (reg:ALL2QA 24))]) + +;; "*divhq3.call" "*udivuhq3.call" +;; "*divha3.call" "*udivuha3.call" +(define_insn "*<code><mode>3.call" + [(set (reg:ALL2QA 24) + (usdiv:ALL2QA (reg:ALL2QA 26) + (reg:ALL2QA 22))) + (clobber (reg:HI 26)) + (clobber (reg:QI 21))] + "" + "%~call __<code><mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + +;; Note the first parameter gets passed in already offset by 2 bytes + +;; "divsa3" "udivusa3" +(define_expand "<code><mode>3" + [(set (reg:ALL4A 24) + (match_operand:ALL4A 1 "register_operand" "")) + (set (reg:ALL4A 18) + (match_operand:ALL4A 2 "register_operand" "")) + (parallel [(set (reg:ALL4A 22) + (usdiv:ALL4A (reg:ALL4A 24) + (reg:ALL4A 18))) + (clobber (reg:HI 26)) + (clobber (reg:HI 30))]) + (set (match_operand:ALL4A 0 "register_operand" "") + (reg:ALL4A 22))]) + +;; "*divsa3.call" "*udivusa3.call" +(define_insn "*<code><mode>3.call" + [(set (reg:ALL4A 22) + (usdiv:ALL4A (reg:ALL4A 24) + (reg:ALL4A 18))) + (clobber (reg:HI 26)) + (clobber (reg:HI 30))] + "" + "%~call __<code><mode>3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) Index: gcc/config/avr/avr-dimode.md =================================================================== --- gcc/config/avr/avr-dimode.md (revision 190535) +++ gcc/config/avr/avr-dimode.md (working copy) @@ -47,44 +47,58 @@ (define_constants [(ACC_A 18) (ACC_B 10)]) +;; Supported modes that are 8 bytes wide +(define_mode_iterator ALL8 [(DI "") + (DQ "") (UDQ "") + (DA "") (UDA "") + (TA "") (UTA "")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Addition ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -(define_expand "adddi3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (match_operand:DI 1 "general_operand" "") - (match_operand:DI 2 "general_operand" "")])] +;; "adddi3" +;; "adddq3" "addudq3" +;; "addda3" "adduda3" +;; "addta3" "adduta3" +(define_expand "add<mode>3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (match_operand:ALL8 1 "general_operand" "") + (match_operand:ALL8 2 "general_operand" "")])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); - if (s8_operand (operands[2], VOIDmode)) + if (DImode == <MODE>mode + && s8_operand (operands[2], VOIDmode)) { emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]); emit_insn (gen_adddi3_const8_insn ()); } - else if (CONST_INT_P (operands[2]) - || CONST_DOUBLE_P (operands[2])) + else if (const_operand (operands[2], GET_MODE (operands[2]))) { - emit_insn (gen_adddi3_const_insn (operands[2])); + emit_insn (gen_add<mode>3_const_insn (operands[2])); } else { - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_adddi3_insn ()); + emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]); + emit_insn (gen_add<mode>3_insn ()); } emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "adddi3_insn" - [(set (reg:DI ACC_A) - (plus:DI (reg:DI ACC_A) - (reg:DI ACC_B)))] +;; "adddi3_insn" +;; "adddq3_insn" "addudq3_insn" +;; "addda3_insn" "adduda3_insn" +;; "addta3_insn" "adduta3_insn" +(define_insn "add<mode>3_insn" + [(set (reg:ALL8 ACC_A) + (plus:ALL8 (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __adddi3" [(set_attr "adjust_len" "call") @@ -99,10 +113,14 @@ (define_insn "adddi3_const8_insn" [(set_attr "adjust_len" "call") (set_attr "cc" "clobber")]) -(define_insn "adddi3_const_insn" - [(set (reg:DI ACC_A) - (plus:DI (reg:DI ACC_A) - (match_operand:DI 0 "const_double_operand" "n")))] +;; "adddi3_const_insn" +;; "adddq3_const_insn" "addudq3_const_insn" +;; "addda3_const_insn" "adduda3_const_insn" +;; "addta3_const_insn" "adduta3_const_insn" +(define_insn "add<mode>3_const_insn" + [(set (reg:ALL8 ACC_A) + (plus:ALL8 (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn")))] "avr_have_dimode && !s8_operand (operands[0], VOIDmode)" { @@ -116,30 +134,62 @@ (define_insn "adddi3_const_insn" ;; Subtraction ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -(define_expand "subdi3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (match_operand:DI 1 "general_operand" "") - (match_operand:DI 2 "general_operand" "")])] +;; "subdi3" +;; "subdq3" "subudq3" +;; "subda3" "subuda3" +;; "subta3" "subuta3" +(define_expand "sub<mode>3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (match_operand:ALL8 1 "general_operand" "") + (match_operand:ALL8 2 "general_operand" "")])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_subdi3_insn ()); + + if (const_operand (operands[2], GET_MODE (operands[2]))) + { + emit_insn (gen_sub<mode>3_const_insn (operands[2])); + } + else + { + emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]); + emit_insn (gen_sub<mode>3_insn ()); + } + emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "subdi3_insn" - [(set (reg:DI ACC_A) - (minus:DI (reg:DI ACC_A) - (reg:DI ACC_B)))] +;; "subdi3_insn" +;; "subdq3_insn" "subudq3_insn" +;; "subda3_insn" "subuda3_insn" +;; "subta3_insn" "subuta3_insn" +(define_insn "sub<mode>3_insn" + [(set (reg:ALL8 ACC_A) + (minus:ALL8 (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __subdi3" [(set_attr "adjust_len" "call") (set_attr "cc" "set_czn")]) +;; "subdi3_const_insn" +;; "subdq3_const_insn" "subudq3_const_insn" +;; "subda3_const_insn" "subuda3_const_insn" +;; "subta3_const_insn" "subuta3_const_insn" +(define_insn "sub<mode>3_const_insn" + [(set (reg:ALL8 ACC_A) + (minus:ALL8 (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn")))] + "avr_have_dimode" + { + return avr_out_minus64 (operands[0], NULL); + } + [(set_attr "adjust_len" "minus64") + (set_attr "cc" "clobber")]) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Negation @@ -180,15 +230,19 @@ (define_expand "conditional_jump" (pc)))] "avr_have_dimode") -(define_expand "cbranchdi4" - [(parallel [(match_operand:DI 1 "register_operand" "") - (match_operand:DI 2 "nonmemory_operand" "") +;; "cbranchdi4" +;; "cbranchdq4" "cbranchudq4" +;; "cbranchda4" "cbranchuda4" +;; "cbranchta4" "cbranchuta4" +(define_expand "cbranch<mode>4" + [(parallel [(match_operand:ALL8 1 "register_operand" "") + (match_operand:ALL8 2 "nonmemory_operand" "") (match_operator 0 "ordered_comparison_operator" [(cc0) (const_int 0)]) (label_ref (match_operand 3 "" ""))])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); @@ -197,25 +251,28 @@ (define_expand "cbranchdi4" emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]); emit_insn (gen_compare_const8_di2 ()); } - else if (CONST_INT_P (operands[2]) - || CONST_DOUBLE_P (operands[2])) + else if (const_operand (operands[2], GET_MODE (operands[2]))) { - emit_insn (gen_compare_const_di2 (operands[2])); + emit_insn (gen_compare_const_<mode>2 (operands[2])); } else { - emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]); - emit_insn (gen_compare_di2 ()); + emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]); + emit_insn (gen_compare_<mode>2 ()); } emit_jump_insn (gen_conditional_jump (operands[0], operands[3])); DONE; }) -(define_insn "compare_di2" +;; "compare_di2" +;; "compare_dq2" "compare_udq2" +;; "compare_da2" "compare_uda2" +;; "compare_ta2" "compare_uta2" +(define_insn "compare_<mode>2" [(set (cc0) - (compare (reg:DI ACC_A) - (reg:DI ACC_B)))] + (compare (reg:ALL8 ACC_A) + (reg:ALL8 ACC_B)))] "avr_have_dimode" "%~call __cmpdi2" [(set_attr "adjust_len" "call") @@ -230,10 +287,14 @@ (define_insn "compare_const8_di2" [(set_attr "adjust_len" "call") (set_attr "cc" "compare")]) -(define_insn "compare_const_di2" +;; "compare_const_di2" +;; "compare_const_dq2" "compare_const_udq2" +;; "compare_const_da2" "compare_const_uda2" +;; "compare_const_ta2" "compare_const_uta2" +(define_insn "compare_const_<mode>2" [(set (cc0) - (compare (reg:DI ACC_A) - (match_operand:DI 0 "const_double_operand" "n"))) + (compare (reg:ALL8 ACC_A) + (match_operand:ALL8 0 "const_operand" "n Ynn"))) (clobber (match_scratch:QI 1 "=&d"))] "avr_have_dimode && !s8_operand (operands[0], VOIDmode)" @@ -254,29 +315,39 @@ (define_code_iterator di_shifts ;; Shift functions from libgcc are called without defining these insns, ;; but with them we can describe their reduced register footprint. -;; "ashldi3" -;; "ashrdi3" -;; "lshrdi3" -;; "rotldi3" -(define_expand "<code_stdname>di3" - [(parallel [(match_operand:DI 0 "general_operand" "") - (di_shifts:DI (match_operand:DI 1 "general_operand" "") - (match_operand:QI 2 "general_operand" ""))])] +;; "ashldi3" "ashrdi3" "lshrdi3" "rotldi3" +;; "ashldq3" "ashrdq3" "lshrdq3" "rotldq3" +;; "ashlda3" "ashrda3" "lshrda3" "rotlda3" +;; "ashlta3" "ashrta3" "lshrta3" "rotlta3" +;; "ashludq3" "ashrudq3" "lshrudq3" "rotludq3" +;; "ashluda3" "ashruda3" "lshruda3" "rotluda3" +;; "ashluta3" "ashruta3" "lshruta3" "rotluta3" +(define_expand "<code_stdname><mode>3" + [(parallel [(match_operand:ALL8 0 "general_operand" "") + (di_shifts:ALL8 (match_operand:ALL8 1 "general_operand" "") + (match_operand:QI 2 "general_operand" ""))])] "avr_have_dimode" { - rtx acc_a = gen_rtx_REG (DImode, ACC_A); + rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A); emit_move_insn (acc_a, operands[1]); emit_move_insn (gen_rtx_REG (QImode, 16), operands[2]); - emit_insn (gen_<code_stdname>di3_insn ()); + emit_insn (gen_<code_stdname><mode>3_insn ()); emit_move_insn (operands[0], acc_a); DONE; }) -(define_insn "<code_stdname>di3_insn" - [(set (reg:DI ACC_A) - (di_shifts:DI (reg:DI ACC_A) - (reg:QI 16)))] +;; "ashldi3_insn" "ashrdi3_insn" "lshrdi3_insn" "rotldi3_insn" +;; "ashldq3_insn" "ashrdq3_insn" "lshrdq3_insn" "rotldq3_insn" +;; "ashlda3_insn" "ashrda3_insn" "lshrda3_insn" "rotlda3_insn" +;; "ashlta3_insn" "ashrta3_insn" "lshrta3_insn" "rotlta3_insn" +;; "ashludq3_insn" "ashrudq3_insn" "lshrudq3_insn" "rotludq3_insn" +;; "ashluda3_insn" "ashruda3_insn" "lshruda3_insn" "rotluda3_insn" +;; "ashluta3_insn" "ashruta3_insn" "lshruta3_insn" "rotluta3_insn" +(define_insn "<code_stdname><mode>3_insn" + [(set (reg:ALL8 ACC_A) + (di_shifts:ALL8 (reg:ALL8 ACC_A) + (reg:QI 16)))] "avr_have_dimode" "%~call __<code_stdname>di3" [(set_attr "adjust_len" "call") Index: gcc/config/avr/avr.md =================================================================== --- gcc/config/avr/avr.md (revision 190535) +++ gcc/config/avr/avr.md (working copy) @@ -88,10 +88,10 @@ (define_c_enum "unspecv" (include "predicates.md") (include "constraints.md") - + ;; Condition code settings. (define_attr "cc" "none,set_czn,set_zn,set_n,compare,clobber, - out_plus, out_plus_noclobber,ldi" + out_plus, out_plus_noclobber,ldi,minus" (const_string "none")) (define_attr "type" "branch,branch1,arith,xcall" @@ -139,8 +139,10 @@ (define_attr "length" "" (define_attr "adjust_len" "out_bitop, out_plus, out_plus_noclobber, plus64, addto_sp, + minus, minus64, tsthi, tstpsi, tstsi, compare, compare64, call, mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32, + ufract, sfract, xload, movmem, load_lpm, ashlqi, ashrqi, lshrqi, ashlhi, ashrhi, lshrhi, @@ -225,8 +227,20 @@ (define_mode_iterator QISI [(QI "") (HI (define_mode_iterator QIDI [(QI "") (HI "") (PSI "") (SI "") (DI "")]) (define_mode_iterator HISI [(HI "") (PSI "") (SI "")]) +(define_mode_iterator ALL1 [(QI "") (QQ "") (UQQ "")]) +(define_mode_iterator ALL2 [(HI "") (HQ "") (UHQ "") (HA "") (UHA "")]) +(define_mode_iterator ALL4 [(SI "") (SQ "") (USQ "") (SA "") (USA "")]) + ;; All supported move-modes -(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "")]) +(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "") + (QQ "") (UQQ "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "")]) + +;; Supported ordered modes that are 2, 3, 4 bytes wide +(define_mode_iterator ORDERED234 [(HI "") (SI "") (PSI "") + (HQ "") (UHQ "") (HA "") (UHA "") + (SQ "") (USQ "") (SA "") (USA "")]) ;; Define code iterators ;; Define two incarnations so that we can build the cross product. @@ -317,9 +331,11 @@ (define_expand "nonlocal_goto" DONE; }) -(define_insn "pushqi1" - [(set (mem:QI (post_dec:HI (reg:HI REG_SP))) - (match_operand:QI 0 "reg_or_0_operand" "r,L"))] +;; "pushqi1" +;; "pushqq1" "pushuqq1" +(define_insn "push<mode>1" + [(set (mem:ALL1 (post_dec:HI (reg:HI REG_SP))) + (match_operand:ALL1 0 "reg_or_0_operand" "r,Y00"))] "" "@ push %0 @@ -334,7 +350,11 @@ (define_mode_iterator MPUSH (PSI "") (SI "") (CSI "") (DI "") (CDI "") - (SF "") (SC "")]) + (SF "") (SC "") + (HA "") (UHA "") (HQ "") (UHQ "") + (SA "") (USA "") (SQ "") (USQ "") + (DA "") (UDA "") (DQ "") (UDQ "") + (TA "") (UTA "")]) (define_expand "push<mode>1" [(match_operand:MPUSH 0 "" "")] @@ -422,12 +442,14 @@ (define_insn "load_<mode>_clobber" (set_attr "cc" "clobber")]) -(define_insn_and_split "xload8_A" - [(set (match_operand:QI 0 "register_operand" "=r") - (match_operand:QI 1 "memory_operand" "m")) +;; "xload8qi_A" +;; "xload8qq_A" "xload8uqq_A" +(define_insn_and_split "xload8<mode>_A" + [(set (match_operand:ALL1 0 "register_operand" "=r") + (match_operand:ALL1 1 "memory_operand" "m")) (clobber (reg:HI REG_Z))] "can_create_pseudo_p() - && !avr_xload_libgcc_p (QImode) + && !avr_xload_libgcc_p (<MODE>mode) && avr_mem_memx_p (operands[1]) && REG_P (XEXP (operands[1], 0))" { gcc_unreachable(); } @@ -441,16 +463,16 @@ (define_insn_and_split "xload8_A" emit_move_insn (reg_z, simplify_gen_subreg (HImode, addr, PSImode, 0)); emit_move_insn (hi8, simplify_gen_subreg (QImode, addr, PSImode, 2)); - insn = emit_insn (gen_xload_8 (operands[0], hi8)); + insn = emit_insn (gen_xload<mode>_8 (operands[0], hi8)); set_mem_addr_space (SET_SRC (single_set (insn)), MEM_ADDR_SPACE (operands[1])); DONE; }) -;; "xloadqi_A" -;; "xloadhi_A" +;; "xloadqi_A" "xloadqq_A" "xloaduqq_A" +;; "xloadhi_A" "xloadhq_A" "xloaduhq_A" "xloadha_A" "xloaduha_A" +;; "xloadsi_A" "xloadsq_A" "xloadusq_A" "xloadsa_A" "xloadusa_A" ;; "xloadpsi_A" -;; "xloadsi_A" ;; "xloadsf_A" (define_insn_and_split "xload<mode>_A" [(set (match_operand:MOVMODE 0 "register_operand" "=r") @@ -488,11 +510,13 @@ (define_insn_and_split "xload<mode>_A" ;; Move value from address space memx to a register ;; These insns must be prior to respective generic move insn. -(define_insn "xload_8" - [(set (match_operand:QI 0 "register_operand" "=&r,r") - (mem:QI (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r") - (reg:HI REG_Z))))] - "!avr_xload_libgcc_p (QImode)" +;; "xloadqi_8" +;; "xloadqq_8" "xloaduqq_8" +(define_insn "xload<mode>_8" + [(set (match_operand:ALL1 0 "register_operand" "=&r,r") + (mem:ALL1 (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r") + (reg:HI REG_Z))))] + "!avr_xload_libgcc_p (<MODE>mode)" { return avr_out_xload (insn, operands, NULL); } @@ -504,11 +528,11 @@ (define_insn "xload_8" ;; R21:Z : 24-bit source address ;; R22 : 1-4 byte output -;; "xload_qi_libgcc" -;; "xload_hi_libgcc" -;; "xload_psi_libgcc" -;; "xload_si_libgcc" +;; "xload_qi_libgcc" "xload_qq_libgcc" "xload_uqq_libgcc" +;; "xload_hi_libgcc" "xload_hq_libgcc" "xload_uhq_libgcc" "xload_ha_libgcc" "xload_uha_libgcc" +;; "xload_si_libgcc" "xload_sq_libgcc" "xload_usq_libgcc" "xload_sa_libgcc" "xload_usa_libgcc" ;; "xload_sf_libgcc" +;; "xload_psi_libgcc" (define_insn "xload_<mode>_libgcc" [(set (reg:MOVMODE 22) (mem:MOVMODE (lo_sum:PSI (reg:QI 21) @@ -528,9 +552,9 @@ (define_insn "xload_<mode>_libgcc" ;; General move expanders -;; "movqi" -;; "movhi" -;; "movsi" +;; "movqi" "movqq" "movuqq" +;; "movhi" "movhq" "movuhq" "movha" "movuha" +;; "movsi" "movsq" "movusq" "movsa" "movusa" ;; "movsf" ;; "movpsi" (define_expand "mov<mode>" @@ -546,8 +570,7 @@ (define_expand "mov<mode>" /* One of the operands has to be in a register. */ if (!register_operand (dest, <MODE>mode) - && !(register_operand (src, <MODE>mode) - || src == CONST0_RTX (<MODE>mode))) + && !reg_or_0_operand (src, <MODE>mode)) { operands[1] = src = copy_to_mode_reg (<MODE>mode, src); } @@ -560,7 +583,9 @@ (define_expand "mov<mode>" src = replace_equiv_address (src, copy_to_mode_reg (PSImode, addr)); if (!avr_xload_libgcc_p (<MODE>mode)) - emit_insn (gen_xload8_A (dest, src)); + /* ; No <mode> here because gen_xload8<mode>_A only iterates over ALL1. + ; insn-emit does not depend on the mode, it' all about operands. */ + emit_insn (gen_xload8qi_A (dest, src)); else emit_insn (gen_xload<mode>_A (dest, src)); @@ -627,12 +652,13 @@ (define_expand "mov<mode>" ;; are call-saved registers, and most of LD_REGS are call-used registers, ;; so this may still be a win for registers live across function calls. -(define_insn "movqi_insn" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r ,d,Qm,r ,q,r,*r") - (match_operand:QI 1 "nox_general_operand" "rL,i,rL,Qm,r,q,i"))] - "register_operand (operands[0], QImode) - || register_operand (operands[1], QImode) - || const0_rtx == operands[1]" +;; "movqi_insn" +;; "movqq_insn" "movuqq_insn" +(define_insn "mov<mode>_insn" + [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r ,d ,Qm ,r ,q,r,*r") + (match_operand:ALL1 1 "nox_general_operand" "r Y00,n Ynn,r Y00,Qm,r,q,i"))] + "register_operand (operands[0], <MODE>mode) + || reg_or_0_operand (operands[1], <MODE>mode)" { return output_movqi (insn, operands, NULL); } @@ -643,9 +669,11 @@ (define_insn "movqi_insn" ;; This is used in peephole2 to optimize loading immediate constants ;; if a scratch register from LD_REGS happens to be available. -(define_insn "*reload_inqi" - [(set (match_operand:QI 0 "register_operand" "=l") - (match_operand:QI 1 "immediate_operand" "i")) +;; "*reload_inqi" +;; "*reload_inqq" "*reload_inuqq" +(define_insn "*reload_in<mode>" + [(set (match_operand:ALL1 0 "register_operand" "=l") + (match_operand:ALL1 1 "const_operand" "i")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" "ldi %2,lo8(%1) @@ -655,14 +683,15 @@ (define_insn "*reload_inqi" (define_peephole2 [(match_scratch:QI 2 "d") - (set (match_operand:QI 0 "l_register_operand" "") - (match_operand:QI 1 "immediate_operand" ""))] - "(operands[1] != const0_rtx - && operands[1] != const1_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (set (match_operand:ALL1 0 "l_register_operand" "") + (match_operand:ALL1 1 "const_operand" ""))] + ; No need for a clobber reg for 0x0, 0x01 or 0xff + "!satisfies_constraint_Y00 (operands[1]) + && !satisfies_constraint_Y01 (operands[1]) + && !satisfies_constraint_Ym1 (operands[1])" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;;============================================================================ ;; move word (16 bit) @@ -693,18 +722,20 @@ (define_insn "movhi_sp_r" (define_peephole2 [(match_scratch:QI 2 "d") - (set (match_operand:HI 0 "l_register_operand" "") - (match_operand:HI 1 "immediate_operand" ""))] - "(operands[1] != const0_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (set (match_operand:ALL2 0 "l_register_operand" "") + (match_operand:ALL2 1 "const_or_immediate_operand" ""))] + "operands[1] != CONST0_RTX (<MODE>mode)" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation, only in above peephole -(define_insn "*reload_inhi" - [(set (match_operand:HI 0 "register_operand" "=r") - (match_operand:HI 1 "immediate_operand" "i")) +;; "*reload_inhi" +;; "*reload_inhq" "*reload_inuhq" +;; "*reload_inha" "*reload_inuha" +(define_insn "*reload_in<mode>" + [(set (match_operand:ALL2 0 "l_register_operand" "=l") + (match_operand:ALL2 1 "immediate_operand" "i")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" { @@ -712,14 +743,16 @@ (define_insn "*reload_inhi" } [(set_attr "length" "4") (set_attr "adjust_len" "reload_in16") - (set_attr "cc" "none")]) + (set_attr "cc" "clobber")]) -(define_insn "*movhi" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m ,d,*r,q,r") - (match_operand:HI 1 "nox_general_operand" "r,L,m,rL,i,i ,r,q"))] - "register_operand (operands[0], HImode) - || register_operand (operands[1], HImode) - || const0_rtx == operands[1]" +;; "*movhi" +;; "*movhq" "*movuhq" +;; "*movha" "*movuha" +(define_insn "*mov<mode>" + [(set (match_operand:ALL2 0 "nonimmediate_operand" "=r,r ,r,m ,d,*r,q,r") + (match_operand:ALL2 1 "nox_general_operand" "r,Y00,m,r Y00,i,i ,r,q"))] + "register_operand (operands[0], <MODE>mode) + || reg_or_0_operand (operands[1], <MODE>mode)" { return output_movhi (insn, operands, NULL); } @@ -728,28 +761,30 @@ (define_insn "*movhi" (set_attr "cc" "none,none,clobber,clobber,none,clobber,none,none")]) (define_peephole2 ; movw - [(set (match_operand:QI 0 "even_register_operand" "") - (match_operand:QI 1 "even_register_operand" "")) - (set (match_operand:QI 2 "odd_register_operand" "") - (match_operand:QI 3 "odd_register_operand" ""))] + [(set (match_operand:ALL1 0 "even_register_operand" "") + (match_operand:ALL1 1 "even_register_operand" "")) + (set (match_operand:ALL1 2 "odd_register_operand" "") + (match_operand:ALL1 3 "odd_register_operand" ""))] "(AVR_HAVE_MOVW && REGNO (operands[0]) == REGNO (operands[2]) - 1 && REGNO (operands[1]) == REGNO (operands[3]) - 1)" - [(set (match_dup 4) (match_dup 5))] + [(set (match_dup 4) + (match_dup 5))] { operands[4] = gen_rtx_REG (HImode, REGNO (operands[0])); operands[5] = gen_rtx_REG (HImode, REGNO (operands[1])); }) (define_peephole2 ; movw_r - [(set (match_operand:QI 0 "odd_register_operand" "") - (match_operand:QI 1 "odd_register_operand" "")) - (set (match_operand:QI 2 "even_register_operand" "") - (match_operand:QI 3 "even_register_operand" ""))] + [(set (match_operand:ALL1 0 "odd_register_operand" "") + (match_operand:ALL1 1 "odd_register_operand" "")) + (set (match_operand:ALL1 2 "even_register_operand" "") + (match_operand:ALL1 3 "even_register_operand" ""))] "(AVR_HAVE_MOVW && REGNO (operands[2]) == REGNO (operands[0]) - 1 && REGNO (operands[3]) == REGNO (operands[1]) - 1)" - [(set (match_dup 4) (match_dup 5))] + [(set (match_dup 4) + (match_dup 5))] { operands[4] = gen_rtx_REG (HImode, REGNO (operands[2])); operands[5] = gen_rtx_REG (HImode, REGNO (operands[3])); @@ -801,19 +836,21 @@ (define_insn "*movpsi" (define_peephole2 ; *reload_insi [(match_scratch:QI 2 "d") - (set (match_operand:SI 0 "l_register_operand" "") - (match_operand:SI 1 "const_int_operand" "")) + (set (match_operand:ALL4 0 "l_register_operand" "") + (match_operand:ALL4 1 "immediate_operand" "")) (match_dup 2)] - "(operands[1] != const0_rtx - && operands[1] != constm1_rtx)" - [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + "operands[1] != CONST0_RTX (<MODE>mode)" + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation. +;; "*reload_insi" +;; "*reload_insq" "*reload_inusq" +;; "*reload_insa" "*reload_inusa" (define_insn "*reload_insi" - [(set (match_operand:SI 0 "register_operand" "=r") - (match_operand:SI 1 "const_int_operand" "n")) + [(set (match_operand:ALL4 0 "register_operand" "=r") + (match_operand:ALL4 1 "immediate_operand" "n Ynn")) (clobber (match_operand:QI 2 "register_operand" "=&d"))] "reload_completed" { @@ -824,12 +861,14 @@ (define_insn "*reload_insi" (set_attr "cc" "clobber")]) -(define_insn "*movsi" - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r") - (match_operand:SI 1 "nox_general_operand" "r,L,Qm,rL,i ,i"))] - "register_operand (operands[0], SImode) - || register_operand (operands[1], SImode) - || const0_rtx == operands[1]" +;; "*movsi" +;; "*movsq" "*movusq" +;; "*movsa" "*movusa" +(define_insn "*mov<mode>" + [(set (match_operand:ALL4 0 "nonimmediate_operand" "=r,r ,r ,Qm ,!d,r") + (match_operand:ALL4 1 "nox_general_operand" "r,Y00,Qm,r Y00,i ,i"))] + "register_operand (operands[0], <MODE>mode) + || reg_or_0_operand (operands[1], <MODE>mode)" { return output_movsisf (insn, operands, NULL); } @@ -844,8 +883,7 @@ (define_insn "*movsf" [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r") (match_operand:SF 1 "nox_general_operand" "r,G,Qm,rG,F ,F"))] "register_operand (operands[0], SFmode) - || register_operand (operands[1], SFmode) - || operands[1] == CONST0_RTX (SFmode)" + || reg_or_0_operand (operands[1], SFmode)" { return output_movsisf (insn, operands, NULL); } @@ -861,8 +899,7 @@ (define_peephole2 ; *reload_insf "operands[1] != CONST0_RTX (SFmode)" [(parallel [(set (match_dup 0) (match_dup 1)) - (clobber (match_dup 2))])] - "") + (clobber (match_dup 2))])]) ;; '*' because it is not used in rtl generation. (define_insn "*reload_insf" @@ -1015,9 +1052,10 @@ (define_expand "strlenhi" (set (match_dup 4) (plus:HI (match_dup 4) (const_int -1))) - (set (match_operand:HI 0 "register_operand" "") - (minus:HI (match_dup 4) - (match_dup 5)))] + (parallel [(set (match_operand:HI 0 "register_operand" "") + (minus:HI (match_dup 4) + (match_dup 5))) + (clobber (scratch:QI))])] "" { rtx addr; @@ -1043,10 +1081,12 @@ (define_insn "*strlenhi" ;+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ; add bytes -(define_insn "addqi3" - [(set (match_operand:QI 0 "register_operand" "=r,d,r,r,r,r") - (plus:QI (match_operand:QI 1 "register_operand" "%0,0,0,0,0,0") - (match_operand:QI 2 "nonmemory_operand" "r,i,P,N,K,Cm2")))] +;; "addqi3" +;; "addqq3" "adduqq3" +(define_insn "add<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,d ,r ,r ,r ,r") + (plus:ALL1 (match_operand:ALL1 1 "register_operand" "%0,0 ,0 ,0 ,0 ,0") + (match_operand:ALL1 2 "nonmemory_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))] "" "@ add %0,%2 @@ -1058,11 +1098,13 @@ (define_insn "addqi3" [(set_attr "length" "1,1,1,1,2,2") (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")]) - -(define_expand "addhi3" - [(set (match_operand:HI 0 "register_operand" "") - (plus:HI (match_operand:HI 1 "register_operand" "") - (match_operand:HI 2 "nonmemory_operand" "")))] +;; "addhi3" +;; "addhq3" "adduhq3" +;; "addha3" "adduha3" +(define_expand "add<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "")))] "" { if (CONST_INT_P (operands[2])) @@ -1079,6 +1121,12 @@ (define_expand "addhi3" DONE; } } + + if (CONST_FIXED == GET_CODE (operands[2])) + { + emit_insn (gen_add<mode>3_clobber (operands[0], operands[1], operands[2])); + DONE; + } }) @@ -1124,24 +1172,22 @@ (define_insn "*addhi3_sp" [(set_attr "length" "6") (set_attr "adjust_len" "addto_sp")]) -(define_insn "*addhi3" - [(set (match_operand:HI 0 "register_operand" "=r,d,!w,d") - (plus:HI (match_operand:HI 1 "register_operand" "%0,0,0 ,0") - (match_operand:HI 2 "nonmemory_operand" "r,s,IJ,n")))] +;; "*addhi3" +;; "*addhq3" "*adduhq3" +;; "*addha3" "*adduha3" +(define_insn "*add<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,d,!w ,d") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "%0,0,0 ,0") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,s,IJ YIJ,n Ynn")))] "" { - static const char * const asm_code[] = - { - "add %A0,%A2\;adc %B0,%B2", - "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))", - "", - "" - }; - - if (*asm_code[which_alternative]) - return asm_code[which_alternative]; - - return avr_out_plus_noclobber (operands, NULL, NULL); + if (REG_P (operands[2])) + return "add %A0,%A2\;adc %B0,%B2"; + else if (CONST_INT_P (operands[2]) + || CONST_FIXED == GET_CODE (operands[2])) + return avr_out_plus_noclobber (operands, NULL, NULL); + else + return "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))"; } [(set_attr "length" "2,2,2,2") (set_attr "adjust_len" "*,*,out_plus_noclobber,out_plus_noclobber") @@ -1152,41 +1198,44 @@ (define_insn "*addhi3" ;; itself because that insn is special to reload. (define_peephole2 ; addhi3_clobber - [(set (match_operand:HI 0 "d_register_operand" "") - (match_operand:HI 1 "const_int_operand" "")) - (set (match_operand:HI 2 "l_register_operand" "") - (plus:HI (match_dup 2) - (match_dup 0)))] + [(set (match_operand:ALL2 0 "d_register_operand" "") + (match_operand:ALL2 1 "const_operand" "")) + (set (match_operand:ALL2 2 "l_register_operand" "") + (plus:ALL2 (match_dup 2) + (match_dup 0)))] "peep2_reg_dead_p (2, operands[0])" [(parallel [(set (match_dup 2) - (plus:HI (match_dup 2) - (match_dup 1))) + (plus:ALL2 (match_dup 2) + (match_dup 1))) (clobber (match_dup 3))])] { - operands[3] = simplify_gen_subreg (QImode, operands[0], HImode, 0); + operands[3] = simplify_gen_subreg (QImode, operands[0], <MODE>mode, 0); }) ;; Same, but with reload to NO_LD_REGS ;; Combine *reload_inhi with *addhi3 (define_peephole2 ; addhi3_clobber - [(parallel [(set (match_operand:HI 0 "l_register_operand" "") - (match_operand:HI 1 "const_int_operand" "")) + [(parallel [(set (match_operand:ALL2 0 "l_register_operand" "") + (match_operand:ALL2 1 "const_operand" "")) (clobber (match_operand:QI 2 "d_register_operand" ""))]) - (set (match_operand:HI 3 "l_register_operand" "") - (plus:HI (match_dup 3) - (match_dup 0)))] + (set (match_operand:ALL2 3 "l_register_operand" "") + (plus:ALL2 (match_dup 3) + (match_dup 0)))] "peep2_reg_dead_p (2, operands[0])" [(parallel [(set (match_dup 3) - (plus:HI (match_dup 3) - (match_dup 1))) + (plus:ALL2 (match_dup 3) + (match_dup 1))) (clobber (match_dup 2))])]) -(define_insn "addhi3_clobber" - [(set (match_operand:HI 0 "register_operand" "=!w,d,r") - (plus:HI (match_operand:HI 1 "register_operand" "%0,0,0") - (match_operand:HI 2 "const_int_operand" "IJ,n,n"))) - (clobber (match_scratch:QI 3 "=X,X,&d"))] +;; "addhi3_clobber" +;; "addhq3_clobber" "adduhq3_clobber" +;; "addha3_clobber" "adduha3_clobber" +(define_insn "add<mode>3_clobber" + [(set (match_operand:ALL2 0 "register_operand" "=!w ,d ,r") + (plus:ALL2 (match_operand:ALL2 1 "register_operand" "%0 ,0 ,0") + (match_operand:ALL2 2 "const_operand" "IJ YIJ,n Ynn,n Ynn"))) + (clobber (match_scratch:QI 3 "=X ,X ,&d"))] "" { gcc_assert (REGNO (operands[0]) == REGNO (operands[1])); @@ -1198,29 +1247,24 @@ (define_insn "addhi3_clobber" (set_attr "cc" "out_plus")]) -(define_insn "addsi3" - [(set (match_operand:SI 0 "register_operand" "=r,d ,d,r") - (plus:SI (match_operand:SI 1 "register_operand" "%0,0 ,0,0") - (match_operand:SI 2 "nonmemory_operand" "r,s ,n,n"))) - (clobber (match_scratch:QI 3 "=X,X ,X,&d"))] +;; "addsi3" +;; "addsq3" "addusq3" +;; "addsa3" "addusa3" +(define_insn "add<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,d ,r") + (plus:ALL4 (match_operand:ALL4 1 "register_operand" "%0,0 ,0") + (match_operand:ALL4 2 "nonmemory_operand" "r,i ,n Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" { - static const char * const asm_code[] = - { - "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2", - "subi %0,lo8(-(%2))\;sbci %B0,hi8(-(%2))\;sbci %C0,hlo8(-(%2))\;sbci %D0,hhi8(-(%2))", - "", - "" - }; - - if (*asm_code[which_alternative]) - return asm_code[which_alternative]; + if (REG_P (operands[2])) + return "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2"; return avr_out_plus (operands, NULL, NULL); } - [(set_attr "length" "4,4,4,8") - (set_attr "adjust_len" "*,*,out_plus,out_plus") - (set_attr "cc" "set_n,set_czn,out_plus,out_plus")]) + [(set_attr "length" "4,4,8") + (set_attr "adjust_len" "*,out_plus,out_plus") + (set_attr "cc" "set_n,out_plus,out_plus")]) (define_insn "*addpsi3_zero_extend.qi" [(set (match_operand:PSI 0 "register_operand" "=r") @@ -1329,27 +1373,38 @@ (define_insn "*subpsi3_sign_extend.hi" ;----------------------------------------------------------------------------- ; sub bytes -(define_insn "subqi3" - [(set (match_operand:QI 0 "register_operand" "=r,d") - (minus:QI (match_operand:QI 1 "register_operand" "0,0") - (match_operand:QI 2 "nonmemory_operand" "r,i")))] + +;; "subqi3" +;; "subqq3" "subuqq3" +(define_insn "sub<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,d ,r ,r ,r ,r") + (minus:ALL1 (match_operand:ALL1 1 "register_operand" "0,0 ,0 ,0 ,0 ,0") + (match_operand:ALL1 2 "nonmemory_or_const_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))] "" "@ sub %0,%2 - subi %0,lo8(%2)" - [(set_attr "length" "1,1") - (set_attr "cc" "set_czn,set_czn")]) + subi %0,lo8(%2) + dec %0 + inc %0 + dec %0\;dec %0 + inc %0\;inc %0" + [(set_attr "length" "1,1,1,1,2,2") + (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")]) -(define_insn "subhi3" - [(set (match_operand:HI 0 "register_operand" "=r,d") - (minus:HI (match_operand:HI 1 "register_operand" "0,0") - (match_operand:HI 2 "nonmemory_operand" "r,i")))] +;; "subhi3" +;; "subhq3" "subuhq3" +;; "subha3" "subuha3" +(define_insn "sub<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,d ,*r") + (minus:ALL2 (match_operand:ALL2 1 "register_operand" "0,0 ,0") + (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,i Ynn,Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" - "@ - sub %A0,%A2\;sbc %B0,%B2 - subi %A0,lo8(%2)\;sbci %B0,hi8(%2)" - [(set_attr "length" "2,2") - (set_attr "cc" "set_czn,set_czn")]) + { + return avr_out_minus (operands, NULL, NULL); + } + [(set_attr "adjust_len" "minus") + (set_attr "cc" "minus")]) (define_insn "*subhi3_zero_extend1" [(set (match_operand:HI 0 "register_operand" "=r") @@ -1373,13 +1428,23 @@ (define_insn "*subhi3.sign_extend2" [(set_attr "length" "5") (set_attr "cc" "clobber")]) -(define_insn "subsi3" - [(set (match_operand:SI 0 "register_operand" "=r") - (minus:SI (match_operand:SI 1 "register_operand" "0") - (match_operand:SI 2 "register_operand" "r")))] +;; "subsi3" +;; "subsq3" "subusq3" +;; "subsa3" "subusa3" +(define_insn "sub<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,d ,r") + (minus:ALL4 (match_operand:ALL4 1 "register_operand" "0,0 ,0") + (match_operand:ALL4 2 "nonmemory_or_const_operand" "r,n Ynn,Ynn"))) + (clobber (match_scratch:QI 3 "=X,X ,&d"))] "" - "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2" + { + if (REG_P (operands[2])) + return "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2"; + + return avr_out_minus (operands, NULL, NULL); + } [(set_attr "length" "4") + (set_attr "adjust_len" "*,minus,minus") (set_attr "cc" "set_czn")]) (define_insn "*subsi3_zero_extend" @@ -1515,8 +1580,18 @@ (define_insn "*addsi3.lt0" adc %A0,__zero_reg__\;adc %B0,__zero_reg__\;adc %C0,__zero_reg__\;adc %D0,__zero_reg__" [(set_attr "length" "6") (set_attr "cc" "clobber")]) - +(define_insn "*umulqihi3.call" + [(set (reg:HI 24) + (mult:HI (zero_extend:HI (reg:QI 22)) + (zero_extend:HI (reg:QI 24)))) + (clobber (reg:QI 21)) + (clobber (reg:HI 22))] + "!AVR_HAVE_MUL" + "%~call __umulqihi3" + [(set_attr "type" "xcall") + (set_attr "cc" "clobber")]) + ;; "umulqihi3" ;; "mulqihi3" (define_insn "<extend_u>mulqihi3" @@ -3303,44 +3378,58 @@ (define_insn_and_split "*rotb<mode>" ;;<< << << << << << << << << << << << << << << << << << << << << << << << << << ;; arithmetic shift left -(define_expand "ashlqi3" - [(set (match_operand:QI 0 "register_operand" "") - (ashift:QI (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nop_general_operand" "")))]) +;; "ashlqi3" +;; "ashlqq3" "ashluqq3" +(define_expand "ashl<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "") + (ashift:ALL1 (match_operand:ALL1 1 "register_operand" "") + (match_operand:QI 2 "nop_general_operand" "")))]) (define_split ; ashlqi3_const4 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 4)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 4)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -16)))] - "") + [(set (match_dup 1) + (rotate:QI (match_dup 1) + (const_int 4))) + (set (match_dup 1) + (and:QI (match_dup 1) + (const_int -16)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; ashlqi3_const5 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 5)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 5)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -32)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 1))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int -32)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; ashlqi3_const6 - [(set (match_operand:QI 0 "d_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 6)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 6)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int -64)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 2))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int -64)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) -(define_insn "*ashlqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,!d,r,r") - (ashift:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] +;; "*ashlqi3" +;; "*ashlqq3" "*ashluqq3" +(define_insn "*ashl<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,!d,r,r") + (ashift:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] "" { return ashlqi3_out (insn, operands, NULL); @@ -3349,10 +3438,10 @@ (define_insn "*ashlqi3" (set_attr "adjust_len" "ashlqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")]) -(define_insn "ashlhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashift:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +(define_insn "ashl<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashlhi3_out (insn, operands, NULL); @@ -3377,8 +3466,7 @@ (define_insn_and_split "*ashl<extend_su> "" [(set (match_dup 0) (ashift:QI (match_dup 1) - (match_dup 2)))] - "") + (match_dup 2)))]) ;; ??? Combiner does not recognize that it could split the following insn; ;; presumably because he has no register handy? @@ -3443,10 +3531,13 @@ (define_peephole2 }) -(define_insn "ashlsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashift:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashlsi3" +;; "ashlsq3" "ashlusq3" +;; "ashlsa3" "ashlusa3" +(define_insn "ashl<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashlsi3_out (insn, operands, NULL); @@ -3458,55 +3549,65 @@ (define_insn "ashlsi3" ;; Optimize if a scratch register from LD_REGS happens to be available. (define_peephole2 ; ashlqi3_l_const4 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 4))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 4))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) (set (match_dup 1) (const_int -16)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; ashlqi3_l_const5 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 5))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 5))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 1))) (set (match_dup 1) (const_int -32)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; ashlqi3_l_const6 - [(set (match_operand:QI 0 "l_register_operand" "") - (ashift:QI (match_dup 0) - (const_int 6))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (ashift:ALL1 (match_dup 0) + (const_int 6))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 2))) (set (match_dup 1) (const_int -64)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (ashift:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashift:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*ashlhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (ashift:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (ashift:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*ashlhi3_const" +;; "*ashlhq3_const" "*ashluhq3_const" +;; "*ashlha3_const" "*ashluha3_const" +(define_insn "*ashl<mode>3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return ashlhi3_out (insn, operands, NULL); @@ -3517,19 +3618,24 @@ (define_insn "*ashlhi3_const" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (ashift:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashift:SI (match_dup 1) (match_dup 2))) + [(parallel [(set (match_dup 0) + (ashift:ALL4 (match_dup 1) + (match_dup 2))) (clobber (match_dup 3))])] "") -(define_insn "*ashlsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (ashift:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] +;; "*ashlsi3_const" +;; "*ashlsq3_const" "*ashlusq3_const" +;; "*ashlsa3_const" "*ashlusa3_const" +(define_insn "*ashl<mode>3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return ashlsi3_out (insn, operands, NULL); @@ -3580,10 +3686,12 @@ (define_insn "*ashlpsi3" ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ;; arithmetic shift right -(define_insn "ashrqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,r ,r ,r") - (ashiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0 ,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,C03 C04 C05,C06 C07,Qm")))] +;; "ashrqi3" +;; "ashrqq3" "ashruqq3" +(define_insn "ashr<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,r ,r ,r") + (ashiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0 ,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,C03 C04 C05,C06 C07,Qm")))] "" { return ashrqi3_out (insn, operands, NULL); @@ -3592,10 +3700,13 @@ (define_insn "ashrqi3" (set_attr "adjust_len" "ashrqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,clobber,clobber")]) -(define_insn "ashrhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashrhi3" +;; "ashrhq3" "ashruhq3" +;; "ashrha3" "ashruha3" +(define_insn "ashr<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashrhi3_out (insn, operands, NULL); @@ -3616,10 +3727,13 @@ (define_insn "ashrpsi3" [(set_attr "adjust_len" "ashrpsi") (set_attr "cc" "clobber")]) -(define_insn "ashrsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "ashrsi3" +;; "ashrsq3" "ashrusq3" +;; "ashrsa3" "ashrusa3" +(define_insn "ashr<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return ashrsi3_out (insn, operands, NULL); @@ -3632,19 +3746,23 @@ (define_insn "ashrsi3" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashiftrt:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*ashrhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (ashiftrt:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (ashiftrt:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*ashrhi3_const" +;; "*ashrhq3_const" "*ashruhq3_const" +;; "*ashrha3_const" "*ashruha3_const" +(define_insn "*ashr<mode>3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return ashrhi3_out (insn, operands, NULL); @@ -3655,19 +3773,23 @@ (define_insn "*ashrhi3_const" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*ashrsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (ashiftrt:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (ashiftrt:ALL4 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*ashrsi3_const" +;; "*ashrsq3_const" "*ashrusq3_const" +;; "*ashrsa3_const" "*ashrusa3_const" +(define_insn "*ashr<mode>3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return ashrsi3_out (insn, operands, NULL); @@ -3679,44 +3801,59 @@ (define_insn "*ashrsi3_const" ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ;; logical shift right -(define_expand "lshrqi3" - [(set (match_operand:QI 0 "register_operand" "") - (lshiftrt:QI (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nop_general_operand" "")))]) +;; "lshrqi3" +;; "lshrqq3 "lshruqq3" +(define_expand "lshr<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "") + (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "") + (match_operand:QI 2 "nop_general_operand" "")))]) (define_split ; lshrqi3_const4 - [(set (match_operand:QI 0 "d_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 4)))] + [(set (match_operand:ALL1 0 "d_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 4)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 15)))] - "") + [(set (match_dup 1) + (rotate:QI (match_dup 1) + (const_int 4))) + (set (match_dup 1) + (and:QI (match_dup 1) + (const_int 15)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; lshrqi3_const5 - [(set (match_operand:QI 0 "d_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 5)))] - "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 7)))] - "") + [(set (match_operand:ALL1 0 "d_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 5)))] + "" + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 1))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int 7)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) (define_split ; lshrqi3_const6 [(set (match_operand:QI 0 "d_register_operand" "") (lshiftrt:QI (match_dup 0) (const_int 6)))] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2))) - (set (match_dup 0) (and:QI (match_dup 0) (const_int 3)))] - "") + [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4))) + (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 2))) + (set (match_dup 1) (and:QI (match_dup 1) (const_int 3)))] + { + operands[1] = avr_to_int_mode (operands[0]); + }) -(define_insn "*lshrqi3" - [(set (match_operand:QI 0 "register_operand" "=r,r,r,r,!d,r,r") - (lshiftrt:QI (match_operand:QI 1 "register_operand" "0,0,0,0,0 ,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] +;; "*lshrqi3" +;; "*lshrqq3" +;; "*lshruqq3" +(define_insn "*lshr<mode>3" + [(set (match_operand:ALL1 0 "register_operand" "=r,r,r,r,!d,r,r") + (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "0,0,0,0,0 ,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))] "" { return lshrqi3_out (insn, operands, NULL); @@ -3725,10 +3862,13 @@ (define_insn "*lshrqi3" (set_attr "adjust_len" "lshrqi") (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")]) -(define_insn "lshrhi3" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r,r,r") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "lshrhi3" +;; "lshrhq3" "lshruhq3" +;; "lshrha3" "lshruha3" +(define_insn "lshr<mode>3" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r,r,r") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return lshrhi3_out (insn, operands, NULL); @@ -3749,10 +3889,13 @@ (define_insn "lshrpsi3" [(set_attr "adjust_len" "lshrpsi") (set_attr "cc" "clobber")]) -(define_insn "lshrsi3" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,0,r,0,0,0") - (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] +;; "lshrsi3" +;; "lshrsq3" "lshrusq3" +;; "lshrsa3" "lshrusa3" +(define_insn "lshr<mode>3" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r,r,r,r") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,0,r,0,0,0") + (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))] "" { return lshrsi3_out (insn, operands, NULL); @@ -3764,55 +3907,65 @@ (define_insn "lshrsi3" ;; Optimize if a scratch register from LD_REGS happens to be available. (define_peephole2 ; lshrqi3_l_const4 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 4))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 4))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) (set (match_dup 1) (const_int 15)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; lshrqi3_l_const5 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 5))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 5))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 1))) (set (match_dup 1) (const_int 7)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 ; lshrqi3_l_const6 - [(set (match_operand:QI 0 "l_register_operand" "") - (lshiftrt:QI (match_dup 0) - (const_int 6))) + [(set (match_operand:ALL1 0 "l_register_operand" "") + (lshiftrt:ALL1 (match_dup 0) + (const_int 6))) (match_scratch:QI 1 "d")] "" - [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4))) - (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2))) + [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4))) + (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 2))) (set (match_dup 1) (const_int 3)) - (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))] - "") + (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))] + { + operands[2] = avr_to_int_mode (operands[0]); + }) (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:HI 0 "register_operand" "") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL2 0 "register_operand" "") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (lshiftrt:HI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*lshrhi3_const" - [(set (match_operand:HI 0 "register_operand" "=r,r,r,r,r") - (lshiftrt:HI (match_operand:HI 1 "register_operand" "0,0,r,0,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (lshiftrt:ALL2 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*lshrhi3_const" +;; "*lshrhq3_const" "*lshruhq3_const" +;; "*lshrha3_const" "*lshruha3_const" +(define_insn "*lshr<mode>3_const" + [(set (match_operand:ALL2 0 "register_operand" "=r,r,r,r,r") + (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "0,0,r,0,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,K,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,X,&d"))] "reload_completed" { return lshrhi3_out (insn, operands, NULL); @@ -3823,19 +3976,23 @@ (define_insn "*lshrhi3_const" (define_peephole2 [(match_scratch:QI 3 "d") - (set (match_operand:SI 0 "register_operand" "") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "") - (match_operand:QI 2 "const_int_operand" "")))] + (set (match_operand:ALL4 0 "register_operand" "") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "") + (match_operand:QI 2 "const_int_operand" "")))] "" - [(parallel [(set (match_dup 0) (lshiftrt:SI (match_dup 1) (match_dup 2))) - (clobber (match_dup 3))])] - "") - -(define_insn "*lshrsi3_const" - [(set (match_operand:SI 0 "register_operand" "=r,r,r,r") - (lshiftrt:SI (match_operand:SI 1 "register_operand" "0,0,r,0") - (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) - (clobber (match_scratch:QI 3 "=X,X,X,&d"))] + [(parallel [(set (match_dup 0) + (lshiftrt:ALL4 (match_dup 1) + (match_dup 2))) + (clobber (match_dup 3))])]) + +;; "*lshrsi3_const" +;; "*lshrsq3_const" "*lshrusq3_const" +;; "*lshrsa3_const" "*lshrusa3_const" +(define_insn "*lshr<mode>3_const" + [(set (match_operand:ALL4 0 "register_operand" "=r,r,r,r") + (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0") + (match_operand:QI 2 "const_int_operand" "L,P,O,n"))) + (clobber (match_scratch:QI 3 "=X,X,X,&d"))] "reload_completed" { return lshrsi3_out (insn, operands, NULL); @@ -4278,24 +4435,29 @@ (define_insn "*negated_tstsi" [(set_attr "cc" "compare") (set_attr "length" "4")]) -(define_insn "*reversed_tstsi" +;; "*reversed_tstsi" +;; "*reversed_tstsq" "*reversed_tstusq" +;; "*reversed_tstsa" "*reversed_tstusa" +(define_insn "*reversed_tst<mode>" [(set (cc0) - (compare (const_int 0) - (match_operand:SI 0 "register_operand" "r"))) - (clobber (match_scratch:QI 1 "=X"))] - "" - "cp __zero_reg__,%A0 - cpc __zero_reg__,%B0 - cpc __zero_reg__,%C0 - cpc __zero_reg__,%D0" + (compare (match_operand:ALL4 0 "const0_operand" "Y00") + (match_operand:ALL4 1 "register_operand" "r"))) + (clobber (match_scratch:QI 2 "=X"))] + "" + "cp __zero_reg__,%A1 + cpc __zero_reg__,%B1 + cpc __zero_reg__,%C1 + cpc __zero_reg__,%D1" [(set_attr "cc" "compare") (set_attr "length" "4")]) -(define_insn "*cmpqi" +;; "*cmpqi" +;; "*cmpqq" "*cmpuqq" +(define_insn "*cmp<mode>" [(set (cc0) - (compare (match_operand:QI 0 "register_operand" "r,r,d") - (match_operand:QI 1 "nonmemory_operand" "L,r,i")))] + (compare (match_operand:ALL1 0 "register_operand" "r ,r,d") + (match_operand:ALL1 1 "nonmemory_operand" "Y00,r,i")))] "" "@ tst %0 @@ -4313,11 +4475,14 @@ (define_insn "*cmpqi_sign_extend" [(set_attr "cc" "compare") (set_attr "length" "1")]) -(define_insn "*cmphi" +;; "*cmphi" +;; "*cmphq" "*cmpuhq" +;; "*cmpha" "*cmpuha" +(define_insn "*cmp<mode>" [(set (cc0) - (compare (match_operand:HI 0 "register_operand" "!w,r,r,d ,r ,d,r") - (match_operand:HI 1 "nonmemory_operand" "L ,L,r,s ,s ,M,n"))) - (clobber (match_scratch:QI 2 "=X ,X,X,&d,&d ,X,&d"))] + (compare (match_operand:ALL2 0 "register_operand" "!w ,r ,r,d ,r ,d,r") + (match_operand:ALL2 1 "nonmemory_operand" "Y00,Y00,r,s ,s ,M,n Ynn"))) + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d ,X,&d"))] "" { switch (which_alternative) @@ -4330,11 +4495,15 @@ (define_insn "*cmphi" return "cp %A0,%A1\;cpc %B0,%B1"; case 3: + if (<MODE>mode != HImode) + break; return reg_unused_after (insn, operands[0]) ? "subi %A0,lo8(%1)\;sbci %B0,hi8(%1)" : "ldi %2,hi8(%1)\;cpi %A0,lo8(%1)\;cpc %B0,%2"; case 4: + if (<MODE>mode != HImode) + break; return "ldi %2,lo8(%1)\;cp %A0,%2\;ldi %2,hi8(%1)\;cpc %B0,%2"; } @@ -4374,11 +4543,14 @@ (define_insn "*cmppsi" (set_attr "length" "3,3,5,6,3,7") (set_attr "adjust_len" "tstpsi,*,*,*,compare,compare")]) -(define_insn "*cmpsi" +;; "*cmpsi" +;; "*cmpsq" "*cmpusq" +;; "*cmpsa" "*cmpusa" +(define_insn "*cmp<mode>" [(set (cc0) - (compare (match_operand:SI 0 "register_operand" "r,r ,d,r ,r") - (match_operand:SI 1 "nonmemory_operand" "L,r ,M,M ,n"))) - (clobber (match_scratch:QI 2 "=X,X ,X,&d,&d"))] + (compare (match_operand:ALL4 0 "register_operand" "r ,r ,d,r ,r") + (match_operand:ALL4 1 "nonmemory_operand" "Y00,r ,M,M ,n Ynn"))) + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d"))] "" { if (0 == which_alternative) @@ -4398,55 +4570,33 @@ (define_insn "*cmpsi" ;; ---------------------------------------------------------------------- ;; Conditional jump instructions -(define_expand "cbranchsi4" - [(parallel [(set (cc0) - (compare (match_operand:SI 1 "register_operand" "") - (match_operand:SI 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) +;; "cbranchqi4" +;; "cbranchqq4" "cbranchuqq4" +(define_expand "cbranch<mode>4" + [(set (cc0) + (compare (match_operand:ALL1 1 "register_operand" "") + (match_operand:ALL1 2 "nonmemory_operand" ""))) (set (pc) (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") - -(define_expand "cbranchpsi4" - [(parallel [(set (cc0) - (compare (match_operand:PSI 1 "register_operand" "") - (match_operand:PSI 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) - (set (pc) - (if_then_else (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") + (match_operator 0 "ordered_comparison_operator" [(cc0) + (const_int 0)]) + (label_ref (match_operand 3 "" "")) + (pc)))]) -(define_expand "cbranchhi4" +;; "cbranchhi4" "cbranchhq4" "cbranchuhq4" "cbranchha4" "cbranchuha4" +;; "cbranchsi4" "cbranchsq4" "cbranchusq4" "cbranchsa4" "cbranchusa4" +;; "cbranchpsi4" +(define_expand "cbranch<mode>4" [(parallel [(set (cc0) - (compare (match_operand:HI 1 "register_operand" "") - (match_operand:HI 2 "nonmemory_operand" ""))) + (compare (match_operand:ORDERED234 1 "register_operand" "") + (match_operand:ORDERED234 2 "nonmemory_operand" ""))) (clobber (match_scratch:QI 4 ""))]) (set (pc) (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") - -(define_expand "cbranchqi4" - [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "nonmemory_operand" ""))) - (set (pc) - (if_then_else - (match_operator 0 "ordered_comparison_operator" [(cc0) - (const_int 0)]) - (label_ref (match_operand 3 "" "")) - (pc)))] - "") + (match_operator 0 "ordered_comparison_operator" [(cc0) + (const_int 0)]) + (label_ref (match_operand 3 "" "")) + (pc)))]) ;; Test a single bit in a QI/HI/SImode register. @@ -4477,7 +4627,7 @@ (define_insn "*sbrx_branch<mode>" (const_int 4)))) (set_attr "cc" "clobber")]) -;; Same test based on Bitwise AND RTL. Keep this incase gcc changes patterns. +;; Same test based on bitwise AND. Keep this in case gcc changes patterns. ;; or for old peepholes. ;; Fixme - bitwise Mask will not work for DImode @@ -4492,12 +4642,12 @@ (define_insn "*sbrx_and_branch<mode>" (label_ref (match_operand 3 "" "")) (pc)))] "" -{ + { HOST_WIDE_INT bitnumber; bitnumber = exact_log2 (GET_MODE_MASK (<MODE>mode) & INTVAL (operands[2])); operands[2] = GEN_INT (bitnumber); return avr_out_sbxx_branch (insn, operands); -} + } [(set (attr "length") (if_then_else (and (ge (minus (pc) (match_dup 3)) (const_int -2046)) (le (minus (pc) (match_dup 3)) (const_int 2046))) @@ -4837,9 +4987,10 @@ (define_insn "*tablejump" (define_expand "casesi" - [(set (match_dup 6) - (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0) - (match_operand:HI 1 "register_operand" ""))) + [(parallel [(set (match_dup 6) + (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0) + (match_operand:HI 1 "register_operand" ""))) + (clobber (scratch:QI))]) (parallel [(set (cc0) (compare (match_dup 6) (match_operand:HI 2 "register_operand" ""))) @@ -5201,8 +5352,8 @@ (define_peephole ; "*dec-and-branchqi!=- (define_peephole ; "*cpse.eq" [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "r,r") - (match_operand:QI 2 "reg_or_0_operand" "r,L"))) + (compare (match_operand:ALL1 1 "register_operand" "r,r") + (match_operand:ALL1 2 "reg_or_0_operand" "r,Y00"))) (set (pc) (if_then_else (eq (cc0) (const_int 0)) @@ -5236,8 +5387,8 @@ (define_peephole ; "*cpse.eq" (define_peephole ; "*cpse.ne" [(set (cc0) - (compare (match_operand:QI 1 "register_operand" "") - (match_operand:QI 2 "reg_or_0_operand" ""))) + (compare (match_operand:ALL1 1 "register_operand" "") + (match_operand:ALL1 2 "reg_or_0_operand" ""))) (set (pc) (if_then_else (ne (cc0) (const_int 0)) @@ -5246,7 +5397,7 @@ (define_peephole ; "*cpse.ne" "!AVR_HAVE_JMP_CALL || !avr_current_device->errata_skip" { - if (operands[2] == const0_rtx) + if (operands[2] == CONST0_RTX (<MODE>mode)) operands[2] = zero_reg_rtx; return 3 == avr_jump_mode (operands[0], insn) @@ -6265,4 +6416,8 @@ (define_insn_and_split "*extzv.qihi2" }) \f +;; Fixed-point instructions +(include "avr-fixed.md") + +;; Operations on 64-bit registers (include "avr-dimode.md") Index: gcc/config/avr/avr-modes.def =================================================================== --- gcc/config/avr/avr-modes.def (revision 190535) +++ gcc/config/avr/avr-modes.def (working copy) @@ -1 +1,28 @@ FRACTIONAL_INT_MODE (PSI, 24, 3); + +/* On 8 bit machines it requires fewer instructions for fixed point + routines if the decimal place is on a byte boundary which is not + the default for signed accum types. */ + +ADJUST_IBIT (HA, 7); +ADJUST_FBIT (HA, 8); + +ADJUST_IBIT (SA, 15); +ADJUST_FBIT (SA, 16); + +ADJUST_IBIT (DA, 31); +ADJUST_FBIT (DA, 32); + +/* Make TA and UTA 64 bits wide. + 128 bit wide modes would be insane on a 8-bit machine. + This needs special treatment in avr.c and avr-lib.h. */ + +ADJUST_BYTESIZE (TA, 8); +ADJUST_ALIGNMENT (TA, 1); +ADJUST_IBIT (TA, 15); +ADJUST_FBIT (TA, 48); + +ADJUST_BYTESIZE (UTA, 8); +ADJUST_ALIGNMENT (UTA, 1); +ADJUST_IBIT (UTA, 16); +ADJUST_FBIT (UTA, 48); Index: gcc/config/avr/avr-protos.h =================================================================== --- gcc/config/avr/avr-protos.h (revision 190535) +++ gcc/config/avr/avr-protos.h (working copy) @@ -79,6 +79,9 @@ extern const char* avr_load_lpm (rtx, rt extern bool avr_rotate_bytes (rtx operands[]); +extern const char* avr_out_fract (rtx, rtx[], bool, int*); +extern rtx avr_to_int_mode (rtx); + extern void expand_prologue (void); extern void expand_epilogue (bool); extern bool avr_emit_movmemhi (rtx*); @@ -92,6 +95,8 @@ extern const char* avr_out_plus (rtx*, i extern const char* avr_out_plus_noclobber (rtx*, int*, int*); extern const char* avr_out_plus64 (rtx, int*); extern const char* avr_out_addto_sp (rtx*, int*); +extern const char* avr_out_minus (rtx*, int*, int*); +extern const char* avr_out_minus64 (rtx, int*); extern const char* avr_out_xload (rtx, rtx*, int*); extern const char* avr_out_movmem (rtx, rtx*, int*); extern const char* avr_out_insert_bits (rtx*, int*); Index: gcc/config/avr/constraints.md =================================================================== --- gcc/config/avr/constraints.md (revision 190535) +++ gcc/config/avr/constraints.md (working copy) @@ -192,3 +192,47 @@ (define_constraint "C0f" "32-bit integer constant where no nibble equals 0xf." (and (match_code "const_int") (match_test "!avr_has_nibble_0xf (op)"))) + +;; CONST_FIXED is no element of 'n' so cook our own. +;; "i" or "s" would match but because the insn uses iterators that cover +;; INT_MODE, "i" or "s" is not always possible. + +(define_constraint "Ynn" + "Fixed-point constant known at compile time." + (match_code "const_fixed")) + +(define_constraint "Y00" + "Fixed-point or integer constant with bit representation 0x0" + (and (match_code "const_fixed,const_int") + (match_test "op == CONST0_RTX (GET_MODE (op))"))) + +(define_constraint "Y01" + "Fixed-point or integer constant with bit representation 0x1" + (ior (and (match_code "const_fixed") + (match_test "1 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_P (op)"))) + +(define_constraint "Ym1" + "Fixed-point or integer constant with bit representation -0x1" + (ior (and (match_code "const_fixed") + (match_test "-1 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_N (op)"))) + +(define_constraint "Y02" + "Fixed-point or integer constant with bit representation 0x2" + (ior (and (match_code "const_fixed") + (match_test "2 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_K (op)"))) + +(define_constraint "Ym2" + "Fixed-point or integer constant with bit representation -0x2" + (ior (and (match_code "const_fixed") + (match_test "-2 == INTVAL (avr_to_int_mode (op))")) + (match_test "satisfies_constraint_Cm2 (op)"))) + +;; Similar to "IJ" used with ADIW/SBIW, but for CONST_FIXED. + +(define_constraint "YIJ" + "Fixed-point constant from @minus{}0x003f to 0x003f." + (and (match_code "const_fixed") + (match_test "IN_RANGE (INTVAL (avr_to_int_mode (op)), -63, 63)"))) Index: gcc/config/avr/avr.c =================================================================== --- gcc/config/avr/avr.c (revision 190535) +++ gcc/config/avr/avr.c (working copy) @@ -49,6 +49,10 @@ #include "params.h" #include "df.h" +#ifndef CONST_FIXED_P +#define CONST_FIXED_P(X) (CONST_FIXED == GET_CODE (X)) +#endif + /* Maximal allowed offset for an address in the LD command */ #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE)) @@ -264,6 +268,23 @@ avr_popcount_each_byte (rtx xval, int n_ return true; } + +/* Access some RTX as INT_MODE. If X is a CONST_FIXED we can get + the bit representation of X by "casting" it to CONST_INT. */ + +rtx +avr_to_int_mode (rtx x) +{ + enum machine_mode mode = GET_MODE (x); + + return VOIDmode == mode + ? x + : simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0); +} + + +/* Implement `TARGET_OPTION_OVERRIDE'. */ + static void avr_option_override (void) { @@ -389,9 +410,14 @@ avr_regno_reg_class (int r) } +/* Implement `TARGET_SCALAR_MODE_SUPPORTED_P'. */ + static bool avr_scalar_mode_supported_p (enum machine_mode mode) { + if (ALL_FIXED_POINT_MODE_P (mode)) + return true; + if (PSImode == mode) return true; @@ -715,6 +741,58 @@ avr_initial_elimination_offset (int from } } + +/* Helper for the function below. */ + +static void +avr_adjust_type_node (tree *node, enum machine_mode mode, int sat_p) +{ + *node = make_node (FIXED_POINT_TYPE); + TYPE_SATURATING (*node) = sat_p; + TYPE_UNSIGNED (*node) = UNSIGNED_FIXED_POINT_MODE_P (mode); + TYPE_IBIT (*node) = GET_MODE_IBIT (mode); + TYPE_FBIT (*node) = GET_MODE_FBIT (mode); + TYPE_PRECISION (*node) = GET_MODE_BITSIZE (mode); + TYPE_ALIGN (*node) = 8; + SET_TYPE_MODE (*node, mode); + + layout_type (*node); +} + + +/* Implement `TARGET_BUILD_BUILTIN_VA_LIST'. */ + +static tree +avr_build_builtin_va_list (void) +{ + /* avr-modes.def adjusts [U]TA to be 64-bit modes with 48 fractional bits. + This is more appropriate for the 8-bit machine AVR than 128-bit modes. + The ADJUST_IBIT/FBIT are handled in toplev:init_adjust_machine_modes() + which is auto-generated by genmodes, but the compiler assigns [U]DAmode + to the long long accum modes instead of the desired [U]TAmode. + + Fix this now, right after node setup in tree.c:build_common_tree_nodes(). + This must run before c-cppbuiltin.c:builtin_define_fixed_point_constants() + which built-in defines macros like __ULLACCUM_FBIT__ that are used by + libgcc to detect IBIT and FBIT. */ + + avr_adjust_type_node (&ta_type_node, TAmode, 0); + avr_adjust_type_node (&uta_type_node, UTAmode, 0); + avr_adjust_type_node (&sat_ta_type_node, TAmode, 1); + avr_adjust_type_node (&sat_uta_type_node, UTAmode, 1); + + unsigned_long_long_accum_type_node = uta_type_node; + long_long_accum_type_node = ta_type_node; + sat_unsigned_long_long_accum_type_node = sat_uta_type_node; + sat_long_long_accum_type_node = sat_ta_type_node; + + /* Dispatch to the default handler. */ + + return std_build_builtin_va_list (); +} + + +/* Implement `TARGET_BUILTIN_SETJMP_FRAME_VALUE'. */ /* Actual start of frame is virtual_stack_vars_rtx this is offset from frame pointer by +STARTING_FRAME_OFFSET. Using saved frame = virtual_stack_vars_rtx - STARTING_FRAME_OFFSET @@ -723,10 +801,13 @@ avr_initial_elimination_offset (int from static rtx avr_builtin_setjmp_frame_value (void) { - return gen_rtx_MINUS (Pmode, virtual_stack_vars_rtx, - gen_int_mode (STARTING_FRAME_OFFSET, Pmode)); + rtx xval = gen_reg_rtx (Pmode); + emit_insn (gen_subhi3 (xval, virtual_stack_vars_rtx, + gen_int_mode (STARTING_FRAME_OFFSET, Pmode))); + return xval; } + /* Return contents of MEM at frame pointer + stack size + 1 (+2 if 3 byte PC). This is return address of function. */ rtx @@ -1580,7 +1661,7 @@ avr_legitimate_address_p (enum machine_m MEM, strict); if (strict - && DImode == mode + && GET_MODE_SIZE (mode) > 4 && REG_X == REGNO (x)) { ok = false; @@ -2081,6 +2162,14 @@ avr_print_operand (FILE *file, rtx x, in /* Use normal symbol for direct address no linker trampoline needed */ output_addr_const (file, x); } + else if (GET_CODE (x) == CONST_FIXED) + { + HOST_WIDE_INT ival = INTVAL (avr_to_int_mode (x)); + if (code != 0) + output_operand_lossage ("Unsupported code '%c'for fixed-point:", + code); + fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival); + } else if (GET_CODE (x) == CONST_DOUBLE) { long val; @@ -2116,6 +2205,7 @@ notice_update_cc (rtx body ATTRIBUTE_UNU case CC_OUT_PLUS: case CC_OUT_PLUS_NOCLOBBER: + case CC_MINUS: case CC_LDI: { rtx *op = recog_data.operand; @@ -2139,6 +2229,11 @@ notice_update_cc (rtx body ATTRIBUTE_UNU cc = (enum attr_cc) icc; break; + case CC_MINUS: + avr_out_minus (op, &len_dummy, &icc); + cc = (enum attr_cc) icc; + break; + case CC_LDI: cc = (op[1] == CONST0_RTX (GET_MODE (op[0])) @@ -2779,9 +2874,11 @@ output_movqi (rtx insn, rtx operands[], if (real_l) *real_l = 1; - if (register_operand (dest, QImode)) + gcc_assert (1 == GET_MODE_SIZE (GET_MODE (dest))); + + if (REG_P (dest)) { - if (register_operand (src, QImode)) /* mov r,r */ + if (REG_P (src)) /* mov r,r */ { if (test_hard_reg_class (STACK_REG, dest)) return "out %0,%1"; @@ -2803,7 +2900,7 @@ output_movqi (rtx insn, rtx operands[], rtx xop[2]; xop[0] = dest; - xop[1] = src == const0_rtx ? zero_reg_rtx : src; + xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src; return out_movqi_mr_r (insn, xop, real_l); } @@ -2825,6 +2922,8 @@ output_movhi (rtx insn, rtx xop[], int * return avr_out_lpm (insn, xop, plen); } + gcc_assert (2 == GET_MODE_SIZE (GET_MODE (dest))); + if (REG_P (dest)) { if (REG_P (src)) /* mov r,r */ @@ -2843,7 +2942,6 @@ output_movhi (rtx insn, rtx xop[], int * return TARGET_NO_INTERRUPTS ? avr_asm_len ("out __SP_H__,%B1" CR_TAB "out __SP_L__,%A1", xop, plen, -2) - : avr_asm_len ("in __tmp_reg__,__SREG__" CR_TAB "cli" CR_TAB "out __SP_H__,%B1" CR_TAB @@ -2880,7 +2978,7 @@ output_movhi (rtx insn, rtx xop[], int * rtx xop[2]; xop[0] = dest; - xop[1] = src == const0_rtx ? zero_reg_rtx : src; + xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src; return out_movhi_mr_r (insn, xop, plen); } @@ -3403,9 +3501,10 @@ output_movsisf (rtx insn, rtx operands[] if (!l) l = &dummy; - if (register_operand (dest, VOIDmode)) + gcc_assert (4 == GET_MODE_SIZE (GET_MODE (dest))); + if (REG_P (dest)) { - if (register_operand (src, VOIDmode)) /* mov r,r */ + if (REG_P (src)) /* mov r,r */ { if (true_regnum (dest) > true_regnum (src)) { @@ -3440,10 +3539,10 @@ output_movsisf (rtx insn, rtx operands[] { return output_reload_insisf (operands, NULL_RTX, real_l); } - else if (GET_CODE (src) == MEM) + else if (MEM_P (src)) return out_movsi_r_mr (insn, operands, real_l); /* mov r,m */ } - else if (GET_CODE (dest) == MEM) + else if (MEM_P (dest)) { const char *templ; @@ -4126,14 +4225,25 @@ avr_out_compare (rtx insn, rtx *xop, int rtx xval = xop[1]; /* MODE of the comparison. */ - enum machine_mode mode = GET_MODE (xreg); + enum machine_mode mode; /* Number of bytes to operate on. */ - int i, n_bytes = GET_MODE_SIZE (mode); + int i, n_bytes = GET_MODE_SIZE (GET_MODE (xreg)); /* Value (0..0xff) held in clobber register xop[2] or -1 if unknown. */ int clobber_val = -1; + /* Map fixed mode operands to integer operands with the same binary + representation. They are easier to handle in the remainder. */ + + if (CONST_FIXED == GET_CODE (xval)) + { + xreg = avr_to_int_mode (xop[0]); + xval = avr_to_int_mode (xop[1]); + } + + mode = GET_MODE (xreg); + gcc_assert (REG_P (xreg)); gcc_assert ((CONST_INT_P (xval) && n_bytes <= 4) || (const_double_operand (xval, VOIDmode) && n_bytes == 8)); @@ -4143,7 +4253,7 @@ avr_out_compare (rtx insn, rtx *xop, int /* Comparisons == +/-1 and != +/-1 can be done similar to camparing against 0 by ORing the bytes. This is one instruction shorter. - Notice that DImode comparisons are always against reg:DI 18 + Notice that 64-bit comparisons are always against reg:ALL8 18 (ACC_A) and therefore don't use this. */ if (!test_hard_reg_class (LD_REGS, xreg) @@ -5884,6 +5994,9 @@ avr_out_plus_1 (rtx *xop, int *plen, enu /* MODE of the operation. */ enum machine_mode mode = GET_MODE (xop[0]); + /* INT_MODE of the same size. */ + enum machine_mode imode = int_mode_for_mode (mode); + /* Number of bytes to operate on. */ int i, n_bytes = GET_MODE_SIZE (mode); @@ -5908,8 +6021,11 @@ avr_out_plus_1 (rtx *xop, int *plen, enu *pcc = (MINUS == code) ? CC_SET_CZN : CC_CLOBBER; + if (CONST_FIXED_P (xval)) + xval = avr_to_int_mode (xval); + if (MINUS == code) - xval = simplify_unary_operation (NEG, mode, xval, mode); + xval = simplify_unary_operation (NEG, imode, xval, imode); op[2] = xop[3]; @@ -5920,7 +6036,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu { /* We operate byte-wise on the destination. */ rtx reg8 = simplify_gen_subreg (QImode, xop[0], mode, i); - rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i); + rtx xval8 = simplify_gen_subreg (QImode, xval, imode, i); /* 8-bit value to operate with this byte. */ unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode); @@ -5941,7 +6057,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu && i + 2 <= n_bytes && test_hard_reg_class (ADDW_REGS, reg8)) { - rtx xval16 = simplify_gen_subreg (HImode, xval, mode, i); + rtx xval16 = simplify_gen_subreg (HImode, xval, imode, i); unsigned int val16 = UINTVAL (xval16) & GET_MODE_MASK (HImode); /* Registers R24, X, Y, Z can use ADIW/SBIW with constants < 64 @@ -6085,6 +6201,41 @@ avr_out_plus_noclobber (rtx *xop, int *p } +/* Output subtraction of register XOP[0] and compile time constant XOP[2]: + + XOP[0] = XOP[0] - XOP[2] + + This is basically the same as `avr_out_plus' except that we subtract. + It's needed because (minus x const) is not mapped to (plus x -const) + for the fixed point modes. */ + +const char* +avr_out_minus (rtx *xop, int *plen, int *pcc) +{ + rtx op[4]; + + if (pcc) + *pcc = (int) CC_SET_CZN; + + if (REG_P (xop[2])) + return avr_asm_len ("sub %A0,%A2" CR_TAB + "sbc %B0,%B2", xop, plen, -2); + + if (!CONST_INT_P (xop[2]) + && !CONST_FIXED_P (xop[2])) + return avr_asm_len ("subi %A0,lo8(%2)" CR_TAB + "sbci %B0,hi8(%2)", xop, plen, -2); + + op[0] = avr_to_int_mode (xop[0]); + op[1] = avr_to_int_mode (xop[1]); + op[2] = gen_int_mode (-INTVAL (avr_to_int_mode (xop[2])), + GET_MODE (op[0])); + op[3] = xop[3]; + + return avr_out_plus (op, plen, pcc); +} + + /* Prepare operands of adddi3_const_insn to be used with avr_out_plus_1. */ const char* @@ -6103,6 +6254,19 @@ avr_out_plus64 (rtx addend, int *plen) return ""; } + +/* Prepare operands of subdi3_const_insn to be used with avr_out_plus64. */ + +const char* +avr_out_minus64 (rtx subtrahend, int *plen) +{ + rtx xneg = avr_to_int_mode (subtrahend); + xneg = simplify_unary_operation (NEG, DImode, xneg, DImode); + + return avr_out_plus64 (xneg, plen); +} + + /* Output bit operation (IOR, AND, XOR) with register XOP[0] and compile time constant XOP[2]: @@ -6442,6 +6606,349 @@ avr_rotate_bytes (rtx operands[]) return true; } + +/* Outputs instructions needed for fixed point type conversion. + This includes converting between any fixed point type, as well + as converting to any integer type. Conversion between integer + types is not supported. + + The number of instructions generated depends on the types + being converted and the registers assigned to them. + + The number of instructions required to complete the conversion + is least if the registers for source and destination are overlapping + and are aligned at the decimal place as actual movement of data is + completely avoided. In some cases, the conversion may already be + complete without any instructions needed. + + When converting to signed types from signed types, sign extension + is implemented. + + Converting signed fractional types requires a bit shift if converting + to or from any unsigned fractional type because the decimal place is + shifted by 1 bit. When the destination is a signed fractional, the sign + is stored in either the carry or T bit. */ + +const char* +avr_out_fract (rtx insn, rtx operands[], bool intsigned, int *plen) +{ + int i; + bool sbit[2]; + /* ilen: Length of integral part (in bytes) + flen: Length of fractional part (in bytes) + tlen: Length of operand (in bytes) + blen: Length of operand (in bits) */ + int ilen[2], flen[2], tlen[2], blen[2]; + int rdest, rsource, offset; + int start, end, dir; + bool sign_in_T = false, sign_in_Carry = false, sign_done = false; + bool widening_sign_extend = false; + int clrword = -1, lastclr = 0, clr = 0; + rtx xop[6]; + + const int dest = 0; + const int src = 1; + + xop[dest] = operands[dest]; + xop[src] = operands[src]; + + if (plen) + *plen = 0; + + /* Determine format (integer and fractional parts) + of types needing conversion. */ + + for (i = 0; i < 2; i++) + { + enum machine_mode mode = GET_MODE (xop[i]); + + tlen[i] = GET_MODE_SIZE (mode); + blen[i] = GET_MODE_BITSIZE (mode); + + if (SCALAR_INT_MODE_P (mode)) + { + sbit[i] = intsigned; + ilen[i] = GET_MODE_SIZE (mode); + flen[i] = 0; + } + else if (ALL_SCALAR_FIXED_POINT_MODE_P (mode)) + { + sbit[i] = SIGNED_SCALAR_FIXED_POINT_MODE_P (mode); + ilen[i] = (GET_MODE_IBIT (mode) + 1) / 8; + flen[i] = (GET_MODE_FBIT (mode) + 1) / 8; + } + else + fatal_insn ("unsupported fixed-point conversion", insn); + } + + /* Perform sign extension if source and dest are both signed, + and there are more integer parts in dest than in source. */ + + widening_sign_extend = sbit[dest] && sbit[src] && ilen[dest] > ilen[src]; + + rdest = REGNO (xop[dest]); + rsource = REGNO (xop[src]); + offset = flen[src] - flen[dest]; + + /* Position of MSB resp. sign bit. */ + + xop[2] = GEN_INT (blen[dest] - 1); + xop[3] = GEN_INT (blen[src] - 1); + + /* Store the sign bit if the destination is a signed fract and the source + has a sign in the integer part. */ + + if (sbit[dest] && ilen[dest] == 0 && sbit[src] && ilen[src] > 0) + { + /* To avoid using BST and BLD if the source and destination registers + overlap or the source is unused after, we can use LSL to store the + sign bit in carry since we don't need the integral part of the source. + Restoring the sign from carry saves one BLD instruction below. */ + + if (reg_unused_after (insn, xop[src]) + || (rdest < rsource + tlen[src] + && rdest + tlen[dest] > rsource)) + { + avr_asm_len ("lsl %T1%t3", xop, plen, 1); + sign_in_Carry = true; + } + else + { + avr_asm_len ("bst %T1%T3", xop, plen, 1); + sign_in_T = true; + } + } + + /* Pick the correct direction to shift bytes. */ + + if (rdest < rsource + offset) + { + dir = 1; + start = 0; + end = tlen[dest]; + } + else + { + dir = -1; + start = tlen[dest] - 1; + end = -1; + } + + /* Perform conversion by moving registers into place, clearing + destination registers that do not overlap with any source. */ + + for (i = start; i != end; i += dir) + { + int destloc = rdest + i; + int sourceloc = rsource + i + offset; + + /* Source register location is outside range of source register, + so clear this byte in the dest. */ + + if (sourceloc < rsource + || sourceloc >= rsource + tlen[src]) + { + if (AVR_HAVE_MOVW + && i + dir != end + && (sourceloc + dir < rsource + || sourceloc + dir >= rsource + tlen[src]) + && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2)) + || (dir == -1 && (destloc % 2) && (sourceloc % 2))) + && clrword != -1) + { + /* Use already cleared word to clear two bytes at a time. */ + + int even_i = i & ~1; + int even_clrword = clrword & ~1; + + xop[4] = GEN_INT (8 * even_i); + xop[5] = GEN_INT (8 * even_clrword); + avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1); + i += dir; + } + else + { + if (i == tlen[dest] - 1 + && widening_sign_extend + && blen[src] - 1 - 8 * offset < 0) + { + /* The SBRC below that sign-extends would come + up with a negative bit number because the sign + bit is out of reach. ALso avoid some early-clobber + situations because of premature CLR. */ + + if (reg_unused_after (insn, xop[src])) + avr_asm_len ("lsl %T1%t3" CR_TAB + "sbc %T0%t2,%T0%t2", xop, plen, 2); + else + avr_asm_len ("mov __tmp_reg__,%T1%t3" CR_TAB + "lsl __tmp_reg__" CR_TAB + "sbc %T0%t2,%T0%t2", xop, plen, 3); + sign_done = true; + + continue; + } + + /* Do not clear the register if it is going to get + sign extended with a MOV later. */ + + if (sbit[dest] && sbit[src] + && i != tlen[dest] - 1 + && i >= flen[dest]) + { + continue; + } + + xop[4] = GEN_INT (8 * i); + avr_asm_len ("clr %T0%t4", xop, plen, 1); + + /* If the last byte was cleared too, we have a cleared + word we can MOVW to clear two bytes at a time. */ + + if (lastclr) + clrword = i; + + clr = 1; + } + } + else if (destloc == sourceloc) + { + /* Source byte is already in destination: Nothing needed. */ + + continue; + } + else + { + /* Registers do not line up and source register location + is within range: Perform move, shifting with MOV or MOVW. */ + + if (AVR_HAVE_MOVW + && i + dir != end + && sourceloc + dir >= rsource + && sourceloc + dir < rsource + tlen[src] + && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2)) + || (dir == -1 && (destloc % 2) && (sourceloc % 2)))) + { + int even_i = i & ~1; + int even_i_plus_offset = (i + offset) & ~1; + + xop[4] = GEN_INT (8 * even_i); + xop[5] = GEN_INT (8 * even_i_plus_offset); + avr_asm_len ("movw %T0%t4,%T1%t5", xop, plen, 1); + i += dir; + } + else + { + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (i + offset)); + avr_asm_len ("mov %T0%t4,%T1%t5", xop, plen, 1); + } + } + + lastclr = clr; + clr = 0; + } + + /* Perform sign extension if source and dest are both signed, + and there are more integer parts in dest than in source. */ + + if (widening_sign_extend) + { + if (!sign_done) + { + xop[4] = GEN_INT (blen[src] - 1 - 8 * offset); + + /* Register was cleared above, so can become 0xff and extended. + Note: Instead of the CLR/SBRC/COM the sign extension could + be performed after the LSL below by means of a SBC if only + one byte has to be shifted left. */ + + avr_asm_len ("sbrc %T0%T4" CR_TAB + "com %T0%t2", xop, plen, 2); + } + + /* Sign extend additional bytes by MOV and MOVW. */ + + start = tlen[dest] - 2; + end = flen[dest] + ilen[src] - 1; + + for (i = start; i != end; i--) + { + if (AVR_HAVE_MOVW && i != start && i-1 != end) + { + i--; + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (tlen[dest] - 2)); + avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1); + } + else + { + xop[4] = GEN_INT (8 * i); + xop[5] = GEN_INT (8 * (tlen[dest] - 1)); + avr_asm_len ("mov %T0%t4,%T0%t5", xop, plen, 1); + } + } + } + + /* If destination is a signed fract, and the source was not, a shift + by 1 bit is needed. Also restore sign from carry or T. */ + + if (sbit[dest] && !ilen[dest] && (!sbit[src] || ilen[src])) + { + /* We have flen[src] non-zero fractional bytes to shift. + Because of the right shift, handle one byte more so that the + LSB won't be lost. */ + + int nonzero = flen[src] + 1; + + /* If the LSB is in the T flag and there are no fractional + bits, the high byte is zero and no shift needed. */ + + if (flen[src] == 0 && sign_in_T) + nonzero = 0; + + start = flen[dest] - 1; + end = start - nonzero; + + for (i = start; i > end && i >= 0; i--) + { + xop[4] = GEN_INT (8 * i); + if (i == start && !sign_in_Carry) + avr_asm_len ("lsr %T0%t4", xop, plen, 1); + else + avr_asm_len ("ror %T0%t4", xop, plen, 1); + } + + if (sign_in_T) + { + avr_asm_len ("bld %T0%T2", xop, plen, 1); + } + } + else if (sbit[src] && !ilen[src] && (!sbit[dest] || ilen[dest])) + { + /* If source was a signed fract and dest was not, shift 1 bit + other way. */ + + start = flen[dest] - flen[src]; + + if (start < 0) + start = 0; + + for (i = start; i < flen[dest]; i++) + { + xop[4] = GEN_INT (8 * i); + + if (i == start) + avr_asm_len ("lsl %T0%t4", xop, plen, 1); + else + avr_asm_len ("rol %T0%t4", xop, plen, 1); + } + } + + return ""; +} + + /* Modifies the length assigned to instruction INSN LEN is the initially computed length of the insn. */ @@ -6489,6 +6996,8 @@ adjust_insn_length (rtx insn, int len) case ADJUST_LEN_OUT_PLUS: avr_out_plus (op, &len, NULL); break; case ADJUST_LEN_PLUS64: avr_out_plus64 (op[0], &len); break; + case ADJUST_LEN_MINUS: avr_out_minus (op, &len, NULL); break; + case ADJUST_LEN_MINUS64: avr_out_minus64 (op[0], &len); break; case ADJUST_LEN_OUT_PLUS_NOCLOBBER: avr_out_plus_noclobber (op, &len, NULL); break; @@ -6502,6 +7011,9 @@ adjust_insn_length (rtx insn, int len) case ADJUST_LEN_XLOAD: avr_out_xload (insn, op, &len); break; case ADJUST_LEN_LOAD_LPM: avr_load_lpm (insn, op, &len); break; + case ADJUST_LEN_SFRACT: avr_out_fract (insn, op, true, &len); break; + case ADJUST_LEN_UFRACT: avr_out_fract (insn, op, false, &len); break; + case ADJUST_LEN_TSTHI: avr_out_tsthi (insn, op, &len); break; case ADJUST_LEN_TSTPSI: avr_out_tstpsi (insn, op, &len); break; case ADJUST_LEN_TSTSI: avr_out_tstsi (insn, op, &len); break; @@ -6683,6 +7195,20 @@ avr_assemble_integer (rtx x, unsigned in return true; } + else if (CONST_FIXED_P (x)) + { + unsigned n; + + /* varasm fails to handle big fixed modes that don't fit in hwi. */ + + for (n = 0; n < size; n++) + { + rtx xn = simplify_gen_subreg (QImode, x, GET_MODE (x), n); + default_assemble_integer (xn, 1, aligned_p); + } + + return true; + } return default_assemble_integer (x, size, aligned_p); } @@ -7489,6 +8015,7 @@ avr_operand_rtx_cost (rtx x, enum machin return 0; case CONST_INT: + case CONST_FIXED: case CONST_DOUBLE: return COSTS_N_INSNS (GET_MODE_SIZE (mode)); @@ -7518,6 +8045,7 @@ avr_rtx_costs_1 (rtx x, int codearg, int switch (code) { case CONST_INT: + case CONST_FIXED: case CONST_DOUBLE: case SYMBOL_REF: case CONST: @@ -8446,11 +8974,17 @@ avr_compare_pattern (rtx insn) if (pattern && NONJUMP_INSN_P (insn) && SET_DEST (pattern) == cc0_rtx - && GET_CODE (SET_SRC (pattern)) == COMPARE - && DImode != GET_MODE (XEXP (SET_SRC (pattern), 0)) - && DImode != GET_MODE (XEXP (SET_SRC (pattern), 1))) + && GET_CODE (SET_SRC (pattern)) == COMPARE) { - return pattern; + enum machine_mode mode0 = GET_MODE (XEXP (SET_SRC (pattern), 0)); + enum machine_mode mode1 = GET_MODE (XEXP (SET_SRC (pattern), 1)); + + /* The 64-bit comparisons have fixed operands ACC_A and ACC_B. + They must not be swapped, thus skip them. */ + + if ((mode0 == VOIDmode || GET_MODE_SIZE (mode0) <= 4) + && (mode1 == VOIDmode || GET_MODE_SIZE (mode1) <= 4)) + return pattern; } return NULL_RTX; @@ -8788,6 +9322,8 @@ avr_2word_insn_p (rtx insn) return false; case CODE_FOR_movqi_insn: + case CODE_FOR_movuqq_insn: + case CODE_FOR_movqq_insn: { rtx set = single_set (insn); rtx src = SET_SRC (set); @@ -8796,7 +9332,7 @@ avr_2word_insn_p (rtx insn) /* Factor out LDS and STS from movqi_insn. */ if (MEM_P (dest) - && (REG_P (src) || src == const0_rtx)) + && (REG_P (src) || src == CONST0_RTX (GET_MODE (dest)))) { return CONSTANT_ADDRESS_P (XEXP (dest, 0)); } @@ -9021,7 +9557,7 @@ output_reload_in_const (rtx *op, rtx clo if (NULL_RTX == clobber_reg && !test_hard_reg_class (LD_REGS, dest) - && (! (CONST_INT_P (src) || CONST_DOUBLE_P (src)) + && (! (CONST_INT_P (src) || CONST_FIXED_P (src) || CONST_DOUBLE_P (src)) || !avr_popcount_each_byte (src, n_bytes, (1 << 0) | (1 << 1) | (1 << 8)))) { @@ -9048,6 +9584,7 @@ output_reload_in_const (rtx *op, rtx clo ldreg_p = test_hard_reg_class (LD_REGS, xdest[n]); if (!CONST_INT_P (src) + && !CONST_FIXED_P (src) && !CONST_DOUBLE_P (src)) { static const char* const asm_code[][2] = @@ -9239,6 +9776,7 @@ output_reload_insisf (rtx *op, rtx clobb if (AVR_HAVE_MOVW && !test_hard_reg_class (LD_REGS, op[0]) && (CONST_INT_P (op[1]) + || CONST_FIXED_P (op[1]) || CONST_DOUBLE_P (op[1]))) { int len_clr, len_noclr; @@ -10834,6 +11372,12 @@ avr_fold_builtin (tree fndecl, int n_arg #undef TARGET_SCALAR_MODE_SUPPORTED_P #define TARGET_SCALAR_MODE_SUPPORTED_P avr_scalar_mode_supported_p +#undef TARGET_BUILD_BUILTIN_VA_LIST +#define TARGET_BUILD_BUILTIN_VA_LIST avr_build_builtin_va_list + +#undef TARGET_FIXED_POINT_SUPPORTED_P +#define TARGET_FIXED_POINT_SUPPORTED_P hook_bool_void_true + #undef TARGET_ADDR_SPACE_SUBSET_P #define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p Index: gcc/config/avr/avr.h =================================================================== --- gcc/config/avr/avr.h (revision 190535) +++ gcc/config/avr/avr.h (working copy) @@ -261,6 +261,7 @@ enum #define FLOAT_TYPE_SIZE 32 #define DOUBLE_TYPE_SIZE 32 #define LONG_DOUBLE_TYPE_SIZE 32 +#define LONG_LONG_ACCUM_TYPE_SIZE 64 #define DEFAULT_SIGNED_CHAR 1 Index: libgcc/config/avr/avr-lib.h =================================================================== --- libgcc/config/avr/avr-lib.h (revision 190620) +++ libgcc/config/avr/avr-lib.h (working copy) @@ -4,3 +4,79 @@ #define DI SI typedef int QItype __attribute__ ((mode (QI))); #endif + +/* fixed-bit.h does not define functions for TA and UTA because + that part is wrapped in #if MIN_UNITS_PER_WORD > 4. + This would lead to empty functions for TA and UTA. + Thus, supply appropriate defines as if HAVE_[U]TA == 1. + #define HAVE_[U]TA 1 won't work because avr-modes.def + uses ADJUST_BYTESIZE(TA,8) and fixed-bit.h is not generic enough + to arrange for such changes of the mode size. */ + +typedef unsigned _Fract UTAtype __attribute__ ((mode (UTA))); + +#if defined (UTA_MODE) +#define FIXED_SIZE 8 /* in bytes */ +#define INT_C_TYPE UDItype +#define UINT_C_TYPE UDItype +#define HINT_C_TYPE USItype +#define HUINT_C_TYPE USItype +#define MODE_NAME UTA +#define MODE_NAME_S uta +#define MODE_UNSIGNED 1 +#endif + +#if defined (FROM_UTA) +#define FROM_TYPE 4 /* ID for fixed-point */ +#define FROM_MODE_NAME UTA +#define FROM_MODE_NAME_S uta +#define FROM_INT_C_TYPE UDItype +#define FROM_SINT_C_TYPE DItype +#define FROM_UINT_C_TYPE UDItype +#define FROM_MODE_UNSIGNED 1 +#define FROM_FIXED_SIZE 8 /* in bytes */ +#elif defined (TO_UTA) +#define TO_TYPE 4 /* ID for fixed-point */ +#define TO_MODE_NAME UTA +#define TO_MODE_NAME_S uta +#define TO_INT_C_TYPE UDItype +#define TO_SINT_C_TYPE DItype +#define TO_UINT_C_TYPE UDItype +#define TO_MODE_UNSIGNED 1 +#define TO_FIXED_SIZE 8 /* in bytes */ +#endif + +/* Same for TAmode */ + +typedef _Fract TAtype __attribute__ ((mode (TA))); + +#if defined (TA_MODE) +#define FIXED_SIZE 8 /* in bytes */ +#define INT_C_TYPE DItype +#define UINT_C_TYPE UDItype +#define HINT_C_TYPE SItype +#define HUINT_C_TYPE USItype +#define MODE_NAME TA +#define MODE_NAME_S ta +#define MODE_UNSIGNED 0 +#endif + +#if defined (FROM_TA) +#define FROM_TYPE 4 /* ID for fixed-point */ +#define FROM_MODE_NAME TA +#define FROM_MODE_NAME_S ta +#define FROM_INT_C_TYPE DItype +#define FROM_SINT_C_TYPE DItype +#define FROM_UINT_C_TYPE UDItype +#define FROM_MODE_UNSIGNED 0 +#define FROM_FIXED_SIZE 8 /* in bytes */ +#elif defined (TO_TA) +#define TO_TYPE 4 /* ID for fixed-point */ +#define TO_MODE_NAME TA +#define TO_MODE_NAME_S ta +#define TO_INT_C_TYPE DItype +#define TO_SINT_C_TYPE DItype +#define TO_UINT_C_TYPE UDItype +#define TO_MODE_UNSIGNED 0 +#define TO_FIXED_SIZE 8 /* in bytes */ +#endif Index: libgcc/config/avr/lib1funcs-fixed.S =================================================================== --- libgcc/config/avr/lib1funcs-fixed.S (revision 0) +++ libgcc/config/avr/lib1funcs-fixed.S (revision 0) @@ -0,0 +1,874 @@ +/* -*- Mode: Asm -*- */ +;; Copyright (C) 2012 +;; Free Software Foundation, Inc. +;; Contributed by Sean D'Epagnier (sean@depagnier.com) +;; Georg-Johann Lay (avr@gjlay.de) + +;; This file is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by the +;; Free Software Foundation; either version 3, or (at your option) any +;; later version. + +;; In addition to the permissions in the GNU General Public License, the +;; Free Software Foundation gives you unlimited permission to link the +;; compiled version of this file into combinations with other programs, +;; and to distribute those combinations without any restriction coming +;; from the use of this file. (The General Public License restrictions +;; do apply in other respects; for example, they cover modification of +;; the file, and distribution when not linked into a combine +;; executable.) + +;; This file is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with this program; see the file COPYING. If not, write to +;; the Free Software Foundation, 51 Franklin Street, Fifth Floor, +;; Boston, MA 02110-1301, USA. + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Fixed point library routines for AVR +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +.section .text.libgcc.fixed, "ax", @progbits + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Conversions to float +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +#if defined (L_fractqqsf) +DEFUN __fractqqsf + ;; Move in place for SA -> SF conversion + clr r22 + mov r23, r24 + lsl r23 + ;; Sign-extend + sbc r24, r24 + mov r25, r24 + XJMP __fractsasf +ENDF __fractqqsf +#endif /* L_fractqqsf */ + +#if defined (L_fractuqqsf) +DEFUN __fractuqqsf + ;; Move in place for USA -> SF conversion + clr r22 + mov r23, r24 + ;; Zero-extend + clr r24 + clr r25 + XJMP __fractusasf +ENDF __fractuqqsf +#endif /* L_fractuqqsf */ + +#if defined (L_fracthqsf) +DEFUN __fracthqsf + ;; Move in place for SA -> SF conversion + wmov 22, 24 + lsl r22 + rol r23 + ;; Sign-extend + sbc r24, r24 + mov r25, r24 + XJMP __fractsasf +ENDF __fracthqsf +#endif /* L_fracthqsf */ + +#if defined (L_fractuhqsf) +DEFUN __fractuhqsf + ;; Move in place for USA -> SF conversion + wmov 22, 24 + ;; Zero-extend + clr r24 + clr r25 + XJMP __fractusasf +ENDF __fractuhqsf +#endif /* L_fractuhqsf */ + +#if defined (L_fracthasf) +DEFUN __fracthasf + ;; Move in place for SA -> SF conversion + clr r22 + mov r23, r24 + mov r24, r25 + ;; Sign-extend + lsl r25 + sbc r25, r25 + XJMP __fractsasf +ENDF __fracthasf +#endif /* L_fracthasf */ + +#if defined (L_fractuhasf) +DEFUN __fractuhasf + ;; Move in place for USA -> SF conversion + clr r22 + mov r23, r24 + mov r24, r25 + ;; Zero-extend + clr r25 + XJMP __fractusasf +ENDF __fractuhasf +#endif /* L_fractuhasf */ + + +#if defined (L_fractsqsf) +DEFUN __fractsqsf + XCALL __floatsisf + ;; Divide non-zero results by 2^31 to move the + ;; decimal point into place + tst r25 + breq 0f + subi r24, exp_lo (31) + sbci r25, exp_hi (31) +0: ret +ENDF __fractsqsf +#endif /* L_fractsqsf */ + +#if defined (L_fractusqsf) +DEFUN __fractusqsf + XCALL __floatunsisf + ;; Divide non-zero results by 2^32 to move the + ;; decimal point into place + cpse r25, __zero_reg__ + subi r25, exp_hi (32) + ret +ENDF __fractusqsf +#endif /* L_fractusqsf */ + +#if defined (L_fractsasf) +DEFUN __fractsasf + XCALL __floatsisf + ;; Divide non-zero results by 2^16 to move the + ;; decimal point into place + cpse r25, __zero_reg__ + subi r25, exp_hi (16) + ret +ENDF __fractsasf +#endif /* L_fractsasf */ + +#if defined (L_fractusasf) +DEFUN __fractusasf + XCALL __floatunsisf + ;; Divide non-zero results by 2^16 to move the + ;; decimal point into place + cpse r25, __zero_reg__ + subi r25, exp_hi (16) + ret +ENDF __fractusasf +#endif /* L_fractusasf */ + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Conversions from float +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + +#if defined (L_fractsfqq) +DEFUN __fractsfqq + ;; Multiply with 2^{24+7} to get a QQ result in r25 + subi r24, exp_lo (-31) + sbci r25, exp_hi (-31) + XCALL __fixsfsi + mov r24, r25 + ret +ENDF __fractsfqq +#endif /* L_fractsfqq */ + +#if defined (L_fractsfuqq) +DEFUN __fractsfuqq + ;; Multiply with 2^{24+8} to get a UQQ result in r25 + subi r25, exp_hi (-32) + XCALL __fixunssfsi + mov r24, r25 + ret +ENDF __fractsfuqq +#endif /* L_fractsfuqq */ + +#if defined (L_fractsfha) +DEFUN __fractsfha + ;; Multiply with 2^24 to get a HA result in r25:r24 + subi r25, exp_hi (-24) + XJMP __fixsfsi +ENDF __fractsfha +#endif /* L_fractsfha */ + +#if defined (L_fractsfuha) +DEFUN __fractsfuha + ;; Multiply with 2^24 to get a UHA result in r25:r24 + subi r25, exp_hi (-24) + XJMP __fixunssfsi +ENDF __fractsfuha +#endif /* L_fractsfuha */ + +#if defined (L_fractsfhq) +DEFUN __fractsfsq +ENDF __fractsfsq + +DEFUN __fractsfhq + ;; Multiply with 2^{16+15} to get a HQ result in r25:r24 + ;; resp. with 2^31 to get a SQ result in r25:r22 + subi r24, exp_lo (-31) + sbci r25, exp_hi (-31) + XJMP __fixsfsi +ENDF __fractsfhq +#endif /* L_fractsfhq */ + +#if defined (L_fractsfuhq) +DEFUN __fractsfusq +ENDF __fractsfusq + +DEFUN __fractsfuhq + ;; Multiply with 2^{16+16} to get a UHQ result in r25:r24 + ;; resp. with 2^32 to get a USQ result in r25:r22 + subi r25, exp_hi (-32) + XJMP __fixunssfsi +ENDF __fractsfuhq +#endif /* L_fractsfuhq */ + +#if defined (L_fractsfsa) +DEFUN __fractsfsa + ;; Multiply with 2^16 to get a SA result in r25:r22 + subi r25, exp_hi (-16) + XJMP __fixsfsi +ENDF __fractsfsa +#endif /* L_fractsfsa */ + +#if defined (L_fractsfusa) +DEFUN __fractsfusa + ;; Multiply with 2^16 to get a USA result in r25:r22 + subi r25, exp_hi (-16) + XJMP __fixunssfsi +ENDF __fractsfusa +#endif /* L_fractsfusa */ + + +;; For multiplication the functions here are called directly from +;; avr-fixed.md instead of using the standard libcall mechanisms. +;; This can make better code because GCC knows exactly which +;; of the call-used registers (not all of them) are clobbered. */ + +/******************************************************* + Fractional Multiplication 8 x 8 without MUL +*******************************************************/ + +#if defined (L_mulqq3) && !defined (__AVR_HAVE_MUL__) +;;; R23 = R24 * R25 +;;; Clobbers: __tmp_reg__, R22, R24, R25 +;;; Rounding: ??? +DEFUN __mulqq3 + XCALL __fmuls + ;; TR 18037 requires that (-1) * (-1) does not overflow + ;; The only input that can produce -1 is (-1)^2. + dec r23 + brvs 0f + inc r23 +0: ret +ENDF __mulqq3 +#endif /* L_mulqq3 && ! HAVE_MUL */ + +/******************************************************* + Fractional Multiply .16 x .16 with and without MUL +*******************************************************/ + +#if defined (L_mulhq3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB <= error <= 0.5 LSB +DEFUN __mulhq3 + XCALL __mulhisi3 + ;; Shift result into place + lsl r23 + rol r24 + rol r25 + brvs 1f + ;; Round + sbrc r23, 7 + adiw r24, 1 + ret +1: ;; Overflow. TR 18037 requires (-1)^2 not to overflow + ldi r24, lo8 (0x7fff) + ldi r25, hi8 (0x7fff) + ret +ENDF __mulhq3 +#endif /* defined (L_mulhq3) */ + +#if defined (L_muluhq3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) *= (R23:R22) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB < error <= 0.5 LSB +DEFUN __muluhq3 + XCALL __umulhisi3 + ;; Round + sbrc r23, 7 + adiw r24, 1 + ret +ENDF __muluhq3 +#endif /* L_muluhq3 */ + + +/******************************************************* + Fixed Multiply 8.8 x 8.8 with and without MUL +*******************************************************/ + +#if defined (L_mulha3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB <= error <= 0.5 LSB +DEFUN __mulha3 + XCALL __mulhisi3 + XJMP __muluha3_round +ENDF __mulha3 +#endif /* L_mulha3 */ + +#if defined (L_muluha3) +;;; Same code with and without MUL, but the interfaces differ: +;;; no MUL: (R25:R24) *= (R23:R22) +;;; Clobbers: ABI, called by optabs +;;; MUL: (R25:R24) = (R19:R18) * (R27:R26) +;;; Clobbers: __tmp_reg__, R22, R23 +;;; Rounding: -0.5 LSB < error <= 0.5 LSB +DEFUN __muluha3 + XCALL __umulhisi3 + XJMP __muluha3_round +ENDF __muluha3 +#endif /* L_muluha3 */ + +#if defined (L_muluha3_round) +DEFUN __muluha3_round + ;; Shift result into place + mov r25, r24 + mov r24, r23 + ;; Round + sbrc r22, 7 + adiw r24, 1 + ret +ENDF __muluha3_round +#endif /* L_muluha3_round */ + + +/******************************************************* + Fixed Multiplication 16.16 x 16.16 +*******************************************************/ + +#if defined (__AVR_HAVE_MUL__) + +;; Multiplier +#define A0 16 +#define A1 A0+1 +#define A2 A1+1 +#define A3 A2+1 + +;; Multiplicand +#define B0 20 +#define B1 B0+1 +#define B2 B1+1 +#define B3 B2+1 + +;; Result +#define C0 24 +#define C1 C0+1 +#define C2 C1+1 +#define C3 C2+1 + +#if defined (L_mulusa3) +;;; (C3:C0) = (A3:A0) * (B3:B0) +;;; Clobbers: __tmp_reg__ +;;; Rounding: -0.5 LSB < error <= 0.5 LSB +DEFUN __mulusa3 + ;; Some of the MUL instructions have LSBs outside the result. + ;; Don't ignore these LSBs in order to tame rounding error. + ;; Use C2/C3 for these LSBs. + + clr C0 + clr C1 + mul A0, B0 $ movw C2, r0 + + mul A1, B0 $ add C3, r0 $ adc C0, r1 + mul A0, B1 $ add C3, r0 $ adc C0, r1 $ rol C1 + + ;; Round + sbrc C3, 7 + adiw C0, 1 + + ;; The following MULs don't have LSBs outside the result. + ;; C2/C3 is the high part. + + mul A0, B2 $ add C0, r0 $ adc C1, r1 $ sbc C2, C2 + mul A1, B1 $ add C0, r0 $ adc C1, r1 $ sbci C2, 0 + mul A2, B0 $ add C0, r0 $ adc C1, r1 $ sbci C2, 0 + neg C2 + + mul A0, B3 $ add C1, r0 $ adc C2, r1 $ sbc C3, C3 + mul A1, B2 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0 + mul A2, B1 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0 + mul A3, B0 $ add C1, r0 $ adc C2, r1 $ sbci C3, 0 + neg C3 + + mul A1, B3 $ add C2, r0 $ adc C3, r1 + mul A2, B2 $ add C2, r0 $ adc C3, r1 + mul A3, B1 $ add C2, r0 $ adc C3, r1 + + mul A2, B3 $ add C3, r0 + mul A3, B2 $ add C3, r0 + + clr __zero_reg__ + ret +ENDF __mulusa3 +#endif /* L_mulusa3 */ + +#if defined (L_mulsa3) +;;; (C3:C0) = (A3:A0) * (B3:B0) +;;; Clobbers: __tmp_reg__ +;;; Rounding: -0.5 LSB <= error <= 0.5 LSB +DEFUN __mulsa3 + XCALL __mulusa3 + tst B3 + brpl 1f + sub C2, A0 + sbc C3, A1 +1: sbrs A3, 7 + ret + sub C2, B0 + sbc C3, B1 + ret +ENDF __mulsa3 +#endif /* L_mulsa3 */ + +#undef A0 +#undef A1 +#undef A2 +#undef A3 +#undef B0 +#undef B1 +#undef B2 +#undef B3 +#undef C0 +#undef C1 +#undef C2 +#undef C3 + +#else /* __AVR_HAVE_MUL__ */ + +#define A0 18 +#define A1 A0+1 +#define A2 A0+2 +#define A3 A0+3 + +#define B0 22 +#define B1 B0+1 +#define B2 B0+2 +#define B3 B0+3 + +#define C0 22 +#define C1 C0+1 +#define C2 C0+2 +#define C3 C0+3 + +;; __tmp_reg__ +#define CC0 0 +;; __zero_reg__ +#define CC1 1 +#define CC2 16 +#define CC3 17 + +#define AA0 26 +#define AA1 AA0+1 +#define AA2 30 +#define AA3 AA2+1 + +#if defined (L_mulsa3) +;;; (R25:R22) *= (R21:R18) +;;; Clobbers: ABI, called by optabs +;;; Rounding: -1 LSB <= error <= 1 LSB +DEFUN __mulsa3 + push B0 + push B1 + bst B3, 7 + XCALL __mulusa3 + ;; A survived in 31:30:27:26 + rcall 1f + pop AA1 + pop AA0 + bst AA3, 7 +1: brtc 9f + ;; 1-extend A/B + sub C2, AA0 + sbc C3, AA1 +9: ret +ENDF __mulsa3 +#endif /* L_mulsa3 */ + +#if defined (L_mulusa3) +;;; (R25:R22) *= (R21:R18) +;;; Clobbers: ABI, called by optabs and __mulsua +;;; Rounding: -1 LSB <= error <= 1 LSB +;;; Does not clobber T and A[] survives in 26, 27, 30, 31 +DEFUN __mulusa3 + push CC2 + push CC3 + ; clear result + clr __tmp_reg__ + wmov CC2, CC0 + ; save multiplicand + wmov AA0, A0 + wmov AA2, A2 + rjmp 3f + + ;; Loop the integral part + +1: ;; CC += A * 2^n; n >= 0 + add CC0,A0 $ adc CC1,A1 $ adc CC2,A2 $ adc CC3,A3 + +2: ;; A <<= 1 + lsl A0 $ rol A1 $ rol A2 $ rol A3 + +3: ;; IBIT(B) >>= 1 + ;; Carry = n-th bit of B; n >= 0 + lsr B3 + ror B2 + brcs 1b + sbci B3, 0 + brne 2b + + ;; Loop the fractional part + ;; B2/B3 is 0 now, use as guard bits for rounding + ;; Restore multiplicand + wmov A0, AA0 + wmov A2, AA2 + rjmp 5f + +4: ;; CC += A:Guard * 2^n; n < 0 + add B3,B2 $ adc CC0,A0 $ adc CC1,A1 $ adc CC2,A2 $ adc CC3,A3 +5: + ;; A:Guard >>= 1 + lsr A3 $ ror A2 $ ror A1 $ ror A0 $ ror B2 + + ;; FBIT(B) <<= 1 + ;; Carry = n-th bit of B; n < 0 + lsl B0 + rol B1 + brcs 4b + sbci B0, 0 + brne 5b + + ;; Move result into place and round + lsl B3 + wmov C2, CC2 + wmov C0, CC0 + clr __zero_reg__ + adc C0, __zero_reg__ + adc C1, __zero_reg__ + adc C2, __zero_reg__ + adc C3, __zero_reg__ + + ;; Epilogue + pop CC3 + pop CC2 + ret +ENDF __mulusa3 +#endif /* L_mulusa3 */ + +#undef A0 +#undef A1 +#undef A2 +#undef A3 +#undef B0 +#undef B1 +#undef B2 +#undef B3 +#undef C0 +#undef C1 +#undef C2 +#undef C3 +#undef AA0 +#undef AA1 +#undef AA2 +#undef AA3 +#undef CC0 +#undef CC1 +#undef CC2 +#undef CC3 + +#endif /* __AVR_HAVE_MUL__ */ + +/******************************************************* + Fractional Division 8 / 8 +*******************************************************/ + +#define r_divd r25 /* dividend */ +#define r_quo r24 /* quotient */ +#define r_div r22 /* divisor */ + +#if defined (L_divqq3) +DEFUN __divqq3 + mov r0, r_divd + eor r0, r_div + sbrc r_div, 7 + neg r_div + sbrc r_divd, 7 + neg r_divd + cp r_divd, r_div + breq __divqq3_minus1 ; if equal return -1 + XCALL __udivuqq3 + lsr r_quo + sbrc r0, 7 ; negate result if needed + neg r_quo + ret +__divqq3_minus1: + ldi r_quo, 0x80 + ret +ENDF __divqq3 +#endif /* defined (L_divqq3) */ + +#if defined (L_udivuqq3) +DEFUN __udivuqq3 + clr r_quo ; clear quotient + inc __zero_reg__ ; init loop counter, used per shift +__udivuqq3_loop: + lsl r_divd ; shift dividend + brcs 0f ; dividend overflow + cp r_divd,r_div ; compare dividend & divisor + brcc 0f ; dividend >= divisor + rol r_quo ; shift quotient (with CARRY) + rjmp __udivuqq3_cont +0: + sub r_divd,r_div ; restore dividend + lsl r_quo ; shift quotient (without CARRY) +__udivuqq3_cont: + lsl __zero_reg__ ; shift loop-counter bit + brne __udivuqq3_loop + com r_quo ; complement result + ; because C flag was complemented in loop + ret +ENDF __udivuqq3 +#endif /* defined (L_udivuqq3) */ + +#undef r_divd +#undef r_quo +#undef r_div + + +/******************************************************* + Fractional Division 16 / 16 +*******************************************************/ +#define r_divdL 26 /* dividend Low */ +#define r_divdH 27 /* dividend Hig */ +#define r_quoL 24 /* quotient Low */ +#define r_quoH 25 /* quotient High */ +#define r_divL 22 /* divisor */ +#define r_divH 23 /* divisor */ +#define r_cnt 21 + +#if defined (L_divhq3) +DEFUN __divhq3 + mov r0, r_divdH + eor r0, r_divH + sbrs r_divH, 7 + rjmp 1f + NEG2 r_divL +1: + sbrs r_divdH, 7 + rjmp 2f + NEG2 r_divdL +2: + cp r_divdL, r_divL + cpc r_divdH, r_divH + breq __divhq3_minus1 ; if equal return -1 + XCALL __udivuhq3 + lsr r_quoH + ror r_quoL + brpl 9f + ;; negate result if needed + NEG2 r_quoL +9: + ret +__divhq3_minus1: + ldi r_quoH, 0x80 + clr r_quoL + ret +ENDF __divhq3 +#endif /* defined (L_divhq3) */ + +#if defined (L_udivuhq3) +DEFUN __udivuhq3 + sub r_quoH,r_quoH ; clear quotient and carry + ;; FALLTHRU +ENDF __udivuhq3 + +DEFUN __udivuha3_common + clr r_quoL ; clear quotient + ldi r_cnt,16 ; init loop counter +__udivuhq3_loop: + rol r_divdL ; shift dividend (with CARRY) + rol r_divdH + brcs __udivuhq3_ep ; dividend overflow + cp r_divdL,r_divL ; compare dividend & divisor + cpc r_divdH,r_divH + brcc __udivuhq3_ep ; dividend >= divisor + rol r_quoL ; shift quotient (with CARRY) + rjmp __udivuhq3_cont +__udivuhq3_ep: + sub r_divdL,r_divL ; restore dividend + sbc r_divdH,r_divH + lsl r_quoL ; shift quotient (without CARRY) +__udivuhq3_cont: + rol r_quoH ; shift quotient + dec r_cnt ; decrement loop counter + brne __udivuhq3_loop + com r_quoL ; complement result + com r_quoH ; because C flag was complemented in loop + ret +ENDF __udivuha3_common +#endif /* defined (L_udivuhq3) */ + +/******************************************************* + Fixed Division 8.8 / 8.8 +*******************************************************/ +#if defined (L_divha3) +DEFUN __divha3 + mov r0, r_divdH + eor r0, r_divH + sbrs r_divH, 7 + rjmp 1f + NEG2 r_divL +1: + sbrs r_divdH, 7 + rjmp 2f + NEG2 r_divdL +2: + XCALL __udivuha3 + sbrs r0, 7 ; negate result if needed + ret + NEG2 r_quoL + ret +ENDF __divha3 +#endif /* defined (L_divha3) */ + +#if defined (L_udivuha3) +DEFUN __udivuha3 + mov r_quoH, r_divdL + mov r_divdL, r_divdH + clr r_divdH + lsl r_quoH ; shift quotient into carry + XJMP __udivuha3_common ; same as fractional after rearrange +ENDF __udivuha3 +#endif /* defined (L_udivuha3) */ + +#undef r_divdL +#undef r_divdH +#undef r_quoL +#undef r_quoH +#undef r_divL +#undef r_divH +#undef r_cnt + +/******************************************************* + Fixed Division 16.16 / 16.16 +*******************************************************/ + +#define r_arg1L 24 /* arg1 gets passed already in place */ +#define r_arg1H 25 +#define r_arg1HL 26 +#define r_arg1HH 27 +#define r_divdL 26 /* dividend Low */ +#define r_divdH 27 +#define r_divdHL 30 +#define r_divdHH 31 /* dividend High */ +#define r_quoL 22 /* quotient Low */ +#define r_quoH 23 +#define r_quoHL 24 +#define r_quoHH 25 /* quotient High */ +#define r_divL 18 /* divisor Low */ +#define r_divH 19 +#define r_divHL 20 +#define r_divHH 21 /* divisor High */ +#define r_cnt __zero_reg__ /* loop count (0 after the loop!) */ + +#if defined (L_divsa3) +DEFUN __divsa3 + mov r0, r_arg1HH + eor r0, r_divHH + sbrs r_divHH, 7 + rjmp 1f + NEG4 r_divL +1: + sbrs r_arg1HH, 7 + rjmp 2f + NEG4 r_arg1L +2: + XCALL __udivusa3 + sbrs r0, 7 ; negate result if needed + ret + NEG4 r_quoL + ret +ENDF __divsa3 +#endif /* defined (L_divsa3) */ + +#if defined (L_udivusa3) +DEFUN __udivusa3 + ldi r_divdHL, 32 ; init loop counter + mov r_cnt, r_divdHL + clr r_divdHL + clr r_divdHH + wmov r_quoL, r_divdHL + lsl r_quoHL ; shift quotient into carry + rol r_quoHH +__udivusa3_loop: + rol r_divdL ; shift dividend (with CARRY) + rol r_divdH + rol r_divdHL + rol r_divdHH + brcs __udivusa3_ep ; dividend overflow + cp r_divdL,r_divL ; compare dividend & divisor + cpc r_divdH,r_divH + cpc r_divdHL,r_divHL + cpc r_divdHH,r_divHH + brcc __udivusa3_ep ; dividend >= divisor + rol r_quoL ; shift quotient (with CARRY) + rjmp __udivusa3_cont +__udivusa3_ep: + sub r_divdL,r_divL ; restore dividend + sbc r_divdH,r_divH + sbc r_divdHL,r_divHL + sbc r_divdHH,r_divHH + lsl r_quoL ; shift quotient (without CARRY) +__udivusa3_cont: + rol r_quoH ; shift quotient + rol r_quoHL + rol r_quoHH + dec r_cnt ; decrement loop counter + brne __udivusa3_loop + com r_quoL ; complement result + com r_quoH ; because C flag was complemented in loop + com r_quoHL + com r_quoHH + ret +ENDF __udivusa3 +#endif /* defined (L_udivusa3) */ + +#undef r_arg1L +#undef r_arg1H +#undef r_arg1HL +#undef r_arg1HH +#undef r_divdL +#undef r_divdH +#undef r_divdHL +#undef r_divdHH +#undef r_quoL +#undef r_quoH +#undef r_quoHL +#undef r_quoHH +#undef r_divL +#undef r_divH +#undef r_divHL +#undef r_divHH +#undef r_cnt Index: libgcc/config/avr/lib1funcs.S =================================================================== --- libgcc/config/avr/lib1funcs.S (revision 190620) +++ libgcc/config/avr/lib1funcs.S (working copy) @@ -91,6 +91,35 @@ see the files COPYING3 and COPYING.RUNTI .endfunc .endm +;; Negate a 2-byte value held in consecutive registers +.macro NEG2 reg + com \reg+1 + neg \reg + sbci \reg+1, -1 +.endm + +;; Negate a 4-byte value held in consecutive registers +.macro NEG4 reg + com \reg+3 + com \reg+2 + com \reg+1 +.if \reg >= 16 + neg \reg + sbci \reg+1, -1 + sbci \reg+2, -1 + sbci \reg+3, -1 +.else + com \reg + adc \reg, __zero_reg__ + adc \reg+1, __zero_reg__ + adc \reg+2, __zero_reg__ + adc \reg+3, __zero_reg__ +.endif +.endm + +#define exp_lo(N) hlo8 ((N) << 23) +#define exp_hi(N) hhi8 ((N) << 23) + \f .section .text.libgcc.mul, "ax", @progbits @@ -126,175 +155,246 @@ ENDF __mulqi3 #endif /* defined (L_mulqi3) */ -#if defined (L_mulqihi3) -DEFUN __mulqihi3 - clr r25 - sbrc r24, 7 - dec r25 - clr r23 - sbrc r22, 7 - dec r22 - XJMP __mulhi3 -ENDF __mulqihi3: -#endif /* defined (L_mulqihi3) */ + +/******************************************************* + Widening Multiplication 16 = 8 x 8 without MUL + Multiplication 16 x 16 without MUL +*******************************************************/ + +#define A0 r22 +#define A1 r23 +#define B0 r24 +#define BB0 r20 +#define B1 r25 +;; Output overlaps input, thus expand result in CC0/1 +#define C0 r24 +#define C1 r25 +#define CC0 __tmp_reg__ +#define CC1 R21 #if defined (L_umulqihi3) +;;; R25:R24 = (unsigned int) R22 * (unsigned int) R24 +;;; (C1:C0) = (unsigned int) A0 * (unsigned int) B0 +;;; Clobbers: __tmp_reg__, R21..R23 DEFUN __umulqihi3 - clr r25 - clr r23 - XJMP __mulhi3 + clr A1 + clr B1 + XJMP __mulhi3 ENDF __umulqihi3 -#endif /* defined (L_umulqihi3) */ +#endif /* L_umulqihi3 */ -/******************************************************* - Multiplication 16 x 16 without MUL -*******************************************************/ -#if defined (L_mulhi3) -#define r_arg1L r24 /* multiplier Low */ -#define r_arg1H r25 /* multiplier High */ -#define r_arg2L r22 /* multiplicand Low */ -#define r_arg2H r23 /* multiplicand High */ -#define r_resL __tmp_reg__ /* result Low */ -#define r_resH r21 /* result High */ +#if defined (L_mulqihi3) +;;; R25:R24 = (signed int) R22 * (signed int) R24 +;;; (C1:C0) = (signed int) A0 * (signed int) B0 +;;; Clobbers: __tmp_reg__, R20..R23 +DEFUN __mulqihi3 + ;; Sign-extend B0 + clr B1 + sbrc B0, 7 + com B1 + ;; The multiplication runs twice as fast if A1 is zero, thus: + ;; Zero-extend A0 + clr A1 +#ifdef __AVR_HAVE_JMP_CALL__ + ;; Store B0 * sign of A + clr BB0 + sbrc A0, 7 + mov BB0, B0 + call __mulhi3 +#else /* have no CALL */ + ;; Skip sign-extension of A if A >= 0 + ;; Same size as with the first alternative but avoids errata skip + ;; and is faster if A >= 0 + sbrs A0, 7 + rjmp __mulhi3 + ;; If A < 0 store B + mov BB0, B0 + rcall __mulhi3 +#endif /* HAVE_JMP_CALL */ + ;; 1-extend A after the multiplication + sub C1, BB0 + ret +ENDF __mulqihi3 +#endif /* L_mulqihi3 */ +#if defined (L_mulhi3) +;;; R25:R24 = R23:R22 * R25:R24 +;;; (C1:C0) = (A1:A0) * (B1:B0) +;;; Clobbers: __tmp_reg__, R21..R23 DEFUN __mulhi3 - clr r_resH ; clear result - clr r_resL ; clear result -__mulhi3_loop: - sbrs r_arg1L,0 - rjmp __mulhi3_skip1 - add r_resL,r_arg2L ; result + multiplicand - adc r_resH,r_arg2H -__mulhi3_skip1: - add r_arg2L,r_arg2L ; shift multiplicand - adc r_arg2H,r_arg2H - - cp r_arg2L,__zero_reg__ - cpc r_arg2H,__zero_reg__ - breq __mulhi3_exit ; while multiplicand != 0 - - lsr r_arg1H ; gets LSB of multiplier - ror r_arg1L - sbiw r_arg1L,0 - brne __mulhi3_loop ; exit if multiplier = 0 -__mulhi3_exit: - mov r_arg1H,r_resH ; result to return register - mov r_arg1L,r_resL - ret -ENDF __mulhi3 -#undef r_arg1L -#undef r_arg1H -#undef r_arg2L -#undef r_arg2H -#undef r_resL -#undef r_resH + ;; Clear result + clr CC0 + clr CC1 + rjmp 3f +1: + ;; Bit n of A is 1 --> C += B << n + add CC0, B0 + adc CC1, B1 +2: + lsl B0 + rol B1 +3: + ;; If B == 0 we are ready + sbiw B0, 0 + breq 9f + + ;; Carry = n-th bit of A + lsr A1 + ror A0 + ;; If bit n of A is set, then go add B * 2^n to C + brcs 1b + + ;; Carry = 0 --> The ROR above acts like CP A0, 0 + ;; Thus, it is sufficient to CPC the high part to test A against 0 + cpc A1, __zero_reg__ + ;; Only proceed if A != 0 + brne 2b +9: + ;; Move Result into place + mov C0, CC0 + mov C1, CC1 + ret +ENDF __mulhi3 +#endif /* L_mulhi3 */ -#endif /* defined (L_mulhi3) */ +#undef A0 +#undef A1 +#undef B0 +#undef BB0 +#undef B1 +#undef C0 +#undef C1 +#undef CC0 +#undef CC1 + +\f +#define A0 22 +#define A1 A0+1 +#define A2 A0+2 +#define A3 A0+3 + +#define B0 18 +#define B1 B0+1 +#define B2 B0+2 +#define B3 B0+3 + +#define CC0 26 +#define CC1 CC0+1 +#define CC2 30 +#define CC3 CC2+1 + +#define C0 22 +#define C1 C0+1 +#define C2 C0+2 +#define C3 C0+3 /******************************************************* Widening Multiplication 32 = 16 x 16 without MUL *******************************************************/ -#if defined (L_mulhisi3) -DEFUN __mulhisi3 -;;; FIXME: This is dead code (noone calls it) - mov_l r18, r24 - mov_h r19, r25 - clr r24 - sbrc r23, 7 - dec r24 - mov r25, r24 - clr r20 - sbrc r19, 7 - dec r20 - mov r21, r20 - XJMP __mulsi3 -ENDF __mulhisi3 -#endif /* defined (L_mulhisi3) */ - #if defined (L_umulhisi3) DEFUN __umulhisi3 -;;; FIXME: This is dead code (noone calls it) - mov_l r18, r24 - mov_h r19, r25 - clr r24 - clr r25 - mov_l r20, r24 - mov_h r21, r25 + wmov B0, 24 + ;; Zero-extend B + clr B2 + clr B3 + ;; Zero-extend A + wmov A2, B2 XJMP __mulsi3 ENDF __umulhisi3 -#endif /* defined (L_umulhisi3) */ +#endif /* L_umulhisi3 */ + +#if defined (L_mulhisi3) +DEFUN __mulhisi3 + wmov B0, 24 + ;; Sign-extend B + lsl r25 + sbc B2, B2 + mov B3, B2 +#ifdef __AVR_ERRATA_SKIP_JMP_CALL__ + ;; Sign-extend A + clr A2 + sbrc A1, 7 + com A2 + mov A3, A2 + XJMP __mulsi3 +#else /* no __AVR_ERRATA_SKIP_JMP_CALL__ */ + ;; Zero-extend A and __mulsi3 will run at least twice as fast + ;; compared to a sign-extended A. + clr A2 + clr A3 + sbrs A1, 7 + XJMP __mulsi3 + ;; If A < 0 then perform the B * 0xffff.... before the + ;; very multiplication by initializing the high part of the + ;; result CC with -B. + wmov CC2, A2 + sub CC2, B0 + sbc CC3, B1 + XJMP __mulsi3_helper +#endif /* __AVR_ERRATA_SKIP_JMP_CALL__ */ +ENDF __mulhisi3 +#endif /* L_mulhisi3 */ + -#if defined (L_mulsi3) /******************************************************* Multiplication 32 x 32 without MUL *******************************************************/ -#define r_arg1L r22 /* multiplier Low */ -#define r_arg1H r23 -#define r_arg1HL r24 -#define r_arg1HH r25 /* multiplier High */ - -#define r_arg2L r18 /* multiplicand Low */ -#define r_arg2H r19 -#define r_arg2HL r20 -#define r_arg2HH r21 /* multiplicand High */ - -#define r_resL r26 /* result Low */ -#define r_resH r27 -#define r_resHL r30 -#define r_resHH r31 /* result High */ +#if defined (L_mulsi3) DEFUN __mulsi3 - clr r_resHH ; clear result - clr r_resHL ; clear result - clr r_resH ; clear result - clr r_resL ; clear result -__mulsi3_loop: - sbrs r_arg1L,0 - rjmp __mulsi3_skip1 - add r_resL,r_arg2L ; result + multiplicand - adc r_resH,r_arg2H - adc r_resHL,r_arg2HL - adc r_resHH,r_arg2HH -__mulsi3_skip1: - add r_arg2L,r_arg2L ; shift multiplicand - adc r_arg2H,r_arg2H - adc r_arg2HL,r_arg2HL - adc r_arg2HH,r_arg2HH - - lsr r_arg1HH ; gets LSB of multiplier - ror r_arg1HL - ror r_arg1H - ror r_arg1L - brne __mulsi3_loop - sbiw r_arg1HL,0 - cpc r_arg1H,r_arg1L - brne __mulsi3_loop ; exit if multiplier = 0 -__mulsi3_exit: - mov_h r_arg1HH,r_resHH ; result to return register - mov_l r_arg1HL,r_resHL - mov_h r_arg1H,r_resH - mov_l r_arg1L,r_resL - ret -ENDF __mulsi3 + ;; Clear result + clr CC2 + clr CC3 + ;; FALLTHRU +ENDF __mulsi3 -#undef r_arg1L -#undef r_arg1H -#undef r_arg1HL -#undef r_arg1HH - -#undef r_arg2L -#undef r_arg2H -#undef r_arg2HL -#undef r_arg2HH - -#undef r_resL -#undef r_resH -#undef r_resHL -#undef r_resHH +DEFUN __mulsi3_helper + clr CC0 + clr CC1 + rjmp 3f + +1: ;; If bit n of A is set, then add B * 2^n to the result in CC + ;; CC += B + add CC0,B0 $ adc CC1,B1 $ adc CC2,B2 $ adc CC3,B3 + +2: ;; B <<= 1 + lsl B0 $ rol B1 $ rol B2 $ rol B3 + +3: ;; A >>= 1: Carry = n-th bit of A + lsr A3 $ ror A2 $ ror A1 $ ror A0 + + brcs 1b + ;; Only continue if A != 0 + sbci A1, 0 + brne 2b + sbiw A2, 0 + brne 2b + + ;; All bits of A are consumed: Copy result to return register C + wmov C0, CC0 + wmov C2, CC2 + ret +ENDF __mulsi3_helper +#endif /* L_mulsi3 */ -#endif /* defined (L_mulsi3) */ +#undef A0 +#undef A1 +#undef A2 +#undef A3 +#undef B0 +#undef B1 +#undef B2 +#undef B3 +#undef C0 +#undef C1 +#undef C2 +#undef C3 +#undef CC0 +#undef CC1 +#undef CC2 +#undef CC3 #endif /* !defined (__AVR_HAVE_MUL__) */ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; @@ -316,7 +416,7 @@ ENDF __mulsi3 #define C3 C0+3 /******************************************************* - Widening Multiplication 32 = 16 x 16 + Widening Multiplication 32 = 16 x 16 with MUL *******************************************************/ #if defined (L_mulhisi3) @@ -364,7 +464,17 @@ DEFUN __umulhisi3 mul A1, B1 movw C2, r0 mul A0, B1 +#ifdef __AVR_HAVE_JMP_CALL__ + ;; This function is used by many other routines, often multiple times. + ;; Therefore, if the flash size is not too limited, avoid the RCALL + ;; and inverst 6 Bytes to speed things up. + add C1, r0 + adc C2, r1 + clr __zero_reg__ + adc C3, __zero_reg__ +#else rcall 1f +#endif mul A1, B0 1: add C1, r0 adc C2, r1 @@ -375,7 +485,7 @@ ENDF __umulhisi3 #endif /* L_umulhisi3 */ /******************************************************* - Widening Multiplication 32 = 16 x 32 + Widening Multiplication 32 = 16 x 32 with MUL *******************************************************/ #if defined (L_mulshisi3) @@ -425,7 +535,7 @@ ENDF __muluhisi3 #endif /* L_muluhisi3 */ /******************************************************* - Multiplication 32 x 32 + Multiplication 32 x 32 with MUL *******************************************************/ #if defined (L_mulsi3) @@ -468,7 +578,7 @@ ENDF __mulsi3 #endif /* __AVR_HAVE_MUL__ */ /******************************************************* - Multiplication 24 x 24 + Multiplication 24 x 24 with MUL *******************************************************/ #if defined (L_mulpsi3) @@ -1247,6 +1357,19 @@ __divmodsi4_exit: ENDF __divmodsi4 #endif /* defined (L_divmodsi4) */ +#undef r_remHH +#undef r_remHL +#undef r_remH +#undef r_remL +#undef r_arg1HH +#undef r_arg1HL +#undef r_arg1H +#undef r_arg1L +#undef r_arg2HH +#undef r_arg2HL +#undef r_arg2H +#undef r_arg2L +#undef r_cnt /******************************************************* Division 64 / 64 @@ -2757,9 +2880,7 @@ DEFUN __fmulsu_exit XJMP __fmul 1: XCALL __fmul ;; C = -C iff A0.7 = 1 - com C1 - neg C0 - sbci C1, -1 + NEG2 C0 ret ENDF __fmulsu_exit #endif /* L_fmulsu */ @@ -2794,3 +2915,5 @@ ENDF __fmul #undef B1 #undef C0 #undef C1 + +#include "lib1funcs-fixed.S" Index: libgcc/config/avr/t-avr =================================================================== --- libgcc/config/avr/t-avr (revision 190620) +++ libgcc/config/avr/t-avr (working copy) @@ -2,6 +2,7 @@ LIB1ASMSRC = avr/lib1funcs.S LIB1ASMFUNCS = \ _mulqi3 \ _mulhi3 \ + _mulqihi3 _umulqihi3 \ _mulpsi3 _mulsqipsi3 \ _mulhisi3 \ _umulhisi3 \ @@ -55,6 +56,24 @@ LIB1ASMFUNCS = \ _cmpdi2 _cmpdi2_s8 \ _fmul _fmuls _fmulsu +# Fixed point routines in avr/lib1funcs-fixed.S +LIB1ASMFUNCS += \ + _fractqqsf _fractuqqsf \ + _fracthqsf _fractuhqsf _fracthasf _fractuhasf \ + _fractsasf _fractusasf _fractsqsf _fractusqsf \ + \ + _fractsfqq _fractsfuqq \ + _fractsfhq _fractsfuhq _fractsfha _fractsfuha \ + _fractsfsa _fractsfusa \ + _mulqq3 \ + _mulhq3 _muluhq3 \ + _mulha3 _muluha3 _muluha3_round \ + _mulsa3 _mulusa3 \ + _divqq3 _udivuqq3 \ + _divhq3 _udivuhq3 \ + _divha3 _udivuha3 \ + _divsa3 _udivusa3 + LIB2FUNCS_EXCLUDE = \ _moddi3 _umoddi3 \ _clz @@ -81,3 +100,52 @@ libgcc-objects += $(patsubst %,%$(objext ifeq ($(enable_shared),yes) libgcc-s-objects += $(patsubst %,%_s$(objext),$(hiintfuncs16)) endif + + +# Filter out supported conversions from fixed-bit.c + +conv_XY=$(conv)$(mode1)$(mode2) +conv_X=$(conv)$(mode) + +# Conversions supported by the compiler + +convf_modes = QI UQI QQ UQQ \ + HI UHI HQ UHQ HA UHA \ + SI USI SQ USQ SA USA \ + DI UDI DQ UDQ DA UDA \ + TI UTI TQ UTQ TA UTA + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_fract _fractuns,\ + $(foreach mode1,$(convf_modes),\ + $(foreach mode2,$(convf_modes),$(conv_XY)))) + +# Conversions supported by lib1funcs-fixed.S + +conv_to_sf_modes = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA +conv_from_sf_modes = QQ UQQ HQ UHQ HA UHA SA USA + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_fract, \ + $(foreach mode1,$(conv_to_sf_modes), \ + $(foreach mode2,SF,$(conv_XY)))) + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_fract,\ + $(foreach mode1,SF,\ + $(foreach mode2,$(conv_from_sf_modes),$(conv_XY)))) + +# Arithmetik suported by the compiler + +allfix_modes = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA DA UDA DQ UDQ TQ UTQ TA UTA + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_add _sub,\ + $(foreach mode,$(allfix_modes),$(conv_X)3)) + +LIB2FUNCS_EXCLUDE += \ + $(foreach conv,_lshr _ashl _ashr _cmp,\ + $(foreach mode,$(allfix_modes),$(conv_X))) + +#(error $(LIB2FUNCS_EXCLUDE)) + ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [Patch,AVR] PR54222: Add fixed point support 2012-08-23 14:50 ` Georg-Johann Lay @ 2012-08-23 16:42 ` Weddington, Eric 2012-08-23 17:25 ` Georg-Johann Lay 2012-08-24 11:58 ` Denis Chertykov 1 sibling, 1 reply; 12+ messages in thread From: Weddington, Eric @ 2012-08-23 16:42 UTC (permalink / raw) To: Georg-Johann Lay, Denis Chertykov; +Cc: gcc-patches > -----Original Message----- > From: Georg-Johann Lay > Sent: Thursday, August 23, 2012 8:49 AM > To: Denis Chertykov > Cc: Weddington, Eric; gcc-patches@gcc.gnu.org > Subject: Re: [Patch,AVR] PR54222: Add fixed point support > > Hi, here is an updated patch. > > Some functions are reworked and there is some code clean up. > > The test results look good, there are no additional regressions. > > The new test cases in gcc.dg/fixed-point pass except some convert-*.c for > two reasons: > > * Some test cases have a loss of precision and therefore fail. > One fail is that 0x3fffffffc0000000 is compared against > 0x4000000000000000 and thus fails. Presumably its a rounding > error from float. I'd say this is not critical. > > * PR54330: This leads to wrong code for __satfractudadq and the > wrong code is already present in .expand. From the distance > this looks like a middle-end or tree-ssa problem. > > The new patch implements TARGET_BUILD_BUILTIN_VA_LIST. > Rationale is that avr-fixed.md adjust some modes bit these > changes are not reflected by the built-in macros made by gcc. > This leads to wrong code in libgcc because it deduces the > type layout from these built-in defines. Thus, the respective > nodes must be patches *before* built-in macros are emit. > > The changes to LIB2FUNCS_EXCLUDE currently have no effects, > this needs http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01580.html > which is currently under review. > > Ok to install? > Hi Johann, I have no objections to the patch, but I think it also best to wait for Denis to approve as well. Based on your analysis of the test case failure you mentioned above, do you think we need to have some other new test cases for the AVR fixed-point support? BTW, I appreciate all the great work that you've done on this! :-) Eric ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-23 16:42 ` Weddington, Eric @ 2012-08-23 17:25 ` Georg-Johann Lay 0 siblings, 0 replies; 12+ messages in thread From: Georg-Johann Lay @ 2012-08-23 17:25 UTC (permalink / raw) To: Weddington, Eric; +Cc: Denis Chertykov, gcc-patches Weddington, Eric wrote: > >> Georg-Johann Lay: >> >> Hi, here is an updated patch. >> >> Some functions are reworked and there is some code clean up. >> >> The test results look good, there are no additional regressions. >> >> The new test cases in gcc.dg/fixed-point pass except some convert-*.c for >> two reasons: >> >> * Some test cases have a loss of precision and therefore fail. One fail is >> that 0x3fffffffc0000000 is compared against 0x4000000000000000 and thus >> fails. Presumably its a rounding error from float. I'd say this is not >> critical. >> >> * PR54330: This leads to wrong code for __satfractudadq and the wrong code >> is already present in .expand. From the distance this looks like a >> middle-end or tree-ssa problem. >> >> The new patch implements TARGET_BUILD_BUILTIN_VA_LIST. Rationale is that >> avr-fixed.md adjust some modes bit these changes are not reflected by the >> built-in macros made by gcc. This leads to wrong code in libgcc because it >> deduces the type layout from these built-in defines. Thus, the respective >> nodes must be patches *before* built-in macros are emit. >> >> The changes to LIB2FUNCS_EXCLUDE currently have no effects, this needs >> http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01580.html which is currently >> under review. >> >> Ok to install? > > Hi Johann, > > I have no objections to the patch, but I think it also best to wait for > Denis to approve as well. > > Based on your analysis of the test case failure you mentioned above, do you > think we need to have some other new test cases for the AVR fixed-point > support? PR54330 does not look avr-related, and I don't think it makes sense to work around that by cutting down test coverage. That problem reminds me of a middle-end bug where C's undefined signed overflow was applied to IR. This is wrong because signed overflow is only undefined if it comes from the C source but not if it comes from IR optimizers that mess with signed/unsigned/shifts/comparisons/carry etc. on IR. The rounding error is no reason to cut down coverage. The convert tests operate on powers of 2 which all can be represented exactly by float and fixed, so there should not be a rounding error, not even a small one. But maybe it's not a rounding error but just a saturation artifact because a value is saturated to 0xffffffff when converting to/from float. I don't know if there are other values that don't have 2^32-1 in their representation -- I did not track all these errors. The test cases are insanely big and it takes time to follow what's going on with a small target if there is no debugger. What turned out to be unreliable is the out_fixed routine. I fixed the problems I found and tried to add comments to make the code more comprehensible, but there may be more problems. I intend to rewrite that routine from scratch and replace the original version altogether, but it might take some weeks until I have time to return to that. Johann ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Patch,AVR] PR54222: Add fixed point support 2012-08-23 14:50 ` Georg-Johann Lay 2012-08-23 16:42 ` Weddington, Eric @ 2012-08-24 11:58 ` Denis Chertykov 1 sibling, 0 replies; 12+ messages in thread From: Denis Chertykov @ 2012-08-24 11:58 UTC (permalink / raw) To: Georg-Johann Lay; +Cc: Weddington, Eric, gcc-patches 2012/8/23 Georg-Johann Lay <avr@gjlay.de>: > Denis Chertykov wrote: >> 2012/8/13 Georg-Johann Lay: >>> Denis Chertykov wrote: >>>> 2012/8/11 Georg-Johann Lay: >>>>> Weddington, Eric schrieb: >>>>>>> From: Georg-Johann Lay >>>>>>> >>>>>>> >>>>>>> The first step would be to bisect and find the patch that lead to >>>>>>> PR53923. It was not a change in the avr BE, so the question goes >>>>>>> to the authors of the respective patch. >>>>>>> >>>>>>> Up to now I didn't even try to bisect; that would take years on the >>>>>>> host that I have available... >>>>>>> >>>>>>>> My only real concern is that this is a major feature addition and >>>>>>>> the AVR port is currently broken. >>>>>>> I don't know if it's the avr port or some parts of the middle end that >>>>>>> don't cooperate with avr. >>>>>> I would really, really love to see fixed point support added in, >>>>>> especially since I know that Sean has worked on it for quite a while, >>>>>> and you've also done a lot of work in getting the patches in shape to >>>>>> get them committed. >>>>>> >>>>>> But, if the AVR port is currently broken (by whomever, and whatever >>>>>> patch) and a major feature like this can't be tested to make sure it >>>>>> doesn't break anything else in the AVR backend, then I'm hesitant to >>>>>> approve (even though I really want to approve). >>>>> I don't understand enough of DF to fix PR53923. The insn that leads >>>>> to the ICE is (in df-problems.c:dead_debug_insert_temp): >>>>> >>>> Today I have updated GCC svn tree and successfully compiled avr-gcc. >>>> The libgcc2-mulsc3.c from also compiled without bugs. >>>> >>>> Denis. >>>> >>>> PS: May be I'm doing something wrong ? (I had too long vacations) >>> I am configuring with --target=avr --disable-nls --with-dwarf2 >>> --enable-languages=c,c++ --enable-target-optspace=yes --enable-checking=yes,rtl >>> >>> Build GCC is "gcc version 4.3.2". >>> Build and host are i686-pc-linux-gnu. >>> >>> Maybe it's different on a 64-bit computer, but I only have 32-bit host. >>> >> >> I have debugging PR53923 and on my opinion it's not an AVR port bug. >> Please commit fixed point support. >> >> Denis. > > Hi, here is an updated patch. > > Some functions are reworked and there is some code clean up. > > The test results look good, there are no additional regressions. > > The new test cases in gcc.dg/fixed-point pass except some convert-*.c for > two reasons: > > * Some test cases have a loss of precision and therefore fail. > One fail is that 0x3fffffffc0000000 is compared against > 0x4000000000000000 and thus fails. Presumably its a rounding > error from float. I'd say this is not critical. > > * PR54330: This leads to wrong code for __satfractudadq and the > wrong code is already present in .expand. From the distance > this looks like a middle-end or tree-ssa problem. > > The new patch implements TARGET_BUILD_BUILTIN_VA_LIST. > Rationale is that avr-fixed.md adjust some modes bit these > changes are not reflected by the built-in macros made by gcc. > This leads to wrong code in libgcc because it deduces the > type layout from these built-in defines. Thus, the respective > nodes must be patches *before* built-in macros are emit. > > The changes to LIB2FUNCS_EXCLUDE currently have no effects, > this needs http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01580.html > which is currently under review. > > Ok to install? > Please commit. Denis. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-08-24 11:58 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-08-10 15:53 [Patch,AVR] PR54222: Add fixed point support Georg-Johann Lay 2012-08-10 16:09 ` Weddington, Eric 2012-08-10 17:06 ` Georg-Johann Lay 2012-08-10 17:56 ` Weddington, Eric 2012-08-10 21:34 ` Georg-Johann Lay 2012-08-12 9:13 ` Denis Chertykov 2012-08-13 9:28 ` Georg-Johann Lay 2012-08-21 16:10 ` Denis Chertykov 2012-08-23 14:50 ` Georg-Johann Lay 2012-08-23 16:42 ` Weddington, Eric 2012-08-23 17:25 ` Georg-Johann Lay 2012-08-24 11:58 ` Denis Chertykov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).