Applying Simulator "Warmup" Techniques in SID External Specification ============================================== Simulating program execution using SID can be a useful way to collect information about the behaviour and performance of an application. For most existing ports, SID provides cycle counting, an interface for collecting data for analysis using gprof as well as a variety of options for generating trace output. While simulation using these features is useful, it also requires the simulator to perform extra overhead which can significantly slow down the simulation. In addition, it is probable that only a portion of the application being simulated is of interest for analysis. For example, when executing applications written in C, the C runtime startup code is probably not of interest. It would be useful to enable detailed modelling for only the portions of the application which are of interest. Thus the extra overhead of detailed modelling would only be applied for a smaller portion of the simulation. The remaining portion of the simulation would be run without this overhead, so the simulation as a whole would finish more quickly. In SID, the following options result in extra overhead for the simulator: --trace-extract --trace-semantics --trace-disassemble --trace-core --ulog-level --ulog-mode --wrap --verbose --trace-counter --final-insn-count --gprof --insn-count=N where N is a small integer All of these options could be enabled/disabled on demand during the simulation to provide information only about the parts of the application which are of interest. Proposed Interfaces For Controlled Modelling in SID =================================================== In order for this technique to be useful for analyzing real applications and algorithms, precise control is necessary. We propose a combination of a SID command line options and a syscall instruction. SID Command Line Options ------------------------ We propose the addition of new SID command line options: --warmup This option starts SID in "warmup" mode regardless of the specification of the other options listed above. warmup mode means that all these options are set to the values which provide maximum simulation speed. As with most other SID options, if specified before the first --board, then --warmup applies to all --boards, otherwise, it applies only to the previous --board. --profile-config=: where is the name being assigned to this group of options is one or more options from the list above This option associates the given set of options with the given profile name. This profile name may be referenced on a --profile-func option (see below) in order to activate that set of options for a given function or functions. This profile name may also be specified using a syscall instruction (see below) to activate a set of options within a function. This option may be specified more than once in order to define several configuration profiles. The position of the option on the command line is irrelevent. The named profile is available for use by any --board. --profile-func=: where is a comma-separated list of function names is the name of a profile configuration specified on a --profile-config option. (see above) This option automatically reconfigures SID with the specified options whenever one of the listed functions is entered and restores the previous configuration when the function exits. This can be used for gathering information about the execution of specific functions within an application. --profile-func may be specified more than once to provide different configurations for different functions. As with most other SID options, if specified before the first --board, then it applies to all --boards, otherwise, it applies only to the previous --board. ***NOTE: This option will only work for ports in which the cpu component drives cg-caller, cg-callee, cg-jump and cg-return pins on calls and branches. Currently, no port does this. --warmup-func= where is a comma-separated list of function names This option automatically returns SID to warmup mode whenever one of the listed functions is entered and restores the previous configuration when the function exits. --warmup-func may be specified more than once. As with most other SID options, if specified before the first --board, then it applies to all --boards, otherwise, it applies only to the previous --board. ***NOTE: This option will only work for ports in which the cpu component drives cg-caller, cg-callee, cg-jump and cg-return pins on calls and branches. Currently, no port does this. Syscall Instruction ------------------- For finer control at the instruction level, we propose a system call. A system call number which is curently not in use would be selected. We propose the use of syscall number 0 since it is likely that it can be specified on the system call instruction of all existing ports. The API for this system call would be: On Entry: Argument 1: Configuration mode 0 -- warmup: The above options are automatically set to the values which result in maximum simulation speed for the cpu making the call. 1 -- set: Used to set specific configuration options. Argument 2 is a pointer to a nul terminated string containing the name of a profile configuration specified by --profile-config. The syscall dynamically reconfigures SID to reflect the options specified by the given profile configuration for the cpu which executes the syscall. 2 -- reset: Used to restore a previous configuration setting. Argument 2 is a configuration handle returned from a previous call to this syscall insn by the cpu making the call. The configuration is restored to the state represented by that handle. On Exit: Return Value: The configuration handle of the previous configuration. This value may be used in a subsequent call to restore a previous configuration for the cpu making the call. If there is an error, the handle of the current configuration will be returned. Error code: 0 of no error If mode was 1 (set) a non zero value indicates that the profile configuration name was not valid. If mode was 2 (reset) a non zero value indicates that the configuration handle was not valid. This system call may be made in a function specified on a --profile-func option (see above). If so, the configuration which existed prior to calling the function will automatically be restored when the function exits. System call access from C/C++ ----------------------------- The libgloss implementation for a given port may provide access to the system call from C/C++ by implementing the following function: #define _SID_CONFIG_WARMUP 0 #define _SID_CONFIG_SET 1 #define _SID_CONFIG_RESET 2 unsigned _Sid_config (unsigned mode, ...); The arguments to _Sid_config correspond directly to the interface of the system call instruction above: If 'mode' is _SID_CONFIG_WARMUP, then no further arguments are expected. If 'mode' is _SID_CONFIG_SET, then a second argument of type 'const char *' is expected to contain the name of a configuration profile specified on --profile-config and a third argument of type 'unsigned *' is expected for returning the error code. If mode is _SID_CONFIG_RESET, a second argument of type 'unsigned' is expected to contain the configuration handle and a third argument of type 'unsigned *' is expected for returning the error code. The return value will be a handle for the previous configuration. If there are errors, then the handle of the current configuration will be returned. If provided, the implementation of _Sid_config should simply make the appropriate system call for the target. Examples: ========= NOTES: ------ o The examples below are for the xstormy16. Other ports will use different --board and --cpu flags and may use a different system call interface. o The examples using --profile-func and --warmup-func depend upon the cpu component of the port driving the cg-caller, cg-callee, cg-jump and cg-return pins on calls and branches. Currently, no port does this. o The examples using _Sid_config depend on the existence of the function Sid_config in the libgloss implementation for the port. Currently no ports implement this function. Similar examples could be constructed using 'asm' statements to make the system calls directly. Example 1: Profile the entire simulation ---------------------------------------- sid --gprof=gprof.out,cycles=1 --trace-disassemble --trace-counter \ --board=basic --cpu=xstormy16 --memory-region=0x0,0x100000 --load=a.out This example behaves as SID does today. It collects gprof data and dumps a trace of the disassembly and cycle count for the entire simulation. Example 2: Trace and Profile part of one function as specified on the command line ---------------------------------------------------------------------------------- sid --warmup \ --profile-config=myprofile:"--gprof=gprof.out,cycles=1 --trace-disassemble --trace-counter" \ --board=basic --cpu=xstormy16 --memory-region=0x0,0x100000 --load=a.out If function f in a.out looks as follows: .data pname: .ascii "myprofile" .text .p2align 1 .globl f .type f, @function f: # enable profiling as specified by myprofile mov r1,#1 mov.w r2,#pname .byte 0x01 .byte 0x00 .... # code to be tested goes here .... # Restore the simulation to warmup mode mov r1,#0 .byte 0x01 .byte 0x00 #return 0 mov r2, #0 ret This example will collect gprof data and dump a trace of the disassembly and cycle count during the execution of the function f and then restore the simulator to the previous state (warmup mode) before returning. Example 2a: Trace and Profile part of one function as specified on the command line ----------------------------------------------------------------------------------- This example is the same as example 2, except that it uses the 'reset' mode of the syscall to restore the previous configuration after profiling. sid --warmup \ --profile-config=myprofile:"--gprof=gprof.out,cycles=1 --trace-disassemble --trace-counter" \ --board=basic --cpu=xstormy16 --memory-region=0x0,0x100000 --load=a.out If function f in a.out looks as follows: .data pname: .ascii "myprofile" .text .p2align 1 .globl f .type f, @function f: # enable profiling as specified by myprofile mov r1,#1 mov.w r2,#pname .byte 0x01 .byte 0x00 # handle for previous configuration is in r2 .... # code to be tested goes here # must preserve r2 .... # Restore the simulation to the previous mode # r2 contains the handle of the previous configuration mov r1,#2 .byte 0x01 .byte 0x00 #return 0 mov r2, #0 ret Example 3: Collect gprof=,cycles=1 data for one function and then restore the previous configuration ----------------------------------------------------------------------- sid --warmup \ --board=basic --cpu=xstormy16 --memory-region=0x0,0x100000 \ --profile-func=test_function,myprofile \ --profile-config=myprofile:"--gprof=gprof.out,cycles=1 --final-insn-count" \ --load=a.out test.c: ------- main () { non_profiled_function (); test_function (); another_non_profiled_function (); } This example will collect gprof data during the execution of test_function and then restore the simulator to warmup mode before returning. Note that no source code changes are necessary in this example. Example 4: Cache Priming ------------------------ It may be useful in some situations to run a function once without profiling in order to prime the instruction cache and/or data cache before turning on profiling: sid --warmup \ --profile-config=myprofile:"--gprof=gprof.out,cycles=1 --final-insn-count" \ --board=basic --cpu=xstormy16 --memory-region=0x0,0x100000 --load=a.out test.c: ------- main () { unsigned ix; const char *set_error; unsigned reset_error; /* warm up the test function */ test_function (); /* Reconfigure for profiling */ ix = _Sid_config (_SID_CONFIG_SET, "myprofile", & set_error); test_function (); /* Restore the previous configuration */ _Sid_config (_SID_CONFIG_RESET, ix, & reset_error); } Example 5: Excluding a function from profiling ---------------------------------------------- It may be useful to exclude one or more functions from profiling. For example, one might like to exclude functions which set up the state of the application being profiled. The example below profiles 'main' and all functions which it calls except for 'setup_function': sid --warmup \ --profile-config=myprofile:"--gprof=gprof.out,cycles=1 --final-insn-count" \ --profile-func=main:myprofile \ --warmup-func=setup_function \ --board=basic --cpu=xstormy16 --memory-region=0x0,0x100000 --load=a.out test.c: ------- main () { int i; for (i = 0; i < 100; ++i) { setup_function (i); test_function (); } }