Introduction ------------ This tapset provides a framework to facilitate fault injections for testing the kernel. The framework can be used by systemtap scripts to actually inject faults. The framework processes the command line arguments and controls the fault injection process. Following are the generic parameters used to set up the fault injection. a) failtimes - maximum number of times the process can be failed b) interval - number of successful hits between potential failures c) probability - probability of potential failure d) taskfilter - fail all processes or filter processes on pid e) space - number of successful hits before the first failure f) verbosity - control amount of output generated by the script g) totaltime - duration of fault injection session h) debug - print debug information for the script i) pid - process IDs of processes to inject failures into. This can also be specified using the -x option. These parameters are registered in the tapset using the fij_add_option() function which also sets the script specific default values and provides help text. The generic parameters are appended to the params[] array and can be accessed using params["variable_name"]. If you doesn't specify any of the parameters in command line, its default value is used. Using fij_load_param(), your script can also assign script-specific default values to generic parameters. You can define mandatory parameters, which are specific to the script depending upon the kernel subsystem under test. These variables must necessarily be specified on the command line during command execution. E.g: device numbers, inode numbers etc which cannot be given default values. Such parameters can be registered using the fij_add_necessary_option() function. On calling this function, the variable is appended to a mandatoryparams[] array. If these parameters are not specified on the command line, an error is reported and script is aborted. The variable can be accessed at params["variable_name"]. The framework controls the fault injection using fij_should_fail() and fij_done_fail() functions. Your script should probe the relevant kernel routine subjected to fault injection. The user-defined probe handler invokes fij_should_fail(), which returns 1 if it's time to inject a failure, or 0 otherwise. Faults can be injected by your script in various ways like faking the error return by changing the return value, by modifying data structures etc. fij_done_fail() must be called immediately after fault injection to alert the tapset of this. fij_done_fail() must not be called in case no fault was injected. fij_logger() - This is a wrapper for the SystemTap log() function with an added verbosity parameter. The message will be displayed only if the value of global fij_verbosity is equal to or more than the parameter provided to the function. How to use the tapset --------------------- 1) begin probe that adds user defined parameters and default values. 2) Probes for fault injection. Call fij_should_fail() before injecting the fault and fij_done_fail() after fault is injected. Description of code flow ------------------------ 1) begin(less than -1000) in the user script [OPTIONAL] - Preinitialization. As of now, this is not necessary. 2) begin(-1000) in the tapset - This function initialises counters and registers all generic parameters with global defaults. 3) begin in the user script - User defined default parameters are supplied here. Also any script specific parameters are registered at this stage. 4) begin(1000) in the tapset - Command line arguments are parsed and parameters assigned appropriate values. 5) begin(more than 1000) in the user script [OPTIONAL] - This can be used to copy values of arguments from params[] array to local/global variables for easy referencing. 6) Script starts executing. It is interrupted every 10 milliseconds to check if script has run for the stipulated length of time. 7) When function/statement probes are hit, the script must invoke fij_should_fail() function to check if the conditions for failure have been satisfied. 8) Fail the function using suitable methods (changing return values, setting fake values to variables...) 9) Call fij_done_fail() function to inform tapset that fault has been injected. 10) Script will exit either when script calls exit() function or when a timeout is hit. At this point, stats of the experiment are printed.