dpcc - DProbes C compiler |
dpcc [−achp?] [-Iinclude_dir] [-Drpn_def] [-o outfile] filename |
The DProbes C compiler, dpcc , provides a high-level language interface to the IBM Dynamic Probes debugging facility, dprobes. The dprobes facility itself provides an assembly-like language based on Reverse Polish Notation (RPN) for writing user-defined probe-handlers, and allows a certain limited set of objects in the probed program to be specified symbolically, namely global variables and functions. dpcc allows probe-handlers to be written using a language comprising a substantial subset of ANSI C, and allows most probed program structures, including stack-based objects such as function parameters and locals, to be used symbolically in arbitrary ’probe expressions’. dpcc supports all probe and module types provided by dprobes i.e. user, kernel, and kernel module ’breakpoint’ and ’watchpoint’ probes. The C language implemented by dpcc supports a large number of ANSI C language features, and adds several others, most notably exceptions and try/catch exception handling. See the LANGUAGE REFERENCE section for an exhaustive description of the supported language, as well as a summary of how the DProbes C language differs from ANSI C. |
The dpcc command-line has the following options: |
−a |
print abstract syntax tree for the program (used for debugging). |
||
−c |
generate comments in the generated RPN code. |
||
−D |
pass the give define to the preprocessor. |
||
−h |
print help. |
||
−I |
include the given directory in the search path for files included in the program. |
||
−o |
generate the RPN output to the given output file. The default output file is the input filename with the extension (e.g .dpc) replaced by .rpn. |
||
−p |
ignore preprocessor errors. This is useful for ignoring preprocessor errors. |
||
−? |
print help. |
This man page serves as a reference manual and also contains a ’quick start’ and tutorial. Here’s an outline of its organization: Quick Start Compiling Probes Inserting and Testing Probes Language Reference Language Summary Semantic Differences From ANSI C Compilation Phases Language Details Pragma List Supported Types Variable Scope and Storage Global Variable Syntax Operators Functions Conditional Statements and Expressions Looping Statements Exceptions and Exception Handling Stack Traces Library Functions Built-in Functions Adding Your Own Built-in Functions Tutorial Preliminaries Examples Diagnostics Known Problems and Bugs See Also |
Compiling Probes To compile a dprobes C program, invoke dpcc on the source file as such: dpcc myprog.dpc dprobes C programs can include other files, such as the .dph files found in the ./include directory of the dpcc distribution. dpcc by default will look only in the directory containing the program being compiled, so if a program includes files located in other directories, -I options should be used: dpcc -Iinclude -Imyincludes myprog.dpc The output dprobes RPN file will be myprog.rpn. You can optionally specify an output file name: dpcc -o myprog.rpn myhack.dpc where myhack.dpc is the source file and myprog.rpn the resulting RPN file. If the -o option isn’t specified, the output RPN file will be myprog.rpn. Other options include: |
-D |
pass a define to the preprocessor. |
|||
-p |
ignore preprocessor errors. |
|||
-c |
include comments in the generated RPN code. |
Many of the example probes make use of some DProbes C ’header’ files contained in the include directory. Currently, the header files contained in the include directory are: |
defs.dph |
needed for exception handling and the exit(XXX) builtin function. |
regs.dph |
needed for builtin register manipulation functions. |
string.dph |
contains definitions for printing utility functions. |
To allow the compiler to find these files, the -I option should be used. Thus to compile an example probe, from the distribution directory: dpcc -Iinclude include/some-example.dpc The output will be in the current directory as some-example.rpn Inserting and Testing Probes The RPN file generated by the HLL compiler can be directly inserted using the dprobes insert command e.g. for a user-mode probe: dprobes -i some-example.rpn For a kernel probe: dprobes -i some-example.rpn -s /usr/src/linux/System.map If you get a ’probe not allowed’ error message from the dprobes command e.g. ’probe not allowed on opcode 0xcc’, first remove the current probes: dprobes -r -a then recompile the probe and insert again. The triggerprobe program include with the compiler allows the user-mode example probes to be triggered: ./triggerprobe Executing triggerprobe will run test code that triggers the example probes. See the probe source and triggerprobe.c for specifics. In all cases, the output of the probe can be found in the system log, /var/log/messages , for example. However, many of the examples make use of the utility functions definde in "string.dph", which are designed to display output to a ’debugging console’. This output can be viewed in real-time by running the ’tailprint’ Perl script included with the compiler: ./tailprint When the probe fires and executes one of the print/printd/printu/printx utility functions defined in "string.dph", the output will be immediately printed to the console window running tailprint. In practice, it’s most useful to have one console window running tailprint, and the other actually tailing the system log (tail -f /var/log/messages), in order to see output from probes that don’t use the tailprint as well as output from those that do. NOTE: If the system log on your system is not /var/log/messages, the tailprint script must be edited to reflect the actual location of the system log i.e. replace all occurrences of /var/log/messages with the correct location. |
Language Summary The C HLL language is basically ANSI C, with a few additions and several omissions, and some minor semantic differences. The grammar in c_hll_lang.y is derived from the grammar maintained by Jutta Degener, which can be viewed at http://www.lysator.liu.se/c/ANSI-C-grammar-y.html. Additions to ANSI C: |
- |
exception handling (try/catch) |
|||
- |
ROL/ROR (<<</>>>) |
Omissions: |
- |
typedefs |
|||
- |
function pointers |
|||
- |
function prototypes |
|||
- |
register keyword |
|||
- |
volatile keyword |
|||
- |
auto keyword |
|||
- |
float, double, long long |
|||
- |
variable arg functions (ellipsis) |
|||
- |
array/struct initializers |
|||
- |
static blocks |
|||
- |
bitfields |
|||
- |
"old" style function definitions |
|||
- |
labeled statements and goto |
Note that dprobes C doesn’t provide an equivalent to malloc() ; there is no provision to dynamically allocate objects at run-time, so struct/union objects can only be created at compile-time either as global or static variables, or on the stack as local variables. Probe program expressions use the gdb C parser and thus may accept a slightly different grammar in some cases, and they also don’t support the following: |
- |
floats, doubles |
|||
- |
function calls |
|||
- |
bitfields |
Probe expressions do however support the use of typedefs, so most probe expressions can be expressed using the same type definitions found in the source code being probed. Basically, anything that can be expressed using a gdb expression can be expressed in the same way in a probe expression. Semantic Differences From ANSI C |
- |
extern storage class has a syntax change to allow dprobes global variables (gvars) to be defined. For example, |
extern(2) int var; means that the variable var is stored starting at dprobes gvar index 2. extern gvar variables are visible between probe programs and are the closest thing the dprobes C language has to true global variables. The extern keyword is ignored when applied to anything else e.g. functions i.e. functions can never be visible outside of the program they’re defined in. Only primitive types can be extern, because global vars are cooperative, need a specified index, and it would be too error prone for the programmer to know actual gvar indices with the complication of struct lengths etc. The compiler can’t figure out which gvar indices a variable refers to unless explicitly told. |
- |
static storage class means the variable is stored in the lvars array of the probe, and is what all non-automatic variables default to. In ANSI C, non-automatic variables do not default to static. |
Compilation Phases Before dpcc ever sees the source file, it’s first passed through the gcc C preprocessor, which expands #include and #define statements in the source file, before passing it on to the dpcc compiler. If there isn’t a C preprocessor available, the -p option of dpcc allows the preprocessing phase to fail without causing the failure of the compile as a whole. The -I dpcc command-line option allows directories beyond the current directory to be searched for include files specified in #include directives in the source file. The -D option allows defines to be passed on to the preprocessor from the command-line. The output of dpcc is a dprobes RPN file that will subsequently be compiled by the dprobes RPN compiler (via the dprobes insert command, for instance). Language Details A probe program consists of a single source file which contains one or more ’probepoints’ or ’watchpoints’. The general format of a probe program is: File pragmas Global/static variables Probe pragmas (per probe/watch-point) Probe-handler function(s) (one per probe/watch-point) Support functions These can occur in any order, with the exception that global/static variables must be declared before use. Statements can exist only within functions and there must be one and only one probe-handler function per probepoint. Probe pragmas apply to the probe-handler function following them. A probe program can contain any number of functions, probe-handlers, and variables unassociated with any function or probe definition (free variables). Functions are visible only within the same ’program’ i.e. they have file scope, and can be defined in any order within the dprobes program file. Probe-handler functions are the entry points to the dprobes "program" and as such can’t be invoked by any other function. There is no equivalent to main() in a dprobes C program, or rather the function of main() as an entry point is assumed by one or more probe-handler functions which serve as entry points into the dprobes program. Note that there is no argc, argv equivalent for probe functions i.e. there is no way to pass arguments to a probe function. There are several required pragmas, which must be correct in order for a program to compile successfully. For probepoints, these are: #pragma MODNAME(modname) #pragma PROBEPOINT_HANDLER(entry point function) #pragma MODTYPE(user | kernel | kmod) #pragma PROBEPOINT_LOCATION("function name or filename:line number") For watchpoints, these are: #pragma MODNAME(modname) #pragma PROBEPOINT_HANDLER(entry point function) #pragma MODTYPE(user | kernel | kmod) #pragma WATCHPOINT_LOCATION(location) #pragma WATCHPOINT_TYPE(X | RW | W | IO) In particular, the MODNAME should name the image containing debug symbols for the module being probed. Modules being probed should be compiled using the -g option, and for this version of the DProbes compiler, must NOT be compiled with the -fomit-frame-pointers flag. To compile a kernel image with debugging symbols, add the -g compile flag and remove the -fomit-frame-pointers flag from the CFLAGS* variables in the top-level kernel Makefile and recompile. For kernel probes, the file you’ll name in the MODNAME pragma is the vmlinux file in the top-level kernel source directory (the bootable image is stripped of all debug info in a final phase of the make, so you won’t actually be running with a 10 M kernel!). Pragma List The DProbes C compiler supports the following pragmas, most of which translate fairly directly to the corresponding dprobes.lang RPN header statements: File pragmas (can only appear once per program file): #pragma MODNAME(modname) #pragma SYMBOLS(symbol_file) #pragma MODTYPE(user | kernel | kmod) #pragma MAJOR(major code) #pragma JMPMAX(number) #pragma LOGMAX(number) #pragma EX_LOGMAX(number) #pragma PRINTSTACKTRACE(yes | no) #pragma PROBE_GROUPDEF(groupdef) #pragma PROBE_TYPEDEF(typedef) Probe pragmas (can appear once per probe): #pragma PROBEPOINT_HANDLER(entry point function) #pragma PROBEPOINT_LOCATION("function name or filename:line number") #pragma PROBEPOINT_OPCODE(opcode) #pragma WATCHPOINT_LOCATION(location) #pragma WATCHPOINT_TYPE(X | RW | W | IO) #pragma MINOR(minor code) #pragma PASSCOUNT(number) #pragma MAXHITS(number) #pragma EX_MASK(number) #pragma PROBE_GROUP(group) #pragma PROBE_TYPE(type) #pragma LOGONFAULT(yes | no) The PROBEPOINT_HANDLER pragma names the ’entry-point’ function for a probe. This function must be a parameterless, void-returning function defined following this pragma. A given probe must name one and only one handler function. The PROBEPOINT_LOCATION can be either a function name or a "filename:line number" indicating the line number in the source file corresponding to where the probe should be applied (typically the filename:line number version is what you want to use to log the value of a local variable). In either case, the PROBEPOINT_LOCATION will be used to calculate an offset and opcode for probepoints, thus there is no need for an explicit offet pragma, although the opcode can be explicitly specified using the PROBEPOINT_OPCODE pragma if necessary. The SYMBOLS pragma can be used if the symbols should be found somewhere other than in the module itself. The PRINTSTACKTRACE pragma is used mainly to turn off stack tracing. By default, if a program contains try/catch blocks, stack tracing will be enabled, unless otherwise directed via this pragma. Note that there are no pragmas defined to set the number of local or global variables (i.e. dprobes lvar/gvar variables) - these values will be calculated and set by the compiler, and are inaccessible to user programs. Supported Types The dprobes C language supports a limited set of variable types: integral ( char, short, int, long, as well as signed and unsigned variants), enums, structs, and unions. Pointers to the built-in integral types (including void ) and derived types are supported. Arrays of built-in and derived types are also supported. Note that the granularity of ’memory’ locations in the dprobes C language is the same as the machine word size, (e.g. 32 bits for Intel x86), so all variables, including chars, occupy 32 bits regardless of their size in the language. Typedefs, function pointers, variable argument functions, floating point types, long long types, and bitfields are not yet supported. Pointers are defined in the dprobes C grammar and are also used to store addresses of objects in the probed program, so in addition to keeping track of their source, the compiler makes sure they’re wide enough to hold addresses in the target architecture. Variable Scope and Storage Within a probe program, variables declared outside of any function default to the ’static’ storage class, and are visible only to functions within the given probe program file. Variables outside of any function and qualified with the extern(n) keyword are visible to all probe programs i.e. they’re global. Function parameters and variables declared within functions default to the ’auto’ storage class, which means they’re visible only within the containing function, and don’t retain their values between invocations. Variables local to a function may be made to retain their values between function invocations by qualifying them with the static keyword. extern and static variables are automatically initialized to 0. Uninitialized automatic variables have undefined contents. Any variable declared within a block is local to that block. Blocks are delimited by curly braces - {}. Local variables and parameters are stored on the RPN stack. All other variables are stored either in the dprobes local or global variable arrays (lvar/gvar storage areas). All variables have a size that is a multiple of the machine word size (RPN stack width). From here on, ’word’ refers to this size, not the 16-bit size as used in the dprobes RPN language, unless otherwise noted. Global Variable Syntax Note that the notion of a global variable in dprobes refers to a variable visible between probe programs (as opposed to probepoints within the same program), and should only be used rarely; in most cases, static variables are what is really wanted i.e. variables visible to all functions in a program. Global variables, then, are specified using a slight modification of the extern keyword syntax. The programmer must explicitly select the dprobes gvar index that the variable maps onto, enclosed in parentheses immediately following the extern keyword: extern(gvar index) type variablename; If the variable refers to an array or stuct variable, the index refers to the index of the first array element of an array, or the first member of a struct. The reason this is necessary is that there really isn’t any linkage phase when compiling dprobes programs, so ’external linkage’ doesn’t really have any meaning. Although global variables are shared between probe programs, there isn’t currently any way to associate a global symbol accessible to all probe programs with a given global variable index. Operators The following binary operators are supported: +,-,*,/,%,<<,>>,&&,||,&,|,^,==,!=,<,>,<=,>=,=. The following assign-modify operators are supported: +=,-=,*=,/=,%=,<<=,>>=,&=,|=,^=. The following unary operators are supported: -,!,~,*,&,++,--,sizeof,+,CAST. Additionally, rol/ror and their assign-modify variants are supported as operators (<<<, >>>, <<<=, >>>=). Functions All executable instructions within a probe program must exist within some function. Global and static variables declared outside of any function of course exist outside of any function and are visible to all functions. Static variables declared within a function are visible only to the containing function, but persist across function calls. Parameters and non-static variables declared within a function exist only for the lifetime of the function call and are not visible outside the function. In addition to user-defined entry-point and support functions, there are a number of ’built-in’ functions available to all probe programs, which use the same parameter-passing convention. Functions can either return to their caller or can cause the probe to exit in one of 4 ways: exit and save logged data, exit and discard log data, exit and invoke an external debug facility, or exit removing the probe. If a probe-handler function exits normally i.e. via an explicit or implicit return from a probe-handler, logged data is saved. If a probe function calls the built-in function abort(), the function exits and discards the current log data. If a probe function calls the built-in function exit(n), the function exits and invokes one of the following external debug facilities corresponding to the value of n, which can take on the following pre-defined constant values: SAVE_LOG SGI_KDB SGI_CRASH_DUMP CORE_DUMP CORE_DUMP applies only to user probes. SGI_CRASH_DUMP and SGI_KDB apply only to kernel/kmod probes. If a probe function calls the built-in function remove(), the function exits and removes itself i.e. it will never be called again. Conditional Statements and Expressions There are 3 kinds of conditional constructs in dprobes C - if-else statements, switch statements and the ternary operator. The general form of an .B if-else statement is: if (expression) statement1 [else statement2] The general form of a switch statement is: switch (expresssion) { case constant1: statement1 case constant2: statement2 ... default: default_statement } The general form of the ternary operator is: expression ? expression1 : expression2 Looping Statements There are 3 types of looping constructs in dprobes C - for loops, while loops and do loops. The general form of a while loop is: while (expression) statement The general form of a for loop is: for(statement1; expression; statement3) statement2 The general form of a do loop is: do statement while (expression) When ’expression’ evaluates to false, the loops terminate. Loops are also terminated immediately after a break statement. The continue statement causes the current iteration to terminate immediately, after which the next iteration is started. ’statement’ can mean either a single statement or a block of statements. Exceptions and Exception Handling The dprobes C language supports exceptions and exception handling. Here’s the general form for try/catch exception handling (e1, e2, ... are constants): try { // try something } catch (e1) { // caught exception type e1 } catch (e2) { // caught exception type e2 } . . . catch (eN) { // caught exception type eN } // Code here is executed only if there was no exception // thrown by the code above, or an exception was thrown // and caught by one of the catch handlers above Exceptions can be thrown or re-thrown using the throw keyword in a statement: throw exception_code; Any block of code within a function may be wrapped in a try block. The code in a try block will execute normally until an exception occurs. If there is a catch block associated with the try block, which matches the particular exception that occurred, the code in the matching catch block will be executed. There can be multiple catch blocks, corresponding to multiple exception types, associated with a given try block. If there is no catch handler for a given exception in the current block, the exception is automatically propagated up to the containing block, or if there is no containing block, to the calling function. If the exception reaches the top-level function (i.e. a probe-handler) and remains uncaught in that function, the probe is terminated. If an exception is caught, it’s effectively ’canceled’ at that point; propagation of the exception is halted, and execution continues at the beginning of the matching catch block. A caught exception may be re-propagated via the throw statement. When an exception is re-thrown, execution in the current block stops at the point of the throw statement. User code can originate any exception type (though throwing built-in exception types is inadvisable), and may define and throw user-defined exception types. The built-in exception types are as follows: |
EX_INVALID_ADDR |
invalid flat address |
EX_SEG_FAULT |
invalid segmented address |
EX_MAX_JMPS |
the total number of jumps for this probe has exceeded the JMP_MAX pragma value |
EX_CALL_STACK_OVERFLOW |
the total number of function calls has exceeded the hard-coded dprobes limit (32) |
EX_DIV_BY_ZERO |
divide by 0 error |
EX_INVALID_OPERAND |
invalid value for dprobes instruction operand, could also be bad lvar/gvar index |
EX_INVALID_OPCODE |
opcode invalid for this interpreter |
EX_LOG_OVERFLOW |
the total number of bytes logged for this probe has exceeded the LOG_MAX pragma value |
EX_RPN_STACK_WRAP |
informational exception indicating that the dprobes RPN stack has wrapped |
EX_CATCH_ALL |
a catch-all value which can be used to match any exception. If used, typically used for the last catch statement in a try/catch block. |
Users can define their own exceptions by ORing their exception number with EX_USER as such: #define MY_EXCEPTION EX_USER | 0x00010000 For user exceptions, only the top half of the machine word width is available for defining exception values - the bottom half is reserved for built-in exception codes. Stack Traces If an exception occurs in a non-handler function, and remains unhandled after having been propagated to the top-level, probe-handler function, the probe is terminated and a stack trace will be logged to the system log. The data logged for a stack trace include the following: offset, probe major code, probe minor code, exception code and params, exception address, and an entry for each stack frame, local (lv) and global (gv) variable in use at the time of the exception. The exception code ’params’ detail additional information about certain exceptions: |
EX_INVALID_ADDR |
param 1: faulting address |
EX_SEG_FAULT |
param 1: faulting segment |
EX_MAX_JMPS |
param 1: JMP_MAX value |
EX_CALL_STACK_OVERFLOW |
param 1: number of nested calls |
EX_INVALID_OPERAND |
param 1: lv/gv/propagate bit out of range (1/2/3) param 2: invalid index or bit index |
EX_INVALID_OPCODE |
param 1: RPN program offset of invalid opcode |
EX_LOG_OVERFLOW |
param 1: LOG_MAX value |
EX_RPN_STACK_WRAP |
param 1: RPN stack size |
The ’tailprint’ script found in the dpcc distribution attempts to display the stack trace in a human-readable format, but doesn’t yet correlate exceptions with the source that produced them. Note that if an exception occurs in the probe-handler function and remains unhandled, nothing is logged, since at that point there is no call stack. Note also that there isn’t yet any way to specify param 1/param2 values for user-defined exceptions, which themselves should be able to take on user-defined values. Library Functions In addition to the functions built into the compiler (see below), there are a set of functions and definitions implemented as DProbes C code. These are contained in the dpcc distribution’s ./include directory, which should be specified in a -I compiler option when compiling code that accesses these functions and definitions. |
defs.dph |
needed for exception handling and the exit(N) builtin function. |
regs.dph |
needed for builtin register manipulation functions. |
string.dph |
contains definitions for the following utility functions: |
int strlen(char * str); |
Return the length of str. |
char * strchr(char * str, char c); |
Return pointer to first occurrenc of c in str. |
void print(char * str); |
Print str to tailprint ’console’. |
void printd(char * str, int j); |
Print a string to tailprint ’console’, replacing ’%d’ with the |
specified int value. Handles only single %d in string.
void printu(char * str, unsigned int u); |
Print a string to tailprint ’console’, replacing ’%u’ with the |
specified unsigned int value. Handles only single %u in string.
void printx(char * str, unsigned int xval); |
Print a string to tailprint ’console’, replacing ’%x’ with the |
specified unsigned int value. Handles only single %x in string.
The print() functions are designed to be used in conjunction with the ’tailprint’ Perl script included in the distribution. Built-in Functions There are a number of functions built into the compiler and which don’t require any additional includes, unless otherwise specified. Built-in functions also provide an avenue for developers to define and make available functions written directly in the DProbes RPN language rather than via DProbes C code. The current set of built-in functions is: |
unsigned probe_expr("probe expression") |
return the result of evaluating the probe expression contained in the string literal "probe expression". The result can be assigned to |
an HLL variable or pointer variable.
unsigned probe_expr_rel(probe pointer expression, "probe expression") |
return the result of evaluating the expression contained in the string literal "probe expression", relative to a probe pointer. The result can be assigned to an HLL variable or pointer variable. |
void log_probe_expr("probe expression") |
log the result of evaluating the expression contained in the string literal "probe expression". |
void log_probe_expr_rel(probe pointer expression, "probe expression") |
log the result of evaluating the expression contained in the string literal "probe expression", relative to a probe pointer. |
void log_expr(HLL expression) |
log the result of evaluating the HLL expression. |
void log_array(HLL array expression, HLL array length expression) |
log the specified number of elements of the HLL array pointed to by HLL array expression. |
void push(HLL expression) |
push the result of evaluating the HLL expression on the RPN stack. Used in conjunction logd/logw/logb. |
void log_probe_data(probe pointer expression, length expression) |
log the specified number of bytes at the location pointed to by the probe pointer expression. |
void log_probe_string(probe pointer expression) |
log bytes at the location pointed to by the probe pointer expression until the end of string is found. |
unsigned long get_reg(reg) |
get the value contained in specified register. REQUIRES include/regs.dph |
unsigned long get_user_reg(reg) |
get the value contained in specified user-context register. REQUIRES include/regs.dph |
void set_reg(reg, unsigned long value) |
set the value of the specified register to value. REQUIRES include/regs.dph |
void set_user_reg(reg, unsigned long value) |
set the value of the specified user-context register to value. REQUIRES include/regs.dph |
void logd(int n) |
log n dwords on TOS. |
void logw(int n) |
log n words on TOS. |
void logb(int n) |
log n bytes on TOS. |
void unlog() |
backout a log instruction failure by removing the failed log data. Should only be called from within an EX_LOG_OVERFLOW handler. |
void abort_probe() |
unconditionally abort the probe. |
void exit_probe(int n) |
exit invoking external debug utility specified by n. REQUIRES include/defs.dph. This function causes the probe to exit and invokes one of the following external debug facilities corresponding to the value of n, which can take on the following pre-defined constant values: SAVE_LOG SGI_KDB SGI_CRASH_DUMP CORE_DUMP CORE_DUMP applies only to user probes. SGI_CRASH_DUMP and SGI_KDB apply only to kernel/kmod probes. |
void remove_probe() |
exit removing probe forever. |
void set_minor(unsigned int min) |
set minor code. |
void set_major(unsigneed int maj) |
set major code. |
unsigned long get_pid() |
get pid of current process. |
unsigned long get_procid() |
get id of current processor. |
unsigned long get_task() |
get address of current task. |
unsigned char inb(int ioport) |
read byte at port ioport. |
void outb(unsigned char byte, int ioport) |
write byte to port ioport. |
unsigned short inw(int ioport) |
read word at port ioport. |
void outw(int ioport, unsigned short word) |
write word at port ioport. |
unsigned long inl(int ioport) |
read dword at port ioport. |
void outl(int ioport, unsigned long dword) |
write dword at port ioport. |
int is_valid_address(unsigned long address) |
return true if the address is valid. |
unsigned long seg2flat(unsigned long segment, unsigned long offset) |
return flat address corresponding to segmented addr. |
unsigned long combine_addresses(unsigned long lowdword, unsigned long highdword) |
return concat of lowword(lowdword) and lowword(highdword). |
Adding Your Own Built-in Functions In general, you can reuse your dprobes C code by including the functions you want to reuse in a file and #including it other source files. Sometimes, however, it’s not possible to write a certain function using only high-level-language C statements, and you’ll need to write a function using the dprobes RPN language directly, and make it available to be called from dprobes C language code. This means that the RPN code making up the function body has to understand how to access via RPN the arguments passed into the function and as well must understand how to return a value. The first step in creating a builtin function is to set up some data structures and register the new function with the compiler. This should be done by adding a registration call to the add_builtins() function in builtins.c. Here’s an example of a builtin function registration: ast_type * param_types[3]; param_types[0] = ast_builtin_type_int; param_types[1] = ast_builtin_type_signed_char; param_types[2] = ast_builtin_type_long; add_builtin_function(containing_scope, "my_builtin", ast_builtin_type_int, /* return type */ 3, /* n params */ param_types, /* param type list */ gen_my_fn); First, an array of ast_type * should be created, and filled with the types of each parameter the function will have, starting with the first. There are a set of global objects already allocated for primitive types, and some of these are used in the example above. To register the function, add_builtin_function() must be called with the appropriate values for the parameters of the add_builtin_function(), whose prototype is shown here: void add_builtin_function(ast_block * static_scope, char * fn_name, ast_type * retval_type, int n_params, ast_type * param_types[], void (*gen_fn_body_fn) (ast_node *)); |
static_scope |
The outermost scope. Passed in to the add_builtins() function, this should just be passed on to add_builtin_function(). |
fn_name |
The user-visible name of the function. |
retval_type |
The ast_type for the return value. Here are a list of the available built-in types: ast_builtin_type_void; ast_builtin_type_int; ast_builtin_type_unsigned_int; ast_builtin_type_short; ast_builtin_type_unsigned_short; ast_builtin_type_long; ast_builtin_type_unsigned_long; ast_builtin_type_signed_char; ast_builtin_type_unsigned_char; |
n_params |
The number of params the function has. |
param_types[] |
The array of ast_type * that describe the function params. |
gen_fn_body_fn |
A pointer to the function that will generate the code for the body of the function. Here’s an example gen_fn_body_fn function implementation for a built-in function with the following signature: int my_fn(int first_param, char second_param, long third_param); void gen_my_fn(ast_node * block_node) { dp_gen_line("push sbp, 1"); // get second_param // do something with second_param dp_gen_line("push sbp, 0"); // get first_param // do something with first_param dp_gen_line("push sbp, 2"); // get third_param // do something with third_param dp_gen_line("push 0x7777); // push value to return // retval position for 1 parameter fn returning unsigned char dp_gen_line("pop sbp, 3"); // pop return val into return val slot } |
Accessing function params in a function: When the function is entered, the SBP register points to the stack position when the function was entered, which is the position following the last parameter that was pushed. Since parameters are pushed in reverse order, the last parameter pushed is the first function parameter. This allows parameters to be accessed using their natural index relative to SBP e.g. the first param index is 0 relative to SBP, the second is 1 relative to SBP, etc. Returning a value from a function: If the function doesn’t return a value, there’s nothing to do. Otherwise, there’s a stack word reserved for the return value and it must be filled in with the actual return value. The reserved stack word is the stack word just before the first param that was pushed onto the stack before the call. For clarity, here’s a crude diagram of what the stack looks like just after a function call (stack grows downward): 3 space for return value |
1 second_param See builtins.c for actual examples. |
Preliminaries This section presents dpcc concepts using a hands-on approach, starting with the simplest probes and progressing through more advanced and useful features, explaining things along the way. If you’re more interested in ’just the facts, ma’am’, see the LANGUAGE REFERENCE section. The probes presented here are complete and tested probes. To try them out yourself, you can either cut and paste from this document, or see the tut*.dpc files in the examples directory of the dpcc distribution. Unless otherwise noted, these examples can be compiled using the following command-line (assuming your current directory is the dpcc distribution base directory): dpcc examples/tutN.dpc The compiled probe can be inserted using: dprobes -i tutN.rpn Make sure the dp and hook modules are actually loaded (using lsmod ) or you’ll get an error. If they aren’t, use the modprobe command to load them: modprobe dp Also, before compiling an example, make sure that the previous example is removed before it’s compiled: dprobes -r -a or you may get a ’probe not allowed’ error message from the dprobes insert command e.g. ’probe not allowed on opcode 0xcc’. For most of these examples, once the probe is inserted, running the triggerprobe progam: ./triggerprobe will cause the probe to fire, and you can look in the system log to see the results. It helps to have a separate xterm open and tailing the system log: tail -f /var/log/messages For the examples that use the print() functions with the tailprint Perl script, it helps to have a separate xterm open and running tailprint: ./tailprint You should make sure the tailprint script is actually tailing the correct file if different from /var/log/message by editing the tailprint script to reflect the correct log file. Examples This is pretty much the simplest probe you can write: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("test_fn") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { log_expr(7); } This probe specifies that whenever the test_fn() function in the user-space program "/home/trz/dpcc-1.0.0/triggerprobe" executes, the code in the probe-handler function, test(), defined in this probe program should be executed by the dprobes interpreter. In this case the result is simply to write the numeric constant, 7, to the system log. Here’s what the output entry in the system log would look like: Feb 7 10:13:48 positron kernel: dprobes(1,0) cpu=0 Feb 7 10:13:48 positron kernel: dprobes(1,0) 7 0 0 0 The MODNAME and MODTYPE pragmas are the two probe program file pragmas required by all probe programs and must appear only once per probe program file. The PROBEPOINT_LOCATION and PROBEPOINT_HANDLER pragmas are the two probe pragmas required by each probe-point defined in a probe program file. See below for an example of a probe program that contains multiple probe-points. The general format of a probe program is: File pragmas Global/static variables Probe pragmas (per probe/watch-point) Probe-handler function(s) (one per probe/watch-point) Support functions These can occur in any order, with the exception that global/static variables must be declared before use. Statements can exist only within functions and there must be one and only one probe-handler function per probe-point. Probe pragmas apply to the probe-handler function following them. Here’s a slighly more interesting simple probe: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("test_fn") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") int i; /* View output in system log e.g. /var/log/messages. */ void test() { i++; log_expr(i); } This probe is similar to the above, except that rather than logging a constant, the probe-handler logs the current value of the ’global’ variable i, after incrementing it. The effect is to maintain a counter of the number of times that the probed function, test_fn(), has been executed. Here’s what the output entries in the system log would look like, after the probed function was executed 3 times: Feb 7 11:39:55 positron kernel: dprobes(1,0) cpu=0 Feb 7 11:39:55 positron kernel: dprobes(1,0) 1 0 0 0 Feb 7 11:39:58 positron kernel: dprobes(1,0) cpu=0 Feb 7 11:39:58 positron kernel: dprobes(1,0) 2 0 0 0 Feb 7 11:40:00 positron kernel: dprobes(1,0) cpu=0 Feb 7 11:40:00 positron kernel: dprobes(1,0) 3 0 0 0 This probe also illustrates a very important point which may be non-intuitive but illustrates a major difference of the dprobes C language from ANSI C. Notice that the ’counter’ global variable, i, was never initialized, and as a result automatically takes on the initial value 0, which is what would be expected for a static variable, but not a global variable, in the ANSI C definition. The short explanation is that ’global’ variables in dprobes C i.e. variables defined outside of any function, are more like static variables defined outside of any function in ANSI C, in that their scope is only valid within the program file they’re defined in, and they’re automatically initialized to 0. dprobes C also defines another type of ’global’ variable, which is visible between probe files, via the specially modified ’extern(n)’ keyword (see the examples in the examples subdirectory for an example.) In other words, the default storage class for variables (and functions) defined outside of any function is ’static’ rather than truly global. This ’staticness’ can be made explicit via the static keyword. Another important thing to understand at this point is that this example would not have worked correctly if the declaration of i had been int i=0; instead of int i; The reason for this is that user-initialized global/static variables are initialized at each execution of a probe handler. The technical reason is that there isn’t currently a hook available for when a probe is loaded that would allow initialization code to be run once per ’program load’, so user-defined static/global variable initializations are executed on every firing of the probe. Thus, to maintain a global variable visible between probes, make sure that the global variable isn’t user-initialized. The above examples demonstrate basic probe mechanics, but don’t really extract much useful information from the program being debugged. The real utility in using dpcc is the ability to log arbitrarily complex information about the internal state of the program being debugged, via symbolic expressions. First, some terminology. ’probe expressions’ refer to static strings in the dprobes C program that will be evaluated relative to the program being debugged when the probe fires. Another way to think about probe expressions is that the result of evaluating a probe expression is pretty much the same thing you’d get if you’d typed the expression at the command prompt of the gdb debugger after hitting a breakpoint. Here’s a simple example: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("test_fn") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* Demonstrates logging value of probed program global variable. */ /* View output in system log e.g. /var/log/messages. */ void test() { log_probe_expr("global_var+2*3"); } In this probe, the probe expression "global_var+2*3" will be evaluated when the test() probe handler fires, which means that when the test_fn function in the triggerprobe program is executed, the current value of the triggerprobe program’s global variable ’global_var’ is fetched and added to the result of the sub-expression "2*3", then logged. Here’s the output from the system log: Feb 7 17:28:01 positron kernel: dprobes(1,0) cpu=0 Feb 7 17:28:01 positron kernel: dprobes(1,0) 28 2 0 0 When the probe was triggered the value of global_var in the triggerprobe program was 0x222, so the value displayed (0x222+6=0x228) is correct. Here’s an example that demonstrates the ability to log stack variables: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("test_fn") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { log_probe_expr("param1+param2"); } Feb 7 22:53:20 positron kernel: dprobes(1,0) cpu=0 Feb 7 22:53:20 positron kernel: dprobes(1,0) e 0 0 0 In this case, the parameters passed to test_fn(int param1, int param2) were both the value 7, so we see the sum here logged correctly. Local variables i.e. variables local to a function can also be used in probe expressions. Here’s a naive attempt to do so: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("test_fn") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { log_probe_expr("local1+2*3"); } We might expect to see the value, 0x111+2*3=0x117, in the output. Here’s what we actually get: Feb 8 17:56:14 positron kernel: dprobes(1,0) cpu=0 Feb 8 17:56:14 positron kernel: dprobes(1,0) ce fc ff bf Not exactly what we expected. The problem is that the probe program is fired when the probed program’s test_fn() is executed, and at that point none of the local variables have been initialized. In order to have the probe fire after the local is initialized, you have to use a different method to specify that the probe should fire at a location further within the probed function: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("triggerprobe.c:118") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { log_probe_expr("local1+2*3"); } By looking at the source code for triggerprobe.c, we see that the local variable, local1, is initialized on line 117. Therefore, in order to see a meaningful value for local1, we need to have the probe fire no sooner than line 118, which is what we specify in the PROBEPOINT_LOCATION pragma. Note that we specify a file:lineno rather than a function name. The logged data we now see is correct: Feb 8 17:59:44 positron kernel: dprobes(1,0) cpu=0 Feb 8 17:59:44 positron kernel: dprobes(1,0) 17 1 0 0 In general, you need to specify the probe location using the file:lineno method whenever you’re examining local variables and/or to log the effects of a particular line of code in the probed program. So far, the probe expressions we’ve looked at have been very simple. Here’s a much more involved expression (this example is from the Cygnus whitepaper, The Heisenberg Debugging Technology , see reference in the SEE ALSO section): #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("find") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { log_probe_expr("tree->vector.p[tree->vector.n - 1]"); } Here are the struct definitions and function prototype used in the example program being debugged (triggerprobe.c, triggerprobe.h): struct point { int x, y; }; /* A vector is an array of points. N is the number of points, and p points to the first point in the array. */ struct vector { int n; struct point *p; }; /* A binary tree of vectors, ordered by KEY. */ struct tree { struct tree *left, *right; int key; struct vector *vector; }; struct tree * find (struct tree *tree, int key); Here’s the output: Feb 8 08:41:29 positron kernel: dprobes(1,0) cpu=0 Feb 8 08:41:29 positron kernel: dprobes(1,0) 0 8 0 4 0 0 0 4 0 0 0 tree is a pointer to the root of a linked data structure passed as a parameter to the probed function, find(). The expression goes through a couple levels of structure access and uses a complex expression as an array subscript. Notice that the object being logged is an instance of a struct point , which contains two integers. The output shows that both integers are automatically logged (the 3 initial bytes are a sentinel code and a count of how many of the bytes following it correspond to this log statement), illustrating that the expression analyzer will figure out how many bytes to log by examining the type of object being logged, where possible. The previous example illustrates the ability to log complex structures, but doesn’t hint at how one might log all the nodes of a linked structure given a pointer to one of them. In fact, it’s not possible to do so algorithmically using only the functionality presented so far, bringing us to our next example, which introduces log_probe_expr_rel() and probe_expr() as well as the ability to assign the result of evaluating a probe expression to a probe variable: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("print_list") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") void * test_node; /* View output in system log e.g. /var/log/messages. */ void test() { test_node = probe_expr("list"); while(test_node) { log_probe_expr_rel(test_node, "val"); log_probe_expr_rel(test_node, "array[2]"); log_probe_expr_rel(test_node, "p.x"); test_node = probe_expr_rel(test_node, "next"); } } Here are the struct definitions and function prototype used in the example program being debugged (triggerprobe.c, triggerprobe.h): struct list_elt { int array[10]; int val; struct point p; char string[16]; char c; struct list_elt * next; }; void print_list(struct list_elt * list, char c, int testint); In the first line of the probe-handler, the result of evaluating list via the probe_expr() builtin function is assigned to the probe’s test_node static variable, which is a void *. The list variable within the context of the function being probed refers to a struct list_elt * parameter passed to the probed function, which is the head of a linked list. The probe_expr() function evaluates its argument and returns a single value from the program being probed, which can subsequently be stored in a probe variable. The value returned from probe_expr() or probe_expr_rel() can represent any value in the probed program and can be assigned to any probe program variable. In this case, the value returned refers to a pointer in the probed program, which is assigned to a void * probe variable because there’s no represention of a list_elt in the probe itself, and there generally won’t be unless it’s a pointer to a primitive type. Variables assigned the value of probe expression results are just like any other probe variable, with the exception of pointers, which have to keep track of the fact that they refer not to a location in the probe program but rather to a location in the probed program. Arithmetic operations applied to probe pointers take this into account, i.e. pointer arithmetic will work correctly according to what the pointer actually points to. Now that we have a pointer variable that contains a pointer to the linked list in the probed program, we can use the log_probe_expr_rel() and probe_expr_rel() builtin functions to traverse it and log each element’s variables along the way, until we reach the end of the list. The controlling while loop checks at each iteration whether or not the list_elt * pointer, test_node is NULL. If not, it uses that value to have log_probe_expr_rel() evaluate its probe expression relative to the passed-in test_node pointer and log the result. In this case, three such calls are made, one to log a primitive list_elt member, an embedded array member and an embedded struct member. The final line in the loop updates the test_node pointer variable to point to the next element in the linked list, via the probe_expr_rel() function, which is similar to the probe_expr() function except that the probe expression next is evaluated relative to the current value of test_node, which is subsequently updated with the new value. Here’s the probe output: Feb 8 10:11:35 positron kernel: dprobes(1,0) cpu=0 Feb 8 10:11:35 positron kernel: dprobes(1,0) 0 4 0 0 0 0 0 0 4 0 a 0 0 0 0 4 0 10 0 0 0 0 4 0 1 0 0 0 0 4 0 9 0 0 0 0 4 0 11 0 0 0 0 4 0 2 0 0 0 0 4 0 8 0 0 0 0 4 0 12 0 0 0 0 4 0 3 0 0 0 0 4 0 7 0 0 0 0 4 0 13 0 0 0 0 4 0 4 0 0 0 0 4 0 6 0 0 0 0 4 0 14 0 0 0 0 4 0 5 0 0 0 0 4 0 5 0 0 0 0 4 0 15 0 0 0 0 4 0 6 0 0 0 0 4 0 4 0 0 0 0 4 0 16 0 0 0 0 4 0 7 0 0 0 0 4 0 3 0 0 0 0 4 0 17 0 0 0 0 4 0 8 0 0 0 0 4 0 2 0 0 0 0 4 0 18 0 0 0 0 4 0 9 0 0 0 0 4 0 1 0 0 0 0 4 0 19 0 0 0 Note that in this case as well, each logged item is preceded with a sentinel value and the number of bytes per call i.e. each log call is preceded with 0 4 0. The next example is similar, but demonstrates logging an array of structs rather than a linked list: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("print_alist") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { void * test_node; void * alist; int i=1; alist = probe_expr("alist"); test_node = probe_expr("alist"); while(i<10) { log_probe_expr_rel(test_node, "val"); log_probe_expr_rel(test_node, "array[2]"); log_probe_expr_rel(test_node, "p.x"); test_node = alist+i; i++; } } This example also demonstrates that pointer arithmetic on probe pointers depends on the source of the pointer (probe or probed program). Pointer arithmetic is is used here rather than the probe_expr_rel() function to update the test_node pointer. Note also that for (non-string) arrays in the probed program, there isn’t a way for the expression analyzer to know how large the array is, so a value reflecting the array size must be explicitly supplied where needed in the probe program. The output: Feb 8 11:43:00 positron kernel: dprobes(1,0) cpu=0 Feb 8 11:43:00 positron kernel: dprobes(1,0) 0 4 0 0 0 0 0 0 4 0 a 0 0 0 0 4 0 10 0 0 0 0 4 0 1 0 0 0 0 4 0 9 0 0 0 0 4 0 11 0 0 0 0 4 0 2 0 0 0 0 4 0 8 0 0 0 0 4 0 12 0 0 0 0 4 0 3 0 0 0 0 4 0 7 0 0 0 0 4 0 13 0 0 0 0 4 0 4 0 0 0 0 4 0 6 0 0 0 0 4 0 14 0 0 0 0 4 0 5 0 0 0 0 4 0 5 0 0 0 0 4 0 15 0 0 0 0 4 0 6 0 0 0 0 4 0 4 0 0 0 0 4 0 16 0 0 0 0 4 0 7 0 0 0 0 4 0 3 0 0 0 0 4 0 17 0 0 0 0 4 0 8 0 0 0 0 4 0 2 0 0 0 0 4 0 18 0 0 0 If you’d rather just unconditionally dump the whole array, the next probe shows how: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("print_alist") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") int size; void * alist_start; /* View output in system log e.g. /var/log/messages. */ void test() { alist_start = probe_expr("alist"); size = probe_expr("sizeof(struct list_elt)"); /* 0x13c */ size*=10; log_probe_data(alist_start, size); } Here, the log_probe_data() builtin function is used to log a specified number of bytes starting at the given probed program address. The number of bytes to log is here found by evaluating a sizeof expression within the context of the probed program, assigning that value to a probe variable and multiplying it by the number of items we know the array to have. Here’s the output: Feb 8 12:19:49 positron kernel: dprobes(1,0) cpu=0 Feb 8 12:19:49 positron kernel: dprobes(1,0) 0 f8 2 a 0 0 0 a 0 0 0 a 0 0 0 a 0 0 0 a 0 0 0 a 0 0 0 a 0 0 0 a 0 0 0 a 0 0 0 a 0 0 0 0 0 0 0 10 0 0 0 10 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 9 0 0 0 1 0 0 0 11 0 0 0 11 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 8 0 0 0 2 0 0 0 12 0 0 0 12 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 7 0 0 0 3 0 0 0 13 0 0 0 13 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 6 0 0 0 4 0 0 0 14 0 0 0 14 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 15 0 0 0 15 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 Feb 8 12:19:49 positron kernel: 0 0 0 0 0 0 0 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 4 0 0 0 6 0 0 0 16 0 0 0 16 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 3 0 0 0 7 0 0 0 17 0 0 0 17 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 8 0 0 0 18 0 0 0 18 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 9 0 0 0 19 0 0 0 19 0 0 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 0 0 0 0 0 0 0 0 0 0 0 0 There’s also a special logging function designed specifically to log NULL-terminated strings in the probed program, illustrated here: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("print_list") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { char * string_addr = probe_expr("list->next->string"); log_probe_string(string_addr); } Here, a pointer to a structure, list, passed as a parameter to the probed function, print_list, is passed to the log_probe_string() builtin function, which does exactly that for a string member of that struct: Feb 8 12:03:15 positron kernel: dprobes(1,0) cpu=0 Feb 8 12:03:15 positron kernel: dprobes(1,0) 1 c 0 68 65 6c 6c 6f 20 77 6f 72 6c 64 21 If you can read raw ASCII, you’ll see that this says "hello world!" (following the 3 sentinel and length bytes). The final logging function you need to be aware of is log_array(). Recall that log_expr() will log arbitrary hll expressions. It won’t however log hll arrays, because it doesn’t have any way to tell how many array elements to log, given a pointer to an array. Thus the need for log_array() : #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma PROBEPOINT_LOCATION("test_fn") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("starthere") /* View output in system log e.g. /var/log/messages. */ int array[13]; void starthere() { int i; int sizeof_array = sizeof(array); for(i=0;i<sizeof_array;i++) { array[i] = i; } log_array(array, sizeof_array); } This probe simply initializes a static array and then logs it. Here’s the output: Feb 8 16:25:24 positron kernel: dprobes(1,0) cpu=0 Feb 8 16:25:24 positron kernel: dprobes(1,0) 5 d 0 0 0 0 0 1 0 0 0 2 0 0 0 3 0 0 0 4 0 0 0 5 0 0 0 6 0 0 0 7 0 0 0 8 0 0 0 9 0 0 0 a 0 0 0 b 0 0 0 c 0 0 0 To make probe development a little easier, bundled with the dpcc distribution is a Perl script named tailprint which when used in conjunction with several print() functions, allows the system log to be used as a sort of probe ’console’. Here’s an example probe program that demonstrates each of the print() functions: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma JMPMAX(65535) #pragma PROBEPOINT_LOCATION("test_fn") #pragma MODTYPE(user) #pragma PROBEPOINT_HANDLER("starthere") #include "string.dph" int i; int j = 0; unsigned int u=7; unsigned int xval=0x0caaffee; /* View print output using tailprint. */ int test_return(char * str, int a, int b, int c) { int d; int e; int f; d = a; e = b; f = c; printd(str, e); return 9; } void starthere() { i = test_return("hola %d", 6, 7, 8); printd("i: %d", i); printu("u: %u", u); printx("x: %x", xval); print("adios"); } The print() functions are actually implemented in the string.dph file in the ./include directory contained in the dpcc distribution. In order to use them, string.dph must be #included as above, and dpcc must be told to look in the include directory for include files e.g. if compiling in the dpcc distribution directory: dpcc -Iinclude examples/tut13.dpc Here’s the output you’ll see in the tailprint ’console’: hola 7 i: 9 u: 7 x: 0xcaaffee adios The print() functions simply replace the single format string parameter in each string param with the value of the following parameter, then effectively print a newline, so that the next print function starts on a new line (i.e. don’t use a newline yourself in the string). There can only be one %format char in each string for the printd() (integer version), printu() (unsigned int version), and printx() (hex version). The print() function itself doesn’t accept any %format and simply prints the single string param. So far, we’ve demonstrated user-space probes only. dpcc and dprobes can also be used to build kernel and kernel module probes. Here’s an example of a kernel probe: #pragma MODNAME("/usr/src/linux/vmlinux") #pragma PROBEPOINT_LOCATION("do_fork") #pragma MODTYPE(kernel) #pragma PROBEPOINT_HANDLER("test") /* View output in system log e.g. /var/log/messages. */ void test() { log_probe_expr("clone_flags"); } Notice that the MODTYPE pragma has the value of kernel and the MODNAME pragma lists vmlinux as the module name. This is the debugging version of the kernel image (not the stripped image that’s actually booted) which will be used to look up kernel symbols for probe expressions. The finished probe is inserted using a dprobes command something like this: dprobes -i tut14.rpn -s /usr/src/linux/System.map This probe is fired whenever a new process is forked, and logs the clone_flags parameter of the kernel’s do_fork function. Here’s what the output looks like: Feb 8 23:25:28 positron kernel: dprobes(1,0) cpu=0 Feb 8 23:25:28 positron kernel: dprobes(1,0) 0 4 0 11 0 0 0 dpcc can also be used to create kernel module probes: #pragma MODNAME("/lib/modules/2.4.6/kernel/drivers/net/3c59x.o") #pragma PROBEPOINT_LOCATION("update_stats") #pragma MODTYPE(kmod) #pragma PROBEPOINT_HANDLER("test") char * fn; /* View output in system log e.g. /var/log/messages. */ void test() { fn = probe_expr("teststring"); log_probe_string(fn); } This particular probe probes a version of the 3c59x driver hacked for demonstration purposes (a bogus string param added to the update_stats() function). Note that the MODNAME pragma names the installed module object file in the appropriate /lib/modules subdirectory, and that the MODTYPE pragma specifies ’kmod’. This probe can be tested by doing an ’ifconfig’ at the command-line. dpcc can also be used to create watchpoint probes. Here’s an example of a user-mode watchpoint probe: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma WATCHPOINT_LOCATION("test_watch_user_int:test_watch_user_int+3") #pragma WATCHPOINT_TYPE(RW) #pragma MODTYPE(kernel) #pragma PROBEPOINT_HANDLER("starthere") int val; /* View output in system log e.g. /var/log/messages. */ void starthere() { log_probe_expr("test_watch_user_int"); } This probe will be fired whenever the probed program variable test_watch_user_int, a 4-byte integer variable, changes value. The probe-handler simply logs the value of the variable after it’s changed. Note that the MODTYPE pragma specifies ’kernel’ even though the watchpoint is put on a user address. This is a dprobes requirement (see the dprobes man page for details). Also note that WATCHPOINT_LOCATION uses a symbolic address range and is specified in units of bytes. The other point to note is that the WATCHPOINT_TYPE pragma is an additional pragma required for all watchpoints. Here’s the output from this probe: Feb 9 00:00:41 positron kernel: dprobes(1,0) cpu=0 Feb 9 00:00:41 positron kernel: dprobes(1,0) 0 4 0 9 0 0 0 Finally, dprobes provides the ability to define multiple ’probepoints’ per probe program file, and likewise, so does dpcc: #pragma MODNAME("/home/trz/dpcc-1.0.0/triggerprobe") #pragma MODTYPE(user) #pragma JMPMAX(65535) #pragma PROBEPOINT_LOCATION("print_list") #pragma PROBEPOINT_HANDLER("test") /* View print output using tailprint. */ #include "string.dph" int j; void test() { j = probe_expr("list->next->val"); /* 1 */ printd("j: %d", j); } #pragma PROBEPOINT_LOCATION("test_fn") #pragma PROBEPOINT_HANDLER("starthere") int i; unsigned int u=7; unsigned int xval=0x0caaffee; int test_return(char * str, int a, int b, int c) { int d; int e; int f; d = a; e = b; f = c; printd(str, e); return 9; } void starthere() { i = test_return("hola %d", 6, 7, 8); printd("i: %d", i); printu("u: %u", u); printx("x: %x", xval); print("adios"); } #pragma PROBEPOINT_LOCATION("find") #pragma PROBEPOINT_HANDLER("test_find") int test_key; void test_find() { test_key = 13; test_key = probe_expr("key"); /* 3 */ printd("key: %d", test_key); } This probe program contains 3 probepoints, each preceded by pragmas that apply only to that probepoint. What we really mean, then, when we talk about a probepoint, is actually (at minimum) just a unique entry-point function as defined by unique PROBEPOINT_HANDLER and PROBEPOINT_LOCATION pragma values. Non-handler functions are available to any of the probe-point handlers, as are the global/static variables (these are only visible to code following their declarations however). |
- |
If you get a ’probe not allowed’ error message from the dprobes command e.g. ’probe not allowed on opcode 0xcc’, first remove the current probes: |
dprobes -r -a then recompile the probe and insert again. |
- |
If the Dprobes interpreter gets into a sort of state where output isn’t being printed, or only exceptions are being logged, try removing and re-inserting the dprobes module (assuming you compiled it as a module): |
rmmod dp |
- |
Executables (including the kernel and kernel modules) must be compiled without the -fomit-frame-pointers flag in order for parameters/local variables to be accessible within probe expressions. |
||
- |
Probes on inline functions not supported. |
||
- |
Referencing global or static data in modules not supported. |
||
- |
Floating point not supported. This is really a language non-feature rather than a bug, but as it will in the future be supported, is listed here. |
||
- |
long long not supported. This is really a language non-feature rather than a bug, but as it will in the future be supported, is listed here. |
||
- |
struct and union definitions containing embedded struct or union definitions aren’t supported. |
||
- |
The tailprint script can at times be fooled into producing rubbish as output. It and the print() utility functions should be considered unsupported; only the builtin logging functions are guaranteed to work as advertised. |
||
- |
Some of the the logging functions use instructions that prefix logged data with sentinel values, while others don’t, so it may be possible to generate ambiguous log data. This is a potential problem mainly when multiple, interleaved logging calls are made within the same probe handler. |
||
- |
When logging local arrays three extra words are prepended to the logged array. This is a result of having no good way to log the middle of the stack. |
||
- |
There’s currently no way to specify param1/param2 for user exceptions. |
||
- |
Don’t pass a pointer containing a probe expression result to a user-defined function (as opposed to builtin functions, which work fine). |
dprobes(8) dprobes.lang(8) The Heisenberg Debugging Technology, whitepaper by James Blandy and Michael Snyder (Cygnus/Redhat) |
IBM Corporation The DProbes C Compiler is based partly on the agent expressions code from gdb, the GNU Project Debugger. The DProbes C grammar is based on the ANSI C grammar maintained by Jutta Degener. |
Version 1.0.0 Last Modified February 2002 |
dpcc is licensed under GNU General Public License version 2 or later. Copyright (c) International Business Machines Corp., 2002 |