Insure tool to debug c++




Insure++ User's Guide - Insure++

Part I



Insure++

As shown in the Getting Started manual, using Insure++ is essentially easy to do. You simply recompile your program using the special insure command instead of your normal compiler. Running the program normally will then generate a report whenever an error is detected that usually contains enough detail to track down and correct the problem.

What does this give you?

Obviously, the most important advantage of Insure++ is the fact that it automatically detects errors that might otherwise go unnoticed in normal testing. Subtle memory corruption errors and dynamic memory problems often don't crash the program or cause it to give incorrect answers until the program is shipped to customers and they run it on their test cases. Then the problems start.

Even if Insure++ doesn't find any problems in your programs, running it gives you the confidence that your program doesn't contain any errors.

Of course, Insure++ can't possibly check everything that your program does. However, its checking is extensive and covers every class of programming error. The following sections discuss the types of errors that Insure++ will detect.

Memory corruption

This is one of the most unpleasant errors that can occur, especially if it is well disguised. As an example of what can happen, consider the program shown in Figure 1, which concatenates the arguments given on the command line and prints the resulting string.


        1: /*
        2:  * File: hello.c
        3:  */
        4: int main (int argc, char *argc[])
        5: {
        6:      int i;
        7:      char str[16];
        8:
        9:      str[0] = '\0';
        10:     for(i=0; i<argc; i++) {
        11:             strcat(str, argv[i]);
        12:             if(i < (argc-1)) strcat(str, "");
        13:     }
        14:     printf("You entered: %s\n", str);
        15:     return;
        16:}

Figure 1. "Hello world" with bug

If you compile and run this program with your normal compiler, you'll probably see nothing interesting, e.g.,

 

		$ cc -g -o hello hello.c
		$ hello
		You entered: hello
		$ hello world
		You entered: hello world
		$ hello cruel world
		You entered: hello cruel world 

If this were the extent of your test procedures, you would probably conclude that this program works correctly, despite the fact that it has a very serious memory corruption bug.

If you compile with Insure++, the command "hello cruel world" generates the errors shown in Figure 2, because the string that is being concatenated becomes longer than the 16 characters allocated in the declaration at line 7.


        [hello.c.:11] **WRITE_OVERFLOW**
        >> strcat(str, srgv[i]);

                Writing overflows memory: str

                        bbbbbbbbbbbbbbbbbbbbbbbbbbbb
                        |            16            |   2   |
                        wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww

                Writing (w)   : 0xf7fff8a8 thru 0xf7fff8b9
                                   (18 bytes)
                To block (b)  : 0xf7fff8a8 thru 0xf7fff8b7
                                   (16 bytes)
                          str, declared at hello.c, 7

                Stack trace where the error occurred:
                          strcat() (interface)
                            main() hello.c, 11

        **Memory corrupted. Program may crash!!**

        [hello.c:14] **READ_OVERFLOW**
        >> printf("You entered: %s\n", str);

                String is not null terminated within range: str

                Reading    : 0xf7fff8a8 thru 0xf7fff8b9 (18 bytes)
                From block : 0xf7fff8a8 thru 0xf7fff8b7 (16 bytes)
                           str, declared at hello.c, 7

                Stack trace where the error occurred:
                        main() hello.c, 16

        You entered: hello cruel world

Figure 2. Insure++'s messages from the "Hello world" program

Insure++ finds all problems related to overwriting memory or reading past the legal bounds of an object, regardless of whether it is allocated statically (i.e., a global variable), locally on the stack, dynamically (with malloc or new), or even as a shared memory block.

It also detects the case in which a pointer crosses from one block of memory into another and starts to overwrite memory there, even if the memory blocks are adjacent.

Pointer abuse

Problems with pointers are among the most difficult encountered by C programmers. Insure++ detects pointer related problems in the following categories

  • Operations on NULL pointers.
  • Operations on uninitialized pointers.
  • Operations on pointers that don't actually point to valid data.
  • Operations which try to compare or otherwise relate pointers that don't point at the same data object.
  • Function calls through function pointers that don't actually point to functions.

Figure 3 shows the code for a second attempt at the "Hello world" program that uses dynamic memory allocation.


        1: /*
        2:  *
        3:  */
        4: #include <stlib.h>
        5:
        6: int main (int argc, char *argv[]
        7: {
        8:      char *string, *string_so_far;
        9:      int i, length;
        10:
        11:     length = 0;     /* Include last NULL */
        12:
        13:     for(i=0; i<argc, i++) {
        14:             length += strlen(argv[i]+1;
        15:             string = malloc(length);
        16: /*
        17:  * Copy the string built so far.
        18:  */
        19:             if(string_so_far != (char *)0)
        20:                     strcpy(string, string_so_far);
        21:             else *string = '\0';
        22:
 	23:             strcat(string, argv[i]);
        24:             if(i < argc-1) strcat(string, "");
        25:             string_so_far = string;
        26:     }
        27:     printf("You entered: %s\n", string);
        28:     return (0);
        29:}

Figure 3. "Hello world" with dynamic memory allocation

The basic idea of this program is that we keep track of the current string size in the variable length. As each new argument is processed, we add its length to the length variable and allocate a block of memory of the new size. Notice that the code is careful to include the final NULL character when computing the string length (line 11) and also the space between strings (line 14). Both of these would be easy mistakes to make. It's an interesting exercise to see how quickly Insure++ would find such an error.

The code in lines 19-24 either copies the argument to the buffer or appends it depending on whether or not this is the first pass round the loop. Finally in line 25 we point at the new, longer string by assigning the pointer string to the variable string_so_far.

If you compile and run this program under Insure++, you'll see an "uninitialized pointer" errors reported for lines 19 and 20. This is because the variable string_so_far hasn't been set to anything before the first trip through the argument loop!

Memory leaks

A "memory leak" occurs when a piece of dynamically allocated memory cannot be freed because the program no longer contains any pointers that point to the block. A simple example of this behavior can be seen by running the (corrected) "Hello world" program with the arguments

	hello3 this is a test 

If we examine the state of the program at line 27, just before executing the call to malloc for the second time, we observe:

  • The variable string_so_far points to the string "hello" which it was assigned as a result of the previous loop iteration.
  • The variable string points to the extended string "hello this" which was assigned on this loop iteration.

These assignments are shown schematically in Figure 4 - both variables point to blocks of dynamically allocated memory.

Figure 4

The next statement

 
    string_so_far = string; 

will make both variables point to the longer memory block as shown in Figure 5.

Figure5

Once this has happened, however, there is no remaining pointer that points to the shorter block. Even if you wanted to, there is no way that the memory that was previously pointed to by string_so_far can be reclaimed - it is permanently allocated. This is known as a "memory leak", and is diagnosed by Insure++ as shown in Figure 6.


        [hello3.c:27] **LEAK ASSIGN**
        >>        string_so_far = string;

                Memory leaked due to reassignment: string

                In block: 0x0001fbb0 thru 0x0001fbb6 (7 bytes)
                          block allocated at:
                                malloc() (interface)
                                  main() hello3.c, 17

                Stack trace where the error occurred:
                                  main() hello3.c, 27

Figure 6. Insure++ error report for the memory leak

This example is called LEAK_ASSIGN by Insure++ since it is caused when a pointer is re-assigned. Other types that Insure++ detects include:

LEAK_FREE
Occurs when you free a block of memory that contains pointers to other memory blocks. If there are no other pointers that point to these secondary blocks then they are permanently lost and will be reported by Insure++.

LEAK_RETURN
Occurs when a function returns a pointer to an allocated block of memory, but the returned value is ignored in the calling routine.

LEAK_SCOPE
Occurs when a function contains a local variable that points to a block of memory, but the function returns without saving the pointer in a global variable or passing it back to its caller.

Notice that Insure++ indicates the exact source line on which the problem occurs, which is a key issue in finding and fixing memory leaks. This is an extremely important feature, because it's easy to introduce subtle memory leaks into your applications, but very hard to find them all. Using Insure++, you can instantly pinpoint the line of source code which caused the leak.

Should memory leaks be fixed?

Whether this is a serious problem depends on your application. To get more information on the seriousness of the problem, make a file called .psrc in your current directory and add to it the line(1)

    insure++.summarize leaks

Now when you run the program again, you will see the same output as before, followed by a summary of all the memory leaks in your code.

    MEMORY LEAK SUMMARY
    ===================

    4 outstanding memory references for 55 bytes.


    Leaks detected during execution
    -------------------------------
    	55 bytes 4 chunks allocated at hello3.c, 17 

This shows that even this short program lost four different chunks of memory. The total of 55 bytes isn't very large and you might well ignore it in a program this size. If, however, this was a routine in a larger program, it would be a serious problem, because every time the routine is called it allocates blocks of memory and loses some. As a result the program gradually consumes more and more memory and will finally crash when the memory space on the host machine is exhausted.

This type of bug can be extremely hard to detect, because it might take literally days to show up. It is exactly the type of bug that survives all your in-house testing and only shows up when you ship a product to a customer who needs to use it for some enormous processing task!


You may be wondering why Insure++ only prints one error message although the summary indicates that 4 memory leaks occurred. This is because Insure++ normally shows only the first error of any given type at each particular source line. If you wish, you can change this behavior as described in "Insure++ Reports".


You can obtain additional information about each individual memory leak with the .psrc option " insure++.summarize leaks".

Finding all memory leaks

For an even higher level of checking, we suggest the following algorithm for removing all memory leaks from your code. This process is unique - no other tool can do this. If you complete the following steps, there will not be any memory leaks left in your code.

INUSE
  1. Compile your program normally, but link with insure -Zuse and run the program with Inuse (see "Using Inuse" in the Inuse manual). If you see an increase in the heap size as you run the program, you are leaking memory.
  2. Compile all source code, but not libraries, with Insure++. Clean all leaks that are detected by Insure++.
  3. Compile everything that makes up your application with Insure++ - source code and libraries. Clean any leaks detected by Insure++. If you do not have source for any of the libraries, skip this step and proceed to Step 4.
  4. Repeat Step 1. If memory is increasing, add insure++.summarize leaks outstanding to your .psrc file and run your Insure++ checked program again. Any outstanding memory reference shown is a potential leak.
  5. You must now examine each outstanding memory reference to determine whether or not it is a leak. If the pointer is passed into a library function, it may be saved. If this is the case, it is not a leak. Once every outstanding memory reference is understood, and those that are leaks are cleared, the program is free of memory leaks.

Dynamic memory manipulation

Using dynamically allocated memory properly is another tricky issue. In many cases programs continue running well after a programming error causes serious memory corruption - sometimes they don't crash at all.

One common mistake is to try to reuse a pointer after it has already been freed.

As an example we could modify the "Hello world" program to de-allocate memory blocks before allocating the larger ones. Consider the following piece of code which does just that:

    21: if(string_so_far != (char *)0) {
    22: 	free(string_so_far);
    23: 	strcpy(string, string_so_far);
    24: }
    25: else *string = '\0'; 

If you run this code (hello4.c) through Insure++, you'll get another error message about a "dangling pointer" at line 23. The term "dangling pointer" is used to mean a pointer that doesn't point at a valid memory block anymore. In this case the block is freed at line 22 and then used in the following line!

This is another common problem that often goes unnoticed, because many machines and compilers allow this particular behavior.

In addition to this error Insure++, also detects the following

  • Reading from or writing to "dangling pointers".
  • Passing "dangling pointers" as arguments to functions or returning them from functions.
  • Freeing the same memory block multiple times.
  • Attempting to free statically allocated memory.
  • Freeing stack memory (local variables).
  • Passing a pointer to free that doesn't point to the beginning of a memory block.
  • Calls to free with NULL or uninitialized pointers.
  • Passing nonsensical arguments or arguments of the wrong data type to malloc, calloc, realloc or free.

Another way that Insure++ can help you track down dynamic memory problems is through the RETURN_FAILURE error code. Normally, Insure++ will not issue an error if malloc, for example, returns a NULL pointer because it is out of memory. This behavior is the default, because it is assumed that the user program is already checking for, and handling, this case.

If your program appears to be failing due to an unchecked return code, you can enable the RETURN_FAILURE error message class. Insure++ will then print a message whenever any system call fails.

Strings

The standard C library string handling functions are a rich source of potential errors, since they do very little checking on the bounds of the objects being manipulated.

Insure++ detects problems such as overwriting the end of a buffer as described in "Memory corruption". Another common problem is caused by trying to work with strings that are not null-terminated, as in the following example.

	1:	/*
	2:	 * File: readovr2.c
	3:	 */
	4:	main()
	5:	{
	6:		char junk;
	7:		char b[8];
	8:		strncpy(b, "This is a test",
	9:				  sizeof(b));
	10:		printf("$s\n", b);
	11:		return (0);
	12:	}

This program attempts to copy the string "This is a test" into a buffer which is only 8 characters long. Although it uses strncpy to avoid overwriting its buffer, the resulting copy doesn't have a NULL on the end. Insure++ detects this problem in line 10 when the call to printf tries to print the string.

Uninitialized memory

A particularly unpleasant problem to track down occurs when your program makes use of an uninitialized variable. These problems are often intermittent and can be particularly difficult to find using conventional means, since any alteration in the operation of the program may result in different behavior. It is not unusual for this type of bug to show up and then immediately disappear whenever you do something to try to trace it.

Insure++ performs checking for uninitialized data in two sub-categories

copy Normally,Insure++ doesn't complain when you assign a variable
using an uninitialized value, since many applications do this
without error. In many cases the value is changed to something
correct before being used, or may never be used at all.
read Insure++ generates an error report whenever you use an uninitialized
variable in a context which cannot be correct, such as an expression
evaluation.

To clarify the difference between these categories, consider the following code:

 
	1:	/*
	2:	 * File: readuni1.c
	3:	 */
	4:	#include <stdio.h>
	5:
	6:	int main()
	7:	{
	8:		struct rectangle {
	9:			int width;
	10:			int height;
	11:		};
	12:
	13:		struct rectangle box;
	14:		int area;
	15:
	16:		box.width = 5;
	17:		area = box.width*box.height;
	18:		printf("area = %d\n", area);
	19:		return (0);
	20:	}

In line 17 the value of box.height is used to calculate a value which is most definitely invalid, since its value was never assigned Insure++ detected this error in the READ_UNINIT_MEM(read) category. This category is enabled by default, so a message will be displayed.

In you changed line 17 to

	17:		area = box.height;

Insure++ would report errors of type READ_UNINIT_MEM(copy) for both lines 17 and 18, but only if you had unsuppressed this error category.

Uninitialized memory detection options

In a significant change from earlier versions, Insure++ now detects uninitialized memory references using a full flow-analysis of your application's source code (and can often detect problems at compile time) by default. In addition to the performance enhancements made to enable this change, there are several new .psrc options, which allow greater control over this portion of Insure++'s checking abilities (see runtime options).

The default setting is the most comprehensive form of error detection, but obviously involves some overhead during compilation. If you wish to track only uninitialized pointers, you can set the following .psrc option.

 
	insure++.checking_uninit off 

Turning off this option does not, however, completely disable uninitialized variable checking. No errors will be reported in the READ_UNINIT_MEM class, but Insure++ will still check for uninitialized pointer variables and report these errors in the READ_UNINIT_PTR error category.

Warning

If checking_uninit is disabled, uninitialized pointer errors will be reported in the READ_UNINIT_PTR category, not READ_UNINIT_MEM.

Unused variables

Insure++ can also detect variables that have no effect on the behavior of your application, either because they are never used, or because they are assigned values which are never used. In most cases these are not serious errors, since the offending statements can simply be removed, and so they are suppressed by default.

Occasionally, however, an unused variable may be a symptom of a logical program error, so you may wish to enable this checking periodically. See "Unused variables" for more details.

Data representation problems

A lot of programs make either explicit or implicit assumptions about the various data types on which they operate. A common assumption made on workstations is that pointers and integers have the same number of bytes. While some of these problems can be detected during compilation, some codes go to great lengths to hide operations with typecasts such as

    char *p;
    int ip;
ip = (int)p;

On many systems this type of operation would be valid and would cause no problems. When such code is ported to alternative architectures, however, problems can arise. The code shown above would fail, for example, when executed on a PC (16-bit integer, 32-bit pointer) or a 64-bit architecture such as the DEC Alpha (32-bit integer, 64-bit pointer).

In cases where such an operation loses information, Insure++ will report an error. On machines for which the data types have the same number of bits (or more), no error is reported.

Incompatible variable declarations

Insure++ detects inconsistent declarations of variables between source files.

A common problem is caused when an object is declared as an array in one file, e.g.,

	 int myblock[128]; 

but as a pointer in another

	extern int *myblock; 

See the files baddec11.c and baddec12.c for an example. Insure++ also reports differences in size, so that an array declared as one size in one file and another in a second will be detected.

I/O statements

The printf and scanf family of functions are easy places to make mistakes which show up either as bugs or portability problems.

Consider, for example, the code

 
    foo()
    {
    double f;

    scanf("%f", &f);
    } 

This code will not crash, but the value read into the variable f will not be correct, since its data type (double) doesn't match the format specified in the call to scanf (float). As a result, incorrect data will be transferred to the program.

In a similar way, the example badform2.c

 
	foo()
	{
		float f;

		scanf("%lf", &f);
	} 

corrupts memory, since too much data will be written over the supplied variable. This error can be very difficult to detect.

Insure++ detects both of these bugs.

A more subtle issue arises when data types used in I/O statements match "accidentally". The code

 
	foo()
	{
		long l = 123;
		printf("l = %d\n", l);
	} 

functions correctly on machines where types int and long have the same number of bits, but fails otherwise. Insure++ detects this error, but classifies it differently from the previous cases. You can choose to ignore this type of problem while still seeing the previous bugs. (See BAD_FORMAT for details.)

In addition to checking printf and scanf arguments, Insure++ also detects errors in other I/O statements. The code

 
    foo(line)
    	  char line[80];
    {
    	  gets(line);
    } 

works as long as the input supplied by the user is shorter than 80 characters, but fails on longer input. Insure++ checks for this case and reports an error if necessary.

This case is somewhat tricky, since Insure++ can only check for an overflow after the data has been read. In extreme cases the act of reading the data will crash the program before Insure++ gets the chance to report it.

Mismatched arguments

Calling functions with incorrect arguments is a common problem in many programs, and can often go unnoticed.

Insure++ detects the error in the following program

    double foo(dd)
    	  double dd;
    {
   	  return dd + 1.0;
    }

    main()
    {
          printf("Result = %f\n", foo(1));
    } 

in which the argument passed to the function foo in main is an integer rather than a floating point number.

Warning

Converting this program to ANSI style (e.g., with a function prototype for foo) makes it correct since the argument passed in main will be automatically converted to double. Insure++ doesn't report an error in this case.

Insure++ detects several different categories of errors, which you can enable or suppress separately depending on which types of bugs you consider important.

Sign errors
Arguments agree in type but one is signed and the other unsigned, e.g., int vs. unsigned int.

Compatible
The arguments are different data types which happen to occupy the same amount of memory on the current machine, e.g. int vs. long if both are thirty-two bits. While this error may not cause problems on your current machine, it is a portability problem.

Incompatible types
Similar to the example above - data types are fundamentally different or require different amounts of memory. int vs. long would appear in this category on machines where they require different numbers of bits.

C++ compile time warnings

C++

During compilation, Insure++'s parser detects a number of C++-specific problems and prints warning messages. These messages are coded by the chapter, section, and paragraphs pertaining to that warning in the draft ANSI standard. Therefore, if you are uncertain what a particular warning message means or would like additional information, you can consult the standard for an explanation.

As an example, when processed by Insure++, the code

 
	void foo(char *str) { }
	void func()
	{
		void *iptr = (char *) 0;
		foo(iptr);
	} 

will produce the warning

	insure -c foo.C
	[foo.C:5] Warning:13-2: wrong arguments passed to function 'foo'
    	|   declared at: [foo.C:1]
	| expected args: (char *)
	|   passed args: (void *) 
	>> foo(iptr);

Invalid parameters in system calls

Interfacing to library software is often tricky, because passing an incorrect argument to a routine may cause it to fail in an unpredictable manner. Debugging such problems is much harder than correcting your own code, since you typically have much less information about how the library routine should work.

Insure++ has built-in knowledge of a large number of system calls and checks the arguments you pass to ensure correct data type and, if appropriate, correct range.

For example, the code

	void myrewind(FILE fp)
	{
		fseek(fp, (long)0, 3);
	}

would generate an error since the last argument passed to the fseek function is outside the legal range.

Unexpected errors in system calls

Checking the return codes from system calls and dealing correctly with all the error cases that can arise is a very difficult task. It is a very rare program that deals with all possible cases correctly.

An unfortunate consequence of this is that programs can fail unexpectedly after they have been shipped to customers because some system call fails in a way that had not been anticipated. The consequences of this can range from a nasty "core dump" to a system that performs erratically at the customer location.

Insure++ has a special error class, RETURN_FAILURE , that can be used to detect these problems. All the system calls known to Insure++ contain special error checking code that detects failures. Normally these errors are suppressed, since it is assumed that the application is handling them itself, but they can be enabled at runtime by adding the line

	insure++.unsuppress RETURN_FAILURE 

to a .psrc file. Any system call that returns an error code will then print a message indicating the name of the routine, the arguments supplied, and the reason for the error.

This capability detects any error in any system call. Among the potential benefits are automatic detection of errors in the following situations

and many others.

Threads

In order for Insure++ to be able to correctly track memory in threaded programs, all calls to pthread_create () or thr_create () must have been "seen" by Insure++. In Insure++ 3.1 and earlier versions this meant that all files that ever call thread creation routines must be instrumented with Insure++. This is still true in version 4.0 if you are using backward compatibility mode (interface_preference tqi tqs). Starting with version 4.0, Insure++ defaults to using "Library Interpositioning" (referred to as TQL interfaces from now on). This mode guarantees that the above requirement is met, even if Insure++ was only used to link the executable and none of the source files have been instrumented.

Achieving Total Quality Software

The previous sections described the various types of problems detected by Insure++. As you can see, a very large number of problems can be detected as simply as recompiling your program and running it under Insure++. Hopefully, this will eliminate many bugs that you might otherwise ship to your customers.

It would be naive, however, to expect that Insure++ will remove all of the bugs in your code. Some will still make it through all the testing steps. Luckily, Insure++ can still help even after you've shipped your product.

An important way that Insure++ can help you reach the Total Quality Software goal is to ship two versions of your product to your customers:

  • The normal version, compiled without Insure++
  • A version built with Insure++

This second version can be used at the customer site to help track down problems. This will dramatically improve the efficiency of your support staff at finding bugs in the released software.


Footnotes

(1)
If you already have a file called .psrc in your directory, simply add this line to it.

For more information, call (888) 305-0041 or send email to: insure@parasoft.com

< Introduction >Chaperon Introduction ^Insure++ User's Guide TOC
Tools to debug c++ and java
(888) 305-0041 info@parasoft.com Copyright © 1996-2000 ParaSoft