Data Center Works Inc


Drilling Down Through Flint

No, not the kind of rock, the “flexible lint” program from Gimpel Software. Like lint, flint can find all sorts of problems with C and C++ programs.

However, lint programs try to find everything, when much of the time that's not what you need to do. When doing porting or looking for a particular kind of bug, you don't want a general overview of everything that might be wrong. Instead, you want an answer to a narrow and specific question, like “where else is there a null pointer?”

Fortunately, flint in particular and lints in general can be convinced to pay attention to just a few things, and let you drill down to just what's important.

Using “select error”Options

Many lints have an option to turn off particular messages: flint has the option of turning on only particular ones.

The program comes with a file of all the error messages it can produce, for you can to select just the ones you need. For example, the following are some errors involving null values:

32    Field size (member 'Symbol') should not be zero  -- The
54    Division by 0  -- The constant 0 was used on the right
84    sizeof object is zero or object is undefined  -- A sizeof
85    Array 'Symbol' has dimension 0  -- An array (named Symbol)

To select just these, we specify the -w0 option to turn off all warnings and errors and then add +e32 to turn error 32 back on. To turn on the others, we add +e53 +e84 +e85. That's most of the coding required to create a flint that looks only for these null-related errors:

#bin/sh
exec flint -w0 +e32 +e53 +e84 +e85 "$@"

An Example: Diagnosing Endianness Errors

On a recent engagement, we were porting code from an old Sequent to a SPARC, and after the specific pointer issues we discussed in the Story of Thud and Blunder, we needed to look for other null pointer issues and also endian-ness errors.

The Sequent was a little-endian machine, like an Intel, and we were porting to a big-endian machine, a SPARC, so certain errors were much more serious.

For example, dereferencing an int pointer that was accidentally pointed to a long returns returns the bottom 32 bits on a little-endian machine. If the long happens to contain a value less than 2^32, the results will be correct, even though the code is erroneous

If you do the same thing on a big-endian machine, you'll get the high 32 bits, which is almost guaranteed to be wrong.

To make flint find these errors and nothing else, we write the following

#!/bin/sh
#
# flintEndian -- test for a group of endian-related errors,
#       including passing a pointer to the first two bytes of
#       a 4-byte int and expecting it to work on a SPARC.
#
#set -x
ProgName=`basename $0`
ADD='+e66 +e67 +e68 +e69 +e70 +e71 +e110 +e124 +e413 +e418 +e433 +e501 +e502 \
+e503 +e507 +e511 +e514 +e515 +e516 +e519 +e520 +e521 +e522 +e524 +e532 \
+e545 +e546 +e549 +e569 +e570 +e571 +e572 +e605 +e609 +e610 +e611 +e619 \
+e632 +e633 +e634 +e635 +e636 +e637 +e638 +e639 +e640 +e641 +e643 +e647 \
+e650 +e655 +e656 +e674 +e679 +e680 +e684 +e701 +e702 +e703 +e704 +e712 \
+e713 +e730 +e731 +e732 +e734 +e735 +e736 +e737 +e740 +e741 +e747 +e776 \
+e790 +e792 +e812 +e815 +e826 +e909 +e911 +e912 +e913 +e914 +e915 +e916 \
+e917 +e918 +e919 +e920 +e921 +e922 +e923 +e924 +e925 +e926 +e927 +e928 \
+e929 +e930 +e958 +e959 +e1056 +e1059'

main() {
        if [ $# -lt 1 ]; then
                say "$ProgName error: you must supply something to lint"
                say "Usage: $0 [cc-opts] file"
                exit 1
        fi
        flint '-strong(AJX)' -w0 ${ADD} "$@"
}
say() {
        echo "$@" 1>&2
}

main "$@"

The program runs flint with just the options in ${ADD}, and therefor picks out a collection of errors that are common in “endian” conversions. They're not all about endian-ness, but they are all messages that indicate serious errors in data references that we found during the port.

When we ran this on fifty thousand lines of Sequent code, we got about 20 warnings, and found about eight things that were clearly wrong for a big-endian machine.

Twenty lines was a reasonable number to check, and we were very happy to find as many as eight real errors among them.

Using Grep or Awk

Remember we said that lints in general could be convinced to look at only one thing? It isn't nearly as fast, but you can run lint and then select just a specific set of messages to pay attention to.

For example, the complete set of diagnostic messages about include files from the Solaris lint begin with E_INCL_, followed by either M or N, and then by one of N, O or U. This can be matched with the awk expression if ($0 ~ /E_INCL_[MN][NOU]/) That allows us to write a script that selects only those errors:

#!/bin/sh
#
# lint_includes -- look for unneeded includes via lint
#
#
ProgName=`basename $0`
Verbose=0

# Turn on almost everything, but use one-line errors for selection in awk.
LINTOPTS='-errchk -errhdr=%all -errfmt=simple -errsecurity -Nlevel -Ncheck=macro -XCC -Xtransition -errtags'

main() {

        if [ $# -lt 1 ]; then
                say "$ProgName error: you must supply at least one C file"
                say "Usage: $0 [cc-opts] file"
                exit 1
        fi

        lint $LINTOPTS "$@" 2>&1 | postprocess
}

#
# postprocess -- make multi-line messages single-line ones and filter
#	out just the include messages fro attention.
#
postprocess() {

        nawk '
        BEGIN { prev = ""; }
        /.*/ {
                # print ">>> " $0;
                if ( $0 ~ /^lint/) {
                        # Lint error, print it
                        print $0;
                        prev = "";
                }
                else if ($0 ~ /E_INCL_[MN][NOU]/) {
                        # Matched, see if it was multi-line
                        if ($0 !~ /^"/ && prev != "") {
                                # print "<<< multi-line";
                                printf("%s ", prev);
                                prev = "";
                        }
                        print $0;
                }
                else if ($0 ~ /^"/) {
                        # beginning of a possible multi-line message
                        prev = $0;
                }
        }
'
}

say() {
        echo "$@" 1>&2
}

main "$@"



This gives us a vendor lint that will select only a few messages of interest.

What to Drill For

The next question is what to use this ability for. The answer is to look for things that are otherwise hard to find.

One of the hardest things to find is dead code or dead includes. That's code which, through it looks like it will be used in some cases, is actually unreachable.

This is code you have to maintain, but which has no value. Instead it wastes space and your time. There's an antipattern called “lava flow” [Brown 1998] that looks at how dead code builds up to mislead, confuse and slow the engineer, and discusses how to find and remove it.

Dead #includes are, if anything, worse. They silently slow compilation speed, and fill the preprocessor full of #defines that will never be used. And every time you compile, you pay the cost. In tests on several large programs, we and colleagues found that the compilation speed was cut in half by dead includes alone. So one of the things we can look for with flint and lint is dead code and unused includes, using the flintDead and lintDead scripts.

With modern 64-bit operating systems, it is often advantageous to upgrade programs and drivers to 64-bit from 32. Both flint and lint have tests for mis-assigned data types, which catch a significant number of the problems in a ported program, via the lint32bit script.

   

Null pointers can show up anywhere, but are endemic when porting programs from the Sequent or the older VAX. Finding them with lintNullPointers or flintNullPointers can save you core dumps.

Finally, switching from a little-endian machine like the Intel to a big-endian one like a SPARC, Motorola or IBM PowerPC can cause some code to misbehave bizarrely, which we found with flintEndian script above.

For your convenience, we provide all these scripts and a sorted file of the Solaris lint messages in our tools page

References:

[Brown 1998] Brown, William J.; Raphael C. Malveau, Hays W. "Skip" McCormick, Dr. Thomas J. Mowbray, Theresa Hudson (ed).) (1998). AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. John Wiley & Sons, ltd. ISBN 0-471-19713-0.