![]() |
Data Center Works Inc |
| Home | Services | About Us | Jobs | Resources | Stories | Tools | Contact us |
No, not the kind of rock, the “flexible lint” program from Gimpel Software. Like lint, flint can find all sorts of problems with C and C++ programs.
However, lint programs try to find everything, when much of the time that's not what you need to do. When doing porting or looking for a particular kind of bug, you don't want a general overview of everything that might be wrong. Instead, you want an answer to a narrow and specific question, like “where else is there a null pointer?”
Fortunately, flint in particular and lints in general can be convinced to pay attention to just a few things, and let you drill down to just what's important.
Using “select error”Options
Many lints have an option to turn off particular messages: flint has the option of turning on only particular ones.
The program comes with a file of all the error messages it can produce, for you can to select just the ones you need. For example, the following are some errors involving null values:
32 Field size (member 'Symbol') should not be zero -- The 54 Division by 0 -- The constant 0 was used on the right 84 sizeof object is zero or object is undefined -- A sizeof 85 Array 'Symbol' has dimension 0 -- An array (named Symbol)
To select just these, we specify the -w0 option to turn
off all warnings and errors and then add +e32 to turn
error 32 back on. To turn on the others, we add +e53 +e84
+e85. That's most of the coding required to create a flint
that looks only for these null-related errors:
#bin/sh exec flint -w0 +e32 +e53 +e84 +e85 "$@"
An Example: Diagnosing Endianness Errors
On a recent engagement, we were porting code from an old Sequent to a SPARC, and after the specific pointer issues we discussed in the Story of Thud and Blunder, we needed to look for other null pointer issues and also endian-ness errors.
The Sequent was a little-endian machine, like an Intel, and we were porting to a big-endian machine, a SPARC, so certain errors were much more serious.
For example, dereferencing an int pointer that was accidentally pointed to a long returns returns the bottom 32 bits on a little-endian machine. If the long happens to contain a value less than 2^32, the results will be correct, even though the code is erroneous
If you do the same thing on a big-endian machine, you'll get the high 32 bits, which is almost guaranteed to be wrong.
To make flint find these errors and nothing else, we write the following
#!/bin/sh
#
# flintEndian -- test for a group of endian-related errors,
# including passing a pointer to the first two bytes of
# a 4-byte int and expecting it to work on a SPARC.
#
#set -x
ProgName=`basename $0`
ADD='+e66 +e67 +e68 +e69 +e70 +e71 +e110 +e124 +e413 +e418 +e433 +e501 +e502 \
+e503 +e507 +e511 +e514 +e515 +e516 +e519 +e520 +e521 +e522 +e524 +e532 \
+e545 +e546 +e549 +e569 +e570 +e571 +e572 +e605 +e609 +e610 +e611 +e619 \
+e632 +e633 +e634 +e635 +e636 +e637 +e638 +e639 +e640 +e641 +e643 +e647 \
+e650 +e655 +e656 +e674 +e679 +e680 +e684 +e701 +e702 +e703 +e704 +e712 \
+e713 +e730 +e731 +e732 +e734 +e735 +e736 +e737 +e740 +e741 +e747 +e776 \
+e790 +e792 +e812 +e815 +e826 +e909 +e911 +e912 +e913 +e914 +e915 +e916 \
+e917 +e918 +e919 +e920 +e921 +e922 +e923 +e924 +e925 +e926 +e927 +e928 \
+e929 +e930 +e958 +e959 +e1056 +e1059'
main() {
if [ $# -lt 1 ]; then
say "$ProgName error: you must supply something to lint"
say "Usage: $0 [cc-opts] file"
exit 1
fi
flint '-strong(AJX)' -w0 ${ADD} "$@"
}
say() {
echo "$@" 1>&2
}
main "$@"
The program runs flint with just the options in ${ADD}, and therefor picks out a collection of errors that are common in “endian” conversions. They're not all about endian-ness, but they are all messages that indicate serious errors in data references that we found during the port.
When we ran this on fifty thousand lines of Sequent code, we got about 20 warnings, and found about eight things that were clearly wrong for a big-endian machine.
Twenty lines was a reasonable number to check, and we were very happy to find as many as eight real errors among them.
Using Grep or Awk
Remember we said that lints in general could be convinced to look at only one thing? It isn't nearly as fast, but you can run lint and then select just a specific set of messages to pay attention to.
For example, the complete set of diagnostic messages about include files from the Solaris lint begin with E_INCL_, followed by either M or N, and then by one of N, O or U. This can be matched with the awk expression if ($0 ~ /E_INCL_[MN][NOU]/) That allows us to write a script that selects only those errors:
#!/bin/sh
#
# lint_includes -- look for unneeded includes via lint
#
#
ProgName=`basename $0`
Verbose=0
# Turn on almost everything, but use one-line errors for selection in awk.
LINTOPTS='-errchk -errhdr=%all -errfmt=simple -errsecurity -Nlevel -Ncheck=macro -XCC -Xtransition -errtags'
main() {
if [ $# -lt 1 ]; then
say "$ProgName error: you must supply at least one C file"
say "Usage: $0 [cc-opts] file"
exit 1
fi
lint $LINTOPTS "$@" 2>&1 | postprocess
}
#
# postprocess -- make multi-line messages single-line ones and filter
# out just the include messages fro attention.
#
postprocess() {
nawk '
BEGIN { prev = ""; }
/.*/ {
# print ">>> " $0;
if ( $0 ~ /^lint/) {
# Lint error, print it
print $0;
prev = "";
}
else if ($0 ~ /E_INCL_[MN][NOU]/) {
# Matched, see if it was multi-line
if ($0 !~ /^"/ && prev != "") {
# print "<<< multi-line";
printf("%s ", prev);
prev = "";
}
print $0;
}
else if ($0 ~ /^"/) {
# beginning of a possible multi-line message
prev = $0;
}
}
'
}
say() {
echo "$@" 1>&2
}
main "$@"
This gives us a vendor lint that will select only a few messages of interest.
What to Drill For
The next question is what to use this ability for. The answer is to look for things that are otherwise hard to find.
One of the hardest things to find is dead code or dead includes. That's code which, through it looks like it will be used in some cases, is actually unreachable.
This is code you have to maintain, but which has no value. Instead it wastes space and your time. There's an antipattern called “lava flow” [Brown 1998] that looks at how dead code builds up to mislead, confuse and slow the engineer, and discusses how to find and remove it.
Dead #includes are,
if anything, worse. They silently slow compilation speed, and fill
the preprocessor full of #defines that will never be
used. And every time you compile, you pay the cost. In tests on
several large programs, we and colleagues found that the compilation
speed was cut in half by dead includes alone. So one of the things
we can look for with flint and lint is dead code and unused includes,
using the flintDead and lintDead scripts.
With modern 64-bit operating systems, it is often advantageous to upgrade programs and drivers to 64-bit from 32. Both flint and lint have tests for mis-assigned data types, which catch a significant number of the problems in a ported program, via the lint32bit script.
Null pointers can show up anywhere, but are endemic when porting programs from the Sequent or the older VAX. Finding them with lintNullPointers or flintNullPointers can save you core dumps.
Finally, switching from a little-endian machine like the Intel to a big-endian one like a SPARC, Motorola or IBM PowerPC can cause some code to misbehave bizarrely, which we found with flintEndian script above.
For your convenience, we provide all these scripts and a sorted file of the Solaris lint messages in our tools page
References:
[Brown 1998] Brown, William J.; Raphael C. Malveau, Hays W. "Skip" McCormick, Dr. Thomas J. Mowbray, Theresa Hudson (ed).) (1998). AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis. John Wiley & Sons, ltd. ISBN 0-471-19713-0.