Checking Python Code with Pylint

These are a collection of notes from learning to use pylint to check the quality of Python code. I’ve been experimenting with Python recently; it’s become widely popular for data science applications (especially deep learning), and it’s also handy for the utility scripts that I often write (like trolling through logs, looking for interesting events and trends). I want to use pylint to check my code, find unsuspected coding errors and in general to get better ideas about how to create more “pythonic” code.

First…to install pylint on Mac OS X (similar on Linux):

$ sudo pip install pylint

On my machine, pylint ended up in /usr/local/bin:

$ which pylint
/usr/local/bin/pylint
$ file `which pylint`
/usr/local/bin/pylint: a /usr/bin/python script text executable

I see that pylint is a python script…hmmm, will have to browse through it later 8^).

It’s simple enough to run pylint on a python script; just pass the path to the script on the command line:

$ python myscript.py
************* Module myscript
C: 71, 0: Line too long (128/100) (line-too-long)
C: 76, 0: Line too long (103/100) (line-too-long)
C:228, 0: Trailing whitespace (trailing-whitespace)
C:283, 0: Trailing whitespace (trailing-whitespace)
C:  1, 0: Missing module docstring (missing-docstring)
C: 28, 0: Missing function docstring (missing-docstring)
R: 37, 0: Too many instance attributes (13/7) (too-many-instance-attributes)
C: 37, 0: Old-style class defined. (old-style-class)
C: 63, 4: Missing method docstring (missing-docstring)
R: 63, 4: Too many local variables (22/15) (too-many-locals)
R: 63, 4: Too many branches (13/12) (too-many-branches)
C:112, 4: Missing method docstring (missing-docstring)
C:116, 4: Missing method docstring (missing-docstring)
C:230,17: Invalid constant name "filelist" (invalid-name)
C:255, 4: Invalid constant name "filepath" (invalid-name)
C:260, 8: Invalid constant name "clf_file_obj" (invalid-name)
C:262, 8: Invalid constant name "clf_file_obj" (invalid-name)
 
-----------------------------------
Your code has been rated at 5.43/10

Not surprising that the script has a lot of complaints about it; I wrote this script pretty quickly to get a job done, and I’m far from an expert Pythonist! There were no “error” (“E”) or “fatal” (“F”) messages in the pylint report, but it would still be good to get the code’s structure in a better shape.

The first letter of the output is the message type; here are what the message types mean (from the pylint documentation):

  • [R]efactor for a “good practice” metric violation
  • [C]onvention for coding standard violation
  • [W]arning for stylistic problems, or minor programming issues
  • [E]rror for important programming issues (i.e. most probably bug)
  • [F]atal for errors which prevented further processing

The goal will be to reduce the number of complaints from pylint, ideally by fixing the problem in the code. But I have found in the past that lint checkers for various programming languages tend to be somewhat opinionated and can complain verbosely about issues that aren’t really very pressing, so in some cases I’ll choose to suppress the message using any of several ways that are available in pylint. I’ve done this sometimes when working with Java (the “findbugs” code checker) and Javascript (jshint/jslint running as part of a grunt script).

Suppressing Unneeded Messages

One “don’t care” issue is the complaints about missing docstrings. Good code documentation it important, but the first measure should be to choose good, self-documenting names for classes, methods, and functions. In many cases, a good choice of name could be enough. As I believe I’ve done that in this script, I want to suppress these messages about docstrings from the pylint output.

Just to experiment, the “–disable” (or “-d”) option on the pylint command line will disable selected messages:

$ pylint --disable missing-docstring myscript.py
...

Another way to suppress pylint messages is to add a formatted comment in the body of the script. A handy command line option to pylint helps this task by giving details about pylint message, including the code number of the message which should be inserted in the comment. Using the “missing-docstring” name of the “Missing method docstring” complaint:

$ pylint --help-msg missing-docstring
:missing-docstring (C0111): *Missing %s docstring*
  Used when a module, function, class or method has no docstring.Some special
  methods like __init__ doesn't necessary require a docstring. This message
  belongs to the basic checker.

In this case, the code of the message is “C0111”. With this information, the message can be suppressed by adding a formatted comment in the python script itself:

# pylint: disable=C0111

A third way to disable defect messages is to use a configuration file. pylint will look for a file named either “pylintrc” or “.pylintrc” in the current directory or the user home directory; a “disable=” option accepts a comma separated string of message names that are to be suppressed. And the pylint command line offers an option to create a boilerplate pylintrc file for you (the contents of the file is written to stdout):

$ pylint --generate-rcfile > .pylintrc

This generated pylintrc file should be used when starting to customize pylint, because the generated file captures a bunch of initial, standard defaults. For example, the generated .pylintrc file has dozens of message types already disabled by default. Adding “missing-docstring” (with a comma separator) to the end of this list of message types removed the message from the pylint output.

Fixing Defects in the Code

Having cleared the output of some distracting messages, I’ll turn to fixing some of the other issues that pylint raises, and in general, try to make the code more “pythonic” (fitting with preferred styles and idioms for writing python code).

The complaint about “Trailing whitespace (trailing-whitespace)” is simple to fix: a quick edit removes some white space left on the end of the lines. Whitespace is significant in python, and while it isn’t a deal-breaker to have to non-functional whitespace around, might as well fix it and eliminate the messages.

The complaint about ‘Invalid constant name “filelist” (invalid-name)’ can be fixed by taking all code that runs under the “if __name__ == ‘__main__’:” sentinel statement and wrapping it in a main() method. So instead of this:

if __name__ == '__main__':    filepath = None
    if filelist:
        filepath = filelist[0]
 
    if filepath:
        clf_file_obj = open(filepath)
    else:
        clf_file_obj = sys.stdin
 
    do_something(clf_file_obj)

…do this:

def main():    filepath = None
    if filelist:
        filepath = filelist[0]
 
    if filepath:
        clf_file_obj = open(filepath)
    else:
        clf_file_obj = sys.stdin
 
    do_something(clf_file_obj)
 
if __name__ == '__main__':    main()

The effect of this change has nothing to do with declaration of constants (as the pylint message suggests); instead, variables that were formerly globals in the main code become locals to the main() method. The message complains about “Invalid constant name” because it’s not considered pythonic to have global variables in a script. The pylint checker assumes that the intention was to use these as variables as constants, but it is (again) considered pythonic to use all capital letters when naming constants in python – and that’s what pylint complains about.

The complaint about “Old-style class defined” can be fixed by extending classes in the script from ‘object’ instead of not specifying a base class.

Instead of this:

class MyClass:    def __init__(self):
        pass
...

Do this:

class MyClass(object):    def __init__(self):
        pass
...

Why extend the class from object? This stack overflow posting tells the story: in Python 2, if a class doesn’t extend “object”, the type reported by Python will not be correct (and there are some issues with multiple inheritance). Multiple inheritance isn’t an issue in the script I’m working on, but it would be worthwhile to get the typing correct for debugging purposes – hence the fix above. In Python 3, classes implicitly extend object, so this issue goes away; this fix is only required for Python 2 compatibility.

There are a number of complaints about too many instance variables, local variables, branches, etc: these required some minor code refactoring to correct: breaking up the code into smaller functions or methods, creating separate locations for the variables. It’s a good practice in general to keep function and method size small, giving each method a limited and focused responsibility – this makes the code more understandable (and hence maintainable).

Keeping the number of features of code within reasonable bounds is a good thing, but the level of complexity of the code may require extending these limits a little. In extreme cases it’s possible to change the configured limits in a “.pylintrc” file (perhaps a config file kept local to the project).

A quick ‘grep’ of the generated ‘.pylintrc’ file shows the configurable parameters for max limits on features of the code:

$ grep "max" .pylintrc
...
max-nested-blocks=5
max-line-length=100
max-module-lines=1000
missing-member-max-choices=1
max-args=5
max-attributes=7
max-bool-expr=5
max-branches=12
max-locals=15
max-parents=7
max-public-methods=20
max-returns=6
max-statements=50

Editing the .pylintrc file and raising some of these limits can selectively eliminate some complaints.

With the changes completed, another run of pylint gave the the script a much better score!

$ python myscript.py
************* Module my script
...
------------------------------------------------------------------
Your code has been rated at 8.94/10 (previous run: 5.43/10, +3.51)

I found it really educational to use the pylint utility to analyze my code; it inspired me to question my coding style and explore alternatives. pylint also provided some very good command line support for use and configuration (these are things I’d have to dig around documentation on one or more websites to find for other code checking tools). The support for generating a default configuration was particularly useful, combined with the option to get details about defect messages.

TL; DR

Handy command options in pylint:

$ pylint --help-msg message-name[,message-name]
$ pylint --generate-rcfile > .pylintrc

Suppressing selected messages in pylint: three options:

  • On the command line: ‘pylint –disable message-name[,message-name] scriptname.py’
  • In the script itself: ‘# pylint: disable=‘ (use ‘pylint –help-msg’ to find the code)
  • In a ‘.pylintrc’ file (current directory or $HOME): append to the “disable=” entry there

Creating more pythonic code:

  • wrap code under the “if __name__ == ‘__main__’:”sentinel in a main() method
  • extend classes from ‘object’ if they don’t have to be extended explicitly from another class
  • keep the number of locals in functions and the number of attributes in classes small; refactor if functions or classes get too big!

Use ‘grep max .pylintrc’ to find all configurations for max limits on the size of various code features and tailor to special needs of the project:

$ grep max .pylintrc
max-nested-blocks=5
max-line-length=100
max-module-lines=1000
...

Versions

$ pylint --version
pylint 1.7.4, 
astroid 1.5.3
Python 2.7.10 (default, Oct 23 2015, 19:19:21) 

References

Pylint User Manual — Pylint 1.8.0 documentation
inheritance – How to avoid Pylint warnings for constructor of inherited class in Python 3?
inheritance – Should all Python classes extend object?

Add a Comment