Tuesday, July 30, 2013

Parsing Arguments in Python with argparse

Parse arguments from the command line is something that most of us have to do at some point in time. Unfortunatley for us, when it comes to python, most of the examples are for the old, deprecated library for parsing arguments (optparse). Since python 2.7, the new library, argparse, has become the standard. Although the description of how to parse arguments in python is well documented in the documentation, the most basic examples aren't near the top, so it seems much more complicated than it actually is. In any case, here are some quick, basic examples of how to use argparse.

The Simplest

First, let's start with just getting arguments from the command line.
# argparse1.py
import argparse

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='An awesome program')
    parser.add_argument(
        'first_name', help='First Name')
    parser.add_argument(
        'last_name', help='Last Name')
    args = vars(parser.parse_args())
    print "{} {}".format(args['first_name'], args['last_name'])
If you run this as python argparse1.py, without any parameters, you should see an error message with a helpful usage message along the lines of:
# argparse1.py
usage: argparse1.py [-h] first_name last_name
argparse1.py: error: too few arguments
I.e. just by using the argparse parser and parsing the arguments, we have a built in usage generator. Neat!

Now, if you run this with actual values, such as python argparse1.py John Smith, then, as you'd expect this will work work and print out "John Smith". One thing to note is that I used vars() to get the variables out of the Namespace that is created by the parser. If you want to, you can get the values directly out of the Namespace without using vars(), but I prefer the dictionary style access for my arguments. For more details, on this, I'll refer you to the python documentation.

Named Parameters

While the above example works well for simple situations where all arguments are required and positional arguments make sense, it is often nice to allow the use of named (optional) parameters. For example, we can re-write the above using named parameters as follows:
# argparse2.py
import argparse

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='An awesome program')
    parser.add_argument(
        '--first_name', required=True, help='First Name')
    parser.add_argument(
        '--last_name', required=True, help='Last Name')
    parser.add_argument(
        '--middle_name', required=False, help='Middle Name')

    args = vars(parser.parse_args())
    if args['middle_name']:
        print "{} {} {}".format(
            args['first_name'], args['middle_name'], args['last_name'])
    else:
        print "{} {}".format(args['first_name'], args['last_name'])
Unlike before, we now have to specify the argument name before using it, but we can get the same result as before by doing:
# argparse1.py
python argparse2.py --first_name John --last_name Smith
Just as before, if you don't provide first_name or last_name, we get a helpful error message. However, we also added a new optional argument for the middle name, which we can provide if we feel like it. E.g.
# argparse1.py
python argparse2.py --first_name John --middle_name Bob --last_name Smith

Sub Commands (Sub-Parsers)

Ok, now that we know the basics, let's look at the case where we have a program that has two different sub commands.
# argparse3.py
import argparse

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='An awesome program')
    subparsers = parser.add_subparsers(
        title='subcommands', description='valid subcommands',
        help='additional help')
    parser_create = subparsers.add_parser('create')
    parser_create.set_defaults(which='create')
    parser_create.add_argument(
        '--first_name', required=True, help='First Name')
    parser_create.add_argument(
        '--last_name', required=True, help='Last Name')

    parser_delete = subparsers.add_parser('delete')
    parser_delete.set_defaults(which='delete')
    parser_delete.add_argument(
        'id', help='Database ID')
    args = vars(parser.parse_args())

    if args['which'] == 'create':
        print "Creating {} {}".format(args['first_name'], args['last_name'])
    else:
        print "Deleting {}".format(args['id'])
Whoa! What do we have here? If you run this without any arguments, you will see a help message along the lines of:
# argparse1.py
usage: argparse3.py [-h] {create,delete} ...
argparse3.py: error: too few arguments
This is telling us that we have to provide one of the available subcommands "create" or "delete". So, let's try that by running python argparse3.py create:
# argparse1.py
usage: argparse3.py create [-h] --first_name FIRST_NAME --last_name LAST_NAME
argparse3.py create: error: argument --first_name is required
Now we get the helpful message saying exactly what the arguments are for the subcommand "create". Neat! If you actually provide it with valid inputs, you will see that the parser only returns the arguments for the subparse that was selected. In other words, continuing with the previous exmaple, if we run the program as follows:
# argparse1.py
python argparse3.py create --first_name John --last_name Smith
Then we only get the arguments first_name and last_name; id will not be there since it didn't belong to any of the first sub-parser's arguments. As you also may have noticed, you can mix and match positional arguments and named arguments at will.

Unfortunately, the one thing that is lacking by default in the argument parsing when using subcommands is a way to get which subcommand was run. Although in the example above we can figure it out since only the "create" subcommand has first_name and last_name, but what would we do if both subcommands had overlapping arguments? The solution to this (originally found here) is to provide a default argument that tells us which subcommand was chosen. This is why I added, for example, the line parser_create.set_defaults(which='create') to the first subparser. This allows us to get the argument "which" that we have added to tell us which subcommand was chosen.

Going Farther

Well, that's it for the basics. If you want to do more than this, then I highly suggest you read the docs as they contain other examples. Hopefully, this little introduction has made it a bit easier to digest what is going on in that documentation page!