Creating CLIs with Python

Courses

Neuroscience Data Analysis Pipelines with Python, Git, and Snakemake

Authors

Dr. Mohammad Bashiri | Dr. Nicholas Del Grosso

Download Materials

While scripts are valuable tools for automating tasks, Command-Line Interfaces (CLIs) make scripts more powerful and user-friendly, allowing for easier use.

Here are the key differences between scripts and CLIs:

Aspect	Scripts	CLIs
User Interaction	Limited, often run with predefined operations or simple input parameters.	Enhanced, with interactive prompts, menus, and more complex input handling.
Parameter Handling	Basic, typically through command-line arguments or file input.	Dynamic, with support for optional and mandatory arguments, complex command structures, and detailed feedback on incorrect inputs.
Usability & Documentation	Requires manual documentation, which may not be as accessible to new users.	Often includes built-in help options (`--help`), automatically generating usage documentation and making the tool more user-friendly.
Scalability & Integration	Designed for specific, standalone tasks; integration into larger workflows can be cumbersome.	Easily integrated into automation pipelines, compatible with other tools, and designed for composability in larger systems.

In this session, we will use Python’s argparse library to create user-friendly CLIs. Argparse is a simple yet powerful tool that allows creating CLIs with different levels of complexity using minimal code.

As we go through the exercises, you might find some of the code/functions/commands in the below table

Here are some functions/commands that might come handy going through the exercises.

Section 1: Argparse CLI Functions

Argparse

Function	Code Example	Description
Create Parser	`parser = argparse.ArgumentParser(description='CLI Tool')`	Initialize a new argument parser for CLI.
Add Argument	`parser.add_argument('name', type=str, help='user name')`	Define a positional argument for the CLI.
Parse Args	`args = parser.parse_args()`	Parse the arguments given at the command line.
Numeric Arg	`parser.add_argument('number', type=int, help='a number')`	Accept a numeric input and ensure it’s of type `int`.
Optional Arg	`parser.add_argument('--sum', dest='accumulate', action='store_const', const=sum, default=max)`	Add an optional argument with a default value.

Terminal Commands for CLIs

Command	Example	Description
Help Flag	`python cli_name.py -h`	Display the help message for the CLI.

Numpy Functions

Function	Code	Description
Load array	`array = np.load('file.npy')`	Load a numpy array from a .npy file.
Save array	`np.save('file.npy', array)`	Save a numpy array to a .npy file.
Array mean	`np.mean(array)`	Compute the mean of an array.
Array std dev	`np.std(array)`	Compute the standard deviation of an array.
Array minimum	`np.min(array)`	Find the minimum value in an array.
Array maximum	`np.max(array)`	Find the maximum value in an array.

Pandas Functions

Function	Code	Description
Read CSV	`df = pd.read_csv('file.csv')`	Read data from a CSV file into a DataFrame.
Write CSV	`df.to_csv('file.csv', index=False)`	Write a DataFrame to a CSV file.
Column mean	`df.mean()`	Compute the mean of each column in a DataFrame.
Column std dev	`df.std()`	Compute the standard deviation of each column in a DataFrame.
Column minimum	`df.min()`	Compute the minimum of each column in a DataFrame.
Column maximum	`df.max()`	Compute the maximum of each column in a DataFrame.
Conditional select	`df.loc[df['column'] > value]`	Access rows and columns by labels or a boolean array based on a condition.

Section 2: Creating CLIs with Argparse

Exercises

Example: Create a CLI that takes a name and prints a greeting message.

To help notice the difference between a script and a CLI, here is a side-by-side comparison of the code that would do the same thing:

Note that, while the code for the script is shorter, it is much less informative - if the script is longer and more complex it will be difficult for a user that is not familiar with the code to interact with it. On the other hand, when we use argparse, we can give the input arguments names, and even provide a help text that the user can read and get a better understanding how to run the command and how interact with the code.

What is the command to run this CLI? Same as the script:

python clis/cli_name.py InputValue

How to get the help text? Using the -h or --help flag:

python clis/cli_name.py -h

Now it’s your turn! :)

Exercise: Create a CLI that takes a number as an argument and prints its square. Note that the type should be set to a numerical type (i.e. int or float).

Solution:

import argparse

# Create the parser
parser = argparse.ArgumentParser(description='Calculate the square of a number')

# Add argument
parser.add_argument('number', type=float, help='The number to square')

# Parse the arguments
args = parser.parse_args()

# Calculate and print the square
print(args.number ** 2)

Run with:

python clis/square_number.py 5

Exercise: Create a CLI, called add_two_numbers.py, that takes two numbers, as two separate arguments, and prints the addition.

Hint: since we have two inputs, we need to define two arguments in the add_two_numbers.py:

# Defining the first argument
parser.add_argument('number1', type=float, help='First input')

# Defining the second argument
parser.add_argument('number2', type=float, help='Second input')

# Parse the arguments
args = parser.parse_args()

And we can refer to these in the code as args.number1 and args.number2.

Does it work?

Solution:

import argparse

# Create the parser
parser = argparse.ArgumentParser(description='Add two numbers')

# Defining the first argument
parser.add_argument('number1', type=float, help='First input')

# Defining the second argument
parser.add_argument('number2', type=float, help='Second input')

# Parse the arguments
args = parser.parse_args()

# Calculate and print the sum
print(args.number1 + args.number2)

Run with:

python clis/add_two_numbers.py 3 7

Exercise: There is an script in the scripts forlder called extract_valid_trials.py. This scripts takes the path to a CSV file that contains trial information from a recording session and saves a new CSV file that only contains the “valid” trials. Can you create the CLI version of it please? Save it in the clis folder, and please make sure it works by applying it on the session.csv (which is in the data/raw folder).

Solution:

import argparse
import pandas as pd

# Create the parser
parser = argparse.ArgumentParser(description='Extract valid trials from session data')

# Add arguments
parser.add_argument('input_path', type=str, help='Path to input CSV file')
parser.add_argument('output_path', type=str, help='Path to output CSV file')

# Parse the arguments
args = parser.parse_args()

# Read the data
df = pd.read_csv(args.input_path)

# Extract valid trials
valid_trials = df.loc[df['valid'] == 1]

# Save to output file
valid_trials.to_csv(args.output_path, index=False)

print(f"Valid trials extracted and saved to {args.output_path}")

Run with:

python clis/extract_valid_trials.py data/raw/session.csv data/processed/valid_trials.csv

Exercise: There are two scripts in the scripts forlder called normalize_array.py and standardize_array.py. Can you combine their functionality into a single CLI, called transform_array.py? With this CLI, the user can pass their desired transformation as an additional input argument. Please make sure the transformation is reflected in the name of the output file. Appply it on the array.npy file in the data/raw folder. Does it work?

Solution:

import argparse
import numpy as np

# Create the parser
parser = argparse.ArgumentParser(description='Transform array by normalizing or standardizing')

# Add arguments
parser.add_argument('operation', type=str, help='Transformation operation: normalize or standardize')
parser.add_argument('input_path', type=str, help='Path to input .npy file')
parser.add_argument('output_path', type=str, help='Path to output .npy file')

# Parse the arguments
args = parser.parse_args()

# Load the array
array = np.load(args.input_path)

# Apply transformation
if args.operation == 'normalize':
    transformed = (array - np.min(array)) / (np.max(array) - np.min(array))
elif args.operation == 'standardize':
    transformed = (array - np.mean(array)) / np.std(array)
else:
    print(f"Unknown operation: {args.operation}")
    exit(1)

# Save the transformed array
np.save(args.output_path, transformed)

print(f"Array transformed using {args.operation} and saved to {args.output_path}")

Run with:

python clis/transform_array.py normalize data/raw/array.npy data/processed/array_normalized.npy
python clis/transform_array.py standardize data/raw/array.npy data/processed/array_standardized.npy

Keep track of the development: We have ceated some new files, let’s commit the changes (with a short message) and push to GitHub.

Section 3: CLIs with Optional Arguments

So far we have created CLIs where there is a specific number of arguments, and the CLI only runs if we specify a value for all the arguments. What if we want our CLI to run even if some arguments were not assigned a value by the user? These are “optional” arguments.

Creating optional arguments for your CLI, is very similar to how we did it before, but we add a -- before the name of the argument, and also specify a default value (in case nothing was specified by the user). Here is an example:

# Defining an optional argument
parser.add_argument('--group', type=str, default="iBehave", help='Research group the scholar is a member of.')

Let’s practice this a bit.

Exercise: Change the add_two_numbers.py CLI such that the user can also optionally scale the output (default scale is 1).

Solution:

import argparse

# Create the parser
parser = argparse.ArgumentParser(description='Add two numbers with optional scaling')

# Defining the first argument
parser.add_argument('number1', type=float, help='First input')

# Defining the second argument
parser.add_argument('number2', type=float, help='Second input')

# Defining an optional scale argument
parser.add_argument('--scale', type=float, default=1, help='Scale factor for the result (default: 1)')

# Parse the arguments
args = parser.parse_args()

# Calculate, scale, and print the sum
result = (args.number1 + args.number2) * args.scale
print(result)

Run with:

python clis/add_two_numbers.py 3 7
python clis/add_two_numbers.py 3 7 --scale=2

Exercise: Change the transform_array.py CLI such that the the operation argument is optional (default operation is to standardize).

Solution:

import argparse
import numpy as np

# Create the parser
parser = argparse.ArgumentParser(description='Transform array by normalizing or standardizing')

# Add arguments
parser.add_argument('input_path', type=str, help='Path to input .npy file')
parser.add_argument('output_path', type=str, help='Path to output .npy file')
parser.add_argument('--operation', type=str, default='standardize', help='Transformation operation: normalize or standardize (default: standardize)')

# Parse the arguments
args = parser.parse_args()

# Load the array
array = np.load(args.input_path)

# Apply transformation
if args.operation == 'normalize':
    transformed = (array - np.min(array)) / (np.max(array) - np.min(array))
elif args.operation == 'standardize':
    transformed = (array - np.mean(array)) / np.std(array)
else:
    print(f"Unknown operation: {args.operation}")
    exit(1)

# Save the transformed array
np.save(args.output_path, transformed)

print(f"Array transformed using {args.operation} and saved to {args.output_path}")

Run with:

python clis/transform_array.py data/raw/array.npy data/processed/array_standardized.npy
python clis/transform_array.py data/raw/array.npy data/processed/array_normalized.npy --operation=normalize

Exercise (optional): Change the extract_valid_trials.py CLI such that we can extract valid trials for a specific subject (by specifying the subject_id)? By default (i.e. when subject_id is not specified) all subjects are included. Please make sure the file name reflects the subject_id too.

Solution:

import argparse
import pandas as pd

# Create the parser
parser = argparse.ArgumentParser(description='Extract valid trials from session data')

# Add arguments
parser.add_argument('input_path', type=str, help='Path to input CSV file')
parser.add_argument('output_path', type=str, help='Path to output CSV file')
parser.add_argument('--subject_id', type=str, default=None, help='Subject ID to filter (default: all subjects)')

# Parse the arguments
args = parser.parse_args()

# Read the data
df = pd.read_csv(args.input_path)

# Extract valid trials
valid_trials = df.loc[df['valid'] == 1]

# Filter by subject_id if specified
if args.subject_id is not None:
    valid_trials = valid_trials.loc[valid_trials['subject_id'] == args.subject_id]

# Save to output file
valid_trials.to_csv(args.output_path, index=False)

print(f"Valid trials extracted and saved to {args.output_path}")

Run with:

python clis/extract_valid_trials.py data/raw/session.csv data/processed/valid_trials.csv
python clis/extract_valid_trials.py data/raw/session.csv data/processed/valid_trials_sub01.csv --subject_id=sub01

Keep track of the development: We have ceated some new files, let’s commit the changes (with a short message) and push to GitHub.

Section 4: Running CLIs in Jupyter Notebooks

And of course, we can also run our CLIs in Jupyter Notebooks. To do this we can simply use the ! operator to run the CLI in a code cell:

!python greet_user.py Mo --group=iBOTS

Example: For an example please see (and run) the greet_user_cli_demo.ipynb in the notebooks directory.

Exercise: Using a Jupyter notebook, show that the transform_array.py CLI does work properly:

Create a jupyter notebook (a file with .ipynb extension) in the notebooks folder
Within this notebook, we want to use the CLI and make sure it does the correct transformation:
- When using the CLI to normalize the data, is the resulting output file correctly transformed? i.e. is the value of data min 0 and max 1?
- When using the CLI to standardize the data, is the resulting output file correctly transformed? i.e. is the value of data mean 0 and standard deviation 1?

Solution:

Create a notebook called test_transform_array.ipynb in the notebooks folder:

# Cell 1: Test transform_array.py with normalization
import numpy as np

# Run the CLI to normalize
!python clis/transform_array.py data/raw/array.npy data/processed/array_normalized.npy --operation=normalize

# Load and verify the normalized array
normalized_array = np.load('data/processed/array_normalized.npy')
print(f"Normalized array min: {np.min(normalized_array)}")
print(f"Normalized array max: {np.max(normalized_array)}")

# Cell 2: Test transform_array.py with standardization
# Run the CLI to standardize
!python clis/transform_array.py data/raw/array.npy data/processed/array_standardized.npy --operation=standardize

# Load and verify the standardized array
standardized_array = np.load('data/processed/array_standardized.npy')
print(f"Standardized array mean: {np.mean(standardized_array)}")
print(f"Standardized array std: {np.std(standardized_array)}")

Exercise: Within the same notebook, let’s also test the extract_active_trials.py CLI.

Deos it correctly extract the active trials?
Which subject had the highest number of correct response (i.e. response=1) for valid trials?

Solution:

Add to the same notebook (test_transform_array.ipynb):

# Cell 3: Test extract_valid_trials.py
import pandas as pd

# Run the CLI
!python clis/extract_valid_trials.py data/raw/session.csv data/processed/valid_trials.csv

# Load and verify the results
valid_trials = pd.read_csv('data/processed/valid_trials.csv')

# Check if all trials are valid
print(f"All trials valid: {(valid_trials['valid'] == 1).all()}")

# Find subject with highest number of correct responses
correct_by_subject = valid_trials.loc[valid_trials['response'] == 1].groupby('subject_id').size()
best_subject = correct_by_subject.idxmax()
print(f"Subject with most correct responses: {best_subject} with {correct_by_subject.max()} correct")

Keep track of the development: We have ceated some new files, let’s commit the changes (with a short message) and push to GitHub.

[BONUS] One CLI for different tasks: Creating sub-commands

It could be that, depending on the CLI, we want to package mulitple commands into a single CLI such that people can choose what specific command they would like to use depending on their usecase. This would require us to create “sub-commands” and we can do this with argparse. So far, we simply created a parser object and added arguments to it. Now that we want to have multiple commands, we need to create and add “sub-parsers” first (one sub-parser is a sub-command) and then add arguments to that. Here is an example where the greet_user.py is extended such that we have one sub-command for greeting and another for farewell:

import argparse

# Create the parser
parser = argparse.ArgumentParser()

# Create a subparser object (we will be adding parsers to this object, basically creating sub-parsers)
subparsers = parser.add_subparsers(dest='command', help='Commands', required=True)

# Sub-command for greeting
greet_parser = subparsers.add_parser('greet', help='Greet the user')
greet_parser.add_argument('name', type=str, help='The name of the user')
greet_parser.add_argument('--group', type=str, default="iBehave", help='Research group the scholar is a member of.')

# Sub-command for farewell
farewell_parser = subparsers.add_parser('farewell', help='Say farewell to the user')
farewell_parser.add_argument('name', type=str, help='The name of the user')

# Parse the arguments
args = parser.parse_args()

# Use the arguments and run the corresponding code based on the command
if args.command == 'greet':
    print(f"Hello, {args.name}! {args.name} is from {args.group}.")
elif args.command == 'farewell':
    print(f"Goodbye, {args.name}!")

And we can now run this either to greet someone:

python clis/greet_or_farewell.py greet Mo --group=iBOTS

or to say goodbye:

python clis/greet_or_farewell.py farewell Mo

Exercise: Change the transform_array.py CLI such that the transformation is instead specified via subcommand (as opposed to an input argument). In other words, we would like to run the command as follows:

python cli/transform_array.py standardize path/to/input_array path/to/transformed_array

python cli/transform_array.py normalize path/to/array path/to/transformed_array

Solution:

import argparse
import numpy as np

# Create the parser
parser = argparse.ArgumentParser(description='Transform array by normalizing or standardizing')

# Create a subparser object
subparsers = parser.add_subparsers(dest='command', help='Transformation commands', required=True)

# Sub-command for normalize
normalize_parser = subparsers.add_parser('normalize', help='Normalize the array')
normalize_parser.add_argument('input_path', type=str, help='Path to input .npy file')
normalize_parser.add_argument('output_path', type=str, help='Path to output .npy file')

# Sub-command for standardize
standardize_parser = subparsers.add_parser('standardize', help='Standardize the array')
standardize_parser.add_argument('input_path', type=str, help='Path to input .npy file')
standardize_parser.add_argument('output_path', type=str, help='Path to output .npy file')

# Parse the arguments
args = parser.parse_args()

# Load the array
array = np.load(args.input_path)

# Apply transformation based on command
if args.command == 'normalize':
    transformed = (array - np.min(array)) / (np.max(array) - np.min(array))
elif args.command == 'standardize':
    transformed = (array - np.mean(array)) / np.std(array)

# Save the transformed array
np.save(args.output_path, transformed)

print(f"Array transformed using {args.command} and saved to {args.output_path}")

Run with:

python clis/transform_array.py standardize data/raw/array.npy data/processed/array_standardized.npy
python clis/transform_array.py normalize data/raw/array.npy data/processed/array_normalized.npy