Do you really need a main() function?

MP 54: Some resources recommend that every script have a main() function. Is it really necessary?

Sep 26, 2023

Someone recently asked if all Python programs should be encapsulated in a main() function, which is then called at the end of the file:

Some Python courses recommend using a main() function defined at the beginning of the program, but only called at the end. I don’t see anything about that in your book.
I don’t see the main() function as being necessary. Can you tell me if I should ignore it? Thanks.

This is a great question, and it’s one I’ve asked myself at times after reading other people’s code that includes a main() function. Let’s look at why people sometimes structure their programs this way, and draw some conclusions about when to use a main() function and when to leave it out.

“Hello world”, without `main()`

Let’s look at the simplest Python program, without a main() function:

print("Hello Python world!")

This is the classic “Hello world” program. Many people don’t understand the significance of writing a one-line program like this. The main reason for running this program is to prove that the language you’re planning to use has been properly installed, and can run programs on your system.

Every language in existence has a version of “Hello world”, and many experienced programmers have run this program throughout their lives to test whether a new system has been set up correctly. It’s not just for beginners writing their first program.

“Hello world”, with `main()`

Now let’s look at the same program, with a main() function and the typical conditional block that runs it:

def main():
    print("Hello Python world!")

if __name__ == "__main__":
    main()

This program is only four lines long. If you already understand the structure of the program it can seem quite simple. But if you’re new to programming, there’s a lot going on here.

The one real line of code that matters, the call to print(), is wrapped in a function.
The function name seems arbitrary; it’s unrelated to the code inside the function.
There’s a conditional block, which includes two things unfamiliar to people who are new to programming: __name__ and "__main__".1

Overall, this program uses three lines of boilerplate to run a single line of code. Sometimes there are good reasons to use this kind of structure. But I don’t think it should be used in early examples of Python scripts, and I don’t think people should adopt it as a convention for all their scripts.

`main()` and imports

One of the most common reasons for using this structure in a program relates to how code is imported in Python. When a file is imported, Python runs all the code in the file. That means if you import the file hello_world.py, you’ll see the output of the print() call:

>>> import hello_world
Hello Python world!

But if you import the version that used an if __name__ == "__main__" block, the print() call won’t be executed:

>>> import main
>>>

Both hello_world.py and main.py are being executed when they’re being imported. But hello_world.py is structured in a way that causes the print() call to execute no matter how the program is run. The main.py file has an if block that keeps the print() call from being executed automatically, unless the file is run directly.

You can still make the print() call run after importing the second version, by explicitly calling main():

>>> import main
>>> main.main()
Hello Python world!

The if block prevents the code inside main() from being run automatically when it’s imported, but lets you run call that function whenever you need to.

Other reasons to use `main()`

There are some other reasons to use main() as well:

It encapsulates your code in a function, which makes it easier to test.
It can act as an entry point into a larger project.
If you add a CLI, your CLI can process arguments first and then call main().
It provides a clear path through the code. If you’re looking at a program for the first time and it has a main() function, you can start reading that function and follow the execution path from that point forward.

When not to use `main()`

Although main() has a number of important uses, it’s not required. One of the distinguishing characteristics of Python is its simplicity and lack of boilerplate. People write all kinds of scripts where the benefits of including main() just aren’t needed.

If you’re writing a one-off script with a specific purpose, and you have no plans to import it anywhere else, you don’t need the boilerplate of main(). You can use functions, you can break your code up into modules, and you can use imports, all without needing a main() function. Even if you understand the structure perfectly well, it adds unnecessary clutter when you don’t have a specific reason to use it.

A real-world example

As a brief example, Substack has a weekly event called Office Hours. It’s an online discussion where authors can ask each other questions, and sometimes get feedback from staff as well. Staff join the conversation at 10:00 am Pacific time, but the discussion often begins well before that time. It can open as early as 8:30, and it’s advantageous to participate early because the discussion quickly gets quite busy.

I wrote a short script to poll the Office Hours page, and open it in a browser when the discussion begins. Here’s the script:

import os, sys, time
import requests

url = "https://on.substack.com/s/office-hours"

for attempt_num in range(90):
    r = requests.get(url)
    print(f"Attempt #{attempt_num}. Status: {r.status_code}")

    if " min ago<" in r.text:
        cmd = f'open -a "Google Chrome" {url}'
        os.system(cmd)
        print("  Found a new post!")
        sys.exit()
    else:
        print("  No new post found. Waiting one minute...")
        time.sleep(60)

Once a minute, this script makes a request to the Office Hours URL. If there’s a new post, the HTML source of the page will include the snippet " min ago<". If this snippet is found, the Office Hours page is opened in a browser. If it’s not found, a message is printed in the terminal where the program is running. The script times out after 90 minutes, because Office Hours doesn’t happen every week.

Waiting for Office Hours to start, without having to manually reload the browser every few minutes.

This script does exactly what I want it to, and is likely to keep doing so for as long as Substack holds Office Hours. No one else is going to use this code, and I’m never going to import it into another program. There’s no need to break this code into functions, and certainly no need to wrap it in a main() function.

When discussing Python we often consider large libraries and projects, but people write short single-purpose scripts like this all the time.

Conclusions

There are many conventions in programming that have specific purposes, but become widely used even in situations where those purposes aren’t relevant. When you see a convention that doesn’t seem to have any specific rationale, it’s worth digging in and finding out exactly when it’s needed.

As a Python programmer, you should certainly understand the purpose of the main() function and the if __name__ == "__main__" block. But I don’t suggest you use it in every script you write. And if you’re writing resources for beginners, please don’t give them that kind of boilerplate until it’s actually needed.

Briefly, Python sets a value for __name__ for every file as it’s being executed. Normally the value of __name__ is the same as the file, without the .py ending. If the file is being executed directly, the value of __name__ will be "__main__".

The if __name__ == "__main__" block means that the code in the function will only be executed when the file is run directly. That code won’t be run when the file is imported into another Python program.

Mostly Python

Discussion about this post

Ready for more?