Executing Python Files from Jupyter Notebook: From %run to Modular Design

Keywords: Jupyter Notebook | Python Modules | %run Command

Abstract: This article provides an in-depth exploration of various methods to execute external Python files within Jupyter Notebook, focusing on the %run command's -i parameter and its limitations. By comparing direct execution with modular import approaches, it details proper namespace sharing and introduces the autoreload extension for live reloading. Complete code examples and best practices are included to help build cleaner, maintainable code structures.

Problem Background and Core Challenge

Jupyter Notebook is widely favored in data science and machine learning workflows for its interactivity. However, as codebases grow, moving logic to external .py files becomes necessary. A common issue users face is that executing via %run script.py does not recognize variables or imported libraries from the Notebook, as %run defaults to a new, isolated namespace.

Basic Solution: Using the -i Parameter with %run

IPython's %run magic command offers the -i option, which runs the external file in the current interactive namespace. This allows the file to access all defined variables, functions, and imported modules in the Notebook.

%run -i script.py

This method is straightforward and suitable for rapid testing and prototyping. For instance, after defining variable x and function f in the Notebook, script.py can directly use these objects for plotting.

Modular Design: A More Pythonic Alternative

While %run -i resolves namespace issues, it can obscure dependencies, reducing code readability and maintainability. A more elegant approach is to design the external file as a module, explicitly passing dependencies via function parameters.

Refactoring into a Reusable Module

Refactor script.py to include clear interfaces:

import matplotlib.pyplot as plt

def plot_function(func, x_values):
    """
    Plot the given function over specified x values.
    
    Parameters:
        func: Callable function that takes x and returns y
        x_values: Array of x-axis values
    """
    plt.plot(x_values, func(x_values))
    plt.xlabel("Eje $x$", fontsize=16)
    plt.ylabel("$f(x)$", fontsize=16)
    plt.title("Función $f(x)$")
    plt.show()

Importing and Using in Notebook

Import the module and call the function in Jupyter Notebook:

import numpy as np
import script

def f(x):
    return np.exp(-x ** 2)

x = np.linspace(-1, 3, 100)
script.plot_function(f, x)

Dynamic Module Reloading with Autoreload

When frequently modifying modules during development, the autoreload extension automatically reloads changes without restarting the kernel.

%load_ext autoreload
%autoreload 1
%aimport script

Once configured, any changes to script.py take effect on the next call, maintaining a seamless development workflow.

Comparison and Best Practices

Direct Execution (%run -i): Ideal for quick experiments but not recommended for long-term projects due to hidden dependencies.

Modular Approach: Enhances testability, reusability, and clarity, making it the preferred choice for production environments.

Selection should be based on project phase and team standards: use %run -i for prototyping and modular design for complex projects.

Additional Execution Methods

Using system commands: !python script.py, but this also faces namespace isolation and lacks IPython's interactive features.

Conclusion

By effectively using %run -i or adopting a modular architecture, large codebases can be efficiently managed in Jupyter Notebook. The modular method, though requiring more upfront design, offers significant benefits for long-term maintenance and collaboration.

Copyright Notice: All rights in this article are reserved by the operators of DevGex. Reasonable sharing and citation are welcome; any reproduction, excerpting, or re-publication without prior permission is prohibited.