Advertisement
Advertisement


Can someone explain __all__ in Python?


Question

I have been using Python more and more, and I keep seeing the variable __all__ set in different __init__.py files. Can someone explain what this does?

2016/03/03
1
1029
3/3/2016 5:52:24 PM

Accepted Answer

It's a list of public objects of that module, as interpreted by import *. It overrides the default of hiding everything that begins with an underscore.

2017/03/08
607
3/8/2017 3:50:12 PM


Explain __all__ in Python?

I keep seeing the variable __all__ set in different __init__.py files.

What does this do?

What does __all__ do?

It declares the semantically "public" names from a module. If there is a name in __all__, users are expected to use it, and they can have the expectation that it will not change.

It also will have programmatic affects:

import *

__all__ in a module, e.g. module.py:

__all__ = ['foo', 'Bar']

means that when you import * from the module, only those names in the __all__ are imported:

from module import *               # imports foo and Bar

Documentation tools

Documentation and code autocompletion tools may (in fact, should) also inspect the __all__ to determine what names to show as available from a module.

__init__.py makes a directory a Python package

From the docs:

The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path.

In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable.

So the __init__.py can declare the __all__ for a package.

Managing an API:

A package is typically made up of modules that may import one another, but that are necessarily tied together with an __init__.py file. That file is what makes the directory an actual Python package. For example, say you have the following files in a package:

package
├── __init__.py
├── module_1.py
└── module_2.py

Let's create these files with Python so you can follow along - you could paste the following into a Python 3 shell:

from pathlib import Path

package = Path('package')
package.mkdir()

(package / '__init__.py').write_text("""
from .module_1 import *
from .module_2 import *
""")

package_module_1 = package / 'module_1.py'
package_module_1.write_text("""
__all__ = ['foo']
imp_detail1 = imp_detail2 = imp_detail3 = None
def foo(): pass
""")

package_module_2 = package / 'module_2.py'
package_module_2.write_text("""
__all__ = ['Bar']
imp_detail1 = imp_detail2 = imp_detail3 = None
class Bar: pass
""")

And now you have presented a complete api that someone else can use when they import your package, like so:

import package
package.foo()
package.Bar()

And the package won't have all the other implementation details you used when creating your modules cluttering up the package namespace.

__all__ in __init__.py

After more work, maybe you've decided that the modules are too big (like many thousands of lines?) and need to be split up. So you do the following:

package
├── __init__.py
├── module_1
│   ├── foo_implementation.py
│   └── __init__.py
└── module_2
    ├── Bar_implementation.py
    └── __init__.py

First make the subpackage directories with the same names as the modules:

subpackage_1 = package / 'module_1'
subpackage_1.mkdir()
subpackage_2 = package / 'module_2'
subpackage_2.mkdir()

Move the implementations:

package_module_1.rename(subpackage_1 / 'foo_implementation.py')
package_module_2.rename(subpackage_2 / 'Bar_implementation.py')

create __init__.pys for the subpackages that declare the __all__ for each:

(subpackage_1 / '__init__.py').write_text("""
from .foo_implementation import *
__all__ = ['foo']
""")
(subpackage_2 / '__init__.py').write_text("""
from .Bar_implementation import *
__all__ = ['Bar']
""")

And now you still have the api provisioned at the package level:

>>> import package
>>> package.foo()
>>> package.Bar()
<package.module_2.Bar_implementation.Bar object at 0x7f0c2349d210>

And you can easily add things to your API that you can manage at the subpackage level instead of the subpackage's module level. If you want to add a new name to the API, you simply update the __init__.py, e.g. in module_2:

from .Bar_implementation import *
from .Baz_implementation import *
__all__ = ['Bar', 'Baz']

And if you're not ready to publish Baz in the top level API, in your top level __init__.py you could have:

from .module_1 import *       # also constrained by __all__'s
from .module_2 import *       # in the __init__.py's
__all__ = ['foo', 'Bar']     # further constraining the names advertised

and if your users are aware of the availability of Baz, they can use it:

import package
package.Baz()

but if they don't know about it, other tools (like pydoc) won't inform them.

You can later change that when Baz is ready for prime time:

from .module_1 import *
from .module_2 import *
__all__ = ['foo', 'Bar', 'Baz']

Prefixing _ versus __all__:

By default, Python will export all names that do not start with an _. You certainly could rely on this mechanism. Some packages in the Python standard library, in fact, do rely on this, but to do so, they alias their imports, for example, in ctypes/__init__.py:

import os as _os, sys as _sys

Using the _ convention can be more elegant because it removes the redundancy of naming the names again. But it adds the redundancy for imports (if you have a lot of them) and it is easy to forget to do this consistently - and the last thing you want is to have to indefinitely support something you intended to only be an implementation detail, just because you forgot to prefix an _ when naming a function.

I personally write an __all__ early in my development lifecycle for modules so that others who might use my code know what they should use and not use.

Most packages in the standard library also use __all__.

When avoiding __all__ makes sense

It makes sense to stick to the _ prefix convention in lieu of __all__ when:

  • You're still in early development mode and have no users, and are constantly tweaking your API.
  • Maybe you do have users, but you have unittests that cover the API, and you're still actively adding to the API and tweaking in development.

An export decorator

The downside of using __all__ is that you have to write the names of functions and classes being exported twice - and the information is kept separate from the definitions. We could use a decorator to solve this problem.

I got the idea for such an export decorator from David Beazley's talk on packaging. This implementation seems to work well in CPython's traditional importer. If you have a special import hook or system, I do not guarantee it, but if you adopt it, it is fairly trivial to back out - you'll just need to manually add the names back into the __all__

So in, for example, a utility library, you would define the decorator:

import sys

def export(fn):
    mod = sys.modules[fn.__module__]
    if hasattr(mod, '__all__'):
        mod.__all__.append(fn.__name__)
    else:
        mod.__all__ = [fn.__name__]
    return fn

and then, where you would define an __all__, you do this:

$ cat > main.py
from lib import export
__all__ = [] # optional - we create a list if __all__ is not there.

@export
def foo(): pass

@export
def bar():
    'bar'

def main():
    print('main')

if __name__ == '__main__':
    main()

And this works fine whether run as main or imported by another function.

$ cat > run.py
import main
main.main()

$ python run.py
main

And API provisioning with import * will work too:

$ cat > run.py
from main import *
foo()
bar()
main() # expected to error here, not exported

$ python run.py
Traceback (most recent call last):
  File "run.py", line 4, in <module>
    main() # expected to error here, not exported
NameError: name 'main' is not defined
2019/12/30

I'm just adding this to be precise:

All other answers refer to modules. The original question explicitely mentioned __all__ in __init__.py files, so this is about python packages.

Generally, __all__ only comes into play when the from xxx import * variant of the import statement is used. This applies to packages as well as to modules.

The behaviour for modules is explained in the other answers. The exact behaviour for packages is described here in detail.

In short, __all__ on package level does approximately the same thing as for modules, except it deals with modules within the package (in contrast to specifying names within the module). So __all__ specifies all modules that shall be loaded and imported into the current namespace when us use from package import *.

The big difference is, that when you omit the declaration of __all__ in a package's __init__.py, the statement from package import * will not import anything at all (with exceptions explained in the documentation, see link above).

On the other hand, if you omit __all__ in a module, the "starred import" will import all names (not starting with an underscore) defined in the module.

2013/05/16

It also changes what pydoc will show:

module1.py

a = "A"
b = "B"
c = "C"

module2.py

__all__ = ['a', 'b']

a = "A"
b = "B"
c = "C"

$ pydoc module1

Help on module module1:

NAME
    module1

FILE
    module1.py

DATA
    a = 'A'
    b = 'B'
    c = 'C'

$ pydoc module2

Help on module module2:

NAME
    module2

FILE
    module2.py

DATA
    __all__ = ['a', 'b']
    a = 'A'
    b = 'B'

I declare __all__ in all my modules, as well as underscore internal details, these really help when using things you've never used before in live interpreter sessions.


__all__ customizes * in from <module> import *

__all__ customizes * in from <package> import *


A module is a .py file meant to be imported.

A package is a directory with a __init__.py file. A package usually contains modules.


MODULES

""" cheese.py - an example module """

__all__ = ['swiss', 'cheddar']

swiss = 4.99
cheddar = 3.99
gouda = 10.99

__all__ lets humans know the "public" features of a module.[@AaronHall] Also, pydoc recognizes them.[@Longpoke]

from module import *

See how swiss and cheddar are brought into the local namespace, but not gouda:

>>> from cheese import *
>>> swiss, cheddar
(4.99, 3.99)
>>> gouda
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'gouda' is not defined

Without __all__, any symbol (that doesn't start with an underscore) would have been available.


Imports without * are not affected by __all__


import module

>>> import cheese
>>> cheese.swiss, cheese.cheddar, cheese.gouda
(4.99, 3.99, 10.99)

from module import names

>>> from cheese import swiss, cheddar, gouda
>>> swiss, cheddar, gouda
(4.99, 3.99, 10.99)

import module as localname

>>> import cheese as ch
>>> ch.swiss, ch.cheddar, ch.gouda
(4.99, 3.99, 10.99)

PACKAGES

In the __init__.py file of a package __all__ is a list of strings with the names of public modules or other objects. Those features are available to wildcard imports. As with modules, __all__ customizes the * when wildcard-importing from the package.[@MartinStettner]

Here's an excerpt from the Python MySQL Connector __init__.py:

__all__ = [
    'MySQLConnection', 'Connect', 'custom_error_exception',

    # Some useful constants
    'FieldType', 'FieldFlag', 'ClientFlag', 'CharacterSet', 'RefreshOption',
    'HAVE_CEXT',

    # Error handling
    'Error', 'Warning',

    ...etc...

    ]

The default case, asterisk with no __all__ for a package, is complicated, because the obvious behavior would be expensive: to use the file system to search for all modules in the package. Instead, in my reading of the docs, only the objects defined in __init__.py are imported:

If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py. It also includes any submodules of the package that were explicitly loaded by previous import statements.


Wildcard imports ... should be avoided as they [confuse] readers and many automated tools.

[PEP 8, @ToolmakerSteve]

2019/11/18

From (An Unofficial) Python Reference Wiki:

The public names defined by a module are determined by checking the module's namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module's namespace which do not begin with an underscore character ("_"). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module).

2012/10/11

Source: https://stackoverflow.com/questions/44834
Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Email: [email protected]