Advertisement
Advertisement


Extract file name from path, no matter what the os/path format


Question

Which Python library can I use to extract filenames from paths, no matter what the operating system or path format could be?

For example, I'd like all of these paths to return me c:

a/b/c/
a/b/c
\a\b\c
\a\b\c\
a\b\c
a/b/../../a/b/c/
a/b/../../a/b/c
2016/11/28
1
838
11/28/2016 1:46:41 PM

Accepted Answer

Using os.path.split or os.path.basename as others suggest won't work in all cases: if you're running the script on Linux and attempt to process a classic windows-style path, it will fail.

Windows paths can use either backslash or forward slash as path separator. Therefore, the ntpath module (which is equivalent to os.path when running on windows) will work for all(1) paths on all platforms.

import ntpath
ntpath.basename("a/b/c")

Of course, if the file ends with a slash, the basename will be empty, so make your own function to deal with it:

def path_leaf(path):
    head, tail = ntpath.split(path)
    return tail or ntpath.basename(head)

Verification:

>>> paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c', 
...     'a/b/../../a/b/c/', 'a/b/../../a/b/c']
>>> [path_leaf(path) for path in paths]
['c', 'c', 'c', 'c', 'c', 'c', 'c']


(1) There's one caveat: Linux filenames may contain backslashes. So on linux, r'a/b\c' always refers to the file b\c in the a folder, while on Windows, it always refers to the c file in the b subfolder of the a folder. So when both forward and backward slashes are used in a path, you need to know the associated platform to be able to interpret it correctly. In practice it's usually safe to assume it's a windows path since backslashes are seldom used in Linux filenames, but keep this in mind when you code so you don't create accidental security holes.

2013/04/28
824
4/28/2013 9:12:59 PM


os.path.split is the function you are looking for

head, tail = os.path.split("/tmp/d/a.dat")

>>> print(tail)
a.dat
>>> print(head)
/tmp/d
2018/01/17

In python 3

>>> from pathlib import Path    
>>> Path("/tmp/d/a.dat").name
'a.dat'
2018/02/03

import os
head, tail = os.path.split('path/to/file.exe')

tail is what you want, the filename.

See python os module docs for detail

2020/02/24

import os
file_location = '/srv/volume1/data/eds/eds_report.csv'
file_name = os.path.basename(file_location )  #eds_report.csv
location = os.path.dirname(file_location )    #/srv/volume1/data/eds

In your example you will also need to strip slash from right the right side to return c:

>>> import os
>>> path = 'a/b/c/'
>>> path = path.rstrip(os.sep) # strip the slash from the right side
>>> os.path.basename(path)
'c'

Second level:

>>> os.path.filename(os.path.dirname(path))
'b'

update: I think lazyr has provided the right answer. My code will not work with windows-like paths on unix systems and vice versus with unix-like paths on windows system.

2011/12/05

Source: https://stackoverflow.com/questions/8384737
Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Email: [email protected]