How do I trim whitespace?
How do I trim whitespace?
Question
Is there a Python function that will trim whitespace (spaces and tabs) from a string?
Example: \t example string\t
→ example string
Accepted Answer
For whitespace on both sides use str.strip
:
s = " \t a string example\t "
s = s.strip()
For whitespace on the right side use rstrip
:
s = s.rstrip()
For whitespace on the left side lstrip
:
s = s.lstrip()
As thedz points out, you can provide an argument to strip arbitrary characters to any of these functions like this:
s = s.strip(' \t\n\r')
This will strip any space, \t
, \n
, or \r
characters from the left-hand side, right-hand side, or both sides of the string.
The examples above only remove strings from the left-hand and right-hand sides of strings. If you want to also remove characters from the middle of a string, try re.sub
:
import re
print(re.sub('[\s+]', '', s))
That should print out:
astringexample
Read more… Read less…
Python trim
method is called strip
:
str.strip() #trim
str.lstrip() #ltrim
str.rstrip() #rtrim
For leading and trailing whitespace:
s = ' foo \t '
print s.strip() # prints "foo"
Otherwise, a regular expression works:
import re
pat = re.compile(r'\s+')
s = ' \t foo \t bar \t '
print pat.sub('', s) # prints "foobar"
You can also use very simple, and basic function: str.replace(), works with the whitespaces and tabs:
>>> whitespaces = " abcd ef gh ijkl "
>>> tabs = " abcde fgh ijkl"
>>> print whitespaces.replace(" ", "")
abcdefghijkl
>>> print tabs.replace(" ", "")
abcdefghijkl
Simple and easy.
#how to trim a multi line string or a file
s=""" line one
\tline two\t
line three """
#line1 starts with a space, #2 starts and ends with a tab, #3 ends with a space.
s1=s.splitlines()
print s1
[' line one', '\tline two\t', 'line three ']
print [i.strip() for i in s1]
['line one', 'line two', 'line three']
#more details:
#we could also have used a forloop from the begining:
for line in s.splitlines():
line=line.strip()
process(line)
#we could also be reading a file line by line.. e.g. my_file=open(filename), or with open(filename) as myfile:
for line in my_file:
line=line.strip()
process(line)
#moot point: note splitlines() removed the newline characters, we can keep them by passing True:
#although split() will then remove them anyway..
s2=s.splitlines(True)
print s2
[' line one\n', '\tline two\t\n', 'line three ']
No one has posted these regex solutions yet.
Matching:
>>> import re
>>> p=re.compile('\\s*(.*\\S)?\\s*')
>>> m=p.match(' \t blah ')
>>> m.group(1)
'blah'
>>> m=p.match(' \tbl ah \t ')
>>> m.group(1)
'bl ah'
>>> m=p.match(' \t ')
>>> print m.group(1)
None
Searching (you have to handle the "only spaces" input case differently):
>>> p1=re.compile('\\S.*\\S')
>>> m=p1.search(' \tblah \t ')
>>> m.group()
'blah'
>>> m=p1.search(' \tbl ah \t ')
>>> m.group()
'bl ah'
>>> m=p1.search(' \t ')
>>> m.group()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
If you use re.sub
, you may remove inner whitespace, which could be undesirable.