Python string split and splitlines: Split by spaces or newlines

A Python string is split to a list of strings by the split().

s = 'book-car-desk'

a = s.split('-')

print(a)  # ['book', 'car', 'desk']

The split() returns a Python list. The s has two delimiters (-) so is split to the words list. If the original string doesn't contain the delimiter, the list contains only one string.

s = 'book-car-desk'

a = s.split('***')

print(a)  # ['book-car-desk']

book-car-desk doesn't contain *** so the split() returns the list that has one element.

Split a string by one space

It's common to split sentences into words in Python programming. A Python string can be split by a space as follows.

s = 'book car desk'

a = s.split(' ')

print(a)  # ['book', 'car', 'desk']

The argument of split() is one space so the string is split to three strings.

No argument

s = 'book car desk'

a = s.split()

print(a)  # ['book', 'car', 'desk']

There is no arguments in the method and it automatically splits by one space. Furthermore, even if there are many spaces between strings, the outcome is the same.

s = ' book  car   desk'

a = s.split()

print(a)
# ['book', 'car', 'desk']

No argument vs one space delimiter

The split() with no argument splits a string by not only spaces but newlines. But if the argument is a one-space delimiter and the original string has newlines, the elements in the outcome list have newlines.

s = """
red
 blue
  green
"""

a = s.split()
b = s.split(' ')

print(a)  # ['red', 'blue', 'green']
print(b)  # ['\nred\n', 'blue\n', '', 'green\n']

The area in two of three quotations is a string with multiple newlines and the first letter would be directly after the first triple quotations. In the above example, there is one newline between triple quotations and red so there is one newline before red in the outcome list.

Split string to letters

We can split a string to all letters using Python list comprehension.

a = 'Apple'

b = [c for c in a]

print(b)  # ['A', 'p', 'p', 'l', 'e']

The c is a local variable iterating the a and an element of new generated list b.

Split by newline

The splitlines() splits a string by a newline (linefeed) in Python. It is often used to split an article to some paragraphs.

a = """Microsoft
Facebook
Netflix
"""

b = a.splitlines()

print(b)  # ['Microsoft', 'Facebook', 'Netflix']

This method doesn't ignore empty strings so if there are multiple newlines in the original string, empty strings will be in the outcome list.

a = """

Microsoft


Facebook

Netflix

"""

b = a.splitlines()

print(b)  # ['', '', 'Microsoft', '', '', 'Facebook', '', 'Netflix', '']

If you want to remove empty strings, use filter() and set the first argument None.

a = """

Microsoft


Facebook

Netflix

"""

b = a.splitlines()
c = filter(None, b)
d = list(c)

print(d)  # ['Microsoft', 'Facebook', 'Netflix']

Empty strings are removed.

Python String

Python Tutorial