Python split() string: by character, multiple separators, to list, regex
One of the most common tasks in programming, especially when dealing with text-based data, is string truncation. Python is one of the easiest, most powerful, and most intuitive of almost any programming language when it comes to dealing with strings.
The same is true for string truncation. Python provides a number of built-in methods and functions for truncating strings.
In this article, I'll summarize how to work with strings in Python by going through the various techniques for truncating and manipulating strings, as well as examples of their use.
1. Create a list by truncating each character with the list() method
In Python, a string is a sequence of characters enclosed in single or double quotes. Strings are immutable, meaning that their contents cannot be changed once they are declared. Python provides some basic built-in methods for working with strings, such as cutting, pasting, and formatting them. Of these, we'll focus on how to split a string.
First, we'll use the list()
method to split every character in a string, including spaces, into an array.
text = "Python is great language!"
text_list = list(text)
# Output: ['P', 'y', 't', 'h', 'o', 'n', ' ', 'i', 's', ' ', 'g', 'r', 'e', 'a', 't', ' ', 'l', 'a', 'n', 'g', 'u', 'a', 'g', 'e', '!']
# If we wanted to create a list without spaces, we would first remove the spaces with the replace() method and then call the list() method.
text_list = list(text.replace(' ', ''))
# Output: ['P', 'y', 't', 'h', 'o', 'n', 'i', 's', 'g', 'r', 'e', 'a', 't', 'l', 'a', 'n', 'g', 'u', 'a', 'g', 'e', '!']
2. Splitting a string with the split() method
The split()
method is the most common and simplest way to split a string in Python.
It basically splits a string based on whitespace and returns a list of substrings.
Example using split()
.
text = "Python is great language!"
words = text.split()
print(words)
Here's the result of running the example.
['Python', 'is', 'great', 'language!']
You can also pass the base character at which you want to split the string as an argument to the split()
method, so you can split the string however you want.
text = "Python-is-great-language!"
words = text.split("-")
print(words)
Here's the result of running the example.
['Python', 'is', 'great', 'language!']
The split()
method has a maxsplit
parameter.
The maxsplit
parameter specifies the maximum number of times the string can be split.
After the string is split the maximum number of times from the beginning, the remaining string is returned as the last element.
text = "Python is great language! It's easy-to-use."
words = text.split(" ", maxsplit=2)
print(words)
Here's the result of running the example.
['Python', 'is', "great language! It's easy-to-use."]
3. Splitting lines at line breaks with the splitlines() method
When dealing with multi-line strings, use the splitlines()
method.
The splitlines()
method splits a multiline string line by line and returns a list of each record.
By default, the method splits the string by newline characters (\n
).
An example of using the splitlines()
method:
multiline_text = "Python is great language!\nIt's easy-to-use.\nJust try it today!"
lines = multiline_text.splitlines()
print(lines)
Here's the result of running the example.
['Python is great language!', "It's easy-to-use.", 'Just try it today!']
The splitlines()
method can also optionally take a keepends
parameter,
If this parameter is set to True
, it will keep the newline character (\n
) at the end of each line of the returned list:
multiline_text = "Python is great language!\nIt's easy-to-use.\nJust try it today!"
lines = multiline_text.splitlines(keepends=True)
print(lines)
Here's the result of running the example.
['Python is great language!\n', "It's easy-to-use.\n", 'Just try it today!']
4. Split a string with multiple delimiters using the re.split() regular expression
In some cases, you may need more advanced split functions to split a string based on multiple delimiters or patterns.
Python's re
module provides a powerful split()
function that allows you to split strings using regular expressions.
Here's an example of splitting a string using multiple delimiters:
import re
text = "Python is;great:language! It's,easy-to-use."
words = re.split(r"[;:,\s]\s*", text)
print(words)
Here's the result of running the example
['Python', 'is', 'great', 'language!', "It's", 'easy-to-use.']
The regular expression pattern used in the example above is r"[;:,\s]\s*"
.
The meaning of r"[;:,\s]\s*"
is to find all patterns that start with a semicolon (;
), colon (:
), comma (,
), or space (\s
) followed by zero or more spaces (\s*
).
The re.split()
function will split the string whenever this pattern occurs.
Another way to use re.split()
is to split a string based on a pattern rather than a specific delimiter:
import re
text = "Python is;great:language!1234It's,easy-to-use."
words = re.split(r"\d+", text)
print(words)
Here's the result of running the example
['Python is;great:language!', "It's,easy-to-use."]
In the example above, the regular expression pattern (r"\d+"
) means that it matches one or more numbers.
The function re.split()
splits the string each time this pattern occurs.
5. Best Practices for Splitting Strings in Python
- For simple string splitting, use Python's built-in
split()
method. Thesplit()
method is efficient and easy to use for most string splitting tasks. It is the best solution, especially if you are splitting a string based on a single delimiter. - To split multi-line strings, use the
splitlines()
method. Thesplitlines()
method is the most efficient and convenient way to split a multiline string line by line and return a list with the lines as elements. - For more advanced splitting, use regular expressions with
re.split()
. If you need to split a string based on multiple delimiters, patterns, or complex rules, there
module'sre.split()
function provides powerful and flexible splitting capabilities. - However, while regular expressions are powerful, they can also be slower than the built-in methods. We recommend using the built-in methods whenever possible, and using regular expressions only when necessary.
6. Conclusion
In this article, we've covered the built-in Python methods split()
and splitlines()
.
and the re
module's re.split()
function,
We've seen several techniques for splitting strings in Python.
Understanding and mastering these techniques will allow you to efficiently manipulate and process text-based data in Python.
From simple text processing to complex data extraction and transformation, these powerful string manipulation tools are at your disposal, We hope you find this post helpful in handling a variety of tasks.
