Effortlessly Check if Python str Contains a Substring
How to Check if a Python String Contains a Substring
If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.
In this tutorial, you’ll focus on the most Pythonic way to tackle this task, using the in
membership operator. Additionally, you’ll learn how to identify the right string methods for related, but different, use cases. Finally, you’ll also learn how to find substrings in pandas columns. This is helpful if you need to search through data from a CSV file. You could use the approach that you’ll learn in the next section, but if you’re working with tabular data, it’s best to load the data into a pandas DataFrame and search for substrings in pandas.
How to Confirm That a Python String Contains Another String
If you need to check whether a string contains a substring, use Python’s membership operator in
. In Python, this is the recommended way to confirm the existence of a substring in a string:
The in
membership operator gives you a quick and readable way to check whether a substring is present in a string. You may notice that the line of code almost reads like English.
Note: If you want to check whether the substring is not in the string, then you can use not in
:
Because the substring "secret"
is present in raw_file_content
, the not in
operator returns False
.
When you use in
, the expression returns a Boolean value:
True
if Python found the substringFalse
if Python didn’t find the substring
You can use this intuitive syntax in conditional statements to make decisions in your code:
In this code snippet, you use the membership operator to check whether "secret"
is a substring of raw_file_content
. If it is, then you’ll print a message to the terminal.
Generalize Your Check by Removing Case Sensitivity
Sometimes you want to perform a substring check regardless of the letter case in the string. For example, you might want to search for "secret"
in a string, regardless of whether the substring is capitalized, all lowercase, or a mix of both. To achieve this, you can convert both the string and the substring to the same letter case before performing the check.
In Python, you have the methods .lower()
and .upper()
available for strings to convert all characters to lowercase or uppercase, respectively. You can use these methods together with the membership operator to perform a case-insensitive substring check.
Here’s an example:
In this example, you convert raw_file_content
to lowercase using the .lower()
method, and then assign the result to the variable file_content
. Now, regardless of whether the substring “secret” is written in uppercase or lowercase in the original string, the in
operator will check for its existence in the lowercase file_content
string.
This also works with the .upper()
method if you want to perform a case-insensitive check but prefer uppercase syntax for your code:
In this case, you convert raw_file_content
to uppercase using the .upper()
method, and then assign the result to the variable file_content
. The in
operator now performs a case-insensitive check for the uppercase substring “SECRET” in the uppercase file_content
string.
Learn More About the Substring
In some cases, you may want to extract or manipulate the substring that you’re looking for. Python provides several string methods that can help you with this task. Here are a few commonly used methods:
.index()
: Returns the index of the first occurrence of a substring in a string..find()
: Returns the index of the first occurrence of a substring in a string, or -1 if the substring is not found..count()
: Returns the number of occurrences of a substring in a string..replace()
: Replaces all occurrences of a substring with a new string..split()
: Splits a string into a list at each occurrence of a substring.
Let’s see some examples of how these methods work:
These methods provide you with powerful tools to manipulate strings and extract relevant information from them.
Find a Substring With Conditions Using Regex
Python’s membership operator in
and the string methods covered so far are very useful for basic substring checks. However, if you need to search for substrings that meet certain conditions or have specific patterns, regular expressions (regex) provide a more flexible and powerful solution.
The re
module in Python provides functions for working with regular expressions. To search for substrings using regex, you can use the re.search()
function, which returns a Match
object if a match is found, or None
if no match is found.
Here’s an example of how to use re.search()
to find a substring that starts with “secret” and ends with a digit:
In this code, the regular expression r"secret.*\d"
is used to define the search pattern. The pattern starts with “secret”, followed by any number of characters (.*
), and ends with a digit (\d
). If this pattern is found in the file_content
string, the if
statement will execute and print “Found!“.
Regular expressions provide a powerful way to search for substrings with complex patterns. You can define patterns for specific characters, digits, whitespace, and much more. However, regular expressions are beyond the scope of this tutorial, so if you’re interested in learning more, refer to the official Python documentation or additional resources on regular expressions.
Find a Substring in a pandas DataFrame Column
If you’re working with tabular data and need to search for substrings in specific columns, pandas provides easy-to-use methods that can save you time and effort. Let’s say you have a pandas DataFrame with a column called “text” containing text data, and you want to find all rows that contain a certain substring.
Here’s an example:
In this code, the .str.contains()
method is used to check whether each value in the “text” column contains the substring “example”. This method returns a boolean mask, which can be used to filter the DataFrame. In this case, all rows that contain the substring “example” are selected and stored in the sub_df
DataFrame.
You can modify the substring and column name according to your specific use case. This method is particularly useful when working with large datasets or when you need to perform complex filtering operations based on substrings.
Key Takeaways
- You can use the membership operator
in
to check whether a string contains a substring. This provides a quick and readable way to confirm the existence of a substring in a string. - Python also provides useful string methods like
.index()
,.find()
,.count()
,.replace()
, and.split()
to work with substrings and perform various operations. - If you need to search for substrings with specific conditions or patterns, regular expressions provide a more flexible solution. The
re
module in Python allows you to work with regular expressions. - When working with tabular data in pandas, you can use the
.str.contains()
method to filter rows based on substrings in specific columns. This can save you time and effort when searching for substrings in large datasets.
Now that you know the different ways to check if a Python string contains a substring, you can confidently handle tasks that require substring identification and manipulation. Whether you’re analyzing text data, processing user input, or searching through large datasets, these techniques will help you efficiently work with substrings in Python.