Skip to content

Effortlessly Explore String Containment in Python

[

How to Check if a Python String Contains a Substring

by Martin Breuss

If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.

In this tutorial, you’ll focus on the most Pythonic way to tackle this task, using the membership operator in. Additionally, you’ll learn how to identify the right string methods for related, but different, use cases. Finally, you’ll also learn how to find substrings in pandas columns. This is helpful if you need to search through data from a CSV file. You could use the approach that you’ll learn in the next section, but if you’re working with tabular data, it’s best to load the data into a pandas DataFrame and search for substrings in pandas.

How to Confirm That a Python String Contains Another String

If you need to check whether a string contains a substring, use Python’s membership operator in. In Python, this is the recommended way to confirm the existence of a substring in a string:

raw_file_content = """Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one."""
"secret" in raw_file_content

The in membership operator gives you a quick and readable way to check whether a substring is present in a string.

Note: If you want to check whether the substring is not in the string, then you can use not in:

"secret" not in raw_file_content

When you use in, the expression returns a Boolean value: True if Python found the substring, and False if Python didn’t find the substring.

You can use this intuitive syntax in conditional statements to make decisions in your code:

if "secret" in raw_file_content:
print("Found!")

In this code snippet, you use the membership operator to check whether “secret” is a substring of raw_file_content. If it is, then you’ll print the message “Found!“.

Generalize Your Check by Removing Case Sensitivity

By default, the membership operator in performs a case-sensitive search. This means that it will only match substrings that have the same casing as the string you’re searching in.

However, sometimes you may want to perform a case-insensitive search, especially if you don’t want the search to be sensitive to the casing of the individual characters. In this case, you can convert both strings to the same lowercase or uppercase before performing the membership check.

raw_file_content = """Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one."""
"SECRET" in raw_file_content # False
"SECRET".lower() in raw_file_content.lower() # True

In the code snippet above, the first membership check returns False because “SECRET” is not present in raw_file_content. However, after converting both strings to lowercase using the .lower() method, the second membership check returns True. This is because the membership operator is now performing a case-insensitive search.

Learn More About the Substring

In addition to checking if a Python string contains a substring, you may also want to find its index or count how many times it appears. Python provides several string methods that can help you achieve this.

Finding the Index of a Substring

If you want to find the starting index of a substring in a string, you can use the .index() method:

sentence = "The quick brown fox jumps over the lazy dog."
substring = "jumps"
index = sentence.index(substring)

In the code snippet above, .index() returns the index at which the substring “jumps” starts in the sentence. In this case, the value of index will be 20.

Counting the Occurrences of a Substring

To count how many times a substring appears in a string, Python provides the .count() method:

sentence = "The quick brown fox jumps over the lazy dog."
substring = "the"
count = sentence.lower().count(substring)

In the code snippet above, .count() returns the number of times the substring “the” appears in the sentence. Note that the search is case-insensitive because both the sentence and substring have been converted to lowercase using the .lower() method. In this case, the value of count will be 2.

By leveraging these additional string methods, you can gain more control and flexibility in working with substrings in Python.

Find a Substring With Conditions Using Regex

Another powerful tool for finding substrings in Python is regular expressions, or regex. Regex allows you to define complex search patterns using special characters and syntax. It provides a more advanced and flexible way to find substrings based on various conditions.

To utilize regex for substring matching, you can use the re module in Python. Here’s an example of how to use regex to search for substrings that start with a specific prefix:

import re
string = "The sun is shining and the birds are singing."
matches = re.findall(r"\bs\w+", string)

In the code snippet above, re.findall() returns a list of all substrings in string that start with the letter “s”. In this case, the value of matches will be ['sun', 'shining', 'singing'].

Regex provides a powerful and flexible way to search for substrings based on various conditions, such as prefixes, suffixes, patterns, or even complex rules. It’s a valuable tool to have in your Python toolkit when you need to perform more advanced substring matching.

Find a Substring in a pandas DataFrame Column

If you need to search for substrings in a pandas DataFrame column, the process is slightly different. First, you’ll need to load your data into a DataFrame using pandas. Once you have your DataFrame, you can use the .str.contains() method to check if each element in the column contains a specific substring.

Here’s an example using a pandas DataFrame with a column named “text”:

import pandas as pd
data = {
"text": [
"The quick brown fox",
"jumps over the lazy dog.",
"This sentence does not contain the word example.",
"Here's an example sentence."
]
}
df = pd.DataFrame(data)
substring = "example"
# Use `.str.contains()` to check if each element in the "text" column contains the substring
matches = df["text"].str.contains(substring, case=False)

In the code snippet above, df["text"].str.contains() returns a boolean Series that indicates whether each element in the “text” column contains the substring “example”. The case=False argument ensures that the search is case-insensitive. In this case, the value of matches will be [False, False, False, True], indicating that the last row contains the substring “example”.

By using the .str.contains() method in pandas, you can easily search for substrings within specific columns of your DataFrame.

Key Takeaways

  • Python’s membership operator in is the recommended way to check whether a string contains a substring.
  • You can remove case sensitivity by converting both strings to lowercase or uppercase before performing the membership check.
  • Python provides additional string methods, such as .index() and .count(), to find the index or count the occurrences of a substring.
  • Regular expressions (regex) provide a more advanced and flexible way to find substrings based on various conditions.
  • In pandas, you can use the .str.contains() method to search for substrings within specific columns of a DataFrame.

With these techniques, you can confidently check whether a Python string contains a substring and perform various operations based on the result.