Python string compare function: __eq__()
, not equal, algorithm, time complexity, ignore case, list comparison
This post summarizes the most frequently asked questions about comparing Python strings.
1. The __eq__(
) method
The __eq__()
method is one of the special methods in Python used to compare two objects to see if they are equal.
This method is called by itself when comparing two objects with the ==
operator.
Python string comparison operations also call this method because they use the ==
operator.
This means that when you run "Hello" == "hello"
, Python is actually internally calling "Hello".__eq__("hello")
.
If you run it like the code above, you'll get the result that the two strings are different
str1 = 'Hello'
str2 = 'hello'
print(str1 == str2) # output: False
This is because the __eq__()
method defined inside str
, the built-in class that represents strings in Python, is case-sensitive.
Thus, we can say that any Python built-in class that uses the ==
operator has the __eq__()
method defined inside its class.
We can customize the behavior of the ==
operator by overriding the __eq__()
method of any class.
If this method does not exist, Python defaults to comparing the memory addresses of the two objects.
The following example defines a new string class CaseInsensitiveStr
that performs case-insensitive comparisons.
Note how the method __eq__()
is defined.
class CaseInsensitiveStr:
def __init__(self, str_):
self.str_ = str_.lower()
def __eq__(self, other):
if isinstance(other, CaseInsensitiveStr):
return self.str_ == other.str_
elif isinstance(other, str):
return self.str_ == other.lower()
return False
str1 = CaseInsensitiveStr('Hello')
str2 = CaseInsensitiveStr('hello')
print(str1 == str2) # output: True
2. using not-equal (>
,<
)
When comparing strings in Python, not-equal operators can be used for two purposes
- Indicate lexicographic order
- Indicate an inclusion relationship
2.1. Not-equal operators for indicating lexicographic order
Let's start with the first one: Python is based on ASCII values for alphanumeric comparisons and Unicode values (mainly utf-8
) for Korean characters.
When you compare the alphabets a
and b
like this, you're actually comparing the ASCII values of these two characters.
The ord()
function is a built-in function that returns the Unicode code point (including the ASCII code) of a given character, so you can get the numeric value of any character.
print("a" == "b") # this comparison is actually
print(ord("a") == ord("a")) # This comparison is true.
By the same principle, when you use the not-equal operators like >
and <
to compare strings, you are comparing the values of Unicode codepoints.
These values are scaled alphabetically for both alphabetic and alphanumeric characters. z
is greater than a
, lowercase letters are greater than uppercase letters.
print("a" < "z") # Output: True
print("A" < "a") # Output: True
2.2. Not-equal operators for representing inclusion relations
The second use is for inclusion relations between strings.
When Python compares strings of two or more characters, it compares them one by one, starting with the earliest element. If one string is a substring of another, at some point you'll have to compare characters and spaces. At that point, the longer string is determined to be larger.
Let's see this in code
print("trees" > "tree") Output: True
Since "tree"
is a substring of "trees"
, it will always be resolved as the smaller value.
I mention this not to suggest that you should use not-equal operators in string containment relationships,
but to illustrate how Python's string comparison works.
We have the powerful keyword in
to determine inclusion.
3. Python's string compare algorithm and time complexity
Python's string comparison algorithm is intuitive. To compare two strings, Python compares the first two characters at the same index.
This results in a time complexity of O(n)
when the maximum number of operations is required, such as "hello" == "hellu"
.
4.How to ignore case when comparing two strings
As mentioned in Section 1, Python's str
class implements the case-sensitive method __eq__()
.
If you want to compare two strings case-insensitively, you can use the upper()
or lower()
methods of the str
class.
As the name implies, the upper()
method capitalizes all characters in a given string,
The lower()
method returns a new string with the characters of the given string converted to lowercase.
Because of this, it's easy to ignore case and do comparisons.
Let's see this in a code example
lower_case = "happy"
mixed_case = "HaPPy"
print(lower_case == mixed_case) # output: False
print(lower_case.upper() == mixed_case.upper()) # output: True
This is how you get the case-insensitive Python string comparison result.
5. String-List Comparison
Python strings and lists are different data types, so comparing them directly will always return False
.
print(["a", "b", "c"] == "abc") # Output: False
To compare the contents of two data types, we need to match the data types. To do this, we can use the following two methods.
5.1. list() function to convert a string into a list
The first method converts a string into a list. Python's list()
function is a simple function that converts many different data types to lists.
Use it like this
print(list("abc")) # Output: ["a", "b", "c"]
print(["a", "b", "c"] == list("abc")) # output: True
Since the data types are the same, we compared the elements of the two lists and returned a True
value.
5.2. join() method to convert lists to strings
Alternatively, we can convert the lists to strings and then compare them.
To do this, use the join()
method. This method combines all the elements of the list into a single string.
print("".join(["a", "b", "c"])) # output: "abc"
print("".join(["a", "b", "c"]) == "abc") # Output: True
Now that we have the same string type, we can compare the contents.
6. Conclusion
We've covered 5 of the most frequently asked questions about comparing strings in Python, and I hope this post cleared up some of your questions.
