Python
Basic Data Types
string
compare

Python string compare function: __eq__(), not equal, algorithm, time complexity, ignore case, list comparison

This post summarizes the most frequently asked questions about comparing Python strings.

1. The __eq__() method

The __eq__() method is one of the special methods in Python used to compare two objects to see if they are equal. This method is called by itself when comparing two objects with the == operator. Python string comparison operations also call this method because they use the == operator.

This means that when you run "Hello" == "hello", Python is actually internally calling "Hello".__eq__("hello").

If you run it like the code above, you'll get the result that the two strings are different

str1 = 'Hello'
str2 = 'hello'
 
print(str1 == str2) # output: False

This is because the __eq__() method defined inside str, the built-in class that represents strings in Python, is case-sensitive.

Thus, we can say that any Python built-in class that uses the == operator has the __eq__() method defined inside its class. We can customize the behavior of the == operator by overriding the __eq__() method of any class. If this method does not exist, Python defaults to comparing the memory addresses of the two objects.

The following example defines a new string class CaseInsensitiveStr that performs case-insensitive comparisons. Note how the method __eq__() is defined.

class CaseInsensitiveStr:
 
    def __init__(self, str_):
        self.str_ = str_.lower()
 
    def __eq__(self, other):
        if isinstance(other, CaseInsensitiveStr):
            return self.str_ == other.str_
        elif isinstance(other, str):
            return self.str_ == other.lower()
        return False
 
str1 = CaseInsensitiveStr('Hello')
str2 = CaseInsensitiveStr('hello')
 
print(str1 == str2) # output: True

2. using not-equal (>,<)

When comparing strings in Python, not-equal operators can be used for two purposes

  • Indicate lexicographic order
  • Indicate an inclusion relationship

2.1. Not-equal operators for indicating lexicographic order

Let's start with the first one: Python is based on ASCII values for alphanumeric comparisons and Unicode values (mainly utf-8) for Korean characters.

When you compare the alphabets a and b like this, you're actually comparing the ASCII values of these two characters. The ord() function is a built-in function that returns the Unicode code point (including the ASCII code) of a given character, so you can get the numeric value of any character.

print("a" == "b") # this comparison is actually
print(ord("a") == ord("a")) # This comparison is true.

By the same principle, when you use the not-equal operators like > and < to compare strings, you are comparing the values of Unicode codepoints. These values are scaled alphabetically for both alphabetic and alphanumeric characters. z is greater than a, lowercase letters are greater than uppercase letters.

print("a" < "z") # Output: True
print("A" < "a") # Output: True

2.2. Not-equal operators for representing inclusion relations

The second use is for inclusion relations between strings.

When Python compares strings of two or more characters, it compares them one by one, starting with the earliest element. If one string is a substring of another, at some point you'll have to compare characters and spaces. At that point, the longer string is determined to be larger.

Let's see this in code

print("trees" > "tree") Output: True

Since "tree" is a substring of "trees" , it will always be resolved as the smaller value.

I mention this not to suggest that you should use not-equal operators in string containment relationships, but to illustrate how Python's string comparison works. We have the powerful keyword in to determine inclusion.

3. Python's string compare algorithm and time complexity

Python's string comparison algorithm is intuitive. To compare two strings, Python compares the first two characters at the same index.

This results in a time complexity of O(n) when the maximum number of operations is required, such as "hello" == "hellu".

4.How to ignore case when comparing two strings

As mentioned in Section 1, Python's str class implements the case-sensitive method __eq__(). If you want to compare two strings case-insensitively, you can use the upper() or lower() methods of the str class.

As the name implies, the upper() method capitalizes all characters in a given string, The lower() method returns a new string with the characters of the given string converted to lowercase. Because of this, it's easy to ignore case and do comparisons.

Let's see this in a code example

lower_case = "happy"
mixed_case = "HaPPy"
 
print(lower_case == mixed_case) # output: False
print(lower_case.upper() == mixed_case.upper()) # output: True

This is how you get the case-insensitive Python string comparison result.

5. String-List Comparison

Python strings and lists are different data types, so comparing them directly will always return False.

print(["a", "b", "c"] == "abc") # Output: False

To compare the contents of two data types, we need to match the data types. To do this, we can use the following two methods.

5.1. list() function to convert a string into a list

The first method converts a string into a list. Python's list() function is a simple function that converts many different data types to lists.

Use it like this

print(list("abc")) # Output: ["a", "b", "c"]
print(["a", "b", "c"] == list("abc")) # output: True

Since the data types are the same, we compared the elements of the two lists and returned a True value.

5.2. join() method to convert lists to strings

Alternatively, we can convert the lists to strings and then compare them.

To do this, use the join() method. This method combines all the elements of the list into a single string.

print("".join(["a", "b", "c"])) # output: "abc"
print("".join(["a", "b", "c"]) == "abc") # Output: True

Now that we have the same string type, we can compare the contents.

6. Conclusion

We've covered 5 of the most frequently asked questions about comparing strings in Python, and I hope this post cleared up some of your questions.

copyright for Python string compare function: __eq__(), not equal, algorithm, time complexity, ignore case, list comparison

© 2023 All rights reserved.