Advertisement
Advertisement


Why does comparing strings using either '==' or 'is' sometimes produce a different result?


Question

I've got a Python program where two variables are set to the value 'public'. In a conditional expression I have the comparison var1 is var2 which fails, but if I change it to var1 == var2 it returns True.

Now if I open my Python interpreter and do the same "is" comparison, it succeeds.

>>> s1 = 'public'
>>> s2 = 'public'
>>> s2 is s1
True

What am I missing here?

2019/02/23
1
1167
2/23/2019 10:29:49 PM

Accepted Answer

is is identity testing, == is equality testing. what happens in your code would be emulated in the interpreter like this:

>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False

so, no wonder they're not the same, right?

In other words: is is the id(a) == id(b)

2009/10/01
1549
10/1/2009 3:51:34 PM

Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.

The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:

Interned strings speed up string comparisons, which are sometimes a performance bottleneck in applications (such as compilers and dynamic programming language runtimes) that rely heavily on hash tables with string keys. Without interning, checking that two different strings are equal involves examining every character of both strings. This is slow for several reasons: it is inherently O(n) in the length of the strings; it typically requires reads from several regions of memory, which take time; and the reads fills up the processor cache, meaning there is less cache available for other needs. With interned strings, a simple object identity test suffices after the original intern operation; this is typically implemented as a pointer equality test, normally just a single machine instruction with no memory reference at all.

So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn't always happen, and the rules for when this happens are quite convoluted, so please don't rely on this behavior in production code!)

Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string -- that is, it has a different identity, because it is stored in a different place in memory.

2018/02/02

The is keyword is a test for object identity while == is a value comparison.

If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.

2009/10/01

One last thing to note, you may use the sys.intern function to ensure that you're getting a reference to the same string:

>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True

As pointed out above, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.

Note that the intern function used to be a builtin on Python 2 but was moved to the sys module in Python 3.

2020/04/16

is is identity testing, == is equality testing. What this means is that is is a way to check whether two things are the same things, or just equivalent.

Say you've got a simple person object. If it is named 'Jack' and is '23' years old, it's equivalent to another 23yr old Jack, but its not the same person.

class Person(object):
   def __init__(self, name, age):
       self.name = name
       self.age = age

   def __eq__(self, other):
       return self.name == other.name and self.age == other.age

jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)

jack1 == jack2 #True
jack1 is jack2 #False

They're the same age, but they're not the same instance of person. A string might be equivalent to another, but it's not the same object.

2016/03/04

This is a side note, but in idiomatic python, you will often see things like:

if x is None: 
    # some clauses

This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).

2009/10/01

Source: https://stackoverflow.com/questions/1504717
Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Email: [email protected]