How can I check if a Python object is a string (either regular or Unicode)?
Its simple, use the following code (we assume the object mentioned to be obj)-
if type(obj) == str:
print('It is a string')
else:
print('It is not a string.')
To check if an object o
is a string type of a subclass of a string type:
isinstance(o, basestring)
because both str
and unicode
are subclasses of basestring
.
To check if the type of o
is exactly str
:
type(o) is str
To check if o
is an instance of str
or any subclass of str
:
isinstance(o, str)
The above also work for Unicode strings if you replace str
with unicode
.
However, you may not need to do explicit type checking at all. "Duck typing" may fit your needs. See http://docs.python.org/glossary.html#term-duck-typing.
See also What’s the canonical way to check for type in python?
In Python 3.x basestring
is not available anymore, as str
is the sole string type (with the semantics of Python 2.x's unicode
).
So the check in Python 3.x is just:
isinstance(obj_to_test, str)
This follows the fix of the official 2to3
conversion tool: converting basestring
to str
.
I found this ans more pythonic
:
if type(aObject) is str:
#do your stuff here
pass
since type objects are singleton, is can be used to do the compare the object to the str type
If one wants to stay away from explicit type-checking (and there are good reasons to stay away from it), probably the safest part of the string protocol to check is:
str(maybe_string) == maybe_string
It won't iterate through an iterable or iterator, it won't call a list-of-strings a string and it correctly detects a stringlike as a string.
Of course there are drawbacks. For example, str(maybe_string)
may be a heavy calculation. As so often, the answer is it depends.
EDIT: As @Tcll points out in the comments, the question actually asks for a way to detect both unicode strings and bytestrings. On Python 2 this answer will fail with an exception for unicode strings that contain non-ASCII characters, and on Python 3 it will return False
for all bytestrings.
I might deal with this in the duck-typing style, like others mention. How do I know a string is really a string? well, obviously by converting it to a string!
def myfunc(word):
word = unicode(word)
...
If the arg is already a string or unicode type, real_word will hold its value unmodified. If the object passed implements a __unicode__
method, that is used to get its unicode representation. If the object passed cannot be used as a string, the unicode
builtin raises an exception.