Python 2.7 Unicodedecodeerror: 'ascii' Codec Can't Decode Byte
Solution 1:
ofile
is a bytestream, which you are writing a character string to. Therefore, it tries to handle your mistake by encoding to a byte string. This is only generally safe with ASCII characters. Since word
contains non-ASCII characters, it fails:
>>>open('/dev/null', 'wb').write(u'ä')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0:
ordinal not in range(128)
Make ofile
a text stream by opening the file with io.open
, with a mode like 'wt'
, and an explicit encoding:
>>>import io>>>io.open('/dev/null', 'wt', encoding='utf-8').write(u'ä')
1L
Alternatively, you can also use codecs.open
with pretty much the same interface, or encode all strings manually with encode
.
Solution 2:
Phihag's answer is correct. I just want to propose to convert the unicode to a byte-string manually with an explicit encoding:
ofile.write((u'\t\t\t\t\t<feat att="writtenForm" val="' +
word + u'"/>\n').encode('utf-8'))
(Maybe you like to know how it's done using basic mechanisms instead of advanced wizardry and black magic like io.open
.)
Solution 3:
I've had a similar error when writing to word documents (.docx). Specifically with the Euro symbol (€).
x = "€".encode()
Which gave the error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
How I solved it was by:
x = "€".decode()
I hope this helps!
Solution 4:
The best solution i found in stackoverflow is in this post: How to fix: "UnicodeDecodeError: 'ascii' codec can't decode byte" put in the beggining of the code and the default codification will be utf8
# encoding=utf8import sys
reload(sys)
sys.setdefaultencoding('utf8')
Post a Comment for "Python 2.7 Unicodedecodeerror: 'ascii' Codec Can't Decode Byte"