[ACCEPTED]-Weird characters in string-python

Accepted answer
Score: 11

You have UTF-8 encoded data. You could decode 12 the data:

with open(filename) as f:
   for line in f:
       print line.decode('utf8')

or use io.open() to have Python decode the 11 contents for you, as you read:

import io

with io.open(filename, encoding='utf8') as f:
   for line in f:
       print line

Your data, decoded:

>>> print 'tamb\xc3\xa9m'.decode('utf8')
também
>>> print 'f\xc3\xbcr'.decode('utf8')
für
>>> print 'cari\xc3\xb1o'.decode('utf8')
cariño

You 10 appear to have printed string representations, (the output 9 of the repr() function), which produces string 8 literal syntax suitable for pasting back 7 into your Python interpreter. \xhh hex codes 6 are used for characters outside of the printable 5 ASCII range. Python containers such as list or 4 dict also use repr() to show their contents, when 3 printed.

You may want to read up on Unicode, and 2 how it interacts with Python. See:

More Related questions