+ 1
Why do some symbols such as £ show up as different symbols in print?
Some code I did in code playground used print to display a string that had a £ (British Pound) symbol, but it showed up as a different character - accented u. For example: print("£3") displays ù3 instead of £3 I think it might have something to do with Unicode, and an error in conversion "under the hood", or perhaps something like html entities?? (edited because I went back to code playground and found correct symbol)
7 ответов
+ 4
http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/
In case you would enjoy a rant with useful data. I don't know exactly what's up for you so would have to poke more for simpler...and I'm traveling :/
+ 4
SoloLearn is running (I believe; it's been a little while since I checked) Windows 2012 Server. The default character set for Windows is 8859.
I tried detecting encoding...but they don't have the module:
>>> import chardet
>>> s = '\xe2\x98\x83' # ☃
>>> chardet.detect(s)
{'confidence': 0.505, 'encoding': 'utf-8'}
Then I tried forcing it:
line 1:
# -- coding: latin-1
s="£3";
print(s.decode('iso-8859-1').encode('utf-8'))
Line 1:
Syntax error: encoding problem:
iso-8859-1 with BOM
Ironically (since I got my answer by trying to coerce it before the force), this seems to reveal the character set in use by crashing.
+ 4
Also codepage 1252 for windows in legacy/text mode:
import locale
print(locale.getpreferredencoding())
Output: cp1252
# causes encoding error (for 0x80) hinting at above check:
for i in range(127):
print(str(i+128) +":" + chr(i+128))
This link shows encoding errors caused by mixing utf-8 and cp1252/8859...I feel like this is the right question area:
www.i18nqa.com/debug/utf8-debug.html#dbg
+ 3
This works...not entirely sure on this yet but something to look at:
https://code.sololearn.com/c6NQFsmNHz82/?ref=app
+ 2
ooh a rant with useful data. I shall check that out
+ 2
so I've read the link Kirk sent now - lots of it went over my head but it
a) confirmed that my issue may be an error in pythons handling of unicode
b) was a fairly interesting read, and looks like the author knows his stuff (could be worth reading more of his/her blog in future)
+ 1
I'm now googling my problem with not much success so far, and suggestions I've tried in code playground don't fix it.
this link http://stackoverflow.com/questions/12330184/printing-unicode looks to indicate it may be to do with code playground using a different character set, so maybe I need to just try putting it into different character sets until I find one that works