0

Python print(chr(128)) UnicodeEncodeError: 'ascii' codec can't encode character '\x80' in position 0: ordinal not in range(128)

I have made a script that, reads from file a string, and changes each character on ascii table, with a number. Then I add this number, with another number that the user inputs. In the end, the script, writes a txt file with the changed characters.Problem is, if the number is for an ascii character that doesn’t have a character like the above, it gives the above error. Is there a way that, the script will always return a character?

26th Nov 2018, 6:40 PM
Dimitris K
Dimitris K - avatar
6 odpowiedzi
+ 4
This may not be a solution, but the link might be relevant, although it mainly focuses on writing a file rather than reading. https://python-forum.io/Thread-Writing-chr-128-to-text-file-returns-errors Hth,
26th Nov 2018, 8:14 PM
Ipang
+ 4
Happy to hear that! will you please share the code here, save a copy in your profile, put the code URL here, so those who face similar issues later can use what you have solved : ) Thanks!
27th Nov 2018, 9:27 AM
Ipang
+ 1
ty! This might help! i ll try to implement the solutions and see if there is a different result
26th Nov 2018, 8:17 PM
Dimitris K
Dimitris K - avatar
+ 1
Just an update, it works flawlessly!
27th Nov 2018, 9:18 AM
Dimitris K
Dimitris K - avatar
+ 1
I will :)
27th Nov 2018, 9:35 AM
Dimitris K
Dimitris K - avatar
0
On Windows, many editors assume the default ANSI encoding (CP1252 on US Windows) instead of UTF-8 if there is no byte order mark (BOM) character at the start of the file. Files store bytes, which means all unicode have to be encoded into bytes before they can be stored in a file. read_csv takes an encoding option to deal with files in different formats. So, you have to specify an encoding, such as utf-8. df.to_csv('D:\panda.csv',sep='\t',encoding='utf-8') If you don't specify an encoding, then the encoding used by df.tocsv defaults to ascii in Python2, or utf-8 in Python3. Also, you can encode a problematic series first then decode it back to utf-8. df['column-name'] = df['column-name'].map(lambda x: x.encode('unicode-escape').decode('utf-8')) This will also rectify the problem. http://net-informations.com/ds/pd/tocsv.htm
5th Apr 2021, 6:31 AM
carlhyde