+ 2

About File Encoding

Suppose I want to open and read information from a text file which is not utf-8 encoded. Right now, this is the code: myfile = open("filename.txt", "r") line = myfile.readline() Which gives me an error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc7 in position 40: invalid continuation byte Reading it as binary "rb" doesn't help either, since it doesn't give an error, but special characters appear as code \xc7, \xd5, \n and so. Anyone has any idea how can I solve this?

20th Nov 2016, 12:35 AM
Vinicius Matté Gregory
Vinicius Matté Gregory - avatar
2 Respuestas
+ 3
open function has the parameter 'encoding'. You should use it, if your file is not utf-8 encoded: with open('text.txt', 'r', encoding='cp1251') as f: text = f.read() https://docs.python.org/3/library/functions.html?highlight=open#open
20th Nov 2016, 9:38 AM
donkeyhot
donkeyhot - avatar
0
Worked fine, thanks. Had to use "latin-1". =D
21st Nov 2016, 2:29 AM
Vinicius Matté Gregory
Vinicius Matté Gregory - avatar