读取CSV文件时出现UnicodeDecodeError:('utf-8'编解码器)

问题描述:

我正在尝试读取csv来制作数据框-在列中进行更改-再次将更改后的值更新/反映到同一csv(to_csv)中-再次尝试读取该csv来制作另一个数据框. ..我遇到错误

what i am trying is reading a csv to make a dataframe---making changes in a column---again updating/reflecting changed value into same csv(to_csv)- again trying to read that csv to make another dataframe...there i am getting an error

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 7: invalid continuation byte

我的代码是

 import pandas as pd
 df = pd.read_csv("D:\ss.csv")
 df.columns  #o/p is Index(['CUSTOMER_MAILID', 'False', 'True'], dtype='object')
 df['True'] = df['True'] + 2     #making changes to one column of type float
 df.to_csv("D:\ss.csv")       #updating that .csv    
 df1 = pd.read_csv("D:\ss.csv")   #again trying to read that csv

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 7: invalid continuation byte

因此,请提出如何避免该错误并能够再次将该csv读取到数据帧的建议.

So please suggest how can i avoid the error and be able to read that csv again to a dataframe.

我知道在读取和写入csv的过程中我缺少编码=某种编解码器类型"或解码=某种类型".

I know somewhere i am missing "encode = some codec type" or "decode = some type" while reading and writing to csv.

但是我不知道到底应该更改什么,所以需要帮助.

But i don't know what exactly should be changed.so need help.

已知编码

如果您知道要读取的文件的编码, 您可以使用

Known encoding

If you know the encoding of the file you want to read in, you can use

pd.read_csv('filename.txt', encoding='encoding')

这些是可能的编码: https://docs.python.org/3/library/codecs.html#standard-encodings

These are the possible encodings: https://docs.python.org/3/library/codecs.html#standard-encodings

如果您不知道编码,则可以尝试使用chardet,但这不能保证能正常工作.更多的是猜测工作.

If you do not know the encoding, you can try to use chardet, however this is not guaranteed to work. It is more a guess work.

import chardet
import pandas as pd

with open('filename.csv', 'rb') as f:
    result = chardet.detect(f.read())  # or readline if the file is large


pd.read_csv('filename.csv', encoding=result['encoding'])