〇〇一:??
2022.1.6? 13:30
在linux下用慣了, 換到windows下,會有種種想不到的問題.比如昨天碰到的用read_csv讀取文件報錯,實際上是路徑書寫的問題.
錯誤信息一大堆:
---------------------------------------------------------------------------------------
---------------------------------------------------------------------------OSErrorTraceback (most recent call last)E:\Temp/ipykernel_2916/1776550767.pyin<module> 1importpandasaspd 2# 讀取練習(xí)數(shù)據(jù)弦赖,文件路徑為'./工作/test_data.csv',編碼格式為'utf-8'----> 3 test_data=pd.read_csv('E:\Downloads\課程素材\工作\test_data.csv',encoding='utf-8') 4# 查看 test_data 5test_datad:\program files\python37\lib\site-packages\pandas\util\_decorators.pyinwrapper(*args, **kwargs) 309stacklevel=stacklevel, 310)--> 311 returnfunc(*args,**kwargs) 312 313returnwrapperd:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyinread_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options) 584kwds.update(kwds_defaults) 585--> 586 return_read(filepath_or_buffer,kwds) 587 588d:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyin_read(filepath_or_buffer, kwds) 480 481# Create the parser.--> 482 parser=TextFileReader(filepath_or_buffer,**kwds) 483 484ifchunksizeoriterator:d:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyin__init__(self, f, engine, **kwds) 809self.options["has_index_names"]=kwds["has_index_names"] 810--> 811 self._engine=self._make_engine(self.engine) 812 813defclose(self):d:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyin_make_engine(self, engine) 1038) 1039# error: Too many arguments for "ParserBase"-> 1040 returnmapping[engine](self.f,**self.options)# type: ignore[call-arg] 1041 1042def_failover_to_python(self):d:\program files\python37\lib\site-packages\pandas\io\parsers\c_parser_wrapper.pyin__init__(self, src, **kwds) 49 50# open handles---> 51 self._open_handles(src,kwds) 52assertself.handlesisnotNone 53d:\program files\python37\lib\site-packages\pandas\io\parsers\base_parser.pyin_open_handles(self, src, kwds) 227memory_map=kwds.get("memory_map",False), 228storage_options=kwds.get("storage_options",None),--> 229 errors=kwds.get("encoding_errors","strict"), 230) 231d:\program files\python37\lib\site-packages\pandas\io\common.pyinget_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options) 705encoding=ioargs.encoding, 706errors=errors,--> 707 newline="", 708) 709else:OSError: [Errno 22] Invalid argument: 'E:\\Downloads\\課程素材\\工作\test_data.csv'
-----------------------------------------------------------------------------------------------
最主要是最后一句:?
OSError: [Errno 22] Invalid argument: 'E:\\Downloads\\課程素材\\工作\test_data.csv'
前面復(fù)制的Windows文件夾路徑是\\, 后面文件名前面的是我自己加上的\, 提醒我可能是路徑格式問題.
根據(jù)參考網(wǎng)上查到一篇文章(見下),試了幾種方法:
1)? r'E:\Downloads\課程素材\工作\test_data.csv'? ? ?-----OK
2)??'E:\\Downloads\\課程素材\\工作\\test_data.csv'? -----OK
3)? r'E:\\Downloads\\課程素材\\工作\\test_data.csv'?-----OK
4)??r'E:\\Downloads\\課程素材\\工作\test_data.csv'? -----OK
5)??'E:/Downloads/課程素材/工作/test_data.csv'? ? ? ?-----OK
6)??r'E:/Downloads/課程素材/工作/test_data.csv'? ? ? -----OK
7)??r'E:\Downloads\\課程素材\/工作/test_data.csv'? ? -----OK
8)??'E:\Downloads\\課程素材\/工作/test_data.csv'? ? ?-----OK
9)??'E:\Downloads\\課程素材\/工作\/test_data.csv'? ? ?-----OK
10)?'E:\Downloads\\課程素材\/工作/\test_data.csv'? ? -----不行
11)??r'E:\Downloads\\課程素材\/工作/\test_data.csv'? ?-----OK
參考文獻:?
問題的根本:windows讀取文件可以用\,但在字符串里面\被作為轉(zhuǎn)義字符使用,那么python在描述路徑時有兩種方式:
'd:\\a.txt',轉(zhuǎn)義的方式
r'd:\a.txt'么夫,聲明字符串不需要轉(zhuǎn)義
這樣就實現(xiàn)了python在windows系統(tǒng)中用\來訪問,其實這樣比較麻煩的是不是,下面對幾種情況說明:
問題1:其實python中文件的絕對路徑可以直接復(fù)制window的路徑店乐,
如:
C:\Users\Administrator\Desktop\python\source.txt? 這個路徑是沒有問題的
但是,其實你的絕對路徑正確呻袭,但是執(zhí)行報錯眨八,那么就是你文件名的問題,如:
C:\Users\Administrator\Desktop\python\t1.txt? 這個路徑絕對會報錯左电,因為 \t被轉(zhuǎn)義了
python就會解析為 C:\Users\Administrator\Desktop\python 1.txt? 這個時候肯定會報錯的
若果你改成下面的寫法就不會報錯啦(推薦使用此寫法“/"廉侧,可以避免很多異常)
C:/Users/Administrator/Desktop/python/t1.txt
————————————————
版權(quán)聲明:本文為CSDN博主「講測試的古古奇老師」的原創(chuàng)文章,遵循CC 4.0 BY-SA版權(quán)協(xié)議篓足,轉(zhuǎn)載請附上原文出處鏈接及本聲明段誊。
原文鏈接:https://blog.csdn.net/jusulysunbeamy/article/details/51290080
〇〇二:?
2022.1.6? 15:11
解決了路徑格式的問題, 從網(wǎng)上下載的示例csv可以打開了, 但是我從EXCEL表格轉(zhuǎn)出的csv還是報錯.
錯誤信息:?
---------------------------------------------------------------------------UnicodeDecodeErrorTraceback (most recent call last)E:\Temp/ipykernel_2916/486158148.pyin<module> 1importpandasaspd 2# 讀取練習(xí)數(shù)據(jù),文件路徑為'./工作/test_data.csv'栈拖,編碼格式為'utf-8'----> 3 test_data=pd.read_csv(r'I:\test_data.csv') 4# 查看 test_data 5test_datad:\program files\python37\lib\site-packages\pandas\util\_decorators.pyinwrapper(*args, **kwargs) 309stacklevel=stacklevel, 310)--> 311 returnfunc(*args,**kwargs) 312 313returnwrapperd:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyinread_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options) 584kwds.update(kwds_defaults) 585--> 586 return_read(filepath_or_buffer,kwds) 587 588d:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyin_read(filepath_or_buffer, kwds) 480 481# Create the parser.--> 482 parser=TextFileReader(filepath_or_buffer,**kwds) 483 484ifchunksizeoriterator:d:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyin__init__(self, f, engine, **kwds) 809self.options["has_index_names"]=kwds["has_index_names"] 810--> 811 self._engine=self._make_engine(self.engine) 812 813defclose(self):d:\program files\python37\lib\site-packages\pandas\io\parsers\readers.pyin_make_engine(self, engine) 1038) 1039# error: Too many arguments for "ParserBase"-> 1040 returnmapping[engine](self.f,**self.options)# type: ignore[call-arg] 1041 1042def_failover_to_python(self):d:\program files\python37\lib\site-packages\pandas\io\parsers\c_parser_wrapper.pyin__init__(self, src, **kwds) 67kwds["dtype"]=ensure_dtype_objs(kwds.get("dtype",None)) 68try:---> 69 self._reader=parsers.TextReader(self.handles.handle,**kwds) 70exceptException: 71self.handles.close()d:\program files\python37\lib\site-packages\pandas\_libs\parsers.pyxinpandas._libs.parsers.TextReader.__cinit__()d:\program files\python37\lib\site-packages\pandas\_libs\parsers.pyxinpandas._libs.parsers.TextReader._get_header()d:\program files\python37\lib\site-packages\pandas\_libs\parsers.pyxinpandas._libs.parsers.TextReader._tokenize_rows()d:\program files\python37\lib\site-packages\pandas\_libs\parsers.pyxinpandas._libs.parsers.raise_parser_error()UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 566: invalid start byte
最后一句提示可能是編碼的問題.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 566: invalid start byte
難道excle轉(zhuǎn)存的csv的默認編碼格式不是UTF-8? 谷歌一查, 還真是, 默認是ANSI. ( https://zhidao.baidu.com/question/2014606813258805588.html )
在剛才語句后面加上encoding = 'ANSI' , 就可以正確打開csv了.