我試圖編寫一個包含Unicode到JSON的pandas數(shù)據(jù)幀,但是內(nèi)置的.to_json函數(shù)會忽略字符红碑。我該怎么解決這個問題赶站?
例子:
import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json')
這就產(chǎn)生了:
{"0":{"0":"\u03c4","1":"\u03c0"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}
與預(yù)期結(jié)果不同:
{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}
我嘗試添加force_ascii=False參數(shù):
import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json', force_ascii=False)
但這會產(chǎn)生以下錯誤:
UnicodeEncodeError: 'charmap' codec can't encode character '\u03c4' in position 11: character maps to <undefined>
最佳答案
打開一個編碼設(shè)置為utf-8的文件诈茧,然后將該文件傳遞給.to_json函數(shù)产喉,可以解決此問題:
with open('df.json', 'w', encoding='utf-8') as file:
df.to_json(file, force_ascii=False)
給出正確的:
{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}