在跑ESPnet HKUST例程時糊秆,調(diào)對路徑后運(yùn)行run.sh遇到報錯:
stage 0: Data preparation
Usage: iconv [-c] [-s] [-f fromcode] [-t tocode] [file ...]
or: iconv -l
Try 'iconv --help' for more information.
xargs: cat: terminated by signal 13
Error: ./export/corpora/LDC/LDC2005T32/ is invalid
排查./local/hkust_data_prep.sh時發(fā)現(xiàn)問題:
find -L $hkust_text_dir -iname "*.txt" | grep -i "trans/train" | xargs cat |\
iconv -f GBK -t utf-8 - | perl -e '
while (<STDIN>) {
@A = split(" ", $_);
if (@A <= 1) { next; }
if ($A[0] eq "#") { $utt_id = $A[1]; }
if (@A >= 3) {
$A[2] =~ s:^([AB])\:$:$1:;
printf "%s-%s-%06.0f-%06.0f", $utt_id, $A[2], 100*$A[0] + 0.5, 100*$A[1] + 0.5;
for($n = 3; $n < @A; $n++) { print " $A[$n]" };
print "\n";
}
}
' | sort -k1 > $train_dir/transcripts.txt || { echo "Error: $hkust_text_dir is invalid"; exit 1; }
這一段長命令里iconv命令有一個“-”,但前面的命令沒有“-”來代表stdout弃衍,導(dǎo)致iconv命令認(rèn)為有個參數(shù)沒有輸入胶征,從而給出usage。將“-”去掉即可念祭,腳本中一共兩處讹蘑。