現(xiàn)象
在spark-shell中執(zhí)行streaming application時(shí),頻繁出現(xiàn)以下錯(cuò)誤。但是相同的代碼在之前執(zhí)行成功并無(wú)任務(wù)錯(cuò)誤官研,集群以及spark的配置都沒(méi)有任何改動(dòng)
15/05/13 17:41:53 INFO scheduler.TaskSetManager: Starting task 3.0 in stage 0.0 (TID 6, slave14.cluster03, ANY, 1522 bytes)
15/05/13 17:41:53 WARN scheduler.TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3, slave14.cluster03): java.lang.NoClassDefFoundError: org/apache/kafka/common/message/KafkaLZ4BlockOutputStream
at kafka.message.ByteBufferMessageSet$.decompress(ByteBufferMessageSet.scala:65)
at kafka.message.ByteBufferMessageSet$$anon$1.makeNextOuter(ByteBufferMessageSet.scala:179)
at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:192)
at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:146)
at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
at scala.collection.Iterator$$anon$1.hasNext(Iterator.scala:847)
at scala.collection.Iterator$$anon$19.skip(Iterator.scala:612)
at scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:615)
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:164)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.common.message.KafkaLZ4BlockOutputStream
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 23 more
原因:
這個(gè)錯(cuò)誤在google迷雪、spark mail list中都沒(méi)有找到有幫助的信息。后來(lái)想到是否是因?yàn)榧嘿Y源不足導(dǎo)致task會(huì)各種失敗巷挥,檢查集群情況,集群資源使用率為100%验靡,之后在集群空閑時(shí)執(zhí)行相同的streaming application倍宾,并未出現(xiàn)任何錯(cuò)誤