WARN ReliableDeliverySupervisor: Association with remote system has failed, address is now gated for [5000] ms. Reason: [Disassociated]
问题 I am running the following sentence on aws spark val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext.implicits._ case class Wiki(project: String, title: String, count: Int, byte_size: String) val data = sc.textFile("s3n://+++/").map(_.split(" ")).filter(_.size ==4 ).map(p => Wiki(p(0), p(1), p(2).trim.toInt, p(3))) val df = data.toDF() df.printSchema() val en_agg_df = df.filter("project = 'en'").select("title","count").groupBy("title").sum().collect() can after about 2