Flink入门基础安装配置

Apache Flink是一个框架和分布式处理引擎,用于对无界和有界数据流进行有状态计算。

wordcound代码示例


/**
 *
 *批处理WordCount
 */

object WordCount {
  def main(args: Array[String]): Unit = {
    val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment
    val fileDS: DataSet[String] = env.readTextFile("input/word.txt")
    val wordDS: DataSet[String] = fileDS.flatMap(_.split(" "))
    val wordToOneDS: DataSet[(String, Int)] = wordDS.map((_, 1))
    val resultDS: AggregateDataSet[(String, Int)] = wordToOneDS.groupBy(0).sum(1)
    resultDS.print()
  }
}

/**
 *
 *流处理wordcount
 */

object WordCountStream {
  def main(args: Array[String]): Unit = {
    val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
    val socketDS: DataStream[String] = env.socketTextStream("hadoop104", 9999)
    val wordDS: DataStream[String] = socketDS.flatMap(_.split(" "))
    val resultDS: DataStream[(String, Int)] = wordDS.map((_,1)).keyBy(0).sum(1)
    resultDS.print("wc")
    env.execute("app")
  }
}
Flink入门基础安装配置

发表评论

电子邮件地址不会被公开。 必填项已用*标注

滚动到顶部