Apache Beam

Apache Beam
開發者	Apache软件基金会
首次发布	2016年6月15日，8年前
当前版本	2.61.0（2024年11月14日；穩定版本）;
源代码库	github.com/apache/beam;
编程语言	Java, Python, Go
操作系统	跨平台
许可协议	Apache许可证 2.0
网站	beam.apache.org

Apache Beam是一个开源统一编程模型，用于定义和执行数据处理管道，包括ETL、批处理和流（连续）处理。^[2] Beam流水线是使用提供的SDK之一定义的，并在Beam支持的一个运行器（分布式处理后端）中执行，包括Apache Apex（英语：Apache Apex）、Apache Flink、Apache Gearpump（孵化中）、Apache Samza（英语：Apache Samza）、Apache Spark和Google Cloud Dataflow。^[3]

它被称为“大数据的超级API”。^[4]

历史

Apache Beam^[3]是数据流模型文件的一种实现。^[5]数据流模型基于以前关于Google的分布式处理抽象的工作，特别是FlumeJava^[6]和Millwheel。^[7]^[8]

Google于2014年发布了数据流模型的开放式SDK，以及在本地（非分布式）和Google云平台服务中执行数据流的环境。

2016年，Google向Apache软件基金会捐赠了核心SDK以及本地运行程序的实现，以及用于访问Google云平台数据服务的一组IO（数据连接器）。其他公司和社区成员为现有的分布式执行平台提供了运行器，以及新的将Beam Runners与现有数据库、键值存储和消息系统集成的IO。此外，还提出了新的DSL，以支持Beam模型之上的特定领域需求。

时间线

版本	释放日期
當前版本： 2.19.0	2020-02-04
舊版本，不再支援： 2.18.0	2020-01-23
舊版本，不再支援： 2.17.0	2020-01-06
舊版本，不再支援： 2.16.0	2019-10-07
舊版本，不再支援： 2.15.0	2019-08-22
舊版本，不再支援： 2.14.0	2019-08-01
舊版本，不再支援： 2.13.0	2019-05-22
舊版本，不再支援： 2.12.0	2019-04-25
舊版本，不再支援： 2.11.0	2019-02-26
舊版本，不再支援： 2.10.0	2019-02-01
舊版本，不再支援： 2.9.0	2018-12-13
舊版本，不再支援： 2.8.0	2018-10-29
舊版本，不再支援： 2.7.0	2018-10-03
舊版本，不再支援： 2.6.0	2018-08-08
舊版本，不再支援： 2.5.0	2018-06-26
舊版本，不再支援： 2.4.0	2018-03-20
舊版本，不再支援： 2.3.0	2018-01-30
舊版本，不再支援： 2.2.0	2017-12-02
舊版本，不再支援： 2.1.0	2017-08-23
舊版本，不再支援： 2.0.0	2017-05-17
舊版本，不再支援： 0.6.0	2017-03-11
舊版本，不再支援： 0.5.0	2017-02-02
舊版本，不再支援： 0.4.0	2016-12-29
舊版本，不再支援： 0.3.0	2016-10-31
舊版本，不再支援： 0.2.0	2016-08-08
舊版本，不再支援： 0.1.0	2016-06-15
格式：舊版本舊版本，仍被支援当前版本最新的预览版未来版本

参见

Apache软件基金会的项目列表

参考文献

^ Release 2.61.0. 2024年11月14日 [2024年12月21日].
^ Woodie, Alex. Apache Beam's Ambitious Goal: Unify Big Data Development. Datanami. 2016-04-22 [2016-08-04]. （原始内容存档于2016-08-13）.
^ ^3.0 ^3.1 Cloud Dataflow - Batch & Stream Data Processing. [2018-12-21]. （原始内容存档于2018-12-23）.
^ Ian Pointer. Apache Beam wants to be uber-API for big data. InfoWorld（英语：InfoWorld）. 2016-04-14 [2018-12-21]. （原始内容存档于2018-12-22）.
^ Akidau, Tyler; Schmidt, Eric; Whittle, Sam; Bradshaw, Robert; Chambers, Craig; Chernyak, Slava; Fernández-Moctezuma, Rafael J.; Lax, Reuven; McVeety, Sam. The dataflow model (PDF). Proceedings of the VLDB Endowment. 2015-08-01, 8 (12): 1792–1803 [2016-08-04]. doi:10.14778/2824032.2824076. （原始内容存档 (PDF)于2016-03-04）.
^ Chambers, Craig; Raniwala, Ashish; Perry, Frances; Adams, Stephen; Henry, Robert R.; Bradshaw, Robert; Weizenbaum, Nathan. FlumeJava: Easy, Efficient Data-parallel Pipelines (PDF). Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (ACM). 2010-01-01: 363–375 [2016-08-04]. doi:10.1145/1806596.1806638. （原始内容 (PDF)存档于2016-09-23）.
^ Akidau, Tyler; Whittle, Sam; Balikov, Alex; Bekiroğlu, Kaya; Chernyak, Slava; Haberman, Josh; Lax, Reuven; McVeety, Sam; Mills, Daniel. MillWheel (PDF). Proceedings of the VLDB Endowment. 2013-08-27, 6 (11): 1033–1044 [2016-08-04]. doi:10.14778/2536222.2536229. （原始内容 (PDF)存档于2016-02-01）.
^ Pointer, Ian. Apache Beam wants to be uber-API for big data. InfoWorld. [2016-08-04]. （原始内容存档于2016-08-03）.

[wikidata-df267ac5b109043671d5d29286647e73e92d66bd-v3-1] Release 2.61.0. 2024年11月14日 [2024年12月21日].

[Woodie2016-2] Woodie, Alex. Apache Beam's Ambitious Goal: Unify Big Data Development. Datanami. 2016-04-22 [2016-08-04]. （原始内容存档于2016-08-13）.

[google.com-3] 3.0 ^3.1 Cloud Dataflow - Batch & Stream Data Processing. [2018-12-21]. （原始内容存档于2018-12-23）.

[uber-4] Ian Pointer. Apache Beam wants to be uber-API for big data. InfoWorld（英语：InfoWorld）. 2016-04-14 [2018-12-21]. （原始内容存档于2018-12-22）.

[Akidau2015-5] Akidau, Tyler; Schmidt, Eric; Whittle, Sam; Bradshaw, Robert; Chambers, Craig; Chernyak, Slava; Fernández-Moctezuma, Rafael J.; Lax, Reuven; McVeety, Sam. The dataflow model (PDF). Proceedings of the VLDB Endowment. 2015-08-01, 8 (12): 1792–1803 [2016-08-04]. doi:10.14778/2824032.2824076. （原始内容存档 (PDF)于2016-03-04）.

[Chambers2010-6] Chambers, Craig; Raniwala, Ashish; Perry, Frances; Adams, Stephen; Henry, Robert R.; Bradshaw, Robert; Weizenbaum, Nathan. FlumeJava: Easy, Efficient Data-parallel Pipelines (PDF). Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (ACM). 2010-01-01: 363–375 [2016-08-04]. doi:10.1145/1806596.1806638. （原始内容 (PDF)存档于2016-09-23）.

[Akidau2013-7] Akidau, Tyler; Whittle, Sam; Balikov, Alex; Bekiroğlu, Kaya; Chernyak, Slava; Haberman, Josh; Lax, Reuven; McVeety, Sam; Mills, Daniel. MillWheel (PDF). Proceedings of the VLDB Endowment. 2013-08-27, 6 (11): 1033–1044 [2016-08-04]. doi:10.14778/2536222.2536229. （原始内容 (PDF)存档于2016-02-01）.

[Pointer2016-8] Pointer, Ian. Apache Beam wants to be uber-API for big data. InfoWorld. [2016-08-04]. （原始内容存档于2016-08-03）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Apache Beam

历史

时间线

参见

参考文献

Information related to Apache Beam

Portal di Ensiklopedia Dunia