What is Apache Beam? -
i going through apache posts , found new term called beam. can explain apache beam is? tried google out unable clear answer.
apache beam open source, unified model defining , executing both batch , streaming data-parallel processing pipelines, set of language-specific sdks constructing pipelines , runtime-specific runners executing them.
history: model behind beam evolved number of internal google data processing projects, including mapreduce, flumejava, , millwheel. model known “dataflow model” , first implemented google cloud dataflow -- including java sdk on github writing pipelines , managed service executing them on google cloud platform. others in community began writing extensions, including spark runner, flink runner, , scala sdk. in january 2016, google , number of partners submitted dataflow programming model , sdks portion apache incubator proposal, under name apache beam (unified batch + stream processing). apache beam graduated incubation in december 2016.
additional resources learning beam model:
- the apache beam website
- the vldb 2015 paper (using original naming dataflow model)
- streaming 101 , streaming 102 posts on o’reilly’s radar site
- a beam podcast on software engineering radio
Comments
Post a Comment