Need advice about which tool to choose?Ask the StackShare community!
I need to design a pipeline for ingesting streaming data (video, audio, and telemetry) from remote video cameras to Cloud AI/ML services. Cameras can be wired or wireless. So connection can be unstable. The video should be processed separately from each camera. Telemetry and audio can be added in the future, for now, it's only video stream. Looking for a solution for GCP. Thanks!
Disclosure: I work on Beam and Dataflow.
I have seen Apache Beam and Cloud Dataflow used to develop pipelines processing data from IoT devices via PubSub. Beam also has connectors for Cloud AI services, like the Vision API[1]. If you can upload data to Cloud Storage, or stream it via PubSub, Beam has appropriate connectors for all of those.
I have no exposure to the services around Cloud IoT, but I believe they all work via PubSub, so they should integrate well with Dataflow.
Check the video in [2]: A use case that seems very similar to yours - they don't go into implementation details much, but it should give you an idea of the general architecture.
[1] https://beam.apache.org/releases/pydoc/2.25.0/apache_beam.ml.gcp.visionml.html
Pros of Apache Beam
- Open-source5
- Cross-platform5
- Portable2
- Unified batch and stream processing2
Pros of Google Cloud Dataflow
- Unified batch and stream processing2
- Autoscaling2
- Fully managed2
- Throughput Transparency1