Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
AWS Data Pipeline is a web service that provides a simple management system for data-driven workflows. Using AWS Data Pipeline, you define a pipeline composed of the “data sources” that contain your data, the “activities” or business logic such as EMR jobs or SQL queries, and the “schedule” on which your business logic executes. For example, you could define a job that, every hour, runs an Amazon Elastic MapReduce (Amazon EMR)–based analysis on that hour’s Amazon Simple Storage Service (Amazon S3) log data, loads the results into a relational database for future lookup, and then automatically sends you a daily summary email. | Amazon Kinesis can collect and process hundreds of gigabytes of data per second from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data. |
You can find (and use) a variety of popular AWS Data Pipeline tasks in the AWS Management Console’s template section.;Hourly analysis of Amazon S3‐based log data;Daily replication of AmazonDynamoDB data to Amazon S3;Periodic replication of on-premise JDBC database tables into RDS | Real-time Processing- Amazon Kinesis enables you to collect and analyze information in real-time, allowing you to answer questions about the current state of your data, from inventory levels to stock trade frequencies, rather than having to wait for an out-of-date report;Easy to use- You can create a new stream, set the throughput requirements, and start streaming data quickly and easily. Amazon Kinesis automatically provisions and manages the storage required to reliably and durably collect your data stream;High throughput. Elastic.- Amazon Kinesis seamlessly scales to match the data throughput rate and volume of your data, from megabytes to terabytes per hour. Amazon Kinesis will scale up or down based on your needs;Integrate with Amazon S3, Amazon Redshift, and Amazon DynamoDB- With Amazon Kinesis, you can reliably collect, process, and transform all of your data in real-time before delivering it to data stores of your choice, where it can be used by existing or new applications. Connectors enable integration with Amazon S3, Amazon Redshift, and Amazon DynamoDB;Build Kinesis Applications- Amazon Kinesis provides developers with client libraries that enable the design and operation of real-time data processing applications. Just add the Amazon Kinesis Client Library to your Java application and it will be notified when new data is available for processing;Low Cost- Amazon Kinesis is cost-efficient for workloads of any scale. You can pay as you go, and you’ll only pay for the resources you use. You can get started by provisioning low throughput streams, and only pay a low hourly rate for the throughput you need |
Statistics | |
Stacks 94 | Stacks 797 |
Followers 398 | Followers 604 |
Votes 1 | Votes 9 |
Pros & Cons | |
Pros
| Pros
Cons
|

Google Cloud Dataflow is a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation. Cloud Dataflow frees you from operational tasks like resource management and performance optimization.

Oneprofile syncs customer profiles and events across all the tools a company uses. Instead of each system having its own version of a customer, Oneprofile keeps everything in sync automatically — CRMs, analytics, support, marketing. When customer data changes anywhere, it’s reflected everywhere, instantly. No manual pipelines, no broken integrations — just the right data in the right place.

AWS Snowball Edge is a 100TB data transfer device with on-board storage and compute capabilities. You can use Snowball Edge to move large amounts of data into and out of AWS, as a temporary storage tier for large local datasets, or to support local workloads in remote or offline locations.

REST API for real-time SEC filings data. Access 10-K, 10-Q, 8-K filings and Form 4 insider transactions as they hit EDGAR. Filter by ticker, form type, or date range. Build alerts, power dashboards, or integrate into trading systems. Free tier available.

Wiseek is a SaaS platform that processes real-time SEC filings into structured, queryable data for analysts, developers, and research teams.

Integrate existing data sources and take data-driven decisions about the natural and built environment. Nexus is an online platform that provides governments, NGOs, utilities and consultants with a digital twin using real-time connections with public and private data sources. Calculation models can easily be connected to the platform to enable continuous analysis of the integrated data. With Nexus, organizations can detect and monitor changes in the physical environment, perform operational forecasting, share data with partner organizations, evaluate spatial policies and schedule data-driven maintenance.

Offers live, customizable weather radar maps with real-time AI tornado detection and storm tracking powered by Level 2 Doppler data.

It is an elegant and simple HTTP library for Python, built for human beings. It allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your POST data.

Amazon Kinesis Firehose is the easiest way to load streaming data into AWS. It can capture and automatically load streaming data into Amazon S3 and Amazon Redshift, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today.

It is a .NET library that can read/write Office formats without Microsoft Office installed. No COM+, no interop.