Big Data

Build Spark Structured Streaming applications with the open source connector for Amazon Kinesis Data Streams

Apache Spark is a powerful big data engine used for large-scale data analytics. Its in-memory computing makes it great for iterative algorithms and interactive queries. You can use Apache Spark to process streaming data from a variety of streaming sources, including Amazon Kinesis Data Streams for use cases like clickstream analysis, fraud detection, and more. Kinesis Data Streams is a

Read More »
Big Data

Run Kinesis Agent on Amazon ECS

Kinesis Agent is a standalone Java software application that offers a straightforward way to collect and send data to Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose. The agent continuously monitors a set of files and sends new data to the desired destination. The agent handles file rotation, checkpointing, and retry upon failures. It delivers all of your data

Read More »
Big Data

Amazon Kinesis Data Streams: celebrating a decade of real-time data innovation

Data is a key strategic asset for every organization, and every company is a data business at its core. However, in many organizations, data is typically spread across a number of different systems such as software as a service (SaaS) applications, operational databases, and data warehouses. Such data silos make it difficult to get unified views of the data in

Read More »
Ai

Implement real-time personalized recommendations using Amazon Personalize

At a basic level, Machine Learning (ML) technology learns from data to make predictions. Businesses use their data with an ML-powered personalization service to elevate their customer experience. This approach allows businesses to use data to derive actionable insights and help grow their revenue and brand loyalty. Amazon Personalize accelerates your digital transformation with ML, making it easier to integrate

Read More »
Big Data

Processing large records with Amazon Kinesis Data Streams

In today’s digital era, data is abundant and constantly flowing. Businesses across industries are seeking ways to harness this wealth of information to gain valuable insights and make real-time decisions. To meet this need, AWS offers Amazon Kinesis Data Streams, a powerful and scalable real-time data streaming service. With Kinesis Data Streams, you can effortlessly collect, process, and analyze streaming

Read More »
Big Data

Non-JSON ingestion using Amazon Kinesis Data Streams, Amazon MSK, and Amazon Redshift Streaming Ingestion

Organizations are grappling with the ever-expanding spectrum of data formats in today’s data-driven landscape. From Avro’s binary serialization to the efficient and compact structure of Protobuf, the landscape of data formats has expanded far beyond the traditional realms of CSV and JSON. As organizations strive to derive insights from these diverse data streams, the challenge lies in seamlessly integrating them

Read More »
Big Data

How Chime Financial uses AWS to build a serverless stream analytics platform and defeat fraudsters

This is a guest post by Khandu Shinde, Staff Software Engineer and Edward Paget, Senior Software Engineering at Chime Financial. Chime is a financial technology company founded on the premise that basic banking services should be helpful, easy, and free. Chime partners with national banks to design member first financial products. This creates a more competitive market with better, lower-cost

Read More »
Big Data

Perform Amazon Kinesis load testing with Locust

Building a streaming data solution requires thorough testing at the scale it will operate in a production environment. Streaming applications operating at scale often handle large volumes of up to GBs per second, and it’s challenging for developers to simulate high-traffic Amazon Kinesis-based applications to generate such load easily. Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose are capable

Read More »

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

With the rapid growth of technology, more and more data volume is coming in many different formats—structured, semi-structured, and unstructured. Data analytics on operational data at near-real time is becoming a common need. Due to the exponential growth of data volume, it has become common practice to replace read replicas with data lakes to have better scalability and performance. In

Read More »

Amazon Kinesis Data Streams on-demand capacity mode now scales up to 1 GB/second ingest capacity

Amazon Kinesis Data Streams is a serverless data streaming service that makes it easy to capture, process, and store streaming data at any scale. As customers collect and stream more types of data, they have asked for simpler, elastic data streams that can handle variable and unpredictable data traffic. In November 2021, Amazon Web Services launched the on-demand capacity mode

Read More »
Big Data

A side-by-side comparison of Apache Spark and Apache Flink for common streaming use cases

Apache Flink and Apache Spark are both open-source, distributed data processing frameworks used widely for big data processing and analytics. Spark is known for its ease of use, high-level APIs, and the ability to process large amounts of data. Flink shines in its ability to handle processing of data streams in real-time and low-latency stateful computations. Both support a variety

Read More »
Big Data

Near-real-time analytics using Amazon Redshift streaming ingestion with Amazon Kinesis Data Streams and Amazon DynamoDB

Amazon Redshift is a fully managed, scalable cloud data warehouse that accelerates your time to insights with fast, easy, and secure analytics at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical queries, making it the widely used cloud data warehouse. You can run and scale analytics in seconds on

Read More »
Big Data

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

Amazon Kinesis Data Analytics makes it easy to transform and analyze streaming data in real time. In this post, we discuss why AWS recommends moving from Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics for Apache Flink to take advantage of Apache Flink’s advanced streaming capabilities. We also show how to use Kinesis Data Analytics Studio to

Read More »
Big Data

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

Customers have been using data warehousing solutions to perform their traditional analytics tasks. Recently, data lakes have gained lot of traction to become the foundation for analytical solutions, because they come with benefits such as scalability, fault tolerance, and support for structured, semi-structured, and unstructured datasets. Data lakes are not transactional by default; however, there are multiple open-source frameworks that

Read More »
Big Data

Real-time anomaly detection via Random Cut Forest in Amazon Kinesis Data Analytics

Real-time anomaly detection describes a use case to detect and flag unexpected behavior in streaming data as it occurs. Online machine learning (ML) algorithms are popular for this use case because they don’t require any explicit rules and are able to adapt to a changing baseline, which is particularly useful for continuous streams of data where incoming data changes continuously

Read More »
Big Data

Build a real-time GDPR-aligned Apache Iceberg data lake

Data lakes are a popular choice for today’s organizations to store their data around their business activities. As a best practice of a data lake design, data should be immutable once stored. But regulations such as the General Data Protection Regulation (GDPR) have created obligations for data operators who must be able to erase or update personal data from their

Read More »
Big Data

How Vanguard made their technology platform resilient and efficient by building cross-Region replication for Amazon Kinesis Data Streams

This is a guest post co-written with Raghu Boppanna from Vanguard.  At Vanguard, the Enterprise Advice line of business improves investor outcomes through digital access to superior, personalized, and affordable financial advice. They made it possible, in part, by driving economies of scale across the globe for investors with a highly resilient and efficient technical platform. Vanguard opted for a

Read More »
Big Data

Build highly available streams with Amazon Kinesis Data Streams

Many use cases are moving towards a real-time data strategy due to demand for real-time insights, low-latency response times, and the ability to adapt to the changing needs of end-users. For this type of workload, you can use Amazon Kinesis Data Streams to seamlessly provision, store, write, and read data in a streaming fashion. With Kinesis Data Streams, there are

Read More »
Big Data

Build near real-time logistics dashboards using Amazon Redshift and Amazon Managed Grafana for better operational intelligence

Amazon Redshift is a fully managed data warehousing service that is currently helping tens of thousands of customers manage analytics at scale. It continues to lead price-performance benchmarks, and separates compute and storage so each can be scaled independently and you only pay for what you need. It also eliminates data silos by simplifying access to your operational databases, data

Read More »
Big Data

Near-real-time fraud detection using Amazon Redshift Streaming Ingestion with Amazon Kinesis Data Streams and Amazon Redshift ML

The importance of data warehouses and analytics performed on data warehouse platforms has been increasing steadily over the years, with many businesses coming to rely on these systems as mission-critical for both short-term operational decision-making and long-term strategic planning. Traditionally, data warehouses are refreshed in batch cycles, for example, monthly, weekly, or daily, so that businesses can derive various insights

Read More »
Big Data

LaunchDarkly’s journey from ingesting 1 TB to 100 TB per day with Amazon Kinesis Data Streams

This post was co-written with Mike Zorn, Software Architect at LaunchDarkly as the lead author. LaunchDarkly’s feature management platform enables customers to release features and measure their impact. As part of this platform, SDKs gather event data, and the event ingestion platform consumes and analyzes this data to measure impact. As the platform launched and customer adoption increased, we had

Read More »

Accessibility Dashboard

Accessibility settings have been reset

Help = available voice commands

Hide help = available voice commands

Scroll down = available voice commands

Scroll up = available voice commands

Go to top = available voice commands

Go to bottom = available voice commands

Tab = available voice commands

Tab back = available voice commands

Show numbers = available voice commands

Hide numbers = available voice commands

Clear input = available voice commands

Enter = available voice commands

Reload = available voice commands

Stop = available voice commands

Exit = available voice commands