RisingWave

RisingWave · 2026-04-29T15:07:28.426Z

Build a Streaming Lakehouse with Kafka + RisingWave + Apache Iceberg (via Lakekeeper) + DuckDB Most streaming pipelines stop at ingestion. Most lakehouses stop at storage. The real value comes when both work together as one system. Here is the streaming lakehouse pattern that makes it work: Kafka → RisingWave → Iceberg → DuckDB → Kafka streams events into RisingWave → RisingWave writes continuously into Iceberg via the Lakekeeper catalog → DuckDB queries the same Iceberg table directly → No data copies. No proprietary lock-in. What you get: ✅ Real-time ingestion from Kafka ✅ Open Iceberg storage on object storage ✅ Multi-engine access: DuckDB, Spark, Trino ✅ Continuous streaming writes with transactional commits ✅ Managed table lifecycle through RisingWave This is what a streaming lakehouse should look like. Real-time ingest. Open storage. Query from any Iceberg-compatible engine. Want to build one? Join our community: https://lnkd.in/eW_gjzqx

Software Development

San Francisco, California 14,416 followers

The live data company. Powering humans and agents with what's happening now.

See jobs Follow

Discover all 42 employees

About us

The live data company. Powering humans and agents with what's happening now. Talk to us: https://risingwave.com/slack.

Website: http://www.risingwave.com/
External link for RisingWave
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2021

Products

RisingWave

Event Stream Processing (ESP) Software

RisingWave is an event stream processing and management platform. It offers an unified experience for real-time data ingestion, stream processing, data persistence, and low-latency serving.

Locations

Primary

95 3rd St

2nd Floor

San Francisco, California 94103, US

Get directions
16 Collyer Quay

Downtown Core, Central Region 049318, SG

Get directions

Employees at RisingWave

See all employees

Updates

RisingWave

14,416 followers
14h
Report this post
Happening THIS THURSDAY! We’re hosting a webinar on real-time graph analytics for cybersecurity with our CEO Yingjun Wu and PuppyGraph CEO Weimo Liu. They’ll explore how streaming data, Apache Iceberg, and real-time graph queries come together to power modern security analysis through an end-to-end demo. If you work on cybersecurity, observability, or real-time infrastructure analytics, this session is for you. Looking forward to seeing you there! Register here: https://luma.com/kv0l3y2t

1 Comment

Like Comment Share
RisingWave

14,416 followers
19h
Report this post
An event stream is infinite. That is exactly what makes querying them hard. So how do streaming systems like RisingWave and Flink run aggregations, joins, and rankings on data streams that never stop? The answer is windows. A window divides an unbounded stream into finite time intervals so the system can continuously compute results. It is the bridge between infinite data and the finite questions you actually want to answer. They support three core window types: Tumble, Hop, and Session. Here is when to use each: ➡️ Tumble windows: fixed-size, non-overlapping. Each event belongs to exactly one window. Best for real-time dashboards, periodic reporting, and time-bucketed metrics. ➡️ Hop windows: fixed-size but overlapping. Events can belong to multiple windows. Best for moving averages, rolling KPIs, and sliding trend analysis. ➡️ Session windows: dynamic, separated by inactivity gaps. Best for user behavior analysis, IoT activity bursts, and clickstream sessions. The mental model is simple: Stream → Key → Time Window → Aggregate → Continuous Results The key takeaway: Windows transform infinite streams into structured time-based computations. ✅ Tumble for periodic analytics ✅ Hop for rolling analytics ✅ Session for activity-driven grouping Master these three, and you have covered most real-world streaming use cases. Want to see how window functions power stream processing with PostgreSQL-style SQL? 👉 Join the RisingWave community: https://lnkd.in/eW_gjzqx

Like Comment Share
RisingWave

14,416 followers
1d
Report this post
S3-native is the future of data infra, but read/write fees and 100 to 200 ms latency can add costs and impact performance. Here's the Rust hybrid cache that cuts costs by 90%. As more data systems go S3-native (and Apache Iceberg-native), the real cost shows up in AWS read/write fees and latency. Storage may be just $23 / TB / month, but those delays and the charges add up fast. The fix? Hybrid caching. Meet foyer! A fast, hybrid in-memory + disk cache written in Rust, supporting pluggable algorithms and high concurrency. foyer, built by Yao Meng, one of the engineers who worked at RisingWave, is already in production at: ✅ RisingWave – real-time SQL stream processing ✅ Chroma – vector DB for LLM apps ✅ ZeroFS – filesystem that makes S3 your primary storage ✅ SlateDB – an embedded object-storage engine also used by OpenData from Responsive Real-world results of foyer: Throughput: 800 MB/s to 75 MB/s Operation rate: 10× reduction S3 costs: ↓ 90% Freshness: memory-speed reads with 10 to 100 ms lag, even in mission-critical trading This isn’t theory. foyer, written in Rust, is production-ready. Want to see what smart caching can do in an S3-native stream-processing world? 👉 Join our RisingWave community: https://lnkd.in/dVD8ifPH #Caching #StreamProcessing #S3 #CloudStorage
Like Comment Share
RisingWave

14,416 followers
4d
Report this post
Last month we said the future is agentic 🤖 and we were betting on it. This month we shipped that bet! 🚀 What's in this edition: - RisingWave Cloud V2 is now in Public Preview. A ground-up rebuild with a redesigned console, a new rwc CLI, and open-source Skill + MCP server so AI agents understand streaming SQL. - v2.8.2 stabilizes the v2.8 line (v2.8.1 is deprecated, please upgrade). - New ingest paths: WebSocket ingest, generic HTTP sink, TLS for webhooks, Protobuf on MQTT. - SQL Server CDC now has full lag and offset monitoring, on par with Postgres. - ALTER SOURCE CONNECTOR lets you rotate CDC credentials without recreating sources. Plus four new blog posts (Cloud V2, AI developer tools, the rwc CLI, and a deep dive on Iceberg ingestion vs Flink), and a packed run of upcoming events on streaming + AI agents. Read the full edition below 👇👇👇 #RisingWave #StreamProcessing #AgenticAI #DataEngineering #RealTimeData #ApacheIceberg

Streaming Data News - April 2026 RisingWave on LinkedIn

Like Comment Share
RisingWave

14,416 followers
5d
Report this post
Why Apache Iceberg is the future of data lakes? In the past, data lakes didn’t fail because of storage. They fail because tables were never really "tables". Hive-style lakes rely on file paths, partitions, and external coordination, which breaks when you have: multiple writers multiple engines changing schemas petabyte-scale metadata Apache Iceberg fixes this by bringing real table semantics to object storage: ACID transactions (safe concurrent writes) Time travel and rollback (snapshots) Fast planning at scale (manifests and metadata indexing) Schema evolution (add or rename columns without rewrites) Hidden partitioning (no manual partition traps) Multi-engine interoperability (Spark, Flink, Trino, RisingWave, etc.) Iceberg turns your lake from: a pile of files and scripts into a transactional, warehouse-like platform. If your lake needs: Strong consistency Streaming + batch Multiple engines Long-term evolution Then, build your data lake with Apache Iceberg. Want to build a streaming lakehouse? RisingWave lets you build one with Postgres simplicity, with native support for the full Apache Iceberg table lifecycle, from creation to catalog management and maintenance. 👉 Join our RisingWave community: https://lnkd.in/eW_gjzqx
Like Comment Share
RisingWave reposted this
Bauplan

3,149 followers
6d Edited
Report this post
Ciro Greco from Bauplan just wrapped a great conversation with Yingjun Wu from RisingWave on agentic data infrastructure and how #AIagents are transforming the modern data pipeline. They got into the architectural shift underway: branch-based isolation, atomic publishing, and streaming-first ingestion and then discussed where the data stack is headed in the next 6 to 12 months as agents become first-class users of infrastructure. Thanks to everyone who joined live! See you at the next webinar :))
Like Comment Share
RisingWave

14,416 followers
6d
Report this post
Most streaming systems were designed for engineers who lived inside JVM frameworks, complex deployments, and operational overhead. That world is changing fast. The next generation of real-time apps, analytics, and AI agents needs something simpler. That is what RisingWave was built for. The streaming database for real-time apps, analytics, and agents. Here is what makes it different: ✅ Postgres-compatible: use the familiar Postgres interface and plug into your existing ecosystem ✅ Rust-powered performance and safety: built in Rust from scratch for speed, efficiency, and memory safety ✅ S3-first architecture: leverage S3 as primary storage for cost-effectiveness and infinite scale ✅ Iceberg native tables: native Apache Iceberg support for the full table lifecycle ✅ Real-time processing and low-latency serving: run concurrent ad-hoc SQL on streaming data with sub-second latency The result is a stack that finally treats the streaming lakehouse like a real Postgres database. No JVM tuning. No proprietary storage. No engine lock-in. Just SQL, Rust, S3, Iceberg, and real-time results.
Like Comment Share
RisingWave reposted this
Ciro Greco
1w
Report this post
Super excited to sit down with Yingjun Wu talk about how the agentic infrastructure proposed by Bauplan and streaming real-time systems like RisingWave will play a major role in the infrastructure of tomorrow at 9 am PT. AI agents are changing how data infrastructure gets built and operated and data pipelines were originally designed around humans: stepwise workflows, custom glue code, brittle handoffs between systems. Autonomous agents now explore, build, validate, and operate on data continuously, which creates new requirements for the modern data stack. Join us for a webinar on what it takes to build an agent-native data stack. We'll cover: ✅ How streaming databases like RisingWave power real-time agent workloads. ✅ Where traditional pipelines hit limits under agent-driven workflows. ✅ A new architecture for agents: branch-based isolation, atomic publishing, streaming-first systems . ✅ How agents safely interact with production data via isolated branches and instant rollbacks. Working in data infra, AI agents, or modern data pipelines? Register below. 👉 https://lnkd.in/dzBUYRU8 #AIAgents #DataEngineering #DataInfrastructure #StreamingData

Agentic Data Infrastructure: How AI Agents Are Transforming the Modern Data Pipeline · Zoom · Luma luma.com

2 Comments

Like Comment Share
RisingWave

14,416 followers
1w
Report this post
Cybersecurity questions are rarely about a single event or a single asset. They are about relationships. Which identities can reach which systems? How do permissions connect across resources? Where does exposure exist? And if one system is compromised, what else becomes at risk? Cyber attacks unfold in moments. So, answering these questions requires tracing paths across the latest state of your infrastructure in real time. But most security stacks are not built for that. We are hosting a webinar with PuppyGraph to show what an architecture that actually solves this looks like. Here is what we will cover: ✅ How streaming data, Iceberg, and real-time graph queries fit together for security analysis ✅ An end-to-end demo: security data processed in RisingWave, managed in Iceberg, queried as a live graph in PuppyGraph ✅ Modeling infrastructure and access relationships as a graph to investigate excessive permissions, internet-exposed assets, and blast radius of compromised VMs ✅ openCypher graph queries alongside natural language interaction through an agent chatbot powered by ontology-enforced graph querying Our speakers: Yingjun Wu, Founder and CEO at RisingWave Weimo Liu, Co-Founder and CEO at PuppyGraph If you are working on security data, observability, or anything that requires tracing relationships across infrastructure in real time, this webinar is for you! 👉 Register here: https://luma.com/kv0l3y2t
Like Comment Share
RisingWave

14,416 followers
1w
Report this post
Build a Streaming Lakehouse with Kafka + RisingWave + Apache Iceberg (via Lakekeeper) + DuckDB Most streaming pipelines stop at ingestion. Most lakehouses stop at storage. The real value comes when both work together as one system. Here is the streaming lakehouse pattern that makes it work: Kafka → RisingWave → Iceberg → DuckDB → Kafka streams events into RisingWave → RisingWave writes continuously into Iceberg via the Lakekeeper catalog → DuckDB queries the same Iceberg table directly → No data copies. No proprietary lock-in. What you get: ✅ Real-time ingestion from Kafka ✅ Open Iceberg storage on object storage ✅ Multi-engine access: DuckDB, Spark, Trino ✅ Continuous streaming writes with transactional commits ✅ Managed table lifecycle through RisingWave This is what a streaming lakehouse should look like. Real-time ingest. Open storage. Query from any Iceberg-compatible engine. Want to build one? Join our community: https://lnkd.in/eW_gjzqx
Like Comment Share

RisingWave

Software Development

San Francisco, California 14,416 followers

The live data company. Powering humans and agents with what's happening now.

About us

Products

RisingWave

Event Stream Processing (ESP) Software

Locations

Employees at RisingWave

Xiangyu (Sam) Hu

Zach Taapken

Patrick Huang

Pin Zhang

Updates

Join now to see what you are missing

Similar pages

e6data

Apache Iceberg

Materialize

StreamNative

Confluent

Vakamo

StarRocks

Redpanda Data

ClickHouse

Neon Postgres

Browse jobs

Engineer jobs

Risk Director jobs

Senior Director of Product Management jobs

Planning Director jobs

Information System Security Manager jobs

Director jobs

Vice President of Product Management jobs

Head of Content jobs

Director of Product Management jobs

Head of Product jobs

Senior Director jobs

Director of Engineering jobs

Director of Operations jobs

Director Project Control jobs

Lead Software Engineer jobs

Platform Engineer jobs

Senior Product Manager jobs

Associate jobs

Machine Learning Engineer jobs

Engineering Manager jobs