Log inSign up
Chris
13.6K posts
Image
user avatar
Chris
@criccomini
37% context left
Sunnyvale, CA
Joined April 2009
267
Following
12.2K
Followers
  • Pinned
    user avatar
    Chris
    @criccomini
    Apr 27
    Spent the past week on SlateDB's DST harness. It was a bit of a slog. The more state I explored, the more false positives I encountered.
    Social card for Deterministic Simulation Testing Is Really Hard
    Deterministic Simulation Testing Is Really Hard
    From rng.md
    15K
  • user avatar
    Chris
    @criccomini
    Mar 16, 2024
    I got a chance to sit in on some @ycombinator pitches this week. A few thoughts: 1⃣ I have AI fatigue--SO MUCH. Very little of it is deep tech; mostly applying OpenAI FM to stuff. Investors in this space: I have no idea how you do this. I feel like there's a lot of $ to be lost.
    488K
  • user avatar
    Chris
    @criccomini
    May 20, 2019
    Successful intern projects: 1. High value if completed. 2. Low risk if not completed. 3. Able to finish in allotted time (2-3 months). 4. Exciting to work on and talk about. Anything else I'm missing?
  • user avatar
    Chris
    @criccomini
    Feb 20, 2023
    Embedded DBs are having a renaissance. RDBMS: SQLite OLAP: DuckDB Graph: KuzuDB Search: Chroma The developer experience is so good on these. Things just work. Really cool to see.
    79K
  • user avatar
    Chris
    @criccomini
    Dec 5, 2019
    My @InfoQ talk 🎙️ on the "Future of Data Engineering" is up! I cover the six stages of data pipeline maturity: 0. None 1. Batch 2. Realtime 3. Integration 4. Automation 5. Decentralization Check it out! 👀 (I'm so sorry for the link picture)
    Image
    Future of Data Engineering
    From infoq.com
  • user avatar
    Chris
    @criccomini
    Aug 14, 2024
    It's out! I've been working with @paulgb, @vigneshc, the team @responsive_apps, and others to put together an LSM storage engine built on object storage. Contributors, users, and feedback would all be great!
    Image
    GitHub - slatedb/slatedb: A cloud native embedded storage engine built on object storage.
    From github.com
    23K
  • user avatar
    Chris
    @criccomini
    Oct 25, 2023
    Some interesting infra projects: WarpStream Turbopuffer LanceDB Neon AWS Neptune TigerBeetle Modal Materialize Tabular (Iceberg) DuckDB/Motherduck Arrow Data Fusion/Substrate gvisor KIP-932 (Kafka) VeniceDB Bauplan Buf schema registry Apicurio
    103K
  • user avatar
    Chris
    @criccomini
    Feb 6, 2024
    TIL about Apache DafaFusion Comet. @Apple has replaced @ApacheSpark's guts with @ApacheArrow DataFusion. And they're donating it. 🤯 github.com/apache/arrow-d… This is an alternative to @MetaOpenSource's Velox Spark implementation. facebookincubator.github.io/velox/spark_fu… /ht @philippemnoel
    Image
    43K
  • user avatar
    Chris
    @criccomini
    Aug 20, 2021
    Replying to @sethrosen
    “Reddit’s database has two tables” “Instead, they keep a Thing Table and a Data Table. Everything in Reddit is a Thing: users, links, comments, subreddits, awards, etc. Things keep common attribute like up/down votes, a type, and creation date” 🥴 kevin.burke.dev/kevin/reddits-…
  • user avatar
    Chris
    @criccomini
    Nov 26, 2023
    This is the future. Kafka writing Parquet to S3 (via tiered storage). Instant data lake.
    user avatar
    Gunnar Morling 🌍
    @gunnarmorling
    Nov 26, 2023
    "KIP-1008: ParKa - the Marriage of Parquet and Kafka" That's an interesting proposal: writing #Kafka segments as #Parquet files. Can see the appeal for data lake ingest; wondering though how well the columnar file structure plays with Kafka semantics 🤔. cwiki.apache.org/confluence/dis…
    Image
    53K
  • user avatar
    Chris
    @criccomini
    Sep 18, 2024
    Uber's actually doing the thing. uber.com/blog/datamesh If they keep going, this could be a first-class reference architecture.
    Image
    16K
  • user avatar
    Chris
    @criccomini
    Jan 23, 2023
    DBs are getting totally ripped apart right now and I love it. Query engines (trino, duck), storage (s3, gcs), and indexing (iceberg, hudi) all separate.
    user avatar
    Gunnar Morling 🌍
    @gunnarmorling
    Jan 23, 2023
    "Querying SQLite databases with DuckDB" Enjoyed watching this fast-paced video by @markhneedham demoing how to use #DuckDB's query engine to run analytics queries against data in a #SQLite file. 5:50 well spent 🦆! youtube.com/watch?v=ogge3k…
    58K
  • user avatar
    Chris
    @criccomini
    Jan 5, 2023
    I'm open sourcing Recap, a dead simple data catalog for engineers! Unlike traditional catalogs, Recap is built to power infrastructure and tools that need metadata. Read the docs: docs.recap.cloud Or dive straight into the Github repo: github.com/recap-cloud/re…
    47K
  • user avatar
    Chris
    @criccomini
    Aug 28, 2024
    Big news: I'm helping with @martinkl with a second edition of Designing Data-Intensive Applications! An early release of the first 3 chapters is now available (O'Reilly Learning subscribers only at this point) and we're hoping to finish it next year.
    Image
    Designing Data-Intensive Applications, 2nd Edition
    From oreilly.com
    8.2K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement