<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://blog.hyperparam.app/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blog.hyperparam.app/" rel="alternate" type="text/html" /><updated>2026-03-26T18:20:07+00:00</updated><id>https://blog.hyperparam.app/feed.xml</id><title type="html">Hyperparam Blog</title><subtitle>Notes on working with LLM datasets in the browser.</subtitle><entry><title type="html">What wasted tool calls revealed about my LLM’s behavior</title><link href="https://blog.hyperparam.app/wasted-tool-calls-in-llm-logs-are-worth-debugging/" rel="alternate" type="text/html" title="What wasted tool calls revealed about my LLM’s behavior" /><published>2026-03-25T20:00:00+00:00</published><updated>2026-03-25T20:00:00+00:00</updated><id>https://blog.hyperparam.app/wasted-tool-calls-in-llm-logs-are-worth-debugging</id><content type="html" xml:base="https://blog.hyperparam.app/wasted-tool-calls-in-llm-logs-are-worth-debugging/"><![CDATA[<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "@id": "https://hyperparam.app/blog/wasted-tool-calls-in-llm-logs-are-worth-debugging#faq",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is a wasted tool call?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "A wasted tool call is a tool call that sends the model down an unhelpful path. The model may still recover, but the failed or unnecessary call adds cost, latency or noise to the trace."
      }
    },
    {
      "@type": "Question",
      "name": "Why are wasted tool calls worth debugging?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "A wasted tool call can look harmless if the model self-corrects and the run succeeds on a later attempt. It still adds cost, latency and noise to the trace, and it can expose a prompt issue or failure pattern that keeps repeating across the dataset."
      }
    },
    {
      "@type": "Question",
      "name": "What's the difference between required and avoidable failed tool calls?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Some failed tool calls are required because the agent has to make the call to retrieve information it does not have. A file lookup may fail because the file does not exist, or a path check may fail because the path is wrong. Other failed tool calls are avoidable because the prompt is inaccurate or not specific enough, which leads the model down an unhelpful path."
      }
    },
    {
      "@type": "Question",
      "name": "Why can a successful run still contain wasted tool calls?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Because the model can self-correct after a failed or unhelpful tool call and still reach a usable result. The run may succeed, but the trace still includes the extra failure, extra calls and extra work that made the path slower, noisier and more expensive."
      }
    },
    {
      "@type": "Question",
      "name": "What do wasted tool calls usually indicate?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "They usually indicate that the prompt or call specification needs work. The model may recover, but the wasted call still weakens the signal on whether the prompt is well specified and whether the system is behaving efficiently."
      }
    }
  ]
}
</script>

<p>We’ve all seen it happen: the LLM starts going down the wrong path and makes dozens of failed or wasted tool calls that don’t actually get it closer to its goal.</p>

<p>Even though models can self-correct and find a new path on a subsequent retry, self-correction can hide repeated failures that make the agent slower, more expensive and more difficult to evaluate across the dataset. In this post, I look at what wasted tool calls do to the trace, when retries are required and when they’re avoidable, and the cost of avoidable failures in practice.</p>

<p>Need the workflow? See my step-by-step guide for <a href="https://hyperparam.app/docs/debugging/how-debug-wasted-tool-calls">debugging wasted tool calls in LLM logs</a>.</p>

<h2 id="key-takeaways">Key takeaways</h2>

<ul>
  <li>Failed tool calls add cost, latency, noise and weaker prompt signal.</li>
  <li>A wasted tool call makes the trace noisier even when the next run succeeds.</li>
  <li>Required retried tool calls help the agent retrieve information it didn’t have.</li>
  <li>Avoidable retries usually stem from prompts that are inaccurate or not specific enough.</li>
  <li>If the model keeps self-correcting and eventually finds a new path, the trace can look healthier than the prompt really is.</li>
</ul>

<h2 id="a-wasted-tool-call-produces-a-noisy-trace">A wasted tool call produces a noisy trace</h2>

<p>Even when the tool call eventually finds a new path, it doesn’t erase the earlier failure(s). The trace still shows the failed calls and the subsequent successful one. This makes it harder to review the run and determine whether the agent reached the result efficiently. In a production dataset, that pattern can repeat across many traces, even when the final outputs look fine.</p>

<h2 id="required-retries-vs-avoidable-retries">Required retries vs. avoidable retries</h2>

<p>Some tool-call failures calls are required because the agent has to make the call to retrieve information it doesn’t have. If a file lookup fails because the file doesn’t exist, that failure may still be useful because it tells the agent something it needed to know. A path check can work the same way. Other calls are genuine failures because the agent queried the wrong file, took a dead-end path or went down a rabbit hole. The line between these cases can be fuzzy.</p>

<p>Other failed tool calls are avoidable. These occur when the prompt is inaccurate or not specific enough. Think of incorrect parameters, wrong attribute names on objects and shell syntax issues such as unescaped pipe characters.</p>

<h2 id="the-hidden-cost-of-failed-tool-calls-in-agent-traces">The hidden cost of failed tool calls in agent traces</h2>

<p>Wasted tool calls that are avoidable hide costs that are easy to miss if you only look at whether the run succeeded:</p>

<ul>
  <li><strong>More cost:</strong> more calls mean more tokens, more compute and more spend</li>
  <li><strong>More latency:</strong> retries slow the agent down even when the run succeeds</li>
  <li><strong>More trace noise:</strong> the extra failure and retry make the trace harder to review</li>
  <li><strong>More unhelpful work:</strong> the model may recover on a subsequent attempt, but it still spends extra calls on a path that was never going to help</li>
  <li><strong>Weaker prompt signal:</strong> recovery masks prompt defects, so a successful run is a weaker indicator of whether the prompt is doing what you think it is</li>
</ul>

<p>For my step-by-step workflow, see <a href="https://hyperparam.app/docs/debugging/how-debug-wasted-tool-calls">how to debug wasted tool calls in LLM logs</a>.</p>

<h2 id="faq">FAQ</h2>

<p><strong>Why are failed tool calls in LLM logs worth debugging?</strong></p>

<p>Failed tool calls in LLM logs are worth debugging because although the model may correct itself and find a new path on a subsequent attempt, the retry still adds cost, latency and noise to the trace. The failure can also point to a prompt issue that keeps repeating across the dataset.</p>

<p><strong>Why does a retried tool call produce a noisy trace?</strong></p>

<p>A retried tool call produces a noisy trace because the successful call doesn’t erase the failed one that came before it even though the run eventually succeeds. This makes reviewing the trace and diagnosing issues more difficult.</p>

<p><strong>What’s the difference between wasted tool calls that are required and ones that are avoidable?</strong></p>

<p>Some failed tool calls are required because the failure provides the model with new information, for example, that a file doesn’t exist or a path is wrong. Avoidable retries happen when the prompt is inaccurate or not specific enough, which leads to underspecified tool calls.</p>

<p><strong>When a wasted tool call is avoidable, what does that usually indicate?</strong></p>

<p>When a failed tool call is avoidable, the prompt usually needs to be updated. The unnecessary noise in the trace weakens the signal on whether the prompt is well specified.</p>]]></content><author><name>Brendan McMullen</name></author><category term="engineering" /><category term="llm" /><category term="wasted-tool-calls" /><category term="failed-tool-calls" /><category term="agent-traces" /><category term="LLM-logs" /><category term="prompt-debugging" /><category term="AI-agent-debugging" /><summary type="html"><![CDATA[Wasted tool calls usually look harmless because the LLM self-corrects and finds a new path. However, they still add cost, latency and noise to the trace.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-wasted-tool-calls.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-wasted-tool-calls.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Virtual Scrolling for Billions of Rows — Techniques from HighTable</title><link href="https://blog.hyperparam.app/hightable-scrolling-billions-of-rows/" rel="alternate" type="text/html" title="Virtual Scrolling for Billions of Rows — Techniques from HighTable" /><published>2026-02-11T20:00:00+00:00</published><updated>2026-02-11T20:00:00+00:00</updated><id>https://blog.hyperparam.app/hightable-scrolling-billions-of-rows</id><content type="html" xml:base="https://blog.hyperparam.app/hightable-scrolling-billions-of-rows/"><![CDATA[<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Visual explainer: How HighTable scrolls billions of rows",
  "description": "A visual explainer by Sylvain Lesage on how HighTable works around browser scrollbar limits to display billions of rows, republished by Hyperparam.",
  "author": {
    "@type": "Person",
    "name": "Sylvain Lesage"
  },
  "publisher": {
    "@type": "Organization",
    "name": "Hyperparam"
  },
  "datePublished": "2026-02-12",
  "dateModified": "2026-02-12",
  "isBasedOn": "https://rednegra.net/blog/20260212-virtual-scroll/",
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://blog.hyperparam.app/hightable-scrolling-billions-of-rows"
  },
  "url": "https://rednegra.net/blog/20260212-virtual-scroll/",
  "keywords": "HighTable, virtual scrolling, billions of rows, React, virtualization"
}
</script>

<blockquote>
  <p><strong>Editor’s note:</strong> This post was originally written by Sylvain Lesage, one of the primary contributors to HighTable, whose work we’ve been sponsoring as part of our broader effort to plan for the data-scale problems that are emerging with LLMs. We’re republishing it here with his permission because it captures a core technical challenge we’ve been working through—how to scroll through billions of rows in the browser.</p>

  <p>Stay tuned for a future post on what this innovation means for Hyperparam users.</p>
</blockquote>

<p>TL;DR: In this post, I present <strong>five techniques related to vertical scrolling</strong> used in <code class="language-plaintext highlighter-rouge">&lt;HighTable&gt;</code>, a React component that can display billions of rows in a table while keeping good performance and accessibility.</p>

<p>It’s a long post, which reflects the complexity of rendering billions of rows in a table, and the amount of work we put into building the React component.</p>

<p>Table of contents:</p>

<ul>
  <li><a href="#introduction">Introduction</a></li>
  <li><a href="#demo">Demo</a></li>
  <li><a href="#scrolling-basics">Scrolling basics</a></li>
  <li><a href="#technique-1-lazy-loading">Technique 1: lazy loading</a></li>
  <li><a href="#technique-2-table-slice">Technique 2: table slice</a></li>
  <li><a href="#technique-3-infinite-pixels">Technique 3: infinite pixels</a></li>
  <li><a href="#technique-4-pixel-precise-scroll">Technique 4: pixel-precise scroll</a></li>
  <li><a href="#technique-5-two-step-random-access">Technique 5: two-step random access</a></li>
  <li><a href="#conclusion">Conclusion</a></li>
</ul>

<h2 id="introduction">Introduction</h2>

<p>Showing data in a table is one of the first exercises you’ll find in HTML 101 courses.</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;table&gt;</span>
  <span class="nt">&lt;thead&gt;</span>
    <span class="nt">&lt;tr&gt;&lt;th&gt;</span>Name<span class="nt">&lt;/th&gt;&lt;th&gt;</span>Age<span class="nt">&lt;/th&gt;&lt;/tr&gt;</span>
  <span class="nt">&lt;/thead&gt;</span>
  <span class="nt">&lt;tbody&gt;</span>
    <span class="nt">&lt;tr&gt;&lt;td&gt;</span>Alice<span class="nt">&lt;/td&gt;&lt;td&gt;</span>64<span class="nt">&lt;/td&gt;&lt;/tr&gt;</span>
    <span class="nt">&lt;tr&gt;&lt;td&gt;</span>Bob<span class="nt">&lt;/td&gt;&lt;td&gt;</span>37<span class="nt">&lt;/td&gt;&lt;/tr&gt;</span>
  <span class="nt">&lt;/tbody&gt;</span>
<span class="nt">&lt;/table&gt;</span>
</code></pre></div></div>

<p>But, as often in data science, what works for simple cases breaks when the size increases.</p>

<p>In this post, I’ll showcase five techniques we use to <strong>solve challenges related to vertical scrolling</strong> in the <code class="language-plaintext highlighter-rouge">&lt;HighTable&gt;</code> React component to handle billions of rows.</p>

<p>The component also provides features for columns (sort, hide, resize), rows (select), cells (keyboard navigation, pointer interactions, custom rendering). Feel free to ask and look at the code if you’re interested in knowing more.</p>

<p>The <code class="language-plaintext highlighter-rouge">&lt;HighTable&gt;</code> component is developed at <a href="https://github.com/hyparam/hightable/">hyparam/hightable</a>. It was created by <a href="https://github.com/platypii">Kenny Daniel</a> for <a href="https://hyperparam.app/">Hyperparam</a>, and I’ve had the chance to contribute to its development for one year now.</p>

<p>This blog post was sponsored by <a href="https://hyperparam.app/">Hyperparam</a>. Thanks for the support and for challenging me to solve the fascinating problem of rendering billions of rows in the browser!</p>

<h2 id="demo">Demo</h2>

<p>Try the <a href="https://hyparam.github.io/demos/hightable/#/large">hightable demo</a>:</p>

<iframe src="https://hyparam.github.io/demos/hightable/#/large" title="HighTable demo with a large dataset" width="100%" height="400px"></iframe>

<p>HighTable is also used in the <a href="https://hyparam.github.io/demos/hyparquet/">Parquet viewer</a>, on <a href="https://source.coop/jrc-lucas/jrc-lucas-ml/ml_data/classes_dataset.csv">source.coop</a> and in <a href="https://hyperparam.app">Hyperparam</a>:</p>

<p><a title="Try HighTable in Hyperparam, the workbench for LLM datasets" href="https://hyperparam.app"><img alt="HighTable embedded in hyperparam.app" src="/assets/images/hightable-hyperparam.webp" /></a></p>

<h2 id="scrolling-basics">Scrolling basics</h2>

<p>Before diving into the techniques, let’s describe how scrolling works using a standard HTML table.</p>

<p>The HTML structure is composed of a scrollable container, that we call the <span class="viewport"><em>viewport</em></span>, and a <span class="table">table</span> element inside it:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;div</span> <span class="na">class=</span><span class="s">"viewport"</span> <span class="na">style=</span><span class="s">"overflow-y: auto;"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;table</span> <span class="na">class=</span><span class="s">"table"</span><span class="nt">&gt;</span>
    ...
  <span class="nt">&lt;/table&gt;</span>
<span class="nt">&lt;/div&gt;</span>
</code></pre></div></div>

<p>In this structure, the <span class="viewport">viewport</span> is a div with a fixed height and the CSS property <code class="language-plaintext highlighter-rouge">overflow-y: auto</code> enables a vertical scrollbar when the <span class="table">table</span> is taller than the <span class="viewport">viewport</span>.</p>

<p>In the following widget, scroll the left box up and down to see how the right box mimics the scrolling effect.</p>

<blockquote>
  <p>If you use a keyboard, you can focus the left box with <kbd>Tab</kbd>, and scroll with the arrow keys <kbd>⏶</kbd> and <kbd>⏷</kbd>. Otherwise, you can use mouse wheel, drag the scroll bar, or slide on a touch screen.</p>
</blockquote>

<iframe src="https://rednegra.net/embed/scroll-native" style="height: 730px; width:100%; border:none; overflow:hidden;"></iframe>

<p>The component is delimited by its fixed-size <span class="viewport">viewport</span> (blue border). The <span class="table"><em>table</em></span> (golden border) is rendered inside the container. As its <span class="table">height</span> is larger than the <span class="viewport">viewport height</span>, only part of the table is visible, and a vertical scrollbar lets changing the visible part. <strong>The inner <span class="table">table</span> element moves up and down within the <span class="viewport">viewport</span></strong>, creating the scrolling effect.</p>

<p>On the right side, we mimic the scrolling effect, showing the position of the <span class="table">table</span> relative to the <span class="viewport">viewport</span>.</p>

<p>Let’s settle some definitions and formulas that will be useful later:</p>

<ol>
  <li>
    <p>in this post, we assume <code><span class="viewport">viewport</span>.clientHeight</code>, the height of the visible area, is constant. In HighTable, we measure it and react to resizing.</p>
  </li>
  <li>
    <p><code><span class="viewport">viewport</span>.scrollHeight</code>, the total height of the scrollable content, is equal to <code><span class="table">table</span>.clientHeight</code>. Both are equal to the number of rows in the table multiplied by the row height:</p>

    <div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kd">const</span> <span class="nx">rowHeight</span> <span class="o">=</span> <span class="mi">33</span> <span class="c1">// in pixels</span>
 <span class="kd">const</span> <span class="nx">numRows</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">numRows</span> <span class="c1">// total number of rows in the table</span>
 <span class="kd">const</span> <span class="nx">height</span> <span class="o">=</span> <span class="nx">numRows</span> <span class="o">*</span> <span class="nx">rowHeight</span>
</code></pre></div>    </div>

    <p>In this post, we assume the row height and the number of rows are constant. In HighTable, we react to changes in <code>data.numRows</code> (the number of rows in the <em>data frame</em>, the data structure holding the table data), for example when filtering; but we assume the row height is fixed (see <a href="https://github.com/hyparam/hightable/issues/395">issue #395</a> to support variable row heights).</p>
  </li>
  <li>
    <p><code><span class="viewport">viewport</span>.scrollTop</code> is the number of pixels between the top of the scrolled <span class="table">table</span> and the top of the <span class="viewport">viewport</span>. The minimum value <code>0px</code> shows the top of the table, while the bottom of the table is reached at the maximum value <code><span class="viewport">viewport</span>.scrollHeight - <span class="viewport">viewport</span>.clientHeight</code>.</p>
  </li>
  <li>
    <p>The visible pixels can be computed from the <span class="viewport">viewport</span> scroll top position:</p>

    <div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kd">const</span> <span class="nx">firstVisiblePixel</span> <span class="o">=</span> <span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span>
 <span class="kd">const</span> <span class="nx">lastVisiblePixel</span> <span class="o">=</span> <span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span> <span class="o">+</span> <span class="nx">viewport</span><span class="p">.</span><span class="nx">clientHeight</span>
 <span class="c1">// firstVisiblePixel is inclusive, lastVisiblePixel is exclusive</span>
</code></pre></div>    </div>
  </li>
</ol>

<p>Now that we have the basics, let’s see how to handle large datasets.</p>

<h2 id="technique-1-lazy-loading">Technique 1: lazy loading</h2>

<p>The first challenge when working on a large dataset is that it will not fit in your browser memory. The good news: you’ll not want to look at every row either, and not at the same time. So, instead of loading the whole data file at start, we <strong>only load the visible cells</strong>.</p>

<blockquote>
  <p>Note that lazy loading the data does not change the HTML structure of the table.</p>
</blockquote>

<p>The following widget shows how lazy loading works. Scroll the left box up and down to see how the cells are loaded on demand on the right side:</p>

<iframe src="https://rednegra.net/embed/scroll-lazy-load" style="height:680px; width:100%; border:none; overflow:hidden;"></iframe>

<p>In the <span class="table">table</span>, only the visible cells are loaded. When scrolling, newly visible cells are requested and loaded in the background, and rendered when available.</p>

<p>To do so, we compute the visible rows, and only load them:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">rowStart</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span><span class="nx">firstVisiblePixel</span> <span class="o">/</span> <span class="nx">rowHeight</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">rowEnd</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">ceil</span><span class="p">(</span><span class="nx">lastVisiblePixel</span> <span class="o">/</span> <span class="nx">rowHeight</span><span class="p">)</span>
<span class="c1">// rowStart is inclusive, rowEnd is exclusive</span>
</code></pre></div></div>

<p>In HighTable, the data loading logic is handled in a <em>data frame</em>, passed to the React component as the <code class="language-plaintext highlighter-rouge">data</code> prop:</p>

<div class="language-jsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">&lt;</span><span class="nc">HighTable</span> <span class="na">data</span><span class="p">=</span><span class="si">{</span><span class="nx">data</span><span class="si">}</span> <span class="p">/&gt;</span>
</code></pre></div></div>

<p>The data frame is an object that defines how to load (i.e. fetch and cache) the data on demand, and how to get the loaded data for rendering. See the <code class="language-plaintext highlighter-rouge">DataFrame</code> TypeScript definition in <a href="https://github.com/hyparam/hightable/blob/b171cd35a61253cb2b090f60c83c9aa660bf27fb/src/helpers/dataframe/types.ts#L50">types.ts</a>.</p>

<p>Here is a simplified DataFrame implementation that generates random data for one column, applying some delay to simulate fetching data over the network, and persists the values in memory:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">cache</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Map</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">eventTarget</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">EventTarget</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">numRows</span> <span class="o">=</span> <span class="mi">1</span><span class="nx">_000_000</span>

<span class="kd">const</span> <span class="nx">data</span> <span class="o">=</span> <span class="p">{</span>
  <span class="nx">numRows</span><span class="p">,</span>
  <span class="nx">eventTarget</span><span class="p">,</span>

  <span class="c1">// Synchronously return the cached value (if any)</span>
  <span class="nx">getCell</span><span class="p">({</span> <span class="nx">row</span> <span class="p">})</span> <span class="p">{</span>
    <span class="k">return</span> <span class="nx">cache</span><span class="p">.</span><span class="kd">get</span><span class="p">(</span><span class="nx">row</span><span class="p">);</span>
  <span class="p">},</span>

  <span class="c1">// Load missing values for the given rows, and cache them</span>
  <span class="k">async</span> <span class="nx">fetch</span><span class="p">({</span> <span class="nx">rowStart</span><span class="p">,</span> <span class="nx">rowEnd</span> <span class="p">})</span> <span class="p">{</span>
    <span class="c1">// Simulate network delay</span>
    <span class="k">await</span> <span class="k">new</span> <span class="nb">Promise</span><span class="p">((</span><span class="nx">resolve</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="nx">setTimeout</span><span class="p">(</span><span class="nx">resolve</span><span class="p">,</span> <span class="mi">100</span><span class="p">));</span>
    <span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">row</span> <span class="o">=</span> <span class="nx">rowStart</span><span class="p">;</span> <span class="nx">row</span> <span class="o">&lt;</span> <span class="nx">rowEnd</span><span class="p">;</span> <span class="nx">row</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
      <span class="c1">// Skip already cached rows</span>
      <span class="k">if</span> <span class="p">(</span><span class="nx">cache</span><span class="p">.</span><span class="nx">has</span><span class="p">(</span><span class="nx">row</span><span class="p">))</span> <span class="k">continue</span><span class="p">;</span>
      <span class="c1">// Generate a random value for the cell, and cache it</span>
      <span class="nx">cache</span><span class="p">.</span><span class="kd">set</span><span class="p">(</span><span class="nx">row</span><span class="p">,</span> <span class="p">{</span><span class="na">value</span><span class="p">:</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">random</span><span class="p">()});</span>
    <span class="p">}</span>
    <span class="c1">// Emit an event to tell &lt;HighTable&gt; to re-render the visible cells</span>
    <span class="nx">eventTarget</span><span class="p">.</span><span class="nx">dispatchEvent</span><span class="p">(</span><span class="k">new</span> <span class="nx">Event</span><span class="p">(</span><span class="dl">'</span><span class="s1">resolve</span><span class="dl">'</span><span class="p">));</span>
  <span class="p">},</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The data frame loads the data from the source using the asynchronous <code class="language-plaintext highlighter-rouge">data.fetch()</code> method. It must cache the results, and dispatch a <code class="language-plaintext highlighter-rouge">resolve</code> event when new data is available. The source can be anything. In our example, the data was randomly generated. It can also be obtained from a <a href="https://developer.mozilla.org/en-US/docs/Web/API/File">local file</a>, an in-memory array, a remote file (using HTTP range requests), or a REST API, to name a few examples.</p>

<p>The data frame must also provide a synchronous <code class="language-plaintext highlighter-rouge">data.getCell()</code> method to get the cached data for a given cell, or <code class="language-plaintext highlighter-rouge">undefined</code> if the data is not loaded yet.</p>

<p>On every scroll move, the table is rendered, calling <code class="language-plaintext highlighter-rouge">data.getCell()</code> for the visible rows, as well as <code class="language-plaintext highlighter-rouge">data.fetch()</code> to load them in the background if necessary (it’s the responsibility of the data frame to return fast if the data is already cached). Every time new data is fetched and reported (on  <code class="language-plaintext highlighter-rouge">resolve</code> events), the table will be re-rendered.</p>

<blockquote>
  <p>You can find a more complete example of a data frame that loads a remote Parquet file (using HTTP range requests) in the <a href="https://github.com/hyparam/demos/blob/8cbaf815eb75af0699d44242be2cfb2756b02ce7/hyparquet/src/App.tsx#L23">hyparquet demo</a>.</p>
</blockquote>

<p>The data frame structure is not oriented towards rows or columns, and allows loading and accessing the data by cell. Currently, in HighTable, we load full rows, but we could improve by computing the visible columns and loading them lazily as well. Join the pending <a href="https://github.com/hyparam/hightable/issues/297">discussion</a> if you’re interested in this feature.</p>

<blockquote>
  <p><strong>Impact of lazy loading</strong></p>

  <p>If we assume 10 billions of rows, and 100 bytes per row, the <strong>total data size is 1TB</strong>. Loading it all in memory is not possible, but with lazy loading, <strong>we only load 3KB</strong> for the visible part (about 30 rows at a time), and keep good performance.</p>
</blockquote>

<p>Lazy loading the data is the first step, required to handle large datasets in the browser. The next step is to avoid rendering too many HTML elements at once.</p>

<h2 id="technique-2-table-slice">Technique 2: table slice</h2>

<p>In software engineering, when you try to optimize, the first step is to remove computing that does nothing. In our case, if the table has one million rows and we can see only 30 at a time, why render one million <code class="language-plaintext highlighter-rouge">&lt;tr&gt;</code> HTML elements? As a reference, Chrome <a href="https://developer.chrome.com/docs/performance/insights/dom-size">recommends</a> creating or updating less than 300 HTML elements for optimal responsiveness.</p>

<p>In the <code class="language-plaintext highlighter-rouge">&lt;HighTable&gt;</code> component, <strong>only the visible slice of the table is rendered</strong>. The other row elements simply don’t exist.</p>

<p>To achieve this, the HTML structure must be adapted, by adding an intermediate div element, that we call the <span class="canvas">canvas</span>, between the <span class="viewport">viewport</span> and the <span class="table">table</span>:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;div</span> <span class="na">class=</span><span class="s">"viewport"</span> <span class="na">style=</span><span class="s">"overflow-y: auto;"</span><span class="nt">&gt;</span>
  <span class="nt">&lt;div</span> <span class="na">class=</span><span class="s">"canvas"</span> <span class="na">style=</span><span class="s">"position: relative; height: 30000px;"</span><span class="nt">&gt;</span>
    <span class="nt">&lt;table</span> <span class="na">class=</span><span class="s">"table"</span> <span class="na">style=</span><span class="s">"position: absolute; top: 3000px;"</span><span class="nt">&gt;</span>
      <span class="c">&lt;!-- the table only renders the visible rows --&gt;</span>
      ...
    <span class="nt">&lt;/table&gt;</span>
  <span class="nt">&lt;/div&gt;</span>
<span class="nt">&lt;/div&gt;</span>
</code></pre></div></div>

<p>The HTML structure will remain the same for the rest of the blog post, including techniques 3, 4 and 5.</p>

<blockquote>
  <p>The <span class="canvas">canvas</span> div is not related at all with the <code class="language-plaintext highlighter-rouge">&lt;canvas&gt;</code> HTML element. I’m open to suggestions for better naming if it’s confusing.</p>
</blockquote>

<p>The <span class="canvas">canvas</span> is sized so that it could contain all the rows:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">canvas</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">height</span> <span class="o">=</span> <span class="s2">`</span><span class="p">${</span><span class="nx">data</span><span class="p">.</span><span class="nx">numRows</span> <span class="o">*</span> <span class="nx">rowHeight</span><span class="p">}</span><span class="s2">px`</span>
</code></pre></div></div>

<p>It sets the <span class="viewport">viewport</span> scrollbar to the expected size. As shown in the scrolling basics section, <code><span class="viewport">viewport</span>.scrollHeight</code> is equal to <code><span class="canvas">canvas</span>.clientHeight</code>.</p>

<p>The <span class="canvas">canvas</span> serves as a reference for absolutely positioning the <span class="table">table</span> slice.</p>

<p>The following widget shows how table slicing works. Scroll the left box up and down to see how the right box mimics the scrolling effect, while rendering only the visible rows. Toggle the <span class="full-table">full table</span> button to see how the rendered rows fit in the full table:</p>

<iframe src="https://rednegra.net/embed/scroll-slice" style="height:760px; width:100%; border:none; overflow:hidden;"></iframe>

<p>On the right side, you see that only the visible rows are rendered. The <span class="table">table</span> slice contains 6 rows instead of 10 (or 7, depending on the scroll position).</p>

<p>The HTML structure inside the <span class="table">table</span> slice is:</p>

<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;table&gt;</span>
  <span class="nt">&lt;tbody&gt;</span>
    <span class="c">&lt;!-- Rows 0 to 99 are not rendered --&gt;</span>

    <span class="c">&lt;!-- Visible rows --&gt;</span>
    <span class="nt">&lt;tr&gt;</span>...row 100...<span class="nt">&lt;/tr&gt;</span>
    <span class="nt">&lt;tr&gt;</span>...row 101...<span class="nt">&lt;/tr&gt;</span>
    ...
    <span class="nt">&lt;tr&gt;</span>...row 119...<span class="nt">&lt;/tr&gt;</span>

    <span class="c">&lt;!-- Rows 120 to 999 are not rendered --&gt;</span>
  <span class="nt">&lt;/tbody&gt;</span>
<span class="nt">&lt;/table&gt;</span>
</code></pre></div></div>

<p>Let’s assume the <span class="full-table">data</span> has 1,000 rows, each row in the table is 30px height, and the <span class="viewport">viewport</span> height is 600px (so that about 20 rows are visible at once). If the user has scrolled down 3,000px, <code class="language-plaintext highlighter-rouge">&lt;HighTable&gt;</code> only renders rows 100 to 119 in the actual <code class="language-plaintext highlighter-rouge">&lt;table&gt;</code> <span class="table">element</span>.</p>

<blockquote>
  <p>The HTML above is a simplification. In <a href="https://github.com/hyparam/hightable/blob/b171cd35a61253cb2b090f60c83c9aa660bf27fb/src/components/HighTable/Slice.tsx#L177">hightable</a>, we render a table header and add some padding rows before and after the visible rows to improve the scrolling experience.</p>
</blockquote>

<p>The <span class="table">table</span> top position is adjusted to fit in the <span class="full-table">full table</span> (toggle the <span class="full-table">Show</span> / <span class="full-table">Hide</span> button to render the full table). It’s equals to the position of the first visible row inside the virtual <span class="full-table">full table</span>. It’s nearly equal to <code><span class="viewport">viewport</span>.scrollTop</code>, but differs by the amount of hidden pixels at the top of the first visible row. So:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">table</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">top</span> <span class="o">=</span> <span class="s2">`</span><span class="p">${</span>
  <span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span> <span class="o">-</span> <span class="p">(</span><span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span> <span class="o">%</span> <span class="nx">rowHeight</span><span class="p">)</span>
<span class="p">}</span><span class="s2">px`</span><span class="p">;</span>
</code></pre></div></div>

<p>These computations are done on every scroll event (and on every other change: when the <span class="viewport">viewport</span> height changes, or when the number of rows is updated). Once computed, the <span class="table">table slice</span> is re-rendered with the new visible rows, the <span class="table">table</span> position is updated with the new <code class="language-plaintext highlighter-rouge">top</code> value, and the data frame is queried to load the new visible cells if needed.</p>

<blockquote>
  <p>A detail worth mentioning is the sticky header. In <code class="language-plaintext highlighter-rouge">&lt;HighTable&gt;</code>, the header with column names is rendered as part of the <span class="table">table</span> element, in <code class="language-plaintext highlighter-rouge">&lt;thead&gt;</code>, not as a separate element. It helps with accessibility, as screen readers can easily identify the header cells associated with each data cell, and with columns resizing, as the header and data cells are aligned automatically by the browser. Thanks to the CSS property <code class="language-plaintext highlighter-rouge">position: sticky</code> (see <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Reference/Properties/position#sticky">sticky</a> on MDN), the header row remains visible at the top of the <span class="viewport">viewport</span> when scrolling. We take it into account to compute the first visible row.</p>
</blockquote>

<p>Note that the table slicing technique is not specific to vertical scrolling. The same approach can be used for horizontal scrolling (rendering only the visible columns). It’s less critical, as tables generally have less columns than rows. Join the pending <a href="https://github.com/hyparam/hightable/issues/297">discussion on virtual columns</a> if you’re interested in this feature.</p>

<blockquote>
  <p><strong>Impact of table slicing</strong></p>

  <p>If we assume 10 billions of rows, and 30 rows are visible at a time, <strong>we only render 30 HTML elements instead of 10 billion</strong>. It allows to keep good performance with any number of rows, as <strong>the number of rendered elements is constant</strong>.</p>
</blockquote>

<p>Until now, everything is pretty standard. The next techniques are more specific to hightable, and address challenges that arise when dealing with billions of rows.</p>

<h2 id="technique-3-infinite-pixels">Technique 3: infinite pixels</h2>

<p>Technique 2 works perfectly, until it breaks… As Eric Meyer explains in his blog post <a href="https://meyerweb.com/eric/thoughts/2025/08/07/infinite-pixels/">Infinite Pixels</a>, HTML elements have a maximum height, and the exact value depends on the browser. The worst case is Firefox: about 17 million pixels. As the <span class="canvas">canvas</span> height increases with the number of rows, if the row height is 33px (the default in HighTable), we cannot render more than 500K rows.</p>

<p>Our approach to this issue in HighTable is to <strong>set a maximum height for the <span class="canvas">canvas</span> and downscale the scrollbar resolution above this limit.</strong> In HighTable, the threshold is set to 8 million pixels.</p>

<p>Concretely, above the threshold, one scrolled pixel corresponds to multiple pixels in the <span class="full-table">full table</span>. The downscaling factor is the ratio between the theoretical height of the <span class="full-table">full table</span> and the maximum height of the <span class="canvas">canvas</span>. Thanks to that factor, if you scroll half the scrollbar, you reach the middle of the <span class="full-table">full table</span>, no matter how big it is.</p>

<p>Below the threshold, the downscaling factor is 1, so everything works as before: one scrolled pixel corresponds to one pixel in the <span class="full-table">full table</span>.</p>

<p>The downscale factor is computed as:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">fullTableHeight</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">numRows</span> <span class="o">*</span> <span class="nx">rowHeight</span>
<span class="kd">const</span> <span class="nx">maxCanvasHeight</span> <span class="o">=</span> <span class="mi">8</span><span class="nx">_000_000</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">fullTableHeight</span> <span class="o">&lt;=</span> <span class="nx">maxCanvasHeight</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">downscaleFactor</span> <span class="o">=</span> <span class="mi">1</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
  <span class="nx">downscaleFactor</span> <span class="o">=</span> 
    <span class="p">(</span><span class="nx">fullTableHeight</span> <span class="o">-</span> <span class="nx">viewport</span><span class="p">.</span><span class="nx">clientHeight</span><span class="p">)</span> <span class="o">/</span>
    <span class="p">(</span><span class="nx">maxCanvasHeight</span> <span class="o">-</span> <span class="nx">viewport</span><span class="p">.</span><span class="nx">clientHeight</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, the first visible row is computed with:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">firstVisibleRow</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span>
  <span class="p">(</span><span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span> <span class="o">*</span> <span class="nx">downscaleFactor</span><span class="p">)</span> <span class="o">/</span> <span class="nx">rowHeight</span>
<span class="p">)</span>
</code></pre></div></div>

<p>and the <span class="table">table</span> top position is set to align the first visible row with the top of the <span class="viewport">viewport</span>:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">table</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">top</span> <span class="o">=</span> <span class="s2">`</span><span class="p">${</span><span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span><span class="p">}</span><span class="s2">px`</span><span class="p">;</span>
</code></pre></div></div>

<p>This lets the user navigate through the whole table, even with billions of rows.</p>

<p>The following widget shows how scrollbar downscaling works. Scroll the left box up and down to see how the right box mimics the scrolling effect, allowing to navigate through ten billion rows.</p>

<iframe src="https://rednegra.net/embed/scroll-downscale" style="height:890px; width:100%; border:none; overflow:hidden;"></iframe>

<p>But there is a drawback. The native scroll bar precision is limited to 1 <em>physical</em> pixel. On “high-resolution” screens, the apparent precision is a fraction of a <em>CSS</em> pixel (1 / <a href="https://developer.mozilla.org/en-US/docs/Web/API/Window/devicePixelRatio">devicePixelRatio</a>). But let’s keep one pixel for simplicity.</p>

<blockquote>
  <p>As an anecdote, setting the scroll value programmatically is hard to predict. It depends on the device pixel ratio, which itself depends on the zoom, and maybe other factors. For example, <code class="language-plaintext highlighter-rouge">element.scrollTo({top: 100})</code> might result in <code class="language-plaintext highlighter-rouge">scrollTop = 100</code>, <code class="language-plaintext highlighter-rouge">scrollTop = 100.23</code>, or <code class="language-plaintext highlighter-rouge">scrollTop = 99.89</code>. You cannot know exactly, but within a margin of one pixel.</p>

  <p>The scrollTop value can even be outside of the expected range, for example negative or larger than the maximum value <code class="language-plaintext highlighter-rouge">scrollHeight - clientHeight</code>. To prevent such browser-specific over-scroll effects, when reacting to a scroll event, hightable always clamps the <code class="language-plaintext highlighter-rouge">scrollTop</code> value within the expected range, and applies the CSS rule <code class="language-plaintext highlighter-rouge">overflow-y: clip</code>. <code class="language-plaintext highlighter-rouge">clip</code>, instead of <code class="language-plaintext highlighter-rouge">hidden</code>, shows the sticky header, even if I’m not sure why to be honest.</p>
</blockquote>

<p>So, when the downscale factor is big, like in the example above (2,189,781,021), the minimal scroll move (1px) corresponds to 2,189,781,021 pixels in the full table. With a row height of 30px, it means that the minimal scroll move corresponds to about 72,992,701 rows. It creates <em>gaps</em> in the reachable rows:</p>

<ul>
  <li>if <code><span class="viewport">viewport</span>.scrollTop = 0</code>, the visible rows are 0 to 5</li>
  <li>if <code><span class="viewport">viewport</span>.scrollTop = 1</code>, the visible rows are 72,992,700 to 72,992,705</li>
  <li>if <code><span class="viewport">viewport</span>.scrollTop = 2</code>, the visible rows are 145,985,401 to 145,985,406</li>
  <li>and so on…</li>
</ul>

<p>There is no way to navigate to the rows 6 to 10, for example. Setting <code><span class="viewport">viewport</span>.scrollTop = 0.00000000274</code> to reach rows 6 to 10 is impossible, because the browser rounds the scroll position to the nearest integer pixel.</p>

<blockquote>
  <p><strong>Impact of infinite pixels</strong></p>

  <p>If we assume 10 billions of rows, the infinite pixels technique allows to navigate through the whole rows span. <strong>There is no limit to the number of rows</strong>, as we can always increase the downscale factor to fit in the maximum canvas height.</p>

  <p>But due to the limited scrollbar precision, if the row height is 30px and the canvas is 8Mpx, each scrolled pixel moves the table by 1,250 rows. It means that <strong>only one row (and its neighbors) out of 1,250 is reachable</strong>.</p>
</blockquote>

<p>The infinite pixels technique thus provides global navigation through billions of rows. But it does not allow fine scrolling, and some rows are unreachable. The technique 4 addresses this issue.</p>

<h2 id="technique-4-pixel-precise-scroll">Technique 4: pixel-precise scroll</h2>

<p>The previous technique allows to scroll globally through the file, but prevents users from scrolling locally because any scroll gesture will jump over gaps of unreachable rows.</p>

<p>To fix that, we implement <strong>two scrolling modes: local and global scrolling</strong>. Local scrolling means scrolling the table slice pixel by pixel (i.e. even more precisely than row by row), while global scrolling means jumping to the position given by the scrollbar.</p>

<p>The logic requires a state with three values: <code class="language-plaintext highlighter-rouge">{ scrollTop, globalAnchor, localOffset }</code></p>
<ul>
  <li>the last <span class="viewport">viewport</span> scroll top value is stored in the state to compute the scroll move on every scroll event.</li>
  <li>the global anchor is the <span class="viewport">viewport</span> scroll top value corresponding to the last global scroll. It is updated on every global scroll, but not on local scrolls.</li>
  <li>the local offset is the offset applied to the global anchor to compute the current scroll position. It is updated on every local scroll, and reset to 0 on global scrolls.</li>
</ul>

<p>The first visible row is computed from the global anchor and the local offset:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">firstVisibleRow</span> <span class="o">=</span> <span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">((</span>
    <span class="nx">state</span><span class="p">.</span><span class="nx">globalAnchor</span> <span class="o">*</span> <span class="nx">downscaleFactor</span> <span class="o">+</span> <span class="nx">state</span><span class="p">.</span><span class="nx">localOffset</span>
  <span class="p">)</span> <span class="o">/</span> <span class="nx">rowHeight</span><span class="p">)</span>

</code></pre></div></div>

<p>The absolute positioning of the <span class="table">table</span> is now:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">table</span><span class="p">.</span><span class="nx">style</span><span class="p">.</span><span class="nx">top</span> <span class="o">=</span> <span class="s2">`</span><span class="p">${</span><span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span> <span class="o">+</span> <span class="nx">state</span><span class="p">.</span><span class="nx">localOffset</span><span class="p">}</span><span class="s2">px`</span><span class="p">;</span>
</code></pre></div></div>

<p>On every scroll event, we compute the magnitude of the scroll move (difference between the new <span class="viewport">viewport</span>’s scroll top and the previous one, stored in the state) and decide to apply:</p>

<ul>
  <li>a <b>global scroll</b> if the scroll move is big, typically on scrollbar drag and drop, and we jump to the new global position (technique 3),</li>
  <li>or a <b>local scroll</b> if the scroll move is small, for example when using the mouse wheel. In that case, we keep the state’s <code class="language-plaintext highlighter-rouge">globalAnchor</code> value unchanged (ie: not sync’ed anymore with the real <code class="language-plaintext highlighter-rouge">scrollTop</code> value) and adjust the <code class="language-plaintext highlighter-rouge">localOffset</code> so that the move appears local (for example, 3 rows downwards).</li>
</ul>

<p>Represented as code, the logic looks like this (simplified, pseudo-code):</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">state</span> <span class="o">=</span> <span class="nx">getState</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">delta</span> <span class="o">=</span> <span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span> <span class="o">-</span> <span class="nx">state</span><span class="p">.</span><span class="nx">scrollTop</span>
<span class="k">if</span> <span class="p">(</span><span class="nb">Math</span><span class="p">.</span><span class="nx">abs</span><span class="p">(</span><span class="nx">delta</span><span class="p">)</span> <span class="o">&gt;</span> <span class="nx">localThreshold</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">// global scroll</span>
  <span class="nx">state</span><span class="p">.</span><span class="nx">localOffset</span> <span class="o">=</span> <span class="mi">0</span>
  <span class="nx">state</span><span class="p">.</span><span class="nx">globalAnchor</span> <span class="o">=</span> <span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTop</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
  <span class="c1">// local scroll</span>
  <span class="nx">state</span><span class="p">.</span><span class="nx">localOffset</span> <span class="o">+=</span> <span class="nx">delta</span>
<span class="p">}</span>
<span class="nx">setState</span><span class="p">(</span><span class="nx">state</span><span class="p">)</span>
</code></pre></div></div>

<p>Now, the user can navigate around the current row, but also jump to any part of the data.</p>

<p>The following widget shows the dual scrolling mode. Scroll the left box up and down to see how the right box mimics the scrolling effect, allowing to navigate both locally and globally through ten billion rows.</p>

<iframe src="https://rednegra.net/embed/scroll-dual" style="height:785px; width:100%; border:none; overflow:hidden;"></iframe>

<p>With this approach, small scroll moves appear local, while large scroll moves jump to the expected global position. The user can navigate through the whole table, and reach every row. The user can scroll as expected in the browser, with their mouse wheel, touchpad, keyboard (when the table is focused) or scrollbar.</p>

<blockquote>
  <p><strong>Impact of pixel-precise scroll</strong></p>

  <p>If we assume 10 billions of rows, the dual scrolling mode allows to <strong>access any pixel of the <span class="full-table">full table</span> using the native scrollbar</strong>. The user can scroll locally with the mouse wheel, and scroll globally by dragging the scrollbar.</p>

  <p>This works if the <span class="full-table">full table</span> height is less than the maximum <span class="canvas">canvas</span> height (8Mpx in hightable) squared, which corresponds to about 64 trillion pixels. So, <strong>1px fidelity is guaranteed up to 2 trillion rows</strong> with a row height of 30px.</p>

  <p>Above that limit, the minimal step is greater than 1px, but <strong>every row is still reachable up to 64 trillion rows!</strong> Above, some rows become unreachable.</p>
</blockquote>

<p>The last challenge is to move to any cell programmatically (i.e. random access to any part of the table), be it using the keyboard or through a “jump to row” input, without worrying about the local vs global scrolling mode. Random access requires decoupling vertical and horizontal scrolling. We explain it in the next section.</p>

<h2 id="technique-5-two-step-random-access">Technique 5: two-step random access</h2>

<p>One of the HighTable requirements is to allow keyboard navigation (e.g. <kbd>↓</kbd> to go to the next row). Fortunately, the Web Accessibility Initiative (WAI) provides guidance through the <a href="https://www.w3.org/WAI/ARIA/apg/patterns/grid/">Grid Pattern</a> and the <a href="https://www.w3.org/WAI/ARIA/apg/patterns/grid/examples/data-grids/">Data Grid Examples</a>. We use <a href="https://www.w3.org/WAI/ARIA/apg/practices/keyboard-interface/#kbd_roving_tabindex">tabindex roving</a> to handle the focus, providing all the expected <a href="https://www.w3.org/WAI/ARIA/apg/patterns/grid/#datagridsforpresentingtabularinformation">keyboard interactions</a>.</p>

<blockquote>
  <p>The browser provides a useful default when calling <code class="language-plaintext highlighter-rouge">cell.focus()</code>: it automatically scrolls to the cell and focus it. But in HighTable, we don’t use the default behavior. Indeed, it positions the cell at the <em>center</em> of the viewport, which does not feel natural.</p>

  <p>To get the expected behavior, we first scroll by the minimal amount to show the next row and column, by calling <code class="language-plaintext highlighter-rouge">cell.scrollIntoView({block: 'nearest', inline: 'nearest'})</code>. Then we set the focus with no scroll action using <code class="language-plaintext highlighter-rouge">cell.focus({preventScroll: true})</code>.</p>
</blockquote>

<p>Unfortunately, the keyboard navigation techniques explained in the WAI resources are designed for full tables. But due to the techniques 2 (table slice), 3 (infinite pixels) and 4 (pixel-precise scroll), multiple steps are required. In particular, to let the user move the active cell with the keyboard, we <strong>separate the vertical scrolling logic from the horizontal one</strong>.</p>

<p>When the user moves the active cell, the final position can be anywhere in the table: <kbd>↓</kbd> moves to the next row, while <kbd>Ctrl</kbd>+<kbd>↓</kbd> moves to the last row. If the move is big, we might have to scroll vertically to have the required cell in the DOM.</p>

<blockquote>
  <p>The same issue whenever we access a random row in the table, for example if an app embedding <code class="language-plaintext highlighter-rouge">&lt;HighTable&gt;</code> provides a “jump to row” feature. The table should programmatically scroll to the expected row, and focus the cell in the expected column, without worrying about the local vs global scrolling mode, or the horizontal scroll position.</p>
</blockquote>

<p>The process is as follows:</p>

<ol>
  <li>compute the next state (global anchor and local offset) that will make the row of the required cell visible,</li>
  <li>programmatically scroll to the new scrollTop position, if the global anchor has changed,</li>
  <li>once scrolled, render the table slice to have the required cell in the DOM,</li>
  <li>scroll horizontally if needed with <code class="language-plaintext highlighter-rouge">cell.scrollIntoView({inline: 'nearest'})</code>,</li>
  <li>set the focus to the new cell with <code class="language-plaintext highlighter-rouge">cell.focus({preventScroll: true})</code>.</li>
</ol>

<p>Note that, for point 1. (computing the next state), we respect the <code class="language-plaintext highlighter-rouge">block: nearest</code> behavior by minimizing the scroll move. If the next row is below the current viewport, it will be the last visible row in the next viewport. If it is above, it will be the first visible row. If it is already visible, no vertical scroll is applied.</p>

<p>The pseudo-code for decoupling vertical and horizontal scrolling requires a flag to prevent horizontal scrolling and focus during the programmatic vertical scroll:</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* in the cell navigation code */</span>
<span class="kd">const</span> <span class="nx">shouldScroll</span> <span class="o">=</span> <span class="nx">state</span><span class="p">.</span><span class="nx">update</span><span class="p">()</span>
<span class="nx">renderTableSlice</span><span class="p">()</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">shouldScroll</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">// set a flag to prevent horizontal scrolling + focus</span>
  <span class="c1">// during programmatic scroll</span>
  <span class="nx">setFlag</span><span class="p">(</span><span class="dl">'</span><span class="s1">programmaticScroll</span><span class="dl">'</span><span class="p">)</span>
  <span class="nx">viewport</span><span class="p">.</span><span class="nx">scrollTo</span><span class="p">({</span><span class="na">top</span><span class="p">:</span> <span class="nx">state</span><span class="p">.</span><span class="nx">globalAnchor</span><span class="p">,</span> <span class="na">behavior</span><span class="p">:</span> <span class="dl">'</span><span class="s1">instant</span><span class="dl">'</span><span class="p">})</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* in the scroll event handler */</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">isFlagSet</span><span class="p">(</span><span class="dl">'</span><span class="s1">programmaticScroll</span><span class="dl">'</span><span class="p">))</span> <span class="p">{</span>
  <span class="c1">// allow horizontal scrolling + focus,</span>
  <span class="c1">// once the programmatic scroll is done</span>
  <span class="nx">clearFlag</span><span class="p">(</span><span class="dl">'</span><span class="s1">programmaticScroll</span><span class="dl">'</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* in the cell rendering code */</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="nx">isFlagSet</span><span class="p">(</span><span class="dl">'</span><span class="s1">programmaticScroll</span><span class="dl">'</span><span class="p">))</span> <span class="p">{</span>
  <span class="c1">// horizontal scrolling + focus allowed</span>
  <span class="nx">cell</span><span class="p">.</span><span class="nx">scrollIntoView</span><span class="p">({</span><span class="na">inline</span><span class="p">:</span> <span class="dl">'</span><span class="s1">nearest</span><span class="dl">'</span><span class="p">})</span>
  <span class="nx">cell</span><span class="p">.</span><span class="nx">focus</span><span class="p">({</span><span class="na">preventScroll</span><span class="p">:</span> <span class="kc">true</span><span class="p">})</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We set <code class="language-plaintext highlighter-rouge">behavior: 'instant'</code> when scrolling programmatically to ensure we only receive one <code class="language-plaintext highlighter-rouge">scroll</code> event. The alternative, <code class="language-plaintext highlighter-rouge">behavior: 'smooth'</code>, would trigger multiple <code class="language-plaintext highlighter-rouge">scroll</code> events, clearing the flag too early, and generating conflicts with the internal state due to intermediate unexpected <code class="language-plaintext highlighter-rouge">scrollTop</code> positions (see the <a href="https://github.com/hyparam/hightable/issues/393">open issue</a>).</p>

<blockquote>
  <p><strong>Impact of two-step random access</strong></p>

  <p>With this technique, <strong>the user can access any random cell in the table with the keyboard</strong>, and the table will scroll to the expected position, even with billions of rows. The vertical and horizontal scrolling are decoupled, so that the user can move to the next column with <kbd>→</kbd> without triggering a vertical scroll, and vice versa with <kbd>↓</kbd>.</p>
</blockquote>

<h2 id="conclusion">Conclusion</h2>

<p>No need for a <a href="https://everyuuid.com/">fake</a> <a href="https://dev.to/kohii/how-to-implement-virtual-scrolling-beyond-the-browsers-limit-16ol">scroll bar</a>. No need to render the table <a href="https://github.com/xwinstone/canvastable">in a canvas</a>. We use the <a href="https://en.wikipedia.org/wiki/Web_platform">Web platform</a>. Thanks to these five techniques that rely on native HTML elements, <a href="https://github.com/hyparam/hightable">hightable</a> lets you navigate seamlessly through billions of rows of a remote data file, in the browser.</p>

<p>Give a star ⭐ to the <a href="https://github.com/hyparam/hightable">GitHub repo</a> if you liked the article!</p>]]></content><author><name>Sylvain Lesage</name></author><category term="engineering" /><category term="open-source" /><category term="HighTable" /><category term="virtual-scrolling" /><category term="React" /><category term="virtualization" /><category term="performance" /><category term="accessibility" /><category term="JavaScript" /><summary type="html"><![CDATA[A visual explainer by Sylvain Lesage on how HighTable works around browser scrollbar limits to display billions of rows, republished by Hyperparam.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-hightable-scrolling.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-hightable-scrolling.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to debug chatbot failures by inspecting LLM logs</title><link href="https://blog.hyperparam.app/debug-chatbot-failures-inspecting-llm-logs/" rel="alternate" type="text/html" title="How to debug chatbot failures by inspecting LLM logs" /><published>2026-01-29T08:00:00+00:00</published><updated>2026-01-29T08:00:00+00:00</updated><id>https://blog.hyperparam.app/debug-chatbot-failures-inspecting-llm-logs</id><content type="html" xml:base="https://blog.hyperparam.app/debug-chatbot-failures-inspecting-llm-logs/"><![CDATA[<h2 id="a-real-world-workflow-for-debugging-chatbot-failures-at-scale">A real-world workflow for debugging chatbot failures at scale</h2>

<p>Most dissatisfied users don’t complain. They churn.</p>

<p>“I asked your chatbot how I could talk with a live customer service agent and it gave me a nonsensical answer, so I never used it again.”</p>

<p>That complaint is unusually helpful. Most users don’t send feedback. They just bounce.</p>

<p><strong>This post walks through a real-world workflow for inspecting LLM logs to debug chatbot failures, identify systemic issues, and validate fixes using real production data.</strong></p>

<iframe width="660" height="315" src="https://www.youtube.com/embed/wwhvfl5Qnms" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" style="margin-bottom: 1.5em;"></iframe>

<p>For Darryl, the engineer debugging the issue, the immediate question wasn’t why this one chat failed. It was:</p>

<p><strong>If one person noticed and complained, how many others hit the same failure and churned?</strong></p>

<p>The team had shipped the chatbot a month earlier. In that time, the LLM logs had already exploded to multiple gigabytes in Parquet and were still growing fast. Reading them manually wasn’t an option. Spot-checking wasn’t an option either: this was a trust failure in a support channel, and you can’t restore trust by guessing.</p>

<p>Darryl needed a workflow that could answer two things quickly:</p>

<ul>
  <li>Find the failing conversations (including this user’s chat) without knowing the exact phrasing.</li>
  <li>Reproduce and fix the failure, then validate the fix across real historical inputs—not just a few hand-picked examples.</li>
</ul>

<p>The first step was to stop thinking in terms of individual conversations and start reasoning over the <a href="https://blog.hyperparam.app/missing-issues-llm-chat-logs/">LLM logs as a dataset</a>.</p>

<h2 id="step-1-inspect-the-logs-like-a-dataset-not-a-transcript">Step 1: Inspect the logs like a dataset, not a transcript</h2>

<p>Darryl loaded the Parquet logs into Hyperparam and started by scanning raw rows to understand what was captured per turn (messages, tool calls, metadata).</p>

<p>The first goal was simple: locate conversations that matched the complaint, such as “trying to reach a human,” “live agent,” “customer service,” etc. That’s awkward in SQL because the query is semantic: the same intent shows up across many different phrasings.</p>

<p>Instead of writing brittle keyword filters, he used an AI agent to filter the dataset down to conversations that likely matched the reported intent. Then he pulled up the specific user’s interaction to review the full conversational context.</p>

<h2 id="step-2-identify-the-real-failure-mode-it-wasnt-random-hallucination">Step 2: Identify the real failure mode (it wasn’t “random hallucination”)</h2>

<p>Darryl soon discovered that the issue wasn’t just that the chatbot wrote something wrong. In the failing chat, the model attempted a tool call to answer a factual question about support availability—but it called the wrong tool.</p>

<p>That’s an important distinction, because it changes the fix:</p>

<ul>
  <li>If it’s pure model generation, you’re tuning prompts and refusal behavior.</li>
  <li>If it’s tool routing, you’re fixing selection, schema, constraints, and guardrails—and you can validate the fix deterministically across historical inputs.</li>
</ul>

<h2 id="step-3-turn-one-bug-into-a-measurable-pattern">Step 3: Turn “one bug” into a measurable pattern</h2>

<p>After fixing the single root cause, Darryl ran a broader review across the full dataset to look for other cases where users asked factual questions and received low-quality or nonsensical answers. Once he zoomed out to look across the full dataset, individual failures stopped being useful on their own. The questions he needed to ask were:</p>

<ul>
  <li>Which intents fail most?</li>
  <li>Which tool calls correlate with failures?</li>
  <li>Is the problem isolated to one path or systemic?</li>
</ul>

<p>The agent surfaced multiple similar issues. What initially appeared to be a single complaint turned out to be a recurring failure that could be costing customers.</p>

<h2 id="step-4-replay-real-conversations-to-validate-changes">Step 4: Replay real conversations to validate changes</h2>

<p>At this stage, Darryl needed to validate behavior by asking:</p>

<p>Given the exact same user inputs from the original version (V1), does the updated version (V2) reliably choose the right tool and produce a correct answer?</p>

<p>With Hyperparam, Darryl replayed historical conversations under different configurations (prompts, tooling, model), then compared outputs across variants using <a href="https://arxiv.org/abs/2411.15594">LLM-as-a-judge</a> to score improvements at scale.</p>

<p>This made it possible to see whether fixes held up across the full replay dataset, not just a few handpicked samples.</p>

<p>After iterating, he exported a concrete set of changes for the next chatbot version: which tool call behavior to adjust, what prompt or tool constraints to add, and which configuration produced the best outcomes on the replay dataset.</p>

<p><em>“I found the right setup within a couple of hours without pulling in an entire team. Being able to compare V1 and V2 across real inputs made it obvious which changes actually worked.”</em></p>

<p>If you’re debugging chatbot failures using real LLM logs, this is the kind of workflow <a href="https://hyperparam.app/">Hyperparam</a> is designed to support.</p>]]></content><author><name>Kenny Daniel</name></author><category term="machine-learning" /><category term="product" /><category term="LLM" /><category term="chatbot" /><category term="debugging" /><category term="llm-logs" /><category term="production-debugging" /><category term="AI-agents" /><summary type="html"><![CDATA[LLM logs capture failures that cause customer complaints. See how inspecting chat logs & validating updates with Hyperparam helps you fix your chatbot.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-chatbot-logs.png" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-chatbot-logs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Squirreling: a new SQL engine for the web</title><link href="https://blog.hyperparam.app/squirreling-new-sql-engine-for-web/" rel="alternate" type="text/html" title="Squirreling: a new SQL engine for the web" /><published>2025-12-30T08:11:00+00:00</published><updated>2025-12-30T08:11:00+00:00</updated><id>https://blog.hyperparam.app/squirreling-new-sql-engine-for-web</id><content type="html" xml:base="https://blog.hyperparam.app/squirreling-new-sql-engine-for-web/"><![CDATA[<p>Hyperparam is built on three bets: first, there’s a goldmine of information in LLM data — e.g. LLM chat logs or LLM training data. Second, people need tools to understand this data. And third, the browser is the only place to build modern interactive data tools. Over the past year, we’ve built a browser-native tool without a “classical” backend server that helps users transform and analyze massive LLM datasets. One of the core functionalities is the ability to extract structured information at scale with the use of AI.</p>

<p>Now comes the challenge. Hyperparam’s users end up with massive datasets with large text-blobby columns such as chat logs and structured columns with labels, scores, or other, more classical, structured information. Users need the ability to query over this data in the browser in an AI native manner. The only AI-friendly language to do this with is SQL, but there’s no SQL engine built natively for the browser that’s fast enough, low memory enough, or async enough to meet Hyperparam’s standards for interactivity.</p>

<p>So I did what all engineers do: I built Squirreling, a ~9 KB (minified and gzipped) SQL engine with zero external dependencies. It achieves instant startup and constant memory usage for streaming queries.</p>

<p>I made my first commit on November 15th; open-sourced it on November 22nd, and had it live in Hyperparam on November 26th. This would never have been possible with only one person and in such a short timeframe without AI.</p>

<h2 id="drawbacks-of-webassembly-for-sql-engines">Drawbacks of WebAssembly for SQL engines</h2>

<p>To understand why existing browser-based SQL engines struggle with interactive data exploration, it helps to examine how they’re built. Tools like DuckDB-Wasm compile a full analytical SQL engine to WebAssembly so it can run inside the browser. But database engines relying on WebAssembly to run in the browser face inherent limitations:</p>

<ul>
  <li><strong>Large footprint:</strong> DuckDB-Wasm, for instance, exceeds 4MB, which adds to processing times.</li>
  <li><strong>Synchronous execution model:</strong> This limits true streaming execution.</li>
  <li><strong>Differences in memory types:</strong> WebAssembly’s linear memory model is separate from JavaScript’s heap memory, which requires data copying at boundaries.</li>
</ul>

<p>In practice, this shows up as noticeable startup times before queries can run, delayed time-to-first-result, and execution behavior that prioritizes throughput over interactivity. Queries run to completion before yielding results. And if you wanted to have derived columns or user-defined functions (UDFs), there’s no way to connect DuckDB-Wasm with async API calls such as LLMs. This makes existing SQL engines less than optimal for exploratory workflows that depend on fast, incremental feedback.</p>

<h2 id="how-squirreling-fundamentally-differs-from-existing-sql-engines">How Squirreling fundamentally differs from existing SQL engines</h2>

<p>Squirreling emerged as a response to the simple question: what happens if you design a SQL engine for the browser first, instead of adapting a server-oriented database to run there?</p>

<p>Starting from that premise leads to a different set of design choices than those made by existing solutions:</p>

<ul>
  <li><strong>Async-native execution:</strong> Squirreling relies on JavaScript’s AsyncGenerator protocol throughout, resulting in streaming query results and a responsive UI.</li>
  <li><strong>Late materialization via lazy cells:</strong> Table cells are represented as async functions and are only evaluated when accessed, which minimizes expensive operations.</li>
  <li><strong>Pluggable data sources:</strong> The AsyncDataSource interface decouples query execution from data retrieval, allowing data sources to only return the specific rows and columns a query requires.</li>
</ul>

<p>These design choices are reflected throughout Squirreling’s architecture, shaping how queries execute, how data is retrieved, and how work is scheduled.</p>

<h2 id="squirrelings-architecture">Squirreling’s architecture</h2>

<p>Let’s examine the key parts of Squirreling’s architecture that make these design choices concrete.</p>

<h3 id="late-materialization">Late materialization</h3>

<p>Squirreling delays computing column values until the query needs them. Expensive operations only run on cells that survive earlier stages such as filtering, sorting, and limiting.</p>

<p><img src="/assets/images/late-materialization.png" alt="Late materialization diagram" /></p>

<p>By delaying materialization, Squirreling executes joins over minimal projections and effectively inherits the asymptotically worst-case-optimal behavior of modern join algorithms, only materializing payload columns for rows that survive the join. <a href="https://www.cs.umd.edu/~abadi/papers/abadiicde2007.pdf">[1]</a></p>

<h3 id="execution-model">Execution model</h3>

<p>Squirreling distinguishes between streaming and buffered paths based on query characteristics:</p>

<ul>
  <li><strong>The streaming path</strong> is used for queries without ORDER BY, GROUP BY, or aggregates. It achieves constant data coverage regardless of the size of the dataset. It yields one input row, one output row, and a small amount of data that satisfies the limit query.</li>
  <li><strong>The buffered path</strong> is used for queries with ORDER BY or GROUP BY; however, late materialization options are still implemented, and LIMIT is applied before projection. Squirreling buffers the rows first and only evaluates those columns that are required for the current stage of the query.</li>
</ul>

<h3 id="query-processing">Query processing</h3>

<p>Squirreling parses SQL into an AST and executes against the AST directly without a separate planning phase. This simple system avoids heavy planning overhead and allows execution to stay incremental and async. It fits the browser environment, where responsiveness matters more than deep cost-based optimization.</p>

<p>This AST-driven, async, late materialization model applies even to complex queries. Joins are executed directly from the AST, yielding an async and streaming result. They can stop early in the event LIMIT is applied and don’t force evaluation of columns. Columns are treated independently and evaluated individually and lazily, deferring expensive columns until required.</p>

<h3 id="module-footprint">Module footprint</h3>

<p>Squirreling is written in pure JavaScript with zero runtime dependencies. The complete library — consisting of the parser, executor, and all built-in functions — compiles to ~9kb (minified and gzipped). That’s 500x smaller than DuckDB-Wasm’s 4.5 MB binary.</p>

<p>This small footprint offers the following benefits:</p>

<ul>
  <li><strong>Instant startup:</strong> Because there’s no WebAssembly compilation delay, Squirreling is ready to execute queries immediately after the JavaScript loads.</li>
  <li><strong>Embeddability:</strong> Squirreling can be bundled into applications with almost no impact on size.</li>
  <li><strong>Edge deployment:</strong> Squirreling’s small size enables deployment in serverless edge functions or service workers — environments that typically can’t host databases at all.</li>
</ul>

<h2 id="squirreling-a-browser-native-sql-engine">Squirreling: a browser-native SQL engine</h2>

<p>These architectural choices follow logically from treating the browser as the primary execution environment for interactive exploration. The result is a browser-native SQL engine with a set of properties that shape how queries run:</p>

<ul>
  <li><strong>Immediate, incremental query execution in the browser.</strong> Queries start producing results as soon as execution begins, with rows and cells streaming as they’re ready. Users can inspect partial output, refine queries, or stop execution without blocking the UI or waiting for full execution.</li>
  <li><strong>Explicit control over when expensive work runs.</strong> Columns are evaluated only when referenced, and LIMIT and ORDER BY act as cost controls as well as query clauses. Expensive operations run only when required, which allows execution to stop early to avoid unnecessary computation.</li>
  <li><strong>Lightweight, backend-free exploration over asynchronous data.</strong> Squirreling runs entirely client-side as an open-source library, with no backend, account setup, or loading flows. It supports interactive querying over asynchronous data sources, including cloud-native formats like Parquet, in a footprint of ~9KB (minified and gzipped).</li>
</ul>

<p>Squirreling is available as an open-source library here: <a href="https://github.com/hyparam/squirreling">github.com/hyparam/squirreling</a></p>

<h2 id="references">References</h2>

<p>[1] Abadi, D. J., Myers, D. S., DeWitt, D. J., &amp; Madden, S. R. (2007). <a href="https://www.cs.umd.edu/~abadi/papers/abadiicde2007.pdf">Materialization Strategies in a Column-Oriented DBMS</a>. In <em>Proceedings of the 23rd International Conference on Data Engineering (ICDE)</em> (pp. 466–475). IEEE.</p>]]></content><author><name>Kenny Daniel</name></author><category term="engineering" /><category term="open-source" /><category term="SQL-engine" /><category term="browser-native" /><category term="interactive-data-exploration" /><category term="WebAssembly" /><category term="JavaScript" /><summary type="html"><![CDATA[Squirreling is a browser-native SQL engine built for interactive data exploration. Run async queries, stream results instantly, and avoid backend setup.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-squirreling.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-squirreling.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Why you’re missing issues in your LLM chat logs</title><link href="https://blog.hyperparam.app/missing-issues-llm-chat-logs/" rel="alternate" type="text/html" title="Why you’re missing issues in your LLM chat logs" /><published>2025-12-16T09:30:00+00:00</published><updated>2025-12-16T09:30:00+00:00</updated><id>https://blog.hyperparam.app/missing-issues-llm-chat-logs</id><content type="html" xml:base="https://blog.hyperparam.app/missing-issues-llm-chat-logs/"><![CDATA[<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What makes LLM chat logs harder to analyze than other types of AI data?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "LLM chat logs combine long-form text, multi-turn conversations, and inconsistent structures that don't fit traditional table-based workflows. This makes it difficult to map issues and patterns across the full dataset using slice-based or metadata-first methods."
      }
    },
    {
      "@type": "Question",
      "name": "Why do subtle LLM failures often appear only at large scale?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Many LLM behaviors like sycophancy or inconsistent answers show up sparsely across thousands of conversations. They don't form clear patterns until you analyze the dataset broadly."
      }
    },
    {
      "@type": "Question",
      "name": "How can teams reduce blind spots when working with massive unstructured logs?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Teams can reduce blind spots by shifting from slice-based inspection to dataset-level reasoning, using methods that allow them to query, compare, and evaluate issues across the entire dataset instead of isolated rows."
      }
    }
  ]
}
</script>

<h1 id="youre-missing-critical-issues-in-your-llm-chat-logs">You’re missing critical issues in your LLM chat logs</h1>

<p>Debugging issues like sycophancy or tone shifts in large LLM chat logs usually starts the same way. Someone flags a problem, and suddenly you’re the engineer staring at hundreds of thousands of rows trying to figure out what went sideways. Your boss wants answers, and the dataset is huge. So you pull a small sample, send it through another LLM to score for sycophancy, and check to see whether the scoring prompt actually captures what you care about. That quick loop works for the iteration phase, but it never tells you how often the issue appears or what triggers it across the full dataset.</p>

<p>LLM chat logs become harder to reason about at scale because the issues you care about are distributed across tens or even hundreds of thousands of lines of text. Chatbot logs consist of multi-GB text files. In this <a href="https://blog.hyperparam.app/explore-massive-datasets-with-hyperparam/">deluge of unstructured data</a>, what matters is finding the important 1% of failures that are relevant to the challenge you’re working on. Most teams start by sampling because it’s the fastest way to inspect a few examples and test their scoring logic. But sampling only shows fragments of the behavior, and there’s no guarantee those fragments reflect the full picture.</p>

<h1 id="key-takeaways">Key takeaways</h1>

<ul>
  <li>LLM chat logs hide important behaviors because the issues you care about are distributed across thousands of conversations. Debugging becomes slower and less reliable when you can’t query across the full dataset.</li>
  <li>Sampling is an essential first step, but it only surfaces fragments of the behavior and can’t reveal frequency, triggers, or context across the entire dataset.</li>
  <li>Reasoning across the entire multi-GB dataset is becoming essential for accurate LLM behavior analysis.</li>
</ul>

<h1 id="table-of-contents">Table of contents</h1>

<ul>
  <li>Traditional debugging breaks down with LLM chat logs</li>
  <li>The issues you know exist but can’t quantify in sampled logs</li>
  <li>What happens when issues stay hidden in your logs</li>
  <li>The future of LLM debugging depends on reasoning across multi-GB datasets</li>
</ul>

<h1 id="traditional-debugging-breaks-down-with-the-scale-of-llm-chat-logs">Traditional debugging breaks down with the scale of LLM chat logs</h1>

<p>Traditional debugging workflows were built for structured tables, not multi-GB datasets of AI-scale text. They’re designed for predictable schemas, uniform rows, and fields you can sort, filter, or compute against. They usually rely on sampling or slice-based queries because they’re the fastest ways to inspect a few examples.</p>

<p>But the new world we live in consists of huge piles of unstructured text data: LLM chat logs that don’t behave that way. A single row can contain a full conversation, a long reasoning chain, or text that spans hundreds of tokens with no consistent structure. Engineers still catch individual failures like an instance of sycophancy or a strange tone shift, but locating those issues in massive logs often requires digging through isolated rows manually. And because conventional SQL or Python workflows weren’t built to analyze unstructured, conversational text across large datasets, they don’t help you map how often an issue occurs, what triggers it, or whether it’s part of a pattern that repeats across thousands of conversations.</p>

<p>So the question becomes: am I seeing the full picture here, or is my view skewed because traditional methods aren’t built to query massive unstructured datasets?</p>

<h1 id="the-issues-you-know-exist-but-cant-quantify-in-sampled-logs">The issues you know exist but can’t quantify in sampled logs</h1>

<p>Understanding the issues in your AI systems almost always comes from actual use, whether doing so yourself (dogfooding) or by listening to reports from your users. In many cases, you might have a general sense of the issues that exist, like sycophancy, unexpected tone shifts, or two conversations that answer the same question differently. But with massive logs like this, the underlying behavior often only shows up when issues are viewed across the entire dataset.</p>

<p>Some issues only make sense when you see how often they appear or what triggers them:</p>

<ul>
  <li>Sycophancy triggered by specific phrasing that appears sporadically in large LLM chat logs</li>
  <li>Logical inconsistencies that surface only in longer, multi-turn threads</li>
  <li>Conflicting answers that only become obvious when similar queries appear in different regions or contexts</li>
</ul>

<p>These issues aren’t rare, but they’re distributed thinly across the dataset, and that distribution makes them nearly impossible to see without dataset-level querying. A sample shows you symptoms, but only inspecting the entire dataset reveals the scope, frequency, or context that define the real pattern. And you can’t query for this kind of thing with SQL. You need something that understands natural language.</p>

<h1 id="what-happens-when-issues-stay-hidden-in-your-llm-logs">What happens when issues stay hidden in your LLM logs</h1>

<p>When you can’t query across the full dataset, you lose the ability to judge the scope or conditions of an issue. This can result in:</p>

<ul>
  <li>Issues surfacing late, often only after a user reports something unexpected.</li>
  <li>Teams working on fixes that don’t solve the real problem because the root cause wasn’t visible.</li>
  <li>The possibility of updates shipping with issues no one noticed.</li>
  <li>Debugging that focuses on the model when the issue actually lives in prompts, context windows, or specific conversational patterns.</li>
</ul>

<p>These blind spots might start out small, but they can expand quickly and slow down debugging, pushing teams into reactive rather than proactive work.</p>

<h1 id="the-future-of-llm-debugging-depends-on-reasoning-across-multi-gb-datasets">The future of LLM debugging depends on reasoning across multi-GB datasets</h1>

<p>As datasets grow, the limits of slice-based inspection become harder to ignore. Issues in LLM chat logs emerge from patterns that spread across thousands of conversations, across different regions and varying prompts. And with multi-GB datasets now being the norm, finding issues and understanding the patterns behind them requires reasoning across the full dataset, not just the fragments that appear inside a sample.</p>

<p>If you work with LLM chat logs, you can <a href="https://hyperparam.app/">try the Hyperparam app</a> for a faster way to explore and query large datasets. It’s free while in beta.</p>

<h1 id="faq">FAQ</h1>

<p><strong>What makes LLM chat logs harder to analyze than other types of AI data?</strong></p>

<p>LLM chat logs combine long-form text, multi-turn conversations, and inconsistent structures that don’t fit traditional structured data workflows. This makes it difficult to map issues and patterns across the full dataset using query-based or search-based methods.</p>

<p><strong>Why do subtle LLM failures often appear only at a large scale?</strong></p>

<p>No one knows how to make good evals. So issues tend to surface from actually deploying and using AI models. The issues that surface only become apparent over time, and are often subtle behavior issues like <a href="https://www.anthropic.com/research/towards-understanding-sycophancy-in-language-models">sycophancy</a>. They don’t form clear patterns until you analyze the dataset broadly.</p>

<p><strong>What happens when issues stay hidden in your LLM logs?</strong></p>

<p>In my experience, inability to look deeply at LLM chat log data has resulted in:</p>

<ul>
  <li>Issues surfacing late, often only after a user reports something unexpected.</li>
  <li>Teams working on fixes that don’t solve the real problem because the root cause wasn’t visible.</li>
  <li>Shipping updates with issues no one noticed.</li>
  <li>Debugging that focuses on the model when the issue actually lives in prompts, tools, and context windows.</li>
</ul>

<p>These blind spots might start out small, but they can expand quickly and slow down debugging, pushing teams into reactive rather than proactive work.</p>

<p><strong>How can teams reduce blind spots when working with massive unstructured logs?</strong></p>

<p>Teams can reduce blind spots by shifting from spot-checking to dataset-level reasoning, using AI assistance that allows them to query, compare, and evaluate issues across the entire dataset instead of isolated rows.</p>]]></content><author><name>Kenny Daniel</name></author><category term="AI" /><category term="debugging" /><category term="LLM-chat-log" /><category term="chatbot-logs" /><category term="LLM-debugging" /><category term="AI-scale-text" /><category term="reasoning-across-multi-GB-datasets" /><summary type="html"><![CDATA[Massive LLM chat logs hide issues and patterns that samples can't show. See why full-dataset visibility matters and explore Hyperparam for free.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-missing-issues-llm-logs.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-missing-issues-llm-logs.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Explore massive datasets with the Hyperparam AI tool</title><link href="https://blog.hyperparam.app/explore-massive-datasets-with-hyperparam/" rel="alternate" type="text/html" title="Explore massive datasets with the Hyperparam AI tool" /><published>2025-11-19T14:00:00+00:00</published><updated>2025-11-19T14:00:00+00:00</updated><id>https://blog.hyperparam.app/explore-massive-datasets-with-hyperparam</id><content type="html" xml:base="https://blog.hyperparam.app/explore-massive-datasets-with-hyperparam/"><![CDATA[<h2 id="meet-the-hyperparam-ai-tool-for-massive-datasets">Meet the Hyperparam AI tool for massive datasets</h2>

<h3 id="the-first-of-its-kind-interactive-ui-for-navigating-and-improving-llm-scale-datasets">The first-of-its-kind interactive UI for navigating and improving LLM-scale datasets</h3>

<p>AI runs on data. Massive amounts of it. On one side you’re training models on large amounts of text, and once deployed, these models constantly produce mountains of AI text data. The entire lifecycle of AI is massive data in and even more data out. Between April 2024 and April 2025, Google’s AI products alone went from roughly 9.7 trillion tokens to more than 480 trillion tokens. That’s almost a 50x increase in just one year and rapidly approaching 1 quadrillion tokens per month.</p>

<p>However, none of the tools that currently exist are built to work with massive, planet-sized balls of unstructured text. Notebooks, SQL engines, and data visualizers all assume something smaller and more structured than what we actually deal with today.</p>

<p>If we want to keep advancing with AI, we need solutions that let us explore and understand AI data at the speed at which it’s produced. And that’s why Hyperparam exists. The Hyperparam AI tool, a browser-native application built specifically for this environment, lets you explore and transform massive datasets in real time so you can understand and improve your AI datasets.</p>

<h2 id="key-takeaways">Key takeaways</h2>

<ul>
  <li>AI-scale datasets grow faster than traditional tools can handle, leaving teams unable to understand their own data.</li>
  <li>The Hyperparam AI tool pairs a high-speed browser engine with an army of AI agents and natural language analysis to make AI-scale data workable.</li>
  <li>You can explore and refine massive unstructured datasets in real time without waiting.</li>
  <li>One person can triage issues like sycophancy or hallucinations across tens of thousands of rows inside a single browser tab.</li>
</ul>

<h2 id="table-of-contents">Table of contents</h2>

<ul>
  <li>Teams are drowning in overwhelming amounts of unstructured data</li>
  <li>AI is only useful when the UI can keep up</li>
  <li>How the Hyperparam AI agents accelerate real data work</li>
  <li>Hyperparam keeps the human in the loop</li>
</ul>

<h2 id="teams-are-drowning-in-overwhelming-amounts-of-unstructured-data">Teams are drowning in overwhelming amounts of unstructured data</h2>

<p>Every company building with AI now sits on more text than any team can realistically examine. Chat logs, model outputs, product interactions, and support conversations all contain valuable intelligence on how a company’s AI is performing.</p>

<p>But AI data accumulates faster than humans can review or understand. In Q3 2025 alone, Azure’s AI services processed over 100 trillion tokens. Even small teams wind up with tens of thousands of rows overnight, and the rate of growth only accelerates as AI proliferates across more companies and industries.</p>

<p>Traditional tools to help businesses understand their data often rely heavily on the data being structured and accessed via SQL or other structured query languages. But the “signals” in AI data — e.g. did the model hallucinate, did the model ask for clarification, did the user get frustrated — exist fuzzily in text, not in an easy-to-access column. The information to learn from is in the data, but there is no way with traditional tools to access it for any kind of dataset analysis or debugging.</p>

<p>With the pace of AI, that gap only compounds. The more data you produce, the less equipped you are to do anything meaningful with it. The result is a backlog of unknowns that keeps growing while your ability to understand it stays flat.</p>

<h2 id="ai-is-only-useful-when-the-ui-can-keep-up">AI is only useful when the UI can keep up</h2>

<p>Ironically, our hypothesis is that AI can help you understand your AI data, but only if the interface makes that possible. AI models can fuzzily extract information, transform, label, and filter for you, but none of that matters when the surrounding tools choke the moment you hit real-world dataset volumes. For example, ChatGPT can help you understand if your AI is hallucinating, but you can’t load more than a few dozen chat logs at a time. Traditional data viewers, even augmented with AI, can’t display more than a few thousand rows instantly. Custom notebooks could be built to use AI, but would require scalable infrastructure to run over the entire data.</p>

<p>The Hyperparam AI tool solves this problem. It’s the first tool that makes AI usable at dataset scale by pairing two things that have never existed side by side:</p>

<ul>
  <li><a href="https://blog.hyperparam.app/browser-based-tools-will-reshape-ai/">Browser-native speed</a> that streams and renders massive unstructured datasets instantly</li>
  <li>A host of AI agents that act like a Swiss Army knife for your data, enabling you to score, label, categorize, and filter rows using natural language</li>
</ul>

<p>Because the interface is fast enough to keep up, the AI insights become actionable. You can generate columns, score for sentiment, and filter results in real time, all without waiting or guessing. In short, everything clicks into place: the combination of a high-speed UI and Hyperparam’s AI agents gives us the first tool designed to explore and understand AI-scale data and support real LLM dataset debugging.</p>

<h2 id="how-the-hyperparam-ai-agents-accelerate-real-data-work">How the Hyperparam AI agents accelerate real data work</h2>

<p>Once the interface is fast enough to keep up with the data, the AI layer turns into a genuine workflow upgrade. The browser engine handles the scale, the model does the hard work of reading through the thousands of rows of text data, and you stay in charge of the decisions. The model scores every row, creates new columns, surfaces issues, and points out strange behavior you might not notice on your own. You explore and validate the results in real time because nothing stalls or blocks you.</p>

<p>Take something as simple as triaging chatbot sycophancy and releasing a new prompt to correct sycophantic behavior. In the Hyperparam chat, you can ask Hyperparam to score every conversation for sycophancy, sort the entire dataset, filter to the outliers, and transform sycophantic results into desired behaviors for evaluations. Then you can try out different prompts, check the responses, and iterate until you have a prompt performing well on your corrected evaluation. You can even export this evaluation to use it later. And you can do this all singlehandedly inside one browser tab.</p>

<h2 id="the-hyperparam-ai-tool-keeps-the-human-in-the-loop">The Hyperparam AI tool keeps the human in the loop</h2>

<p>Large language models can help score conversations or pinpoint odd behavior, but they can’t work through AI-scale datasets on their own. Hyperparam overcomes that limitation by pairing a high-speed browser engine with an army of AI agents that support the parts of the workflow where natural language actually adds value. You move through the data instantly, and the model helps you understand what you’re seeing without ever taking over the decisions.</p>

<p>This setup keeps the judgment where it belongs: with you, the human expert. We believe strongly that <a href="https://cloud.google.com/discover/human-in-the-loop">human-in-the-loop</a> is the only way to work responsibly with AI. You decide how far to trust a score or when a prompt needs refinement. The UI makes the dataset feel lightweight and the AI does the heavy lifting, but every decision runs through your expert eye.</p>

<p>If you work with AI data, <a href="https://hyperparam.app/">try the Hyperparam AI tool</a> for a faster way to inspect, debug, and refine massive datasets. It’s free while it’s in beta.</p>]]></content><author><name>Kenny Daniel</name></author><category term="product" /><category term="AI" /><category term="data-transformation" /><category term="Hyperparam" /><category term="AI-workflows" /><category term="AI-data" /><category term="browser-native" /><category term="startup" /><summary type="html"><![CDATA[The Hyperparam AI tool lets you inspect and refine massive datasets in real time. Move faster through your AI data and try it free during beta.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-big-llm-data.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-big-llm-data.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Lessons from Hyperparam’s Year of Open-Source Data Transformation</title><link href="https://blog.hyperparam.app/lessons-from-open-source-data-transformation/" rel="alternate" type="text/html" title="Lessons from Hyperparam’s Year of Open-Source Data Transformation" /><published>2025-11-12T08:00:00+00:00</published><updated>2025-11-12T08:00:00+00:00</updated><id>https://blog.hyperparam.app/lessons-from-open-source-data-transformation</id><content type="html" xml:base="https://blog.hyperparam.app/lessons-from-open-source-data-transformation/"><![CDATA[<h4 id="i-sat-down-with-my-former-cofounder-kenny-daniel-to-talk-about-his-new-startup-hyperparam">I sat down with my former cofounder, Kenny Daniel, to talk about his new startup Hyperparam</h4>

<p>Hyperparam is an AI-powered data transformation tool that lets users and an army of AI agents look at, transform, score, and filter massive datasets instantly. As Kenny puts it, “It’s like a Swiss Army knife for your data.” It’s built on an ecosystem of open source data transformation libraries that power its paid app, which delivers the full Hyperparam experience.</p>

<p>Unlike most products that start with an enterprise focus and chase a single proof of concept, Kenny, as I’ve always known him to do, chose his own path. His thesis: Starting from open source is a better, faster way to build a product. In this interview, he shares his take on the open source community, product development in the new world of AI, and how Hyperparam took an intentional approach to open and closed source development.</p>

<h1 id="key-takeaways">Key Takeaways:</h1>

<ul>
  <li>Open source development provides faster, more authentic product feedback than traditional enterprise development.</li>
  <li>Hugging Face’s adoption of Hyperparam’s libraries (HyLlama and Hyparquet) validated the browser-native approach and its value for large-scale AI workflows.</li>
  <li>Minimalism in engineering, or building without dependencies, creates faster, more reliable software.</li>
  <li>Building tools you want to use yourself often leads to creating products others didn’t realize they needed.</li>
</ul>

<h1 id="table-of-contents">Table of Contents:</h1>

<ul>
  <li><a href="#why-hyperparam-went-open-source">Why Hyperparam Went Open Source</a></li>
  <li><a href="#hyperparam-vs-data-visualization-tools">Hyperparam vs. Data Visualization Tools</a></li>
  <li><a href="#how-hugging-face-validated-hyperparams-browser-native-approach">How Hugging Face Validated the Browser-Native Approach</a></li>
  <li><a href="#whats-open-source-and-whats-product-at-hyperparam">What’s Open Source and What’s Product at Hyperparam</a></li>
  <li><a href="#ai-workflows-that-make-large-scale-data-transformation-faster">AI Workflows That Make Large-Scale Data Transformation Faster</a></li>
  <li><a href="#why-minimalism-drives-hyperparams-engineering-philosophy">Why Minimalism Drives Hyperparam’s Engineering Philosophy</a></li>
  <li><a href="#advice-for-developers-exploring-open-source-projects">Advice for Developers Exploring Open Source Projects</a></li>
</ul>

<h1 id="why-hyperparam-went-open-source">Why Hyperparam Went Open Source</h1>

<p><strong>At our previous company, Algorithmia, we didn’t go open source. What made you decide to do it differently this time?</strong></p>

<p>I built Hyperparam as the <a href="https://hyperparam.app/">data transformation tool</a> I wished I had because there wasn’t one that met my criteria. The first version of Hyperparam was a simple browser-based data viewer for Parquet files with some simple data transformation tools. The majority of large datasets for AI training and monitoring are Parquet files, and I just wanted to look at the data and play around with it. But I didn’t think people would pay for a Parquet viewer, so I doubted I’d be shooting myself in the foot by giving it away. If anything, I was going to get all the benefits of usage in the community. So I just put it out there without promoting it.</p>

<p>One of the most compelling arguments for doing open source is because I think it’s fundamentally a better way to build a product and get feedback. If you start building a product straight for the enterprise, it’s a recipe for disaster. You start asking the wrong questions, like, “How do we fit into their workflows?’ rather than asking, “How do we build a product that would see organic adoption?” With open source, people use your software if it’s useful and if it’s not, they don’t. That’s an incredibly valuable signal.</p>

<h1 id="hyperparam-vs-data-visualization-tools">Hyperparam vs. Data Visualization Tools</h1>

<p><strong>Though you can use Hyperparam to view large datasets, you describe it as a data transformation tool. What makes Hyperparam different from data visualizers?</strong></p>

<p>Hyperparam lets you instantly view, explore, and transform millions of rows of data, all through a user-friendly, chat-based UI built for scale and usability. So it’s much more than a data visualizer; it’s a data transformation tool.</p>

<p>When I went looking for a data tool, I just wanted to open one dataset. <a href="https://jupyter.org/">Jupyter</a> was frustratingly slow. ChatGPT, VS Code, Copilot, and other assistants weren’t designed for interacting with massive datasets. And I quickly realized there wasn’t a single tool out there that let me look at any scaled dataset.</p>

<p>That led me on this path, and the question became: What does the interface look like for using AI across data? The answer is Hyperparam. It delivers the power of instant data transformation with the ease and nuance of natural language querying.</p>

<h1 id="how-hugging-face-validated-hyperparams-browser-native-approach">How Hugging Face Validated Hyperparam’s Browser-Native Approach</h1>

<p><strong>Hugging Face is just one of the organizations that started using your libraries HyLlama and Hyparquet. What was the significance of that moment?</strong></p>

<p>Hugging Face’s adoption of my libraries validated hugely that there was something to my idea of moving more AI workflows to the browser. It was the strongest market signal I’d had up to that point, and it made me start thinking about what else we could build with these components.</p>

<p>For context, Hugging Face is the world’s repository for open models and open data. They use multi-gigabyte files in Llama CCP format, and they wanted to enable the user to simply get the metadata instead of downloading the entire file. HiLlama does exactly that: it pulls the metadata and provides the info the user needs, saving bandwidth, time, and disk space.</p>

<p>After Hugging Face integrated HyLlama into their website, they started looking into Hyparquet.  When they realized it offered many of the same benefits for data, they started integrating it, as well. And it was a great honor that because they support OSS in general and have adopted our libraries, they gave us a substantial open-source grant.</p>

<h1 id="whats-open-source-and-whats-product-at-hyperparam">What’s Open Source and What’s Product at Hyperparam</h1>

<p><strong>You’re launching a paid version of Hyperparam soon. What’s open source and what’s part of the product?</strong></p>

<p>I’ve already open sourced the parts of Hyperparam that have shared value to the community, and I’ve kept the full product experience (including the AI workflows) within the paid app.</p>

<p>Hyparquet, HighTable, HyLlama are some libraries we’ve released that are building blocks that help others explore data in the browser and also power what we’re building internally. My belief is that connectors, frontend components, writers, readers, and other “glue” components should be open source. They’re globally useful beyond what I’m building and should be shared by the community. But on their own, they’re not the Hyperparam product.</p>

<p>Beyond just thinking about what is useful for the community, there are a few other upsides of open sourcing components. For one, I get to control how the component is optimized and designed, and I can make sure it’s designed to work well with Hyperparam. Secondly, and probably most importantly, there’s the community of developers invested in these components. Approximately a dozen people contributed code to Hyparquet and HighTable, and even more filed bugs that I subsequently fixed. Giving away components doesn’t diminish value; it amplifies it through feedback, goodwill, and contributions.</p>

<p>Now, when thinking about the paid product, any AI component and the core user experience is proprietary. I care deeply how users flow through my product, and I need to own that experience because I don’t think anyone else can build it correctly. That’s a bit of a cocky statement, but my team is a select group of people obsessed with the overall data experience and how the AI should work.</p>

<h1 id="ai-workflows-that-make-large-scale-data-transformation-faster">AI Workflows That Make Large-Scale Data Transformation Faster</h1>

<p><strong>What kind of AI workflows can you do with Hyperparam?</strong></p>

<p>Hyperparam is a general purpose data transformation tool that enables you to do multiple things with your data, so it’s easiest to give an example.</p>

<p>Let’s say you’re a company that’s deploying a chatbot to your users, and a user files a ticket. Your support team needs to dive into the data to understand what happened, whether it’s sycophancy or some other issue. With Hyperparam, you can apply an LLM-generated score to every row in your dataset, look at the values, filter out the bad ones, transform them into something better, export the results, and just continue with your workflow.</p>

<p>That’s just one example of what Hyperparam can do. In addition to applying AI scores and sorting, filtering, and searching based on those scores, you can ask natural-language questions about your data, for example: “Rate every chatbot conversation for sycophancy,” “Did the user seem satisfied?” or “Was a conclusion reached?” You can categorize, tag, and explore your dataset in ways that were never possible before.</p>

<p>You can also run experiments: import historical data, tweak prompts, compare models, and see how the outputs change. It’s a deep research workflow that lets one person do what used to take an entire team.</p>

<h1 id="why-minimalism-drives-hyperparams-engineering-philosophy">Why Minimalism Drives Hyperparam’s Engineering Philosophy</h1>

<p><strong>What’s your core engineering philosophy and how did open source support that?</strong></p>

<p>One of my fundamental engineering principles is to take no dependencies. I feel very strongly against building a huge stack of dependent software, which is something you see a lot of in JavaScript. Because Hyperparam started as a passion project and it’s open source, I could optimize purely for the function that I cared about. That’s not necessarily how that would have been at a company.</p>

<p>That simplification reminds me of SpaceX’s Raptor engine. Each iteration of the Raptor keeps getting simpler… yet more powerful. I wanted to do the same for software. It’s an aesthetic choice, but it also influences the architecture and engineering. With an obsession over engineering and product, you can build minimal software, and that’s what Hyperparam is. I built it from the ground up depending on nothing else. That’s why it’s as small, light, and fast as it is.</p>

<h1 id="advice-for-developers-exploring-open-source-projects">Advice for Developers Exploring Open Source Projects</h1>

<p><strong>What’s your advice for developers starting out with open source?</strong></p>

<p><a href="https://blog.hyperparam.app/machine-learning/product/2025/01/21/browser-based-tools-will-reshape-ai/">Build the product you want to use yourself</a>. If you have to solely rely on other people to tell you if what you are building is useful, your iteration cycles will be slow and painful. This advice might run contrary to conventional startup wisdom, which says you should assume you know nothing, talk to a hundred customers, and then build to solve their problem. That’s viable, but it’s not the only way to create something meaningful.</p>

<p>To build a certain kind of product, you need more vision and aesthetic opinion. In open source, you’ll see these shining monuments to technology, and why? Because someone cared enough to make them both functional and beautiful. Hyperparam is the data transformation tool I wished existed. I’m building for an audience of one: myself. When you build something you want to use yourself, you often end up building something others didn’t realize they needed.</p>]]></content><author><name>Diego Oppenheimer</name></author><category term="open-source" /><category term="product" /><category term="AI" /><category term="open-source" /><category term="data-transformation" /><category term="Hyperparam" /><category term="Hyparquet" /><category term="HighTable" /><category term="HyLlama" /><category term="community" /><category term="product-development" /><category term="AI-workflows" /><category term="browser-native" /><category term="startup" /><category term="engineering-philosophy" /><category term="interview" /><summary type="html"><![CDATA[Hyperparam's founder explains what a year of open source data transformation taught him about balancing community and development in the AI era.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-interview.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-interview.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Simulated Personas, Real Insights: Using Snowglobe and Hyperparam to Stress-Test Conversational AI</title><link href="https://blog.hyperparam.app/simulated-personas/" rel="alternate" type="text/html" title="Simulated Personas, Real Insights: Using Snowglobe and Hyperparam to Stress-Test Conversational AI" /><published>2025-10-15T07:10:00+00:00</published><updated>2025-10-15T07:10:00+00:00</updated><id>https://blog.hyperparam.app/simulated-personas</id><content type="html" xml:base="https://blog.hyperparam.app/simulated-personas/"><![CDATA[<h2 id="testing-our-conversations-before-we-go-live">Testing our Conversations Before we Go Live</h2>

<p>Hyperparam is building an AI-assisted tool for working with large text datasets. The product includes a viewer for parquet (and csv and jsonl) datasets, and a data assistant chat. Before launching the product we wanted to anticipate problems that may crop up.</p>

<p>The fastest way there was to generate simulated data with realistic, diverse, edge-case conversations that exposed how our data agent behaved across user types and intents. Once we could generate this data we had a quick way to interrogate the simulation data set in order to slice, flag and transform the conversations into a usable data set for follow on fine tuning (or continued analysis).</p>

<h3 id="our-setup-snowglobe-for-simulation--hyperparam-for-exploration">Our setup: <a href="https://snowglobe.co">Snowglobe</a> for simulation + <a href="https://hyperparam.app">Hyperparam</a> for exploration</h3>

<p>Plan:</p>

<ol>
  <li>
    <p><strong>Simulate</strong> 10 realistic personas and conversations using Snowglobe.</p>
  </li>
  <li>
    <p><strong>Explore and transform</strong> that dataset interactively with Hyperparam.</p>
  </li>
  <li>
    <p><strong>Isolate</strong> conversations where the agent recommended using Python versus general analytical queries.</p>
  </li>
  <li>
    <p><strong>Prepare</strong> the subset for evaluation or fine-tuning.</p>
  </li>
</ol>

<h2 id="step-1-generate-synthetic-conversations-with-snowglobe"><strong>Step 1: Generate Synthetic Conversations with Snowglobe</strong></h2>

<p><a href="https://snowglobe.so/"><strong>Snowglobe</strong></a> is a simulation engine for conversational AI teams. You define who your users are, what they want, and how they behave. Snowglobe auto-generates thousands of realistic interactions with your model or API endpoint. Think of it as a load test for reasoning or dialogue, not just latency. Snowglobe uses the information from an application description to create data that’s useful for your specific app. For this blog post, we created an application with the system prompt from Hyperparam. It’s long, but the short version looks like: “This chatbot allows users to chat with their data, pulling out insights and statistics. The data looks like this: <schema>...</schema>.”</p>

<h3 id="define-your-personas"><strong>Define Your Personas</strong></h3>

<p>In our example, we’re simulating users of <a href="https://hyperparam.app"><strong>Hyperparam</strong></a>, a data exploration tool. We want personas that mirror the real user base  data engineers,  data analysts, and tinkerers with different levels of skill and temperament. To create personas like these, we can start with a “Simulation prompt”. For example, we can enter a simulation prompt like “Users are data engineers, scientists, and analysts ask questions about their data.”</p>

<p>This prompt results in personas like follows. These personas vary in objective, tone, and style.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">personas</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Hands-On</span><span class="nv"> </span><span class="s">Data</span><span class="nv"> </span><span class="s">Explorer"</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Loves examples, learns by doing.</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Skeptical</span><span class="nv"> </span><span class="s">Analyst"</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Double-checks every step, asks 'why'.</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Product</span><span class="nv"> </span><span class="s">Engineer"</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Wants quick, applied answers.</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Aha</span><span class="nv"> </span><span class="s">Moment</span><span class="nv"> </span><span class="s">Seeker"</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Prefers conceptual explanations.</span>
  <span class="s">...</span>
</code></pre></div></div>

<p><img src="/assets/images/personas/snowglobe-spatial.png" alt="Snowglobe Spatial View" /></p>

<h3 id="configure-conversation-scenarios"><strong>Configure Conversation Scenarios</strong></h3>

<p>This provides us with conversation templates tied to our product use cases:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">scenarios</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">topic</span><span class="pi">:</span> <span class="s2">"</span><span class="s">data</span><span class="nv"> </span><span class="s">analysis"</span>
    <span class="na">prompt</span><span class="pi">:</span> <span class="s2">"</span><span class="s">How</span><span class="nv"> </span><span class="s">can</span><span class="nv"> </span><span class="s">I</span><span class="nv"> </span><span class="s">explore</span><span class="nv"> </span><span class="s">this</span><span class="nv"> </span><span class="s">dataset</span><span class="nv"> </span><span class="s">for</span><span class="nv"> </span><span class="s">outliers?"</span>
  <span class="pi">-</span> <span class="na">topic</span><span class="pi">:</span> <span class="s2">"</span><span class="s">python</span><span class="nv"> </span><span class="s">vs</span><span class="nv"> </span><span class="s">sql"</span>
    <span class="na">prompt</span><span class="pi">:</span> <span class="s2">"</span><span class="s">Should</span><span class="nv"> </span><span class="s">I</span><span class="nv"> </span><span class="s">use</span><span class="nv"> </span><span class="s">Python</span><span class="nv"> </span><span class="s">or</span><span class="nv"> </span><span class="s">write</span><span class="nv"> </span><span class="s">an</span><span class="nv"> </span><span class="s">analytic</span><span class="nv"> </span><span class="s">query?"</span>
</code></pre></div></div>

<p><img src="/assets/images/personas/snowglobe-table.png" alt="Snowglobe Table View" /></p>

<p>Snowglobe will orchestrate multi-turn dialogues between each persona and our model endpoint, generating text logs, metadata, and structured output (JSONL).</p>

<p>When the run completes, we have a dataset with 10 personas × 200 conversations each — 2000 total dialogues, complete with role labels, timestamps, and message-level metadata.</p>

<h2 id="step-2-explore-the-dataset-in-hyperparam"><strong>Step 2: Explore the Dataset in Hyperparam</strong></h2>

<p><a href="https://hyperparam.app"><strong>Hyperparam</strong></a> is an <strong>interactive, browser-based dataset explorer</strong> purpose-built for ML workflows. It opens local or remote Parquet, JSONL, CSV files instantly and lets you filter, transform, and visualize data directly: no heavy Jupyter notebooks required.</p>

<p>Drop file directly in the explorer</p>

<video controls="">
  <source src="/assets/images/personas/drop.mp4" type="video/mp4" />
</video>

<p>Hyperparam renders the dataset in an interactive table. You can scroll through conversations, inspect columns like <code class="language-plaintext highlighter-rouge">persona</code>, <code class="language-plaintext highlighter-rouge">topic</code>, or <code class="language-plaintext highlighter-rouge">assistant_message</code>, and even preview message trees. Conversation view allows for easy visual exploration.</p>

<p><img src="/assets/images/personas/hyperparam-conversation-view.png" alt="Hyperparam Conversation View" /></p>

<p>One of the things we noticed right off the bat was that sometimes a user would ask a question and the model would suggest leaving the product and that they use a python script instead. This is not what we want. Sometimes though, the user asks a question that genuinely need to be done off-platform. What we’d really like to find is the conversations that <em>could</em> have been solved on-platform, but instead the model recommended python.</p>

<h2 id="step-3-flag-conversations-with-hyperparam"><strong>Step 3: Flag Conversations with Hyperparam</strong></h2>

<p>Now we want to detect when the assistant recommended Python code vs analytic query language in its replies.</p>

<p>In Hyperparam, you can do this with the hyperparam chat: a prompt-based operation that adds a new computed column.</p>

<p><em>Prompt: “Add columns: flag when a model suggests the user run python themselves, or makes a general analytic-style query instead of a transformation or filtering like we expect.”</em></p>

<p>The data agent runs across all sample rows, and decides to create two new boolean columns:</p>

<ul>
  <li>suggested_user_python: true/false</li>
  <li>is_analytics_query: true/false</li>
</ul>

<p>This single operation converts raw chat logs into labeled data.</p>

<h2 id="step-4-filter-export-and-iterate"><strong>Step 4: Filter, Export, and Iterate</strong></h2>

<p>After inspecting the two new columns, we wanted to extract the samples where it suggested using python but not as a general analytics query. This is our proxy for things our data-agent should be able to do but for some reason did not.</p>

<p>By visually inspecting the data we notice:</p>

<ul>
  <li>
    <p>“Hands-On Data Explorer” and “Skeptical Analyst” personas trigger Python examples more frequently.</p>
  </li>
  <li>
    <p>“Pragmatic Insight Seekers” get concise analytic answers.</p>
  </li>
  <li>
    <p>Conversations recommending Python also tend to have longer message chains (higher cognitive load).</p>
  </li>
</ul>

<p>The data set of <code class="language-plaintext highlighter-rouge">[suggested_user_python=true &amp;&amp; is_analytics_query=false]</code> is good for further fine-tuning our data agent as examples of where we should have a suggestions but did not.</p>

<h2 id="step-5-why-this-workflow-matters"><strong>Step 5: Why This Workflow Matters</strong></h2>

<p>The combination of <strong>Snowglobe + Hyperparam</strong> closes a crucial loop for conversational AI teams:</p>

<table>
  <thead>
    <tr>
      <th>Stage</th>
      <th>Tool</th>
      <th>Outcome</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Simulation</strong></td>
      <td>Snowglobe</td>
      <td>Synthetic but realistic data across personas</td>
    </tr>
    <tr>
      <td><strong>Exploration</strong></td>
      <td>Hyperparam</td>
      <td>Fast, visual filtering and labeling</td>
    </tr>
    <tr>
      <td><strong>Transformation</strong></td>
      <td>Hyperparam</td>
      <td>LLM-assisted column creation</td>
    </tr>
    <tr>
      <td><strong>Iteration</strong></td>
      <td>Both</td>
      <td>Repeat, evaluate, fine-tune</td>
    </tr>
  </tbody>
</table>

<p>This pipeline lets teams:</p>

<ul>
  <li>
    <p>Build evaluation datasets <em>before</em> collecting real user data.</p>
  </li>
  <li>
    <p>Debug reasoning patterns in synthetic interactions.</p>
  </li>
  <li>
    <p>Scale up diverse conversational contexts without manual labeling.</p>
  </li>
  <li>
    <p>Quickly explore and interact with the data sets.</p>
  </li>
</ul>

<h2 id="closing-thoughts"><strong>Closing Thoughts</strong></h2>

<p>Simulation is the new data collection.</p>

<p>When you can generate, label, and filter conversation data quickly, interactively and with precision, you gain the power to test your agent’s reasoning loops and UX outcomes before they ever reach a customer.</p>

<p><strong>Snowglobe</strong> gives you the <em>synthetic user base</em>.
<strong>Hyperparam</strong> gives you the <em>interactive microscope</em>.</p>

<p><strong>Next steps:</strong></p>

<ul>
  <li>
    <p>Try <a href="https://snowglobe.so/">snowglobe.so</a> to generate your own synthetic conversations.</p>
  </li>
  <li>
    <p>Explore the data instantly with <a href="https://hyperparam.app/">hyperparam.app</a>.</p>
  </li>
</ul>]]></content><author><name>Diego Oppenheimer</name></author><category term="AI" /><category term="testing" /><category term="conversational-AI" /><category term="simulation" /><category term="synthetic-data" /><category term="testing" /><category term="Snowglobe" /><category term="Hyperparam" /><category term="personas" /><category term="data-exploration" /><category term="fine-tuning" /><category term="evaluation" /><category term="AI-stress-testing" /><category term="chatbot-testing" /><summary type="html"><![CDATA[Conversational AI stress-testing made simple. Generate synthetic conversations with Snowglobe and explore them interactively with Hyperparam.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-personas.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-personas.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Hyparquet: The Quest for Instant Data</title><link href="https://blog.hyperparam.app/quest-for-instant-data/" rel="alternate" type="text/html" title="Hyparquet: The Quest for Instant Data" /><published>2025-07-24T14:00:00+00:00</published><updated>2025-07-24T14:00:00+00:00</updated><id>https://blog.hyperparam.app/quest-for-instant-data</id><content type="html" xml:base="https://blog.hyperparam.app/quest-for-instant-data/"><![CDATA[<p>I just wanted to build a javascript code model.</p>

<p>Following the common adage that “data quality determines model quality”, I did what every AI engineer does and tried to <em>look</em> at some training data hosted on HuggingFace. I did not care how, I just needed to see some data and interact with it - search around rows, sort, and otherwise get a feel for the data quality.</p>

<p>This is where my goal started to go off the rails… Most modern AI datasets are 10GB or more and are in parquet format – we’ll talk more about this later – which means you need to parse and open the file. No simple <code class="language-plaintext highlighter-rouge">less</code> would work. The most common tools to read parquet for easy viewing are pandas/polars and DuckDB. With some ChatGPT help, I was running the command to load the first 5 rows of data. As shown below, I sat there waiting… and waiting.</p>

<p><img src="/assets/images/hyparquet/notebook.gif" alt="Jupyter Notebook Slow" /></p>

<p>Modern data viewer tools take anywhere from 5 sec (DuckDB) to 57 sec (Pandas) to load just 10 rows of data. The HCI community largely agrees that the ideal time-to-first-interactivity is 500 ms <a href="https://idl.cs.washington.edu/files/2014-Latency-InfoVis.pdf">[Lui, Heer 2014]</a>. Why should that not hold for data? Why is it acceptable for data to take 20x longer to load data than a webpage?</p>

<p>The rest of this blog describes my multi-month journey to hyper-optimize time-to-first-data for parquet files. I am still on this journey but, along the way, released <a href="https://github.com/hyparam/hyparquet">Hyparquet</a>, the most conformant browser-based parquet file reader in existence. It’s open source and, most importantly, can load my 10 rows of data in 150 ms.</p>

<h2 id="legacy-server-architecture">Legacy Server Architecture</h2>

<p>Let’s take a step back and first understand where the runtime is going in existing data viewer tools. Let’s take an oversimplified version of a simple pandas backed data viewer and pretend it’s hosted in AWS and reading a parquet file from S3. Before the data even gets loaded, the user’s browser has to first hit cloud front, goes to <a href="https://aws.amazon.com/elasticloadbalancing/">ELB</a>, then finally gets redirected to the Node JS frontend server of the data hosting service, goes through another ELB, hits the backend server hosting the data and then finally pings S3 and downloads the data. In total this takes about 40 sec of latency just to get the request and download the data. The data then gets parsed in the backend server (taking about 1 sec), and finally makes its way back to the user’s browser.</p>

<p><img src="/assets/images/hyparquet/arch1.svg" alt="Legacy server architecture diagram" /></p>

<p>This diagram may feel complicated but it’s drastically oversimplified compared to most real-world architectures that have: auth, logging, message brokers, etc. Systems that each add additional latency.</p>

<p>When optimizing this pipeline, most engineers only have control of the backend and spend time optimizing parsing. This can speed up the time-to-first-data a lot, but it’s not enough for me. I wanted to completely remove the latency before parsing even started.</p>

<h2 id="browser-first-architecture">Browser-First Architecture</h2>

<p>Fundamentally, whenever you have a backend, you need layers of tooling on top. And backend servers are generally good ideas - they manage application state, can handle compute heavy processes, and decouple the viewer from the data models. But I don’t care about any of that. A data viewer isn’t a feature in my application, it is the entire application. So what if I just remove everything backend related and point the browser directly at S3 *(Well, you still need cloudfront to optimize the SSL handshake)?</p>

<p><img src="/assets/images/hyparquet/arch2.svg" alt="Browser-first architecture diagram" /></p>

<p>You would be left with just the browser talking straight to cloud storage:</p>

<p><img src="/assets/images/hyparquet/arch3.svg" alt="Simplified browser to cloud storage architecture" /></p>

<p>With this architecture, you immediately save latency as you skip having to hit ELB and a backend. As an added bonus, it’s cheaper because you don’t need cloud costs to host the backend server and far simpler for developers to maintain.</p>

<p>This simplified architecture does leave two issues: (a) where does user state live so if they, for example, refresh a page, they don’t lose their location in the viewer and (b) you still need to parse a parquet file.</p>

<p>It turns out, if you use browser cookies and local storage, you can manage user state all in the browser. Sure, if the user clears their browsing history, they’re in trouble, but I’m okay with that. The parsing…well, I was just going to have to use a javascript parser instead. Or, as it turns out, build my own.</p>

<h2 id="parquet-in-the-browser">Parquet in the Browser</h2>

<p>At the beginning 2024, when I started this quest, there were 3 libraries that could load parquet files from cloud storage directly into the browser: ParquetJS, ParquetWASM, and DuckDB WASM. And I had a goal to parse parquet files in under 500ms. As shown below, none of these were fast and ParquetJS wasn’t even supported anymore.</p>

<p><img src="/assets/images/hyparquet/browser-library-performance.png" alt="Parquet browser library performance comparison" /></p>

<p>Looking at this waterfall chart we can see that all libraries take at least 600 ms to get a request and parse a parquet file. But, they also show multiple opportunities for optimization. Let’s summarize some of the inefficiencies of the duckdb-wasm library - we will go into more details below.</p>

<ol>
  <li>Loading the WASM engine (extra &gt; 1 sec)</li>
  <li>Multiple requests to get metadata when it could be done in one (extra 200 ms)</li>
  <li>Sequential read requests when it could be in parallel</li>
  <li>Limited optimizations based on metadata</li>
  <li>Synchronous fetches versus asynchronous</li>
  <li>Inefficient compression algorithms</li>
</ol>

<p>If I could optimize these pieces, I could achieve my 500ms time-to-first-data. And I could do it in Kenny style: 100% javascript and no dependencies (because who doesn’t want to rebuild everything from scratch). Time to introduce Hyparquet.</p>

<h2 id="parquet-from-scratch">Parquet from Scratch</h2>

<p>Re-writing a parquet parser from scratch, how hard can it be?? It took about a week to be able to parse my first parquet file, which I thought was pretty good. The problem is that I kept finding more parquet files that I couldn’t open. Parquet is a sprawling format, with many features:</p>

<ul>
  <li>8 physical types (bool, int, float, etc)</li>
  <li>22 converted types</li>
  <li>17 logical types</li>
  <li>8 compression codecs (snappy, gzip, brotli, etc)</li>
  <li>2 major versions</li>
</ul>

<p>It took <strong>6 months</strong> to parse ALL the parquet files.<br />
<img src="/assets/images/hyparquet/hyparquet-github.png" alt="Parquet development timeline" /></p>

<h2 id="gotta-go-fast">Gotta Go Fast</h2>

<p>Javascript is not exactly known as a high performance language. I think this reputation is undeserved. I’m not saying it’s going to beat rust in a benchmark. But with careful engineering and tactical use of modern browser apis, we can make decoding parquet in the browser surprisingly performant.</p>

<p>Let’s dive deeper into some of the mistakes made by other parquet libraries, and how we can make it better in the browser:</p>

<ol>
  <li>
    <p><strong>Engine Size</strong> – DuckDB-WASM requires downloading and compiling several megabytes of WebAssembly, incurring seconds of startup delay before queries can run. That’s seconds where your user sees… nothing.</p>

    <p>Could we get a performance advantage from starting with less? Every kilobyte of WASM adds startup latency. Hyparquet’s core engine is only 10KB (minified, gzipped), dramatically reducing startup latency, and is substantially easier to bundle. By narrowing the focus strictly to Parquet parsing with pushdown filters, we achieve near-instant initialization.</p>

    <p>We also save an entire round-trip loading the wasm blob:</p>

    <p><img src="/assets/images/hyparquet/parquet-wasm.png" alt="Engine size comparison showing reduced latency" /></p>
  </li>
  <li>
    <p><strong>Smart Metadata Fetching</strong> – In parquet, the metadata is stored in the footer of the file. So in order to fetch the metadata, you might naively make at least three requests:</p>

    <p>This is what parquet-wasm and parquetjs do:</p>

    <ol>
      <li>
        <p>HEAD request to get file size</p>
      </li>
      <li>
        <p>Fetch the last 8 bytes to get the metadata_length field</p>
      </li>
      <li>
        <p>Fetch the metadata</p>
      </li>
    </ol>

    <p>But with hyparquet we actually do a little better: we can skip the second step. Rather than make an 8 byte round-trip fetch request, we optimistically fetch 512kb of the footer of the file. 99% of the time that includes the entire metadata. In the rare cases where this initial request fails to include all the metadata, we use the metadata length in the footer and make another request for just the remainder of the missing metadata. On http over the internet, an 8 byte fetch takes <em>almost</em> the same amount of time as a 512 kb request.</p>
  </li>
  <li>
    <p><strong>Parallelization</strong> – Traditional databases fetch data sequentially. But browsers can handle 6+ concurrent HTTP connections. Hyparquet leverages parallel HTTP range requests, retrieving only needed portions of the Parquet file (specific columns or row groups) in parallel. This overlap of I/O helps reduce wall-clock latency for data access.</p>

    <p>Duckdb uses a different (and much worse) algorithm: it does a sequence of exponentially increasing request sizes, all in series (not parallel). This is fine when you’re reading from local disk but is pathological when loading over the network:</p>

    <p><img src="/assets/images/hyparquet/duckdb-wasm.png" alt="DuckDB sequential vs Hyparquet parallel fetching" /></p>
  </li>
  <li>
    <p><strong>Use the Metadata</strong> – Hyparquet employs predicate pushdown by analyzing Parquet metadata (schema, column statistics). This allows it to identify and skip irrelevant row groups entirely, reducing network load and improving speed. This isn’t new—every modern columnar database does this. But when network latency is your enemy, skipping even one unnecessary 25MB column chunk can save seconds.</p>

    <p>It’s worth mentioning that by default parquetjs does NOT do this. In fact, neither does python! The default pyarrow and pandas parquet readers WILL READ THE ENTIRE FILE. I had to tweak parquetjs to make it load partial data at all. <a href="https://github.com/hyparam/parquet-browser-eval">[2]</a></p>
  </li>
  <li>
    <p><strong>Async Everything</strong> – JavaScript might be the world’s most async-friendly language. We utilize this to return whatever data is ready first. Parquet is a column-oriented format, so if rows are being emitted from a cursor object, you’re making users wait for ALL the columns to load before returning any data to the user. Hyparquet can return data asynchronously whenever it’s ready (but provides helpers for row-oriented data if that’s needed).</p>
  </li>
  <li>
    <p><strong>Compression That Doesn’t Suck</strong> – Standard JavaScript Snappy decompression was too slow, so we implemented HySnappy, a WebAssembly-based decoder that’s 40% faster yet adds minimal size (&lt;4KB). This ensures decompression never becomes the performance bottleneck.</p>

    <p>The problem with WASM is that it normally adds an extra round-trip fetch request for the wasm file. We improved this using a little-known browser trick: you can <em>synchronously</em> load wasm if and only if it is less than 4kb! So we wrote our own snappy decompression library, with no dependencies, not even memcpy, and definitely no emscripten. This makes hysnappy super easy to bundle, deploy, and load.</p>
  </li>
</ol>

<h2 id="the-result-sub-second-magic">The Result: Sub-Second Magic</h2>

<p>This obsession with latency has real-world implications. Where DuckDB-Wasm might take several seconds just to initialize its query engine, Hyparquet can produce visible results on multi-gigabyte datasets in under a second.</p>

<p><img src="/assets/images/hyparquet/final-performance-results.png" alt="Final performance results showing sub-second loading" /></p>

<h2 id="the-philosophy-bringing-compute-to-data-in-your-browser">The Philosophy: Bringing Compute to Data (In Your Browser)</h2>

<p>Hyparquet demonstrates a shift toward treating the browser as a fully capable query processor operating directly on data stored in cloud storage, suggesting a new paradigm for database research:</p>

<p>This inverts traditional assumptions:</p>

<ul>
  <li>Thin server, client doing the heavy lifting.</li>
  <li>Round trips matter more than total bandwidth.</li>
  <li>Time-to-first-byte is the new query optimization target.</li>
</ul>

<p>Hyparquet’s extreme minimalism sets a new benchmark for browser-native analytics. But what are the broader implications?</p>

<p>Hyparquet enables ML researchers and data analysts to interactively explore large datasets directly in the browser, eliminating the need for traditional backend setup or data infrastructure management.</p>

<p>Hyparquet also allows data analysis over a much-simplified infrastructure: By removing backend databases, there’s less infrastructure to maintain, simpler developer experience, and faster user experience.</p>

<h2 id="where-did-this-lead-hyperparam">Where did this lead? Hyperparam.</h2>

<p>If you remember where this started, I wanted users to see the first few rows of a large AI parquet dataset in under a second. But I started looking at data because I wanted to train a model. I obsessed over the first step of the AI data curation pipeline because it was painful. But I didn’t stop there. I founded Hyperparam, a company built on this paradigm of hyper optimization of browser native applications for data curation. Hyperparam’s goal is for users to build their training, evaluation, or RAG datasets with the seamless interactivity of the browser they are accustomed to for non data-intensive tasks. Our motto - “javascript can do it too”</p>

<h2 id="try-it-yourself">Try It Yourself</h2>

<p>Want to see Hyparquet in action? Head to <a href="https://hyperparam.app">https://hyperparam.app</a>, drop any Parquet file or url, and watch your data appear instantly.</p>

<p><img src="/assets/images/hyparquet/hyperparam.gif" alt="Hyperparam Parquet Viewer" /></p>]]></content><author><name>Kenny Daniel</name></author><category term="engineering" /><category term="performance" /><category term="parquet" /><category term="hyparquet" /><category term="JavaScript" /><category term="performance" /><category term="browser-architecture" /><category term="WebAssembly" /><category term="data-visualization" /><category term="latency-optimization" /><category term="Apache-Parquet" /><category term="compression" /><category term="instant-data" /><category term="AI-engineering" /><summary type="html"><![CDATA[Instant data in the browser. Hyparquet delivers sub-second data loading for AI engineers with a JavaScript Parquet parser that loads in under 500ms.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-hyparquet.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-hyparquet.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Hyperparam: How Browser-Based Tools Will Re-Shape AI</title><link href="https://blog.hyperparam.app/browser-based-tools-will-reshape-ai/" rel="alternate" type="text/html" title="Hyperparam: How Browser-Based Tools Will Re-Shape AI" /><published>2025-01-21T05:40:00+00:00</published><updated>2025-01-21T05:40:00+00:00</updated><id>https://blog.hyperparam.app/browser-based-tools-will-reshape-ai</id><content type="html" xml:base="https://blog.hyperparam.app/browser-based-tools-will-reshape-ai/"><![CDATA[<p>What is the key to building the most advanced AI models? Data quality.</p>

<p>Everyone wants better AI models: smarter, cheaper, and with style. How does one achieve that? Whether you’re a mega-scale AI company, or a small enterprise team, the only real lever for making better models is to construct a better training set.</p>

<p>How do you build a better training set? This is a question that has always been one of the most challenging, and labor-intensive parts of the data science process.</p>

<p><img src="/assets/images/reddit-comments.png" alt="Reddit r/datascience comments on &quot;What's your biggest time sink as a data scientist?&quot;" /></p>

<p>Why is data cleaning and data understanding so time-consuming? Because current tools often miss three key capabilities: 1) should enable very fast free-form data exploration by the user, which is key to finding insights in your data, 2) use AI models to assist looking at huge volumes of data that would be impractical for a person, and 3) should be simple to run locally in the browser and not depend on complex services and data pipelines. Instead, most tools are built around Python, arguably the worst language for creating modern, compelling UIs and tools. This might seem controversial, but think about what is the most common interface for python? Jupyter Notebooks. Notebooks are great for iteration and experimentation, but they are extremely weak when it comes to interactive data exploration. If you’ve ever tried to open a parquet file (the most common format for modern ML datasets) in a notebook it looks like this:</p>

<p><img src="/assets/images/jupyter.png" alt="Screenshot of Jupyter Notebook loading data from starcoderjs.parquet file" /></p>

<p>This table is practically useless. You can’t paginate to the next set of rows. You can’t even see the entire data in a cell (which in this case is an entire github source file). So how are you supposed to get an intuitive sense of your data if you can’t even see it?</p>

<p>Can we do better? If you want to build a highly performant user interface, there is only one choice: JavaScript. The browser is the only place for building modern UIs.</p>

<p>The problem is that ML datasets are massive (often multiple gigabytes of compressed text data), so it’s not obvious if it’s even possible to work with large scale datasets in the browser. However, by using modern data formats like <a href="https://parquet.apache.org/">Apache Parquet</a>, and clever frontend engineering, it is in fact possible to work with massive datasets directly in the browser.</p>

<blockquote>
  <p>Aside: Apache Parquet files are a column-oriented data structure that contains a built-in index. This allows tools like hadoop and duckdb to efficiently query parquet datasets without having to retrieve all the data. Furthermore it allows doing these queries without a server, simply by putting the parquet files in a storage service like S3. What if you could do this same trick in the browser, and pull in just the data needed to render the current view. Hello Hyparquet.</p>
</blockquote>

<p>Hyparquet is a new JavaScript parquet parser which can efficiently query against parquet files stored in the cloud. This enables the creation of a new type of client-side only parquet data viewer which is significantly faster than anything that could be done with a server.</p>

<video width="710" height="420" autoplay="" muted="" loop="">
  <source src="/assets/images/hyperparam1.mp4" type="video/mp4" />
</video>

<p>The goal here is to get data engineers to look at their data 👀 Anyone who has worked with data for a model before knows that looking at your data is the key to understanding the domain you’re trying to model, and it is virtually impossible to do good data science without looking at your data. Looking at your data is the easiest way to find data and model issues, and is a constant source of ideas of how to improve them.</p>

<p><a href="https://x.com/karpathy/status/1311884485676294151"><img src="/assets/images/karpathy.png" alt="Andrej Karpathy: &quot;When you sort your dataset descending by loss you are guaranteed to find something unexpected, strange and helpful.&quot; Mechanical Dirk: &quot;In a large enough dataset, you can pick almost any property, sort by it, and find interesting stuff at both ends of the list.&quot;" /></a></p>

<p>This is one of the core workflows in data science: build a model, see what data was correctly or incorrectly modeled, fix the data and/or the model, and repeat. This is a repeatable, teachable process! And if it can be taught to a human data scientist, why can’t it be taught to a model to assist?</p>

<p>Can you use a model to assist with dataset curation? The challenges are two-fold: 1) How do you leverage human expertise to express what you want from the model? 2) These datasets are huge, so the cost of running a model across all the data is expensive.</p>

<p>You need the human in the loop to express their intent for the data. There is not just one definition of “good” versus “bad” data. What matters is the question “is this data useful for the model I’m trying to build?” This is where the UI comes in as a way to allow the user to look at the data, and use the data to express their intent.</p>

<p>As for the cost, we are entering a new era of LLMs where for the first time it is affordable to do dataset-scale inference in which you run an entire dataset through a model to help filter and label data. In 2023 it cost $5,000,000 USD to process 1 trillion input tokens with a sota model (gpt-4-turbo). In 2024 it cost $75,000 USD to process 1 trillion input tokens with a similar model (gpt-4o-mini). This trend will continue to make dataset-scale inference accessible to model builders. Model-based quality filtering has already been used by Meta to filter the training set for llama3 using labels generated by llama2 <a href="https://arxiv.org/pdf/2407.21783">[1]</a>.</p>

<p>We’re entering a new era in which dataset-scale inference and interactive, browser-based data exploration will define how AI models are built and refined. By combining efficient data formats, high-performance JavaScript interfaces, and affordable AI-based annotations, teams can finally put data quality front and center without prohibitively high costs or clunky workflows.</p>

<p>The future belongs to those who seamlessly blend human expertise with AI-assisted insights—an approach that makes data cleaning faster, more intuitive, and ultimately, far more effective in powering the next generation of advanced AI models.</p>

<hr />

<p><strong>Ready to explore your machine learning data?</strong> Visit <a href="https://hyperparam.app">Hyperparam</a> to start viewing and analyzing your datasets in seconds.</p>]]></content><author><name>Kenny Daniel</name></author><category term="machine-learning" /><category term="product" /><category term="AI" /><category term="machine-learning" /><category term="browser-tools" /><category term="data-curation" /><category term="parquet" /><category term="hyparquet" /><category term="JavaScript" /><category term="dataset-quality" /><category term="LLM" /><category term="data-science" /><category term="browser-based-tools" /><category term="data-exploration" /><summary type="html"><![CDATA[Explore massive datasets in the browser. Hyperparam shows how browser-based tools are reshaping AI workflows through speed, visibility, and interactivity.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blog.hyperparam.app/assets/images/banner-browser.jpg" /><media:content medium="image" url="https://blog.hyperparam.app/assets/images/banner-browser.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>