Daniel Kang – Medium

Daniel Kang

Daniel Kang

·

May 27

Accelerating Analytical Joins on Unstructured Data

Semantic joins over unstructured data have become essential to modern data analytics. Today, e-commerce business analysts track…

Accelerating Analytical Joins on Unstructured Data

Daniel Kang

·

May 19

SODIUM: From Open Web Data to Queryable Databases

In research workflows using public data, answering a single analytical question requires collecting and organizing data from many different…

SODIUM: From Open Web Data to Queryable Databases

Daniel Kang

·

Feb 24

Launching the CVE-Bench Leaderboard: A Public Arena of AI for Cybersecurity

Last year, we introduced CVE-Bench, a rigorous benchmark with real-world web vulnerabilities to evaluate the cyberoffensive capabilities of…

Launching the CVE-Bench Leaderboard: A Public Arena of AI for Cybersecurity

Daniel Kang

·

Dec 16, 2025

Claude 4.5 Opus Solves CORE-Bench — But Not REPRO-Bench

In our ACL 2025 paper, we introduced REPRO-Bench (GitHub), a benchmark designed to evaluate whether AI agents can accurately assess the…

Claude 4.5 Opus Solves CORE-Bench — But Not REPRO-Bench

Daniel Kang

·

Nov 10, 2025

SafeSearch: Teaching LLM Search Agents to Be Both Smart and Safe

LLMs are rapidly expanding their built-in knowledge from training. However, they still suffer from hallucinations and lack access to…

SafeSearch: Teaching LLM Search Agents to Be Both Smart and Safe

Daniel Kang

·

Nov 5, 2025

When Your Home Robot Turns Against You: BEATing Vision-Language Agents with Visual Backdoors

Household humanoid robots promise to assist everyone in daily life, with several exciting demos released recently (NEO, Figure 03, Tesla…

When Your Home Robot Turns Against You: BEATing Vision-Language Agents with Visual Backdoors

Daniel Kang

·

Nov 3, 2025

DRAMA: Enabling AI Agents to Collect Data to Support Data Science Workflows

Data science workflows generally include two major phases: data retrieval and data analysis. In practice, analysts (especially in the…

DRAMA: Enabling AI Agents to Collect Data to Support Data Science Workflows

Daniel Kang

·

Oct 30, 2025

CVE-Bench v2.0: Making Evaluation More Rigorous with ABC

This is the third post in the Agentic Benchmark Checklist (ABC) blog series. Written by Yuxuan Zhu, Antony Kellermann, and Daniel Kang.

CVE-Bench v2.0: Making Evaluation More Rigorous with ABC

Daniel Kang

·

Oct 5, 2025

No, RL does not get “1 bit of information” per rollout

Dwarkesh is one of the biggest podcasters in the AI space. He’s recently (and repeatedly) made the claim that reinforcement learning gives…

No, RL does not get “1 bit of information” per rollout

Daniel Kang

·

Aug 11, 2025

Human Data is (Probably) More Expensive Than Compute for Training Frontier LLMs

This blog post is written by Yuxuan Zhu and Daniel Kang

Human Data is (Probably) More Expensive Than Compute for Training Frontier LLMs

Daniel Kang

Daniel Kang

Following

Help

Status

About

Careers

Press

Blog

Store

Privacy

Rules

Terms

Text to speech