Search papers, labs, and topics across Lattice.
We track OpenAI, DeepMind, Anthropic, and 17 other labs daily - with AI-powered summaries, trend charts, and a weekly digest.
We read everything so you don't have to. One email, zero noise.
Time can be learned and manipulated in videos, leading to groundbreaking advancements in speed-conditioned video generation and temporal super-resolution.
Bridging the gap between image generation and detection, UniGenDet leverages a unified framework to enhance both fidelity and interpretability in generated images.
The trajectory of gradient descent is not random; it is systematically forced toward the critical threshold of $2/η$, revealing a hidden structure in neural network optimization.
TPGO allows multi-agent systems to learn from their own optimization history, leading to unprecedented self-improvement in performance.
AI-driven summaries of public consultations can systematically exclude dissenting voices, raising concerns about biased policy recommendations even when individual outputs seem reasonable.
Automated identification of individual animals can only be effective if it aligns with ecological questions and data practices, not just algorithmic accuracy.
Current remote sensing change captioning datasets miss fine-grained localized semantic reasoning, but RSRCC fills this gap with 126k change-specific questions.
Stop penalizing your ANN search algorithms for failing to retrieve irrelevant neighbors – Semantic Recall offers a more nuanced and effective way to measure retrieval quality.
Users who actively participate in an AI agent's spreadsheet execution not only improve task outcomes, but also gain a deeper understanding and feel more ownership over the results.
Directly embedding quantile tokens into input sequences leads to sharper and more accurate distribution predictions, outperforming traditional methods by a substantial margin.
Pocket-sized VLA models can now achieve state-of-the-art robot manipulation performance by pre-training on a curated multimodal dataset and injecting manipulation-relevant representations into the action space.
LLMs are poised to flip the script on personalization, giving users unprecedented control over their data and how it's used across platforms.
We read everything so you don't have to. One email, zero noise.
Continuous benchmarking of protein function prediction models is now possible, enabling faster iteration and more robust performance tracking as annotations evolve.
A low-cost, compact sensor provides continuous vision-tactile feedback, enabling robots to "see" and "feel" their way through dexterous manipulation tasks.
LVLMs can self-detect and correct object hallucinations by focusing on specific image regions, offering a simple, training-free fix.
Deterministic decoding can outperform stochastic self-consistency in constrained domains by systematically exploring high-probability reasoning traces, leading to better performance with less computation.
Continual learning for LLM agents hits a wall: scaling models doesn't reliably improve skill generation, and self-feedback can lead to recursive drift.
LLMs can reason more effectively by directly tracking their own belief in the correct answer throughout the reasoning process, enabling more targeted policy updates.
Achieve superhuman dexterity: ALAS unlocks robust long-horizon task completion by decoupling environment understanding from motor control, enabling generalization across diverse human-scene interaction scenarios.
MLLMs still struggle to integrate diverse data for clinical reasoning, as evidenced by their poor performance on a new ophthalmology benchmark spanning image quality assessment to diagnosis.
Sampling plausible configurations of digital twins can reveal multiple valid parameterizations, enhancing model adaptation in complex natural systems.
Extracting temporal geometry from generative models can boost reinforcement learning performance by over 2x without changing the optimal policy.
Imagine slashing the human effort needed to go from hypothesis to submission-ready ML theory paper by orders of magnitude.
Forget complex fixed-point machinery: this work offers a dramatically simpler and more efficient route from external regret to $Φ$-regret minimization.
We read everything so you don't have to. One email, zero noise.
Entropy regularization makes planning provably easy: SmoothCruiser achieves polynomial sample complexity in MDPs where standard methods fail.
TurboQuant's claimed advantages over RaBitQ in quantization don't hold up under rigorous, reproducible comparison, raising questions about its practical utility.
Multilingual LLMs exhibit a surprising "American bias," even when prompted in other languages, and instruction tuning makes it worse.
Bridging the offline-streaming gap in ASR is now more achievable: a single RNN-Transducer model can deliver high accuracy in both settings, thanks to a novel consistency regularization technique.
Forget chasing the biggest LLM – this benchmark reveals that smaller models (<2B params) can deliver 3x better energy efficiency and faster ROI in real-world industry deployments.
Cyclic equalizability, a concept relevant to card-based cryptography, boils down to having identical Parikh vectors.
Get the performance boost of expensive sampling-based RL policies for a fraction of the compute by learning to prune action candidates early in the diffusion denoising process.
LLMs can fix 26% more bugs when given access to intermediate runtime states during program repair, proving that even the best models struggle to infer root causes from just failure symptoms.
Contact-aware reconstruction transforms how we achieve realistic human-scene interactions in 3D environments, correcting artifacts that have plagued previous methods.
Training-free diffusion models can now harmonize satellite imagery across diverse domains, enabling scalable remote-sensing synthesis without retraining.
Uncover misleading half-truths by pitting a Politician agent against a Scientist agent in a debate moderated by a Judge, revealing what's left unsaid.
LLMs still struggle to reason in context when cultural and linguistic nuances are involved, achieving only 44% accuracy on a new grounded benchmark spanning 14 languages.
We read everything so you don't have to. One email, zero noise.
Achieve state-of-the-art person re-identification with only 20% of the data by explicitly teaching the model to "think" before matching identities.
Stop fragmented land cover predictions: SSDM leverages global geospatial embeddings to guide local feature extraction, achieving state-of-the-art performance in high-resolution remote sensing mapping.
Multi-event video generation gets a 33% quality boost with TS-Attn, a training-free attention mechanism that dynamically aligns video content with complex temporal prompts.
Stop training MoEs from scratch: "expert upcycling" lets you expand existing models with duplicated experts and targeted fine-tuning, slashing training costs by 32% without sacrificing performance.
LLMs don't see cities neutrally; their perception is skewed towards a culturally uneven baseline, favoring Western perspectives.
LLM agents suffer from a human-like cognitive bias, Actor-Observer Asymmetry, leading them to make inconsistent judgments about their own and others' failures.
GAAP offers a deterministic, trust-minimized approach to AI agent security, safeguarding user data even when models are compromised or prompts are injected.
End-to-end training of Vision-Language-Action models just got a whole lot easier: VLA Foundry unifies LLM, VLM, and VLA training in a single open-source framework.
Freezing a Stable Diffusion backbone and injecting CLIP and BLIP features lets you beat the state-of-the-art in zero-shot sketch-based 3D shape retrieval, without any costly retraining.
VLMs can be significantly boosted on embodied tasks by mid-training on a carefully curated subset of VLM data that is highly aligned with the VLA domain, rivaling the performance of much larger models.
DPP-based Monte Carlo integration can offer variance reduction, but choosing the right DPP—fixed vs. tailored to the integrand—determines whether you get a biased but faster converging estimator or an unbiased but standard-rate estimator.
Neural operators can achieve uniform convergence rates for approximating solution maps across diverse geometric domains, challenging traditional assumptions about shape-dependent PDE solutions.
We read everything so you don't have to. One email, zero noise.
FUSE achieves verification quality on par with semi-supervised methods, all without needing any labeled data.
LLMs waste compute on tokens that have already "figured it out" – DASH selectively skips these tokens during prefill, speeding things up without retraining or sacrificing accuracy.
WorldMark enables fair, apples-to-apples comparisons of interactive video models, leveling the playing field for researchers and practitioners alike.
Extracting actionable insights from noisy customer incidents at scale is now possible: TingIS achieves a 95% discovery rate for high-priority incidents with just minutes of latency.
Identity encoders can now maintain consistency across diverse artistic styles, achieving human-level performance in recognizing faces even in heavily stylized formats.
Automated expert-level evaluation across 10,000 cases characterised artificial intelligence clinical blind spots hitherto invisible to small-scale testing and should become standard for uncovering serious failures and implementing safety guardrails before clinical deployment exposes patients to risk.
SpanDec achieves state-of-the-art NER accuracy with significantly improved throughput, proving that you don't need to exhaustively process every possible span to achieve top performance.
Ditch the fixed trade-offs: ParetoSlider lets you smoothly navigate competing generative goals in diffusion models at inference time, without retraining.
Generative training doesn't just make images prettier; it can actually boost a model's spatial reasoning skills.
A new global dataset reveals intricate deployment patterns and operational dynamics of offshore wind infrastructure, enabling unprecedented temporal analysis.
Vision-based tactile signals in the VTOUCH dataset significantly enhance bimanual manipulation capabilities, paving the way for more effective robotic interactions.
Exact attention over billion-token sequences is now possible on a single GPU, thanks to a novel streaming approach that avoids out-of-memory errors without approximation.
Ditch sparse contact cues: LEXIS-Flow uses a learned manifold of interaction signatures to capture dense, continuous proximity between humans and objects, leading to more realistic 3D HOI reconstructions.
Gauge-equivariant GNNs unlock the ability to learn intrinsically nonlocal observables in lattice gauge theories by directly embedding non-Abelian symmetries into message passing.
We read everything so you don't have to. One email, zero noise.
Fixed-width attention spans can give you better grammar and human-like reading patterns, especially when you're short on training data.
Ditch the GNN training: this label propagation method matches or beats GNN accuracy while being far more computationally efficient, even on tricky heterophilous graphs.
Short-term A/B test metrics can be misleading: this paper shows how to accurately estimate long-term value changes by modeling treatment effects as a decaying function learned from multiple cohorts.
Open-source MLLMs can now achieve state-of-the-art accuracy on complex tabular reasoning tasks, even outperforming models 18x their size, by explicitly penalizing visual hallucinations and shortcut guessing through process-supervised RL.
Get 82x faster Bayesian inference for equipment monitoring by replacing MCMC with neural nets trained on simulated data.
Forget fine-tuning behemoth LLMs for every new task – this paper shows how a tiny, nimble model generating smart supplements can unlock surprisingly strong agentic performance from frozen giants.
Unlock 10x faster simulation-based inference in hierarchical models by training on single-site simulations and assembling synthetic multi-site data.
Whitening neuroimaging features can transform linear models from black boxes into interpretable tools for understanding brain pathology.
Differentially private federated learning gets a boost: PINA achieves 2.9% higher accuracy than state-of-the-art methods by using a novel two-stage approach with privacy-preserving initialization and normality-driven aggregation.
GNNs can slash storm surge forecast errors by over 70%, offering a faster and more accurate alternative to traditional numerical models for coastal disaster prediction.
Geometry-aware optimization can dramatically improve LLM alignment by ensuring fairer trade-offs among conflicting human values.
Bayesian mixture-of-experts models can achieve robust density and parameter estimation with adaptive expert selection, fundamentally reshaping our approach to complex probabilistic modeling.
We read everything so you don't have to. One email, zero noise.
Current MLLMs fail to detect covert advertisements, revealing a critical gap in social media moderation that could mislead consumers and pose ethical risks.
Individual prosumers can now effectively coordinate in electricity markets, boosting overall market performance through a novel hierarchical MARL framework.
Reinforcement learning's Temporal Difference value estimation offers a surprisingly effective and theoretically grounded approach to calibrating uncertainty in vision-language-action models for robotics.
VDC achieves high-dimensional density estimation with remarkable speed and accuracy, transforming the landscape of copula modeling.
Explicit dropout achieves superior performance without the randomness of traditional methods, offering a clearer path to regularization control in Transformer models.
Decentralized learning can match centralized performance by sharing only Gibbs measures, not datasets, opening new avenues for privacy-preserving collaboration.
GNNs can predict network traffic flow with surprising accuracy, particularly in pinpointing connection endpoints.
LLMs can pinpoint mental states but falter at predicting dialogue trajectories, revealing a critical gap in their reasoning capabilities.
Ditch the expensive energy calculations: this new ML-DFT approach learns directly from ground-state densities, achieving state-of-the-art accuracy with improved runtime scaling.
A groundbreaking dataset suite reveals the intricate dynamics of decentralized prediction markets, offering unparalleled insights into collective forecasting behavior.
Conditional risk calibration reveals a unique perspective on uncertainty quantification that could transform how we approach decision-making in machine learning.
Counterintuitively, using only measured nodes to define the GNN topology slashes training time by 6x and boosts fault localization accuracy by 11% in power distribution grids.
We read everything so you don't have to. One email, zero noise.
AMM price prediction accuracy jumps 56% by explicitly modeling the uncertainty in block intervals, revealing the critical role of on-chain event timing.
R2IF achieves up to 34.62% better performance in function calling accuracy, bridging the gap between reasoning and decision-making in LLMs.
Tabular anomaly detection gets a serious upgrade: uLEAD-TabPFN leverages frozen PFNs to model complex feature dependencies, outperforming existing methods by a significant margin, especially in high-dimensional spaces.
Achieve more reliable and interpretable virtual cell perturbation predictions by combining knowledge-driven multimodal modeling with evidence retrieval.
Energy-dissipation principles can revolutionize how we infer potential functions in noisy, incomplete data environments, achieving remarkable robustness in generalized diffusion processes.
Force-feeding physics to LSTMs slashes battery thermal runaway prediction errors by over 80%, making your next e-bike less likely to explode.
Transfer learning gets a boost: SMART sidesteps restrictive assumptions and data sharing limitations by transferring spectral information between tasks, leading to improved accuracy and robustness.
Forget retraining: LEVER lets you snap together pre-trained RL policies at inference time, matching or beating from-scratch performance in some cases.
World models can navigate blood vessels autonomously with higher success rates than standard RL, paving the way for safer robotic stroke treatments.
By cleverly hedging between Cover's and Robbins' betting strategies, you can achieve almost-sure $O(\ln \ln n)$ regret without sacrificing the $O(\ln n)$ worst-case guarantee.
Even the best large vision-language models struggle with multi-image reasoning, scoring only 50% on a new benchmark designed to challenge their capabilities.
All evaluated language models exhibit vulnerabilities to a novel adversarial attack, underscoring the urgent need for improved security measures in AI systems.
We read everything so you don't have to. One email, zero noise.
SiPeR reveals how integrating scene dynamics with Bayesian inference can dramatically enhance the relevance of conversational recommendations in real-world contexts.
Current audio-language models are surprisingly bad at controlling and interpreting subtle vocal cues, failing in nearly half of situational dialogue scenarios.