Nature Language Tech

by Synced 2025-06-16 118

Researchers from PSU and Duke introduce “Multi-Agent Systems Automated Failure Attribution

“Automated failure attribution” is a crucial component in the development lifecycle of Multi-Agent systems. It has the potential to transform the challenge of identifying “what went wrong and who is to blame” from a perplexing mystery into a quantifiable and analyzable problem

by Synced 2025-05-28 123

Machine Learning & Data Science Nature Language Tech Popular Research

Adobe Research Unlocking Long-Term Memory in Video World Models with State-Space Models

By combining State-Space Models (SSMs) for efficient long-range dependency modeling with dense local attention for coherence, and using training strategies like diffusion forcing and frame local attention, researchers from Adobe Research successfully overcome the long-standing challenge of long-term memory in video generation.

by Synced 2025-04-23 21

Nature Language Tech Popular Research

Can GRPO be 10x Efficient? Kwai AI’s SRPO Suggests Yes with SRPO

Kwai AI’s SRPO framework slashes LLM RL post-training steps by 90% while matching DeepSeek-R1 performance in math and code. This two-stage RL approach with history resampling overcomes GRPO limitations.

by Synced 2025-04-11 26

AI China Machine Learning & Data Science Nature Language Tech Popular Research

DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT

DeepSeek AI, a prominent player in the large language model arena, has recently published a research paper detailing a new technique aimed at enhancing the scalability of general reward models (GRMs) during the inference phase.

by Synced 2025-01-25 22

AI Nature Language Tech Popular Research

Beyond Next-Token Prediction? Meta’s Novel Architectures Spark Debate on the Future of Large Language Models

Meta AI’s recent research introduces the BLT architecture, eliminating tokenizers for improved multimodal processing, and the Large Concept Model (LCM), which operates on semantic “concepts” instead of tokens for more human-like reasoning and better cross-lingual generalization. These innovations challenge the traditional “next-token prediction” paradigm in LLMs.

by Synced 2024-07-18 8

AI Machine Learning & Data Science Nature Language Tech Research

Revolutionizing Transformers: DeepMind’s PEER Layer and the Power of a Million Experts

A DeepMind research team introduces PEER, a innovative layer design leverages the product key technique for sparse retrieval from an extensive pool of tiny experts (over a million), which unlocks the potential for further scaling transformer models while maintaining computational efficiency.

by Synced 2024-03-27 2

AI Machine Learning & Data Science Nature Language Tech Research

First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models

In a new paper Stealing Part of a Production Language Model, a research team introduces the first model-stealing attack that unveils precise, nontrivial information from black-box production language models such as OpenAI’s ChatGPT or Google’s PaLM-2.

by Synced 2024-02-07 8

AI Machine Learning & Data Science Nature Language Tech Research

Nomic Embed: The Inaugural Open-Source Long Text Embedding Model Outshining OpenAI’s Finest

In a new paper Nomic Embed: Training a Reproducible Long Context Text Embedder, a Nomic AI research team introduces nomic-embed-text-v1, which marks the inception of the first fully reproducible, open-source, open-weights, open-data text embedding model, capable of handling an extensive context length of 8192 in English.

by Synced 2024-01-09 4

AI Machine Learning & Data Science Nature Language Tech Research

Beyond Behemoths: How Blended Chat AIs Outshine Trillion-Parameters ChatGPT with Elegance

Can a collective of moderately-sized LLMs collaboratively constitute a chat AI with equivalent or superior abilities? Motivated by this query, a new paper “Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM” confirms this idea and introduces the Blended approach.

by Synced 2024-01-07 3

AI Machine Learning & Data Science Nature Language Tech Research

LangSplat: Turbocharging 3D Language Fields with a Mind-Blowing 199x Speed Boost

In a new paper LangSplat: 3D Language Gaussian Splattin, a research team from Tsinghua University and Harvard University introduces LangSplat, a groundbreaking 3D Gaussian Splatting-based method designed for 3D language fields, which surpasses the state-of-the-art LERF method while boasting a remarkable speed improvement of 199 times.

by Synced 2023-12-22 8

AI Machine Learning & Data Science Nature Language Tech Research

A Robot Chemist Driven by GPT-4 Made Its Debut in Nature: Autonomously Designs Reactions and Performs Complex Experiments

In a new paper Autonomous chemical research with large language models, a research team from Carnegie Mellon University and Emerald Cloud Lab introduces an innovative LLMs-Powered system named Coscientist, which autonomously designs, plans, and executes complex scientific experiments, marking a significant leap forward in the integration of laboratory automation technologies with powerful language models.

by Synced 2023-09-26 1

AI Machine Learning & Data Science Nature Language Tech Research

The Reversal Curse: Uncovering the Intriguing Limits of Language Models

In a new paper titled “The Reversal Curse: LLMs trained on ‘A is B’ fail to learn ‘B is A'” authored by a collaborative research team from Vanderbilt University, the UK Frontier AI Taskforce, Apollo Research, New York University, the University of Sussex, and the University of Oxford, has unveiled a remarkable shortcoming in auto-regressive large language models (LLMs).

by Synced 2023-09-25 2

AI Machine Learning & Data Science Nature Language Tech Research

One half-day of training using a few hundred dollars yields similar results to mainstream large models, open-source and commercial-free domain-specific LLM solution

Being at the forefront of cost reduction and efficiency enhancement for large models, the Colossal-AI team maximizes the core capabilities of LLaMA-2. Through innovative training techniques, Colossal-AI has achieved remarkable results by utilizing only approximately 0.0085 trillion tokens of data, investing 15 hours, and incurring training costs in the range of a few hundred dollars.

by Synced 2023-09-20 2

AI Machine Learning & Data Science Nature Language Tech Research

Unveiling the Enigma: Meta AI & UPC Decodes the Inner Workings of Large Scale Language Models

In a new paper Neurons in Large Language Models: Dead, N-gram, Positional, a research team from Meta AI and Universitat Politècnica de Catalunya conducts comprehensive analysis of a family of Open Pre-trained Transformer Language Models (OPT) up to 66b parameters to provide insights of how feed-forward network (FFN) layers act.

by Synced 2023-08-29 7

AI Machine Learning & Data Science Nature Language Tech Research

Meta AI Open Sources Code Llama: A SOTA Code-Specialized Llama 2

In a new paper Code Llama: Open Foundation Models for Code, a Meta AI research team releases Code Llama, a family of code-specialized Llama 2 models for code generation and infilling, which achieves state-of-the-art performance against open models on code benchmarks.

by Synced 2023-07-31 2

AI Machine Learning & Data Science Nature Language Tech Research

Stanford U Demonstrates Meta-Reinforcement Agents Gain Language Skills Without Direct Language Supervision

In a new paper Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning, a Stanford University research team affirms that simple language skills can emerge in meta-RL agents without direct language supervision by testifying this theory in their customized multi-task environment.

by Synced 2023-07-04 2

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s new Pareto Optimal Self-Supervision Framework Automatically Corrects Language Models to Boost GPT SOTA Records

In a new paper Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision, a Microsoft team research team presents Pareto optimal self-supervision, a flexible framework that leverages programmatic supervision to automatically calibrate and correct error for Large language models without extra manual efforts.

by Synced 2023-05-29 2

AI Machine Learning & Data Science Nature Language Tech Research

Meta AI’s READ Method for Fine-Tuning Large Transformers Cuts GPU Energy Costs by 84%

In the new paper READ: Recurrent Adaptation of Large Transformers, a Meta AI research team proposes REcurrent ADaption (READ), a lightweight and memory-efficient fine-tuning approach that achieves a 56 percent reduction in memory consumption and an 84 percent reduction in GPU use.

by Synced 2023-05-16 2

AI Machine Learning & Data Science Nature Language Tech Research

‘May the Source Be With You!’ – BigCode’s Open-Access StarCoder Outperforms All Existing Open Code LLMs

In the new paper StarCoder: May the Source Be With You!, the BigCode community releases StarCoder and StarCoderBase, 15.5B parameter open-access large language models (LLMs) trained on 80+ programming languages. StarCoderBase outperforms all multi-programming-language code LLMs, and StarCoder surpasses all models fine-tuned on Python.

by Synced 2023-05-02 6

AI Machine Learning & Data Science Nature Language Tech Research

Google & TAU Explore How Transformer-Based LLMs Extract Knowledge From Their Parameters

In the new paper Dissecting Recall of Factual Associations in Auto-Regressive Language Models, a team from Google DeepMind, Tel Aviv University and Google Research investigates how factual associations are stored and extracted internally in transformer-based language models and provides insights on how such models’ factual predictions are formed.

by Synced 2023-04-13 3

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s LLMA Accelerates LLM Generations via an ‘Inference-With-Reference’ Decoding Approach

In the new paper Inference with Reference: Lossless Acceleration of Large Language Models, a Microsoft research team proposes LLMA, an inference-with-reference decoding mechanism that achieves up to 2x lossless speed-ups with identical generation results by exploiting the overlaps between LLM outputs and references.

by Synced 2023-03-29 7

AI Machine Learning & Data Science Nature Language Tech Research

ColossalChat: An Open-source Solution for Cloning ChatGPT with A Complete RLHF Pipeline

Colossal-AI open sources a complete RLHF pipeline that includes supervised data collection, supervised fine-tuning, reward model training, and reinforcement learning fine-tuning, based on the LLaMA pre-trained model, and shares ColossalChat, the most practical open-source project that closely resembles the original ChatGPT technical solution!

by Synced 2023-03-28 2

AI Machine Learning & Data Science Nature Language Tech Research

Google’s CoLT5 Processes Extremely Long Inputs via Conditional Computation

A Google Research team addresses transformers’ input sequence limitations in the new paper CoLT5: Faster Long-Range Transformers with Conditional Computation, proposing CoLT5 (Conditional LongT5), a family of models that applies a novel conditional computation approach for higher quality and faster long-input processing of up to 64,000 tokens.

by Synced 2023-03-23 6

AI Machine Learning & Data Science Nature Language Tech Research

OpenAI, Open Research & UPenn Paper Considers How GPTs Will Impact the US Labour Market

In the new paper GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models, a research team from OpenAI, OpenResearch, and the University of Pennsylvania investigates the potential impact of LLMs like GPT on the US labour market, shedding light on the economic, social, and policy implications.

by Synced 2023-03-22 10

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s UPRISE Automatically Retrieves Prompts to Boost the Zero-Shot Performance of Large Language Models

In the new paper UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation, a Microsoft research team introduces a novel approach that tunes a lightweight and versatile retriever to retrieve prompts for any given task input to improve the zero-shot performance of LLMs.

by Synced 2023-03-16 2

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s MathPrompter Dramatically Improves LLM Performance on Mathematical Reasoning Tasks

In the new paper MathPrompter: Mathematical Reasoning Using Large Language Models, a Microsoft Research team presents MathPrompter, a novel approach that leverages chain-of-thought (CoT) prompting techniques to improve LLM performance on mathematical reasoning problems and increase confidence in their predictions.

by Synced 2023-03-09 4

AI Machine Learning & Data Science Nature Language Tech Research

Google’s Universal Speech Model Scales Automatic Speech Recognition to 100+ Languages

In the new paper Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages, Google introduces the Universal Speech Model (USM), a scalable self-supervised training framework that extends automatic speech recognition to more than 100 languages.

by Synced 2023-03-07 23

AI Machine Learning & Data Science Nature Language Tech Popular Research

Toward AGI: Microsoft’s KOSMOS-1 MLLM Can Perceive General Modalities, Follow Instructions, and Perform In-Context Learning

In the new paper Language Is Not All You Need: Aligning Perception with Language Models, a Microsoft research team presents KOSMOS-1, a multimodal large language model (MLLM) that can perceive general modalities, learn in context, and follow instructions.

by Synced 2023-03-02 4

AI Machine Learning & Data Science Nature Language Tech Research

Tackling Hallucinations: Microsoft’s LLM-Augmenter Boosts ChatGPT’s Factual Answer Score

In the new paper Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback, a Microsoft Research and Columbia University team presents LLM-Augmenter, a system that augments black-box large language models with a set of plug-and-play modules to significantly improve the factuality of their responses.

by Synced 2023-02-28 34

AI Machine Learning & Data Science Nature Language Tech Research

CMU & Inspired Cognition’s DocPrompting Improves Code Generation by Retrieving Relevant Documentation

In the new paper DocPrompting: Generating Code by Retrieving the Docs, a research team from Carnegie Mellon University and Inspired Cognition presents DocPrompting, a natural-language-to-code generation approach. Tasked with generating code to unseen functions or libraries from a natural language intent, DocPrompting retrieves corresponding code documentation to enable the model to learn to perform the task.

by Synced 2023-02-09 0

AI Machine Learning & Data Science Nature Language Tech Research

DeepMind’s Speculative Sampling Achieves 2–2.5x Decoding Speedups in Large Language Models

In the new paper Accelerating Large Language Model Decoding with Speculative Sampling, a DeepMind research team presents SpS (Speculative Sampling), an algorithm that achieves 2–2.5x decoding speedups on a 70 billion parameter Chinchilla language model. The novel approach maintains sample quality and does not require any modifications to model parameters or architecture.

by Synced 2023-02-01 5

AI Machine Learning & Data Science Nature Language Tech Research

Stanford U’s DetectGPT Takes a Curvature-Based Approach to LLM-Generated Text Detection

In the new paper DetectGPT: Zero-Shot Machine-Generated Text Detection Using Probability Curvature, a Stanford University research team presents DetectGPT, a zero-shot machine-generated text detection algorithm that uses probability curvature to predict whether a candidate passage was generated by a large language model.

by Synced 2023-01-18 1

AI Machine Learning & Data Science Nature Language Tech Research

Google Brain & Alberta U Paper Confirms the Computational Universality of Memory-Augmented Large Language Models

In the new paper Memory Augmented Large Language Models are Computationally Universal, Google Brain and University of Alberta researcher Dale Schuurmans establishes computational universality for a large language model augmented with an associative read-write memory.

by Synced 2022-12-29 1

AI Machine Learning & Data Science Nature Language Tech Research

Improving Instruction Tuning for LLMs: Meta AI Presents the OPT-IML Benchmark of 2000 NLP Tasks

In the new paper OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization, a Meta AI research team presents OPT-IML Bench, an Instruction Meta Learning benchmark comprising 2000 NLP tasks and an evaluation framework for model generalization.

by Synced 2022-12-20 2

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s Structured Prompting Breaks In-Context Learning Length Limits, Scales to Thousands of Examples

In the new paper Structured Prompting: Scaling In-Context Learning to 1,000 Examples, a Microsoft Research team proposes structured prompting. The novel approach breaks through conventional in-context learning length limits, scaling to thousands of examples with reduced computation complexity and superior performance and stability.

by Synced 2022-12-14 9

AI Machine Learning & Data Science Nature Language Tech Research

Finding Truth in LLMs: UC Berkeley & Peking U Propose Unsupervised Contrast-Consistent Search

In the new paper Discovering Latent Knowledge in Language Models Without Supervision, a research team from UC Berkeley and Peking University presents Contrast-Consistent Search (CCS), an unsupervised approach for discovering latent knowledge in language models.

by Synced 2022-12-12 5

AI Machine Learning & Data Science Nature Language Tech Research

ServiceNow Research & Hugging Face Release The Stack: 3 TB of Permissively Licensed Source Code for LLMs

In the new paper The Stack: 3 TB of Permissively Licensed Source Code, a team from ServiceNow Research and Hugging Face advances open and responsible research on code LLMs by releasing The Stack, a 3.1 TB dataset of permissively licensed source code in 30 programming languages.

by Synced 2022-12-06 0

AI Machine Learning & Data Science Nature Language Tech Research

DeepMind & UCL Fine-tune a 70B Parameter LM to Generate Statements Agreeable to Humans with Diverse Opinions

In the new paper Fine-tuning Language Models To Find Agreement Among Humans With Diverse Preferences, a research team from DeepMind and University College London fine-tunes a 70 billion parameter language model to generate statements that maximize agreement among a human group with diverse written opinions.

by Synced 2022-11-21 5

AI Machine Learning & Data Science Nature Language Tech Research

Talking to Models: Stanford U & Microsoft Method Enables Developers to Correct Model Bugs via Natural Language Patches

In the new paper Fixing Model Bugs with Natural Language Patches, researchers from Stanford University and Microsoft Research propose a method that uses declarative statements as feedback for correcting errors in neural models, significantly increasing accuracy without high compute costs.

by Synced 2022-11-08 8

AI Machine Learning & Data Science Nature Language Tech Popular Research

MIT, Northeastern & Technion Propose ROME for Efficient Locating and Editing of Factual Associations in GPT Models

In the new paper Locating and Editing Factual Associations in GPT, a research team from MIT CSAIL, Northeastern University and Technion IIT examines how information flows during knowledge recall in large autoregressive transformers and introduces Rank-One Model Editing (ROME), a simple, zero-shot principled model editor capable of locating and editing factual associations in such models.