Category: Nature Language Tech

Image
AI Nature Language Tech Research Share My Research

Researchers from PSU and Duke introduce “Multi-Agent Systems Automated Failure Attribution

“Automated failure attribution” is a crucial component in the development lifecycle of Multi-Agent systems. It has the potential to transform the challenge of identifying “what went wrong and who is to blame” from a perplexing mystery into a quantifiable and analyzable problem

Image
Machine Learning & Data Science Nature Language Tech Popular Research

Adobe Research Unlocking Long-Term Memory in Video World Models with State-Space Models

By combining State-Space Models (SSMs) for efficient long-range dependency modeling with dense local attention for coherence, and using training strategies like diffusion forcing and frame local attention, researchers from Adobe Research successfully overcome the long-standing challenge of long-term memory in video generation.

Image
AI Nature Language Tech Popular Research

Beyond Next-Token Prediction? Meta’s Novel Architectures Spark Debate on the Future of Large Language Models

Meta AI’s recent research introduces the BLT architecture, eliminating tokenizers for improved multimodal processing, and the Large Concept Model (LCM), which operates on semantic “concepts” instead of tokens for more human-like reasoning and better cross-lingual generalization. These innovations challenge the traditional “next-token prediction” paradigm in LLMs.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Nomic Embed: The Inaugural Open-Source Long Text Embedding Model Outshining OpenAI’s Finest

In a new paper Nomic Embed: Training a Reproducible Long Context Text Embedder, a Nomic AI research team introduces nomic-embed-text-v1, which marks the inception of the first fully reproducible, open-source, open-weights, open-data text embedding model, capable of handling an extensive context length of 8192 in English.

Image
AI Machine Learning & Data Science Nature Language Tech Research

LangSplat: Turbocharging 3D Language Fields with a Mind-Blowing 199x Speed Boost

In a new paper LangSplat: 3D Language Gaussian Splattin, a research team from Tsinghua University and Harvard University introduces LangSplat, a groundbreaking 3D Gaussian Splatting-based method designed for 3D language fields, which surpasses the state-of-the-art LERF method while boasting a remarkable speed improvement of 199 times.

Image
AI Machine Learning & Data Science Nature Language Tech Research

A Robot Chemist Driven by GPT-4 Made Its Debut in Nature: Autonomously Designs Reactions and Performs Complex Experiments

In a new paper Autonomous chemical research with large language models, a research team from Carnegie Mellon University and Emerald Cloud Lab introduces an innovative LLMs-Powered system named Coscientist, which autonomously designs, plans, and executes complex scientific experiments, marking a significant leap forward in the integration of laboratory automation technologies with powerful language models.

Image
AI Machine Learning & Data Science Nature Language Tech Research

The Reversal Curse: Uncovering the Intriguing Limits of Language Models

In a new paper titled “The Reversal Curse: LLMs trained on ‘A is B’ fail to learn ‘B is A'” authored by a collaborative research team from Vanderbilt University, the UK Frontier AI Taskforce, Apollo Research, New York University, the University of Sussex, and the University of Oxford, has unveiled a remarkable shortcoming in auto-regressive large language models (LLMs).

Image
AI Machine Learning & Data Science Nature Language Tech Research

One half-day of training using a few hundred dollars yields similar results to mainstream large models, open-source and commercial-free domain-specific LLM solution

Being at the forefront of cost reduction and efficiency enhancement for large models, the Colossal-AI team maximizes the core capabilities of LLaMA-2. Through innovative training techniques, Colossal-AI has achieved remarkable results by utilizing only approximately 0.0085 trillion tokens of data, investing 15 hours, and incurring training costs in the range of a few hundred dollars.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Unveiling the Enigma: Meta AI & UPC Decodes the Inner Workings of Large Scale Language Models

In a new paper Neurons in Large Language Models: Dead, N-gram, Positional, a research team from Meta AI and Universitat Politècnica de Catalunya conducts comprehensive analysis of a family of Open Pre-trained Transformer Language Models (OPT) up to 66b parameters to provide insights of how feed-forward network (FFN) layers act.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s new Pareto Optimal Self-Supervision Framework Automatically Corrects Language Models to Boost GPT SOTA Records

In a new paper Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision, a Microsoft team research team presents Pareto optimal self-supervision, a flexible framework that leverages programmatic supervision to automatically calibrate and correct error for Large language models without extra manual efforts.

Image
AI Machine Learning & Data Science Nature Language Tech Research

‘May the Source Be With You!’ – BigCode’s Open-Access StarCoder Outperforms All Existing Open Code LLMs

In the new paper StarCoder: May the Source Be With You!, the BigCode community releases StarCoder and StarCoderBase, 15.5B parameter open-access large language models (LLMs) trained on 80+ programming languages. StarCoderBase outperforms all multi-programming-language code LLMs, and StarCoder surpasses all models fine-tuned on Python.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Google & TAU Explore How Transformer-Based LLMs Extract Knowledge From Their Parameters

In the new paper Dissecting Recall of Factual Associations in Auto-Regressive Language Models, a team from Google DeepMind, Tel Aviv University and Google Research investigates how factual associations are stored and extracted internally in transformer-based language models and provides insights on how such models’ factual predictions are formed.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s LLMA Accelerates LLM Generations via an ‘Inference-With-Reference’ Decoding Approach

In the new paper Inference with Reference: Lossless Acceleration of Large Language Models, a Microsoft research team proposes LLMA, an inference-with-reference decoding mechanism that achieves up to 2x lossless speed-ups with identical generation results by exploiting the overlaps between LLM outputs and references.

Image
AI Machine Learning & Data Science Nature Language Tech Research

ColossalChat: An Open-source Solution for Cloning ChatGPT with A Complete RLHF Pipeline

Colossal-AI open sources a complete RLHF pipeline that includes supervised data collection, supervised fine-tuning, reward model training, and reinforcement learning fine-tuning, based on the LLaMA pre-trained model, and shares ColossalChat, the most practical open-source project that closely resembles the original ChatGPT technical solution!

Image
AI Machine Learning & Data Science Nature Language Tech Research

Google’s CoLT5 Processes Extremely Long Inputs via Conditional Computation

A Google Research team addresses transformers’ input sequence limitations in the new paper CoLT5: Faster Long-Range Transformers with Conditional Computation, proposing CoLT5 (Conditional LongT5), a family of models that applies a novel conditional computation approach for higher quality and faster long-input processing of up to 64,000 tokens.

Image
AI Machine Learning & Data Science Nature Language Tech Research

OpenAI, Open Research & UPenn Paper Considers How GPTs Will Impact the US Labour Market

In the new paper GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models, a research team from OpenAI, OpenResearch, and the University of Pennsylvania investigates the potential impact of LLMs like GPT on the US labour market, shedding light on the economic, social, and policy implications.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s MathPrompter Dramatically Improves LLM Performance on Mathematical Reasoning Tasks

In the new paper MathPrompter: Mathematical Reasoning Using Large Language Models, a Microsoft Research team presents MathPrompter, a novel approach that leverages chain-of-thought (CoT) prompting techniques to improve LLM performance on mathematical reasoning problems and increase confidence in their predictions.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Tackling Hallucinations: Microsoft’s LLM-Augmenter Boosts ChatGPT’s Factual Answer Score

In the new paper Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback, a Microsoft Research and Columbia University team presents LLM-Augmenter, a system that augments black-box large language models with a set of plug-and-play modules to significantly improve the factuality of their responses.

Image
AI Machine Learning & Data Science Nature Language Tech Research

CMU & Inspired Cognition’s DocPrompting Improves Code Generation by Retrieving Relevant Documentation

In the new paper DocPrompting: Generating Code by Retrieving the Docs, a research team from Carnegie Mellon University and Inspired Cognition presents DocPrompting, a natural-language-to-code generation approach. Tasked with generating code to unseen functions or libraries from a natural language intent, DocPrompting retrieves corresponding code documentation to enable the model to learn to perform the task.

Image
AI Machine Learning & Data Science Nature Language Tech Research

DeepMind’s Speculative Sampling Achieves 2–2.5x Decoding Speedups in Large Language Models

In the new paper Accelerating Large Language Model Decoding with Speculative Sampling, a DeepMind research team presents SpS (Speculative Sampling), an algorithm that achieves 2–2.5x decoding speedups on a 70 billion parameter Chinchilla language model. The novel approach maintains sample quality and does not require any modifications to model parameters or architecture.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Stanford U’s DetectGPT Takes a Curvature-Based Approach to LLM-Generated Text Detection

In the new paper DetectGPT: Zero-Shot Machine-Generated Text Detection Using Probability Curvature, a Stanford University research team presents DetectGPT, a zero-shot machine-generated text detection algorithm that uses probability curvature to predict whether a candidate passage was generated by a large language model.

Image
AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s Structured Prompting Breaks In-Context Learning Length Limits, Scales to Thousands of Examples

In the new paper Structured Prompting: Scaling In-Context Learning to 1,000 Examples, a Microsoft Research team proposes structured prompting. The novel approach breaks through conventional in-context learning length limits, scaling to thousands of examples with reduced computation complexity and superior performance and stability.

Image
AI Machine Learning & Data Science Nature Language Tech Research

DeepMind & UCL Fine-tune a 70B Parameter LM to Generate Statements Agreeable to Humans with Diverse Opinions

In the new paper Fine-tuning Language Models To Find Agreement Among Humans With Diverse Preferences, a research team from DeepMind and University College London fine-tunes a 70 billion parameter language model to generate statements that maximize agreement among a human group with diverse written opinions.

Image
AI Machine Learning & Data Science Nature Language Tech Popular Research

MIT, Northeastern & Technion Propose ROME for Efficient Locating and Editing of Factual Associations in GPT Models

In the new paper Locating and Editing Factual Associations in GPT, a research team from MIT CSAIL, Northeastern University and Technion IIT examines how information flows during knowledge recall in large autoregressive transformers and introduces Rank-One Model Editing (ROME), a simple, zero-shot principled model editor capable of locating and editing factual associations in such models.