Reddit - The heart of the internet

Feed About

Best

Open sort options

Change post view

[R] End-to-End Test-Time Training for Long Context

u/karansdalal

[R] End-to-End Test-Time Training for Long Context

Research

We formulate long-context language modeling as a problem in continual learning rather than architecture design. Under this formulation, we only use a standard architecture – a Transformer with sliding-window attention. However, our model continues learning at test time via next-token prediction on the given context, compressing the context it reads into its weights. In addition, we improve the model’s initialization for learning at test time via meta-learning at training time. Overall, our method, a form of Test-Time Training (TTT), is End-to-End (E2E) both at test time (via next-token prediction) and training time (via meta-learning), in contrast to previous forms. We conduct extensive experiments with a focus on scaling properties. In particular, for 3B models trained with 164B tokens, our method (TTT-E2E) scales with context length in the same way as Transformer with full attention, while others, such as Mamba 2 and Gated DeltaNet, do not. However, similar to RNNs, TTT-E2E has constant inference latency regardless of context length, making it 2.7× faster than full attention for 128K context. Our code is publicly available.

u/bootdotdev

•

Promoted

The Python Course everyone is talking about

boot.dev

Ilya Sutskever is puzzled by the gap between AI benchmarks and the economic impact [D]

u/we_are_mammals

Ilya Sutskever is puzzled by the gap between AI benchmarks and the economic impact [D]

Discussion

In a recent interview, Ilya Sutskever said:

This is one of the very confusing things about the models right now. How to reconcile the fact that they are doing so well on evals... And you look at the evals and you go "Those are pretty hard evals"... They are doing so well! But the economic impact seems to be dramatically behind.

I'm sure Ilya is familiar with the idea of "leakage", and he's still puzzled. So how do you explain it?

Edit: GPT-5.2 Thinking scored 70% on GDPval, meaning it outperformed industry professionals on economically valuable, well-specified knowledge work spanning 44 occupations.

Researching Manufacturing Workflows – Looking for Ideas on Where AI Can Actually Help [R]

u/Public-Air3181

Researching Manufacturing Workflows – Looking for Ideas on Where AI Can Actually Help [R]

Research

Hey everyone,

I’m currently doing research on how manufacturing units actually work on the ground, especially from a safety and operations point of view. My goal is to understand real workflows and then explore where AI can realistically be implemented, not just theoretically.

The areas I’m focusing on are:

1.	Behaviour Based Safety Management

(Tracking PPE usage, unsafe actions, safety compliance, observations, etc.)

2.	Accident, Incident & Investigation Management

(Incident reporting, root cause analysis, near-miss detection, prevention)

3.	Work to Permit Management

(Hot work permits, confined space permits, approvals, compliance checks)

4.	Visitor & Vehicle Management

(Entry/exit logs, safety induction, vehicle movement, restricted zones)

5.	Safety Training Management

(Training effectiveness, compliance tracking, refreshers, behavior change)

Most of the data in these environments is still manual (Excel sheets, registers, WhatsApp photos, CCTV footage). I’m trying to research:

•	How these processes actually run in real factories

•	Where AI/ML, computer vision, NLP, or automation could reduce manual work

•	What would be useful vs overkill in a real manufacturing setup

r/MachineLearning

Community highlights

[D] Self-Promotion Thread

[D] Monthly Who's Hiring and Who wants to be Hired?

No Spam

No Self-Promotion

No Marketing Campaigns (SEO)

No Disrespectful Behavior

No arXiv Links without Body Text

No Low-Effort, Beginner Questions