Sign in to view Dayi’s full profile
or
Already on LinkedIn? Sign in
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Dayi’s full profile
or
Already on LinkedIn? Sign in
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Toronto, Ontario, Canada
Sign in to view Dayi’s full profile
Dayi can introduce you to 10+ people at Huawei Canada
Join with email
or
Already on LinkedIn? Sign in
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
1K followers
500+ connections
Sign in to view Dayi’s full profile
or
Already on LinkedIn? Sign in
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Dayi
Dayi can introduce you to 10+ people at Huawei Canada
Join with email
or
Already on LinkedIn? Sign in
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Dayi
or
Already on LinkedIn? Sign in
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Dayi’s full profile
or
Already on LinkedIn? Sign in
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Websites
- Personal Website
-
http://lindayi.me
- ResearchGate
-
https://researchgate.net/profile/dayi_lin
- Github
-
https://github.com/lindayi
About
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Activity
1K followers
-
Dayi Lin reposted thisDayi Lin reposted this𝗕𝗲𝘆𝗼𝗻𝗱 𝗧𝗵𝗲𝗼𝗿𝘆: 𝗧𝗵𝗲 𝗥𝗲𝗮𝗹𝗶𝘁𝘆 𝗼𝗳 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗴𝗲 𝗼𝗳 𝗔𝗜 The pace of evolution is mind-bending. Come join us this week at ICSE Rio, every day at 16:00 (4 PM) in Oceania III, to get a raw, firsthand glimpse of what is actually happening in the trenches and what the industry urgently needs. From the shift to 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝗘 𝟯.𝟬, to the industrial rigor required to build 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻 𝗠𝗼𝗱𝗲𝗹𝘀, and the rise of 𝗩𝗶𝗯𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴, we are covering the full spectrum of how software and SE are being reimagined in the age of AI. 🔽 𝗪𝗲𝗱𝗻𝗲𝘀𝗱𝗮𝘆 | 1️⃣ 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴: 𝗔 𝗥𝗼𝗮𝗱𝗺𝗮𝗽 𝘁𝗼 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝟯.𝟬 We are moving beyond AI-assisted development (SE 2.0) into a world where AI teammates act as true collaborators (SE 3.0). This session introduces the Agentic SE framework and provides an empirical look at over 450,000 agent-authored pull requests to address the "speed vs. trust" gap. (For a deeper dive into this shift, check out my book: https://lnkd.in/g2mwzax9) 𝗝𝗼𝗶𝗻𝘁 𝘄𝗼𝗿𝗸 𝘄𝗶𝘁𝗵: Hao Li, Dayi Lin, Bram Adams, Tse-Hsun Peter Chen, Yutaro Kashiwa, Dong Qiu Qiu, Haoxiang Zhang, Miku Watanabe 🔽 𝗧𝗵𝘂𝗿𝘀𝗱𝗮𝘆 | 2️⃣ 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗳𝗼𝗿 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻 𝗠𝗼𝗱𝗲𝗹𝘀 (𝗦𝗘𝟰𝗙𝗠) Foundation models are some of the most complex systems that humanity ever built, yet their creation often relies on "tribal knowledge". Based on our experience building some of the leading foundation models, we distill a rigorous engineering approach organized around four pillars: 𝗗𝗮𝘁𝗮𝗢𝗽𝘀, 𝗘𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁𝗢𝗽𝘀, 𝗘𝘃𝗮𝗹𝗢𝗽𝘀, 𝗮𝗻𝗱 𝗙𝗶𝗲𝗹𝗱𝗢𝗽𝘀. 𝗝𝗼𝗶𝗻𝘁 𝘄𝗼𝗿𝗸 𝘄𝗶𝘁𝗵: Boyuan Chen, Dayi Lin , Arthur Leung, zhilong chen , Gopi Krishnan Rajbahadur, Gustavo Ansaldi Oliva, Yihao Chen, Xiaoshuang Liu, Chun Yong Chong 🔽 𝗙𝗿𝗶𝗱𝗮𝘆 | 3️⃣ 𝗩𝗶𝗯𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴: 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗳𝗼𝗿 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗠𝗮𝗸𝗲𝗿𝘀 "Vibe coding" allows non-technical makers to build apps by describing desired outcomes. We introduce a structured framework that marries 𝗜𝗻𝘁𝗲𝗻𝘁 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 (using Theory of Mind to capture what users actually need) with 𝗥𝗲𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 (an AI teammate execution environment) to make this new way of building software robust and trustworthy. 𝗝𝗼𝗶𝗻𝘁 𝘄𝗼𝗿𝗸 𝘄𝗶𝘁𝗵: Keheliya Gallaba, Ph.D., Zhiyu Fan, Jiahuei Lin, Ph.D., Filipe Roseiro Côgo, Ben Rombaut, Dayi Lin See you at ICSE - International Conference on Software Engineering in Rio! 🌴 #ICSE2026 #SoftwareEngineering #AgenticSE #GenerativeAI #VibeCoding #FoundationModels #SE30
-
Dayi Lin shared this𝗖𝗮𝗻 𝘄𝗲 𝗲𝗺𝗽𝗼𝘄𝗲𝗿 𝗻𝗼𝗻-𝘁𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗽𝗲𝗼𝗽𝗹𝗲 𝘁𝗼 𝘃𝗶𝗯𝗲 𝗰𝗼𝗱𝗲 𝘀𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝘄𝗶𝘁𝗵 𝘁𝗵𝗲 𝘀𝗮𝗺𝗲 𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗴𝘂𝗮𝗿𝗮𝗻𝘁𝗲𝗲𝘀 𝗮𝘀 𝗽𝗿𝗼𝗳𝗲𝘀𝘀𝗶𝗼𝗻𝗮𝗹 𝗱𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿𝘀? In other words: how do we transform 𝗩𝗶𝗯𝗲 𝗖𝗼𝗱𝗶𝗻𝗴 into verifiable, high-quality 𝗩𝗶𝗯𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴? At the #AAAI26 Next-Gen Code Development with Collaborative AI Agents workshop, I shared a high-level overview of our exploration in this direction: what we call 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗳𝗼𝗿 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗠𝗮𝗸𝗲𝗿𝘀 (𝗦𝗘𝟰𝗦𝗠). Even with model capabilities improving drastically and paradigms like 𝗦𝗽𝗲𝗰-𝗗𝗿𝗶𝘃𝗲𝗻 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁 gaining popularity by the day, there are still critical unanswered questions: - The style and granularity you use when talking to an intern versus your CTO are certainly different. Shouldn't code agents also tailor their communication to different users? - The readers of specs — and in some cases even the authors — are code agents. Should specs be optimized for agents rather than remaining human-readable? - Current code agents still follow a linear design → develop → verify process. This waterfall workflow was historically constrained by the human cost of rework. But now that agents have dramatically reduced development costs, do we still need to finalize all design decisions before writing a single line of code? - Does testing and verification really have to wait until after development is complete? Would love to hear your thoughts — drop a comment below! 👇 Work done with my awesome collaborators Keheliya Gallaba, Ph.D. , Zhiyu Fan , Jiahuei Lin, Ph.D. , Filipe Roseiro Côgo , Ben Rombaut , Ahmed E. Hassan , and many more talented colleagues.
-
Dayi Lin shared this⚠️ Google just highlighted Codeforces Elo as 𝘁𝗵𝗲 coding benchmark for Gemini 3 Deep Think, claiming a score of 𝟑𝟒𝟓𝟓. Here's why I 𝗦𝗧𝗥𝗢𝗡𝗚𝗟𝗬 advise against using Codeforces Elo scores for LLM's code capability comparison. Which contests? Which divisions? How were submissions ordered? How many runs? 𝗡𝗼𝗻𝗲 𝗼𝗳 𝘁𝗵𝗶𝘀 𝗶𝘀 𝗱𝗶𝘀𝗰𝗹𝗼𝘀𝗲𝗱. As someone who leads coding capability evaluation for in-house LLM training, our new preprint "𝗪𝗵𝗲𝗻 𝗘𝗹𝗼 𝗟𝗶𝗲𝘀" shows why, from our experience and experiments, 𝗘𝗹𝗼 𝗮𝗹𝗼𝗻𝗲 𝗶𝘀 𝗺𝗲𝗮𝗻𝗶𝗻𝗴𝗹𝗲𝘀𝘀. For the 𝗦𝗔𝗠𝗘 model on the 𝗦𝗔𝗠𝗘 contest: → Changing submission order shifts Elo by up to 𝟯𝟵𝟰 points (11% of Gemini 3 Deep Think's ELO score) → Choosing different contest divisions swings it by up to 𝟭,𝟭𝟮𝟮 points (32% of Gemini 3 Deep Think's ELO score) → Run-to-run variability causes up to 𝟯𝟰𝟵 points difference in mean Elo across repeated runs on identical problems (10% of Gemini 3 Deep Think's ELO score) These variances are large enough to 𝗰𝗼𝗺𝗽𝗹𝗲𝘁𝗲𝗹𝘆 𝗿𝗲𝗼𝗿𝗱𝗲𝗿 model rankings. Evaluation is not a leaderboard number. It's identifying regression, diagnosing defects, and guiding improvement direction. Using Elo as the metric on Codeforces problems, rather than more granular measures like Pass@n, serves 𝗻𝗼𝗻𝗲 of these purposes. Next time you see a Codeforces Elo in a model card, ask: where are the evaluation details? We should be pushing for transparency and decomposable metrics, not headline numbers. 📄 https://lnkd.in/giPWfaBa Kudos to my co-authors: Shenyu Zheng, Ximing Dong, Xiaoshuang Liu, Gustavo Ansaldi Oliva, Chun Yong Chong, Boyuan (Nemo) Chen, Shaowei Wang, and Ahmed E. Hassan. #LLM #AI #Evaluation #Benchmarking #Codeforces #GeminiWhen Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language ModelsWhen Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language Models
-
Dayi Lin shared this𝗧𝗶𝗿𝗲𝗱 𝗼𝗳 𝘄𝗮𝗶𝘁𝗶𝗻𝗴 𝗳𝗼𝗿 𝘆𝗼𝘂𝗿 𝗟𝗟𝗠 𝘁𝗼 𝗿𝗲𝗮𝘀𝗼𝗻 𝗼𝘃𝗲𝗿 𝗮 𝗹𝗼𝗻𝗴 𝘁𝗶𝗺𝗲? Excited to share our latest work, 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰𝗦𝗽𝗲𝗰, which achieves a speedup of up to 𝟮.𝟳𝘅 beyond existing token- and sequence-level speculative decoding techniques, by estimating semantic equivalence using model internal states and accepting more semantically equivalent valid drafts, while maintaining output quality. ⚡ 𝗧𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺: Traditional speculative decoding accelerates LLM outputs by drafting and verifying tokens in parallel, but still treats each token in isolation, ignoring the fact that different token sequences can carry the same meaning. This leads to unnecessary rejections and slower throughput. 💡 𝗧𝗵𝗲 𝗶𝗻𝗻𝗼𝘃𝗮𝘁𝗶𝗼𝗻: SemanticSpec shifts speculative decoding from token-level to semantic-level reasoning. That means: - Drafts and verifies full semantic sequences rather than just tokens. - Estimates the likelihood a model would generate any sequence with the same meaning by probing its internal hidden states, not just surface token probabilities. - Trains a lightweight predictor to estimate this “semantic probability”, enhancing acceptance of semantically correct drafts. 🔍 𝗪𝗵𝘆 𝗶𝘁 𝗺𝗮𝘁𝘁𝗲𝗿𝘀: This approach aligns decoding with meaning rather than token patterns, marking a significant step toward more efficient, semantically intelligent LLM deployment, especially for models that reason deeply. Check out our preprint for details: https://lnkd.in/gWuJFHbd In collaboration with Ximing Dong Shaowei Wang Boyuan (Nemo) Chen Ahmed E. Hassan #LLMInference #SpeculativeDecoding #ModelServingBeyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal StatesBeyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal States
-
Dayi Lin posted thisI’ll be at San Diego this week attending NeurIPS’25. Say hi if you see me and let’s chat if you work in LLM4Code (coding model training, coding agent design, and their synergy)! Checkout some of the exciting research from our team recently: - Our vision of the roadmap ahead for *reliable, trustworthy* agentic software engineering: https://lnkd.in/gPvVpy3A - Fully automated SWE-Bench-Verified level issue labelling pipeline: https://lnkd.in/gD53b4a9 - Reliable workflow generation with DSL+program repair: https://lnkd.in/gW-AB3QV - Evaluating agent performance under resource constraints: https://lnkd.in/gmJXfDyV - Better human-agent intent alignment with Theory of Mind: https://lnkd.in/gUq-Q6E2 … and many more! If you’re interested in any of the related topics and potential opportunities in Canada / China, come talk to me and my teammates Kirill Vasilevski Arthur Leung and Jiayuan Zhou Ph.D. at the event. #NeurIPS #LLM4Code #AgenticSE
-
Dayi Lin posted thisI'm attending ASE 2025 / AIware 2025 in Seoul. Our team is presenting four papers this week across ASE and AIware — if you’re attending, feel free to drop by our sessions! 📌 SPICE: An Automated SWE-Bench Labeling Pipeline for Issue Clarity, Test Coverage, and Effort Estimation 🗓 Mon 17 Nov 2025, 14:20–14:30 @ Grand Hall 5 ➡️ A fully automated pipeline that generates high-quality clarity, coverage, and effort labels for SWE-bench tasks at dramatically lower cost and scale. with Aaditya Bhatia, Gustavo Ansaldi Oliva, Gopi Krishnan Rajbahadur, Haoxiang Zhang, Yihao Chen, zhilong chen, Arthur Leung, Boyuan (Nemo) Chen, Ahmed E. Hassan 📌 Watson: A Cognitive Observability Framework for the Reasoning of LLM-Powered Agents 🗓 Tue 18 Nov 2025, 12:00–12:10 @ Vista ➡️ A framework that reconstructs an LLM agent’s internal reasoning to improve transparency and enable automatic reasoning-based corrections. with Ben Rombaut, Sogol Masoumzadeh, Kirill Vasilevski, Ahmed E. Hassan 📌 Context-Aware CodeLLM Eviction for AI-assisted Coding 🗓 Tue 18 Nov 2025, 16:40 - 16:50 @ Grand Hall 6 ➡️ A novel context-aware model eviction strategy designed specifically to optimize self-hosted CodeLLM serving under resource constraints. with Kishanthan Thangarajah, Boyuan (Nemo) Chen, Shi (Charles) C., Ahmed E. Hassan 📌 PromptExp: Multi-granularity Prompt Explanation of Large Language Models 🗓 Thu 20 Nov 2025, 16:00–16:08 @ Grand Hall 1 ➡️ A method for explaining prompt behavior at multiple granularities, enabling better debuggability and transparency of prompt and context engineering. with Ximing Dong, Shaowei Wang, Gopi Krishnan Rajbahadur, Ahmed E. Hassan We also have a super exciting program for AIware, with industrial talks from Anthropic, Mistral AI, Microsoft, Google, Bytedance, and many more academic and industry leaders. You wouldn't want to miss it! Say hi if you see me around :) #ASE2025 #AIware2025 #AI4SE #CodeLLM
-
Dayi Lin shared thisIt was a pleasure discussing agentic software engineering with the fellow panelists. We are witnessing / experiencing a historical transformation of our industry.Dayi Lin shared this🌐 𝙻𝚒𝚗𝚞𝚡 𝙵𝚘𝚞𝚗𝚍𝚊𝚝𝚒𝚘𝚗 𝙿𝚊𝚗𝚎𝚕 – 𝚆𝚑𝚢 𝙰𝚐𝚎𝚗𝚝𝚒𝚌 𝚂𝙴 𝚒𝚜 𝚊 𝚏𝚞𝚗𝚍𝚊𝚖𝚎𝚗𝚝𝚊𝚕 𝚜𝚑𝚒𝚏𝚝 A bit belated, but I wanted to share a highlight from late this summer: I had the privilege of chairing a wonderful panel of experts for The Linux Foundation to discuss the future of our field. We dove deep into the 𝐀𝐠𝐞𝐧𝐭𝐢𝐜 𝐒𝐄 vision and what it truly means for all of us. This isn't just an incremental change; it's a fundamental shift. We touched on the big, challenging questions: 👥 Talent: What skills define the "1m000x developer" when AI agents can write the code? (Hint: It's more about orchestration and mentorship than syntax). ⚙️ Process: How do our entire software development life cycles, from CI/CD to code review, need to evolve to manage this new world? 💡 Creativity: How does our creative focus shift from implementation details to high-level architecture and intent, amplifying human ingenuity? The energy in the discussion was incredible. If you're as excited as I am about how these ideas will reshape our world, this conversation is well worth your time. You can watch the full panel discussion here: https://lnkd.in/dfcFd7-v Grateful to our outstanding panellists for their insights and time: Cor-Paul Bezemer, Dayi Lin, Ph.D., Keheliya Gallaba, Ph.D., Maliheh (Mali) Izadi, Mehdi KeshaniPanel: Agentic SE & Open Source in the FMware Era – Toward Trustworthy, Context-Aware FMwarePanel: Agentic SE & Open Source in the FMware Era – Toward Trustworthy, Context-Aware FMware
-
Dayi Lin reposted thisDayi Lin reposted thisEffectively and efficiently testing fine-tuned Deep Learning models This paper was just accepted for publication in IEEE Transactions on Software Engineering. “MetaSel: A Test Selection Approach for Fine-tuned DNN Models”, Amin Abbasi, Mahboubeh Dadkhah, PhD, Lionel Briand, Dayi Lin, Ph.D. Summary: Deep Neural Networks (DNNs) face challenges during deployment due to data distribution shifts. Fine-tuning adapts pre-trained models to new contexts requiring smaller labeled sets. However, testing fine-tuned models under constrained labeling budgets remains a critical challenge. This paper introduces MetaSel, a new approach tailored for fine-tuned DNN models to select tests from unlabeled inputs. This work is part of Amin Abbasi's PhD research at the University of Ottawa and a project led by Mahboubeh Dadkhah, PhD, as part of her postdoctoral fellowship responsibilities, focusing on the automatic testing of deep learning models. It was financially supported by Huawei Canada, with whom we carefully defined the problem and assessed the solution. Thank you to the reviewers and associate editor, who provided excellent feedback, significantly improving our original submission. Preprint: https://lnkd.in/eQ4hdTmP
-
Dayi Lin shared thisReproducibility of models (both training and inference) is such an important but often overlooked topic for trustworthy engineering of AI systems, and glad it’s getting traction in AI community as well. If you’re interested, come check out our method addressing both software and hardware level randomness in our ICSE’22 paper https://lnkd.in/gBBYiu3HDayi Lin shared thisToday Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to prompt engineering. Here we share what we are working on and connect with the research community frequently and openly. The name Connectionism is a throwback to an earlier era of AI; it was the name of the subfield in the 1980s that studied neural networks and their similarity to biological brains. https://lnkd.in/gKHbbJ_y
-
Dayi Lin liked thisDayi Lin liked this#ICSE2026 has been a blast! Thank you for attending our Software Engineering for Foundation Models (SE4FM) technical briefing, brought to you by our team Boyuan Chen, Dayi Lin, Gopi Krishnan Rajbahadur, zhilong chen, Gustavo Ansaldi Oliva, Yihao Chen, Xiaoshuang Liu, Chun Yong Chong, Ahmed E. Hassan
-
Dayi Lin liked thisDayi Lin liked thisAfter five years at GitHub, I’ve decided to move on and am joining XBOW. I’ll always treasure my time at GitHub. More than anything, it’s the people — Hubbers are just genuinely kind. I can’t count how many times I dropped into a team’s Slack channel asking for help and someone went out of their way to jump in. The highlight of my time at GitHub was being part of the team that created GitHub Copilot. At the time, I don’t think any of us fully realized what it would become or how much it would change things. Being one of its creators will likely be a career highlight for me. I spent my five years on GitHub Next, a rare mix of research, prototyping, and product incubation. It was a place where we could take ambitious ideas, build them, and try to make them real — whether early Copilot prototypes or more recent work like Agentic Workflows. I’m excited to see what the team builds next. At XBOW, I’ll be Head of Platform Engineering. I’ve been watching the company grow for some time, and I’m especially excited to be reunited with several former teammates from the early Copilot days. They’ve built something special, and I’m excited to be part of it.
-
Dayi Lin liked thisDayi Lin liked this"Software is not made by one person in a vacuum. It’s a team sport. Everyone building it needs to agree on what they’re building and why." - Maggie Appleton (staff research engineer at GitHub Next) in https://lnkd.in/gnEXAXG9, which you should read...right now. I joined GitHub 10+ years ago excited to help humans make better software - together. Whether at work or home. Over the past few years the industry has leaned in heavily to AI, focusing almost exclusively on accelerating the individual: one person commanding the army of agents. To quote Maggie again: "The main problem with this dream is it assumes software is made by one person." The fine folks at GitHub Next (the R&D team that launched the original GitHub Copilot) have been hard at work on solving this problem, looking at how we shift alignment left, and help humans work together to decide what to build. Curating the human context that helps ensure the agents build the right thing. Ultimately, saving time (collapsing feedback cycles and requiring less rework) and tokens (generating less "throw away" work). I've watched the Next team build Ace (the Agentic collaboration environment) for the past many months and this is might be the most excited I have been for something we're building at GitHub. We've kept tight lips about this effort for many months, but now it's time to start sharing. Ace seeks to help humans do the important work of alignment and collaboration. It helps teams gather input from all those who should be involved (development, architecture, design, domain experts, etc.) and welcomes them into the development process. Not everyone has to be a developer (or vibe coder) but everyone can bring their expertise. Maggie's post discusses the alignment challenge in depth and shows a number of demos of how Ace seeks to solve it. Ace is still a work in progress. We're not quite ready to welcome you in, but we'll soon be rolling out a tech preview. If you want to sign up the link is available at the end of her post. Which, again, you should read :)
-
Dayi Lin liked thisDayi Lin liked thisLast day of ICSE 2026! 🇧🇷 We're closing out an incredible week in Rio with one final deep dive. If you’ve heard the buzz about "vibe coding," you know it is the current state-of-the-art for empowering non-technical "software makers" to build applications just by describing what they want. But as many of us have seen, these apps can be fragile when things get complex. How do we take this movement from "cool demo" to "production-ready"? Join us today at 4:00 PM in Oceania III for our technical briefing: "Vibe Engineering: Software Engineering for Software Makers." We’ll be discussing how to bridge the gap between human intent and agent execution. It’s the final session of the conference. But we promise it’ll be worth staying for before we all head out for those post-ICSE celebrations! ☺️ Details: 📍 Room: Oceania III 🕓 Time: 16:00 (90 mins) See you there! #ICSE2026 #VibeEngineering #VibeCoding #SoftwareEngineering #GenerativeAI #AgenticSE #Innovation cc Filipe Roseiro Côgo Jiahuei Lin, Ph.D. Dayi Lin Ahmed E. Hassan Zhiyu Fan Ben Rombaut
-
Dayi Lin liked thisDayi Lin liked thisFoundation models are among the most complex software systems humans have ever built. And right now, most of them are engineered through tribal knowledge and ad hoc processes. That needs to change! Today at ICSE 2026 in Rio de Janeiro, Oceania @ 4PM, our team is presenting a 90-minute Technical Briefing on Software Engineering for Foundation Models (SE4FM). We are giving the community an insider view of how foundation models are really built at scale, organized around four pillars: 1. DataOps: deciding, curating, optimizing, and governing the data that shapes model behavior 2. ExperimentOps: hypothesis-driven experimentation from small proxy runs to full-scale training 3. EvalOps: systematic testing, regression detection, and contamination-resistant evaluation 4. FieldOps: releases, monitoring, compliance, and incident response in production This is not a survey talk. Every method, constraint, and lesson comes from our work improving the code generation performance of Pangu (Huawei's foundation model) to the frontier, placing it in the same performance range as the top 3 open-source FMs. If you are at ICSE, join us: Thursday April 16, 4:00 PM, Oceania III, Windsor Convention Center Boyuan Chen, Dayi Lin, Arthur Leung, zhilong chen, Gustavo Ansaldi Oliva, Yihao Chen, Xiaoshuang Liu, Chun Yong Chong, Ahmed E. Hassan #ICSE2026 #SoftwareEngineering #FoundationModels #SE4FM #LLM #DataOps #MLOps #AIEngineering #HuaweiResearch #MachineLearning #LLMTraining #RioDeJaneiro
-
Dayi Lin liked thisDayi Lin liked thisAGENT’26 workshop kicked off this morning. Come to join us at Oceania VIII. Ahmed E. Hassan David Lo Liming Zhu Satish Chandra Robert Feldt Lingming Zhang Chao Peng Gustavo Soares
-
Dayi Lin liked thisDayi Lin liked thisExcited to just finish presenting an industry keynote for FORGE Confg at Rio de Janeiro this weekend for ICSE - International Conference on Software Engineering 2026. We got cover a lot of topics on everyone's minds. How AI is reshaping software engineering, and how AI is impacting human developers, including "brain fry", "AI psychosis", and what I'm calling the "Zane moment" that's coming for companies. Giving agency back to developers through "Slow Code" and "Napkin Software". Thanks Chao Peng, Gustavo Pinto for the invitation!
-
Dayi Lin liked thisDayi Lin liked this𝗕𝗲𝘆𝗼𝗻𝗱 𝗧𝗵𝗲𝗼𝗿𝘆: 𝗧𝗵𝗲 𝗥𝗲𝗮𝗹𝗶𝘁𝘆 𝗼𝗳 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗴𝗲 𝗼𝗳 𝗔𝗜 The pace of evolution is mind-bending. Come join us this week at ICSE Rio, every day at 16:00 (4 PM) in Oceania III, to get a raw, firsthand glimpse of what is actually happening in the trenches and what the industry urgently needs. From the shift to 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝗘 𝟯.𝟬, to the industrial rigor required to build 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻 𝗠𝗼𝗱𝗲𝗹𝘀, and the rise of 𝗩𝗶𝗯𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴, we are covering the full spectrum of how software and SE are being reimagined in the age of AI. 🔽 𝗪𝗲𝗱𝗻𝗲𝘀𝗱𝗮𝘆 | 1️⃣ 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴: 𝗔 𝗥𝗼𝗮𝗱𝗺𝗮𝗽 𝘁𝗼 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝟯.𝟬 We are moving beyond AI-assisted development (SE 2.0) into a world where AI teammates act as true collaborators (SE 3.0). This session introduces the Agentic SE framework and provides an empirical look at over 450,000 agent-authored pull requests to address the "speed vs. trust" gap. (For a deeper dive into this shift, check out my book: https://lnkd.in/g2mwzax9) 𝗝𝗼𝗶𝗻𝘁 𝘄𝗼𝗿𝗸 𝘄𝗶𝘁𝗵: Hao Li, Dayi Lin, Bram Adams, Tse-Hsun Peter Chen, Yutaro Kashiwa, Dong Qiu Qiu, Haoxiang Zhang, Miku Watanabe 🔽 𝗧𝗵𝘂𝗿𝘀𝗱𝗮𝘆 | 2️⃣ 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗳𝗼𝗿 𝗙𝗼𝘂𝗻𝗱𝗮𝘁𝗶𝗼𝗻 𝗠𝗼𝗱𝗲𝗹𝘀 (𝗦𝗘𝟰𝗙𝗠) Foundation models are some of the most complex systems that humanity ever built, yet their creation often relies on "tribal knowledge". Based on our experience building some of the leading foundation models, we distill a rigorous engineering approach organized around four pillars: 𝗗𝗮𝘁𝗮𝗢𝗽𝘀, 𝗘𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁𝗢𝗽𝘀, 𝗘𝘃𝗮𝗹𝗢𝗽𝘀, 𝗮𝗻𝗱 𝗙𝗶𝗲𝗹𝗱𝗢𝗽𝘀. 𝗝𝗼𝗶𝗻𝘁 𝘄𝗼𝗿𝗸 𝘄𝗶𝘁𝗵: Boyuan Chen, Dayi Lin , Arthur Leung, zhilong chen , Gopi Krishnan Rajbahadur, Gustavo Ansaldi Oliva, Yihao Chen, Xiaoshuang Liu, Chun Yong Chong 🔽 𝗙𝗿𝗶𝗱𝗮𝘆 | 3️⃣ 𝗩𝗶𝗯𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴: 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗳𝗼𝗿 𝗦𝗼𝗳𝘁𝘄𝗮𝗿𝗲 𝗠𝗮𝗸𝗲𝗿𝘀 "Vibe coding" allows non-technical makers to build apps by describing desired outcomes. We introduce a structured framework that marries 𝗜𝗻𝘁𝗲𝗻𝘁 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 (using Theory of Mind to capture what users actually need) with 𝗥𝗲𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 (an AI teammate execution environment) to make this new way of building software robust and trustworthy. 𝗝𝗼𝗶𝗻𝘁 𝘄𝗼𝗿𝗸 𝘄𝗶𝘁𝗵: Keheliya Gallaba, Ph.D., Zhiyu Fan, Jiahuei Lin, Ph.D., Filipe Roseiro Côgo, Ben Rombaut, Dayi Lin See you at ICSE - International Conference on Software Engineering in Rio! 🌴 #ICSE2026 #SoftwareEngineering #AgenticSE #GenerativeAI #VibeCoding #FoundationModels #SE30
Experience & Education
-
Huawei Canada
********* ******* **** ***** *** *** ****
-
****** ************ ****** **** ****
***** **********
-
******* ****
**** *********
-
******* **********
****** ** ********** ******* ********* 4.2 / 4.3
-
View Dayi’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Publications
Languages
-
English
Full professional proficiency
-
Mandarin
Native or bilingual proficiency
View Dayi’s full profile
-
See who you know in common
-
Get introduced
-
Contact Dayi directly
Other similar profiles
Explore more posts
-
Jean-Sébastien Giroux
Environment and Climate… • 497 followers
Quantum Computing (QC) has tremendous potential - Starting from first principles creating a computer that follow quantum mechanics make so much sense. Wonderful research will emerge from the combination of AI and QC in the coming years - I highly recommend applying (Consider making a transition to the field if you are not already from QC background)!
4
-
Mykola Maksymenko
Haiqu • 8K followers
One of the most exciting collaboration projects Haiqu's team is involved in. Open Quantum Design (OQD) is the non-profit effort out of the University of Waterloo / IQC to build a fully open, full-stack trapped-ion quantum computer - from hardware and control electronics to the software layer. Why I care about this (as someone building quantum software for efficient on-device execution): openness at the stack level is what makes results reproducible and enables cross-layer co-design (control <> compilation <> error mitigation <> algorithms). Check the article link in the first comment. Greg Dick Xanadu Open Quantum Design Unitary Foundation #QuantumComputing #OpenSource #TrappedIons #QuantumHardware #QuantumSoftware #OQD
283
9 Comments -
Henry Kao
Huawei • 290 followers
Happy to share that our research into advancing instruction cache replacement policies from the Toronto Heterogeneous Compiler Lab has been accepted to MICRO 2025 (https://lnkd.in/gZzKbXzw). Summary: We observe that conventional compiler profile-guided optimization is able to improve spatial locality of instruction memory using code "temperature" information (execution frequency of different segments of code), but does little to better temporal locality. We design a cache replacement policy that uses a lightweight software-hardware interface to pass "temperature" information from the compiler down to the hardware cache replacement policies in order to reduce instruction cache misses on the most frequently execute code. The technique is called TRRIP (Temperature-Based Re-reference Interval Prediction) and is aimed to be realistically implementable (i.e., no changes to instruction set, minimal hardware overheads). The research will be presented in MICRO 2025 in Seoul, Korea on October 20th in Session 3B: Microarchitecture 1. The accepted paper should be available shortly after. A pre-print of the manuscript is available here: https://lnkd.in/gXx58UmC Credit and acknowledgements to co-authors & colleagues: Nikhil Sreekumar, Prabhdeep Soni, Ali Sedaghati, Fang Su, Bryan Chan, Maziar Goudarzi, Reza Azimi
76
3 Comments -
Manfred Weis
Garmin Wuerzburg GmbH • 174 followers
My breakthrough in combinatorial optimization A certain class of combinatorial optimization problems asks for the best arrangement, resp. order of things. Some of these problems are "easy" to solve, like e.g. assigning workers to jobs, which is a bit surprising because for n workers and n jobs there are n! posssible assignments of which the optimal one must be calculated. In contrast the Traveling Salesman Problem that asks for the shortest tour through n cities requires finding the shortest tour among (n-1)!/2 possible tours; so it is much harder to find the shortest tour among half as many as there are ways to assign workers to jobs. Ruminating on what could be the culprit for that paradoxical finding resulted in an efficient heuristic for improving tours that finds the optimal way to change the order of O(n) of the cities, whereas known techniques like k-opt moves are subject to combinatorial explosion. The new heuristic works for symmetric and asymmetric instances ; I have verified that with a python implementation. The next major problem, especially in relation to the TSP is subtour elimination, i.e. to guarantee that all cities can be visited with a single tour. The standard techniques are the subtour elimination constraint which are numerically preferable, but they require restarting the calculation with additional constraints that rule out subtours that have already been encountered, and it may be necessary to add exponentially many of these constraints. Then there are the fixed number of Miller Tucker Zemlin constraints that rule out subtours at the cost of much worse numerics and the need to use directed edges even in the case of symmetric instances. I have managed to find an ILP formulation of the TSP that needs no subtour elimination constraints; I have also checked that with an implementation. The iteresting thing is that the reformulation gemeralizes to other notoriously hard problems, even to the quadratic assignment problem where it allows for a formulation as a pure ILP instead of the integer bilinear problem formulation that is the current method of choice. To be very clear: the reformulation of the TSP does not solve the P vs NP question; the unsolved problem is to calculate solutions in whole numbers efficiently. My plans are to do more intensive testing and eventually publish what I have found; I would classify the current status as "in review".
18
3 Comments -
Christopher McMahon
GoodLabs Studio • 436 followers
The new Canadian Quantum Champions Program emphasizes benchmarking, not just dollars spent. In frontier technologies, progress usually stalls at measurement. If you can't agree on how to evaluate systems, you can't scale them efficiently or responsibly. Programs that tie funding to benchmarks are rare, and risky, but probably necessary if quantum computing is going to move beyond roadmaps and promises. Curious whether others see benchmarking as a real bottleneck, not just qubit counts.
2
2 Comments -
Sagar Shrestha
Oregon State University • 1K followers
Presented our work titled "Diversified Flow Matching with Translation Identifiability" at ICML 2025, Vancouver. This work introduces an ODE-based framework for transporting a diverse collection of source distributions to their target counterparts via a unified flow. This enables unpaired domain translation (e.g., image to image translation) with translation identifiability. 👉 https://lnkd.in/gdVfxCDV
192
1 Comment -
Wes Worsfold
BitBakery Software • 5K followers
“ IQC has also spun out more than 20 quantum startups and 40 per cent of its researchers are involved in commercialization.” Researchers at the Institute for Quantum Computing (IQC) at the University of Waterloo are doing more than tackling trailblazing theoretical and applied researc… Source: Waterloo News https://buff.ly/CPCUxKB
7
-
HyoukJun Kwon
Meta • 2K followers
I am excited to share that our paper, "Characterizing state space model (SSM) and SSM-Transformer hybrid language model performance with long context length," has been accepted at ISPASS 2026! An early version of the paper is available in this link: https://lnkd.in/gVdj2Psd? <What is this paper about?> We present a thorough workload characterization study of SSM and SSM-Transformer hybrid models with long contexts, focusing on consumer-grade and mobile/edge devices. <Key Findings> - The performance of SSMs is dominated by newer SSM-specific ops, unlike Transformers, which are dominated by GEMM or non-GEMM counterparts. - Unlike pure SSMs, the bottleneck of hybrid models varies by model. - Performance penalty on non-GEMM operations is more severe on edge devices than workstation-class machines. (Please refer to our paper for more insights!) <Why SSM and SSM-Transformer hybrid models?> - Supporting long context has become an important feature for LLMs for recent high-value applications (e.g., coding) and high-quality results with the retrieval of external information. However, Transformer-based LLMs suffer from their super-linear overhead on the context length for long contexts. State space models (SSMs) and SSM-Transformer have been proposed to overcome the limitation of Transformer in scaling the context length, demonstrating their superior capability to support extremely long context (e.g., up to 1M tokens in Nvidia's Nemotron3-Nano). - Also, SSM and SSM-Transformer hybrid models are getting more attention in the industry (Nvidia's Nemotron family, Google's Titans, and so on). <Why characterization study targeting mobile/edge devices?> Due to SSM and SSM-Transformer hybrid models' capability to efficiently support long context language model, we envision their adoption would be pervasive in future smart AI devices (IoT, mobile, and wearable devices) where AI service will lead to useful and high-value use cases. However, their computational characteristics are not well-explored. <Acknowledgements> I appreciate the contributions from incredible co-authors at UC Irvine, Rachid Karami, Haocheng Xu, Sitao Huang. Special thanks to Saptarshi Mitra, who led many efforts as the first author! * Note: This research work has been conducted at UC Irvine.
148
8 Comments -
Adrian Leu
Techlead Consulting Ltd.… • 6K followers
I was pleasantly intrigued and interested to hear this week that a coalition of researchers from EleutherAI, MIT, CMU and University of Toronto curated an 8TB dataset composed entirely of open license or public domain texts. From this, they trained Comma v0.1, a 7B parameter model that achieved performance on par with some of the more well known foundation models like Llama 2-7B and others. The implication of this is that it demonstrates that high-performing LLMs can be built without infringing on copyright. A subject which is certainly pretty hot and likely to get hotter in the immediate future due to the need of transparency of the origin and IP of the training data. It points toward more trustworthy and legally compliant models that support regulatory oversight and full transparency on the origin of the training data. Even if, for the time being, they might not fully compete on all benchmarks with all the large, well known foundation models. This could have a good systemic impact in areas that require specialisation, governance and compliance like education, healthcare, scientific research, legal. https://lnkd.in/er75F_HJ #LLMs #TrainingData #Transparency #CopyrightLaws #AI
8
1 Comment -
Andrew Collinson
CPaaS Acceleration Alliance • 6K followers
TELUS and Rakuten shared quite different approaches to build/buy AI, in discussion with Nathan Bell, Kearney. Jaime Tatis said TELUS were focused on the discipline of staying open to different partners, whereas Zoran Stejic said Rakuten is building its own AI in Japanese language, citing that its own models were more cost effective if less powerful than general LLMs. TM Forum #DTWIgnite #DTW25 #ai #telcoai #telecoms #strategy #innovation
29
-
Alberto Boaventura
CPQD • 6K followers
UBC researchers proposed a silicon chip that can convert microwave signals to optical and vice versa with up to 95% efficiency and minimal noise, preserving quantum entanglement. This breakthrough addresses a key hurdle in building scalable quantum networks, enabling long-distance communication between quantum computers over fiber optics. The chip leverages engineered magnetic defects in silicon and superconducting components, operates at ultra-low power, and is compatible with existing chip fabrication technologies. Though still theoretical, this innovation could accelerate the realization of quantum networks, supporting future applications like unbreakable security and enhanced computing capabilities. https://lnkd.in/duYGMXTm
2
-
Nixon Patel
QCarbon Inc • 3K followers
A team of researchers at the University of Waterloo have made a breakthrough in quantum computing that elegantly by-passes the fundamental "no cloning" problem. The research, "Encrypted Qubits can be Cloned," appears in Physical Review Letters. We don’t get a quantum future by inventing faster algorithms alone. We get there by building infrastructure that can actually survive reality. A recent breakthrough shows how quantum information can be safely distributed across multiple servers without ever being copied—effectively pointing toward a quantum cloud: a quantum Dropbox, a quantum Google Drive, a quantum-native storage layer built to obey the laws of physics. This matters because quantum computing breaks one of the core assumptions of classical systems: you can’t copy quantum data. No copy means no backups. No backups means no infrastructure. Until now. What’s emerging is a new idea of resilience—redundancy without duplication, security without replication. That’s not just a physics insight; it’s a blueprint for how the quantum internet, quantum data centers, and quantum-native platforms will actually be built. For founders and CTOs, this is the shift from “Can quantum computing work?” to “How do we architect the quantum stack? The quantum era won’t arrive all at once. It will arrive quietly—one infrastructure primitive at a time. And this is one of those primitives. 😇 https://lnkd.in/e-XYf9uQ
75
7 Comments -
Soumyajit Tarafdar
Kalinga Institute of… • 368 followers
I have recently finished reading TRM paper [Samsung AI Lab (SAIL) Montréal] Tiny Recursive Models on ARC-AGI-1: Inductive Biases, Identity Conditioning, and Test-Time Compute and here's what i found: Large transformer models often push 100M–1B+ parameters just to gain incremental reasoning improvements, yet still plateau around ~40% accuracy on structured tasks like Sudoku, maze planning, and ARC-AGI-style problems. TRM, with only ~7M parameters, reaches ~80% accuracy on similar benchmarks by enforcing recursive, step-wise reasoning — effectively delivering 2× the accuracy with a fraction of the scale. That gap is a HUGE DEAL: it shows that structured recursion can outperform brute-force parameter growth on logic-heavy tasks. Tiny Recursive Models (TRM) changed how I think about reasoning architectures. After reading the paper, here’s what stood out to me: 1. Recursion instead of scale TRM shows that you don’t always need massive parameter count to solve reasoning heavy tasks. Instead, you reuse a small network recursively. The same module solves progressively refined subproblems. That’s compute reuse, not brute force scale. 2. Depth via time, not parameters Instead of stacking 100+ layers once, TRM applies a small model multiple times. It trades width for iterative refinement. Think: controlled reasoning steps rather than one-pass prediction. 3. Why this matters for HRM HRM (Hierarchical Reasoning Model) separates: High-level planning (H-module) Low-level execution (L-module) The improvement TRM introduces is structural discipline: Iterative refinement makes the planning phase more stable. Recursive reuse reduces overfitting from large monolithic planners. Deep supervision across steps improves intermediate reasoning signals. In short: TRM strengthens HRM by making reasoning compositional instead of oversized. This feels like a shift from “bigger transformer” → “structured computation.” Scaling laws are not the only path forward. Structured recursion is another axis.
4
-
Silver Birch Growth
2K followers
Canadian companies - last day is today to apply for the AI Compute grant. $100K - $5MM for relevant projects 👇🏻. See below. SBG projects entering phase 2 where they already invested in software or ai agents or both and they want to own unique insights, have their own AI infra for doubling down on gains we make together in phase 1 first deployments with brands. https://lnkd.in/gt4Qvm75
-
Dr. S Sagar Srinivas
Tata Research Development and… • 1K followers
🚀 Smarter, Faster RAG: A Policy-Optimized, Memory-Efficient Framework for Test-Time Reasoning 📄 New arXiv print 🔗 Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding https://lnkd.in/dErdTDf7 🔍 The Problem Retrieval-Augmented Generation (RAG) boosts LLMs with external knowledge but suffers from: ❌ Static retrieval schedules ❌ Distractor-induced hallucinations ❌ Linear memory growth in KV caching ❌ Inefficient test-time compute without reward alignment. 🧠 Our Framework: PORAG + ATLAS + CRITIC We propose a three-component architecture that dynamically optimizes retrieval, rewards factual and high-quality outputs, and reduces compute/memory costs — all without altering the pretrained base model at inference. 1️⃣ PORAG: Policy-Optimized Retrieval-Augmented Generation • Built on Group Relative Policy Optimization (GRPO) • Fine-tunes with QLoRA adapters, guided by two specialized reward heads: – Retrieval Fidelity (factual correctness) – Response Quality (fluency, coherence, utility) • Enables single-shot decoding with KL-regularized policy updates for stability 2️⃣ ATLAS: Adaptive Token-Layer Attention Scoring • Detects when to retrieve via Multi-Layer Attention Gradient (MLAG) • Builds what to retrieve via Layerwise Representation Pooling (LRP) • Leverages hybrid attention-embedding relevance, with entropy-aware scaling to manage retrieval load 3️⃣ CRITIC: Cache Reduction via Importance-based Token Inclusion Criteria • Selectively retains high-impact tokens in KV cache • Hybrid importance signal: attention strength, entropy spread, gradient sensitivity • Reduces memory and latency for long-context generation, with adaptive compression ⚙️ Exhaustive Test-Time Inference Scaling — No Fine-Tuning Required Our framework integrates advanced inference-time scaling techniques that allow LLMs to trade off compute, latency, and answer quality — dynamically and intelligently. These include: ✔️ Entropy-Guided Decoding — invoke deeper reasoning under uncertainty ✔️ Monte Carlo Tree Search (MCTS) — strategic decoding via multi-branch simulation ✔️ Mixture of Agents — task-routed decoding using a pool of specialized policies ✔️ Lookahead Decoding — speculative multi-token generation and verification These methods are plugged into our test-time control module, which uses token-level entropy, compute budget, and reward models to dynamically select the optimal strategy for each input. 🧠 Conclusion Our framework enables scalable, retrieval-aware LLM inference by combining reward-optimized generation, introspective retrieval, and memory-efficient decoding. It is: ✔️ Transformer-agnostic ✔️ Plug-and-play for SLMs ✔️ Ready for enterprise-scale, compute-sensitive deployments 📌 Full paper on arXiv: https://lnkd.in/dErdTDf7 🤝 With Akash Das, SHIVAM GUPTA, and Venkataramana Runkana #LLMInference #TestTimeScaling
17
-
Claudio Alberti, PhD
confinis • 1K followers
Two days. Two LLM launches. Same pattern. Early February: MiniMax M2.5 Early benchmarks position it close to Claude Opus 4.6 — at reportedly ~20x lower cost. Comparable reasoning. Fraction of the price. Yesterday: GLM-5 Another model narrowing the gap with ChatGPT, Claude, and Gemini. And here’s the real shift. Having an LLM with performance close to the 800-pound gorillas of the arena running on your own infrastructure is no longer theoretical. It’s reality. Yes, there is hardware cost. Yes, there is power and maintenance. But we’re not talking about hyperscaler-only territory anymore. We’re talking about serious reasoning capabilities available to companies willing to invest in a limited GPU setup. News like this now drops almost weekly. The moat isn’t model access anymore. It’s integration, product thinking, and governance. Especially in regulated environments, this changes the equation. The question is no longer: “Are these models good enough?” It’s: “What are we building with them?” #AI #LLM #CTOInsights #RegulatedSoftware #OpenSourceAI #MedTech Image taken from the z.ai blog https://z.ai/blog/glm-5
10
7 Comments -
Isar Nejadgholi
National Research Council… • 2K followers
The third round of joint testing of LLMs by the network of AI Safety Institutes (AISIs) was focused on multilingual testing of LLM agents! These tests are led by the Singapore AISI and we, at National Research Council Canada / Conseil national de recherches Canada, contributed to the test on behalf of the Canadian AISI, in two languages, Telugu and Farsi. The results are published today on the sidelines of ICML2025. Blog post: https://lnkd.in/eU3yU8AG Full Report: https://lnkd.in/e8N-hNzQ #llmEvaluation #AISafety #MultilingualEvaluation #AgenticAI
75
4 Comments -
Cheng-Yen Hsieh
ByteDance • 4K followers
Toronto startup Taalas launched the HC1 chip on Feb 19, hard-wiring Llama 3.1 8B into a single silicon die delivering 17,000+ tokens/sec per user. It reportedly outperforms Cerebras and Nvidia by wide margins (~8x), uses ~10× less power, and runs in a 200W air-cooled server. Founded in 2023 by Ljubisa Bajic, Taalas has raised $219M and is targeting applications from appliances to medical devices. Impressive speed. Another question is how easily the chip can be customized for new models, given the rapid iteration of model architectures. https://lnkd.in/ghVfZXde
17
-
Tovi Grossman
AXL: Human Potential, AI… • 3K followers
🚀 AXL Newsletter Issue #4! Google launches Gemini Enterprise for the workplace, AI helps uncover a new cancer therapy approach, computer-use models get an upgrade, AMD and Cohere boost Canada’s AI leadership, and more! View the issue: 👉 https://lnkd.in/gSqCFFCT Subscribe: 👉 http://eepurl.com/jin1X-/
22
2 Comments
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More