Miloš Švaňa

ai, python, software engineering, decision-making and more

  • My Arduino RGB controller is now even better

    I had a lot of fun turning an Arduino board into an RGB controller. If I don’t count building a PC and a few guided projects from the Arduino project book, this was my first real hardware project in years.

    The controller does what it’s supposed to – it changes the LED colors in my PC based on the current hardware usage: higher RAM usage means more blue, higher CPU usage means more green, and higher GPU usage means more red. There is also an ambient white light when the PC is turned off. All of this is quite a success, if you ask me.

    But as a tinkerer, I am never fully satisfied. There are always opportunities for improvement. So, for version 2, I focused on the following changes:

    • Reduce the intensity of the ambient white light when the PC is turned off. It’s annoying at night.
    • Control the maximum intensity of the LEDs when the PC is turned on for the same reason.
    • Make the transition between colors smoother. It just looks nicer.
    (more…)
  • I rewrote the Embeddings Playground backend in Rust

    Embeddings Playground is an environment where you can experiment with different text embedding models without the hassle of setting up API keys or writing code. It features a small backend that computes embeddings if you choose to use some of the original SentenceTransformers models. It was originally written in Python, the default choice for AI and ML applications.

    Being the default has its advantages: a large community, well-supported libraries, and ease of finding a job. But there is also a disadvantage: I use Python every day at my job, and it’s kinda boring. So, I’ve decided that when working on a side project, I’ll use something else. And I started rewriting the Embeddings Playground backend in Rust.

    (more…)
  • Reading Club: Build a Large Language Model (from Scratch)

    Language models developed by OpenAI, Anthropic, and other AI labs both excite and horrify me. At the same time, I like to tinker and am a big proponent of building things locally yourself. Add all these things together, and you get someone eager to build their own LLM. I thought reading Build a Large Language Model (from Scratch) by Sebastian Raschka was a good place to start. Did it deliver?

    (more…)
  • I built an RGB controller with Arduino

    Before Christmas, I bought a new computer case. It came with three built-in RGB fans. Unfortunately, the fans use a 3-pin aRGB connection, while my now quite old motherboard has a 4-pin RGB header. I am sure that I could find an adapter that would solve this issue right away. But there is no fun in that. Having experimented with Arduino for some time, now was the time to use my skills to build something useful.

    (more…)
  • Do LLMs hallucinate more in Czech than in English?

    Having explored hallucination benchmarks for LLMs, I’ve decided to use the TruthfulQA dataset to see if LLMs hallucinate more when I talk to them in Czech instead of English.

    This question is important for two reasons. First, if you are a user or a developer integrating LLMs into various apps, and you need to interact with LLMs in Czech (or other languages that are not English), the answer should influence your choice of model. Second, the answer might help ML researchers, governmental bodies, and investors decide if developing language or country-specific language models is worth the effort.

    (more…)
  • Benchmarking LLMs for hallucinations

    How often do different language models hallucinate? You might have noticed that when presenting new models, AI labs usually avoid discussing this question. Their marketing departments probably discourage them from talking about hallucinations out loud. But maybe a more important reason is that benchmarking hallucinations is as straightforward as we think.

    (more…)
  • Reading Club: Against the Machine

    I recently finished Against the Machine by Paul Kingsnorth. The book critiques what Kingsnorth calls “the Machine”. He never defines the term directly. Instead, he describes its many faces: technology sneaking into every aspect of our lives, worship of money, centralization of power, dissolution of physical communities, homogenization of culture through globalization, or obsession with technological progress, no matter the consequences.

    You might have just rolled your eyes. People have always criticized the society they live in, and some of these topics are often associated with conspiracy theorists. So does this book offer anything new?

    (more…)
  • Hallucinations in LLMs: What are they and why we have them

    On February 24, 2025, a federal district court sanctioned three lawyers from the law firm Morgan & Morgan for citing fake cases. As it turned out, these cases were generated by an AI tool. Mistakes made by large language models, such as GPT, have real-world consequences.

    We often call such mistakes hallucinations. A hallucination occurs when a language model hidden behind most modern AI tools generates a factually incorrect, logically incoherent, or otherwise defective answer. When language models hallucinate, they often sound confident. This makes hallucinations dangerous.

    To deepen my knowledge, I read over 20 academic papers on hallucinations. Now, I want to share what I learned. In this article, I’ll focus on defining and categorizing hallucinations, their underlying causes, and answering a crucial question: Are hallucinations inevitable? In a follow-up, I’ll talk about hallucination detection and mitigation.

    (more…)
  • Moving local GPU workflows to the cloud with Rsync

    The GPU rental market is a bit of a mess.

    First, not even big cloud providers can guarantee GPU availability at all times and places. Sure, you could keep the VM running for months, but that would burn your budget quite quickly, especially if you are a solo developer or work in a small team.

    Second, permanent storage is severely limited. Some providers offer persistent drives whose lifecycles are independent of the VMs. But these drives are still usually bound to a specific region. What if that region is currently out of GPUs? That drive is now essentially useless.

    I’ve been thinking about how to organize my workflow around these limitations. Here is what I have come up with so far.

    (more…)
  • Implementing a fully local AI coding agent is hard

    For the last few weeks, I have been working on FileChat — a read-only AI coding agent. As a proponent of privacy and independence from third-party services, I wanted FileChat to be fully local. But making FileChat’s AI components run locally turned out to be more difficult than I initially thought.

    FileChat relies on AI models at two places: creating embeddings for quickly finding relevant files and chat. Let’s start with the first one.

    (more…)
  • Introducing FileChat: A local read-only AI coding assistant

    Today I am introducing the first alpha version of FileChat — a local, read-only AI coding assistant.

    Coding assistants from big players write code for you. FileChat is read-only. It sees your files, and you can chat about them. But FileChat cannot, by design, directly modify them.

    Why this design choice? Because I don’t trust AI. Don’t get me wrong, I still think it’s useful. I can spot issues in my code, help me understand large projects, or brainstorm new ideas. But given my interest in AI safety, I simply don’t want an LLM to write code for me.

    And although scientific evidence is inconclusive, I also believe to such a degree of certainty that I am prompted to act on it, that overreliance on AI tools can make us dumber in the long term.

    If you are more on the conservative side as well, FileChat might be the right tool for you.

    (more…)
  • How I Solved PyTorch’s Cross-Platform Nightmare

    Setting up a Python project that relies on PyTorch, so that it works across different accelerators and operating systems, is a nightmare.

    Most PyTorch projects I work on are internal or run on a server that my colleagues and I can control. In these situations, I am fine with combining optional dependencies with indices as recommended in the UV documentation.

    However, I recently started working on FileChat — my own opinionated AI coding assistant (stay tuned). I plan to distribute FileChat as a Python package. As it turns out, once you build a wheel for distribution, any custom indices go out of the window — they are not included in the metadata. It would be up to the user to configure them properly during installation.

    But this isn’t what I want. I want a single-command install, no matter what hardware or OS you are using. So, what’s the solution?

    (more…)
  • Reading Club: Lessons from AI Safety for Businesses

    Businesses and organizations often seem misaligned with our actual needs and wants. In the second installment of the Reading Club series, I want to explore this issue from the perspective of AI safety research.

    We want our businesses and organizations to be aligned with our values. The same can be said about powerful AI systems. AI safety researchers call it the alignment problem. One of the most important scientific papers addressing the alignment problem is Risks from Learned Optimization. It discusses how optimization processes used to train AI models can create models that are themselves optimizers, and how these optimizers can be misaligned with the original goal the AI system was trained to achieve. Let’s unpack this idea step-by-step.

    (more…)
  • I trained a GPT-1-like model from scratch

    As part of advancing my knowledge of ML and AI, I challenged myself to recreate a model similar to GPT-1 from scratch.

    GPT-1 is a predecessor of the model that’s currently powering ChatGPT, and many other AI companies use a similar architecture for their own language models. Its task is simple: given some input text, predict what comes next.

    In this article, I walk you through the entire process of creating a model similar to GPT-1. I start with data preparation and end with generating text with a trained model. Then I discuss a few things I’d do differently knowing what I know now. I assume you have a basic knowledge of PyTorch and the transformers architecture. If you lack this knowledge, you can start with PyTorch tutorials and excellent videos on neural networks from 3Blue1Brown.

    I reference various parts of the project’s code throughout the article. But if you want first to explore the codebase as a whole, here is the GitHub repository.

    (more…)
  • Reading club: Concrete Problems in AI Safety

    Concrete Problems in AI Safety is one of the most famous scientific papers on AI safety. It was published in 2016, six years before OpenAI introduced the first version of ChatGPT.

    The paper describes five intertwined categories of what the authors call accidents: cases where AI systems behave unexpectedly and cause damage. In the next few paragraphs, I’ll attempt to explain the gist of each category. Then I’ll share a few thoughts on whether the paper is still relevant today. When it was being written, researchers were thinking about AI differently than we do. They envisioned autonomous AI agents that would learn to perform specific tasks from their experience and mistakes. Modern language models don’t work like that.

    (more…)
  • Why we shoud take AI safety seriously

    Názory na bezpečnost umělé inteligence a potenciální rizika této technologie se člověk od člověka liší. Sám považuji toto téma za extrémně důležité. K tomuto závěru jsem došel z několika vzájemně se podporujících důvodů:

    • Umělou superinteligenci (ASI – artificial superintelligence) nebo alespoň obecnou umělou inteligenci (AGI – artificial general intelligence) můžeme vyvinout velmi brzy. Pracovně můžeme AGI definovat jako systém, který zvládá vykonávat většinu intelektuálních úkolů na podobné úrovni jako průměrný člověk, a ASI jako systém, který zvládá většinu intelektuálních úkolů vykonávat lépe než odborník na danou oblast.
    • Existuje celá řada způsobů, jakými nás AGI nebo ASI může ohrozit.
    • Kolem schopností umělé inteligence panuje mnoho nejistoty.
    • Na výzkum bezpečnosti AI se v porovnání s vývojem lepších schopností vynakládá velmi málo zdrojů.

    Podívejme se podrobněji na každý z těchto argumentů.

    People have varying opinions on AI safety and risks. I think this topic is extremely important for several intertwined reasons:

    • We can develop artificial general intelligence (AGI) or even artificial superintelligence (ASI) very soon. We can loosely define AGI as an AI system that can do most intellectual tasks at the same level as an average human, and ASI as an AI system that can do most intellectual tasks at a higher level than an expert in a given field.
    • AGI or ASI can endanger individuals or societies in many ways.
    • We are uncertain about many aspects of AI.
    • Compared to investments in better AI capabilities, AI safety research is severely underfunded.

    Let’s explore these arguments in depth.

    (more…)
  • Hope, Hype, and Hell: Mapping the AI Safety Spectrum

    For a while, I have been exploring the world of technical AI safety and AI policy. In future articles, I’d like to dive into arguments for why I think this field is important and why we should take AI-related risks seriously. But before I get to that, I’d like to talk a bit about the different views on AI safety I’ve encountered while interacting with people in various online communities.

    (more…)
  • Criteria in multiple criteria decision analysis

    If you haven’t done so already, please read Better decisions with Multiple Criteria Decision Analysis before continuing.

    During our recent discussion on Even Swaps, I mentioned that this method significantly simplifies how we work with criteria. We don’t have to analyze them in depth, we just make tradeoffs between their values.

    But I also said that the Even Swaps method has many limitations. If we want to solve more complex problems, we need something different. Most other multiple criteria decision methods require us to analyze criteria and their values in more depth.

    (more…)
  • Even Swaps: making decisions via trade-offs

    In a previous article, we started discussing multiple criteria decision analysis (MCDA). We went over the whole process of solving a decision problem with MCDA. Well, almost. We skipped two crucial parts: a detailed look at the concept of criteria and, more importantly, I haven’t introduced any specific MCDA method.

    Today, we are going to address the second issue and talk about Even Swaps. Even Swaps is a very simple decision-making method that does not require any complex mathematical formulas. You can start using it quickly without much hassle. Unlike other methods, Even Swaps also significantly simplifies handling criteria, so for the time being, we can postpone a discussion on this topic.

    (more…)
  • Better decisions with Multiple Criteria Decision Analysis

    Some decisions are easy, some are very hard. Sometimes, decisions are hard because we must consider many viewpoints.

    Say you want to buy a house. To make a good decision, you must consider size, location, price, energy efficiency, and many other factors. And you need to make tradeoffs. Maybe you found a cheap but small house far from your workplace. A different house is bigger and closer, but much more expensive. Which one should you choose?

    Multiple Criteria Decision Analysis (MCDA) is a set of tools designed to solve problems like this. Let’s investigate what this toolbox looks like and how it helps you solve problems in your own life.

    (more…)