Stories by Kevin Foong on Medium

Axios Supply Chain Compromise: Safeguarding Your Projects Against Dependency Poisoning

Kevin Foong — Mon, 13 Apr 2026 05:43:33 GMT

Recently, a supply chain compromise affected axios, a library used by over 80 million projects weekly. This briefly impacted one of the projects I work on, FormSG, where a 3rd-party dependency in our CI pipeline pulled a poisoned axios version. This led to a brief compromise of one of our staging environments.

After a post-mortem, we identified several simple yet effective strategies that can be applied to almost any project to drastically reduce exposure to future supply chain attacks.

Why should I be concerned about supply chain attacks? How common are they?

Supply chain attacks have been happening frequently and are becoming the norm. Attackers target a popular package, such as Log4j, compromise the maintainer’s account (often through sophisticated social engineering which may take even multiple years to gain the maintainer’s trust and may be sponsored by state attackers, in the case of the xz utils backdoor), and publish malware-laden versions to run arbitrary code or steal credentials and secrets.

In the past 12 months alone, we’ve seen high-profile packages such as axios and nx being compromised. Even security tools like Trivy are not spared. As long as you install from registries like npm or PyPI, you are managing a web of third-party trust. Attackers don’t only target obscure packages; they usually target the giants to maximise the blast radius. Scarily, these are only the known or detected compromises. The xz backdoor had almost went undiscovered.

Technical Recommendations

1. Restrict Post-install Scripts

Why: Malicious code often hides in postinstall hooks to execute immediately upon download. This was a primary vector in the axios compromise.

Actionable Tips:

npm: Use npm ci --ignore-scripts in CI and selectively allow only trusted scripts.
pnpm: Since pnpm denies scripts by default, audit your onlyBuiltDependencies list. For FormSG, we pruned this list from 16 dependencies to just 4.

JSON

// Example: package.json (pnpm)
{
  "pnpm": {
    "onlyBuiltDependencies": [
      "sharp",
      "sqlite3" // do we really need this? 
    ]
  }
}

2. Strict Version Pinning (Zero “Floating” Tags)

Why: Using latest or floating tags (like :lts-alpine) allows compromised transitive dependencies to slip into your builds. In our case, an unpinned CLI tool pulled the poisoned axios version during a CI run.

Actionable Tips:

Docker: Avoid node:lts-alpine. Use specific versions or even SHA256 hashes.
GitHub Actions: Use commit hashes rather than tags (e.g., v4). This prevents an attacker from overwriting a tag with malicious code.

# Dockerfile: Pin to a specific version
FROM node:22.22.2-alpine3.22

# GitHub Actions: Pin to a specific commit hash
- uses: chromaui/action@c93e0bc3a63aa176e14a75b61a31847cbfdd341c # v1

npx: Avoid raw npx commands. Use npm exec --no-install to ensure the environment uses the version pinned in your lockfile. Consider installing previous npx deps into your devDeps to pin all transitive versions.

For example, FormSG replaced usage of npx and pinned our dependencies and actions.

3. Enforce Frozen Lockfiles

Why: Running a standard npm install in CI can update your lockfile and pull in "new" (and potentially poisoned) versions. Using --frozen-lockfile (pnpm) or npm ci ensures your build uses the exact versions verified during development.

Pro-tip: While pnpm detects CI environments automatically, it doesn’t always do so inside a Docker container. Explicitly define it:

# Ensure production builds match the lockfile exactly
RUN pnpm install --frozen-lockfile
RUN npm ci

4. Use Non-Privileged Users in Docker

Why: If a dependency is compromised and executes a shell, a root user gives the attacker full control over the container.

# Create a system user for the app
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

# Run the application
CMD ["node", "dist/index.js"]

5. Use Multi-Stage Docker Builds for Production

Why: A single-stage Dockerfile often contains build tools (compilers, git, SSH keys) and the full node_modules (including devDependencies). If an attacker gains shell access, these tools provide kit to further penetrate your infrastructure. Splitting your Dockerfile into build and production stages ensures your final image only contains the bare essentials.

Actionable Tips:

Use a build stage to install all dependencies and compile your code.
Use a production stage to copy only the compiled dist folder and the production-ready node_modules.
This significantly reduces the attack surface and results in much smaller images.

FormSG’s production dockerfile applies the abovementioned practices for reference here

6. Isolate Environments (The Blast Radius)

Why: Isolation prevents a compromise in Staging from reaching Production. At FormSG, our IaC (Infrastructure as Code) work to fully isolate staging, UAT, and production environments limited the axios compromise to a single staging environment.

Action: If using AWS, create separate AWS accounts within an organisation for each environment. Accounts are free security boundaries.

7. Use GitHub OIDC for Cloud Roles

Why: Instead of storing long-lived AWS keys in GitHub Secrets, use OIDC. This allows you to specify exactly which repository and branch can assume the role, limiting the damage if credentials are leaked.

Example: AWS IAM Trust Policy snippet for GitHub Actions:

JSON

{
  "Effect": "Allow",
  "Principal": { "Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com" },
  "Action": "sts:AssumeRoleWithWebIdentity",
  "Condition": {
    "StringLike": {
      "token.actions.githubusercontent.com:sub": "repo:/:ref:refs/heads/main"
    }
  }
}

Sources and credits:

The Internet Was Weeks Away From Disaster and No One Knew — Veritasium https://www.youtube.com/watch?v=aoag03mSuXQ
Axios supply chain compromise post-mortem https://github.com/axios/axios/issues/10636
Trivy supply chain compromise post-mortem https://github.com/aquasecurity/trivy/discussions/10425
The FormSG open source project https://github.com/opengovsg/FormSG

The skill of writing easy to review PRs (with lesser known git tricks)

Kevin Foong — Mon, 13 Apr 2026 04:22:41 GMT

Writing better PRs is one of the longest investments in your career as a Software Engineer. In this article, I will share some simple but effective tips that will make your PRs significantly easier to review, with live examples practiced by myself and the FormSG team. This will help your PRs be picked up for review much faster, build trust within your team and your reviewers will thank you for it.

Why is this important? How will this help me?

Getting your PRs merged in much quicker

Have you ever encountered a situation when your PRs stay “in-review” for longer than desired? Every software engineer can likely relate to the experience of asking other fellow engineers to help to review your code changes. Often, there is a implicit resistance or lag before reviews come in as it requires context-switching and takes up the reviewer’s limited available bandwidth.

Making your PRs easier to review helps reduce this lag, builds trust amongst fellow engineers and helps bring your changes to users faster.

Unlocking the true value of code reviews — building release confidence

There is a common software engineering saying that big PRs often receive LGTM, whereas small PRs usually receive more in-depth review and issues are surfaced more easily. This is since looking for issues in poorly scoped, big PRs is similar to looking for a needle in a haystack — its challenging for the reviewer to do.

If service reliability is crucial to you and you’d like to minimize the risk of production incidents, it is crucial that reviewers are able to easily spot and surface issues. For example, in FormSG, practicing writing specifically- scoped and smaller code reviews across the engineering team has led to a significant drop in production incidents due to code change.

Firstly, what exactly makes a PR easy to review?

Fundamentally, it boils down to this simple idea:

A PR is easier to review if intention of the PR is focused, clear and well-scoped (and code changes are limited to this intention).

This idea unlocks the chain of benefits:

smaller PRs
easier understanding and verification of the clear and limited scope by the reviewer
improves the reviewer’s ability to give quality feedback and spot issues
less resistance to pick up reviews in the team, leading to your code being merged in quicker
higher release confidence and more reliable services

Beyond simply counting the LoC in a single PR, PRs are often hard to review as they mix and try to include too much irrelevant scope. For example, a simple feature PR might include un-related refactors found along the way. While the boy scout rule is appreciated, these refactors should be included in a separate PR if possible.

The argument for larger scopes and PRs

Often, combining scope of PR is done as it saves the PR writer’s time. It is deemed much more convenient to package every desired code change into a single PR to avoid the hassle of making multiple feature branches and commits.

However, if you consider the total engineering time due to resistance to review a large PR, potential incidents missed and the time it takes to remediate it. Writing smaller well-scoped PRs usually end up being net positive for the wider team. It also leaves a nice effect of allowing PRs to be used as a clean documentation trail for the team (using tools such as git blame).

Furthermore, writing smaller scoped PRs is a skill that can be developed. By practicing the following tips, you will be able to master this skill and achieve smaller and well-scoped PRs without compromising on velocity as a PR writer.

Learning technical recommendations with a deep-dive example

Let’s deep dive into a case study using my personal work as example:

Previously, when implementing a save draft feature, I practiced the “boy scout rule” and tried to include as much changes as possible to “improve” the code. However, this led to a large and poorly scoped PR. This made it challenging to review and contributed to production incidents due to a missed useEffect reference dependency issue.

Real life example

What issues are there in this PR?

Firstly, looking at the commit history — we can see some applications of this “boy scout rule”.

hoist state to provider is an example, where I tried to make the state reusable. However, this is a large change and could have been split into its own PR, leading to a much more focused scope and more effective review. This ultimately led to a production incident.

Secondly, the commit history might not really be useful to a reviewer. Everything is jumbled and stray commits like feat: remove console logs add noise for the reviewer. These commits could have been merged into the offending commits.

Ultimately, we can see this is a large PR due to the desire to combine multiple intentions (hoisting the provider and implementing a save draft feature). This led to firstly, resistance by reviewers to review this massive change and secondly, missed issues during review which led to production incidents.

What did I do to make my PRs easier to review and address my mistakes?

In a subsequent feature release, I applied the tips studied in this article and made my PRs easier to review. This led to a more effective review process where issues were flagged, PRs were picked up for review with less resistance and higher release confidence.

Splitting PRs into scoped and standalone parts using git tricks

Scoped PR with its intentioned clearly described

Immediately, you might be able to tell a few differences.

The lines of code changed is significantly smaller, leading to quicker reviews per PR.
Each PR, in the case of the above which is a “boy scout” improvement to do some abstraction, is split and focused and easy to understand and verify.
Each change now has its own clearly defined context and “why” behind the change, leading to clear documentation for future engineering teams.

Concrete steps into how to achieve the above:

Breaking down a feature into multiple standalone parts

For example, if you’re working on a pdf generation feature, you could break it down into parts. This limits and keeps the scope of each PR focused, reducing the size and allowing the PRs to be more easily understood and verified.

This is easier said than done. The complexity is that you might only know of the changes that need to be made during the implementation phase and cannot pre-breakdown the feature. For example, you might only know of the need to abstract after halfway working on the feature. Read more on the following git tricks in the review-by-commit section to help you with this.

2. Practicing review-by-commit

In certain cases, PRs may not be able to be decomposed further and a minimal set of changes must be merged atomically.

In this case, you can apply the review-by-commit approach described below.

For example, in the above PR, the build scripts across multiple packages need to be updated together.

As you can see, each commit is scoped and can be reviewed sequentially by itself. This can also be applied together with other techniques in this article.

How can you neatly organize your commits like above? By using some simple but useful git tricks.

1. Be selective about what to push in each commit. Use tools such as VSC’s selective commit.

Selectively stage changes for commit

Make sure to only selectively stage changes for commit. Most modern IDEs have tooling to help with this. In the above screenshot, I specifically only staged the release_hotfix and included it in its own scoped commit feat: update release hotfix script

By having granular commits, you can then re-order and combine these commits easily in the next step.

2. Make use of git interactive rebase

The reorder and fixup command is extremely useful to merge in related commits into a single commit and reorder your commits to achieve a clear narrative for your reviewers.

Tutorial:

Example of the current state of my commits

Suppose i have some further changes to the release_hotfix script i need to make discovered after I’ve made the above feat: update release hotfix script .

I have a new change to the hotfix script

I would like to re-order or merge this commit with the feat: update release hotfix script commit to paint a clear narrative to the reviewer.

Hence, I will run the following command: git rebase -i HEAD~5

This allows me to reorder or fixup (merge) into the hotfix script commit to keep it neat

In this case, I have fixed up my add conditional support commit into the above update release hotfix script commit

As you can see, the new add conditional support script commit is now merged into the feat: release hotfix script commit.

This maintains the commit narrative and organizes your commits. Whether to use fixup or simply reorder the commit can be decided on based on context.

Benefits

When you do review by commit, it potentially replaces the “Changes made” section of the PR body, since each commit scopes and lists the changes made. A reviewer can focus on reviewing each commit to understand if each change achieves the intended result.

Moving back to part 1, how do we split into multiple PRs if we only know of the changes needed while halfway working on the feature?

By applying the following review-by-commit structure, we can simply group commits logically into its own PR! This unlocks the idea of “stacked” PRs even for unforeseen required code changes.

You can now split a single PR with the following commit chain into 4 well-scoped PRs, allowing you to further break down a feature into standalone parts!

The following can be done using for example:

git checkout -b PR1 
git cherry-pick ^A..B
git checkout -b PR2 
git cherry-pick ^C..D

Each commit is now its own well-scoped and tiny PR!

3. Conventional commits

Conventional commits is a useful way to further document your commits. By specifying the type of commit, you can signal to reviewers what each change aims to do.

Furthermore, these commits are machine parsable and allow you to generate eg, change logs or semantic version bumps based on them. There is an extensive list of tooling based on this style of commits here: https://www.conventionalcommits.org/en/about/#tooling-for-conventional-commits

For example, the FormSG team uses our conventional commit history to generate changelogs and semantic version bumps. Example here: https://github.com/opengovsg/FormSG/pull/9193

4. Writing a clear PR body and leaving comments on your code for gotchas considered

Minimally, every PR body should very clearly describe the problem that this change aims to solve. This is the first step reviewers usually need to answer which is “Why is this change even being introduced?”. This problem should be as focused as possible. This is made even more focused by practicing the splitting into multiple “stacked” PRs above.

For specific changes with special gotchas or tradeoffs, consider adding a comment in your PR on that specific LoC to explain the “why” behind the change.

For example, looking through your own PR and writing down “why” a change is made: https://github.com/opengovsg/FormSG/pull/9193#discussion_r2944167373

Results from making this change

After implementing the following simple methodologies, I found:

that reviews on my code have been much more effective, more issues that would otherwise be missed are flagged
release confidence has been elevated
my PRs are also picked up for review much faster, leading to higher release cadence and more impact to our users.

References and credits:

My fellow FormSG engineers for being the inspiration for some of these learnings. View our open source code base here: https://github.com/opengovsg/FormSG

Understanding parallel computing: Choosing the right algorithm based on topology

Kevin Foong — Sat, 11 Nov 2023 08:24:33 GMT

As mentioned in the title, this article aims to share some insights on why understanding the topology of interconnection networks in your parallel system is important in the field of parallel computing.

What are interconnection networks? What is topology?

In computer hardware, interconnection networks are used to transfer data between components such as cores, memories and caches.

What do we mean by topology? It refers to the shape and arrangement of hardware components. For example, topology describes how individual computers are arranged to form a parallel cluster in a parallel system.

Understanding different interconnection networks is important as they affect the types of parallel algorithms which can be used, which may result in very different runtime complexities!

The topology used greatly affects the properties of the parallel compute system, such as inter-node communication latency and bandwidth which in turn affect the performance of parallel algorithms.

Additionally, topology also affects the scalability of the system, energy consumption and more.

Understanding how topology affects parallel runtime through the sort problem

Let’s have a look at the classical sort problem, where we are given n elements and have to sort it in increasing order.

We also make the assumption that we have n nodes (though this is seldom the case in the real world) for the purpose of illustrating the importance of topology in parallel computing.

Let’s introduce 2 new parallel sorting algorithms:

Odd-even transposition sort

Odd-even transposition is a sort algorithm designed specially for parallel systems following the chain topology interconnection.

We can map each number to a node in the chain topology.

After which, we compare pairs of numbers in parallel (indicated by the red arrows), swapping only if not in increasing order.

After O(n) steps, we have managed to reach the sorted solution.

The runtime of the algorithm is O(n).

Shear sort

Shear sort is another parallel sorting algorithm which works best on mesh topology interconnection. Let’s have a look on how it works.

Once again, we map each number to each node in 2D mesh topology.

After the sort, we will get a sorted list of numbers following the ‘snake-like’ pattern shown in orange.

For shear sort, we perform the odd-even transposition sort above row and column wise until we eventually achieve the sorted outcome.

Each row/col sort takes O(sqrt(n)) time since they are performed in parallel and there are a maximum of O(log base 2(n) + 1 steps).

Hence, the runtime using shear sort is O(sqrt(n) * log base 2(n) + 1).

Significantly faster runtime enabled by 2D mesh topology over chain topology

Runtime comparison (Shear sort in blue, odd-even in red)

As we can see, the runtime of shear sort enabled by 2D mesh topology is significantly faster than the odd-even transposition.

Implementing shear sort on chain topology would not experience such speedups due to communication overheads due to how the interconnection network is set up.

For example, to sort columns using odd-even transposition during column sort step of shear sort, chain topology needs 4 hops instead of 1 hop. This leads to significant communication overhead!

4 hops for communicating

1 hop for communicating (Error correction: the orange arrow should be between nodes 7 and 9)

Conclusion

Through these simple examples, we have illustrated how interconnection networks can greatly influence the types of parallel algorithms we can use.

Hence, when selecting parallel algorithms, it is crucial to have a good understanding on the interconnection network structure of your parallel system.

Any questions/feedback?

As an aspiring writer and learner, I aim to deliver beginner friendly and easy to understand content on all things tech and computer science.

I am always looking to improve my content delivery and technical knowledge. Feel free to share any feedback or questions in the comments below.

Practices for designing good APIs

Kevin Foong — Mon, 29 Aug 2022 14:21:56 GMT

Designing quality APIs

Some practices and learnings I’ve made on designing good APIs

Throughout my previous experiences as a Software Engineer, I have spent a huge majority of my time designing APIs. In fact, I find that it is usually more challenging to design a quality API as compared to implementing it due to factors such as constantly changing customer requirements. Here are some of my learnings in terms of designing good quality APIs.

Importance of designing good APIs

Here are some key reasons:

APIs are how any user interacts with the software that you develop.

For any piece of useful software, it tends to be reused. As such, after customers start to use the API, it is often very difficult to change existing APIs. and it is crucial that the API design is well thought out from the beginning.

API design also directly affects reputation of the software and company. A quality API can lead to customers investing into learning and building with the APIs. A poorly designed API can serve to become a liability and cause countless support calls faced.

Relevance to us as software developers/engineers

While it is common for us software engineers to focus primarily on the programming aspect of product development, it is important to remember that writing code often involves providing APIs to customers who will use the code we write. Hence, to be a software developer is to be an API designer as well.

What makes a ‘good’ API?

A good API should be:

Easy to learn and use (even without extensive documentation)
Sufficiently powerful to satisfy requirements
Appropriate to audience
Easy to evolve

Often, it seems intuitive to focus on providing the most ‘powerful’ API which provides as much functionality as possible. However, this often comes with severe tradeoffs such as providing irrelevant functionality that doesn’t solve the audience’s problems and sacrifices to ease of use. Instead, APIs should aim to be specific towards solving a well identified and specific problem for a defined audience.

Some practices for designing APIs

1. Gathering requirements for successful API design

Often, when we are tasked to implement some software, it is commonly given to us in the form of solutions by customers instead of problems. However, it is often the case that these solutions may not be the optimal approach to solving their problems and may not address the “root” problems they face.

For example, during my time interning as a software engineer, I was given a task in the form of a solution — to implement an experimental data export service which would run big data processing workloads and export the transformed data to a set destination based on a set of configurations provided by the client. However, there was little mention of the clients profile and problems faced.

As such, it is important for us as API designers to proactively reach out and communicate with the potential clients of the APIs and discover their root problems. Here are some practices I’ve applied and found were useful to deal with requirements gathering when I was designing the API for the service:

Find out and be clear on who the customers are.

It is important to clearly define an audience that your API will be used by and segment them into different profiles. If there appears to be too many different profiles to address, it might be a good idea to narrow down your customer scope as a single solution might not be optimal to address problems for all profiles.

For example, while reaching out to potential users of the service i was designing, I realised that several clients did not require extensive processing of data and utilising this service would lead to unnecessary overhead. Hence, I could make recommendations for other solutions instead, whilst tailoring my API to be more specific for customers requiring heavier processing workloads. For example, I narrowed down the scope to provide a focus on a set of API methods which triggered Apache Spark jobs which were useful for heavier processing jobs.

Extract requirements in the form of user stories

For example, one of user stories I came up with was “as a business dashboard administrator, I want to be able to process the data into the form I want without having to implement code, so I can focus on the data”

Adopting a user-centric approach to identifying requirements helps define APIs which are purposeful and allow us to find out which quality attributes are the most important.

Defining this user story allowed me to focus more on ease of use and learning when I was designing the API.

Fail fast and establish a frequent feedback loop

When defining requirements, it is usually a good idea to create the initial requirements fast and validate it with the customer frequently in an iterative manner. Customer requirements are frequently changing and it is unlikely for us to get it right on the first try. Constant feedback and validation is key to identify the root problems customers face.

For example, I came up with requirements as early as possible and set up meetings with potential clients and clarify doubts. Doing so allowed me to realise that some of the requirements translated to me may not have been fully accurate. For example, I had believed that a potential customer which was the live support team required the need to process records stored, but they only required the records as-is and hence using my API was not the most optimal for their use case. Hence, failing fast and communicating frequently allowed me to clear and prevent wasting effort due to potential misunderstandings and unclear requirements.

2. When in doubt, leave it out

Software that is useful is likely to be reused and depended on by multiple clients. Hence, it is often difficult to change how the API is designed. As mentioned by “Joshua Bloch” in the 2007 talk by Google on API design, it is often easy to add to the API but expensive to remove. Hence, we should be absolutely certain before exposing new functionality to customers.

This also ties with the concept of only providing APIs which are sufficiently powerful to solve the identified audience’s problems identified. Trying to solve too many problems can often lead to APIs being overly complex and hard to understand.

For example, the Slack engineering blog mentions that one of their most popular API methods ‘rtm.start’ had become rather expensive over time. However, despite this, it was challenging to remove due to dependencies by customers and they instead decided to add a new API method rtm.connect which only did one focused thing to alleviate the costs due to the other API method above. (https://slack.engineering/how-we-design-our-apis-at-slack/)

Hence, this highlights the importance of carefully considering API design and highlights the potential cost of mistakes made in API design, where solving API design mistakes may not always be as simple as removing the API method affected.

Credits:

Much of this article is influenced by the Tech talk in 2007 on “Designing a good API and why it matters” given by Joshua Bloch in the form of my personal interpretation of the ideas and my experiences applying them.
Slack engineering blog for their insights on good API design practices