Wikidata is a free and open knowledge base that anyone can edit. It is a sister project of Wikipedia and serves as a central repository for structured data, so rather than paving pages with text, it stores data in a structured format that can be queried and reused across different platforms.
One of the key features of Wikidata is its ability to handle deletion requests, which are known as RFDs (Requests for Deletion), a similar process happens on Wikipedia. These requests allow users to propose the removal of items from the database that are deemed unnecessary, incorrect, or otherwise unsuitable for inclusion.
I was recently asked if there was currently any “tracking of the amount of deletion requests on WD over time”, with a specific focus on promotional editing, number of requests, and administrator burden. I was not aware of any such tracking, so I decided to investigate the data and see what insights could be gleaned from it, and possibly help out with whatever then end up happening as part of T429036 [Analytics] [Request] Baseline data for Item deletions which looks like it will happen soon.
Approach
All of the requests for deletion go via the RFD page on Wikidata. This page is treated as a talk page, with each section being a request for deletion. Each section has a title, which is the item, or items being requested for deletion, and a body, which contains the reason and any discussion around the request. The page is often maintained by bots in terms of marking when deletions occur, and when requests are closed, so the page is a good source of data for analysis. And like many other talk pages, it is also archived, with older requests being moved to archive pages. The main RFD page has been around for a while, and the archive pages go back to 2012.
Data Gathering
I’m trying out marimo for my data gathering things time, when I would normally use a standard IPython notebook. It’s self described as “a next generation Python notebook”.








