×
all 130 comments

[–]Ok_Leading4235 8 points9 points  (0 children)

aiofastnet - optimized (up to x2.2 faster) drop-in replacements for asyncio networking APIs

As a part of algorithmic trading project I had to look into the actual performance of uvloop and asyncio network API. Turned out it wasn't so great, TLS part is especially bad, also in uvloop. Lot's of plumbing code and memory copying. I tried to push PRs to uvloop but the project is almost unmaintained these days. Took more than 1 year to get some of the relatively small PRs reviewed and merged, I'm not even talking about big changes.

Eventually I came up with a much cleaner and loop agnostic way to improve networking API performance.

https://github.com/tarasko/aiofastnet

What My Project Does

Provides drop-in optimized versions of asyncio networking APIs:

  • loop.create_connection()
  • loop.open_connection()
  • loop.create_server()
  • loop.start_server()
  • loop.start_tls()
  • loop.sendfile()

Target Audience

This project is mainly for developers who already use asyncio transports/protocols and want better performance without redesigning their code.

It is probably most relevant for people building:

  • ASGI/HTTP/Websocket or RPC clients and servers
  • proxies
  • database clients/servers
  • custom binary protocols
  • other protocol-heavy network services

Comparison

Compared to uvloop/winloop, aiofastnet is not a separate event loop. It focuses specifically on the transport/TLS layer and works with the loop you already use.

Feedback is very welcome!

[–]Due_Anything4678 8 points9 points  (0 children)

ghostdep - finds phantom and unused deps in your Python project

What My Project Does

Scans your Python project and tells you what you import but didn't add to your manifest, and what you declared but never use.

$ ghostdep -p my-project

[phantom] pandas at app.py:7

[unused] numpy at requirements.txt

Handles requirements.txt, pyproject.toml (PEP 621, Poetry, uv/PEP 735). Knows about aliases like PIL→Pillow, cv2→opencv-python, sklearn→scikit-learn. Uses tree-sitter for AST parsing, not regex.

Single binary, no Python runtime needed. Also supports Go, JS/TS, Rust, Java if you work across languages.

cargo install ghostdep

https://github.com/ojuschugh1/ghostdep

Target Audience

Anyone maintaining Python projects who wants cleaner dependency manifests. Works in CI too - has JSON and SARIF output, exit code 1 when findings exist. v0.1.0, looking for feedback.

Comparison

Most Python dep checkers (pip-check, pip-audit, safety) focus on vulnerabilities or version conflicts. ghostdep focuses on a different problem: deps that are imported but not declared (phantom) or declared but never imported (unused). Closest tool is probably deptry - ghostdep differs by being cross-language (5 languages in one binary) and using AST parsing with confidence scoring for dynamic/conditional imports.

[–]AssociateEmotional11 3 points4 points  (0 children)

Project Name: PyNeat (Upcoming v2.0)

What it does: An AST-based auto-fixer specifically designed to clean up the exact "AI slop" mentioned in this thread's description.

Standard formatters like Black or Ruff are great for styling, but they don't fix bad structural logic. PyNeat uses Instagram's LibCST to safely rewrite the AST while preserving 100% of your original comments and whitespace.

Currently building v2.0 which targets AI-generated artifacts:

  • Debug/Comment Cleaners: Automatically purges orphaned print() statements, JS artifacts like console.log, and useless AI boilerplate comments (# Generated by AI, empty # TODO:).
  • Structural Cleanup: Flattens deeply nested if (arrow anti-patterns) into guard clauses and removes LLM tautologies (e.g., converting if var == True: -> if var:).
  • Safe Excepts: Replaces dangerous AI-injected except: pass or print(e) with safe raise NotImplementedError stubs.

Status: Just passed massive integration stress-tests against the Anthropic SDK and Pydantic core without breaking the AST. Currently finalizing batch processing (pyproject.toml support) before the official release.

Question for the thread: What is the most annoying "AI coding habit/artifact" you constantly find yourself fixing manually? I'd love to add a rule for it before launching!

More from r/Python

  Hide

Comments, continued...

[–]cwt114 2 points3 points  (0 children)

7 months ago, I shared NeoSQLite v1.0.0 here. It was a simple idea: Give SQLite a PyMongo API so we can have the NoSQL experience in Python without the "NoSQL Server" overhead.

The feedback was amazing (and admittedly, a bit brutal). You guys rightly pointed out the flaws and edge cases. So, I went back to the lab. 374 commits later, it's no longer just a "wrapper" falling back to Python loops—it's a full-blown database engine.

What My Project Does

NeoSQLite gives you the complete NoSQL/MongoDB experience in Python without the infrastructure overhead. It turns a standard SQLite database into a MongoDB-compatible engine.

For Python apps, it's a completely serverless, in-process library. But for this release, I also built the "Magic trick": NX-27017, an optional (and permanently experimental) tiny daemon that speaks the actual MongoDB wire protocol. You can point any existing project, GUI tool like MongoDB Compass, or non-Python app at a single SQLite file with zero code changes.

```

Terminal 1: nx-27017 --db myapp.db

Terminal 2:

from pymongo import MongoClient

This is the real PyMongo client, but it's talking to SQLite!

client = MongoClient('mongodb://localhost:27017/') db = client.my_app db.users.insert_one({"name": "Alice", "tags": ["python", "sqlite"]}) ```

Target Audience

This is meant for production use in specific contexts: desktop apps, CLI tools, local development environments, IoT devices, and small-to-medium backend services.

If you are building a massive, horizontally scaled enterprise cluster, use a real server. But if you want a drop-in PyMongo replacement that lives in a single file, this is for you.

I know replacing your database engine sounds terrifying, so to sleep at night, I've built a testing suite of 2,600+ unit tests and an automated "compatibility lab". It runs 377 different complex scenarios against both NeoSQLite and a real MongoDB instance to assert the results are strictly identical. We are sitting at 100% API parity for all comparable features.

Real-World Usage

It's actually being used out in the wild now! For example, Andy Felong recently wrote a full blog post about using NeoSQLite for his astronomy projects across a Raspberry Pi Zero, a headless Ubuntu server, and a Mac:

The fact that I can write an app's database layer once and have it run identically on a Pi Zero, an Ubuntu server, and macOS — all without starting up a single server process — is exactly the kind of pragmatic elegance I love in open-source software.

Comparison

  • vs. MongoDB: You get the exact same PyMongo API, but without managing a Docker container, replica sets, or a heavy server process.
  • vs. Postgres with JSONB: Postgres is incredible for massive web apps. But if you're building a desktop app, a local CLI tool, or a small service, managing a Postgres server is overkill. NeoSQLite gives you similar JSON querying power with zero infrastructure setup.
  • vs. TinyDB / Simple Wrappers: NeoSQLite isn't just a basic dictionary store. I wanted it to be a drop-in replacement for real apps, so it fully supports ACID Transactions (with_transaction), Change Streams (watch()), GridFS, and complex Window Functions ($setWindowFields).

Making it "Production Fast"

In the early days, complex queries were slow because I was evaluating them in Python. I've spent the last few months pushing that logic down into raw SQL:

  • Hash Joins: $lookup (joins) used to be O(n*m). It's now O(n+m) using a custom hash-join algorithm implemented in the query engine. It's the difference between a 10-second query and a 10ms one.
  • Translation Caching: If you run the same query often, the engine now "learns" the SQL translation and caches the AST. It's about 30% faster for repeated operations.
  • JSONB Support: If you're on a modern version of SQLite (3.45+), NeoSQLite automatically detects it and switches to binary JSON (JSONB), which is 2-5x faster across the board.

Try it: pip install neosqlite

GitHub: https://github.com/cwt/neosqlite

I'd love to hear your thoughts. Roast me again, or tell me what feature is keeping you tied to a "real" database server for local dev!

The Boring Stats for those interested: 374 commits since v1.0.0, 460 files changed (+105k lines), 30+ releases.

[–]Powerful_Lock6120 1 point2 points  (0 children)

mpv-tracker is a Textual TUI for tracking local anime / series watched in mpv, with MyAnimeList integration built in.

Features:

  • browse and manage tracked series in a terminal UI
  • resume playback and track watched episode progress
  • authenticate with MyAnimeList from inside the TUI
  • sync watched episode count and score to MAL
  • view cached MAL metadata like score, rank, popularity, synopsis, genres, and studios
  • configure per-series playback preferences like chapter-based starts

Install:

  • uvx mpv-tracker
  • pipx install mpv-tracker
  • pip install mpv-tracker

Links:

Demo: https://github.com/GenessyX/mpv-tracker?tab=readme-ov-file#showcase

Built this for a local anime + mpv workflow where I wanted something lighter than a full media manager, but still with MAL sync and a usable TUI.

[–]ZyF69 1 point2 points  (0 children)

I've released a new version of Makrell, v0.10.0. Makrell was originally for the Python platform only, but has expanded into a family of programming languages and tools for metaprogramming, code generation, and language-oriented programming on multiple platforms. I still consider it alpha, so expect errors and missing bits and pieces, but there's a lot of ground covered now. This release includes:

  • the first release of the whole family as a coherent public system, with a specs-first approach and explicit parity work between the Python, TypeScript, and .NET tracks
  • the first version of Makrell#, the .NET/CLR implementation of the Makrell language
  • the first version of MakrellTS, the TypeScript implementation of the Makrell language
  • a browser playground for MakrellTS
  • MRDT, a typed tabular data format in the Makrell family
  • a new version of the VS Code extension, covering all three language tracks plus the data formats
  • a more consolidated docs and release story

The stuff is at https://makrell.dev . For an in-depth introduction, go straight to the article at https://makrell.dev/odds-and-ends/makrell-design-article.html

An AI usage declaration:

Done by me: All language design, MakrellPy, the MakrellPy bits in VS Code extension and the MakrellPy LSP, sample code, basic documentation.

Done by coding agents: Porting to Makrell# and MakrellTS, the MRDT format implementations, the VS Code extension bits for those tracks, the LSP work for those tracks, a lot of documentation, MakrellTS playground, a lot of testing and refinements, packaging. (It was awesome, by the way.)

The coding agent story is a bit special to me. Earlier this year I had to retire after 30 years as a software developer. Due to Parkinson's disease I suffer from fatigue and fine motor control issues that make it hard to do a lot of coding, or regular work at all. Luckily, my congnitive abilities are still good, though. This ironically coincided with the rise of AI coding assistants, which means I can still produce a lot of code while concentrating on design and high-level directions. The Makrell project had been dormant for two years, but now I was suddenly able to make a lot of progress again by using coding agents to do the actual coding work under my direction. I think it's great. I can concentrate on the interesting bits and not spend my limited energy on the more mechanical coding work. Which really isn't that interesting, I should say.

Now the question is if anyone is going to use or care about this. Probably not. And I believe the future of coding is agents compiling directly from specs to machine code and other low level targets, and that few will care about our beatiful programming languages. Maybe I'll just submit this somewhere as a piece of conceptual art.

Below is a blurb meant for language design people.

About Makrell

Makrell is a structural language family built around a shared core called MBF: a bracket-and-operator-based format meant to support code, data, markup, and embedded DSLs without treating them as completely separate worlds. The project currently includes three host-language tracks, MakrellPy, MakrellTS, and Makrell#, plus related formats: MRON for structured data, MRML for markup, and MRTD for typed tabular data.

What may be most interesting to PL people is that Makrell is not being treated as “one syntax, one implementation”. The same family ideas are being pushed through Python, TypeScript/browser, and .NET/CLR hosts, with a specs-first approach and explicit parity work between the tracks. The aim is not to force every host into identical behaviour everywhere, but to separate what belongs to the shared family core from what should remain host-shaped.

The language side has real macro and compile-time machinery rather than just surface syntax sugar. Makrell supports quoting/unquoting, structural rewrites, meta, and small embedded sublanguages. One of the nicer recurring examples is a shared macro showcase where the same family-level ideas are expressed across the implementations: pipeline reshaping, postfix-to-AST rewriting, and a Lisp-like nested notation living inside Makrell. That general “languages inside languages” direction is a big part of the project’s identity.

The formats are not side projects bolted on afterwards. MRON, MRML, and MRTD are meant to demonstrate that the same structural basis can also support data and document-like representations. So Makrell is partly a programming-language project, partly a language-workbench experiment, and partly an attempt to make code, markup, and structured data feel more closely related than they usually do.

v0.10.0 is the first release where the whole thing feels like a coherent public system rather than a pile of experiments. The packages are published, the .NET CLI ships as a real tool, the TypeScript track has a standalone browser playground, the VS Code extension covers the three language tracks plus the family formats, and the docs/release story are much more consolidated. The editor path is especially important now: run/check workflows and diagnostics exist across MakrellPy, MakrellTS, Makrell#, MRON, MRML, and MRTD, with a longer-term plan to converge tooling further around a TypeScript-based family language-server direction.

If you are interested in macro systems, multi-host language design, little languages, structural notations, or the boundary between programming language and data/markup language design, that is the niche Makrell is trying to explore. It is not “a better Python” or “a replacement for TypeScript”; it is much more a family-oriented design project that happens to have serious implementations in those ecosystems.

The practical entry points now are:

  • makrell.dev for the overall language-family/docs story
  • the MakrellTS playground for the browser-facing live environment
  • vscode-makrell for the current editor workflow
  • the published MakrellPy / MakrellTS / Makrell# packages if you want to run things locally

The repo still contains a lot of active design work, but v0.10.0 is meant to be the point where the project becomes legible as a real language-family effort instead of only an internal exploration.

[–]Big-Rent1128 1 point2 points  (0 children)

RPGNLP, a Python package that tokenizes raw user input for RPG games

Background:
I began working on this package earlier this year when I was making a text-based RPG game. I realized that tokenizing and extracting relevant information from raw text input was more of an undertaking than I thought. So I built an NLP engine on top of NLTK and spaCy to give developers a way to turn raw text into actionable tokens.

What the Project Does:
The engine will take text like "attack the goblin with the hammer" and output a dictionary with values like action: attack, subject: goblin, instrument: hammer. Or "go south east" will output action: travel, direction: south east.

The verbs that the user types is converted into a canon action to make it easier for a game engine to use the data. For instance, if the user types "go south" or "head south," they both tokenize as a "travel" action.

Comparison and Target Audience:
Unlike other NLP packages, this one is specifically designed for RPG games. Hopefully game developers can find this useful so they do not have to develop this sort of engine on their own.

[–]kesor 1 point2 points  (0 children)

tmux-player-ctl.py - a controller for MPRIS media players (spotifyd, mpv, mpd, vlc, chrome, ...)

Built tmux-player-ctl.py, a single-file, pure-Python TUI that pops up inside tmux and gives you full keyboard control over any MPRIS media player (spotifyd, mpv, mpd, VLC, Chrome, Firefox, etc.) using playerctl.

When starting to write it I considered various options like bash, rust, go, etc... but Python was the most suitable for what this needed to do and where it needed to go (most Linux distros have python already).

What worked well on from the Python side:

  • Heavy but careful use of the subprocess module — both synchronous calls and asynchronous background processes (I run a metadata follower subprocess that pushes real-time updates without blocking the TUI).
  • 380+ tests covering metadata parsing round-trips, player state management, UI ANSI/Unicode width craziness, optimistic UI updates + rollback, signal handling, and full integration flows with real playerctl commands.
  • Clean architecture with dataclasses, clear separation between config, player abstraction, metadata tracking, and the display layer.
  • Signal handling (SIGINT/SIGTERM) so the subprocesses and tmux popup shut down cleanly.
  • Zero external Python library dependencies beyond the stdlib.

It’s intentionally tiny and fast: launches in a compact tmux popup (-w72 -h12), shows live track info + progress bar + color-coded volume, supports seek, shuffle, loop modes, and Tab to switch between running players.

Typical one-liner: bash tmux display-popup -B -w72 -h12 -E "tmux-player-ctl.py"

GitHub: https://github.com/kesor/tmux-player-ctl

I’d especially love feedback from people who regularly wrangle subprocess, build CLI/TUI tools, or obsess over testing: any patterns I missed, better ways to handle long-running playerctl followers, or testing gotchas you’ve run into? Especially if you have tips on how to deal with ambiguous-width emoji symbols that have different widths in different fonts.

[–]lewd_peaches 1 point2 points  (0 children)

For anyone working with larger datasets or computationally intensive tasks, I've found significant speedups by offloading parts of my Python code to GPUs. Not just for ML, but also for things like complex simulations.

I've primarily used PyTorch and CuPy. CuPy is a drop-in replacement for NumPy in many cases, and the performance gains can be substantial. For example, a recent Monte Carlo simulation I was running went from taking 3 hours on my CPU to about 20 minutes on a single RTX 3090. The code change was minimal.

I've also experimented with distributed GPU processing using OpenClaw. I used it to fine-tune a smaller LLM on a dataset that was too large to fit on a single GPU. Setting up the distributed environment took some time initially, but then I was able to run a fine-tuning job across 4 GPUs, finishing in around 6 hours. The cost for the compute was around $25, which was much cheaper than renting a large instance from AWS or GCP. Worth looking into if you're hitting memory limits or need to accelerate your workloads.

[–]Prestigious-Wrap2341 1 point2 points  (0 children)

Update: Added a second FastAPI service with 7 new API connectors on the same $4/mo ARM server

What My Project Does

Posted a couple days ago about a FastAPI backend that aggregates 40+ government APIs. Got great feedback. Here's what's new on the engineering side:

Target Audience

Python developers interested in multi-service architecture, API connector patterns, and running multiple FastAPI instances on minimal hardware.

How Python Relates

Added a second FastAPI service running on a separate port with its own systemd unit. Nginx reverse proxies both services on the same $4/mo ARM box. The second service handles deterministic text analysis: rule-based sentence segmentation, candidate detection via signal matching (numbers, dates, named entities, assertion verbs), SHA256 dedup with SequenceMatcher at 0.78 threshold, and BM25Okapi scoring against 29 external API sources. Zero LLM dependency. Same input, same output, every time.

7 new API connectors following the same pattern as the original 36: FCC Consumer Complaints via Socrata SODA (SoQL query building with $where$select$group), Treasury Fiscal Data API (pagination via page[size] and filter params), College Scorecard (data.gov key auth with lazy loading to handle env var timing), Grants.gov (POST to /search2 with JSON body, response nested under data.oppHits), Urban Institute Education Data Portal (URL path-based pagination with 5-page safety limit), FCC ECFS (requires api_key=DEMO_KEY param despite being "free"), and FCC License View.

Built a 14-pattern detection engine that runs cross-table SQL joins to find anomalies: trades within 30 days of bill actions by the same member (JULIANDAY arithmetic), companies lobbying Agency X that also receive contracts from Agency X (mapping LDA government_entities strings to USASpending awarding_agency values), and enforcement records that drop to zero after lobbying spend increases. Each pattern generates a markdown report with data tables pre-built from SQL and narrative sections filled by an optional API call capped at 2/day.

The custom evidence source plugin connects the second service to the main database. It opens a read-only SQLite connection to the 4.3GB WAL-mode database, searches 11 entity tables with LIKE matching, then queries lobbying, contract, enforcement, trade, committee, and donation tables for each matched entity. Results get passed back to the second service's scoring pipeline.

All sync jobs now cover 11 sectors (added Telecom: 26 companies, Education: 31 companies). Same pattern: SEC EDGAR submissions API, USASpending POST search, Senate LDA paginated GET with page_size=25. Sequential execution only, SQLite locks are still unforgiving.

Two uvicorn processes, a scheduler, a Twitter bot cron, nginx, certbot. Still $3.99/month.

Comparison

Same as before. The new engineering is the dual-service architecture and the cross-database evidence source plugin pattern.

Source: https://github.com/Obelus-Labs-LLC/WeThePeople

Second service: https://github.com/Obelus-Labs-LLC/Veritas

[–]ghoztz 1 point2 points  (0 children)

I’ve been working on Milo, a Python framework for building CLIs that humans and AI agents can both use natively.

The core idea is simple:

Write one typed Python function once and get:

• a normal CLI command

• an MCP tool

• an AI-readable llms.txt

from the same definition.

https://github.com/lbliii/milo-cli

[–]shashstormer 1 point2 points  (0 children)

CommIPC

Simpler IPC in python -> FastAPI but IPC

I wanted high speed communication between multiple scripts of mine.

Long ago i had started to use fastapi for that purpose and then i just got into modular monolithic architecture for web UIs.

but then as i kept doing things i didnt feel like that is satisfactory recently i was kinda intrested to build native applications and wanted IPC and i googled it first didnt find anything simple enough for me to just use out of the box.

like grpc too complex i tried using it once but was complex for my use case and added unnecessary friction.

and then to go for simpler microservice architecture rather than multiple fastapi servers/workers in future for my web uis i thought wont this be simpler and have come out with a simpler library for making a more microservices kinda architecture and just wrap the calls in fastapi for distributed.

With regular setups it kinda gets more complex to implement IPC this library abstracts a lot of that and makes it feel almost like fastapi on provider side and on consumer side it somewhat like requests.

I have added support for:

- events (RPC like)

- streaming (streaming RPC calls)

- pub/sub (1->many)

- groups (load balanced events)

- full pydantic integration

I tried some benchmarking and have got like sub ms latencies

| Metric | Mean | Median | P95 | P99 |

|----------------------------|----------|----------|----------|----------|

| RPC Latency (RTT) | 0.32ms | 0.29ms | 0.60ms | 0.66ms |

| Group Latency (LB RTT) | 0.30ms | 0.29ms | 0.36ms | 0.55ms |

| PubSub Latency (Relay) | 18.50ms | 19.36ms | 21.76ms | 21.91ms |

| Throughput Metric | Result |

|----------------------------|------------------------|

| RPC Throughput | 8551.8 calls/sec |

| Group Throughput (LB) | 8877.5 calls/sec |

| Streaming Throughput | 12278.6 chunks/sec |

I wanted better performance while being simpler than regular web stack

and have benchmarked my lib and have gotten decent results id say

the benchmark scripts are in the repo you may check them out

have added some example scripts on how you may want to implement things (check out examples/decoupled_demo) almost like fastapi but just IPC

https://github.com/shashstormer/comm_ipc/tree/master

[–]laserjoy 1 point2 points  (0 children)

Built a small library for DataFrame schema enforcement - dfguard

For any data engineer/swe who works a lot with dataframes - data schema checks are so boring but often necessary. I was looking at pandera for a small project but got annoyed that it has its own type system. If I'm writing PySpark, I already know pyspark.sql.types. Why should I learn pandera's equivalent (A few libs follow this approach). And libs like great_expectattions felt like overkill.

I wanted something light that enforces schema checks at function call time using the types I already use. And I DID NOT want to explicitly call some schema validation functions repeatedly - the project will end up being peppered with them everywhere. A project level setting should enable schema checks everywhere where the appropriate type-annotation is present.

So I built dfguard (PyPI: https://pypi.org/project/dfguard/). It checks that a DataFrame passed to a function matches the expected schema, using whatever types your library already uses.

PySpark, pandas, Polars are supported. It looks at dataframe schema metadata only (not data) and validates it when a function is called based on type annotations.

Some things I enjoyed while building or learnt:

- If you have a packaged data pipeline, dfg.arm() in your package __init__.py covers every dfguard schema-annotated DataFrame argument. No decorator on each function.

- pandas was annoying - dtype is 'object' for strings, lists, dicts, everything. Ended up recommending `pd.ArrowDtype` for users who needs precise nested types in pandas.

- Docs have examples for Airflow and Kedro if you're using those.

pip install 'dfguard[pandas]' pyarrow 
pip install 'dfguard[polars]' 
pip install 'dfguard[pyspark]'

This quickstart should cover everything for anyone who's interested in trying it out.

Curious to hear any thoughts or if you'd like to see some new feature added. If you try it out, I'm ecstatic.

Shameless plug: if you like the repo - consider starring the repo.

[–]Candid_Complaint_925 1 point2 points  (0 children)

BillingWatch — Self-Hosted Stripe Billing Anomaly Detector

Built this because Baremetrics/ProfitWell felt overkill for solo devs who just want to know when something's wrong with their Stripe payments.

FastAPI app that processes Stripe webhooks in real-time and flags anomalies — unexpected refunds, payment failure spikes, revenue drops. Dashboard shows per-tenant billing health. No cloud required, you own your data.

Quick start:

git clone https://github.com/rmbell09-lang/BillingWatch.git
cd BillingWatch
cp .env.example .env  # add your Stripe webhook secret
docker compose up
# Dashboard at http://localhost:8000

One-click deploy configs included for Railway/Render/Fly. MIT licensed.

Repo: https://github.com/rmbell09-lang/BillingWatch

[–]nicholashairs 0 points1 point  (0 children)

<meta: was there an announcement about this monthly thread /chances to the rules? I had a quick look and can't see anything>

[–]macjaf 0 points1 point  (1 child)

tokencap - a Python library for token budget enforcement across AI agents.

The problem: provider spending caps are account-level and reactive. They tell you what happened after the fact. tokencap enforces limits in your code, before the call goes out.

Two ways to use it:

Direct SDK:

client = tokencap.wrap(anthropic.Anthropic(), limit=50_000)

Any agent framework (LangChain, CrewAI, AutoGen, LlamaIndex):

tokencap.patch(limit=50_000)

Four actions at configurable thresholds: WARN, DEGRADE (transparent model swap to a cheaper model), BLOCK, and WEBHOOK. SQLite out of the box, Redis for multi-agent setups. Tracks tokens not dollars - token counts come directly from the provider response and never drift with pricing changes.

pip install tokencap

https://github.com/pykul/tokencap

[–]DifficultDifficulty 0 points1 point  (0 children)

A Python SDK/CLI to make Ray clusters self-serve for Python devs.

What My Project Does

krayne (link) is a Python library and CLI that wraps the KubeRay operator for creating and managing Ray clusters on Kubernetes. Instead of hand-writing KubeRay YAML manifests, you import Python functions (create_cluster(), scale_cluster(), list_clusters(), etc.) or use the krayne / ikrayne (interactive TUI) CLI to spin up and manage clusters with sensible defaults.

The idea is that if you're already writing Ray workflows in Python, training jobs, serve deployments, distributed preprocessing, the cluster management layer should live in the same language. The SDK is the source of truth, the CLI is a thin Typer wrapper on top of it. Operations are stateless functions that return frozen dataclasses, configuration goes through Pydantic models with YAML override support when you need finer control.

GitHub: https://github.com/roulbac/krayne

Target Audience

ML engineers and researchers who write Ray workflows on Kubernetes. The kind of person who knows what ray.init() does but doesn't want to become a KubeRay manifest expert just to get their cluster running. Also useful for platform teams who want a programmable layer on top of KubeRay that their users can call from Python. It's early (v0.1.0) and opinionated, a composable starting point, not a production-hardened product.

Comparison

An alternative I'm familiar with is using kubectl apply with raw KubeRay manifests, or the KubeRay Python client directly. The main difference is that krayne is designed around progressive disclosure:

  • Zero-config defaults out of the box. krayne create my-cluster --gpus-per-worker 1 --workers 2 is a complete command.
  • When you need more control, you drop down to a YAML config or the Python SDK, no cliff between "simple" and "custom."
  • Protocol-based Kubernetes client, so you can unit test cluster management logic with mocks. No real cluster needed.

It's not that working with KubeRay directly can't do what krayne does, it absolutely can. But when you primarily write Ray code and just need a cluster up with the right resources, context-switching into YAML manifests and kubectl is friction you don't need. A typed Python API that validates your input before it hits the cluster and lives right next to your actual Ray code, that's ultimately why I built it.

[–]Illustrious_Road_495 0 points1 point  (0 children)

I would like to share my music downloader with you all! My first project published on PyPI.

It uses a list of providers (currently only Spotify and Deezer), to get metadata for a track (to beautify it), downloads it using ytdlp. It can download all songs in a playlist (youtube playlist, deezler album/playlist), and accepts either search queries or direct urls (spotify/deezer/youtube).

it was originally created as a bridge between my Youtube and Spotify likes libraries, but since the changes to the Spotify api, I had to add a second provider.

https://pypi.org/project/spots-cli/

[–]zanditamar 0 points1 point  (1 child)

CLI-Anything-WEB — Claude Code plugin that generates production Python CLIs for any website

What My Project Does

Point it at a URL and it records live HTTP traffic, then generates a complete pip-installable Python CLI with typed exceptions, REPL mode, --json output, unit + E2E tests, and anti-bot bypass. The entire pipeline runs inside Claude Code via a 4-phase skill system (capture → methodology → testing → standards).

17 CLIs generated so far: Amazon, Airbnb, TripAdvisor, Reddit, YouTube, Hacker News, GitHub Trending, Pexels, Unsplash, ProductHunt, NotebookLM, Booking.com, ChatGPT, and more.

```bash pip install -e amazon/agent-harness cli-web-amazon search "wireless headphones" --json cli-web-amazon bestsellers electronics --json

pip install -e tripadvisor/agent-harness cli-web-tripadvisor hotels search "Paris" --geo-id 187147 --json ```

Target Audience

Developers who want to script or automate interactions with websites that don't have a public API — or who want to pipe web data into other tools. Also useful for agents that need structured --json output from any site.

Comparison

Most scraping libraries (requests, playwright, scrapy) give you raw HTTP primitives. CLI-Anything-WEB generates a complete, opinionated CLI with REPL mode, error handling, pagination, and tests baked in — so you get a tool that works like a proper CLI from day one rather than writing boilerplate each time.

GitHub (MIT): https://github.com/ItamarZand88/CLI-Anything-WEB

[–]UnluckyOpposition 0 points1 point  (0 children)

LongTracer - A local validation and tracing library for RAG pipelines (No LLM APIs used)

What My Project Does

LongTracer is a pure Python, zero-API-cost library designed to detect contradictions in Retrieval-Augmented Generation (RAG) pipelines at inference time.

When an LLM generates a response based on retrieved documents, LongTracer intercepts the output, splits it into individual claims, and uses a local hybrid STS + NLI architecture (MiniLM and DeBERTa) to mathematically verify if the claims match the source text. It then automatically traces the entire pipeline run and logs the evaluation metrics to SQLite (default), MongoDB, Redis, or PostgreSQL.

It works entirely locally without sending data to external LLM-as-a-judge APIs.

Python

from longtracer import check

# Verifies claims against the context purely locally
result = check(
    answer="The Eiffel Tower is 330m tall and located in Berlin.",
    context=["The Eiffel Tower is in Paris, France. It is 330 metres tall."]
)

print(result.verdict)             # FAIL
print(result.hallucination_count) # 1

Target Audience This is meant for production environments and MLOps engineers. If you are building data pipelines, LangChain apps, or LlamaIndex wrappers and need observability into factual consistency without doubling your API costs, this is built for you. It includes 1-line decorators for existing frameworks (instrument_langchain(your_chain)).

Comparison

  • Vs. Ragas / TruLens: Most existing evaluation frameworks are built for offline batch testing and rely heavily on calling OpenAI/GPT-4 to act as the "judge," which is slow and expensive. LongTracer is designed for fast, local, inference-time checking using smaller cross-encoder models.
  • Vs. Vector DB similarity: Standard similarity search checks if texts are related, but not if they contradict. LongTracer uses Natural Language Inference (NLI) to explicitly classify relationships as entailment, neutral, or contradiction.

Source Code:https://github.com/ENDEVSOLS/LongTracer

MIT Licensed. I'd love to hear feedback from the community on the Python architecture, the pluggable database backends, or any features you'd like to see added to the CLI!

[–]arzaan789 0 points1 point  (0 children)

Built a tool to find which of your GCP API keys now have Gemini access

Callback to https://news.ycombinator.com/item?id=47156925

After the recent incident where Google silently enabled Gemini on existing API keys, I built keyguard. keyguard audit connects to your GCP projects via the Cloud Resource Manager, Service Usage, and API Keys APIs, checks whether generativelanguage.googleapis.com is enabled on each project, then flags: unrestricted keys (CRITICAL: the silent Maps→Gemini scenario) and keys explicitly allowing the Gemini API (HIGH: intentional but potentially embedded in client code). Also scans source files and git history if you want to check what keys are actually in your codebase.

https://github.com/arzaan789/keyguard

[–]PotentialTomorrow111 0 points1 point  (0 children)

ASBFlow - Simplied Python API for Azure Service Bus

Hi r/Python,

I’ve been working on an integration with Azure Service Bus to handle high-volume messaging and I found the official Python SDK somewhat verbose and unintuitive for common workflows. To make things simpler, I developed asbflow, an open-source library that wraps the SDK providing a cleaner and more straightforward interface.

What My Project Does
ASBFlow simplifies the most common messaging workflows with Azure Service Bus. It lets you:

  • Publish messages to topics and queues, automatically handling the maximum batch size or letting you specify it
  • Consume messages from subscriptions and queues
  • Manage dead-letter queues: read, republish or purge messages
  • Optionally validate message payloads with Pydantic, preventing message confirmation if parsing fails

The library offers a more intuitive interface than the official SDK, while supporting high-volume messaging, multiple execution strategies (sequential or concurrent) and integration with Azure authentication methods.

Target Audience
ASBFlow is designed for developers and teams building production-grade applications with Azure Service Bus. As a first version, this is not yet production-ready and is currently intended as a tool for prototyping and experimentation. It is designed with the goal of evolving into a production-grade solution for developers and teams working with Azure Service Bus.

Comparison
Compared to the official azure-servicebus Python SDK, ASBFlow:

  • Reduces boilerplate for publishing and consuming messages
  • Integrates optional Pydantic validation with ASB acknowledgement
  • Simplifies dead-letter queue (DLQ) management
  • Supports multiple execution strategies without changing business logic
  • Integrates with Azure authentication methods

Links & Installation

Quick Example

from asbflow import ASBConnectionConfig, ASBPublisher, ASBPublisherConfig, ASBConsumer, ASBConsumerConfig

conn = ASBConnectionConfig(connection_string="<connection-string>")

publisher = ASBPublisher(conn, ASBPublisherConfig(topic_name="<topic-name>"))
consumer = ASBConsumer(conn, ASBConsumerConfig(topic_name="<topic-name>", subscription_name="<subscription-name>"))

publisher.publish({"id": "a1", "severity": "high"}, parse=False)
result = consumer.consume(parse=False, raise_on_error=False)
print(result.succeeded, result.failed)

Project Status & Contributions

This is the first stable version of the project: many more features can be certainly developed and integrated. Contributions and feedback are welcome!

[–]Sad_Mud_4484 0 points1 point  (0 children)

Spent the last few weeks building a ServiceNow loader for LLM pipelines, finally shipped it.

So here's the backstory. I've been working on a project where we need to pull ServiceNow data into a RAG pipeline. Incidents, knowledge base articles, CMDB configs, change requests, the whole nine yards. The problem? There's literally nothing out there that does this properly. The one LlamaIndex reader that exists only handles KB articles and depends on pysnc. That's it. Nothing for LangChain at all.

I ended up writing the same boilerplate over and over. Pagination logic, handling those nested reference fields ServiceNow loves to return, stripping HTML from KB articles, figuring out the auth dance. After the third project where I copy-pasted the same code, I thought screw it, let me just make a proper package.

That's how snowloader happened. It covers six tables out of the box:

- Incidents (with work notes and comments if you want them)
- Knowledge Base articles (HTML gets cleaned automatically)
- CMDB configuration items (this one's fun, it can walk the relationship graph and pull parent/child/depends-on links)
- Change requests
- Problems (flags known errors properly as booleans, not strings)
- Service catalog items

The part I'm most proud of is the CMDB relationship traversal. You point it at a server and it fetches all the connected CIs in both directions. Super useful when you're building context for an AI that needs to understand infrastructure dependencies.

It plugs into LangChain and LlamaIndex natively. Not some hacky wrapper, it actually inherits from BaseLoader and BaseReader so it works with any chain or retriever you throw at it.

Here's what using it looks like:

    from snowloader import SnowConnection, IncidentLoader

    conn = SnowConnection(
        instance_url="https://yourinstance.service-now.com",
        username="admin",
        password="yourpassword",
    )

    loader = IncidentLoader(connection=conn, query="active=true")
    for doc in loader.lazy_load():
        print(doc.page_content[:200])

That's it. Three lines to get structured, LLM-ready documents from ServiceNow. It handles pagination internally, streams results so you don't blow up memory on large tables, and supports delta sync so you can just fetch what changed since yesterday.

Auth-wise it supports basic auth, OAuth with password grant, and bearer tokens. Tested all of them against a real ServiceNow developer instance, not just mocked HTTP.

I've been running this against our dev instance with about 3000 CMDB items, 67 incidents, 100 change requests, and 53 KB articles. 41 integration tests pass against the live instance. The whole thing is typed, linted, and has 124 unit tests on top of that.

pip install snowloader

Or if you want the LangChain adapter directly: pip install langchain-snowloader

Links if anyone wants to check it out:

https://github.com/ronidas39/snowloader

https://pypi.org/project/snowloader/

https://snowloader.readthedocs.io

Would love to hear what people think. If you work with ServiceNow and have been dealing with the same pain, give it a shot. And if something's broken or missing, open an issue, I'm actively working on this.

[–]TuriyaChips 0 points1 point  (0 children)

I built fully offline speech-to-text dictation for Linux (X11 + Wayland) — no cloud, no API keys, no data leaves your machine

I was frustrated with cloud dictation services sending my audio to remote servers. So I built faster-whisper-dictation — a local, privacy-first dictation tool.

How it works:

Microphone → Silero VAD → Whisper Server → Type into focused app
(sounddevice)  (local)      (REST API)      (platform-native)
  • Hold Alt+V, speak, release — text appears in whatever app has focus
  • Everything runs on your machine, zero network dependency
  • Background daemon — 0% CPU while idle

Features:

  • Batch mode (full utterance, highest accuracy) + streaming mode (real-time)
  • Configurable hotkey, hold-to-talk or toggle mode
  • Works with any OpenAI-compatible STT server, or built-in local engine
  • Cross-platform: Linux (X11 + Wayland), macOS, Windows

Install:

uv tool install faster-whisper-dictation

MIT licensed, open source: https://github.com/bhargavchippada/faster-whisper-dictation

Demo GIF and full docs in the README. Happy to answer questions!

[–]Super_Tree_4120 0 points1 point  (0 children)

​I built a "Kid-Proof" Photo Studio because Word and Photoshop were too frustrating for my kids.

https://mnnbir.github.io/kids-photo-studio/

The Problem: > I wanted to teach my kids how to make photo collages for school projects, but every software we tried was a nightmare. Word keeps jumping images around because of text-wrapping, and Photoshop/GIMP is way too complex for a child's brain.

The Solution: > I built Kids Photo Printing Studio. It’s a dead-simple, open-source Windows app with a pure A4 canvas.

What makes it "Kid-Proof":

  1. Drag & Drop: Just pull images from a folder or paste from Google.

  2. 4K Smart-Shrink: It automatically resizes massive 4K/8K photos so they don't break the layout.

  3. Pro-Cloning: Hold Ctrl + Drag to clone images in perfectly straight lines (great for patterns!).

  4. Contextual Tools: Delete, Rotate, and Crop buttons only appear when you click a photo.

  5. Unlimited Undo: Because kids (and adults) make mistakes.

Technical Stuff: Built with Python and PyQt6. It's 100% offline and safe.

I've just published it as Open Source on GitHub. I’d love for other parents to try it out, or for developers to help me add more "fun" features like text tools or stickers!

GitHub Link: https://github.com/mnnbir/kids-photo-studio

Check it out and let me know what you think!

[–]Latter_Professor1351Pythonista 0 points1 point  (0 children)

How are you all handling hallucination risk in production LLM pipelines?

Been dealing with this problem for a while now at my end. I was building a pipeline where LLM outputs were driving some downstream processing, database writes, API calls, that sort of thing. And honestly it was frustrating because the model would return something that looked perfectly structured and confident but was just... wrong. Silently wrong. No errors, nothing to catch it.

I tried a few things prompt engineering, stricter schemas, retry logic, but nothing felt clean enough. Eventually I just wrote a small utility for myself called hallx that does three basic heuristic checks before I trust the output: schema validity, consistency across runs, and grounding against the provided context. Nothing clever, just simple signal aggregation that gives a confidence score and a risk level so I know whether to proceed or retry.

It's been working well enough for my usecase but I'm genuinely curious how others are approaching this. Are you doing any kind of pre-action validation on LLM outputs? Or just relying on retries and downstream error handling?

Would love to hear what's working for people and if anyone's interested the source is here: https://github.com/dhanushk-offl/hallx. Still early and happy to take feedback.

[–]bluepoison24 0 points1 point  (0 children)

I built the same algorithm visualizer in Python (AST rewriting) and Java (bytecode manipulation)

The Python engine rewrites your AST at parse time, arr[i] = arr[j]

_r.on_list_get(arr, j)     
# highlight read
arr[i] = arr[j]             
# your code, untouched
_r.on_list_set(arr, i, v)  
# highlight write

Trees, linked lists, and graphs are auto-detected by attribute structure.

Both engines produce the same command stream — the frontend doesn't know which language ran. The Java engine is majorly hand-written while Python engine is majorly AI written based on the Java engine.

Try it: algopad.dev | Source: github.com/vish-chan/AlgoFlow

[–]ThatOtherBatman 0 points1 point  (0 children)

Sygaldry

This project was written with data/ETL pipelines in mind. But it should be useful for any situation where you're managing a bunch of production pipelines.

Motivation

Many years ago I used to work at a different job, where they had this framework for creating arbitrary Python objects from .ini configuration files. At first I hated it. Because I just could not see the point of writing out these stupid config files vs just writing out a Python script that did the same thing. Over the years that I was there though I really came to appreciate it.

Previously (and since) every time a new pipeline is needed, somebody sits down and writes a new script, a new CLI entrypoint, or a new glue class that just wires the same pieces together in a slightly different order. Then there's a code review, CI/CD, and a release.

Sygaldry lets you assemble arbitrary object graphs from YAML (or TOML) config files. No base classes. No decorators. No framework coupling. Your application code stays completely untouched.

An Example

Imagine that I've got something like this: ```python class DatabaseConnection:  def init(self, host, port, database, username, password): ...

class RestClient: """ Authenticates against a service and makes API calls. """ def init(self, username, password, auth_url): ...

class UrlIterator: """ Reads identifiers from the database, then asks the API for a download URL for each one. """ def init(self, db_connection, rest_client, base_url): ...

class FileDownloader: """ Downloads a file from a URL to a local directory. """ def init(self, directory, base_url, rest_client): ... def download(self, relative_url): ...

class DbUpdater: """ Iterates download URLs, downloads each file, and updates the database with the contents. """ def init(self, db_connection, url_iterator, file_downloader): ... ```

With Sygaldry I have a config file: ```yaml

db_updater.yaml

db: _type: myapp.db.DatabaseConnection host: prod.db.com port: 5432 database: prod username: etl_rw_user password: ${DB_PASSWORD}

api_client: _type: myapp.client.RestClient username: svc_account password: ${API_PASSWORD} auth_url: https://auth.vendor.com/token

url_iterator: _type: myapp.urls.UrlIterator db_connection: {_ref: db} rest_client: {_ref: api_client} base_url: ${base_reference_url}

file_downloader: _type: myapp.download.FileDownloader directory: /data/downloads base_url: ${base_download_url} rest_client: {_ref: api_client}

updater: _type: myapp.update.DbUpdater db_connection: {_ref: db} url_iterator: {_ref: url_iterator} file_downloader: {_ref: file_downloader}

base_reference_url: https://api.vendor.com/references base_download_url: https://api.vendor.com/files ```

Then my entire pipeline can be run as $ sygaldry run -c db_updater.yaml updater

Sygaldry resolves the whole graph depth-first — db and api_client get built first, then url_iterator and file_downloader (which reference them), then updater (which references those). The db and api_client instances are shared automatically — everyone who references db gets the same object.

Why?

Composition Over Inheritance

References (_ref) let you point any component at any other component. Five services need the same database connection? Just reference it. Need to swap a component? Change one line.

New Pipelines Without Code Release

Got a second vendor with the same pattern but different URLs? That's a new YAML file.

yaml _include: - db_updater.yaml base_reference_url: https://api.other-vendor.com/refs base_download_url: https://api.other-vendor.com/dl db: database: other_vendor_db

Need the UrlIterator and the DbUpdater to use different database connections? That's a config change.

Change Anything From The Command-Line

Need to point at a different database for a one-off backfill? --set db.host=backfill-replica. Need to re-download to a different directory? --set file_downloader.directory=/data/backfill. No config release, no environment variable gymnastics. Overrides are applied at load time before resolution, so they compose cleanly with everything else.

Debug With the Exact Config

Something broken in production? $ sysgaldry interactive -c db_updater.yaml Will drop you into a Python terminal with the Artificery loaded and assigned to the variable artificery. You can look a the config (artificery.config), or resolve the config and get the objects (art = artificery.resolve()) for debugging.

Extras

Check the Config

Want to see the Python that corresponds to config that you've supplied? $ sygaldry check -c db_updater.yaml

Typing

I think this is close to pointless. But a bunch of the kids that I work with are obsessed with typing to a point that it's almost a fetish. So you can do $ sygaldry check -c db_updater.yaml --type-checker mypy And it will dump the Python into a file, and run the specified type-checker over it.

Is It AI Slop?

I tend to suffer from a problem where I have new ideas when I'm writing tests and documentation. Which causes more development. Which then requires more tests and documentation. I have found Claude and Codex to be super useful for stopping me from thinking too much once I'm at a certain point. But the idea, and the code are all entirely human slop.

[–]nitish94 0 points1 point  (0 children)

I built a lightweight alternative to Databricks Auto Loader (no Spark, just Polars)

What My Project Does

I built OpenAutoLoader, a Python library for incremental ingestion into Delta Lake without Spark.

It runs on a single node and uses Polars as the engine. It keeps track of processed files using a local SQLite checkpoint, so it only ingests new data.

Features:

  • Incremental ingestion (no reprocessing)
  • SQLite-based checkpointing
  • “Rescue mode” for unexpected columns (_rescued_data)
  • Automatic audit columns (_batch_id, _processed_at, _file_path)
  • Schema evolution options (addNewColumns, fail, rescue, none)
  • Works with S3/GCS/Azure via fsspec

Target Audience

  • Data engineers experimenting with Polars + Delta Lake
  • People who want a local/dev-friendly ingestion tool
  • Anyone trying to understand how tools like Auto Loader work under the hood

⚠️ Not production-ready yet — more of a learning/project + early-stage utility.

Comparison

Compared to Databricks Auto Loader:

  • No Spark or cluster needed
  • Runs locally (much simpler setup)
  • Fully open and hackable

Trade-offs:

  • Not distributed
  • No enterprise-grade reliability guarantees
  • Still early-stage

Built this mainly to learn and scratch my own itch around lightweight ingestion without Spark.

Repo: https://github.com/nitish9413/open_auto_loader
Docs: https://nitish9413.github.io/open_auto_loader/

[–]Legitimate_Proof9171 0 points1 point  (0 children)

**Diogenesis** — runtime behavioral security for AI applications, zero dependencies

**What My Project Does:**

Monitors AI applications at runtime by intercepting imports, file writes, network connections, and subprocesses. Learns what "normal" looks like for your application, then flags deviations. Think of it like an immune system for your code.

**Target Audience:**

Python developers building or deploying AI agents who need runtime behavioral monitoring. Production-ready with 104 automated tests.

**Comparison:**

Unlike firewalls or antivirus (signature-based, only catch known threats), Diogenesis is behavioral — it catches novel threats with no existing signature. Unlike Falco or osquery, it's pure Python with zero dependencies, installs with one command, and sits inside your application rather than at the OS level.

**Why zero dependencies?**

It's a security tool. Every dependency is another package that could be compromised in a supply chain attack. The result is that `pip install diogenesis-sdk` installs exactly one package with zero transitive dependencies.

**Quick start:**

```python

from diogenesis_sdk import activate, status, field_state, threat_summary

activate()

print(status())

```

Five built-in threat patterns: data exfiltration, privilege escalation, shadow imports, resource abuse, and behavioral drift. Graduated response: LOG → WARN → ALERT.

Python 3.8+, MIT licensed.

GitHub: https://github.com/AI-World-CEO/diogenesis-sdk

PyPI: https://pypi.org/project/diogenesis-sdk/

[–]ApprehensiveTrust840 0 points1 point  (0 children)

Prospect Finder — CLI tool I built for my own outreach stack, now open source.

Takes a domain or company name → returns verified emails with confidence scores via Hunter.io API. Pure Python stdlib, zero external deps.

github.com/ImRicoAi/prospect-finder

Would love feedback on the filtering and output format.

[–]RiceTaco12 0 points1 point  (0 children)

timingtower - a modern (Polars + Pydantic) Python library for accessing the F1 livetiming API

What My Project Does

timingtower is a thin, unopinionated package for accessing the Formula 1 livetiming api (https://livetiming.formula1.com). It allows for direct access to every livetiming endpoint and returns validated Pydantic objects and polars dataframes. from timingtower import DirectClient client = DirectClient() car_telemetry = client.get("CarData", year=2026, meeting="Shanghai", session="Race")

Target audience

Technical F1 fans that want to build their own analytical packages from the F1 livetiming API and perform their own analyses.

Comparison

FastF1 is the most widely used Python package for accessing the F1 livetiming API. However, it processes the raw data internally and returns higher level views of the data as pandas dataframes. timingtower leaves the data alone as much as is possible and provides validated structures through Pydantic and allows for more efficient data analysis through its polars dataframes.

[–]hasyb001 0 points1 point  (0 children)

🪁 I built “Kite” — A Next.js-style framework for Python (File-based routing + zero config)

Hey everyone 👋

I’ve been working on a project called Kite, and I’d love some feedback.

👉 GitHub: https://github.com/mhasyb1/kite

The idea is simple:
Bring file-based routing, zero config, and simplicity to Python backend development.

🚀 Why I built this

I noticed that:

  • Django is powerful but heavy
  • Flask is flexible but requires setup
  • FastAPI is great but not beginner-friendly in structure

So I thought:

👉 What if Python had something like Next.js?

⚡ What my project does?

  • ✅ File-based routing (/pages → routes automatically)
  • ✅ Zero configuration
  • ✅ Dynamic routes ([id].py)
  • ✅ Built-in lightweight ORM (SQLite)
  • ✅ Middleware system
  • ✅ API + HTML responses
  • ✅ Async support

📁 Example Routing

Just drop files:

pages/index.py      → /
pages/about.py      → /about
pages/blog/[id].py  → /blog/123
pages/api/users.py  → /api/users

No router setup needed.

✍️ Example Page

methods = ["GET"]

def handler(request):
    return "<h1>Hello from Kite!</h1>"

🔌 API Example

methods = ["GET", "POST"]

async def handler(request):
    if request.method == "GET":
        return {"data": [1,2,3]}

    body = await request.json()
    return {"received": body}

🧠 Built-in ORM

class User(Model):
    table = "users"
    fields = ["id", "name", "email"]

User.create(name="Haseeb", email="h@example.com")

🗺️ Roadmap

  • 🔄 Hot reload (coming)
  • 🧠 AI-native routes (planned)
  • 🔐 Auth system (JWT + sessions)
  • 🐘 PostgreSQL support
  • 🔌 Plugin ecosystem

🎯 Target Audience

Kite is currently aimed at:

  • Beginners learning backend development
  • Developers who want fast setup with minimal configuration
  • Developers who prefer convention over configuration (like Next.js)

Current status:

  • ⚠️ Not production-ready yet
  • ✅ Suitable for learning, prototyping, and small projects

Planned improvements include hot reload, authentication, and database expansion.

⚖️ Comparison

Feature Kite Django Flask FastAPI
File-based routing
Zero configuration ⚠️ ⚠️
Built-in ORM
Async support ⚠️
Learning curve Easy Medium Easy Medium
Flexibility Medium Low High High
Performance Medium Medium Medium High
Production-ready

🤔 Looking for Feedback

I’d love your thoughts on:

  • Does this actually solve a real problem?
  • What would make you try this?
  • What’s missing for production use?

🙌 Honest Goal

I’m trying to build something:

  • Beginner-friendly
  • Fast to start
  • Scalable over time

If this gets some interest, I’ll open-source it properly and keep improving it 🚀

Thanks for reading 🙏

[–]cshjoshi_tech 0 points1 point  (0 children)

Hi r/Python!

What My Project Does:

Phemeral is a hosting platform that makes it easy to host and scale Python backends.
To host your application, all you have to do is connect your repo to Phemeral and make a push.
If your application’s framework is FastAPI, Flask, or Django all config for hosting is determined from your code. 
For other frameworks, the only needed config is to specify a start command.
For your dependencies: uv, poetry, requirements.txt, or any other pyproject.toml based package manager is supported.
Your application is hosted on Phemeral’s managed cloud, and you are only charged for the traffic your applications are receiving.
Deployments automatically scale to 0 when idle and rapidly (~30ms) scale up to accommodate traffic.
It would be awesome if y’all check it out and deploy something of yours! I’m happy to receive any feedback and answer any questions in comments or DMs!

Link to the docs: https://phemeral.dev/docs
Link to the main page: https://phemeral.dev/

The free tier is purposefully accommodating to help folks give it a try:

  • Unlimited projects and deployments
  • 1,000,000 requests/month
  • 10 Hours of active compute usage/month (ie, 10 hours of wall clock time that your requests are actually executing code)

Target Audience:

Developers or development teams that want to simplify their deployment process and/or save on cloud costs when their services are idle.

Comparison:

Since Phemeral has been made specifically for Python, there’s generally less configuration needed compared to existing platforms and broader support for versions of Python.
Phemeral’s compute also works a little differently from existing solutions. Rather than your code running as a long lived process on a persistent VPS, applications on Phemeral run in shorter lived compute environments that automatically scale up and scale down by use of vm snapshots.

[–]coldoven 0 points1 point  (0 children)

What My Project Does

Modular RAG pipeline where every stage (PII redaction, chunking, dedup, embedding, indexing) is a swappable plugin. 
The pipeline is a string: `"docs__pii_redacted__chunked__deduped__embedded"`. 
Drop or add stages by editing the string. Built-in eval (Recall@K, NDCG, MAP) against BEIR benchmarks.

Target Audience:

Developers building RAG pipelines who need to swap components and measure the impact per stage.

Comparison:

LangChain/LlamaIndex provide RAG building blocks but don't enforce stage ordering or offer per-stage evaluation. This does both.

Not everything is working yet, but most of it is. Feedback welcome.

Apache 2.0: https://github.com/mloda-ai/rag_integration

[–]QuoteSad8944 0 points1 point  (0 children)

**What My Project Does**

agentlint statically lints AI coding assistant instruction files (`.cursorrules`, `.instructions.md`, `.windsurfrules`, `CLAUDE.md`, `GEMINI.md`, etc.). It runs 18 checks: dead file references, circular skill dependencies, missing role coverage, unsourced numeric/percentage claims, secret detection, vague instruction patterns, dispatch coverage gaps, .env key parity, cross-file value consistency, ground truth JSON/YAML validation, source constant extraction, dead anchor links, trigger overlap, and token budget analysis. Outputs plain text, JSON, SARIF, badge SVG, and HTML. Runs in pre-commit or GitHub Actions.

v0.5.0 just shipped: circular refs detection, required role coverage check, --baseline for CI, --format html, and an --init wizard.

**Target Audience**

Python developers who maintain AI coding assistant instruction files in professional or team projects and want CI-enforced quality gates. Production use. 310 tests.

**Comparison**

Think Ruff/Flake8 but the files being linted are AI assistant configs, not Python source. No other tool models the dispatch table concept or detects circular skill references. Nothing else validates documented constants against source code. No LLM calls — purely static file analysis.

GitHub: https://github.com/Mr-afroverse/agentlint

Install: `pip install instruction-lint`

[–]AgeAltruistic6510 0 points1 point  (0 children)

fargv — argument parsing where your parameters are a data model, not code scattered around your script. Zero dependencies, auto CLI+config+env from one dict/dataclass/function definition.

I've been using it for 6 years in research scripts and recently added enough features to share it properly.

Hi everybody, this is my first reddit post! I've been using it for about 6 years in research scripts. Recently added enough features that I think it's worth sharing.

The core frustration that led to it: every time I tried to follow someone's click-based CLI code I was jumping between decorated functions trying to understand what the program actually accepted. And with argparse, adding or renaming a parameter means editing multiple places for what should be a one-line change. That felt wrong.

Some of my positions on the topic this module implements:

Parameters are a data model. They should live in one place, defined once, as data — not scattered as decorators across your functions or as add_argument calls spread through the file.

A library should be easy to get rid of. If your parameters are defined as a plain dict, dataclass, or function signature, replacing fargv later costs you almost nothing. Your business logic never imports fargv — only the definition and the one parse call do. The UNIX CLI tradition is predictable enough to automate. Given a parameter name and its default value, the right type, short flag, help text, and --key value / --key=value syntax can all be inferred. Integer parameters defaulting to 0 automatically get stacking switch behavior — -vvv and --verbosity 3 are the same thing, for any such parameter, with no extra configuration. ** Config files, env vars, and CLI arguments are the same problem.** fargv handles all three from the same definition, with each layer overriding the previous. The thing I'm most happy with: exposing a hard-coded value on the CLI takes exactly two line changes. Add "batch_size": 16 to your params dict, change Dataloader(ds, batch_size=16) to Dataloader(ds, batch_size=p.batch_size). No registration, no decorator, no help string required unless you want one. Subcommands are first-class too — each gets its own parameter namespace:

  import fargv

  p, _ = fargv.parse({
    "batch_size": 16,                  # global parameter
    "mode": {
      "train": {"lr": 0.001},        # train-only parameter
      "eval":  {},
      "test":  {},},})

Invoked as python script.py train --lr 0.001 or python script.py eval. Passing --lr under eval raises an error. --batch_size works under any subcommand. --verbosity / -vvv is built-in and can be disabled.

pip install fargv has zero mandatory dependencies. Tk GUI, Qt GUI, and Jupyter widget support activate at runtime only if those packages are already present.

Pypi: https://pypi.org/project/fargv/

Repo: https://github.com/anguelos/fargv

Docs: https://fargv.readthedocs.io

Happy to hear what's missing or broken, and quite open to adding the feature you always wanted.

[–]AssociateSlight8590 0 points1 point  (0 children)

Hello, I'm working on a orchestrator/runtime for an RPA tool. Something small, for us getting started with RPA in a company. It's just the runtime (job intake, queries, logging and so on), you still need an RPA tool like uipath studio or Powerautomate for screenclicks. But feels like I'm reinventing the wheel. Or do you just do everything in the RPA tool? I searched a lot on Github and also asked AI... Found RoboFramework but it's not the same.

https://github.com/eliascccc/robot-runtime

[–]bctm0 0 points1 point  (0 children)

Built a small Python library called faultcore , help to do network fault injection for tests with fine control and no external proxies or services.
https://github.com/albertobadia/faultcore

I know this is a very niche use case but hope it helps somebody with same problems I found testing network clients. It’s Linux-only but made to work well under Docker and CI CD, uses LD_PRELOAD to intercept networks connections. It gives strong control in Docker/CI, deterministic, reproducible fault scenarios without changing application code.

Useful for testing client side apps for retries/fallbacks under timeout, packet loss, jitter, DNS issues, etc.

@faultcore.downlink("1mbps")
@faultcore.latency("120ms")
def download_file(url) -> int:
    response = requests.get(url, timeout=10)
    response.raise_for_status()
    return len(response.content)

[–]Dismal_Beginning_486 0 points1 point  (0 children)

Tidbit- Capture anything into structured Markdown notes and training-ready JSONL. - https://pypi.org/project/tidbit/
https://github.com/phanii9/Tidbit

[–]heroicgorilla 0 points1 point  (0 children)

Built a bot for the trending web game dialed.gg that can consistently achieve near-perfect scores by capturing the target color from the screen and recreating it using automated slider controls

Here's how it works:
1. Setup: The user must calibrate the bot by selecting screen positions (color preview + sliders).
2. Capture: App reads pixel color during preview phase
3. Convert: RGB → HSV using colorsys
4. Detect transition: Monitors pixel change to detect when preview phase ends
5. Apply: Maps HSV values to slider positions and adjusts them automatically

Repo: https://github.com/hero0ic/dialed.gg-bot
Download: https://github.com/hero0ic/Dialed.gg-Bot/releases/tag/v1.0.0

[–]jcubic 0 points1 point  (0 children)

Created a new project for myself. A speaking clock. The code is Open Source:

https://github.com/jcubic/speaking-clock

It tells you the time. You can run it in the background, and it will tell you the time in a given interval. You can give it a range so it will not wake you up at night.

Right now it only supports English and Polish languages. But you can contribute your own language metadata.

[–]Acceptable_Candy881 0 points1 point  (0 children)

Session Feature Extractor

I have been working with Python to build computer vision solutions for some years, but recently I took a dive into the cybersecurity field and found an intersection for my research. I found that most intrusion detection systems (that are in research) use a flow-based approach, i.e. they collect N number of packets per session and find different statistical features. While this is simple, fast and easy to explain, it is also problematic because it often disregards packet-level information. Thus, my idea is to convert individual packets into a NumPy array of integers and combine them to form an image. Using this session format, I completed my Master's thesis, a couple of projects, and published one paper. As I was reusing the same components multiple times, I decided to build a project for it, and here it is.

Links:

What My Project Does

  • Can read PCAP files and their corresponding labels in CSV files. Here, the CSV files are expected to be generated from the CICFlowMeter tool.
  • Using ScaPy, packets are tried to be broken into at least 4 layers of TCP/IP.
  • Reconstruction of the ScaPy packet back from an array is also possible, but might add padding as arrays are padded to fit in a session.
  • Experimental live packet to image conversion is also implemented. It is called sniffing.

Target Audience

A researcher who is trying to bridge the gap between AI and cyber defence.

Comparison

CICFlowMeter is one of the most widely used tools for network session feature extraction, which only extracts Flow-level features. My project also involves extracting packet-level features and converting a session to enable the implementation of computer vision algorithms.

[–]morginalium8 0 points1 point  (0 children)

Today I tried to bring an idea to life that I've had on my mind for a long time: an app capable of quickly generating study notes.

You feed it audio, and out comes a set of notes.

It works - RAM usage stays under 1 GB, and the whole thing is built specifically for M1 MacBooks.

I'm currently working on improvements - specifically, implementing a robust fallback system and expanding the selection of available models.

Link: https://github.com/alexkolesnikov08/outloud

I'd be delighted if you stopped by to take a look. Feedback is welcome!

[–]Excellent-Can4839 0 points1 point  (0 children)

Project Name: SimpleRalph

Source Code: https://github.com/Seungwan98/SimpleRalph

What My Project Does SimpleRalph is a Python CLI that runs a file-driven autonomous coding loop inside any repository. You start with one topic, it creates a local session under .simpleralph/, and it keeps PRD / Tasks / Status / Log explicit on disk while running compile and test gates between iterations.

Current commands: - simpleralph init - simpleralph run - simpleralph status - simpleralph export

Target Audience Developers experimenting with autonomous coding loops who want something inspectable and resumable instead of relying only on hidden chat state. It is still alpha and is currently better suited for early adopters, side projects, and experimentation than production-critical workflows.

Comparison The main difference is that SimpleRalph keeps loop state explicit in repository-local files instead of hiding it inside one long chat session. It is also AGENTS.md-aware by default and agent-agnostic at the config level through a configurable AGENT_COMMAND.

[–]TheDecipherist 0 points1 point  (0 children)

PeelX - Recursively extract nested archives and run every installer

Every time I build a new PC, driver downloads are ZIPs with RARs inside them. Sometimes three levels deep. 15 drivers = 30 manual extractions + cleanup. So I automated it.

https://github.com/TheDecipherist/peelX

What It Does

Point it at a folder, it extracts everything recursively (up to 50 levels), handles split RARs (.r00-.r99), cleans up all the .sfv/.par2 junk, then drops into a curses UI where you run each installer one by one. Tracks which ones you've already run.

Target Audience

Anyone who installs drivers, firmware, game mods, or software distributed as nested archives. Works on Windows, Linux, macOS, and WSL. Standalone .exe available if you don't want Python involved.

pip install peelx[all]

[–]Aggravating-Gold613 0 points1 point  (0 children)

I built a free open source Qullamaggie breakout scanner in Python:

It is a pre-market stock scanner that analyzes ~2,300 stocks and outputs a self contained HTML dashboard with inline candlestick charts.

Tech stack: Python, pandas, yfinance, matplotlib (for chart generation), Jinja2 (HTML templating). No external APIs or paid services.

Some things that might be interesting from a Python perspective:

- Batch downloads 7,000+ tickers via yfinance in chunks of 500

- ThreadPoolExecutor for parallel profile fetching (8x speedup)

- Matplotlib charts rendered to base64 PNGs and embedded directly in the HTML

- Parquet caching with TTL-based invalidation

- Single self-contained HTML output (no JS dependencies, works offline)

Full pipeline runs in ~14 seconds on cached data. First run ~4 min to download a year of daily price data.

GitHub: https://github.com/VladPetrariu/Qullamaggie-breakout-scanner

[–]General-Brilliant697 0 points1 point  (0 children)

[Project] CIPHER - A 9-agent AI security swarm orchestrating tools via message bus architecture (Open Source)

Repo: https://github.com/Daylyt-kb/cipher

I just open-sourced CIPHER. It is a graph-driven orchestration engine built in Python that deploys 9 specialized agents locally.

Built alone with zero budget to bring enterprise-grade security automation to the open source community.

Key technical aspects:

- Multi-agent message bus (Python-native).

- FORGE agent: Generates scripts, validates AST (ast module), and runs in sterile containers.

- Cryptographic scope enforcement (SHA-256 target locks).

- Local-first: No SaaS dependencies or bills.

Feedback on the agent orchestration logic is very welcome!

Repo: https://github.com/Daylyt-kb/cipher

[–]PatientEither6390 0 points1 point  (0 children)

stv — play any streaming service on your TV from the terminal

What My Project Does

stv play netflix "Dark" s1e1 resolves the episode to a content ID and deep-links into the TV app. About 3 seconds to playback. Works on LG, Samsung, Roku, Android TV. Supports Netflix, Disney+, Prime, and 30+ other platforms via auto-detection — no API key needed.

Also works as a Claude Code tool — just say "play Frieren on the living room TV" mid-session.

Target Audience

Developers / home automation people who want CLI control of their TVs. Production-ready.

Comparison

castnow/ytcast are YouTube-only. lgtv-cli/samsungctl are single-vendor. stv unifies 4 TV platforms + 30 streaming services behind one interface.

GitHub: https://github.com/Hybirdss/smartest-tv PyPI: https://pypi.org/project/stv/

[–]Obvious_Special_6588 0 points1 point  (0 children)

c5tree — C5.0 Decision Tree for Python (sklearn-compatible)

C5.0 was missing from the Python ecosystem entirely — R has had it for years. I built a pure-Python sklearn-compatible implementation with native missing value handling, multi-way categorical splits and built-in pruning.

Benchmarked against sklearn's CART — C5.0 wins on Iris, Breast Cancer and Wine datasets across 5-fold CV.

pip install c5tree

PyPI: https://pypi.org/project/c5tree/ | GitHub: https://github.com/vinaykumarkv/c5tree

[–]ZeusAlight 0 points1 point  (0 children)

Hey 👋

After drowning in newsletters and junk for years, I finally built the tool I always wanted: MailShift — an open-source, privacy-first email cleaner that runs entirely on your machine.

How it works:

• Fast mode: heuristic keyword matching (blacklist/whitelist). Runs offline, zero AI calls.

• Pro mode: two-phase — Fast first, then a local LLM (Ollama/LM Studio) re-checks only the suspicious ones. Your emails never leave your machine.

Key things I'm proud of:

✓ Attachment protection: emails with attachments are NEVER deleted

✓ Dry-run by default — you preview before anything is touched

✓ Safety guards force-keep OTP/verification mails, premium expiry notices, Drive storage alerts

✓ Unsubscribe detection: auto-unsubscribe or export List-Unsubscribe links after scan

✓ Gmail, Proton Mail (via Bridge), and any custom IMAP server

✓ Credentials stored in OS Keyring (Windows Credential Manager) — never plain text

✓ SQLite cache so repeat scans don't re-fetch everything

The LLM decisions are in Turkish internally ("SIL"=delete, "TUT"=keep) because I'm Turkish and it was my first target audience, but everything else is bilingual.

Install:

pipx install mailshift

mailshift

GitHub: github.com/lynchest/MailShift

Would love feedback — especially on false positive rates and the UX flow. What would make you actually trust a tool like this with your inbox?

[–]False-Marketing-5663 0 points1 point  (0 children)

I've been using Prisma (both in Python and TypeScript), for a while. When the core team of Prisma decided to rewrite its core to another language, and thus abandon the project, I could not find any other ORM that satisfied my needs. That is why me and few friends have been working on Nautilus.

What my project does

Nautilus is a schema-first ORM toolkit built around a Rust query engine, with generated clients for Python, TypeScript, and Rust.

You define your database schema in a .nautilus file, and Nautilus generates a typed client you can use directly in your application, similar to what Prisma used to do.

Key Features

  • Schema-first design – your database structure is the single source of truth
  • Rust-powered engine – fast query execution via JSON-RPC
  • Client generation – Python, TypeScript, and Rust
  • Multi-database support – PostgreSQL, MySQL, SQLite
  • Migrations & schema diffing built-in
  • CLI toolinggenerate, db push, migrate, etc.
  • LSP + VSCode extension for the schema language
  • Studio – A modern and powerful local database editor written in Next

Example

type Address {
  street  String
  city    String
  zip     String
  country String
}

model User {
  id        Uuid            (uuid())
  email     String    
  username  VarChar(30)
  name      String
  balance   Decimal(10, 2) (balance > 0)
  bio       String?
  tags      String[]
  address   Address?
  createdAt DateTime       (now()) ("created_at")
  updatedAt DateTime        u/map("updated_at")

  @@index([email], type: Hash)
  @@index([createdAt], type: Brin, map: "idx_users_created")
  @@map("users")
}

nautilus generate

import asyncio
from db import Nautilus

async def main():
    async with Nautilus() as client:
        user = await client.user.create({
            "email": "alice@example.com",
            "username": "alice",
            "name": "Alice",
        })

        found = await client.user.find_unique(
            where={"email": "alice@example.com"}
        )

asyncio.run(main())

Target Audience

  • Python developers who want a higher-level ORM workflow
  • People who like Prisma-style schema-first design
  • Developers working on multi-language backends (Python + TS + Rust)

Comparison

Compared to traditional Python ORMs (like SQLAlchemy or Django ORM):

  • Nautilus is schema-first, not model-first
  • It generates clients instead of relying on runtime reflection
  • It separates:
    • schema
    • query engine
    • client
  • You don’t manually write queries or deal with SQL directly

Compared to Prisma:

  • Similar workflow and philosophy
  • But Nautilus is designed to be language-agnostic, not JS-first
  • Python is a first-class target, not an afterthought
  • Nautilus benchmarks on the python client beats prisma-python

🙌 Contribute or Learn More

Nautilus is open-source and actively evolving.
If you're interested in:

  • contributing features
  • improving the Python client
  • or just exploring the idea

check out the repo:

https://github.com/y0gm4/nautilus

I’d really appreciate any feedback 🙏

[–]bdev06 0 points1 point  (0 children)

Hello everyone,

I'm on a basic subscription plan on different vendors and I kept hitting token limits mid-task, way more than I expected. It's frustrating, and it gets expensive fast. I started noticing a pattern (personal observation): the agent reads whole files (even when a snippet would suffice), the context window floods, it loses track of what it was doing, re-explores, reads more files. Round and round. Eventually I got annoyed enough to build something about it. I've been running CodeRay (see below) at work and on side projects for a while now and gotten decent results – decent enough to share.

The project (CodeRay) is a local code index that gives agents file paths + line ranges instead of whole files. The idea is simple: locate first, then read only the lines that matter.

GitHub: https://github.com/bogdan-copocean/coderay

It exposes three tools:

  • search – semantic search that returns file paths + line ranges
  • skeleton – signatures and docstrings only, each tagged with its line range
  • impact – callers, imports, and inheritors for a symbol before you change it

Works as a CLI or as an MCP stdio server so agents can call it directly. Fully local: no LLM, no network, no API key. Python, JS, and TypeScript for now.

I've seen 2–3.4× token on average reduction on my projects (up to 6x on huge files), but it depends a lot on your codebase and how you/the agent queries it.

Still early and rough around the edges. Would love to hear your feedback!

[–]Mindless_Warning8731 0 points1 point  (0 children)

redis-queen: schema migration for redis databases using pydantic models

https://github.com/mahdilamb/redis-queen

I've borrowed some concepts from alembic, but modernised it with pydantic and decorators (and other features that come native to redis like TTL for deleted fields).

Would love some feedback.

[–]Mountain-Wafer2025[🍰] 0 points1 point  (0 children)

I built Spotify2Local: A dead-simple TUI for archiving playlists using uv and Textual

Hey everyone,

For a weekend project, I wanted to build something that actually solved a personal annoyance: getting my curated Spotify playlists onto my running headphones so I could leave my phone at home.

I decided to lean into the modern Python ecosystem for this. It’s powered by uv for lightning-fast environment management and Textual / Rich for the interface. Under the hood, it uses yt-dlp with some custom heuristics to make sure it grabs official studio tracks rather than messy music videos.

Key Tech:

  • uv: Frictionless dependency management.
  • Textual: For a responsive, keyboard-driven TUI.
  • yt-dlp: For the heavy lifting on the audio extraction side.

It’s zero-config (mostly) and handles all the ID3 tagging and high-res cover art automatically. I'd love to hear what you think of the stack!

Repo: https://github.com/mserra0/Spotify2Local

[–]ProblematicSyntax 0 points1 point  (0 children)

Project name: Tomebox

Repo/Website Link:https://github.com/Gravtas-J/TomeBox.git

Description: I got tired of cloud subscriptions and DRM, so I built TomeBox: a completely local, self-hosted Audible manager and streaming server in Python. TomeBox is a local-first audiobook manager and self-hosted media server. It combines a powerful desktop application for downloading, converting, and playing your Audible library with a built-in companion web app for streaming to your mobile devices. Featuring on-the-fly DRM decryption, multi-user cross-device progress syncing, and native lock-screen controls, TomeBox gives you complete ownership of your audiobooks without relying on cloud subscriptions.

[–]spacedil 0 points1 point  (0 children)

Here's a condensed version for the Showcase Thread:

AIDepShield V2 — scan Python dependencies AND CI/CD workflows for supply chain attacks

Built this after the LiteLLM compromise in March. Existing tools (pip-audit, Snyk, Socket) scan for known CVEs in your dependency tree, but the LiteLLM attack happened through an unpinned GitHub Action — the workflow layer, not the dependency layer.

AIDepShield covers both:

  • Dependency Scanner — checks packages against a verified trust registry. Compromised = FAIL with IOC details. Unknown = REVIEW, never SAFE.
  • CI/CD Sentinel — pattern-matches GitHub Actions workflows for unpinned action refs, write-all permissions, secrets on untrusted triggers, remote script execution, publish without provenance.
  • PyPI Monitor — watches 20+ AI-critical packages (openai, anthropic, langchain, transformers, torch, etc.) for suspicious new releases.

Quick scan:

curl -X POST https://api.aidepshield.dev/scan \
  -H "Content-Type: application/json" \
  -d '{"packages": [{"name": "litellm", "version": "1.65.3-post1"}]}'

Self-host: docker run -p 8080:8080 aidepshield/aidepshield:v2

IOC feed is free, no auth: GET https://api.aidepshield.dev/iocs

GitHub: https://github.com/dilipShaachi/aidepshield

Feedback welcome — especially on what CI/CD patterns we're missing.

Shorter than the full post, focused on what it does and how to try it. Copy-paste that into the Showcase Thread comment.

[–]kaliman2go 0 points1 point  (0 children)

pimp-my-repo - Instantly modernize legacy Python repos

I built this because adding strict linting to an old project is usually a nightmare. Most developers give up because they don't have time to fix hundreds of inherited errors just to get a passing build. This tool fixes that friction.

What My Project Does

It’s a CLI tool that automates the migration of any Python repo to a modern stack (uv, ruff, and strict mypy).

The "magic trick" is that it doesn't just add a config file; it runs the linters and automatically injects # noqa or # type: ignore comments into every existing violation. This gives you a passing CI on "strict" mode immediately, allowing you to stop the bleeding for new code and fix the old debt incrementally as you touch the files.

Target Audience

  • Developers maintaining legacy codebases who want to adopt modern tooling without a massive cleanup phase.
  • Teams moving from pip/poetry to uv.

Comparison

  • vs. manual setup: Instead of spending hours configuring pyproject.toml and resolving initial conflicts, this does it in one command.
  • vs. ruff's baseline file: I’ve found that putting suppressions directly in the code (with # noqa) makes the tech debt more visible and easier to clean up during regular PRs compared to a hidden baseline file.

Try it out: uvx pimp-my-repo

GitHub: https://github.com/asaf-kali/pimp-my-repo

[–]Fearless_Grass7325 0 points1 point  (0 children)

Project Showcase: EZMO AD Command Center ​Target Audience IT Support Specialists, System Administrators, and Managed Service Providers (MSPs) managing Windows Server environments. ​The Problem Traditional Active Directory management relies on heavy Microsoft RSAT tools or repetitive PowerShell scripting. This creates a bottleneck for Helpdesk technicians trying to quickly resolve high-volume, low-complexity Tier 1 tickets like account lockouts and password resets. ​The Solution The EZMO AD Command Center is a lightweight, self-contained Python desktop application that replaces clunky legacy tools with a fast, modern graphical interface. It automates repetitive AD tasks, accelerates ticket resolution times, and enforces strict domain security standards without requiring elevated local privileges. ​Core Capabilities ​Rapid User Management: Instantly locate accounts via Logon ID or Asset Tag to execute unlocks, profile edits, and secure, auto-generated password resets. ​Streamlined Provisioning: A dedicated interface for creating fully formatted Active Directory accounts and assigning them to the correct Organizational Units with enforced baseline security policies. ​1-Click Security Auditing: Built-in domain sweeps that instantly identify critical vulnerabilities, including stale accounts (90+ days inactive), unauthorized privileged users, and accounts lacking password requirements. ​Enterprise Architecture: Engineered with a silent background auto-updater and an HMAC-SHA256 cryptographic node-locked licensing system that binds application access to authorized hardware.

https://github.com/zikaos/AD-Command-centre

[–]Punk_Saint 0 points1 point  (0 children)

Hazel : Automatic Photo Album Organizer

GitHub →

What My Project Does

Hazel is a desktop application for photographers that takes a raw folder of unsorted images, an SD card dump, a phone backup, a camera download, and reorganizes them into a clean, dated archive grouped by shoot, without any manual dragging or renaming.

It reads the EXIF metadata embedded in each file to determine when the photo was taken, groups consecutive shots into sessions based on gaps in time, and produces a folder structure like this:

2024 /
  June /
    2024-06-14_session-001 /
      raw /   ← CR2, NEF, ARW, DNG
      image / ← JPG, JPEG, HEIC
    2024-06-14_session-002 /
      image /
  August /
    2024-08-03_session-001 /
      video / ← MP4, MOV
      image /

A "session" is a group of shots taken within a configurable time window of each other (default: 45 minutes). A gap longer than that starts a new session, so a morning street walk and an evening portrait shoot on the same day become two separate folders automatically.

Everything is non-destructive by default. A Preview mode shows the exact folder layout Hazel will build before any file is touched. You can also Copy instead of Move, which leaves your originals exactly where they are and builds the archive alongside them. A Revert function undoes the last move operation if you change your mind.

Beyond the core organizer, Hazel includes a small toolkit for day-to-day photo work:

  • EXIF Viewer: shows camera, lens, shutter, aperture, ISO, focal length, GPS, and white balance for any photo you pick
  • Duplicate Finder: MD5-hashes every file in a folder and reports groups of identical files with the disk space each group wastes
  • Storage Stats: a size breakdown of your archive by year and file type, with bar charts
  • Unpaired RAW/JPG Finder: matches RAW files against JPGs by filename stem and reports anything missing a pair, useful after a cull

The app runs entirely on your machine. No account, no cloud upload, no subscription. Your files go nowhere.

Target Audience

Hazel is aimed at hobbyist and semi-professional photographers who shoot regularly and accumulate large, disorganized import folders over time. It is particularly useful for people who:

  • Shoot with multiple cameras or devices whose files end up in the same dump folder
  • Want an archive organized by shoot rather than by calendar month
  • Shoot both RAW and JPG and want them separated into subfolders automatically
  • Prefer a tool they can run locally without granting any third-party service access to their photos

It is not a DAM (Digital Asset Manager), not a cataloging tool, and not a replacement for Lightroom's folder structure. It does one thing: take a flat import folder and produce a clean, dated archive. Users who need keywording, ratings, facial recognition, or cloud sync should look elsewhere.

Comparison

Most photographers currently solve this problem in one of four ways:

Approach How it works The problem
Manual renaming/dragging Rename and move files by hand in Finder or Explorer Time-consuming and error-prone at scale
Lightroom / Capture One import The DAM's importer copies files and builds a folder structure Requires owning the software and importing into a catalog; overkill if you just want a folder archive
Photo Mechanic Professional ingest tool with fast browsing and renaming Paid, aimed at press photographers, more than most hobbyists need
digiKam / other open-source DAMs Full-featured catalog with an importer Steep learning curve, catalog lock-in, often Linux-first

Hazel differs in that it is narrowly scoped. It has no catalog, no database, and no proprietary format. The output is plain folders and files organized the way you would have done it by hand, only faster. The session-gap algorithm is the core feature that makes it different from simple date-based importers: rather than dumping everything into 2024-06-14/, it splits that day into individual shoots based on the actual gaps between shots.

It is free and open-source, runs on Windows, macOS, and Linux, and requires no installation beyond downloading the executable.

GitHub →

[–]Alternative_Feed9546 0 points1 point  (0 children)

Showcase: contextweaver — stdlib-only Python library for deterministic context assembly

What My Project Does

I recently open-sourced contextweaver, a Python library for assembling bounded context/prompt packs from a larger set of items.

The motivating use case is tool-using AI agents, where prompts often grow by accumulating conversation turns, tool schemas, and tool outputs. Instead of concatenating everything, contextweaver builds a context pack under a fixed budget by selecting, filtering, deduplicating, and packing the most relevant items.

From a Python engineering point of view, the parts I focused on most were:

  • stdlib-only runtime
  • strict typing
  • deterministic behavior
  • protocol-based interfaces
  • clear separation between I/O-capable stages and pure computation

A few design choices I’m particularly interested in feedback on:

  • store interfaces are defined with typing.Protocol
  • the context pipeline is async-first, with sync wrappers for simpler use
  • the output is deterministic for the same input
  • large payloads can be kept out of band and replaced with compact summaries

Example:

from contextweaver.context.manager import ContextManager
from contextweaver.types import ContextItem, ItemKind, Phase

mgr = ContextManager()

mgr.ingest(ContextItem(
    id="u1",
    kind=ItemKind.user_turn,
    text="Hello"
))

pack = mgr.build_sync(phase=Phase.answer, query="greeting")

print(pack.prompt)
print(pack.stats)

Target Audience

This is intended mainly for:

  • developers building tool-using AI/LLM systems
  • people who care about library design in Python
  • engineers who want a small, typed, dependency-light library rather than a larger framework

It is meant to be usable in real applications, but I would still describe it as an early-stage library rather than something I’m claiming is already a mature standard.

Even if the AI use case is not interesting to you, I think some of the Python design tradeoffs may still be relevant if you enjoy thinking about protocols vs ABCs, async/sync API boundaries, deterministic pipelines, and composable library structure.

Comparison

contextweaver is not a full agent framework, and it is not tied to any model provider.

Compared with larger AI frameworks, the goal here is a much smaller and narrower library:

  • it focuses on context assembly
  • it does not do model inference
  • it does not try to own orchestration end to end
  • it keeps runtime dependencies at zero

Compared with a naive “just concatenate everything” approach, it tries to preserve relevance and dependency structure while staying inside a hard budget.

Compared with retrieval-only approaches, it is not trying to be a vector search system. It is more about deterministic assembly rules over known context items and their relationships.

I have not done a broad benchmark yet against other context-selection approaches, so I’m trying to be careful not to overclaim there.

A few implementation details:

  • Python 3.10+
  • zero runtime dependencies
  • 536 tests
  • mypy --strict

Repo: https://github.com/dgenio/contextweaver

I’d especially appreciate feedback from Python library authors on:

  • Protocol vs ABCs here
  • async-first internals with sync wrappers
  • whether the stage decomposition sounds clean or over-engineered

[–]dusktreader 0 points1 point  (0 children)

Background

A while ago, I built a toolkit called typerdrive for writing stateful CLI apps backed by Pydantic settings models. A pain point I kept running into was that collecting those settings from a new user is tedious. Users either have to dump all the settings in a single CLI command with a wall of flags or the developer had to hand-roll an interactive collection tool.

I built an internal developer tool at work using typerdrive, and for the initial settings I built a custom wizard to populate them. It turned out really well, but I found myself a bit frustrated with the process. The settings model already had all the type hints and constraints, so it sucked to reproduce that in the wizard code. I felt like there should be a tool that reads those and drives the wizard automatically.

So I built it and called it wizdantic. It took waaay longer than I expected, but I ended up with something I really like. I also wrote a post about it on my blog that goes a little deeper. Give it a read and let me know what you think!

What my Project Does

wizdantic turns any Pydantic model into an interactive terminal wizard with a single function call:

```python from pydantic import BaseModel, Field from typing import Annotated from wizdantic import run_wizard

class ServerConfig(BaseModel): host: Annotated[str, Field(description="Hostname")] port: Annotated[int, Field(description="Port")] = 8080 debug: Annotated[bool, Field(description="Enable debug mode")] = False

config = run_wizard(ServerConfig) ```

It inspects your type annotations and picks the right prompt for each field:

  • text input for scalars
  • y/n confirm for booleans
  • numbered menus for enums and Literal types
  • masked input for SecretStr
  • JSON or CSV input for collections.

Every value is validated inline through Pydantic's TypeAdapter. Bad input gets an error and a re-prompt. After it's done, you get back a fully constructed and validated model instance.

Wizdantic can also handle nested models by recursing into sub-wizards automatically.

For customization, WizardLore lets you override format hints, plug in a custom parser, or group fields under named section headings.

Available on PyPI:

uv add wizdantic

If you want to kick the tires before installing, the demo covers the full range of supported field types and you can run it without installing anything:

uvx --from="wizdantic[demo]" wizdantic-demo

Target Audience

Python developers who build CLI tools and already use Pydantic to model their configuration or input data. Particularly useful if you're using typerdrive or any other framework where settings are Pydantic models.

Comparison

A few existing options exist for building terminal wizards in Python:

questionary and InquirerPy

These are solid libraries, but you have to define each prompt by hand. There's no connection to your data model, so you end up writing validation logic twice.

click

click.prompt() covers the scalar case fine, but you're still wiring each field manually and there's no Pydantic integration.

pydantic-cli

This drives a CLI from a Pydantic model, but through flags rather than an interactive wizard.

pydantic-wizard

This is the closest thing to a direct comparison. It drives prompts from a Pydantic model, supports nested models and a good range of types, and adds YAML serialization and a Streamlit web UI on top. The main differences:

  • uses questionary under the hood rather than Rich
  • targets a config-file workflow (write to YAML, load back, edit) rather than in-process use
  • has its own TypeHandler extension point where wizdantic uses WizardLore annotations

It's worth a look if YAML round-tripping is what you're after.


Wizdantic's angle is that the model is the wizard configuration. If you already have a Pydantic model, you don't have to write any additional prompt configuration.

Feedback and PRs welcome!

https://github.com/dusktreader/wizdantic

[–]kargarisaaac 0 points1 point  (0 children)

Lerim — background memory agent for coding workflows

What My Project Does: Lerim is a background memory agent for coding workflows. It watches coding-agent sessions and builds reusable project memory automatically.

Why it is different: It gives Claude-like auto-memory benefits, but without vendor lock-in. You can switch agents and keep memory continuity.

Target Audience: Developers using coding agents across multiple repos.

Comparison: Most tools are memory infrastructure and they do not actively extract memories from your sessions. Lerim is workflow-native: extract + consolidate + project stream status.

How to use: You can install it via pip and then use it as a skill.
```bash
pip install lerim
lerim up
lerim status
lerim status - live
```

Repo: https://github.com/lerim-dev/lerim-cli

blog post: https://medium.com/@kargarisaac/lerim-v0-1-72-a-simpler-agentic-memory-architecture-for-long-coding-sessions-f81a199c077a

[–]barnakun 0 points1 point  (0 children)

Tool that tests whether a Python dep upgrade breaks your code and cites the exact changelog entry

Python dependency upgrades are uniquely painful. Major version bumps (Pydantic v1→v2, requests 2→3, SQLAlchemy 1.4→2.0) often involve API surface changes that your tests don't catch until someone runs them.

I built Migratowl to automate this. You give it a repo URL, it:

  1. Scans your pyproject.toml / requirements.txt for outdated packages
  2. Bumps them all and runs pytest (or your configured test command) inside a sandboxed Kubernetes pod
  3. When tests fail, an AI agent reads the traceback, assigns a confidence score to each culprit package, fetches the relevant changelog section, and writes a plain-English fix suggestion

Example output for a requests 2.x → 3.x migration:

{
  "dependency_name": "requests",
  "is_breaking": true,
  "error_summary": "ImportError: cannot import name 'PreparedRequest'",
  "changelog_citation": "## 3.0.0 — Removed PreparedRequest from the public API.",
  "suggested_human_fix": "Replace `from requests import PreparedRequest` with `requests.models.PreparedRequest`.",
  "confidence": 0.95
}

It supports Python, but also Node.js, Go, Rust, and Java — useful if you have a polyglot repo.

I'm a Python dev myself and the langchain-anthropic + LangGraph stack was interesting to build this on. The agent graph has a confidence-scoring phase that decides whether to run packages in bulk (fast) or spawn isolated subagents (accurate) — happy to discuss that design if anyone's curious.

Repo: https://github.com/bitkaio/migratowl

[–]BidForeign1950 0 points1 point  (0 children)

The library that evaluates Python functions at points where they're undefined.

Few months ago I have published highly experimental and rough calculus library. Now this is the first proper library built on that concept.

It allows you to automatically handle the cases where function execution will usually fail at singularities by checking if limit exists and substituting the result with limit.

It also allows you to check and validate the python functions in few different ways to see if limits exists, diverges, etc...

For example the usual case:

def sinc(x):                                                                                                                      
    if x == 0:                                                
        return 1.0  # special case, derived by hand
    return math.sin(x) / x 

Can now be:

 u/safe
 def sinc(x):
     return math.sin(x) / x

 sinc(0.5)  # → 0.9589 (normal computation)                                                                                        
 sinc(0)    # → 1.0 (singularity resolved automatically)

Normal inputs run the original function directly, zero overhead. Only when it fails (ZeroDivisionError, NaN, etc.) does the resolver kick in and compute the mathematically correct value.

It works for any composable function:

                                                            resolve(lambda x: (x**2 - 1) / (x - 1), at=1)      # → 2.0                                                                        
resolve(lambda x: (math.exp(x) - 1) / x, at=0)      # → 1.0                                                                       
limit(lambda x: x**x, to=0, dir="+")                  # → 1.0
limit(lambda x: (1 + 1/x)**x, to=math.inf)            # → e                                                                                                                                         

It also classifies singularities, extracts Taylor coefficients, and detects when limits don't exist. Works with both math and numpy functions, no import changes needed.

Pure Python, zero dependencies.

I have tested it to the best of my abilities, there are some hidden traps left for sure, so I need community scrutiny on it:)).

pip install composite-resolve

GitHub: https://github.com/FWDhr/composite-resolve

PyPI: https://pypi.org/project/composite-resolve/

[–]id3ntifying 0 points1 point  (0 children)

secretsh — run shell commands with secrets without leaking them

What My Project Does
A small tool (with Python bindings) that lets LLM/agent workflows execute shell commands without exposing credentials to the model, logs, or stdout.

  • Secrets stored in an encrypted vault
  • Commands use placeholders like {{API_KEY}}
  • Resolved only at execution time (no sh -c)
  • Output is scanned and secrets are redacted if they appear

Example:

Agent:    curl -H "Authorization: Bearer {{API_KEY}}" https://api.example.com
Exec:     curl -H "Authorization: Bearer sk-abc123" https://api.example.com
Return:   curl -H "Authorization: Bearer [REDACTED_API_KEY]" https://api.example.com

Python usage:

with secretsh.Vault(master_key_env="SECRETSH_KEY") as vault:
    vault.set("API_KEY", bytearray(b"sk-abc123"))
    result = vault.run("curl -H 'Authorization: Bearer {{API_KEY}}' https://api.example.com")
    print(result.stdout)

Comparison
Most approaches rely on env vars or string substitution, which still leak into logs, shell history, or model context.
This keeps secrets out of the command string entirely and adds post-exec redaction as a fallback.

Repo: https://github.com/lthoangg/secretsh

[–]Mountain_Economy_401from __future__ import 4.0 0 points1 point  (0 children)

Built an open-source Qt6 / PySide6 bridge for OsmAnd offline maps

Project: PySide6-OsmAnd-SDK
GitHub: https://github.com/OliverZhaohaibin/PySide6-OsmAnd-SDK

What My Project Does

PySide6-OsmAnd-SDK is an open-source integration project for embedding OsmAnd .obf offline map rendering into Qt6 / PySide6 desktop applications.

Current functionality includes:

  • offline map rendering using OsmAnd .obf data
  • a native widget integration path
  • a Python/helper-backed rendering path
  • support for Windows and Linux
  • a runnable preview application and build scripts for local testing

The project is focused on making offline desktop map integration easier for Qt/Python developers.

Target Audience

This project is intended for:

  • developers building desktop GIS tools
  • developers working on offline map viewers
  • teams using Qt6 / PySide6 in desktop applications
  • developers who need an offline-first desktop mapping stack

It is not currently aimed at non-technical end users, and it is not yet a lightweight pip install style package. It is better described as a developer-oriented SDK workspace that can serve as a foundation for real desktop applications.

Comparison

This project differs from existing alternatives in the following ways:

  • Compared with the upstream OsmAnd core repositories, this project is focused on desktop integration in Qt6 / PySide6, not just the native engine itself.
  • Compared with older Qt5-based OsmAnd integrations, it is designed around modern Qt6 workflows.
  • Compared with other Qt/Python mapping stacks such as MapLibre-based approaches, it is specifically built around OsmAnd .obf offline data and offline desktop rendering.
  • Compared with lightweight Python map packages, it is more focused on embedding a native offline map stack into desktop software than on providing a simple pure-Python mapping API.

[–]vavosmith 0 points1 point  (0 children)

Built a Python local-first operator framework focused on crash recovery, approval gates, and idempotent actions

What My Project Does
Zyrcon is a Python local-first operator framework designed for workflows where reliability matters more than flashy demos. The current focus is crash recovery, approval-gated actions, and avoiding duplicate side effects on retries.

Target Audience
Developers building automation, tool-running assistants, or local workflow systems who want safer and more understandable execution behavior.

What I’d Like Feedback On

  • whether the architecture makes sense
  • whether the docs explain it clearly
  • what should be simplified or benchmarked next

Repo: https://github.com/zyrconlabs/cascadia-os

[–]Sad-Dig2112 0 points1 point  (0 children)

VoxelKit: A small CLI + Python tool and Library for inspecting and sanity-checking multidimensional imaging datasets and images.

As part of working with imaging data (NIfTI, HDF5, NumPy, etc.), I kept running into the same issue: just wanting to quickly check shape, preview a slice, or sanity-check data, and ending up writing small throwaway scripts every time.

Eventually I decided to just build something for it.

https://github.com/ArsalaanAhmad/VoxelKit

What My Project Does

Provides a simple CLI + Python interface for:

- inspecting dataset structure (shape, dtype, etc.)

- quick previews (2D / slice-based for 3D)

- dataset QA (min/max, NaNs, zero fraction, etc.)

- batch reporting across folders (WIP)

Example:

voxelkit report scan.nii.gz

voxelkit preview data.h5 --dataset image --output out.png

Target Audience

People working with:

- medical imaging

- geospatial / satellite data

- bioimaging

- ML pipelines dealing with multidimensional arrays

Basically anyone who ends up writing quick scripts just to inspect data!

Comparison

Tools like nibabel / h5py are great, but they’re low-level, you still end up writing small scripts for common tasks.

This is meant to sit on top of those and make quick inspection + QA easier.

Still early, but would genuinely appreciate feedback from anyone working with this kind of data!

[–]efalk 0 points1 point  (0 children)

I know that the removal of the cgi module is old news, but it took until this last week to bite me.

Rather than re-write my apps, I just ginned-up my own implementation at https://github.com/efalk/fieldstorage.

Yes, I get that the entire module is deprecated, and I get that there are now better ways to do things, but I have a number of simple web apps that are not performance-critical. I decided it was easier to write a replacement for FieldStorage once than to track down and port all my little apps and re-write them.

This version uses urllib.parse.parse_qsl() and email.message to do the heavy lifting. This implementation merely assembles the results into a form compatible with the original API.

Posting this in case someone else finds it useful.

[–]FieldBus_AI 0 points1 point  (0 children)

PyTeslaCoil: A Python-Native Tesla coil design calculator (NiceGUI + Pydantic)

I've been building PyTeslaCoil, an open-source, Python-native alternative to JavaTC, the closed-source tool by Bart Anderson that the Tesla coil hobbyist community has relied on for years.

GitHub Repo, Live Demo, PyPI:pip install pyteslacoil

What My Project Does

PyTeslaCoil computes the physics needed to design a working Tesla coil before you wind any wire. Given inputs like wire gauge, coil dimensions, transformer ratings, and topload geometry, it returns the electrical parameters that determine whether the coil will resonate and produce sparks.

It handles:

  • Secondary: inductance, self-capacitance, resonant frequency, Q, impedance (Medhurst coefficients included)
  • Primary: flat spiral, helical, and conical geometries
  • Topload: toroid and sphere capacitance
  • Tuning: coupling coefficient (k) with auto-adjust and frequency matching
  • Misc: transformer sizing, spark length estimation, static/rotary gap math

The UI is built with NiceGUI; data models use Pydantic. Run it locally via pip, or try the hosted demo.

Target Audience

It's an open-source, Python-native alternative to JavaTC, which is the closed-source JavaScript tool by Bart Anderson that the Tesla coil hobbyist community has relied on for years.

Comparison

  • vs. JavaTC: The de facto standard, but closed-source freeware with no repo, license, or contribution path. PyTeslaCoil is MIT-licensed, pip-installable, and usable as a library or standalone app.
  • Why NiceGUI over Streamlit: Streamlit's rerun model fights you when inputs are deeply interdependent (change the secondary, primary must retune; topload must re-solve). NiceGUI's event-driven model handles this cleanly.

Feedback wanted

  1. Python-side: architecture, packaging, anything that'd make it easier to contribute to or use as a library
  2. UX: Could the NiceGUI interface be improved?

Also happy to hear what you'd want added.

[–]eternal-127 0 points1 point  (0 children)

docwow — pure Python DOCX ↔ HTML conversion with lossless round-trip

pip install docwow | https://github.com/py-prit/docwow | https://docwow.readthedocs.io

---

What My Project Does

Converts Word documents to self-contained HTML and back again without losing anything,

paragraph formatting, tables, lists, inline and floating images, footnotes, comments, track

changes, bookmarks, headers/footers, TOC, field codes, the lot. Also converts arbitrary HTML

from any source (CMS, rich text editor, email) to DOCX on a best-effort basis, and provides a

full programmatic API for reading, editing, and building Word documents without touching XML.

import docwow
html = docwow.to_html("report.docx")          # DOCX → HTML
docwow.to_docx(html, "restored.docx")          # HTML → DOCX (lossless)
docwow.to_docx("<h1>Title</h1>", "out.docx", is_foreign_html=True)  # any HTML → DOCX
doc = docwow.open("report.docx")
doc.paragraphs[0].set_text("New title").set_style("Heading1")
doc.save("updated.docx")

Stress-tested against 176 real-world DOCX files from the Apache POI corpus — 159/176

round-trip with zero data loss (the other 17 are encrypted/password-protected). 2,552 tests,

≥90% coverage.

---

Target Audience

Developers building document pipelines, web apps that preview or edit Word files in the

browser, or anything that needs to programmatically generate or transform DOCX files.

---

Comparison

Most libraries handle one direction: rendering DOCX to HTML, or writing DOCX programmatically.

The gap docwow fills is the round-trip, converting to HTML and back with no data loss, which

requires preserving Word metadata through the HTML representation. It does this by embedding

Word-specific values in data-dw-* attributes alongside visual CSS, so the browser renders

correctly and the round-trip reconstructs the original XML exactly.

[–]TheGreenGamer344 0 points1 point  (0 children)

3d renderer through python/pygame

I'm not a super good programmer, and I'm also new to this subreddit, so bear with me lol.
It's a simple 3d renderer that projects 3d meshes onto your screen. You can initialize objects and give them specific properties like position, size, rotation, and even create a custom polygon with custom points and faces. It also has full camera controls. It does not yet have lighting or the ability to (easily) import stls/blend files externally, though I plan to add it.

github

I made this project just for fun, so I don't know who would use this or how it compares to alternatives.

[–]hatemhosny 0 points1 point  (0 children)

diagrams-js - Cloud architecture diagrams as code https://diagrams-js.hatemhosny.dev

diagrams-js is an open-source library that allows you to draw cloud architecture diagrams as code.

It is a TypeScript port for the popular Python diagrams library.

[–]Remarkable_Depth4933It works on my machine 0 points1 point  (0 children)

Hey everyone,

I wanted to share a tool I've been working on called FoxPipe. 🦊

It’s a minimalist CLI utility designed for end-to-end encrypted, optionally compressed data transfer between machines. I built it because I often needed to move data (like SQL dumps or log streams) between servers without the overhead of setting up full VPNs or accounts, but wanted more security than a raw nc (netcat) pipe.

🚀 Key Features:

  • Secure by Design: Uses AES-256-GCM for authenticated encryption and Scrypt for strong key derivation.
  • Streaming Compression: Built-in zlib compression to save bandwidth on large transfers.
  • Safety Guards: Includes session timeouts, handshake authentication (HMAC-SHA256), and safe decompression limits to prevent "zip bombs."
  • Dead Simple: No accounts, no configs. Just a shared password.

🛠️ Quick Start:

1. Install via PyPI: pip install foxpipe

2. On the Receiver: foxpipe receive 8080 -p "your-password" > backup.sql

3. On the Sender: cat backup.sql | foxpipe send <IP> 8080 -p "your-password"

📦 Source & Links:

I'm looking for feedback on the protocol design and any features you think a modern "secure pipe" should have.

Build. Break. Secure. 🦊

[–]Wide_Mail_1634 -1 points0 points  (0 children)

Showcase threads are usually where the weirdly useful stuff shows up, way more interesting than another benchmark post. Always fun seeing small Python tools that solve one annoying problem cleanly, especially when they stick to stdlib or keep deps minimal.

[–]Chunky_cold_mandala -3 points-2 points  (0 children)

GitGalaxy- A hyper-scale static analyzer & threat-hunting engine built on DNA sequencing principles

What my project does -

GitGalaxy is a two-part ecosystem. It is designed to extract the structural DNA of massive software repositories and render their non-visual architecture into measurable, explorable 3D galaxies.

1. The blAST Engine - The galaxyscope (Backend): A hyper-scale, language-agnostic static analysis CLI. Based on 50 years of bioinformatics and genetic sequencing algorithms, it parses code at ~100,000 LOC/second. It outputs rich JSON telemetry, SQLite databases, and low-token Markdown briefs optimized for AI-agent workflows.

2. The Observatory (Frontend): Drop your galaxy.json into the free viewer at GitGalaxy.io or use the repo's airgap_observatory, a standalone, zero-telemetry WebGPU visualizer. Both visualizers read the JSON contract and renders the entire code base as a procedural 3D galaxy where files are stars, allowing humans to visually map scale and risk exposure instantly.

Live Demo: View 3D galaxy examples of Apollo-11, Linux, Tensorflow and more at GitGalaxy.io - - github - https://github.com/squid-protocol/gitgalaxy

The blAST Paradigm: Sequencing the DNA of Software

Traditional computer science treats software like a rigid blueprint, using slow, language-specific Abstract Syntax Trees (ASTs) to analyze code. GitGalaxy treats code as a sequence to be scanned and then analyzed for patterns and occurrences using the blAST (Broad Lexical Abstract Syntax Tracker) engine.

By applying the principles of biological sequence alignment to software, blAST hunts for the universal structural markers of logic across ~40 languages and ~250 file extensions. We translate this genetic code into "phenotypes"—measurable risk exposures.

Sequencing at Hyper-Scale

By abandoning the compiler bottleneck, blAST achieves processing velocities that traditional ASTs simply cannot comprehend. In live telemetry tracking across the largest open-source ecosystems, blAST demonstrated its absolute scale:

  • Peak Velocity: Sequenced the 141,445 lines of the original Apollo-11 Guidance Computer assembly code in 0.28 seconds (an alignment rate of 513,298 LOC/s).
  • Massive Monoliths: Chewed through the 3.2 million lines of OpenCV in just 11.11 seconds (288,594 LOC/s).
  • Planetary Scale: Effortlessly mapped the architectural DNA of planetary-scale repositories like TensorFlow (7.8M LOC)Kubernetes (5.5M LOC), and FreeBSD (24.4M LOC) in a fraction of the time required to compile them.

Zero-Trust Architecture

Your code never leaves your machine. GitGalaxy performs 100% of its scanning and vectorization locally.

  • No Data Transmission: Source code is never transmitted to any API, cloud database, or third-party service.
  • Ephemeral Memory Processing: Repositories are unpacked into a volatile memory buffer (RAM) and are automatically purged when the browser tab is closed.
  • Privacy-by-Design: Even when using the web-based viewer, the data remains behind the user's firewall at all times.

The Viral Security Lens: Behavioral Threat Hunting

Traditional security scanners rely on rigid, outdated virus signatures. blAST acts like an immune system, hunting for the behavioral genetic markers of a threat. By analyzing the structural density of I/O hits, execution triggers, and security bypasses, blAST is perfectly engineered to stop modern attack vectors:

  • Supply-Chain Poisoning: Instantly flags seemingly innocent setup scripts that possess an anomalous density of network I/O and dynamic execution (eval/exec).
  • Logic Bombs & Sabotage: Identifies code designed to destroy infrastructure by catching dense concentrations of catastrophic OS commands and raw hardware aborts.
  • Steganography & Obfuscated Malware: Mathematically exposes evasion techniques, flagging Unicode Smuggling (homoglyph imports) and sub-atomic custom XOR decryption loops.
  • Credential Hemorrhaging: Acts as a ruthless data vault scanner, isolating hardcoded cryptographic assets (.pem.pfx.jks files) buried deep within massive repositories.