Skip to content

Caerii/vortexnet

 
 

Repository files navigation

VortexNet: Neural Computing through Fluid Dynamics

This repository contains toy implementations of the concepts introduced in VortexNet: Neural Computing through Fluid Dynamics. The code demonstrates how PDE-inspired "vortex" updates can be inserted into standard autoencoders.

Note: These are educational prototypes, not optimized or physically precise fluid solvers.

Contents

  • src/vortexnet/mnist.py: public MNIST autoencoder facade and CLI entry point.
  • src/vortexnet/mnist_model.py, src/vortexnet/mnist_data.py, src/vortexnet/mnist_viz.py, and src/vortexnet/mnist_training.py: MNIST model layers, data loading, reconstruction helpers, and the training runner.
  • src/vortexnet/image.py: public RGB image autoencoder facade and CLI entry point.
  • src/vortexnet/image_model.py, src/vortexnet/image_data.py, src/vortexnet/image_viz.py, and src/vortexnet/image_training.py: image model layers, data loading, visualization/evaluation helpers, and the training runner.
  • src/vortexnet/operators.py: reusable fixed PDE, learned stencil, residual conv, and compact spectral/FNO-style operator blocks.
  • src/vortexnet/experiments.py: shared experiment variant definitions, device setup, summary statistics, and CSV helpers.
  • src/vortexnet/dynamics.py: synthetic heat, wave, advection-diffusion, Burgers-style transport, reaction-diffusion, and vorticity next-state datasets for testing operator blocks on actual spatial dynamics.
  • src/vortexnet/pdebench.py: local PDEBench-style HDF5 sequence dataset adapter.
  • src/vortexnet/pdebench_catalog.py: PDEBench catalog filtering, guarded download, file-size lookup, and MD5 verification helpers.
  • vortexnet_mnist.py and vortexnet_image.py: local compatibility wrappers for direct script execution; installed commands are vortexnet-mnist and vortexnet-image.
  • config_image.yaml: longer custom-image training config.
  • config_image_smoke.yaml: tiny custom-image smoke config.

Experiment Reports

Dated experiment writeups and result tables live in docs/experiments/. The current real-dataset report is 2026-04-24-pdebench-shallow-water.md.

Setup With uv

Install dependencies into a local uv-managed environment:

uv sync

The pyproject.toml pins PyTorch and TorchVision to the PyTorch CUDA 13.0 wheel index on Windows/Linux. If you need CPU-only or a different CUDA wheel family, change the tool.uv.sources and tool.uv.index entries in pyproject.toml.

Check the environment:

uv run python -c "import torch; print(torch.__version__, torch.cuda.is_available(), torch.version.cuda)"
uv run pytest

On locked-down Windows workspaces, pytest cache writes can be flaky; this repo disables pytest's cache provider in pyproject.toml. If uv run ... reports an access error under AppData\Local\uv\cache, use uv --no-cache run ... or refresh the environment with uv sync --no-cache --link-mode=copy.

Useful uv commands:

uv lock      # update uv.lock after dependency changes
uv sync      # create/update .venv from pyproject.toml and uv.lock
uv run ...   # run commands inside the managed environment
uv tree      # inspect the resolved dependency tree

Reproduce A Small MNIST Run

MNIST downloads automatically into data/.

uv run vortexnet-mnist --epochs 1 --batch_size 64 --pde_padding_mode constant --max_train_batches 10 --max_test_batches 5 --no_show

For the original short demo:

uv run vortexnet-mnist --epochs 5 --batch_size 64 --no_show

Outputs are written to outputs/mnist/.

Reproduce A Custom Image Smoke Run

Place JPEG or PNG images in my_data/, then run:

uv run vortexnet-image --config config_image_smoke.yaml --skip_tsne --skip_onnx --no_show

For a longer run:

uv run vortexnet-image --config config_image.yaml --no_show

Outputs are written to the configured data.output_dir.

What To Improve Next

  • Add a real train/validation split instead of validating on the training loader.
  • Track PSNR/SSIM in addition to MSE reconstruction loss.
  • Run ablations against a same-size plain convolutional autoencoder.
  • Keep PDE padding set to constant unless periodic boundaries are semantically required; it uses native convolution padding and avoids explicit circular padding.
  • Implement true loss-gradient adaptive damping instead of the current activation magnitude fallback used during ordinary forward passes.
  • Evaluate on a task where PDE-style recurrence has a plausible advantage, such as video, time series, or long spatial context, not just static reconstruction.

Dynamics Benchmarks

Static reconstruction is a weak test for PDE-inspired blocks. Use the dynamics benchmark to train next-state predictors on generated fields:

uv run python scripts/compare_dynamics_variants.py --task heat --device cuda
uv run python scripts/compare_dynamics_variants.py --task wave --device cuda
uv run python scripts/compare_dynamics_variants.py --task advection_diffusion --device cuda
uv run python scripts/compare_dynamics_variants.py --task burgers2d --device cuda
uv run python scripts/compare_dynamics_variants.py --task reaction_diffusion --rollout-steps 32 --device cuda
uv run python scripts/compare_dynamics_variants.py --task vorticity --rollout-steps 16 --device cuda

For a more reliable comparison, run multiple seeds, autoregressive evaluation, and optional rollout loss during training:

uv run python scripts/compare_dynamics_variants.py --task vorticity --rollout-steps 16 --autoregressive-steps 3 --train-rollout-steps 2 --seeds 42 43 --device cuda

--train-rollout-steps 2 trains on two autoregressive model steps instead of only one target step. This is slower, but it directly optimizes the failure mode that appears during repeated rollout. Use --detach-train-rollout to avoid backpropagating through previous predicted states.

The benchmark compares:

  • plain: shared conv predictor with no operator block.
  • fixed_pde: fixed finite-difference update with learnable channel diffusion.
  • learned_stencil: depthwise learned local stencil plus channel mixing.
  • conv_residual: conventional residual convolution baseline.
  • spectral: compact FNO-style Fourier operator.
  • hybrid_spectral: local residual convolution plus spectral/FNO-style global mixing.
  • unet_small: small U-Net baseline.
  • fno_small: compact FNO-style full-model baseline.

Bounded 32x32 CUDA runs with 512 train samples, 128 eval samples, and 80 train batches show the current pattern:

  • Heat/diffusion is too easy; most variants nearly saturate it.
  • Wave and Burgers-style transport favor the conventional residual conv block.
  • Pure spectral/FNO-style mixing is not robustly better by itself on these small local benchmarks.
  • Vorticity dynamics are where local-global models start to matter, but stronger baselines changed the conclusion: U-Net wins some one-step vorticity metrics, while residual conv has been more stable in one-step-trained 3-step rollout. Adding two-step rollout loss improved rollout stability and made U-Net the best tested vorticity rollout model in the bounded 32x32 two-seed run. The hybrid local-spectral block is competitive, but it does not yet beat those controls.
  • The fixed PDE block is cheap and interpretable, but so far it behaves more like a regularizer than a clear accuracy win.

For larger external benchmarks, the most relevant next targets are:

  • PDEBench: broad PDE surrogate benchmark with advection, Burgers, reaction-diffusion, shallow water, Navier-Stokes, FNO/U-Net/PINN baselines, and DaRUS downloads: https://github.com/pdebench/PDEBench
  • NeuralOperator Navier-Stokes: 2D incompressible Navier-Stokes tensors at 128 and 1024 resolution, but the smaller archive is still about 1.5 GB: https://zenodo.org/records/12825163
  • WeatherBench 2: public weather forecasting grids, including low-resolution 64x32 ERA5 files in gs://weatherbench2/datasets, useful once the local operator stack is credible: https://weatherbench2.readthedocs.io/

PDEBench HDF5 Files

PDEBench provides HDF5 datasets through DaRUS and download scripts in its data_download folder. This repo can now inspect and train against a local PDEBench-style HDF5 sequence file:

List catalog entries before downloading anything:

uv run python scripts/download_pdebench_file.py --pde SWE --check-size
uv run python scripts/download_pdebench_file.py --pde 2D_ReacDiff --check-size

Download one exact match with MD5 verification:

uv run python scripts/download_pdebench_file.py \
  --pde SWE \
  --filename 2D_rdb_NA_NA.h5 \
  --download \
  --output-dir data/pdebench

The downloader refuses files larger than 8 GB unless --allow-large or a larger --max-gb is provided.

uv run python scripts/compare_pdebench_file.py --file path/to/file.hdf5 --inspect-only

Use --inspect-limit 0 to print every HDF5 dataset in large grouped files.

Run a bounded comparison on a local file:

uv run python scripts/compare_pdebench_file.py \
  --file path/to/file.hdf5 \
  --normalize \
  --variants conv_residual unet_small fno_small hybrid_spectral \
  --autoregressive-steps 3 \
  --train-rollout-steps 2 \
  --seeds 42 43 44 \
  --device cuda

The loader supports common time-sequence layouts:

  • ntxyc: samples, time, x, y, channels.
  • ntcxy: samples, time, channels, x, y.
  • ntxy: samples, time, x, y, with one implicit channel.
  • txyc, tcxy, and txy: grouped files where each sample is stored under keys such as 0000/data, 0001/data, and so on.

Use --data-key if the file has more than one large array and --layout if automatic layout inference is wrong.

Benchmarking

The PDE hot path has been fused into one grouped convolution that computes Laplacian, x-gradient, and y-gradient together. Run focused benchmarks with:

uv run python scripts/benchmark_pde.py --device cuda --padding-mode constant
uv run python scripts/benchmark_autoencoder.py --model mnist --device cuda --pde-steps 3 --pde-padding-mode constant
uv run python scripts/benchmark_autoencoder.py --model image --device cuda --image-size 64 --pde-channels 8 --pde-steps 1 --pde-padding-mode constant

Current findings on the tested RTX 3080:

  • Fused PDE forward is roughly 14x-29x faster than the original Python channel-loop reference on representative bottleneck tensors.
  • constant PDE padding is faster than circular and more natural for images.
  • Encoder-only PDE with one step is the fastest useful placement measured so far; decoder PDE and multi-step unrolling add cost quickly.
  • AMP, channels-last, and fused Adam were not consistent wins on these small synthetic benchmarks.
  • torch.compile is not currently usable in this Windows environment because PyTorch Inductor cannot find a working Triton install.

Run placement ablations with:

uv run python scripts/benchmark_ablation.py --model mnist --device cuda --pde-padding-mode constant
uv run python scripts/benchmark_ablation.py --model image --device cuda --image-size 64 --channels-list 4 8 --pde-padding-mode constant

About

VortexNet: Neural Computing through Fluid Dynamics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%