During my Ph.D. I worked on Computer Vision Aided Sewer Inspections and Marine Vision, advised by Professor Thomas B. Moeslund.
I have previously visited the Human Pose Recovery and Behavior Analysis in 2021 working with Professor Sergio Escalera, been a visiting researcher at the Vector Institute and the Machine Learning Research Group at the University of Guelph in 2023 working with Professor Graham Taylor, and in 2024 I was a visiting researcher at the University of Edinburgh working together with Reader Oisin Mac Aodha.
We investigate the feasibility of using novel stereo views from different 3D Gaussian Splatting methods combined with zero-shot depth estimates from FoundationStereo.
We explore whether pretrained models can provide a useful representation space for datasets they were not trained on for the purpose of grouping novel unlabelled data into meaningful clusters.
We introduce CLIBD, the first approach to use contrastive learning for aligning biological images with DNA barcodes and taxonomic labels to enhance taxonomic classification.
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
Zahra Gharaee*,
Scott C. Lowe*,
ZeMing Gong*,
Pablo Millan Arias*,
Nicholas Pellegrino,
Austin T. Wang,
Joakim Bruslund Haurum,
Iuliia Zarubiieva,
Lila Kari,
Dirk Steinke**,
Graham W. Taylor**,
Paul Fieguth**,
Angel X. Chang**
NeurIPS (D&B Track), 2024
arXiv
/
code
We construct a large-scale fine-grained dataset of the Insect class with 5M data samples, each containing a biological taxonomic annotation, DNA barcode sequence, geographical location, and RGB image.
We propose AssemblyNet, a novel dataset for predicting part directions in assembly models for exploded view visuzalitions. We propose a novel two-path network for predicting part directions.
BarcodeBERT: Transformers for Biodiversity Analysis
Pablo Millan Arias*,
Niousha Sadjadi*,
Monireh Safari*,
ZeMing Gong**,
Austin T. Wang**,
Scott C. Lowe,
Joakim Bruslund Haurum,
Iuliia Zarubiieva,
Dirk Steinke,
Lila Kari,
Angel X. Chang,
Graham W. Taylor
NeurIPS SSL Workshop, 2023
arXiv
/
code
/
bibtex
We propose BarcodeBERT, the first self-supervised method for general biodiversity analysis and highlight how the role of self-supervised pretraining in achieving high-accuracy DNA barcode-based identification at the species and genus level.
We construct a large-scale fine-grained dataset of the Insect class with 1M data samples, each containing a biological taxonomic annotation, DNA barcode sequence, and RGB image.
We conduct the first systematic comparison and analysis of 10 state-of-the-art token reduction methods across four image classification datasets, trained using a single codebase and consistent training protocol.
We propose a novel method for to model spatial semantics in images, where features are aggregated at different scales non-locally using a lightweight vision transformer, and novel Sinkhorn clustering-based tokenizer.
We propose a novel decoder-focused multi-task classification architecture called Cross-Task Graph Neural Network (CT-GNN), which refines the disjointed per-task predictions using cross-task information.
We propose a system for generating synthetic point clouds of sewer pipes using Structured Domain Randomization for the generation of the sewer systems and an approximated model of a Pico Flexx Time-of-Flight camera.
We reviewed 113 articles on image-based automated sewer inspection methodologies, finding that there are is a severe lack of 1) publicly available datasets, 2) commonly agreed upon evaluation protocols, and 3) open-sourced code.
We present a new publicly available underwater dataset with annotated image sequences of fish, crabs, and starfish captured in brackish water with varying visibility.
We design a system for the detection of rainfall by the use of surveillance cameras, and compare it against the former state-of-the-art method for rain detection on our new AAU Visual Rain Dataset (VIRADA).
We apply pixel reprojection on nine 360 degree renderings to enable 3D motion and introduce motion parallax effects, without explicit knowledge of the 3D geometry.