I graduated from IIT Delhi with B.Tech in Electrical Engineering (Power And Automation). At IITD, i was a member of MISN group, where I completed my B.Tech thesis "Robustifying GNN against Poisoning Adversarial Attacks using Weighted Laplacian" under the guidance of Prof. Sandeep Kumar .
My research interest spans Computer Vision, Multimodal Learning, Compositionality, Efficient ML and Continual Learning. However, I'm always open to explore new research directions, and develop interesting real-world projects.
Visiting ScholarJan 2022 - March 2023 CERC-AAI Lab, Mila - Quebec AI Institute,
Collaborators: Diganta Misra , Irina Rish
Research Topic: Sparsity, Continual Learning
Worked on large-scale training optimizations and mid-training at IBM Research, as part of the core team behind the Family of Granite 4.0 models.
Developing RL environments for cybersecurity agents to enhance
adaptive reasoning in automated pentesting and threat hunting.
Research EngineerMarch 2024 - August 2024 Simbian
Spearheaded the development of the Security Accelerator, improving threat hunting and detection in the cybersecurity domain.
AI Research InternJune 2021 - August 2021 AlphaICs
Research Area: Quantization of Neural Networks and Graph Neural Networks(GNNs)
NLP InternMay 2021 - June 2021 Zevi
Worked on building a vernacular search engine for e-commerce applications with features like price tag detection from query, autocomplete,spell check.
With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings. Although many methods have been investigated for optimal learning settings in scenarios where the data stream is continuous over time, sparse networks training in such settings have often been overlooked. In this paper, we explore the problem of training a neural network with a target sparsity in a particular case of online learning: the anytime learning at macroscale paradigm (ALMA). We propose a novel way of progressive pruning, referred to as \textit{Anytime Progressive Pruning} (APP); the proposed approach significantly outperforms the baseline dense and Anytime OSP models across multiple architectures and datasets under short, moderate, and long-sequence training. Our method, for example, shows an improvement in accuracy of $\approx 7\%$ and a reduction in the generalization gap by $\approx 22\%$, while being $\approx 1/3$ rd the size of the dense baseline model in few-shot restricted imagenet training. We further observe interesting nonmonotonic transitions in the generalization gap in the high number of megabatches-based ALMA. The code and experiment dashboards can be accessed at \url{https://github.com/landskape-ai/Progressive-Pruning} and \url{https://wandb.ai/landskape/APP}, respectively.
@misc{misra2022app,
title={APP: Anytime Progressive Pruning},
author={Diganta Misra and Bharat Runwal and Tianlong Chen and Zhangyang Wang and Irina Rish},
year={2022},
eprint={2204.01640},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Enhanced the learned embeddings of the network nodes by adapting the loss function of the SiGAT Model to the weighted signed graph.
The learned embeddings shows better inter class seperability in the embeddings space.
This project involves generating summaries of AMI meeting transcripts. The analysis of different methods proposed for abstractive summarization using SOTA Language models is provided and also tried to tackle the problem of summarization on longer documents in the case of AMI meeting corpus.
This project is Anomaly detection in closing prices of S&P500(Stock market index) time series data using LSTM autoencoder.As LSTM network is best for time series Data so i trained a LSTM autoencoder using the Keras API with Tensorflow 2 as the backend to detect anomalies (Sudden price changes) in the S&P 500 index.
Used two Networks here one is Generator which takes random noise for inspiration and tries to generate a face sample.Second is Discriminator which takes a face sample and tries to tell if it’s real or fake. i.e it predicts the probability of input image being a real face.There is snippet attached of generated faces from trained model after training for 15k iterations.