QSAR Drug Discovery Project

This repository contains code and data for a QSAR (Quantitative Structure-Activity Relationship) drug discovery project. The goal of this project is to develop predictive models that can estimate the biological activity of chemical compounds based on their molecular structure.

Features

Data preprocessing and feature extraction
Machine learning model training and evaluation
Random Forest Regressor implementation
Support for various molecular descriptors
Visualization of results
Hyperparameter tuning
Cross-validation
Documentation and examples

Requirements

Python 3.7+
pandas
scikit-learn
numpy
matplotlib
seaborn
RDKit (for cheminformatics tasks)

Dataset

The dataset used in this project is a collection of psychoactive compounds in a CSV file format from https://www.kaggle.com/datasets/thedevastator/psychedelic-drug-database File was renamed before import to QSARDrugAnalysis.csv.

This example uses SlogP as the target variable but in principle any other variable can be used. Generally, biological activity data (ie IC50, EC50, kcal/mol etc) will be used as the target variable. Currently, my research is not published yet, so I cannot share the actual dataset I am using. The dataset should contain molecular structures (e.g., SMILES strings) and their corresponding biological activity values.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
QSAR-ML.py		QSAR-ML.py
QSARDrugAnalysis.csv		QSARDrugAnalysis.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QSAR Drug Discovery Project

Features

Requirements

Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

QSAR Drug Discovery Project

Features

Requirements

Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages