Newest 'clustering' Questions

2 votes

2 answers

105 views

Handle outliers in clustering

I’m working on a cluster analysis of Italian provinces based on three fire-related indicators: total burned area (ha), burned area per fire, fire density. Because these variables are measured on ...

09Bruna

379

asked 23 hours ago

1 vote

1 answer

33 views

Does it make sense to represent monetary value with a discrete distribution?

I am implementing a Bayesian non-parametric clustering model based on the paper "Bayesian clustering of multiple zero-inflated outcomes" by Franzolini et al. (2023) to my household spending ...

lee en

11

asked Dec 2 at 6:34

0 votes

0 answers

25 views

Modeling recurring monthly transactions with weekend-shift effects: DBSCAN vs rule-based temporal detection?

I have 3 months of categorized bank transaction data and need to identify recurring cash inflows and outflows for lending risk modeling. Complications: 1. Income dates shift earlier when payday falls ...

Awande Ntombela

1

asked Nov 19 at 9:09

0 votes

0 answers

35 views

Role of Z-Tests in Kernel Density Estimation for Cluster Classification

In a recent bioinformatics paper, the authors describe a statistical/machine learning approach to classify clusters of cells using kernel density estimation (KDE) and Z-scores. While the details of ...

Michiel.Tawdarous

1

asked Nov 18 at 9:33

1 vote

1 answer

52 views

Vector direction of individual clusters after PCA

Suppose I have two multi-dimensional population samples - $A$ and $B$. I hypothesise that $\mathbb{E}[A]$ and $\mathbb{E}[B]$ are orthogonal in this high-dimensional space. To test this hypothesis, I ...

sunnydk

127

asked Nov 3 at 18:16

1 vote

0 answers

33 views

Supervised Clustering Algorithms / Full Graph Edge Prediction Algorithms

I have an interesting problem I am trying to solve and I cannot find any non-deep methods available to solve it. Problem Description Plain The real life problem this relates to are handwritten digits ...

Ryan Folks

149

asked Oct 28 at 21:38

2 votes

1 answer

46 views

Pattern analysis for time between events data

I am trying to subset data based on a pattern of "strings" or clusters of food deliveries to young that I see in my data (see plots labeled 2, 4, 5, 6, and 8 in the figure below for the most ...

thegrayson

23

asked Oct 23 at 17:57

0 votes

0 answers

27 views

How to identify and quantify main tendencies across participants from cluster membership heatmaps?

I'd appreciate your thoughts on the following problem. I've created a heatmap plot (attached) showing the cluster membership ratio for each participant (in separate subplots) and condition (η). Now, I'...

maria mystakidou

1

asked Oct 23 at 10:08

2 votes

1 answer

122 views

Examining country-level effects based on individual-level data combined with country-level data

I am new to working with country-level effects in comparative OLS regression with individual-level data. Are there any good resources for this? Suppose my dependent variable is social integration (an ...

Olestan

71

asked Sep 30 at 11:45

0 votes

0 answers

45 views

Are there clustering algorithms or preprocessing strategies tailored for zero-inflated and continuous data types?

I am currently working on the project where I need to assign customers across N recipes before AB testing such that KPIs for each customer are balanced across recipes (reduce pre-test bias) Dataset ...

Rishab

1

asked Sep 26 at 6:09

0 votes

0 answers

57 views

How to peform clustering on heavily right skewed data and zero inflated data

I am currently working on clustering continuous variables (such as AOV, RPV, and conversions(conversion/visits)). The variables are heavily right skewed with long tails and one variable is dominated ...

Rishab

1

asked Sep 24 at 12:43

3 votes

1 answer

130 views

Bayesian Clustering with a Finite Gaussian Mixture Model with Missing Data

I would like to perform clustering with a finite Gaussian Mixture model, however, I have missing data (some features are missing at random). I am using Variational Inference to fit my Bayesian GMM. Is ...

Tom

1,112

asked Sep 4 at 16:36

2 votes

0 answers

73 views

Estimating number of clusters using Scikit Bayesian GMM

I am generating clustering data using the Bayesian mixture of Gaussian models described in Bishop's Pattern Recognition and Machine Learning textbook, with model parameters drawn from the following ...

PJB

21

asked Aug 9 at 7:01

1 vote

1 answer

59 views

Mixture-Based Clustering for Ordered Stereotype Model - Distance Scores

I have a 5-variable/3 category-level ordinal survey data set. E.g. 5 health variables ranked 1-3 (good-moderate-poor). I want to row-cluster different responses. But also, I want determine whether ...

EB3112

264

asked Aug 8 at 9:48

1 vote

0 answers

54 views

Are equal and diagonal variance matrices implicitly assumed in k-means clustering?

When applying k-means clustering, I understand that the goal is to partition the dataset by assigning each point to its nearest cluster center. However, I’ve come across statements that k-means can be ...

EngineerMathlover

153

asked Jul 7 at 17:30

Stack Exchange Network

Questions tagged [clustering]

Handle outliers in clustering

Does it make sense to represent monetary value with a discrete distribution?

Modeling recurring monthly transactions with weekend-shift effects: DBSCAN vs rule-based temporal detection?

Role of Z-Tests in Kernel Density Estimation for Cluster Classification

Vector direction of individual clusters after PCA

Supervised Clustering Algorithms / Full Graph Edge Prediction Algorithms

Pattern analysis for time between events data

How to identify and quantify main tendencies across participants from cluster membership heatmaps?

Examining country-level effects based on individual-level data combined with country-level data

Are there clustering algorithms or preprocessing strategies tailored for zero-inflated and continuous data types?

How to peform clustering on heavily right skewed data and zero inflated data

Bayesian Clustering with a Finite Gaussian Mixture Model with Missing Data

Estimating number of clusters using Scikit Bayesian GMM

Mixture-Based Clustering for Ordered Stereotype Model - Distance Scores

Are equal and diagonal variance matrices implicitly assumed in k-means clustering?

Hot Network Questions