Inspiration
Tailored Brands challenge mentioned that one of their subsidiaries -- JoS. A Bank had a lot of excess inventory that needs to be sold as well as a good number of recurring customers. Fortunately, they also have a 90.27% online subscriber rate. We noticed that much of the JoS. A Banks brand strength occurs in more rural areas in the East coast where physical retail outlets are more spread out. Because of this excess inventory, high online engagement and recurring customers, JoS. A Bank has an open market to create a bi-yearly shipping service for JoS. A Bank products.
What it does
We built Banks Box -- a subscription box that uses predictive analytics to discover 4 new JoS. A Bank products that you would be likely to buy, given your previous purchase history. These boxes are hyper personalized and complex (more info below). This would also compliment or add to your current wardrobe.
How we built it
Initially we looked into building a deep learning model using Pytorch, TensorFlow or Keras to build a simple CNN. However, given the computation limits of our VPN and the comprehensive data set, the ML approach was an inefficient and unnecessary way to approach the problem. Because the data set is so comprehensive, we didn't need to use any ML to make predictions -- instead, we could achieve reliable and accurate predictions using traditional statistical methods such as correlation coefficients and clustering. In more comprehensive terms, we analyzed the correlation of purchase between every SKU (number used to track items in stores). This was done by scoured the Tailored Brands data set and noting if (hypothetical) person A had bought a blazer and tie and noting this combination as a popular one. When we come to person B, who has just bought a blazer, but not a tie, we'll suggest buying a tie to them.
Challenges we ran into
-Data set was huge -We had to program everything on a VPN connected to a virtual machine which had traditional internet services disabled -Our virtual machine was extremely slow and we only had access to one machine (which means that three programmers quickly overwhelmed the memory and processing power)
Accomplishments that we're proud of
-Ted using pandas and a dictionary off of github to create geomaps even though mentors said it wasn't possible without an internet connection -Using data to discover a distribution model (subscription service) that would actually pair with our recommendation algorithm to unlock its real value.
What we learned
A whole lot about pandas (it's not always black and white!) Basics of SQL How the flashiest model isn't always the best A whole lot about patience while cleaning datasets The true will of man
What's next for Bank Box
A Strong IPO, coming in 2025 Actually implementing this project in real life
Log in or sign up for Devpost to join the conversation.