Inspiration

Bridgette, also known as BridgEY, is a data integration platform inspired by the EY Canada - Data Integration Challenge. This challenge encouraged students to develop a software application that can effectively map and merge data from these disparate systems into a unified platform while ensuring data integrity and completeness.

Our Idea

This solution was centered around a data integration platform that could take data files of any format (.xlsx, .csv, .json, etc.), understand their structure and formatting through specific organizational schemas, and unify them into clean standardized datasets.

How We Built It

We divided our team into three main departments: data processing, schema mapping, and frontend/backend integration

Data Ingestion: Bridgette supported .json, .csv, and .xlsx files from different organizations. Our solution used pandas and openpyxl to automatically detect data types and headers.

Schema Mapping: Bridgette uses a powerful AI model to handle the schema mapping. This provided the benefit of semantic context expansion, refining the accuracy of the AI model's interpretation of schemas based on its values. Additionally, the backend AI model gives increased adaptability to changing schemas or schemas used in different industries.

Frontend/Backend Integration: Bridgette was frontend was built with React, allowing users to upload multiple files of maximum of 50 MB each through a simple user interface. The backend was built to handle file storage, data merging, and schema mapping to return a unified and processed data file for download.

Challenges We Faced

No Financial Background: Our group had background in banking data and we had to quickly learn and understand what they look like, from transaction logs to load sheets purely from online research and our sample data. Understanding the real-world financial data diversity was eye-opening.

Schema Conflicts: Different banks and organizations used completely different naming conventions and had quantities of data, making column alignment very tricky. We iterated multiple times on our [Karen Insert] to refine our accuracy

What We Learned

Team Collaboration: We worked effectively across all roles of frontend and backend. Even when no one had experience in banking, we were all up for the challenge and pushed through.

Enterprise Data Engineering: Through this challenge, we learned how tough real-world data analysis can be and how to build resilient systems to handle these issues

Share this project:

Updates