Skip to content
View Letsapatiiso07's full-sized avatar

Block or report Letsapatiiso07

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Letsapatiiso07/README.md

👨‍💻 Tiiso Letsapa

Data Engineer | AI ENGINEER | ML Engineer | MLOps Engineer | Cloud Engineer
Building intelligent data systems that bridge cloud infrastructure, machine learning, and healthcare innovation
LinkedIn GitHub Email


🚀 About Me

I'm a data engineer and machine learning specialist based in Pretoria, South Africa, with a passion for architecting scalable data pipelines and deploying production-grade machine learning systems. With 3+ years of hands-on experience, I've delivered solutions that process millions of records daily while maintaining 99.5%+ reliability and reducing operational costs by up to 85%.

What drives me: Turning complex data challenges into elegant, automated solutions that create real business value.

🎯 Quick Stats

  • 6+ production pipelines processing 1.2M+ records/hour
  • AWS-certified cloud architect specializing in serverless architectures
  • 90% accuracy ML models in production
  • 30-85% cost reduction across multiple projects
  • 99.5% uptime through robust monitoring and error handling

🏆 Latest Certifications & Education

Certification / Qualification Provider / Institution Year
Diploma in Financial Management Regent Business School 2025
Data Engineer Professional DataCamp 2025
Associate Data Engineer DataCamp 2025
SQL Associate DataCamp 2025
Machine Learning Engineer DataCamp 2025
AI Engineer For Data Scientists Associate DataCamp 2025
IT Automation with Python Google 2025
Azure Solutions Architect Microsoft 2025
Azure AI Engineer Associate Microsoft 2025
Azure Data Scientist Associate Microsoft 2025

🛠️ Technical Skills

skills = {
    "languages": ["Python", "SQL", "TypeScript", "JavaScript", "Bash"],
    "cloud": ["AWS Lambda", "S3", "DynamoDB", "API Gateway", "Step Functions", "Kinesis"],
    "data_engineering": ["Airflow", "Pandas", "NumPy", "ETL Pipelines", "Real-time Streaming"],
    "ml_ai": ["TensorFlow", "Scikit-Learn", "XGBoost", "CNNs", "Transfer Learning"],
    "devops": ["Docker", "CI/CD", "GitHub Actions", "Infrastructure as Code"],
    "frontend": ["React", "TailwindCSS", "Vite"]
}

🌟 Featured Projects

image

💳 Credit Risk Assessment Model

Fintech | Machine Learning | Risk Modeling | XGBoost
End-to-end machine learning system for predicting loan default risk using borrower financial and demographic data.

🎯 Key Features & Results:

  • AUC-ROC: 0.945 (excellent predictive performance)
  • Comprehensive EDA, outlier handling, and feature engineering (e.g., debt-to-income ratio)
  • Model comparison: XGBoost outperformed Logistic Regression and KNN
  • Interpretable insights via feature importance and visualizations
  • Production-ready pipeline with preprocessing and evaluation metrics

📊 Technical Highlights:

# Top predictive features identified:
- Loan grade & interest rate (highest risk drivers)
- Debt-to-income ratio (engineered feature)
- Home ownership status (renters higher risk)
- Loan amount and percent income

💡 Why It Matters:

  • Demonstrates real-world fintech application of ML for credit scoring
  • Handles class imbalance and provides actionable risk insights
  • Strong addition to financial domain expertise (complements Diploma in Financial Management)

🔗 Tech Stack: Python • Pandas • Scikit-learn • XGBoost • Matplotlib • Seaborn
View Project → image image image

🏥 Medical Image Analysis with Deep Learning

Healthcare AI | Computer Vision | Transfer Learning
A production-ready medical imaging classification system leveraging CNNs and transfer learning for diagnostic assistance.
🎯 Key Features:

  • Multiple architectures: Custom CNN, VGG16, ResNet50, InceptionV3
  • Synthetic medical image generation (X-Ray, Brain MRI)
  • Grad-CAM visualization for model interpretability
  • Complete training pipeline with data augmentation
  • Real-time prediction with confidence scores

🔗 Tech Stack: TensorFlow • Keras • OpenCV • Scikit-learn • Matplotlib • Seaborn
View Project →

🏎️ F1 Race Winner Prediction System

Machine Learning | Sports Analytics | Feature Engineering
End-to-end ML system predicting Formula 1 race outcomes with 90% accuracy.
🎯 Results:

  • 90% accuracy on race winner predictions
  • 95%+ accuracy on podium predictions

🔗 Tech Stack: Python • XGBoost • FastF1 API • Pandas • Scikit-learn • Matplotlib

📊 DataOps Studio

Interactive Dashboard | Modern UI | Real-time Monitoring
A modern, interactive dashboard showcasing Data Engineering & MLOps capabilities.
✨ Features:

  • 📊 Interactive data visualizations
  • 🌙 Dark mode with responsive design
  • ⚡ Automated CI/CD

🔗 Tech Stack: React • TypeScript • TailwindCSS • Recharts • Vite • GitHub Actions

⚡ Other Production Systems

Project Tech Stack Impact
Weather Analytics Pipeline AWS Lambda, S3, DynamoDB 99.5% uptime, <$0.10/month
Inventory Optimization Engine Python, XGBoost, Scikit-Learn 92% accuracy, 30% cost reduction
Cryptocurrency ETL Python, Airflow, REST APIs 85% time savings
IoT Processing System Python, SQLite, JavaScript 1.2M+ records/hour
Financial Automation Airflow, Pandas, PostgreSQL 3 days → 4 hours

🏗️ System Architecture Philosophy

I design cloud-native, serverless-first, and event-driven architectures for maximum scalability and cost efficiency:

┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────────┐
│ 📊 Data Sources │───▶│ 🔄 Ingestion Layer │───▶│ 🏗️ Processing │
│                     │   │                      │   │                     │
│ • REST/GraphQL APIs │   │ • AWS API Gateway    │   │ • AWS Lambda        │
│ • IoT Sensors/MQTT  │   │ • Amazon Kinesis     │   │ • Step Functions    │
│ • Database CDC      │   │ • AWS EventBridge    │   │ • Glue ETL Jobs     │
│ • File Uploads      │   │ • SQS/SNS Queues     │   │ • Error Handling    │
│ • Streaming Data    │   │ • Scheduled Triggers │   │ • Dead Letter Queue │
└─────────────────────┘ └──────────────────────┘ └─────────────────────┘

📊 Performance Metrics

Metric Target Achieved
Pipeline Uptime 99.9% 99.5% ✅
Processing Latency <100ms <200ms ⚡
Cost per TB <$10 $15 💰
Error Rate <0.1% <0.5% 🎯

💡 What Makes Me Tick

Local meets global: Combining South African insights with cloud-scale infrastructure
Passion-driven projects: Built an F1 predictor merging hobbies with tech
Serverless advocate: If it can run without a server, I'm interested
Healthcare AI: Applying ML to medical imaging
Fintech & Finance: Leveraging financial management knowledge with ML for credit risk modeling
Continuous learner: Exploring real-time streaming and advanced MLOps
My goal: Build data systems so reliable, they become invisible


🤝 Let's Connect

I'm always excited to collaborate on data engineering, ML, fintech, or cloud projects. Let's talk!
📧 Email: [email protected]
🔗 LinkedIn: linkedin.com/in/tiiso-letsapa-664990209
💻 GitHub: github.com/Letsapatiiso07


⭐️ "Building intelligent systems that turn data into decisions" 🚀
Based in Pretoria, South Africa 🇿🇦 | Open to remote opportunities worldwide 🌍

Pinned Loading

  1. Letsapatiiso07 Letsapatiiso07 Public

  2. aws-weather-pipeline aws-weather-pipeline Public

    Real-time weather data pipeline using AWS services - automated ETL for South African cities

    Python 1

  3. sales-analytics-pipeline sales-analytics-pipeline Public

    Python

  4. inventory-management inventory-management Public

    Python

  5. crypto-chatbot crypto-chatbot Public

    Python

  6. Data-Pipeline-with-ApacheAirflow Data-Pipeline-with-ApacheAirflow Public

    Python