Data Architect · KPMG UK

PraveenKumar Thumati

Data ArchitectKPMG UK

Certified Data Architect with over 6 years of experience building enterprise-scale cloud data platforms, architecting data solutions, and leading high-performing teams. Specialised in Azure, Microsoft Fabric, and Databricks — delivering transformative data strategies that drive real business value.

6+Years Experience
12Certifications
20+Projects
Available for opportunities
Praveen Kumar Thumati

Building Data Solutions That Scale

I architect end-to-end data platforms that transform raw data into strategic business assets. With deep expertise spanning Microsoft Azure, AWS, and Microsoft Fabric, I design and deliver solutions that empower organisations to make data-driven decisions at scale.

My approach combines technical excellence with strategic thinking, ensuring every solution aligns with business objectives while meeting the highest standards of performance, security, and governance.

Professional Experience

September 2024 — Present
Data Architect
KPMG UK · London

Leading architectural design and implementation of enterprise-scale data platforms, driving strategic initiatives across cloud infrastructure, data governance, and digital transformation.

  • Architecting multi-million pound cloud data transformation programs for Fortune 500 clients
  • Defining enterprise data strategy, governance frameworks, and technical roadmaps
  • Leading cross-functional teams of data engineers, analysts, and business stakeholders
  • Designing scalable solutions using Microsoft Fabric, Azure Synapse, and Databricks
  • Establishing best practices for data modelling, security, and compliance (GDPR, RBAC)
January 2022 — September 2024
Senior Data Engineer
KPMG UK · London

Designed and implemented metadata-driven ETL frameworks and led complex data migration initiatives from legacy systems to modern cloud platforms, delivering high-impact solutions for enterprise clients.

  • Built medallion architecture solutions using Microsoft Fabric and Azure Synapse Analytics
  • Developed Master Data Management (MDM) using probabilistic record linkage (Fellegi-Sunter model)
  • Created reusable Python libraries and frameworks reducing project delivery time by 70%
  • Led migrations from D365 Dynamics, Dataverse, AX 2012 to Azure cloud platforms
  • Implemented RBAC, GDPR compliance, and data quality frameworks
  • Mentored team of 3 data engineers and conducted knowledge transfer sessions
June 2020 — December 2020
Associate Consultant
Capgemini India · Bangalore

Promoted to Associate Consultant role, leading technical initiatives and mentoring junior team members while delivering complex data engineering solutions.

  • Led AWS Glue ETL job development using Python and PySpark for large-scale data transformations
  • Designed and implemented semantic data models using AWS Athena, EC2, and Autosys
  • Mentored junior team members through knowledge-sharing sessions and code reviews
  • Received XtraMile Award for outstanding performance (2020)
  • Won Most Inspiring Idea Award in Cloud Data Platform Contest (2020)
June 2019 — May 2020
Senior Analyst
Capgemini India · Bangalore

Promoted to Senior Analyst, taking on more complex data engineering challenges and leading implementation of cloud-based data solutions.

  • Built and optimised Data Factory pipelines for cloud migrations with incremental load patterns
  • Created automated data quality alert systems using Azure Logic Apps and SQL Server
  • Developed end-to-end ETL workflows for multiple client projects
  • Collaborated with cross-functional teams to define data architecture strategies

Skills & Technologies

A comprehensive toolkit built over years of hands-on experience designing, implementing, and optimising enterprise data platforms.

Cloud Platforms

Azure Synapse AnalyticsMicrosoft FabricAzure DatabricksAzure Data FactoryAzure SQL ServerAWS GlueAWS AthenaAWS S3AWS Lambda

Programming & Processing

PythonPySparkSQL & PL/SQLSpark SQLPandasAzure DevOpsGitCI/CD Pipelines

Data Architecture

Data WarehousingLakehouse ArchitectureMedallion ArchitectureData ModellingStar SchemaKimball MethodologyDelta LakeData Governance

Analytics & AI/ML

Power BIAzure Analysis ServicesMachine LearningDeep LearningAzure Machine LearningAzure AI Services

Certifications

Industry-recognised certifications demonstrating expertise across Microsoft Azure, Databricks, and cloud platforms.

🏅

Microsoft Certified: Fabric Data Engineer Associate

Valid: January 2025 — March 2026
Active
👁️ Click to view certificate
🤖

Microsoft 365 Certified: Copilot and Agent Administration Fundamentals

Earned: April 14, 2026
Active
👁️ Click to view certificate
📊

Microsoft Certified: Fabric Analytics Engineer Associate

Valid: March 2024 — March 2026
Active
👁️ Click to view certificate
📈

Microsoft Certified: Power BI Data Analyst Associate

Valid: March 2024 — March 2026
Active
👁️ Click to view certificate
🚀

Databricks Certified Data Engineer Associate

Valid: Active
Active
👁️ Click to view certificate
🔥

Databricks Certified Developer for Apache Spark 3.0

Valid: December 2022 — Does Not Expire
Active
👁️ Click to view certificate
🧠

Microsoft Certified: Azure AI Fundamentals

Valid: Active
Active
👁️ Click to view certificate
📋

Microsoft Certified: Azure Data Fundamentals (DP-900)

Valid: Active
Active
👁️ Click to view certificate
🌐

Microsoft Certified: Azure Fundamentals

Valid: Active
Active
👁️ Click to view certificate
☁️

Microsoft Certified: Azure Data Engineer Associate

Valid: March 2024 — March 2026
Expired
👁️ Click to view certificate
🟠

AWS Certified Developer Associate

Valid: Expired
Expired
👁️ Click to view certificate
☁️

Oracle Certified Associate: Cloud Infrastructure

Valid: Expired
Expired
👁️ Click to view certificate

Awards & Achievements

Recognition for outstanding contributions, innovation, and technical excellence throughout my career.

🏆

XtraMile Award

Capgemini
2020

Recognised for outstanding performance, exceptional dedication, and consistently exceeding expectations across complex data engineering deliverables.

👁️ Click to view award
💡

Most Inspiring Idea Award

Capgemini — Cloud Contest
2020

Awarded for proposing an innovative approach to modernising data infrastructure and accelerating cloud adoption that influenced project strategy.

Education

January 2021 — January 2022
MSc. Data Science and Analytics
Royal Holloway, University of London

Graduated with Distinction. Advanced study in machine learning, deep learning, big data analytics, and NLP — bridging theoretical rigour with applied engineering excellence.

August 2014 — May 2018
BTech. Electronics and Communication Engineering
Sri Venkateswara College of Engineering · India

Graduated with Distinction. Strong foundation in engineering principles, systems design, and programming — the launchpad for a career in data engineering.

The Spark Academy

Simplifying complex distributed computing concepts through my Medium series, "Data Engineering for Everyone."

📝
Latest Article

The Cheat Sheet Strategy: Mastering Broadcast Joins

How to skip the hallway chaos of Shuffles and join data without leaving your seat — giving every executor a local reference sheet.

Spark OptimisationPerformance
Read Article →
🔄
Featured

The Great Spark Shuffle: Why Data Keeps Changing Seats

Visualising the most expensive part of Spark — manage classroom chaos using partitions and map-side combines.

Big DataArchitecture
Read Article →
The Orientation

The Secret Life of a Spark Job: Lazy Plans to Lightning Execution

Forget dry diagrams — understand the Teacher (Driver) and Students (Executors) through a simple classroom analogy.

PySparkBeginners
Read Article →

Let's Connect

📍

Location

Coventry, United Kingdom

💼

LinkedIn

praveen-kumar-t