CV: Alok Chaturvedi – Alok Chaturvedi

Contact

Name: Alok Chaturvedi
Email: alok92kattayayn@gmail.com
LinkedIn: Alok Chaturvedi
Work Experience: 8 Year, 11 Months

Technical Skills

	Proficient	Familiar
Programming Languages	Python	JavaScript, C++
Data Engineering and Analytics Tools	PySpark, DBT, Pandas, Scikit-Learn	Alteryx, Seaborn, Tensorflow
Cloud Technologies	AWS Lambda, S3, IAM, Secret Manager, SQS	EC2, DynamoDB, Glue, Step Function
Linux / Unix Scripting	Bash Shell Scripting
Back-end Tool		Flask
Databases	Exasol, Snowflake, Vertica DB	MySQL, Oracle, PostgreSQL, MongoDB, DynamoDB
CI / CD and Orchestration	GITLAB, GITHUB, Airflow	AWS Step Function

Summary

Software Developer with 8+ years of experience in Data Engineering using Python, PySpark, SQL, DBT, Bash.
Experienced in creating Data pipelines using Python and Shell Scripting or Pyspark and Shell Scripting to bring data from various sources such as Salesforce API, Salesforce Marketing Cloud, SAP Hana, other Databases (MySQL, Exasol, Snowflake, Vertica), from micro-service provider (web APIs) and flat files from SFTP.
Experienced in Data Vault 2.0 and Data Modeling for BI and Machine Learning.
Experienced in handling both structured, semi-structured (JSON and XML) and unstructured data.
Experienced in cleansing data through text mining using Regular Expression.
Experienced in Machine Learning Assignment using Tensorflow, Scikit Learn, Numpy, Pandas etc.
Experienced in working on both in premise server and with AWS services.
Experienced in automating routine tasks like data reconciliation, file generation for downstream systems and running job using Python and Airflow.
Knowledge of various computer science core technologies such as html, networking, Data Structure, Algorithms etc. .
Quick learner and good performer both in team and independent job environments.
Trained in Public Speaking and Communication by Toastmasters International.

Work Experience

Employer: IBM
Current Employer: Yes
From: 27-Feb-2024
Till: Current Employer

Employer: TTEC Digital
Current Employer: No
From: 25-April-2022
Till: 23-Feb-2024

Employer: Mindtree
Current Employer: No
From: 12-April-2021
Till: 22-April-2022

Employer: Infosys
Current Employer: No
From: 13-March-2017
Till: 12-April-2021

Professional Experience

Project # 7: Navify Marketplace and Portal(Roche)
Role: Lead Data Engineer
Duration: April 2025 – (Working)

Overview: Roche has a product by the name of navify Analytics for Molecular Lab
(https://marketplace.roche.com/products/navify-analytics-for-molecular-lab) which provides
analytics capabilities for molecular tests (HIV, covid etc.), my role is to build the data
pipelines that support this product.

Responsibilities

Building pipelines to get test result data from S3 into Vertica DB.
Parsing the JSON and XML result files.
Creating Data mart for the results data in order to be used in Tableau Dashboard

Project # 6: Navify Analytics for Molecular Lab(Roche)
Role: Lead Data Engineer
Duration: March 2024 – May 2025

Overview: Roche has a product by the name of navify Analytics for Molecular Lab which provides
analytics capabilities for molecular tests (HIV, COVID etc.), my role is to build the data pipelines that support this product.

Responsibilities

Building glue pipelines to get test result data from S3 into Vertica DB.
Parsing the JSON and XML result files.
Creating Data mart for the results data in order to be used in Tableau Dashboard.

Project # 5: Vehicle Safety Owners Engagement (GM)
Role: Data Engineer
Duration: May 2022 – February 2024

Overview: GM wants to analyze the effectiveness of campaign to motivate vehicle owners to do the
critical repairs for all the recall models. The Data user has to create frequent reports in order
to maximize the repairs and minimize the campaign cost. As a data engineer I build the
pipeline to read the repair data, engagement data, vehicle master data and calculate the
repair rate as well as the effectiveness of the Campaign which is used to generate the
reports.

Responsibilities

Loading the data to and from the S3 Bucket.
Writing pyspark codes to join, aggregates and run analytics.
Manage the repository.

Project # 4: SAS Code Migration for Vehicle Safety Owners Engagement and Equity Mining (GM)
Role: Data Engineer
Duration: May 2022 – December 2023

Overview: GM had its analytics system build using SAS on an in-house server which was used to both Store the data as well as run the SAS procedures. Due to cost related reasons GM planned to decommission the SAS server and adopted the open-source data processing engine pyspark. I worked with my team to convert legacy SAS codes into pyspark codes.

Responsibilities

Writing pyspark codes to perform the activity which SAS codes were doing.
Test and perform the UAT using SAS output.
Manage the code repository.

Project # 3: B2B Digital Wholesale Analytics (Adidas)
Role: Data Engineer
Duration: September 2018 – April 2022

Overview: The goal of DWS Analytics team is to provide state of the art and accurate reporting
framework for the B2B business team. As part of the reporting we track, purchase
information, cart and wishlist information, dimension data (product, customer etc.), digital
shelf data (From customers website), survey data etc. .
We create pipelines to connect various data sources to our data warehouse which is Exasol
Database. Our pipelines track the incremental data as well. We do cleansing and
accumulation inside Exasol. Later we create view objects to put together relevant data
which is present in star schema.
The MicroStrategy resources in team, create reports and interactive dashboards using the
views created in Exasol.

Responsibilities

Designing the functional paradigm for whole data ingestion.
Writing python codes to bring data from SAP HANA, Exasol, Salesforce etc. data sources.
Building replacement pipelines to older Alteryx data pipelines.
Data cleansing using pyspark and pandas before saving to S3 Bucket.
Leading the team, providing debugging assistance.

Project # 2: ETL Brazil (Avon)
Role: ETL Engineer
Duration: August 2017 – August 2018

Overview: Avon sells cosmetic products. Avon Brazil ETL team is interested in data generated inside Brazil for BI purposes. Data related to purchase, materials, campaign, survey etc. is loaded
into a common SFTP server. We use BASH shell scripting to fetch data from SFTP and bring it to local server from which it is loaded into Oracle Database via Pl/SQL stored procedure which utilities the external table capability to read flat files.

Responsibilities

Writing Stored procedures for loading flat file into database.
Contributing to the designing of ETL Solution.
Writing BASH scripts for data file movement.
Maintaining the code repository on server.

Project # 1: Infosys Technical Training
Role: Trainee
Duration: March 2017 – July 2017

Overview: Fresh off the college, the Infosys training provided a practical and business-centric knowledge to tackle problems and providing solution in a live project. As part of this
training I was able to revise my technical skills and was able to create bigger projects
requiring 4 team members and 4 days of time. In the same training I also learned about
business communication and creative thinking to tackle the challenges in professional life.