Mfon Ekpo

Junior Data Engineer
Uyo, NG.

About

Highly skilled Junior Data Engineer with proven expertise in Python, SQL, and robust ETL pipeline development. Successfully optimized data workflows, reducing data processing time by 40% and improving data acquisition speed by 50% through innovative web scraping and data orchestration solutions. Adept at ensuring data integrity across diverse database systems like PostgreSQL and MongoDB, consistently delivering high-quality, error-free data for downstream analysis.

Work

Stoicbiot
|

Data Engineering Intern

Remote

Summary

Led data integrity initiatives and developed robust web scraping solutions for Stoicbiot, ensuring high-quality data acquisition and processing from diverse web sources.

Highlights

Developed and deployed four web scraping bots using Selenium and BeautifulSoup, improving data acquisition speed by 50% for structured data extraction from diverse websites.

Optimized data extraction efficiency by 30% through the refinement of XPath queries and strategic request handling, ensuring high-quality data acquisition.

Authored 20 robust Python scripts for data parsing, cleaning, and validation, resulting in a 50% reduction in data processing errors and enhanced data quality.

Ensured comprehensive data integrity throughout the transfer process from web sources to storage by implementing robust validation checks and error-handling mechanisms.

Documented detailed scraping workflows and troubleshooting procedures, fostering seamless team collaboration and knowledge transfer.

OpenTeams
|

Project Engineering Intern

Austin, TX, US

Summary

Contributed to the Narwhals open-source data processing library, enhancing feature development and participating in community events.

Highlights

Contributed Python code to Narwhals, an open-source data processing library, enhancing feature development and documentation for improved data handling.

Optimized data handling functions within the library, significantly improving performance for multi-format datasets.

Actively participated in open-source community events, fostering collaboration and knowledge exchange within the data engineering ecosystem.

Successfully completed the program, earning a badge that confirmed proficiency in open-source project contributions.

Datathink.io
|

Project Engineering Intern

Rexburg, ID, US

Summary

Collaborated on Quarto-based data training course development, designing reusable templates and streamlining review processes for data workflows.

Highlights

Collaborated on a Quarto-based data training course, designing reusable templates and authoring GitHub/VS Code tutorials for data workflows.

Streamlined the team review process, successfully cutting feedback cycles by 20% and enhancing project efficiency.

Developed comprehensive course materials focused on Visual Studio Code and GitHub, improving learning outcomes for data professionals.

Designed and implemented a reusable lesson template for the course, standardizing content delivery and improving scalability.

Dowees Corp
|

Data Entry Clerk

Uyo, Akwa-Ibom State, Nigeria

Summary

Managed and organized datasets in CRM systems, ensuring high data accuracy and facilitating efficient data retrieval for Dowees Corp.

Highlights

Validated and organized datasets in CRM systems using SQL queries and manual cross-checks, ensuring 99% data accuracy for downstream analysis.

Implemented a meticulous data validation process, including thorough data cleansing and cross-referencing, to maintain high data quality and reliability.

Established a systematic data categorization and classification system, leveraging data organization skills to facilitate efficient data retrieval and streamline data analysis processes.

Education

University Of Uyo
Uyo, Akwa-Ibom State, Nigeria

Bachelor of Engineering

Computer Engineering

Languages

English

Skills

Data Analysis & Engineering

Beautiful Soup, Data Mining, Pandas, Polars, ETL Pipelines, Web Scraping, Data Orchestration, Prefect, Apache Airflow.

Programming Languages

Python, SQL.

Databases

NoSQL, MongoDB, PostgreSQL.

Tools & Platforms

Git, Selenium, Visual Studio Code, Google Cloud Platform (GCP).

Projects

Web Scraper Bot

Summary

Built a web scraper during my internship at StoicBiot, implementing rotating IPs to evade bot detection.

ETL Pipeline

Summary

Implemented data orchestration for a web scraping project, deploying the pipeline on Prefect Cloud.

NER Project Using Spacy

Summary

Performed Named Entity Recognition (NER) using Spacy on a 47k-row dataset of news headlines, enhancing NLP skills.