José Molano
Verified Expert in Engineering
Data Engineer and Developer
José is a data engineer with more than six years of experience in extract, transform and load (ETL) pipeline development, data warehouse and data lake design, query performance tuning, and database cloud infrastructure management. With his background spanning multiple domains, José has designed and built scalable data platforms in contexts like ticket exchange and resale, urban traffic, customer service, and tax evasion analysis.
Portfolio
Experience
Availability
Preferred Environment
Apache Airflow, Apache Spark, Snowflake, BigQuery, Amazon RDS, MySQL, Python, Redshift, Terraform, AWS Glue
The most amazing...
...product I've built is an automation pipeline for migrating terabytes of data from a MongoDB database to a data warehouse using Apache Airflow and AWS.
Work Experience
Senior Data Engineer
Globant
- Executed job automation and scheduling with Airflow to support monthly accounting reviews.
- Developed ETL pipelines for ingesting external data sources, such as MongoDB, into Snowflake using Apache Airflow and AWS.
- Planned and executed transactional database migrations using AWS Database Migration Service (DMS).
- Implemented AWS CloudWatch, Datadog monitors, and OpsGenie integration on critical database metrics like CPU and memory consumption and replication lag.
- Managed the Amazon Relational Database Service infrastructure and related resources, such as Amazon Virtual Private Cloud security groups and parameter groups in source control with Terraform.
- Improved SQL query performance in Amazon Aurora MySQL, reducing execution times and cloud infrastructure costs by introducing indexes and partitions on key tables.
Data Engineer
SKG Tecnologia
- Implemented streaming ingestion pipelines for urban traffic mobility data using Apache Kafka and Python.
- Designed traffic analytics applications based on BigQuery.
- Provisioned the SQL database cloud infrastructure with PostgreSQL and managed it using Google Cloud Platform.
- Designed dashboards and processed data related to urban mobility traffic speed analysis.
Data Technical Lead
Alianza CAOBA
- Designed and developed a big data lab environment using VirtualBox, Apache Hadoop, Apache Ambari, and Cloudera, reducing feature development and deployment time.
- Developed a tool for anonymizing sensitive customer information on big data sets using Apache Spark and Apache Hive.
- Devised and developed health analytics applications using Amazon Elastic Compute Cloud (Amazon EC2), Amazon S3, and Amazon Athena.
- Provisioned the SQL database virtual infrastructure with PostgreSQL and managed the database administration. Designed an entity-relationship model oriented to customer service and retail use cases and provisioned database access and user grants.
- Designed and developed Microsoft Power BI dashboards for analyzing data produced by natural language processing (NLP) machine learning models.
Big Data Developer
Alianza CAOBA
- Created an automation pipeline to calculate the expected tax amount for construction projects in Bogotá, Colombia using pandas.
- Developed an automation pipeline for cleaning and processing urban traffic mobility to be available and usable by an interactive dashboard using Apache Spark.
- Managed the big data infrastructure, providing new services such as MongoDB, Apache Spark, Apache Hive, and Hadoop Distributed File System (HDFS).
- Designed and developed Microsoft Power BI dashboards for analyzing urban transportation and mobility data.
Experience
Adaptable Daily Living Activity Identification from Sensor Data Streams
https://www.sciencedirect.com/science/article/pii/S1877050918304551The proposed system is tested and validated under a dataset from a real user. The results show that it can operate adequately in a real scenario with the respective constraints.
The main contribution of this project is a system for ADL detection that can adapt to user behavior changes without retraining the model, considering sensor failures, and preserving user privacy.
ADACOP: A Big Data Platform for Open Government Data
Low-cost and low-precision 2D tracking system for virtual reality and augmented reality applications
https://ceur-ws.org/Vol-1957/CoSeCiVi17_paper_9.pdfSkills
Languages
Snowflake, Python, SQL, Java, JavaScript
Libraries/APIs
Pandas, NumPy, OpenCV, Node.js
Tools
Apache Airflow, GitHub, Microsoft Power BI, Tableau, BigQuery, Terraform, AWS Glue, Cloudera, Amazon Elastic Container Service (Amazon ECS), Amazon CloudWatch, Weka, Boto, Amazon Athena
Paradigms
ETL, Business Intelligence (BI), REST
Platforms
Amazon Web Services (AWS), AWS Lambda, Google Cloud Platform (GCP), Apache Kafka, Docker, Azure
Storage
MySQL, Databases, Data Pipelines, Relational Databases, MongoDB, JSON, MariaDB, Database Migration, Redshift, Apache Hive, HDFS, PostgreSQL, Amazon S3 (AWS S3), Data Lakes, Elasticsearch
Other
Amazon RDS, Data Engineering, Data, CSV File Processing, Data Analysis, CSV, Python Boolean, Boolean Search, ETL Tools, Data Migration, Data Management, Database Optimization, AWS Database Migration Service (DMS), Data Visualization, Reporting, Data Transformation, Data Analytics, Office 365, Reports, Visualization, Streaming Data, APIs, Big Data, OCR, High Availability Disaster Recovery (HADR), CRM APIs, Excel 365, Amazon Neptune, AWS Certified Solution Architect
Frameworks
Apache Spark, Spark, Hadoop, Flask, AWS HA
Education
Master's Degree in Computer Engineering
University of The Andes - Bogotá, Colombia
Bachelor's Degree in Computer Engineering
University of The Andes - Bogotá, Colombia
Certifications
AWS Certified Cloud Practitioner
Amazon Web Services
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring