Andrew Collier
Verified Expert in Engineering
Data Scientist and Software Developer
Andrew picked up programming and data analysis skills while working as an experimental physicist. He now works as a data scientist. His tools of choice are R and Python, with a lot of SQL thrown for good measure. Andrew also uses Docker extensively and has worked with both AWS and Azure. He has a particular passion for web scraping and is also an accomplished speaker and trainer.
Portfolio
Experience
Availability
Preferred Environment
Bash, Linux, Git, Jupyter, Docker, Python, Amazon Web Services (AWS), SQL, R
The most amazing...
...system I've developed has been running autonomously in Antarctica for over a decade.
Work Experience
Web Crawling Specialist
Unrival Limited
- Developed a web scraper for extracting data from large social media platform for B2B marketing product.
- Generated automated reports in HTML and PDF using scraped data.
- Used the Watson APIs to parse and analyze scraped data.
- Used the Bing Maps API to geolocate locations in scraped data.
- Developed a flexible web scraping framework to gather data from over 100 different companies' C-suite pages.
Founder | Data Scientist
Fathom Data
- Cleaned, prepared, and analyzed data: the process was done in both R and Python.
- Built machine learning and deep learning models in both R and Python. Many of the models were subsequently deployed behind APIs.
- Managed a team of data scientists and coordinated and interfaced with clients.
- Automated documentation. Used R Markdown to generate reports and presentations automatically.
- Developed and managed package: a number of packages for R and Python were constructed and maintained.
- Prepared and gave lectures and presentations—training and speaking at conferences and workshops.
Freelance Data Scientist
Toptal
- Built robust web scraper for extracting data for persons and organizations from LinkedIn and Sales Navigator.
- Constructed PostgreSQL database for storing medical and drug data. Implemented ETL pipeline.
- Used Python and spaCy to extract salient information from LinkedIn profiles and blog posts.
Founder/Data Scientist
Exegetic Analytics
- Conducted data analyses for clinical trials.
- Developed a conformance analysis system for use in printing industry.
- Implemented a Kagi Charts indicator in MQL4.
- Conducted analysis of the effects of news events on FOREX trading using data scraped off myfxbook.
- Initiated Durban R User Group and Durban Data Science Meetup.
Python Engineer
HumanOS
- Designed and implemented a database. Set up on Amazon RDS.
- Created a Flask API to interface the database to desktop and mobile apps.
- Integrated the API with a 3rd-party (WeFitter) API to gather wearable data.
Python Data Analyst and Tech Writer | Loom Tutorial Screencasts
Domino Data Lab
- Created videos and tutorial content for existing and new features.
- Updated and maintained documentation. Added automation to the website build.
- Provided feedback and bug reporting on new features.
R Engineer - Shiny App
BluePath Solutions LLC.
- Developed multiple Shiny apps for interacting with data.
- Developed a web crawler to extract pharmaceutical pricing data.
- Designed and built a database using PostgreSQL; deployed on Amazon RDS.
Content Creator
Datacamp
- Designed the content of an online course about machine learning with Spark.
- Developed the course content, script, and associated material.
- Created slides, recorded video and audio, and edited content.
- Continued maintenance of the course and responded to issues raised by students.
Senior Data Scientist
Derivco
- Coded a game recommendation engine.
- Developed a game/player anomaly detection system.
- Automated routine analyses.
- Automated report generation.
- Initiated Data Science Working Group.
Honorary Senior Lecturer
University of KwaZulu-Natal
- Developed an autonomous observation system for experiments in Antarctica.
- Applied machine learning techniques to lightning distributions.
- Mentored students in R and data analysis.
- Presented analytical results at numerous international conferences.
- Published research results in international journals.
Experience
{emayili}
https://github.com/datawookie/emayiliThe package has minimal dependencies and exposes a tidy API for writing and sending emails. It has detailed documentation and an extensive test suite.
The package has also been the subject of a number of blog posts and conference/meetup talks.
Trundler R Package
https://github.com/datawookie/trundlerTrundler is a service that aggregates retail price data acquired via web scraping. The data are available via an API. This package provides a consistent set of functions for accessing the API from R.
Trundler Python Package
https://github.com/datawookie/trundlerpyTrundler is a service that aggregates retail price data acquired via web scraping. The data are available via an API. This package provides a consistent set of functions for accessing the API from R.
Scientific Advisor
Skills
Languages
Python, SQL, Bash, R, Octave, C++, CSS, HTML, Sed, JavaScript
Libraries/APIs
REST APIs, Beautiful Soup, Bing API, ArcGIS, Pandas
Platforms
Linux, RStudio, Docker, Amazon Web Services (AWS), Amazon EC2
Other
Machine Learning, Web Scraping, Task Automation, Regular Expressions, Visualization, Statistics, Data Analysis, Artificial Intelligence (AI), Technology Consulting, Data Visualization, Technical Writing, Algorithms, Bayesian Statistics, Unstructured Data Analysis, Web Crawlers, Large-scale Web Crawlers, APIs, Geospatial Data, WebSockets, Amazon RDS
Frameworks
Selenium, Scrapy, Flask, Django, RStudio Shiny, Spark
Tools
Microsoft Excel, Jupyter, Git, MATLAB
Paradigms
Automation, Data Science
Storage
Amazon S3 (AWS S3), MongoDB, Neo4j, MySQL, PostgreSQL
Education
Ph.D. Degree in Space Physics
Royal Institute of Technology - Stockholm, Sweden
M.Sc. Degree in Nuclear Physics
University of Potchefstroom - Potchefstroom, South Africa
B.Sc. (Hons) Degree in Physics & Mathematics
University of Natal - Durban, South Africa
Certifications
PhD
Royal Institute of Technology
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring