Main

Lucas Moraes

I am a data scientist with experience in very different settings (academic, government and industry). This experience gave me keen communication and soft skills. I have delivered roobusts data products in all of these and am known for my organization and reproducible code.


Relevant professional experience

Data Scientist

Carrefour

N/A

Present - 2023

  • Sales forecasting for promotional events in stores across the country using supervised learning.
  • Scalable experiment design and modeling using PySpark and Google Cloud Platform
  • MAPE reduction in forecasting of 30% compared to the baseline.

Senior Data Analyst

PicPay

N/A

2022 - 2021

  • Data analysis and statistical support for the behavioral segmentation model of customers (Gaussian Mixture Models).
  • Data analysis, experimental design, data wrangling and experimentation monitoring for A/B testing of features in the suggestions session of the app home page.
  • Data analysis for integrity check of models in production. Machine learning modeling for client propension studies.
  • Analytical pipelines using GitHub + Databricks + Airflow for the creation of custom on demand tables in the data lake.

Data Scientist

Melhor Envio

N/A

2018 - 2013

  • End-to-end development of a machine learning client segmentation model for the company (K-prototypes).
  • Statistical analyses of client data for real time monitoring of user activity to aid decision making.
  • Conversion of arbitrary business metrics to robust indicators using statistical techniques (e.g. bootstrapping and hypothesis testing).
  • Dashboard modeling and development using Looker/LookML for technical and non technical audiences.









Education

Technical stack __________________

R

Python

SQL

Spark

Statistics

Machine Learning

Data Viz

Fluent english

Soft Skills

MsC, Genetics

Rio de Janeiro Federal University

N/A

2018 - 2016

  • Hierarchical clustering (unsupervised machine learning) and dendrogram analyses for evolutionary distinct lineages through the integration of biological, geographical and molecular unstructured data.
  • Advisor: Carlos Guerra Schrago.

BsC, Genetics

Rio de Janeiro Federal University

N/A

2012 - 2007

  • Phylogenetic and topological estimation of cetaceans using bayesian and maximum likelihood methods for hierarchical clustering.
  • Publication: Phylogenetic Status and Timescale for the Diversification of Steno and Sotalia Dolphins. PLOS ONE. https://doi.org/10.1371/journal.pone.0028297
  • Advisor: Carlos Guerra Schrago.

About me

Skills & Stack

N/A

N/A

N/A

  • My area of expertise is tilted towards using statistics (Machine Learning included) to describe and understand client behavior, development of predictive models (e.g. churn or propension) and client segmentation using unsupervised ML. I have experience developing ad hoc analyses and also developing production ready models.
  • Statistical modeling, data analysis and hypothesis testing are considered staple knowledge in the area I graduated from. This fact allowed me to transit between different technical areas seamlessly (such as machine learning, A/B testing and data visualization).
  • R is my language of choice, but I also have proficiency with Python, SQL and Spark (PySpark/Sparklyr). I work comfortably with git, github and Linux, besides also having experience working with the Data Bricks environment (and with notebooks in general) and AWS (e.g. Redshift and S3).
  • I have worked in several research projects with people from a wide array of backgrounds and seniority levels, some of which from end-to-end. I have been well trained since an undergrad to explain technical subjects to non technical audiences.