Nicholas Sean Escanilla
"The world is one big data problem." -Andrew McAfee

About
I am a graduate of the University of Wisconsin-Madison, Department of Computer Science, where I obtained my M.S. in Computer Science in May 2018.
My previous engagements include serving as a Machine Learning Consultant at the University of Wisconsin-Madison, a Subject Matter Expert in machine learning at Accenture, and a Data Scientist at Slalom. I am currently a Sr. Data Scientist working on deep learning applications at Verizon.
My academic, research, and industry endeavors have established in me core artificial intelligence and machine learning methods and best practices. I have had the opportunity to expand my skillset in the following areas: data science, computer vision, and health sciences. My overarching goal is to show business leaders and clients how to leverage their data for optimal growth.
Interests:
- Deep Learning
- Data Science
- Computer Vision
- Sports Statistics
Skills
Technical: AI, Data Science, Machine Learning, Deep Learning, Computer Vision
Languages: Python, R, MATLAB, Java
Personal: Learner, Analytical, Public Speaking
Education
Work Experience
Sr. Data Scientist
Verizon - Denver, Colorado
- Deep Learning, Machine Learning, Computer Vision, and Optimization
- Technologies:
- Python
- AWS
Data Scientist
Slalom - Chicago, Illinois
- Engaged with clients to brainstorm, gather data, and determine feasibility for potential data science projects.
- Transformed ~16,000 survey data points to extract feature importance and perform NLP to identify key phrases/drivers for sentiment.
- Designed a rule-based and optimization algorithm to build a portfolio recommendation system for a financial investment firm.
- Developed a convolutional neural network architecture and data science pipeline to perform custom object detection and readiness based on location.
- Technologies:
- Python
- Amazon SageMaker
Data Scientist
Accenture - Chicago, Illinois
- Acted as the Subject Matter Expert in machine learning and data science.
- Trained to understand ML as a Service and tools to use as part of cloud computing services.
- Webscraped 3500 help articles from disparate data sources using Python.
- Built a content-based recommendation system for use in a contact flow proof of concept.
- Implemented convolution neural networks and recurrent neural networks on several datasets (eg. MNIST, CIFAR10) using PyTorch and Keras.
- Technologies:
- Python
- AWS Machine Learning
- Amazon SageMaker
- Amazon Lambda
- Certifications:
- ICAgile
- Machine Learning on AWS - Technical (Digital) (Certificate of Completion)
Machine Learning Consultant
Department of Computer Sciences, UW-Madison
- Collaborated with the Wisconsin Institutes for Medical Research on better understanding breast cancer risk by using novel machine learning methodologies.
- Extracted, cleaned, transformed, and integrated genomic data and environmental data for a final dataset that consisted of 2000 patient records and 300 features/attributes.
- Implemented recursive feature elimination by sensitivity testing (RFEST) (link to RFEST here).
- Showed that 30 out of the 300 features were considered relevant for task of predicting breast cancer risk with an increase in accuracy (ROC AUC).
- Technologies:
- Python
Graduate Research Assistant
Department of Computer Sciences, UW-Madison
- Generated synthetic data based on correlation immune functions of orders two, four, five, and six.
- Empirically proved the effectiveness of a novel feature selection algorithm with correlation immune functions.
- Successful completion of Master's Thesis (Summer 2017).
- Authored a novel feature selection algorithm and accepted into the 17th IEEE International Conference on Machine Learning and Applications (ICMLA 2018).
- Featured in the UW-Madison Graduate School website for work on predicting breast cancer risk.
- Awards:
- Advanced Opportunity Fellowship (2016-2017 academic year).
- Computation and Informatics in Biology and Medicine (CIBM) Fellowship (2017-2018 academic year).
- Technologies:
- R
- Python
- Matlab
- OSX Terminal
- Unix Shell
UW-Madison Summer Researcher
Department of Biostatistics and Medical Informatics, UW-Madison
- Rigorous independent research on general machine learning algorithms.
- Designed a novel feature selection algorithm for use in bioinformatics.
- Applied novel algorithm on germline genomic data to improve breast cancer diagnoses.
- Technologies:
- R
Harvard University Summer Researcher
Department of Biostatistics, Harvard T.H. Chan School of Public Health
- Successful completion of comprehensive introductory courses in biostatistics and epidemiology.
- Implemented summary statistics and logistic regression models.
- Completed a data analysis project entitled "Evaluation of Gene-Environment Interaction for Ovarian Cancer".
- Technologies:
- R
Student Athletic Trainer
Lake Forest College
- Assisted the head athletic trainer to maintain athletic training room.
- Supervised the care of student athletes that compete at the college.
Projects
Publications
Recursive Feature Elimination by Sensitivity Testing
Authors: Nicholas Sean Escanilla, Lisa Hellerstein, Ross Kleiman, Zhaobin Kuang, James Shull, David Page
Abstract - There is great interest in methods to improve human insight into trained non-linear models. Leading approaches include producing a ranking of the most relevant features, a non-trivial task for non-linear models. We show theoretically and empirically the benefit of a novel version of recursive feature elimination (RFE) as often used with SVMs; the key idea is a simple twist on the kinds of sensitivity testing employed in computational learning theory with membership queries. With membership queries, one can check whether changing the value of a feature in an example changes the label. In the real-world, we usually cannot get answers to such queries, so our approach instead makes these queries to a trained (imperfect) non-linear model. Because SVMs are widely used in bioinformatics, our empirical results use a real-world cancer genomics problem; because ground truth is not known for this task, we discuss the potential insights provided. We also evaluate on synthetic data where ground truth is known.
A Comparative Analysis of Feature Selection Techniques for a Family of Nonlinear Target Functions and Breast Cancer Diagnoses
Thesis Committee: David Page, Charles Dyer, James Shull
Abstract - Due to advances in high-throughput technologies, data mining techniques for decision making processes have grown increasingly popular in the past decades. As more domains utilize these methods, we are seeing this surge in data – so-called big data – that requires preprocessing techniques to combat the curse of dimensionality. To handle such problems, dimensionality reduction techniques have been well-studied. Such an example is principal component analysis (PCA), a procedure that combines features to create new ones, thus reducing the dimensions in the dataset. This paper studies a different technique called feature selection. As opposed to using the original set of features in a dataset to build a predictive model, feature selection aims to find a subset of those features so that their combination is higher in predictive power, quality, and better interpretability of the data. In addition to an overview of feature selection techniques previously published in the literature, we present a novel feature selection algorithm and evaluate it using synthetic data labeled by a challenging family of nonlinear target functions. The method is also assessed on germline genomic data, from breast cancer patients and controls, comprised of single-nucleotide polymorphisms (SNPs) from a region of the human genome associated with breast cancer. Through these experiments, we show the advantages that our algorithm has over alternative methods when the task is to determine the subset of relevant features for a given input dataset.
Contact Me
Chicago, IL
Phone: (847) 668-5578
Email: escanillans@gmail.com