HOME ABOUT WORK PROJECTS CONTACT

Nicholas Sean Escanilla

"The world is one big data problem." -Andrew McAfee

updated_profile_pic_11_12_2019

About


I am a graduate of the University of Wisconsin-Madison, Department of Computer Science, where I obtained my M.S. in Computer Science in May 2018.

My previous engagements include serving as a Machine Learning Consultant at the University of Wisconsin-Madison, a Subject Matter Expert in machine learning at Accenture, and a Data Scientist at Slalom. I am currently a Sr. Data Scientist working on deep learning applications at Verizon.

My academic, research, and industry endeavors have established in me core artificial intelligence and machine learning methods and best practices. I have had the opportunity to expand my skillset in the following areas: data science, computer vision, and health sciences. My overarching goal is to show business leaders and clients how to leverage their data for optimal growth.

Interests:

Skills

Technical: AI, Data Science, Machine Learning, Deep Learning, Computer Vision

Languages: Python, R, MATLAB, Java

Personal: Learner, Analytical, Public Speaking


Education

M.S. Computer Science
UW-Madison
B.A. Mathematics
Lake Forest College

Work Experience


Sr. Data Scientist

Verizon - Denver, Colorado

Data Scientist

Slalom - Chicago, Illinois

Data Scientist

Accenture - Chicago, Illinois

Machine Learning Consultant

Department of Computer Sciences, UW-Madison

Graduate Research Assistant

Department of Computer Sciences, UW-Madison

UW-Madison Summer Researcher

Department of Biostatistics and Medical Informatics, UW-Madison

Harvard University Summer Researcher

Department of Biostatistics, Harvard T.H. Chan School of Public Health

Student Athletic Trainer

Lake Forest College

Projects


Publications


Recursive Feature Elimination by Sensitivity Testing

Authors: Nicholas Sean Escanilla, Lisa Hellerstein, Ross Kleiman, Zhaobin Kuang, James Shull, David Page

Abstract - There is great interest in methods to improve human insight into trained non-linear models. Leading approaches include producing a ranking of the most relevant features, a non-trivial task for non-linear models. We show theoretically and empirically the benefit of a novel version of recursive feature elimination (RFE) as often used with SVMs; the key idea is a simple twist on the kinds of sensitivity testing employed in computational learning theory with membership queries. With membership queries, one can check whether changing the value of a feature in an example changes the label. In the real-world, we usually cannot get answers to such queries, so our approach instead makes these queries to a trained (imperfect) non-linear model. Because SVMs are widely used in bioinformatics, our empirical results use a real-world cancer genomics problem; because ground truth is not known for this task, we discuss the potential insights provided. We also evaluate on synthetic data where ground truth is known.

A Comparative Analysis of Feature Selection Techniques for a Family of Nonlinear Target Functions and Breast Cancer Diagnoses

Thesis Committee: David Page, Charles Dyer, James Shull

Abstract - Due to advances in high-throughput technologies, data mining techniques for decision making processes have grown increasingly popular in the past decades. As more domains utilize these methods, we are seeing this surge in data – so-called big data – that requires preprocessing techniques to combat the curse of dimensionality. To handle such problems, dimensionality reduction techniques have been well-studied. Such an example is principal component analysis (PCA), a procedure that combines features to create new ones, thus reducing the dimensions in the dataset. This paper studies a different technique called feature selection. As opposed to using the original set of features in a dataset to build a predictive model, feature selection aims to find a subset of those features so that their combination is higher in predictive power, quality, and better interpretability of the data. In addition to an overview of feature selection techniques previously published in the literature, we present a novel feature selection algorithm and evaluate it using synthetic data labeled by a challenging family of nonlinear target functions. The method is also assessed on germline genomic data, from breast cancer patients and controls, comprised of single-nucleotide polymorphisms (SNPs) from a region of the human genome associated with breast cancer. Through these experiments, we show the advantages that our algorithm has over alternative methods when the task is to determine the subset of relevant features for a given input dataset.

Contact Me


Chicago, IL

Phone: (847) 668-5578

Email: escanillans@gmail.com