The Skunkworks Challenge

October 4, 2019

Prabhu Subramanian (left) and Ziwei Wing Fan (right). Both students have managed the Skunkworks program with Professor Brown.

Professor Nik Brown receives hundreds of resumes from students each semester, many of whom are applying to be his TA, or looking for revisions or hoping he will forward it to an employer. But most of these resumes fail to stand out. They are all the same.

“I get emails every day asking if I have a TA or RA position,” said Professor Brown. “And my response is always, prove to me that you can do [what is on your resume].”

In order to change this trend, Professor Brown and a group of students started Skunkworks, a program to give students opportunities to work in their fields of interest and build a portfolio.

“So before when I was a PhD student at UCLA, I taught at a lot of art schools,” said Professor Brown. “In art schools, they had a first-year portfolio, a second-year portfolio, a third-year portfolio, and a big fourth-year portfolio where they usually rent out an art gallery or a restaurant or something. Here we just have two years with students and I’ve noticed it can be harder for them to build a portfolio.”

Professor Brown went on to explain that because of the time crunch, students often have to wait 14 months and rely solely on their coop to build a portfolio. But Skunkworks has changed that and is giving students the opportunity to start building their portfolio right from the start of the program with projects and challenges that oftentimes lead to coops for the students.

“The basic idea is to start working with real people,” said Professor Brown. “So I reach out to people at Harvard or The Broad Institute or people at Northeastern, a lot of different places, for projects they might be interested in students working with them. Some are volunteers some are paid. This is outside the classroom. So these are projects people would actually hire them for.”

He continued to explain how the program also helps students get a better idea of where they want to focus their studies. Back when Professor Brown was a college student, he initially wanted to become a doctor, until he went to work at a doctor’s office and experienced their day-to-day work firsthand, realizing it was not when he had initially imagined. This is an experience he had made sure to carry with him throughout his career at Northeastern and had an influence on the creation of Skunkworks.

“Everyone wants to be a data scientist because it’s a very sexy field, but oftentimes students don’t really know what’s involved,” said Professor Brown. “So the idea is, rather than students taking a year and a half getting a data science degree by taking a bunch of courses only to eventually figuring out that they like web development better, they work on these challenges and projects to get a good idea of what they prefer.”

The team started Skunkworks in January of 2019, and to date, there are 500 people involved. Oftentimes, the students wind up with more than just a portfolio as many students build connections and in some cases receive coop and job offers. Specifically, with the Broad Challenge, students left a tremendous impression on the organization.

“It turns out that the students did so well on the project that they’re looking for more money to hire more students than the one they originally intended,” Professor Brown explained.

Professor Brown also explained that the projects and challenges have helped students get a better understanding of the course material and leave them better prepared for job interview questions because they wind up having a better idea of the nature of the field. Having this understanding is what results in the students having a resume that will stand out to employers along with a degree in engineering

“There are students who send out over 100 resumes and will get maybe three or four responses,” said Professor Brown. “It’s hard because often when you send resumes out they can’t tell you from anyone else. It’s just a piece of paper. But if you show a project, particularly a project that an employer is interested in, it will be the biggest predictor of whether or not you can do something and whether or not you can do it well.”
logo with skunk illustrations
The Challenges

Project: Hyperparameter DataBase (RISE Challenge)

Author: Prabhu Subramanian

Abstract

Hyperparameters are parameters that are specified prior to running machine learning algorithms that have a large effect on the predictive power of statistical models. Knowledge of the relative importance of a hyperparameter to an algorithm and its range of values is crucial to hyperparameter tuning and creating effective models.
To either experts or non-experts, determining hyperparameters that optimize model performance can be a tedious and difficult task.
Therefore, we develop a hyperparameter database that allows users to visualize and understand how to choose hyperparameters that maximize the predictive power of their models.

Approach

The database is created by running millions of hyperparameter values, over thousands of public datasets and calculating the individual conditional expectation of every hyperparameter to the quality of a model.
We analyze the effect of hyperparameters on algorithms such as :
Distributed Random Forest (DRF),
Generalized Linear Model (GLM)
Gradient Boosting Machine (GBM)
and several more

Hyperparameters are specified for tuning purposes, for example:

learningrate – Learning Rate
n_layers – Number of layers
n_neurons – Number of neurons
Hidden Layers – Number of layers and size of each layer

This addresses the problem of Statistical model tuning.

Project: AI-Addin: AIaddin (Automated platform for Model Interpret-Ability) (RISE Challenge

Author: Ziwei Wing Fan

The opportunity/background:

It is crucial to understand how an algorithm makes a certain decision. Trust in the model is enhanced when the logic is exposed.
It can reveal the logic and bias of these models, exposing the reasons in their predictions.

Why AI-addin?

How could we allow non-experts to understand complicated algorithms & the reasons in the BLACK BOX?
How could we make logical explanations to be more human-friendly?
How could we reveal the logic and bias of the models, and expose the reasons in their predictions or model failure?

Goal of AI-addin:

AIaddin(AI-addin) is artificial intelligence software that decides when and how to automatically apply model interpretability algorithms to any data set that a user uploads for analysis.

Approach of AI-addin:

We evaluate the system qualitatively, asking users if the human-friendly explanations are understandable and make logical sense given their domain expertise after processing via Interpretable Algorithms.

Result:

The black box model could be deeply understood by a non data science expert and we could figure out which features affect the prediction significantly.
We could figure out how some of the features affect the prediction by visual & interpretable language.
It’s getting clear that the data and the reason why a model might fail.

by Jess DeWitt

Related Faculty: Nicholas Brown

Related Departments:Software Engineering and Information Systems