Currently developing the algorithms that drive Math Academy's fully automated and personalized online learning system.
Previously applied math in a variety of contexts including working as a data scientist, modeling biological neural networks, and improving particle detectors.
Also taught, tutored, and developed math content in parallel for a decade, culminating in developing one of the most advanced high school applied math/CS sequences in the USA.
Click to expand sections below. Also available on LinkedIn.
First engineering hire; developed all quantitative aspects of the product
Math Academy, May 2019 - present
Involved in all aspects of production of an intelligent tutoring system that seeks to become the ultimate online math learning platform. As first engineering hire, my primary role involves developing/productionizing our algorithms and knowledge graph so that the system can behave like an experienced tutor.
Most notably, I created the Fractional Implicit Repetition (FIRe) algorithm, which generalizes discrete spaced repetition on independent tasks to fractional implicit spaced repetition on highly connected knowledge graphs. Math Academy uses FIRe to estimate student knowledge profiles and select personalized tasks that optimize knowledge persistence over time, striking an optimal balance between learning new topics (maximizing knowledge gain) and reviewing already-seen topics (minimizing knowledge loss).
The FIRe algorithm has been a massive success, speeding up learning by a factor of 4x while improving mastery: students learning via FIRe on Math Academy's personalized learning platform can complete AP Calculus BC in just 40 minutes per school day with improved AP exam scores as compared to an instructor-led course consisting of 12 hours per week (1 hour class and 1 hour homework per school day, plus an amortized 2 hours per week of studying for quizzes/tests and the AP exam itself).
In addition to developing/productionizing FIRe and many other algorithms, I also wear many other hats including developing command-line tools for managing content and the knowledge graph, encoding massive amounts of domain-expert knowledge into the knowledge graph, and reviewing all lessons written by the content team for clarity and accuracy.
Data Scientist - Analytics, R&D
Worked full time while simultaneously an undergraduate
Aunalytics, Jan 2016 - Jan 2018
Worked full time while simultaneously an undergraduate (transitioned to full-time after interning during sophomore year).
- Built predictive churn models & segmented user data to discover sales insights for clients.
- Developed prototypes of in-house data tools & migrated existing workflows onto Aunsight (custom data processing platform).
- Evaluated the potential of topological data analysis for practical use in the data science pipeline.
Instructor - Computation & Modeling, Machine Learning, Intelligent Systems
Developed one of the most advanced high school applied math/CS sequences in the USA
Math Academy (Non-Profit Division), May 2020 - June 2023
Math Academy supports a highly accelerated math program (by the same name) within the Pasadena Unified School District. The program has been recognized by the Washington Post as "America's most accelerated math program"; students take AP Calculus BC in 8th grade and study a full undergraduate math curriculum in high school.
In collaboration with Jason Roberts (founder of Math Academy), I developed and taught a special three-course math/CS sequence within the Math Academy program called Eurisko (eurisko.us). Eurisko is one of the most advanced high school math/CS sequences in the USA.
- The first course in the sequence, Computation & Modeling, is inspired by MIT's Introduction to Computer Science and goes far beyond it. In addition to implementing canonical data structures and algorithms (sorting, searching, graph traversals), students write their own machine learning algorithms from scratch (polynomial and logistic regression, k-nearest neighbors & k-means, parameter fitting via gradient descent).
- The second course, Machine Learning, covers more advanced machine learning algorithms such as decision trees and neural networks, as well as the development of strategic game-playing agents using game trees. Students also work together to implement Space Empires, an extremely complex board game that pushes their large-scale project skills (object-oriented design, version control, etc) to the limit. Again, students implement algorithms from scratch before using external libraries.
- The third course, Intelligent Systems, involves developing agents that behave intelligently in complex environments. Students reproduce academic research papers in artificial intelligence leading up to Blondie24, a neuroevolution-based game-playing agent that learned to play checkers without having any access to information regarding human-expert strategies, and continue implementing Space Empires with the goal of designing artificially intelligent agents to play it.
Math Educator / Content Developer
Many organizations, Jan 2018 - May 2021
Tutored, taught, and developed math content both independently and for a variety of organizations.
- May 2019 - May 2021 | Independent tutoring
- Jan 2018 - May 2021 | Math Academy - developed many hundreds of tutorial videos and fully scaffolded lessons; taught substitute/summer classes & TA sessions.
- Aug 2019 - May 2020 | Pilgrim School - taught AP Calculus AB, physics, first-semester engineering
- Aug-Sept 2019 | IXL Learning - item writing
- Jun-July 2019 | Math Academy - taught Research and Presentation in Mathematics, Mathematical Problem Solving, Drawing Mathematics with Desmos
- Apr 2019 | Math Academy - substitute taught Linear Algebra / Multivariable Calculus
- Mar 2018 - Mar 2019 | FLEX College Prep - weekend SAT/ACT math classes
- 2018-19 | Wrote 3 textbooks (for fun)
- Feb 2018 - May 2019 | LA Tutors 123 - tutored ~10 students
- Jan 2018 - Mar 2020 | HBar Tutoring - tutored ~20 students
- Jan-Jun 2018 | Study.com - item writing
Mathnasium, Mar 2013 - Jan 2018
Tutored ~250 students in grades K-12 over the course of 5 years, working primarily with middle/high school students taking algebra through calculus. (Evenings & weekends, 20h/week)
Vural Lab, iCeNSA, University of Notre Dame, Aug 2014 - May 2016
Simulated cyclic Hodgkin-Huxley neural networks with spike-timing dependent plasticity and later proved results under simplifying assumptions in the special case of tree networks. Project was a self-directed investigation advised by Prof. Dervis Can Vural of the Many-Body Physics group.
(2016) Shaping STDP Neural Networks with Periodic Stimulation: a Theoretical Analysis for the Case of Tree Networks
(2015) Network Motif-Inspired Evolution of Hodgkin-Huxley Neuronal Networks with Spike-Timing Dependent Plasticity
Computational Neuroscience / Deep Learning
Synthetic Cognition Group (affiliated with LANL), New Mexico Consortium, May - Aug 2015
Implemented spiking neurons in a deep neural network in attempt to create an emergent phenomenon of brain oscillations during an image classification task. Worked in the synthetic cognition group, affiliated with Los Alamos National Lab.
Experimental Particle Physics
Finalist, Intel International Science/Engineering Fair, 2013
QuarkNet (affiliated with CERN), University of Notre Dame, June 2013 - May 2014
Levine Lab (affiliated with Fermilab), IU South Bend, Sept 2012 - May 2013
Helped improve data transmission in Fermilab and CERN particle detectors by reducing signal loss at the detector surface. Finalist in Intel's International Science/Engineering Fair, 2013. (Two separate projects, the first under Prof. Ilan Levine of IUSB and the second in collaboration with Notre Dame's QuarkNet lab.)
(2014) Optimizing Scintillation and Light Transmission for Use in a High Energy Particle Detector
(2013) Making a Matching Layer for Acoustic Sensors for a COUPP Dark Matter Detector
MS, Computer Science, Georgia Institute of Technology, 2020
Machine Learning track; simultaneously worked full time
BS, Mathematics, University of Notre Dame, 2018
Full-ride Lilly Scholarship (top 0.5% high school grads in Indiana; must attend college in state)
Glynn Honors Program (top 5% admits); simultaneously worked full time
Below are high-level descriptions of particularly noteworthy algorithms that I've developed. They are novel and proprietary.
Fractional Implicit Repetition (FIRe)
Speeds up learning by a factor of 4x.
- Generalizes discrete spaced repetition on independent flashcards to fractional implicit spaced repetition on knowledge graphs of interconnected skills and concepts.
- Estimates student knowledge profiles and selects personalized learning tasks that optimize knowledge persistence over time, striking an optimal balance between learning new topics (maximizing knowledge gain) and reviewing already-seen topics (minimizing knowledge loss).
- Algorithm structure has analogies to biology: topics = cells, spaced repetition latent state = chemical concentrations within cells, knowledge graph = brain, correct answers = stimulants of cell growth, incorrect answers = inhibitors of cell growth, learning tasks = stimuli to brain.
- Speeds up learning by a factor of 4x and improves mastery: students learning via FIRe on Math Academy's personalized learning platform can complete AP Calculus BC in just 40 minutes per school day with improved AP exam scores as compared to an instructor-led course consisting of 12 hours per week (1 hour class and 1 hour homework per school day, plus an amortized 2 hours per week of studying for quizzes/tests and the AP exam itself).
- Has enabled sufficiently motivated 6th grade students to progress from prealgebra to AP Calculus BC over the span of just 2 semesters.
XP Calibration System
Ensures that 1 XP is equivalent to 1 minute of work for the average focused and capable student.
- Context: Students are routinely served an handful of learning tasks to choose from, and they earn XP for completing tasks with satisfactory performance. The intention is that 1 XP should equal 1 minute of work for the average focused and capable student. Initially, we did not have any student data, so XP was computed based on hard-coded question times -- but as we gained student data, we noticed that it was taking users longer to earn the same amount of XP as they progress to higher courses.
- Computes a "complexity" value for each topic, with "complexity" defined as the ratio between (numerator) the actual time it takes a student to complete a task while learning during the first encounter, and (denominator) the estimated task time based on hard-coded question times.
- Incorporates complexity into the XP computation algorithm so that for tasks with sufficient data, time estimates are now exactly equal to the actual time that has been observed empirically. (For topics with little data, we place bounds on the complexity estimate.)
- Automatically becomes more accurate as more data is gained, and automatically accounts for changes to the structure of topics that could impact the amount of time it takes for a student to complete it.
XP Penalty System
Shuts down adversarial student behavior.
- Context: Students are routinely served an handful of learning tasks to choose from, and they earn XP for completing tasks with satisfactory performance. In the absence of a penalty system, adversarial students will complete tasks that they feel are easy and then submit a bunch of random guesses to intentionally fail out of tasks that require more effort.
- Applies a penalty (negative XP) when it detects that a student is failing tasks as a result of being unwilling to put in effort. Tracks the amount of "anger" that would build up in a tutor or guardian sitting next to the student, and then translates that anger into an XP penalty.
- Effectively shuts down adversarial behavior while simultaneously not impacting cooperative students. Many adversarial students' pass rates jumped from under 50% to over 90%.
Quickly estimates a student's knowledge profile.
- Constructs efficient diagnostic exams and performs inference to massively reduce the number of questions that must be asked to characterize a student's knowledge profile at a given level of precision.
- Automatically selects a minimal subset of topics that fully "cover" a course and its foundational knowledge at a desired level of granularity. These are the topics on which questions could potentially be asked during the diagnostic exam.
- Further reduces the number of diagnostic questions by adapting to the student's performance. On each question, the most informative topic is chosen for assessment, and the diagnostic stops early when the student's knowledge state has been inferred to a desired level of confidence.
- Guards against guessing by detecting and re-assessing questions on which the student may have guessed.
Grades a free response mathematical expression.
- Determines whether a free response mathematical expression matches the answer key expression.
- Constructs a sample of numerical substitutions such that the free response answer is almost certain to be correct if it matches the answer key on the sample.
- Intelligently handles not only numerical overflow but also details like mathematical ambiguity and context-dependence of mathematical rigor just like a human grader would.
- Introduction to Algorithms and Machine Learning (expected 2023). digital
- Linear Algebra (2019). digital, pdf, print
- Calculus (2019). digital, pdf, print
- Algebra (2018). digital, pdf, print
- Tips for Developing Valuable Models (2022). post
- But WHERE do the Taylor Series and Lagrange Error Bound even come from?! (2019). post