All Posts

2022

Reimplementing Blondie24

Fogel and Chellapilla’s Blondie24 was published over the course of two papers. Here we shall address the first paper, Evolving Neural Networks to Play Checkers without Relying on Expert Knowledge, published in 1999. Read more...

Reimplementing Fogel’s Tic-Tac-Toe Paper

The goal of this chapter is to reimplement the paper Using Evolutionary Programming to Create Neural Networks that are Capable of Playing Tic-Tac-Toe by David Fogel in 1993. This paper preceded Blondie24, and many of the principles introduced in this paper were extended in Blondie24. As such, reimplementing the paper provides good scaffolding as we work our way up to reimplement Blondie24. Read more...

Introduction to Blondie24 and Neuroevolution

Previously, we built strategic connect four players by constructing a pruned game tree, using heuristics to rate terminal states, and then applying the minimax algorithm. This was a combination of game-specific human intelligence (heuristics) and generalizable artificial intelligence (minimax on a game tree). Read more...

Reduced Search Depth and Heuristic Evaluation for Connect Four

In theory, we could solve any game by building a big game tree, labeling the terminal states as wins, losses, or ties, and then working backwards from that information to identify the minimax strategy. But in practice, game trees get so big so quickly that for all but the most simple games, game trees are too expensive to store in memory and take too long to traverse. Read more...

Minimax Strategy

The minimax strategy is a powerful game-playing strategy that operates on game trees. It works by maximizing the worst-case scenario that could potentially arise from every move. Read more...

Canonical and Reduced Game Trees for Tic-Tac-Toe

A game tree is a data structure that represents all the possible outcomes of a game. It is a graph where the nodes correspond to the states of the game, and the edges correspond to actions that cause the game to transition from one state to another. Game trees are commonly used when coding up strategies for autonomous game-playing agents. Read more...

Backpropagation

The most common method used to fit neural networks to data is gradient descent, just like we did have done previously for simpler models. The computations are significantly more involved for neural networks, but an algorithm called backpropagation provides a convenient framework for computing gradients. Read more...

Decision Trees

A decision tree is a graphical flowchart that represents a sequence of nested “if-then” decision rules. To illustrate, first recall the following cookie data set that was introduced during the discussion of k-nearest neighbors: Read more...

Distance and Shortest Paths in Unweighted Graphs

The distance between two nodes in a graph is the fewest number of edges that must be crossed to travel from one node to the other. A path between two nodes is a sequence of nodes that can be traversed to get from one node to the other, traveling along edges. The shortest path is the path with the shortest distance. Read more...

Naive Bayes

Naive Bayes is a classification algorithm that is grounded in Bayesian probability. It involves choosing the class with the highest conditional probability given the corresponding data point, assuming that all the variables in the data set are independent from each other. Read more...

K-Nearest Neighbors

Until now, we have been focused on regression problems, in which we predict an output quantity (that is often continuous). However, in the real world, it’s even more common to encounter classification problems, in which we predict an output class (i.e. category). For example, predicting how much money a person will spend at a store is a regression problem, whereas predicting which items the person will buy is a classification problem. Read more...

Multiple Regression and Interaction Terms

In many real-life situations, there is more than one factor that controls the quantity we’re trying to predict. That is to say, there is more than one input variable that controls the output variable. Read more...

Regression via Gradient Descent

Gradient descent can be applied to fit regression models. In particular, it can help us avoid the pitfalls that we’ve experienced when we attempt to fit nonlinear models using the pseudoinverse. Read more...

Linear, Polynomial, and Multiple Linear Regression via Pseudoinverse

Supervised learning consists of fitting functions to data sets. The idea is that we have a sample of inputs and outputs from some process, and we want to come up with a mathematical model that predicts what the outputs would be for other inputs. Supervised machine learning involves programming computers to compute such models. Read more...

Back to Top ↑

2021

Simplex Method

The simplex method is a technique for maximizing linear expressions subject to linear constraints. Many applied problems can be framed like this – for example, a business may have a variety of raw materials and need to determine the most profitable inventory of products that they can produce from those materials. This would ultimately amount to maximizing a linear expression for profit, subject to a linear inequality for each type of raw material (the total amount of raw material used in the products produced must be less than or equal to the amount of raw material available). Read more...

Hodgkin-Huxley Model of Action Potentials in Neurons

In 1952, Alan Lloyd Hodgkin and Andrew Fielding Huxley published a differential equation model of “spikes” (i.e. “action potentials”) in the voltage of neurons. For this work, they received the 1963 Nobel Prize in Physiology or Medicine (shared with Sir John Carew Eccles). Read more...

SIR Model For the Spread of Disease

One of the simplest ways to model the spread of disease using differential equations is the SIR model. Let’s construct a SIR model and use our multivariable Euler estimator to plot the solution curves. Read more...

Euler Estimation

Arrays can be used to implement more than just matrices. We can also implement other mathematical procedures like Euler estimation. We will jump straight to exercises – it is assumed that you’re already familiar with Euler estimation from calculus. Read more...

Tic-Tac-Toe and Connect Four

One of the best ways to get practice with object-oriented programming is implementing games. Tic-tac-toe and connect four are great for learning the ropes. Later on, we will also use these implementations for developing AI players. Read more...

K-Means Clustering

Clustering is the act of finding groups of similar records within data. Some examples of clusters in real-life data include users who buy similar items, songs in a similar genre, and patients with similar health conditions. Read more...

Basic Matrix Arithmetic

Let’s use arrays to implement matrices and their associated mathematical operations. We will jump straight to exercises – it is assumed that you are already familiar with matrix arithmetic and reduced row echelon form. Read more...

Merge Sort and Quicksort

Merge sort is a sorting algorithm that works by breaking an array in half, recursively calling merge sort on each half separately, and then merging the two sorted halves. The base case is that empty arrays and arrays of 1 item are already sorted (so if you call merge sort on such an array, it leaves the array as-is). Read more...

Single-Variable Gradient Descent

Gradient descent is a technique for minimizing differentiable functions. The idea is that we take an initial guess as to what the minimum is, and then repeatedly use the gradient to nudge that guess further and further “downhill” into an actual minimum. Read more...

Cartesian Product

Implementing the Cartesian product provides good practice working with arrays. Recall that the Cartesian product constructs all the points whose elements occupy the given ranges. To illustrate: Read more...

Simulating Coin Flips

It’s often possible to estimate probabilities of events by simulating a large number of random experiments and measuring the proportion of trials in which the desired outcome occurred. This technique is known as Monte Carlo simulation, named after a famous casino town. Read more...

Some Short Introductory Coding Exercises

Let’s get started on some introductory coding exercises. It’s assumed that you’ve had some basic exposure to programming and that you know about variables, functions, if statements, for loops, while statements, arrays, strings, and dictionaries. Ideally, you’ve learned some object-oriented programming (classes, attributes, methods) as well. We will speak in terms of Python, but the general ideas are applicable to any programming language. Read more...

Back to Top ↑

2019

Inverse Matrices

In this chapter, we introduce the idea of the inverse of a matrix, which undoes the transformation of that matrix. Read more...

Rescaling, Shearing, and the Determinant

The key insight in this chapter is that every square matrix can be decomposed into a product of rescalings and shears. Before we elaborate on that, though, let’s discuss what rescalings and shears are, in terms of matrices. Read more...

N-Dimensional Volume Formula

N-dimensional volume generalizes the idea of the space occupied by an object:

  • • 1-dimensional volume refers to the space occupied by a 1-dimensional object, such as the length of a line segment.
  • • 2-dimensional volume refers to the space occupied by a 2-dimensional object, such as the area of a square.
  • • 3-dimensional volume is what we normally mean by the word “volume” -- the amount of space occupied by a 3-dimensional object, such as the volume of a cube.
Read more...

Elimination as Vector Reduction

Recall that systems of linear equations can be solved through elimination, multiplying equations by constants and adding equations to each other to cancel variables. Read more...

Lines and Planes

A line starts at an initial point and proceeds straight in a constant direction. Thus, we can write the equation of a line as Read more...

N-Dimensional Space

N-dimensional space consists of points that have N components. For example, $(0,0)$ is the origin in 2-dimensional space, $(0,0,0)$ is the origin in 3-dimensional space, and $(0,0,0,0)$ is the origin in 4-dimensional space. Read more...

Manipulating Taylor Series

To find the Taylor series of complicated functions, it’s often easiest to manipulate the Taylor series of simpler functions, such as those given below. Read more...

Taylor Series

The sum formula for a geometric series is an example of representing a non-polynomial function as an infinite polynomial within a particular range of inputs. Read more...

Variation of Parameters

When we know the zero solutions $y_0$ of a differential equation $y’^\prime+a_1(x)y’+a_0(x)y=f(x)$, we can use a method called variation of parameters to find the particular solution. This method is especially useful in cases where we are unable to guess the particular solution through undetermined coefficients. Read more...

Slope Fields and Euler Approximation

When faced with a differential equation that we don’t know how to solve, we can sometimes still approximate the solution by simpler methods. If we just want to get an idea of what the solutions of the differential equation look like on a graph, we can construct a slope field. Read more...

Separation of Variables

In differential equations, we are given an equation in terms of the derivative(s) of some function, and we need to solve for the function that makes the equation true. Read more...

Integration by Parts

Integration by parts is another technique for simplifying integrals. We can apply integration by parts whenever an integral would be made simpler by differentiating some expression within the integral, at the cost of anti-differentiating another expression within the integral. The formula for integration by parts is given below: Read more...

Integration by Substitution

Complicated integrals can sometimes be made simpler through the method of substitution. Substitution involves condensing an expression of $x$ into a single variable, say $u$, and then expressing the integral in terms of $u$ instead of $x$. Read more...

Antiderivatives

An antiderivative of a function $f(x)$ is a function $F(x)$ whose derivative is $f(x)$, i.e. $F’(x)=f(x)$. Read more...

L’Hôpital’s Rule

L’Hôpital’s rule provides a way to evaluate limits that take the indeterminate forms of $\frac{0}{0}$ or $\frac{\infty}{\infty}$. It says that, for such limits, we can differentiate the numerator and denominator separately, without changing the actual value of the limit. Read more...

Differentials and Approximation

The chain rule tells us that we can treat the derivative $\frac{df}{dx}$ like a fraction when multiplying by other derivatives. In this chapter, we continue the idea of interpreting the derivative as a fraction, extending it to an even more literal sense. Read more...

Finding Extrema

Derivatives can be used to find a function’s local extreme values, its peaks and valleys. At its peaks and valleys, a function’s derivative is either $0$ (a smooth, rounded peak/valley) or undefined (a sharp, pointy peak/valley). Read more...

Derivatives of Non-Polynomial Functions

In this chapter, we introduce rules for the derivatives of exponential, logarithmic, trigonometric, and inverse trigonometric functions. Although it’s possible to compute each derivative using the difference quotient, it will take a long time to compute derivatives during calculus problems if we have to start from scratch with the difference quotient process every time – so it’s advantageous to remember the derivative rules. The derivative rules are to calculus, what the multiplication table is to arithmetic. Read more...

Properties of Derivatives

We know that when differentiating polynomials, we can differentiate each term individually. But why are we able to do this? Does multiplication work the same way? What about division? We answer these questions in this chapter. Read more...

Chain Rule

The chain rule tells us how to take derivatives of compositions of functions. Informally, it says that we can “forget” about the inside of a function when we take the derivative, as long as we multiply by the derivative of the inside afterwards. Read more...

Power Rule for Derivatives

It can be a pain to evaluate the difference quotient every time we want to take the derivative of a function. Luckily, there are some patterns in derivatives that allow us to compute derivatives without having to go through all the steps of computing the limit of the difference quotient. Read more...

Derivatives and the Difference Quotient

The derivative of a function is the function’s slope at a particular point. We can approximate the derivative at a point $(x,f(x))$ by choosing another nearby point on the function, and computing the slope. If we increase the input $x$ by a small amount $\Delta x$, then we reach an x-coordinate of $x+\Delta x$, and the corresponding point on the function is $(x+\Delta x, f(x+\Delta x))$. Read more...

Evaluating Limits

The limit of a function $f(x)$, as $x$ approaches some value $a$, is the value we would expect for $f(a)$ if we saw only the portion of the graph around (but not including) $x=a$. If the resulting value is $L$, then we denote the limit as follows: Read more...

Back to Top ↑

2018

Compositions of Functions

Compositions of functions consist of multiple functions linked together, where the output of one function becomes the input of another function. Read more...

Rescalings of Functions

When a function is rescaled, it is stretched or compressed along one of the axes, like a slinky. The function’s general shape is preserved, but it might look a bit thinner or fatter afterwards. Read more...

Shifts of Functions

When a function is shifted, all of its points move vertically and/or horizontally by the same amount. The function’s size and shape are preserved – it is just slid in some direction, like sliding a book across a table. Read more...

Radical Functions

A radical function is a function that involves roots: square roots, cube roots, or any kind of fractional exponent in general. We can often infer what their graphs look like by sandwiching them between polynomial functions. Read more...

Vertical Asymptotes of Rational Functions

Unlike polynomials, rational functions can “blow up” to positive or negative infinity even for relatively small input values. Such input values are called vertical asymptotes, because they represent vertical lines that the function approaches but never quite reaches. Read more...

Horizontal Asymptotes of Rational Functions

Like polynomials, rational functions can have end behavior that goes to positive or negative infinity. However, rational functions can also have another form of end behavior in which they become flat, approaching (but never quite reaching) a horizontal line known as a horizontal asymptote. Read more...

Polynomial Long Division

A rational function is a fraction whose numerator and denominator are both polynomials. Rational functions are usually written in proper form, where the numerator is of a smaller degree than the denominator. (The degree of a polynomial is its highest exponent.) Read more...

Standard Form and End Behavior of Polynomials

Polynomials include linear expressions and quadratic expressions, as well as expressions adding multiples of higher exponents of the variable. Rational functions are usually written in proper form, where the numerator is of a smaller degree than the denominator. (The degree of a polynomial is its highest exponent.) Read more...

Systems of Inequalities

To solve a system of inequalities, we need to solve each individual inequality and find where all their solutions overlap. For example, to solve the system Read more...

Quadratic Inequalities

Quadratic inequalities are best visualized in the plane. For example, to solve a quadratic inequality $-x^2+x+2>0$, we can find the values of $x$ where the parabola $y=-x^2+x+2$ is positive. Read more...

Linear Inequalities in the Plane

When a linear equation has one variable, the solution covers a section of the number line: if our solution is $x > \text{some number}$, then the solution covers the section of the number line that lies right of that number; if our solution is $x<\text{some number}$, then the solution covers the section of the number line that lies left of that number. Read more...

Quadratic Systems

Systems of quadratic equations can be solved via substitution. After substituting, the resulting equation can itself be reduced down to a quadratic equation and solved by techniques covered in this chapter. Read more...

Completing the Square

Completing the square is another method for solving quadratic equations. Although we can use the quadratic formula to solve any quadratic equation, completing the square helps us gain a better intuition for quadratic equations and understand where the quadratic formula comes from. Read more...

Quadratic Formula

Some quadratic equations cannot be factored easily. For example, in the equation $x^2+3x+1=0$, we need to find two factors of $1$ that add to $3$. But the only integer factors of $1$ are $1$ and $1$, and they definitely don’t add to $3$! Read more...

Factoring Quadratic Equations

Factoring is a method for solving quadratic equations. It involves converting the quadratic equation to standard form, then factoring it into a product of two linear terms (called factors), and finally solving for the variable values that make either factor equal to $0$. Read more...

Linear Systems

A linear system consists of multiple linear equations, and the solution of a linear system consists of the pairs that satisfy all of the equations. Read more...

Point-Slope Form

Suppose we want to write the equation of a line with a given slope $m=2$, through a particular point $(3,5)$. In the previous post, we substituted the given information into a slope-intercept equation form $y=mx+b$, solved for $b$, and rewrote the slope-intercept form with $m$ and $b$ substituted so that $x$ and $y$ were the only variables. Read more...

Solving Linear Equations

Loosely speaking, a linear equation is an equality statement containing only addition, subtraction, multiplication, and division. It does not need to include all of these operations, but it cannot include operations beyond them, such as exponentiation. Read more...

Back to Top ↑

2000

Back to Top ↑