Q&A: Sources of Bivariate Data from Various Function Families

by Justin Skycak on

Cross-posted from here.


I’m looking for sample data to give algebra 2 students to teach about using Desmos to do regressions.

Some example data sets folks at my school already have compiled are:

  • Number of Lego pieces vs Price of set (linear)
  • Arm span vs Height of person (linear)
  • Number of years since 1948 vs Average movie ticket price (linear or exponential)
  • Dollars spent on advertisements vs Dollars earned as revenue (square root)
  • NBA draft pick number vs Salary in dollars (exponential decay)

Just to name a few. We would like to add to our collection for each of the following function families: Linear, Absolute Value, Quadratic, Cubic (and higher degree polynomials), Square root, Reciprocal (or other rational functions), Exponential (growth and decay), Logarithmic, Sine or Cosine.

An ideal answer would provide sources of bivariate data from the various function families. It is not critically important that the data be real.


Here are two resources off the top of my head that exercise a variety of function families:

Infections at the beginning of the COVID-19 pandemic

linear, exponential growth/decay, sine/cosine

  • Spring 2020 - Summer 2020 | the number of daily new cases increased linearly, and the total cumulative number of cases grew exponentially.
  • Fall 2020 - Fall 2021 | the number of daily new cases oscillated sinusoidally, and the total cumulative number of cases increased roughly linearly.
  • Winter 2022 - Summer 2023 | the number of daily new cases decayed roughly exponentially.

A made-up rocket launch data set (height of rocket vs time elapsed since launch)

linear, quadratic, higher-degree polynomial

  • This is a problem that I made for a modeling / machine learning class to illustrate the idea that "closer to the data points" is not always better when it comes to fitting models. You don't want to underfit, but you don't want to overfit either.
  • In this problem, we fit linear, quadratic, and high-degree polynomial models to the data. The linear model slightly underfits, the high-degree polynomial way overfits, and the quadratic provides a good fit.