Tips

Here are some tips for developing valuable models.


Focus on solving a problem that fits you.
  • Success comes from creating value, and value is created by solving problems. To solve a problem, you need 1) the ability to solve it and 2) the persistence to stick with it throughout all the difficulties that will surely arise (otherwise the problem would already have been solved).
  • The ability to solve a problem is a combination of domain knowledge and technical expertise -- domain knowledge to envision a feasible solution, and technical expertise to make that vision a reality.
  • The willingness to persist has a more emotional root -- doing something you love, or wanting to fix something that angers you (ideally both).

Consider the full problem from the beginning.
  • A model is worthless if it does not solve the desired problem. Even if it's elegant and theoretically interesting -- if it doesn't actually solve a problem, then it's worthless.
  • It is easier to simplify a convoluted model that solves the problem, than to extend an elegant model that does not solve the problem.
  • Models that are intentionally designed to solve solve specific real problems usually turn out to be theoretically interesting and fun to build, but a theoretically interesting and fun-to-develop model designed in a vacuum will rarely happen to solve any sort of real problem. (Looking at you, topological data analysis!)

Both you and your model need to understand the full context surrounding the problem.
  • The first step to developing a model is to gather domain knowledge and fully grasp the context in which the model is meant to exist. If you skip this step, then your model might work in theory but probably not in real life.
  • In order to gather domain knowledge, you need to engage in hands-on experience. So, avoid domains where you're averse to doing things manually and getting your hands dirty.
  • An model can only be as good as the underlying data. If you want your model to do what an expert does, it needs to have all the information that an expert uses during their decision-making process. (Heard this one from Jason Roberts, who heard it from Peter Stone.)
  • Better data (or better data preparation) usually adds a lot more alpha than more sophisticated models.

Choose the right level for your first principles.
  • It is often more efficient to manually encode expert knowledge in a structured data set and build a model on top of that, than to attempt to build a model that does everything from scratch.
  • It's easy to rationalize that manually encoding expert knowledge takes too long. But if spending several weeks (or even months) creating a structured data set by hand will allow your model to accomplish important goals that it couldn't otherwise, then it's totally worth doing.
  • Plus, when you have to manually encode expert knowledge, it means that you're creating highly relevant data that isn't publicly accessible. This gives you a major edge over any competitor who is not a domain expert or is unwilling to endure tedium for the sake of the model.

Leverage your intuition and emotions.
  • Routinely step back from the theory and implementation and observe your model's behavior. It needs to make sense intuitively and "feel" right emotionally. (If you've spent enough time building domain knowledge by doing things manually and getting your hands dirty, then you should have emotional reactions to the decisions the model makes.)
  • The best machine learning model you have is your brain, and your brain only interfaces with interpretable computer models.
  • The more linear and low-dimensional a model is, the easier it is to find good parameters using your intuition alone.
  • Emotion is an essential part of the feedback loop for improving a model: 1) inspect the model's output, 2) produce a negative emotional reaction, 3) introspect your emotions to identify the root cause of the negativity, 4) describe what the output needs to look like order to produce a positive emotional reaction, 5) tweak the model to give the desired output, 6) return to step 1.

Make your model robust and reliable.
  • Make your model robust to data issues (but make sure it logs a warning whenever it comes across a data issue). Data issues will happen from time to time, especially if the model is being developed in parallel with the underlying data infrastructure. The model can't just fall over and refuse to work whenever data issues happen.
  • The more complex your model is, the more internal validation it needs. Depending on the severity and veracity of a failed sanity check, the model should either log a warning or throw an error, halt, and alert you.
  • Unit tests are ideal, but if your model exists within a very complex system, then your unit tests won't cover all the possible edge cases no matter how hard you try. So internal validation becomes very important.
  • To gain confidence in your model and speed up the debugging process, it helps to generate human-readable justifications for why your model makes decisions it does.
  • It's often worth investing some time to make your logs highly informative yet easy to skim. (Indents and empty line dividers are your friends.) Tuning and debugging go much faster if you can see the forest for the trees.

Never stall out. (Corollary: Control the data-generating process.)
  • Keep forward momentum. If a model is not producing a desired behavior and you're out of ideas, then temporarily hard-code the desired behavior as an "intervention", move on, and periodically revisit the intervention to try out more elegant ideas.
  • If you don't control the data-generating process, then it becomes vastly more difficult (and sometimes impossible) to resolve data issues. You either need to own the data-generating process yourself or have trust, a good relationship, an open line of communication with the person who does.
  • Don't embark on a project unless you have some solid ideas on how to approach it. If your desired outcome feels magical, then you probably don't (yet) have enough technical knowledge to achieve it.

Focus on high-ROI tasks.
  • While it's good to keep developing technical expertise, returns are diminishing. However, you can always become orders of magnitude more productive by working on high-ROI tasks as opposed to low-ROI tasks.
  • Even when you're working on a high-ROI task, you should periodically re-evaluate to make sure it's still high-ROI. Sometimes the act of working on a task reveals new information that indicates a more efficient way to accomplish the same goal.
  • Maintain a priority queue, not a to-do list. You'll never complete everything in your priority queue since the rate at which items are added will outnumber the rate at which items are completed. Therefore, it's essential to work on highest-priority items first.
  • If you don't have any high-ROI tasks, then you need to spend some time thinking creatively about the future. It's essential to strike a balance between shutting out distractions versus allowing yourself to think creatively about the future, so that you can think of new high-ROI tasks as often as you complete existing high-ROI tasks. Work hard AND smart.

If the goal is to "wow" users, then the model must clearly demonstrate its sophistication.
  • Valuable models often work so elegantly and efficiently that they make extraordinarily difficult tasks seem easy. If the model does not demonstrate its sophistication, then users may not experience a "wow" moment, and they may even think that the model is wrong (since the complexity of the task is far beyond their perceived complexity of the model).
  • To appreciate the sophistication of a model, users need to understand and be able to personally verify the what the model is doing at a high level. Consequently, it's necessary for the model to be very interpretable at a high level.
  • Low-level implementation details should remain hidden since they do not matter to the user (and are often the "secret sauce" behind the model).
  • Visualizations and animations are ideal because they give the user a clear picture of what is going on at a high level and can demonstrate complexity in an aesthetically pleasing way without overwhelming the user.