Analysis and Visualization Tools for Anki

13 Apr 2019 » James Diao » Boston, MA


I built an suite of tools that extracts scheduling data for the Anki flashcard app to generate summaries, visualizations, and workload projections.

Anki as a study tool

Just as a quick intro: Anki is a flashcard app, much like Quizlet. You learn by testing yourself on flashcards; you can either make them yourself or download shared decks from online (see my post on Anki for Medical School for more info).

To better understand my own data and my workload for the next year (but also just for fun), I’ve developed a suite of tools for playing around with Anki’s scheduling data. Here’s three of the cooler visualizations you can get from the app:

1. Card Distributions

This shows the hierarchical classification of the my 23,000+ cards in my collection. As you can see, I’ve only reviewed a small fraction (the orange “Review” deck in the bottom-right corner), and am working on a larger chunk (the blue “Current” deck on the left).

Distribution Treemap: All

2. Learn/Review History

This is a plot of how much time I spend on learning new cards (green) vs. reviewing previous cards (blue). As you can see, I have a tendency to cram new cards in bursts, and this leads to some pretty large spikes in my workload.

Learn and Review History

3. Workload Projection

Predicting the future is always tricky, but with enough data, it can actually be quite doable. Using one’s deck information and review history (how many new cards they learn per day, how many cards they need to get through, what their error rate is, etc.), we can simulate a workload curve all the way up to (and beyond) when the deck is complete. Full methodology and validation details are described in another post.

Workload Projection

Because there are too many settings and parameters to account for, the projected curve is subject to systematic bias. Empirically, this bias is almost purely multiplicative, and do not change the shape of the curve (one exception is a high error rate of >15%). Luckily, we can account for this by leveraging the observed data: a least-squares fit is used to rescale the projected curve and (hopefully) deliver better predictions.

4. Report Generation

All of these figures (of varying flavors) can also downloaded as an R Markdown report within the app. Here’s an example of my own report:

PDF Icon AnkiReport_JamesDiao.pdf

Source Code

All code is available on Github:

Improvement Areas

  1. Loading feedback for report generation
  2. Productivity over the day
  3. Setup Shiny Server

<< Back to Posts