How to Improve your Flashcard Knowledge Base

Even the best flashcard developers among us create bad cards on a regular basis (e.g. too long, ambiguous, useless information).

Given the reality that we all are highly imperfect at developing our flashcards, what should we do to improve these crummy cards as they age so you can spend less time reviewing and remember the concepts better?

I call this process flashcard “refactoring” (a term borrowed from software development).

Why refactor flashcards?

Reviewing old flashcards requires time and effort. Here are a few reasons why it’s worth the price:

  • It improves your understanding of the material. The process of breaking learning material down into the smallest “chunks” possible that fit onto flashcards is an extremely valuable exercise. Reviewing troublesome cards clarifies what you don’t understand and forces you to restructure your knowledge in a way that makes sense.
  • Your worst flashcards take up a disproportionate amount of time and effort, while yielding the worst results in terms of retention and usefulness. Following the 80-20 rule, 20% of your cards leads to 80% of the effort in review. So it’s a high value activity to hunt down this subset of your cards.
  • It provides knowledge construction training. Creating good flashcards is a nontrivial skill built over time. You can read Poitr Wozniak 20 rules for formulating knowledge, but actually observing your own performance on your cards and troubleshooting improvements takes your skills to the next level.

The process I use has two broad steps: selection and revision.

Selecting Problem Cards

I use two main methods to find cards needing review.

The first and most important method is finding cards I keep failing (“lapse” is the term in Anki). In the Anki browser, I can use the command prop:lapses>n to find the cards that have lapsed over n times. For me, cards are never lapse more than 8 times because Anki then marks it as a leech and automatically suspends it. Cards that have lapsed 5 or more times are great candidates for refactoring.

The other method is “marking” cards during review when I notice a card is poorly formed. I also try to make notes on marked cards describing what’s causing the problems (coming back to the cards at a later time, you can easily forget the specific issue that tripped you up).

Reviewing and Revising Problem Cards

The first step examining a difficult card is to ask whether I need this knowledge at all. If not, that’s the end of the process – I just delete the card and I’m done with it. I may also revisit the source material or do some Googling on the topic, which will sometimes reveal that the card is pointless or inaccurate.

If I decide that it’s important and relevant knowledge I want to keep, then I’ll examine the card for issues, using the principles from Poitr Wozniak Twenty Rules of Formulating Knowledge.

Example

Consider this data engineering card from my deck which was recently giving me problems:

  • Side 1: Tail latency amplification
  • Side 2: Even if small % backend calls slow, chance of getting a slow call increases if user request requires multiple backend calls, and so a higher proportion of end-user requests end up being slow.

First off, is this card relevant and worthwhile? For me, the answer is definitely yes: it’s both relevant to my job as a data scientist and my software engineering side projects.

Next, diagnose the problem. On closer examination, there are a few things wrong with the card:

In cases where I add material I don’t fully understand, I find the best approach is to go back to the source (which in this case is the book Designing Data-Intensive Applications by Martin Kleppmann). I then refactored the card like this:

  • Side 1: Tail latency amplification (Kleppmann)
  • Side 2: Multiple back-end calls for a single user request increases chance of encountering a tail latency. (Kleppmann)

As you can see, I added a source to clarify where the information came from (Rule 18: Provide source).

I was curious what other cards I have about tail latency, and it turns out there are none! Seems ridiculous to have a card about tail latency amplification, but not have a single one about tail latency which is a more common term. Not having this in my deck probably contributed to interference since I never tested myself on the distinction between the two concepts. So I added:

  • Side 1: Tail latency (Kleppmann)
  • Side 2: High percentile response time. (Kleppmann)

Note that the tail latency amplification card uses tail latency in its response. I’m hoping this will limit confusion between the two and emphasize the distinction (Rule 13: Refer to other memories). I also italicized amplification, to hopefully further avoid interference.

Since I’ve made these changes, I haven’t had any problems with these cards and feel like I have a better grasp on the material. Consider doing the same for the important knowledge in your decks causing you trouble.