TL;DR

Continual learning updates models with new data while retaining old knowledge. Strategies: regularization (EWC), replay (store old examples), or architecture methods (progressive networks).

The problem: catastrophic forgetting

Fine-tuning on new data often erases previous knowledge. Model "forgets" original tasks.

Solutions

Regularization: Penalize changes to important weights (EWC, SI)
Replay: Mix old and new data during training
Architectural: Add new parameters for new tasks (progressive networks)
Meta-learning: Learn to learn continually

Use cases

  • Personalization (adapt to user over time)
  • Domain adaptation (new industries, languages)
  • Evolving knowledge (update facts)

Challenges

  • Balancing old vs new knowledge
  • Storage for replay
  • Computational cost