Skip to content

Additional Readings


This is a list of growing number of papers and implementations I think are interesting.

Long Tailed Recognition

  • Large-Scale Long-Tailed Recognition in an Open World
    • Frequently in real world scenario there're new unseen classes or samples within the tail classes
    • This tackles the problem with dynamic embedding to bring associative memory to aid prediction of long-tailed classes
    • The model essentially combines direct image features with embeddings from other classes

Deep Reinforcement Learning


  • Online Learning Rate Adaptation with Hypergradient Descent
    • Reduces the need for learning rate scheduling for SGD, SGD and nesterov momentum, and Adam,
    • Uses the concept of hypergradients (gradients w.r.t. learning rate) obtained via reverse-mode automatic differentiation to dynamically update learning rates in real-time alongside weight updates
    • Little additional computation because just needs just one additional copy of original gradients store in memory
    • Severely under-appreciated paper

Network Pruning


  • A Unified Approach to Intepreting Model Predictions
    • Introduces SHAP (SHapley Additive exPlanations)
    • "SHAP assigns each feature an importance value for a particular prediction"
      • Higher positive SHAP values (red) = increase the probability of the class
      • Higher negative SHAP values (blue) = decrease the probability of the class



  • Netron
    • Easily visualize your saved deep learning models (PyTorch .pth, TensorFlow .pb, MXNet .model, ONNX, and more)
    • You can even check out each node's documentation quickly in the interface