I share my progress of implementing a research idea from scratch. I attempt to build an ensemble model out of students of label-free self-distillation without any additional data or augmentation. Turns out, it actually works, and interestingly, the more students I employ, the better the accuracy. This leads to the hypothesis that the ensemble effect is not a process of extracting more information from labels.
OUTLINE:
0:00 – Introduction
2:10 – Research Idea
4:15 – Adjusting the Codebase
25:00 – Teacher and Student Models
52:30 – Shipping to the Server
1:03:40 – Results
1:14:50 – Conclusion
Code:
References:
My Video on SimCLRv2:
Born-Again Neural Networks:
Deep Ensembles: A Loss Landscape Perspective:
Links:
YouTube:
Twitter:
Discord:
BitChute:
Minds:
Parler:
source