Understanding Self-supervised Learning with Dual Deep Networks
We propose a novel theoretical framework to understand self-supervised
learning methods that employ dual pairs of deep ReLU networks (e.g., SimCLR,
BYOL). First, we prove that in each SGD update of SimCLR, the weights at each
layer are updated by a...