The premise of self-supervised learning is that we can turn an unsupervised learning problem into a supervised one (partially). This can be done with proxy supervised tasks, such that the labels are automatically generated for free, and solving the task requires the model to understand the context.

i.e., the tasks are generated so the model learns robust representations

A simple example of this is to rotate images randomly and make the model predict the rotation angle. We get the labels from the rotation.