It takes a lot of data to train machine learning models to recognize damage in satellite images following natural disasters. These data can be difficult to find. If you have the data, it may be expensive to create them. Furthermore, even the most well-researched datasets can contain biases that could negatively affect a model’s performance.
Researchers at MIT devised a way to train a machine learning model without using a dataset. Instead of using a dataset, they use a special machine-learning model that generates highly realistic synthetic data that can be used to train another model for downstream vision tasks.
These results showed that a contrastive representation model that was trained with only synthetic data can learn visual representations comparable to or better than those from real data.
A generative model is a machine-learning model that requires less memory than a dataset. Synthetic data has the potential to bypass privacy and usage rights restrictions that restrict real data distribution. The generative model can also be edited to remove some attributes such as race or gender. This could help address biases found in traditional datasets.
“We knew this method would work. We just had to wait for the generative models to improve. We were particularly pleased to show that this method sometimes does better than the real thing,” Ali Jahanian (a researcher in the Computer Science and Artificial Intelligence Laboratory, CSAIL), who is also the lead author of the paper.
Jahanian co-authored the paper along with CSAIL graduate students Xavier Puig and Yonglong Tian. Senior author Phillip Isola is an assistant professor in Electrical Engineering and Computer Science. The research will be presented during the International Conference on Learning Representations.
Generating synthetic data
After a generative model is trained with real data, it can create synthetic data that looks almost identical to the real thing. It is necessary to show the generative model millions upon millions of images that include objects in a specific class (such as cars or cats) and then it has to learn what a car or cat looks, so it can create similar objects.
Jahanian explains that researchers can simply flip a switch to generate realistic images from a pre-trained generative model. These images are based on the training dataset.
He says that generative models can be even more useful as they learn to transform the data they have been trained on. When a model is trained using images of cars, it can “imagine how a car might look in different situations” — situations it didn’t see during training. Then it will output images showing the car in unique poses and colors.
Contrastive learning is a technique that allows multiple views of the same image. This is where a machine-learning model is shown several unlabeled images in order to determine which pairs are similar and different.
Researchers connected a pre-trained generative model and a contrastive learning model in such a way that they could work together seamlessly. Jahanian says that the contrastive learner could tell a generative model to produce different views and learn to identify an object from multiple angles.
It’s even better than the real deal
Researchers compared their method with several other image classification models that were trained from real data. They found that their method performed just as well or better than other models.
A generative model can theoretically create infinite numbers of samples. This is an advantage. The researchers also looked at how the number and quality of the samples affected the performance of the model. In some cases, increasing the number of unique samples produced additional improvements.
These generative models can be trained by someone else. They can be found in online repositories so that everyone can access them. Jahanian states that you don’t have to modify the model in order to obtain good representations.

