GitHub - MichaelDeng03/Generative-Vision-Models: Implements: 1) Style Transfer, 2) GAN

Summary

Style Transfer

style_transfer.ipynb

This project implements Neural Style Transfer, a technique for rendering a new image that combines the content of one source image with the artistic style of another. The method leverages a pre-trained deep network, SqueezeNet, as a feature extractor to represent the perceptual qualities of the images. Rather than training the network's parameters, the optimization is performed via gradient descent directly on the pixels of the output image itself. This process iteratively adjusts the image to minimize a carefully defined loss function, which quantifies the difference between the generated image and the desired content and style characteristics.

The composite loss function is a weighted sum of three distinct terms. First, the content loss measures how much the high-level feature representations of the generated image differ from those of the content image at a specific layer in the network. Second, the style loss captures texture, color, and patterns by comparing the correlations between filter activations using Gram matrices. This loss is typically calculated across multiple layers to capture stylistic features at different scales. Finally, a total variation loss is added as a regularization term to encourage spatial smoothness and reduce high-frequency noise in the resulting image.

GAN

generative_adversarial_network.ipynb

This script implements a Generative Adversarial Network (GAN) in PyTorch to generate handwritten digits trained on the MNIST dataset. The implementation is centered around two competing neural networks: a Generator and a Discriminator. The Generator is designed to produce realistic images from a random noise vector, while the Discriminator's objective is to differentiate between these synthetically generated images and real images from the training data. The project defines the specific adversarial loss functions for both networks, leveraging PyTorch's numerically stable binary_cross_entropy_with_logits function for implementation.

The script constructs and trains two distinct types of GANs. The first is a vanilla GAN that uses simple fully-connected (dense) layers for both the generator and discriminator. The second, more advanced model is a Deep Convolutional GAN (DCGAN). This architecture uses Conv2d and MaxPool layers in the discriminator for spatial feature extraction and ConvTranspose2d layers with BatchNorm in the generator to progressively build an image from the noise input. A generalized training function orchestrates the optimization process for both GAN types. Finally, the script demonstrates that the model has learned something non-trivial about the underlying spatial structure by interpolating between random vectors in its latent space.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
dc_gan_results.jpg		dc_gan_results.jpg
fc_gan_results.jpg		fc_gan_results.jpg
feature_inversion_result.jpg		feature_inversion_result.jpg
gan.py		gan.py
generative_adversarial_networks.ipynb		generative_adversarial_networks.ipynb
spatial_style_transfer.jpg		spatial_style_transfer.jpg
style_transfer.ipynb		style_transfer.ipynb
style_transfer.py		style_transfer.py
style_transfer_result.jpg		style_transfer_result.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summary

Style Transfer

GAN

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Summary

Style Transfer

GAN

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages