seloufian
diff --git a/‎README.md
Lines changed: 13 additions & 2 deletions b/‎README.md
Lines changed: 13 additions & 2 deletions
diff --git a/‎eecs498-007/A4/README.md
Lines changed: 36 additions & 0 deletions b/‎eecs498-007/A4/README.md
Lines changed: 36 additions & 0 deletions
diff --git a/‎eecs498-007/A4/a4_helper.py
Lines changed: 214 additions & 0 deletions b/‎eecs498-007/A4/a4_helper.py
Lines changed: 214 additions & 0 deletions
diff --git a/‎eecs498-007/A4/adversarial_attacks_results.jpg
44.4 KB b/‎eecs498-007/A4/adversarial_attacks_results.jpg
44.4 KB
diff --git a/‎eecs498-007/A4/class_viz_result.jpg
57.9 KB b/‎eecs498-007/A4/class_viz_result.jpg
57.9 KB
diff --git a/‎eecs498-007/A4/eecs598/__init__.py
Lines changed: 4 additions & 0 deletions b/‎eecs498-007/A4/eecs598/__init__.py
Lines changed: 4 additions & 0 deletions
@@ -24,13 +24,13 @@ Assignments are the funniest part of the courses, they allow practicing most of
 
 Assignment questions are in form of Jupyter notebooks that call external Python files in order to execute properly. That is, you will mostly implement missing parts in the Python files and execute notebook's cells to check the correctness of your implementation. However, you'll write also some code in the notebooks and respond to inline questions (result analysis and theoretical questions).
 
-For my implementation, I solved all from the three CS231n assignments, for the questions that use frameworks, they ask to pick only one, and for that I choosed PyTorch. That is, questions that require framework were implemented with PyTorch (and not with TensorFlow). For EECS 498-007, since its assignments are similar to the CS231n ones, I solved only those who bring new concepts, precisely A5 (Object detection: YOLO and Faster RCNN) and A6 (partially, the 1st question about VAEs). For EECS 498-007, there is no choice, only PyTorch is used (which fits perfectly with my choice of using it also in CS231n).
+For my implementation, I solved all from the three CS231n assignments, for the questions that use frameworks, they ask to pick only one, and for that I choosed PyTorch. That is, questions that require framework were implemented with PyTorch (and not with TensorFlow). For EECS 498-007, since its assignments are similar to the CS231n ones, I solved only those who bring new concepts, precisely A4 (partially, the first two questions about Residual Networks and Attention LSTM), A5 (Object detection: YOLO and Faster RCNN) and A6 (partially, the 1st question about VAEs). For EECS 498-007, there is no choice, only PyTorch is used (which fits perfectly with my choice of using it also in CS231n).
 
 Note that, even that my coding solutions are probably correct, the CS231n assignments contain inline questions for which I'm not sure about their correctness, I just responded as well as I know. Also, Except for the CS231n first assignment (which is less commented), for the remaining assignments, I tried to comment on my code as richly as I can to make it understandable.
 
 ### Repository Structure
 
-The repository file's structure is quite intuitive, there are two folders (one for each course), each one with its sub-folders that represent the assignments (three for CS231n and two for EECS 498-007). Note that for each assignment's folder, I put a README which shows covered topics and question descriptions (copied from the assignment's website).
+The repository file's structure is quite intuitive, there are two folders (one for each course), each one with its sub-folders that represent the assignments (three for both, CS231n and EECS 498-007). Note that for each assignment's folder, I put a README which shows covered topics and question descriptions (copied from the assignment's website).
 
 In the rest of this README, I will present a [quick access to the assignments files](#assignment-files), [useful links](#useful-links), some [obtained results](#result-examples) and [credits](#credits).
 
@@ -86,6 +86,17 @@ The table below shows relevant links to both courses' materials.
 
 ### EECS 498-007 / 598-005: Deep Learning for Computer Vision
 
+#### Assignment 4
+
+**Modified Python files:** [``pytorch_autograd_and_nn.py``](eecs498-007/A4/pytorch_autograd_and_nn.py), [``rnn_lstm_attention_captioning.py``](eecs498-007/A4/rnn_lstm_attention_captioning.py), [``network_visualization.py``](eecs498-007/A4/network_visualization.py), [``style_transfer.py``](eecs498-007/A4/style_transfer.py).
+
+| Question |                      Title                      |                                        IPython Notebook                                       |
+|:--------:|:-----------------------------------------------:|:---------------------------------------------------------------------------------------------:|
+|    Q1    |                 PyTorch Autograd                |       [``pytorch_autograd_and_nn.ipynb``](eecs498-007/A4/pytorch_autograd_and_nn.ipynb)       |
+|    Q2    | Image Captioning with Recurrent Neural Networks | [``rnn_lstm_attention_captioning.ipynb``](eecs498-007/A4/rnn_lstm_attention_captioning.ipynb) |
+|    Q3    |              Network Visualization              |         [``network_visualization.ipynb``](eecs498-007/A4/network_visualization.ipynb)         |
+|    Q4    |                  Style Transfer                 |                [``style_transfer.ipynb``](eecs498-007/A4/style_transfer.ipynb)                |
+
 #### Assignment 5
 
 **Modified Python files:** [``single_stage_detector.py``](eecs498-007/A5/single_stage_detector.py), [``two_stage_detector.py``](eecs498-007/A5/two_stage_detector.py).
 
@@ -0,0 +1,36 @@
+<div>
+  <h2 align="center"><a href="https://web.eecs.umich.edu/~justincj/teaching/eecs498/">EECS 498-007 / 598-005: Deep Learning for Computer Vision</a></h2>
+  <h2 align="center"><a href="https://web.eecs.umich.edu/~justincj/teaching/eecs498/FA2020/assignment4.html">Assignment 4 (2020)</a></h3>
+</div>
+
+# Goals
+
+From this assignment forward, you will use autograd in PyTorch to perform backpropgation for you. This will enable you to easily build complex models without worrying about writing code for the backward pass by hand.
+
+The goals of this assignment are:
+
+- Understand how autograd can help automate gradient computation.
+- See how to use PyTorch Modules to build up complex neural network architectures.
+- Understand and implement recurrent neural networks.
+- See how recurrent neural networks can be used for image captioning.
+- Understand how to augment recurrent neural networks with attention.
+- Use image gradients to synthesize saliency maps, adversarial examples, and perform class visualizations.
+- Combine content and style losses to perform artistic style transfer.
+
+# Questions
+
+## Q1: PyTorch Autograd
+
+The notebook [``pytorch_autograd_and_nn.ipynb``](pytorch_autograd_and_nn.ipynb) will introduce you to the different levels of abstraction that PyTorch provides for building neural network models. You will use this knowledge to implement and train Residual Networks for image classification.
+
+## Q2: Image Captioning with Recurrent Neural Networks
+
+The notebook [``rnn_lstm_attention_captioning.ipynb``](rnn_lstm_attention_captioning.ipynb) will walk you through the implementation of vanilla recurrent neural networks (RNN) and Long Short Term Memory (LSTM) RNNs. You will use these networks to train an image captioning model. You will then augment your implementation to perform spatial attention over image regions while generating captions.
+
+## Q3: Network Visualization
+
+The notebook [``network_visualization.ipynb``](network_visualization.ipynb) will walk you through the use of image gradients for generating saliency maps, adversarial examples, and class visualizations.
+
+## Q4: Style Transfer
+
+In the notebook [``style_transfer.ipynb``](style_transfer.ipynb), you will learn how to create images with the artistic style of one image and the content of another image.
@@ -0,0 +1,214 @@
+import os
+import pickle
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torch.optim as optim
+from torch.utils.data import DataLoader
+from torch.utils.data import sampler
+from collections import OrderedDict
+import torchvision.datasets as dset
+import torchvision.transforms as T
+
+import random
+import numpy as np
+from scipy.ndimage.filters import gaussian_filter1d
+
+SQUEEZENET_MEAN = torch.tensor([0.485, 0.456, 0.406], dtype=torch.float)
+SQUEEZENET_STD = torch.tensor([0.229, 0.224, 0.225], dtype=torch.float)
+
+### Helper Functions
+'''
+Our pretrained model was trained on images that had been preprocessed by subtracting 
+the per-color mean and dividing by the per-color standard deviation. We define a few helper 
+functions for performing and undoing this preprocessing.
+'''
+def preprocess(img, size=224):
+  transform = T.Compose([
+    T.Resize(size),
+    T.ToTensor(),
+    T.Normalize(mean=SQUEEZENET_MEAN.tolist(),
+          std=SQUEEZENET_STD.tolist()),
+    T.Lambda(lambda x: x[None]),
+  ])
+  return transform(img)
+
+def deprocess(img, should_rescale=True):
+  # should_rescale true for style transfer
+  transform = T.Compose([
+    T.Lambda(lambda x: x[0]),
+    T.Normalize(mean=[0, 0, 0], std=(1.0 / SQUEEZENET_STD).tolist()),
+    T.Normalize(mean=(-SQUEEZENET_MEAN).tolist(), std=[1, 1, 1]),
+    T.Lambda(rescale) if should_rescale else T.Lambda(lambda x: x),
+    T.ToPILImage(),
+  ])
+  return transform(img)
+
+# def deprocess(img):
+#     transform = T.Compose([
+#         T.Lambda(lambda x: x[0]),
+#         T.Normalize(mean=[0, 0, 0], std=[1.0 / s for s in SQUEEZENET_STD.tolist()]),
+#         T.Normalize(mean=[-m for m in SQUEEZENET_MEAN.tolist()], std=[1, 1, 1]),
+#         T.Lambda(rescale),
+#         T.ToPILImage(),
+#     ])
+#     return transform(img)
+
+def rescale(x):
+  low, high = x.min(), x.max()
+  x_rescaled = (x - low) / (high - low)
+  return x_rescaled
+  
+def blur_image(X, sigma=1):
+  X_np = X.cpu().clone().numpy()
+  X_np = gaussian_filter1d(X_np, sigma, axis=2)
+  X_np = gaussian_filter1d(X_np, sigma, axis=3)
+  X.copy_(torch.Tensor(X_np).type_as(X))
+  return X
+
+
+# Older versions of scipy.misc.imresize yield different results
+# from newer versions, so we check to make sure scipy is up to date.
+def check_scipy():
+    import scipy
+    vnum = int(scipy.__version__.split('.')[1])
+    major_vnum = int(scipy.__version__.split('.')[0])
+    
+    assert vnum >= 16 or major_vnum >= 1, "You must install SciPy >= 0.16.0 to complete this notebook."
+
+def jitter(X, ox, oy):
+  """
+  Helper function to randomly jitter an image.
+  
+  Inputs
+  - X: PyTorch Tensor of shape (N, C, H, W)
+  - ox, oy: Integers giving number of pixels to jitter along W and H axes
+  
+  Returns: A new PyTorch Tensor of shape (N, C, H, W)
+  """
+  if ox != 0:
+    left = X[:, :, :, :-ox]
+    right = X[:, :, :, -ox:]
+    X = torch.cat([right, left], dim=3)
+  if oy != 0:
+    top = X[:, :, :-oy]
+    bottom = X[:, :, -oy:]
+    X = torch.cat([bottom, top], dim=2)
+  return X
+
+
+def load_CIFAR(path='./datasets/'):
+  NUM_TRAIN = 49000
+  # The torchvision.transforms package provides tools for preprocessing data
+  # and for performing data augmentation; here we set up a transform to
+  # preprocess the data by subtracting the mean RGB value and dividing by the
+  # standard deviation of each RGB value; we've hardcoded the mean and std.
+  transform = T.Compose([
+                  T.ToTensor(),
+                  T.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
+              ])
+
+  # We set up a Dataset object for each split (train / val / test); Datasets load
+  # training examples one at a time, so we wrap each Dataset in a DataLoader which
+  # iterates through the Dataset and forms minibatches. We divide the CIFAR-10
+  # training set into train and val sets by passing a Sampler object to the
+  # DataLoader telling how it should sample from the underlying Dataset.
+  cifar10_train = dset.CIFAR10(path, train=True, download=True,
+                               transform=transform)
+  loader_train = DataLoader(cifar10_train, batch_size=64, 
+                            sampler=sampler.SubsetRandomSampler(range(NUM_TRAIN)))
+
+  cifar10_val = dset.CIFAR10(path, train=True, download=True,
+                             transform=transform)
+  loader_val = DataLoader(cifar10_val, batch_size=64, 
+                          sampler=sampler.SubsetRandomSampler(range(NUM_TRAIN, 50000)))
+
+  cifar10_test = dset.CIFAR10(path, train=False, download=True, 
+                              transform=transform)
+  loader_test = DataLoader(cifar10_test, batch_size=64)
+  return loader_train, loader_val, loader_test
+
+
+def load_imagenet_val(num=None, path='./datasets/imagenet_val_25.npz'):
+  """Load a handful of validation images from ImageNet.
+  Inputs:
+  - num: Number of images to load (max of 25)
+  Returns:
+  - X: numpy array with shape [num, 224, 224, 3]
+  - y: numpy array of integer image labels, shape [num]
+  - class_names: dict mapping integer label to class name
+  """
+  imagenet_fn = os.path.join(path)
+  if not os.path.isfile(imagenet_fn):
+    print('file %s not found' % imagenet_fn)
+    print('Run the above cell to download the data')
+    assert False, 'Need to download imagenet_val_25.npz'
+  f = np.load(imagenet_fn, allow_pickle=True)
+  X = f['X']
+  y = f['y']
+  class_names = f['label_map'].item()
+  if num is not None:
+    X = X[:num]
+    y = y[:num]
+  return X, y, class_names
+
+
+def load_COCO(path = './datasets/coco.pt'):
+  '''
+    Download and load serialized COCO data from coco.pt
+    It contains a dictionary of
+    "train_images" - resized training images (112x112)
+    "val_images" - resized validation images (112x112)
+    "train_captions" - tokenized and numericalized training captions
+    "val_captions" - tokenized and numericalized validation captions
+    "vocab" - caption vocabulary, including "idx_to_token" and "token_to_idx"
+
+    Returns: a data dictionary
+  '''
+  data_dict = torch.load(path)
+  # print out all the keys and values from the data dictionary
+  for k, v in data_dict.items():
+      if type(v) == torch.Tensor:
+          print(k, type(v), v.shape, v.dtype)
+      else:
+          print(k, type(v), v.keys())
+
+  num_train = data_dict['train_images'].size(0)
+  num_val = data_dict['val_images'].size(0)
+  assert data_dict['train_images'].size(0) == data_dict['train_captions'].size(0) and \
+        data_dict['val_images'].size(0) == data_dict['val_captions'].size(0), \
+        'shapes of data mismatch!'
+
+  print('\nTrain images shape: ', data_dict['train_images'].shape)
+  print('Train caption tokens shape: ', data_dict['train_captions'].shape)
+  print('Validation images shape: ', data_dict['val_images'].shape)
+  print('Validation caption tokens shape: ', data_dict['val_captions'].shape)
+  print('total number of caption tokens: ', len(data_dict['vocab']['idx_to_token']))
+  print('mappings (list) from index to caption token: ', data_dict['vocab']['idx_to_token'])
+  print('mappings (dict) from caption token to index: ', data_dict['vocab']['token_to_idx'])
+
+
+  return data_dict
+
+
+## Dump files for submission
+def dump_results(submission, path):
+  '''
+  Dumps a dictionary as a .pkl file for autograder 
+    results: a dictionary 
+    path: path for saving the dict object 
+  '''
+  # del submission['rnn_model']
+  # del submission['lstm_model']
+  # del submission['attn_model']
+  with open(path, "wb") as f:
+    pickle.dump(submission, f)
+
+
+
+
+
+
+
+
+
@@ -0,0 +1,4 @@
+from . import data, grad, submit
+from .solver import Solver
+from .utils import reset_seed
+from .vis import tensor_to_image, visualize_dataset