#Exercices


#Exercise: Create a Custom Dataset with Transformations

Objective: Create a custom dataset class for a set of images stored in a local directory, apply transformations to the images, and visualize a few samples with the transformations applied.

Steps:

  1. Create a Custom Dataset Class:

    • Write a PyTorch Dataset class to load images from a directory.
    • Each image should have a corresponding label from a CSV file (format: filename,label).
  2. Apply Transformations:

    • Resize the images to 128x128 pixels.
    • Convert the images to PyTorch tensors.
    • Normalize the images with a mean of 0.5 and standard deviation of 0.5.
  3. Load the Dataset with DataLoader:

    • Use DataLoader to load the dataset and prepare it for training.
  4. Visualize Transformed Images:

    • Display a few images from the dataset with the transformations applied.
Solution
import os import pandas as pd from PIL import Image import torch from torch.utils.data import Dataset, DataLoader from torchvision import transforms import matplotlib.pyplot as plt # Step 1: Create a custom dataset class class CustomImageDataset(Dataset): def __init__(self, csv_file, img_dir, transform=None): self.annotations = pd.read_csv(csv_file) # Read CSV file with image paths and labels self.img_dir = img_dir # Directory with images self.transform = transform # Transformations to apply def __len__(self): return len(self.annotations) # Return the number of samples in the dataset def __getitem__(self, idx): img_path = os.path.join(self.img_dir, self.annotations.iloc[idx, 0]) # Get image path image = Image.open(img_path) # Load image label = torch.tensor(int(self.annotations.iloc[idx, 1])) # Get label if self.transform: image = self.transform(image) # Apply transformations return image, label # Step 2: Define transformations transform = transforms.Compose([ transforms.Resize((128, 128)), # Resize images to 128x128 pixels transforms.ToTensor(), # Convert images to PyTorch tensors transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) # Normalize images ]) # Step 3: Create an instance of the custom dataset with transformations dataset = CustomImageDataset(csv_file='data/labels.csv', img_dir='data/images', transform=transform) # Step 4: Load the dataset with DataLoader dataloader = DataLoader(dataset, batch_size=8, shuffle=True) # Function to display images def show_images(images, labels): images = images / 2 + 0.5 # Unnormalize np_images = images.numpy() fig, axes = plt.subplots(1, 8, figsize=(15, 5)) for idx, ax in enumerate(axes): ax.imshow(np.transpose(np_images[idx], (1, 2, 0))) ax.axis('off') ax.set_title(f'Label: {labels[idx].item()}') # Step 5: Visualize a few transformed images data_iter = iter(dataloader) images, labels = data_iter.next() show_images(images, labels) plt.show()

#Explanation of the Solution

  1. Custom Dataset Class:

    • A custom dataset class CustomImageDataset is created to load images and labels from a specified directory and CSV file.
    • The __getitem__ method loads an image and its label and applies transformations if provided.
  2. Define Transformations:

    • Images are resized to 128x128 pixels.
    • Images are converted to PyTorch tensors.
    • Images are normalized using a mean of 0.5 and a standard deviation of 0.5 for all channels.
  3. DataLoader:

    • DataLoader is used to load the dataset in batches and shuffle the data for training.
  4. Visualization:

    • A helper function show_images is used to display the first 8 images from a batch with the transformations applied, showing the effect of resizing, normalization, and tensor conversion.