#
Exercices
#
Exercise: Create a Custom Dataset with Transformations
Objective: Create a custom dataset class for a set of images stored in a local directory, apply transformations to the images, and visualize a few samples with the transformations applied.
Steps:
Create a Custom Dataset Class:
- Write a PyTorch
Datasetclass to load images from a directory. - Each image should have a corresponding label from a CSV file (format:
filename,label).
- Write a PyTorch
Apply Transformations:
- Resize the images to 128x128 pixels.
- Convert the images to PyTorch tensors.
- Normalize the images with a mean of 0.5 and standard deviation of 0.5.
Load the Dataset with DataLoader:
- Use
DataLoaderto load the dataset and prepare it for training.
- Use
Visualize Transformed Images:
- Display a few images from the dataset with the transformations applied.
import os
import pandas as pd
from PIL import Image
import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
import matplotlib.pyplot as plt
# Step 1: Create a custom dataset class
class CustomImageDataset(Dataset):
def __init__(self, csv_file, img_dir, transform=None):
self.annotations = pd.read_csv(csv_file) # Read CSV file with image paths and labels
self.img_dir = img_dir # Directory with images
self.transform = transform # Transformations to apply
def __len__(self):
return len(self.annotations) # Return the number of samples in the dataset
def __getitem__(self, idx):
img_path = os.path.join(self.img_dir, self.annotations.iloc[idx, 0]) # Get image path
image = Image.open(img_path) # Load image
label = torch.tensor(int(self.annotations.iloc[idx, 1])) # Get label
if self.transform:
image = self.transform(image) # Apply transformations
return image, label
# Step 2: Define transformations
transform = transforms.Compose([
transforms.Resize((128, 128)), # Resize images to 128x128 pixels
transforms.ToTensor(), # Convert images to PyTorch tensors
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) # Normalize images
])
# Step 3: Create an instance of the custom dataset with transformations
dataset = CustomImageDataset(csv_file='data/labels.csv', img_dir='data/images', transform=transform)
# Step 4: Load the dataset with DataLoader
dataloader = DataLoader(dataset, batch_size=8, shuffle=True)
# Function to display images
def show_images(images, labels):
images = images / 2 + 0.5 # Unnormalize
np_images = images.numpy()
fig, axes = plt.subplots(1, 8, figsize=(15, 5))
for idx, ax in enumerate(axes):
ax.imshow(np.transpose(np_images[idx], (1, 2, 0)))
ax.axis('off')
ax.set_title(f'Label: {labels[idx].item()}')
# Step 5: Visualize a few transformed images
data_iter = iter(dataloader)
images, labels = data_iter.next()
show_images(images, labels)
plt.show()
#
Explanation of the Solution
Custom Dataset Class:
- A custom dataset class
CustomImageDatasetis created to load images and labels from a specified directory and CSV file. - The
__getitem__method loads an image and its label and applies transformations if provided.
- A custom dataset class
Define Transformations:
- Images are resized to 128x128 pixels.
- Images are converted to PyTorch tensors.
- Images are normalized using a mean of 0.5 and a standard deviation of 0.5 for all channels.
DataLoader:
DataLoaderis used to load the dataset in batches and shuffle the data for training.
Visualization:
- A helper function
show_imagesis used to display the first 8 images from a batch with the transformations applied, showing the effect of resizing, normalization, and tensor conversion.
- A helper function