Image Classification
Introduction
Image classification is a fundamental task in computer vision that classifies an image into one of many predefined categories. This article will give you a step-by-step guide on how to build an image classification model using PyTorch, one of the most popular deep learning libraries.
Prerequisites
Before we begin, ensure you have PyTorch installed. If not, please follow the official PyTorch installation guide.
Step 1: Importing Libraries
First, let's import the necessary libraries:
import torch
from torchvision import datasets, models, transforms
import torch.nn as nn
import torch.optim as optim
Step 2: Preparing the Dataset
We will use the CIFAR-10 dataset, a popular image classification dataset of colored 32x32 pixel images across 10 classes.
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
trainset = datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
shuffle=True, num_workers=2)
testset = datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
Step 3: Define the Network
Next, let's define a simple Convolutional Neural Network (CNN):
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
This network has two convolutional layers, followed by two fully connected layers.
Step 4: Define a Loss Function and Optimizer
We'll use the Cross-Entropy loss and SGD with momentum.
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
Step 5: Train the Network
The training process involves passing the input data through the network (forward pass), calculating the loss, then updating the weights of the network in the direction that minimizes the loss (backward pass).
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
Step 6: Test the Network
Finally, let's test our model with the test data.
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
This is a simple yet complete guide to training an image classification model using PyTorch. Happy learning!