Neural Collaborative Filtering

Last Updated : 4 Aug, 2025

Neural Collaborative Filtering (NCF) is an improved version of traditional recommendation systems that uses deep learning to make better suggestions. It is an advanced version of collaborative filtering. Traditional methods often rely on simple techniques that can miss complex patterns between users and items. NCF, on the other hand, uses neural networks to learn these patterns more effectively. Instead of using basic matrix factorization methods like Singular Value Decomposition, NCF uses multi-layer perceptrons (MLP) to capture more detailed, non-linear relationships. This allows the model to capture non-linear interactions between users and items, providing more accurate and personalized recommendations.

NCF
Neural Collaborative Filtering (NCF) Architecture

Working of Neural Collaborative Filtering

Let's understand the working of Neural Collaborative Filtering by examining its different layers

NCF-Working
Working of Neural Collaborative Filtering
  • Embedding Layer: Every user and item is represented as a dense vector (a list of numbers), which captures their characteristics in a lower-dimensional space. These vectors are often referred to as embeddings.
  • Interaction Layer: The user and item embeddings are combined. This can be done through simple concatenation, element-wise multiplication or other operations that mix the information from the user and item.
  • Neural Network (MLP): The combined user-item data is passed through a neural network (MLP). The neural network consists of multiple layers that allow the model to learn complex, non-linear relationships between the user and item features.
  • Output Layer: Finally, the network outputs a prediction ,usually a score which indicates how likely the user is to engage with the item. This could represent a rating, a click or any other form of interaction.

Types of Neural Collaborative Filtering

  • NeuMF (Neural Matrix Factorization): This approach combines the strengths of traditional matrix factorization and neural networks. It allows the model to learn both linear and non-linear interactions.
  • NCF with Implicit Feedback: Some recommendation systems don’t have explicit ratings (like a 1–5 star rating) but instead rely on implicit feedback, like clicks, views or purchases. NCF can be adapted to work with this type of data.
  • Federated NCF: This version of NCF allows for model training across decentralized devices (like smartphones) while keeping the user data private. This is particularly useful in situations where data privacy is a concern.

Implementation of Neural Collaborative Filtering

We will now implement neural collaborative filtering using PyTorch.

Step 1: Importing Libraries

We import the libraries necessary for the functioning of the program,

  • torch: The main PyTorch library for deep learning operations.
  • torch.nn: Contains modules for building neural networks, including layers and loss functions.
  • torch.optim: Contains optimization algorithms, such as Adam, for training neural networks.
  • pandas : library for data manipulation and analysis.
Python
import torch
import torch.nn as nn
import torch.optim as optim
import pandas as pd

Step 2: Load Dataset, Mapping Users and Movies

The csv file is read and the mapping of movies and users is done which allows us to use integers instead strings when passing data into the model.

The sample file can be downloaded from here.

  • pd.read_csv: Loads data from a CSV file into a pandas DataFrame.
  • df.head(): Shows the first five rows of the DataFrame to preview data.
  • list(df['user'].unique()): Extracts unique users as a list.
  • list(df['movie'].unique()): Extracts unique movies as a list.
  • user2idx / movie2idx: Maps users and movies to unique integer indices.
  • len(users) / len(movies): Counts number of unique users and movies.
Python
df = pd.read_csv('CSV_FILE')
print(df.head())

users = list(df['user'].unique())
movies = list(df['movie'].unique())
user2idx = {user: idx for idx, user in enumerate(users)}
movie2idx = {movie: idx for idx, movie in enumerate(movies)}
num_users, num_movies = len(users), len(movies)
dataset
Neural Collaborative Filtering

Step 3: Convert Data to Tensors

After mapping the users and movies to indices, we convert the user, movie and rating columns into PyTorch tensors. Tensors are the data structure that PyTorch uses for all computations in neural networks.

  • torch.tensor: Converts lists or NumPy arrays into PyTorch tensors for computation.
Python
user_indices = torch.tensor([user2idx[u]
                            for u in df['user']], dtype=torch.long)
movie_indices = torch.tensor([movie2idx[m]
                             for m in df['movie']], dtype=torch.long)
ratings = torch.tensor(df['rating'].values, dtype=torch.float32)

Step 4: Neural Collaborative Filtering(NCF) Model

The model will take user and movie indices as inputs, look up their embeddings, combine them and then pass them through the layers to predict the rating.

  • nn.Embedding: Used to learn an embedding for users and movies (dense representations of the categorical variables).
  • nn.Linear: Fully connected layers.
  • ReLU: ReLU Activation function that introduces non-linearity.
  • Dropout: Dropout Regularization technique to prevent overfitting by randomly setting a fraction of input units to zero during training.
Python
class NCF(nn.Module):
    def __init__(self, num_users, num_items, embedding_dim=16):
        super(NCF, self).__init__()
        self.user_embedding = nn.Embedding(num_users, embedding_dim)
        self.item_embedding = nn.Embedding(num_items, embedding_dim)
        self.fc1 = nn.Linear(embedding_dim * 2, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 1)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.2)

    def forward(self, user, item):
        user_emb = self.user_embedding(user)
        item_emb = self.item_embedding(item)
        x = torch.cat([user_emb, item_emb], dim=-1)
        x = self.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Step 5: Initialize Model, Loss Function and Optimizer

We initialize the model, specify the loss function (Mean Squared Error) and set up the optimizer.

  • model = NCF(num_users, num_movies): Instantiates the collaborative filtering neural network model.
  • nn.MSELoss: Computes the mean squared error between predictions and targets.
  • optim.Adam: Optimizer that adapts learning rates for model parameter updates.
Python
model = NCF(num_users, num_movies)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.005, weight_decay=1e-5)

Step 6: Train the Model

In each epoch(training cycle):

  • model.train(): Sets the model to training mode (enables dropout, etc.).
  • optimizer.zero_grad(): Clears gradients from the previous step.
  • model(user_indices, movie_indices): Makes predictions with the current model parameters.
  • criterion(predictions, ratings): Calculates the loss between predictions and true ratings.
  • loss.backward(): Computes gradients for backpropagation.
  • optimizer.step(): Updates model parameters using computed gradients.
Python
epochs = 50
for epoch in range(epochs):
    model.train()
    optimizer.zero_grad()
    predictions = model(user_indices, movie_indices).squeeze()
    loss = criterion(predictions, ratings)
    loss.backward()
    optimizer.step()
    if (epoch + 1) % 10 == 0:
        print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")
epoch
Neural Collaborative Filtering

Step 7: Evaluate the Model

  • model.eval(): Switches the model to evaluation mode.
  • torch.no_grad(): Disables gradient computation, which is not needed during evaluation.
  • model(user_idx, movie_idx): Predicts ratings for given user-movie pairs during evaluation.
Python
model.eval()
test_cases = [("Alice", "Titanic"), ("Bob", "Inception"),
              ("Charlie", "The Matrix")]
idx2user = {idx: user for user, idx in user2idx.items()}
idx2movie = {idx: movie for movie, idx in movie2idx.items()}

with torch.no_grad():
    for user, movie in test_cases:
        user_idx = torch.tensor([user2idx[user]], dtype=torch.long)
        movie_idx = torch.tensor([movie2idx[movie]], dtype=torch.long)
        pred = model(user_idx, movie_idx).item()
        print(f"Predicted rating for {user} on {movie}: {pred:.2f} stars")
result
Neural Collaborative Filtering

Applications of Neural Collaborative Filtering

NCF applications in a wide range of industries. Here are some examples:

  • E-commerce: Platforms like Amazon use NCF to recommend products based on user behavior and preferences, leading to increased sales and customer satisfaction.
  • Streaming Services: Netflix and Spotify use NCF to suggest movies, shows or songs based on what users have watched or listened to in the past.
  • Social Media: Social media platforms like Instagram or Facebook recommend content (posts, friends or groups) that they think users will find interesting, using models like NCF.
  • Online Education: Websites like Coursera can recommend courses or learning materials tailored to individual users, helping them discover content they may not have found otherwise.

Advantages of Neural Collaborative Filtering

  • Captures Complex Interactions: Learns non-linear relationships between users and items.
  • Flexible and Scalable: Easily adjustable architecture for different data sizes and types.
  • Embeddings: Learns rich user and item embeddings for better representation.
  • Improved Accuracy: Performs better than traditional methods, especially with large datasets.

Limitations of Neural Collaborative Filtering

  • Requires Large Datasets: Performance drops with small datasets.
  • Computationally Expensive: Needs significant resources (e.g., GPUs) for training.
  • Overfitting Risk: Prone to overfitting without proper regularization.
  • Long Training Times: Requires more time to converge compared to simpler models.
Comment