Teaching AI to Spot Harmful Content: A Journey into Unsupervised Learning

Building in Public: Part 1 - Setting Up Our Training Pipeline for Content Detection

Nov 01, 2024

Hey there, curious minds! 👋

Welcome to the first installment of my "Building in Public" series. While everyone's chatting about ChatGPT writing poems and DALL-E creating art, I'm taking you behind the scenes of building something different - an AI system that helps identify harmful content. I'm documenting every step, every challenge, and every "aha!" moment as part of the Bias Bounty 2 challenge by Humane Intelligence x Revontulet.

Why Understanding AI Bias Matters for Everyone 🌍

Let's have a real talk about AI bias and explainability. It's not just a technical problem - it's a societal challenge that affects all of us. Think about it:

Your loan application might be processed by AI
Job applications often go through AI screening
Social media content is moderated by AI
Healthcare diagnoses are increasingly aided by AI

Every single one of these systems can have biases, and these biases affect real people's lives. But here's the thing: you don't need to be a programmer to be part of the solution.

Finding Your Place in the AI Revolution 🔍

There are many ways to contribute to better, more ethical AI:

Business Leaders: Understanding AI bias helps make better decisions about implementing AI systems
Product Managers: Knowing about AI explainability helps design better user experiences
Content Creators: Understanding how AI moderates content helps create better, inclusive content
Policy Makers: Grasping AI bias helps create better regulations
Users: Being aware of AI bias helps us better navigate and question the systems we interact with daily.

What Does "Building in Public" Mean? 🤔

Imagine if a chef not only served you the final dish but also:

Showed you their grocery shopping list
Let you watch them chop every vegetable
Explained why they chose certain ingredients
Shared when they burned the first batch

That's what we're doing here with AI! I'm sharing:

Every line of code (even the messy first drafts)
The thinking behind each decision
The dead ends and failed attempts
The successful breakthroughs

The Technical Journey Begins 🚀

Now, let's roll up our sleeves and look at how we're actually building this. We're using Python and focusing on Apple Silicon Macs - but don't worry if you have different hardware, I'll cover alternatives in future posts.

Our Toolkit 🛠️

First, let's look at our training script (train.py):

python

"""
Training script for hate content detection model
"""
import tensorflow as tf
from utils import ImageProcessor, SimpleMetalKMeans

def train():
    """Main training function"""
    print("Starting hate content detection model training...")
    
    # Print TensorFlow device info
    physical_devices = tf.config.list_physical_devices('GPU')
    print("Available devices:", physical_devices)

Think of this as setting up our kitchen before cooking. We're:

Importing our tools (TensorFlow and custom utilities)
Checking if we have our high-speed processor (GPU) available

The Image Processor: Our Digital Photo Assistant 📸

python

# Initialize processor
processor = ImageProcessor()

# Load and process training data
print("\nLoading training data...")
train_folder = './Training_data'
train_images, train_ids = processor.load_images(train_folder)

What's happening here? Imagine you're preparing photos for a digital album:

The ImageProcessor is like your helpful assistant
It takes each image and:
- Resizes it to 64x64 pixels (like making sure all your photos are the same size)
- Converts it to the right color format
- Uses multiple workers (like having several assistants helping at once)

The Brain of the Operation: K-means Clustering 🧠

# Initialize and train model
print("\nInitializing model...")
model = SimpleMetalKMeans(
    n_clusters=2,  # Binary classification: harmful vs non-harmful
    random_state=42,  # For reproducibility
    batch_size=1024  # Efficient processing
)

This is where bias considerations become crucial. We're making important decisions:

n_clusters=2: We're creating a binary classification system. Is this oversimplifying complex content?
random_state=42: Ensures our results are reproducible and auditable
batch_size=1024: Balances processing speed with memory usage

The Training Process 📚

# Preprocess data
print("\nPreprocessing training data...")
train_data = train_images.reshape(train_images.shape[0], -1) / 255.0
print(f"Training data shape: {train_data.shape}")

# Train the model
print("\nTraining model...")
model.fit(train_data)

# Save the trained model
print("\nSaving model...")
model.save()

Let's break this down:

We flatten and normalize our images (making them easier for the AI to process)
We let the model learn patterns
We save our trained model for future use

Processing Test Data and Saving Results 📊

# Process test data
print("\nProcessing test data...")
test_folder = './Test_data'
test_images, test_ids = processor.load_images(test_folder)
test_data = test_images.reshape(test_images.shape[0], -1) / 255.0

# Make predictions
print("\nMaking predictions...")
predictions = model.predict(test_data)

# Save results
results_df = pd.DataFrame({
    'image_id': test_ids,
    'prediction': predictions
})
results_df.to_csv('predictions.csv', index=False)

This final stage shows us how our model performs. Every prediction it makes could affect content moderation decisions, so transparency is crucial.

The Bigger Picture: Why This Matters for Bias 🎯

Each technical decision we've made has implications for bias and fairness:

Image Processing: Are we losing important cultural context by resizing images?
Binary Classification: Does categorizing content as simply "harmful" or "not harmful" miss important nuances?
Unsupervised Learning: How do we ensure our model isn't learning problematic patterns?