From Training to Action: Making Our AI Work in the Real World
Building in Public: Part 3 - Teaching Our AI to Make Decisions 🤖
Hey there, fellow Alchemists ! 👋
Last time we made sure our AI kitchen was squeaky clean and ready to go with our verify.py script. Today? We're going to do something SUPER exciting - we're going to make our AI actually work in the real world!
What Do We Mean by "Inference"? 🤔
Okay, let's chat about this word "inference" for a second. I remember the first time I saw it, I thought, "Wow, that sounds complicated!" But here's the thing - it's actually a pretty simple idea. You know how after you learn something, you use that knowledge to make decisions? Like how after you learn to ride a bike, you can look at any bike and think, "Yeah, I can ride that!"
That's exactly what inference is! Our AI has learned some patterns (that's the training part), and now it's going to use that knowledge to make decisions about new images it sees. Cool, right?
Opening Up inference.py 📝
Let's look at our script together:
"""
Inference script for hate content detection model
"""
import pandas as pd
from utils import ImageProcessor, SimpleMetalKMeans
See those imports? We're bringing in two really important tools:
pandas (that's what 'pd' stands for) is like our data organization superhero. Think of it as an super-powered Excel for Python
Our old friends ImageProcessor and SimpleMetalKMeans from utils.py that we talked about before
The Main Star: run_inference() 🌟
def run_inference(image_folder, model_path="model/kmeans_model.pkl"):
"""
Run inference on a folder of images using a trained model.
"""
print("Starting inference process...")
Let's break this down:
image_folder
is where we keep the images we want to analyzemodel_path
points to our trained model (that .pkl file - think of it as our AI's learned experience)The "pkl" extension? That stands for "pickle" (yes, really! 😄). It's Python's way of saving complex stuff to a file
Three Big Steps (Like a Recipe!) 🥘
Step 1: Getting Our Expert Ready 🧑🍳
try:
# Load the trained model
print("\nLoading trained model...")
model = SimpleMetalKMeans.load(model_path)
This is like getting our expert chef into the kitchen. The try:
part? That's like having a safety net - if something goes wrong, we'll catch it and handle it gracefully. We'll talk more about that in a minute!
Step 2: Preparing Our Images 🖼️
# Initialize image processor
processor = ImageProcessor()
# Load and process images
print("\nProcessing images...")
images, image_ids = processor.load_images(image_folder)
# Preprocess data
data = images.reshape(images.shape[0], -1) / 255.0
Whoa, what's that reshape and 255.0 business? Let me explain:
reshape
is like reorganizing your photos in an album. Instead of having a complex 3D structure (width, height, colors), we flatten each image into a simple list of numbersDividing by 255.0 is like converting prices from cents to dollars - it scales our pixel values from 0-255 down to 0-1, which our AI prefers
Step 3: Decision Time! 🎯
# Make predictions
print("\nGenerating predictions...")
predictions = model.predict(data)
This is the magical moment where our AI looks at each image and makes a decision. But what exactly is it deciding? Remember, we're using this for content detection, so for each image, it's basically asking "Does this look concerning or normal?"
Making It Bulletproof 🛡️
Here's something I learned the hard way - ALWAYS plan for things to go wrong! Let's add some safety nets:
def run_inference(image_folder, model_path="model/kmeans_model.pkl"):
try:
# All our code from before...
return results_df
except FileNotFoundError as e:
print(f"\nOops! Couldn't find a file: {e}")
print("Check if your model and image folder paths are correct!")
raise
except Exception as e:
print(f"\nSomething unexpected happened: {e}")
raise
Why all this error handling? Well, let me tell you a story... I once ran an inference job on 10,000 images, and it crashed at image 9,999 with no error handling. 😭 Never again!
Saving Our Results 📊
# Create results DataFrame
results_df = pd.DataFrame({
"image_id": image_ids,
"prediction": predictions
})
# Save to CSV
output_file = "inference_predictions.csv"
results_df.to_csv(output_file, index=False)
A DataFrame is like a super-powered spreadsheet in Python. Here we're creating one with two columns:
image_id: to know which image we're talking about
prediction: what our AI thought about it (0 or 1)
We save it as a CSV file (like an Excel file) so we can easily look at it later.
Download Sample Code
Let's Try It Out! 🎮
Want to get your hands dirty? Here's a fun experiment:
Grab some random images (maybe some pet photos?)
Put them in a folder called "test_images"
Run this code:
if __name__ == "__main__":
inference_folder = "./test_images"
results = run_inference(inference_folder)
print("\nResults Preview:")
print(results.head()) # Show first few results
The Ethics Corner 🎯
Now here's something really important to think about - our model is making decisions that could affect content moderation. That's a big responsibility! We need to consider:
False positives: What happens if we flag normal content as concerning?
False negatives: What if we miss actually concerning content?
Bias: Are we treating all types of content fairly?
What Could Go Wrong? 🔧
Let's talk about some common hiccups you might hit:
"Model not found" Error
FileNotFoundError: [Errno 2] No such file or directory: 'model/kmeans_model.pkl'
This usually means you're not running the code from the right folder. Try printing your current directory:
import os
print("I'm looking for files in:", os.getcwd())
2. Memory Issues
If you're processing lots of images, you might run out of memory. Try processing in batches:
# Process 100 images at a time
for i in range(0, len(image_files), 100):
batch_files = image_files[i:i+100]
# Process batch...
What's Next? 🚀
Next time, we're going to get into something really interesting - adversarial testing! Think of it like trying to fool our AI with optical illusions. We'll see just how robust our model really is!
Share Your Journey! 💭
I'd love to hear about:
What kind of images did you test with?
Any surprising predictions?
Did you modify the code in any interesting ways?
What questions came up as you were working with it?
Remember, we're all learning together! No question is too basic - if you're wondering about something, others probably are too!
Drop a comment below with your experiences, questions, or just to say hi! 👋
#BuildInPublic #MachineLearning #AIInference #PracticalAI
P.S. Running into issues? Share your error messages in the comments - debugging is always more fun together! 🤝