MelMage combines cutting-edge deep learning with explainable AI techniques, making neural network behavior accessible to everyone.
ResNet-style architecture achieving 83.75% accuracy on ESC-50 dataset with advanced mel-spectrogram processing.
See feature maps from every convolutional layer - understand how your model learns audio patterns.
Upload a WAV file and get top-3 predictions within 100ms using a serverless GPU-powered inference pipeline.
Built on Modal Labs for zero-cost scaling with GPU acceleration when needed.
From raw waveform → Mel-Spectrogram → CNN feature maps → Final prediction. Visualized via a custom React dashboard.
Built with Next.js + Tailwind CSS. Displays waveform, spectrogram, and per-layer feature maps to explore how audio signals transform across layers.
Three simple steps to understand your audio CNN completely
Drag & drop any WAV file or record directly in your browser. Supports various audio formats and sample rates.
Built-in audio preprocessing and validation
CNN analyzes Mel-spectrograms through multiple ResNet layers on serverless GPU infrastructure.
Real-time feature extraction and classification
See predictions alongside internal feature maps from every layer of the neural network.
Complete transparency from input to output
See the complete workflow from audio upload to neural network visualization
Custom ResNet-inspired architecture built from scratch in PyTorch. Every layer designed for optimal audio feature extraction.
MelMage leverages the latest in machine learning, serverless infrastructure, and modern web development to deliver a world-class experience.