Wise Songs — YouTube Video ROI Analysis
Context
We generate educational songs (Aesop's fables, mental models, STEM nursery rhymes, GRE vocabulary, cerebral/philosophical songs) via Suno and have a pipeline from lyrics → suno generation → video → YouTube publish. This document evaluates the ROI of different video production approaches and identifies where to focus effort.
1. Song Categories and Their YouTube Opportunity
| Category | Songs | YouTube Competition | Audience | Best Video Format |
|---|---|---|---|---|
| Aesop's Fables | ~30 | Medium (animated channels exist, song format unclaimed) | Kids 4–10, teachers | Lyric video / illustrated storybook |
| Mental Models | ~20 | Near zero — no music competitors | Parents, educators, kids 8–12 | Lyric video with concept visuals |
| STEM Nursery Rhymes | ~10 | Low-medium (SciShow Kids exists but not in song format) | Kids 6–12, science teachers | Lyric video with diagrams |
| GRE Word Wizards | ~5 | Near zero — white space | High school / college students | Clean kinetic typography |
| Cerebral Songs | ~5 | Very low | Teens, adults | Lyric video, philosophical aesthetic |
2. Video Format ROI
Static Image + Audio (lowest effort)
- Build time: 30 minutes (ffmpeg automation, one-time setup)
- YouTube retention: ~15–25%
- Algorithm signal: Weak — will not get recommended
- Value: Permanent indexed asset for exact-title searches. Placeholder only.
- ROI: D — don't build as end product
Lyric Video — verse-timed text overlay (recommended v1)
- Build time: 1 day to build pipeline; ~5 min per song after
- YouTube retention: ~55–70% (people follow along with words)
- Algorithm signal: Strong — watch time signals quality to YouTube
- Value: Works as classroom tool (teachers embed it), parents use with kids to teach reading alongside listening
- ROI: B+ — viable and buildable now with ffmpeg + SRT
Animated Lyric Video — word-level highlighting (v2 target)
- Build time: 2–3 days to build pipeline (requires Whisper transcription → word timestamps → highlight animation)
- YouTube retention: ~65–80%
- Value: Kids follow along word-by-word, high educational signal, strong differentiation from static competitors
- ROI: A — best format achievable without animation budget
Full Animation (eventual)
- Build time: Weeks per song, significant budget
- ROI: A+ for children's content specifically — Cocomelon-tier engagement — but not achievable without investment
- When: After channel validates traction. Prioritize the highest-performing songs first.
3. The Critical CPM Insight
Content tagged "Made for Kids" on YouTube is subject to COPPA restrictions — personalized ads are disabled, dropping CPM to $0.25–$1.00 RPM.
Content NOT tagged "Made for Kids" targeting teens and adults (GRE vocab, mental models, cerebral songs) earns $5–$25 RPM — 10–25x more per view.
Implication: Run two tracks:
| Track | Songs | Target age | YT setting | Expected RPM |
|---|---|---|---|---|
| Educational Kids | Aesop, STEM, some mental models | 4–12 | Made for Kids | $0.50–$1.00 |
| Educational Adults | GRE vocab, mental models (abstract), cerebral | 13+ | Standard | $9–$25 |
GRE vocabulary songs with near-zero competition at 10–25x CPM are the highest-ROI content we can produce right now for direct YouTube revenue. A GRE channel with 100K views/month earns ~$1,000–$2,500/month. A children's channel needs 1–2.5M views/month to earn the same.
4. What Makes a Video Valuable to Others
Teachers and educators
- Need an embeddable video they can put in a lesson plan
- A lyric video with visible words is a teaching tool; a static image is just audio
- Aesop's fables are literal K-8 curriculum content — teachers are actively searching for song versions
- STEM songs (atomic physics, DNA) align with middle school science curriculum
Parents
- Search YouTube with children looking for specific topics
- Lyric video keeps children engaged longer (reading alongside listening)
- Song compilation playlists auto-play during car trips, meals, quiet time
Students (GRE/SAT)
- Songs are mnemonics — more memorable than flashcard drilling
- Word + definition + contextual example visible on screen = the product
- No competing music channel exists in this space
The lead generation mechanism
YouTube video description → wisesongs.supernal.ai
→ "Download free lyrics PDF" (email capture)
→ Newsletter / Substack subscription
→ Notified when new songs published
The video is discovery. The website is conversion. The email list is retention.
5. Suno Technical Limits
| Version | Max single generation |
|---|---|
| V3 | ~2 minutes |
| V4 | ~4 minutes |
| V4.5 / V5 (current) | ~8 minutes |
- Extensions add ~1 minute each; can chain 2–3 extensions cleanly before quality degrades (shimmer artifacts, tonal drift)
- Practical maximum with extensions: 10–11 minutes of usable audio
- No "movie mode" exists — Suno generates songs, not long-form audio narratives
- For compilations (20–45 min YouTube videos): generate songs individually, assemble in CapCut / DaVinci Resolve
Implication: We do not need Revide or other tools to work around a Suno length limit. Our songs are 1:30–4:00, well within the 8-minute cap. For compilation videos, we build them ourselves from individual songs.
6. Recommended Production Stack
| Tool | Role | Cost |
|---|---|---|
| Suno v4.5 | Audio generation | Existing plan |
| ffmpeg | MP3 + cover art + SRT → MP4 | Free |
| Whisper (OpenAI) | MP3 → word timestamps (for v2 lyric sync) | ~$0.006/min of audio |
| CapCut | Lyric videos, Shorts, compilations | Free |
| Canva | Thumbnails, chapter cards between songs in compilations | Free / $15/mo |
| Revid.ai | Optional: Suno-specific music video generation with Ghibli/Pixar visual styles | $39/mo (evaluate after v1 ships) |
Not needed yet: YouTube Data API auto-upload. Manual upload first batch — metadata (title, description, tags) matters more than automation. Automate after validating what gets traction.
7. Recommended Priority Order
- Build lyric video pipeline (
suno_video.py): ffmpeg + verse-timed SRT → MP4. Ship the 10 songs currently invideo_genstage. - Publish GRE vocabulary songs first — highest CPM, zero competition, lowest production bar (kinetic typography is sufficient)
- Publish STEM nursery rhymes — curriculum-aligned, teachers actively search these
- Publish Aesop's Fables — high search volume, "song" format is unclaimed
- Build Whisper-aligned word highlighting (v2) — after validating v1 traction
- Evaluate Revid.ai — if budget allows, try their Suno-to-video pipeline on 3–5 songs
- Build monthly compilation — once 8+ songs are live, bundle into a 20–30 min compilation
8. Success Metrics
| Metric | Target (Month 3) | Target (Month 6) |
|---|---|---|
| Videos published | 10 | 30 |
| YouTube subscribers | 100 | 500 |
| Average view retention | >50% | >60% |
| GRE channel views/mo | 5,000 | 25,000 |
| Email list from YouTube CTA | 50 | 250 |
| Teacher embeds / lesson plan downloads | 10 | 50 |