AI & Machine Learning

How I Built GM-Genie: A Cinematic AI Game Master with Gemini Live API

Ali NematiAli Nemati6 hours ago47 sec read10 views

GM-Genie uses a combination of server-side and client-side processing to create an immersive audio experience for text-based games. Key components include:

  • A custom model serving API that handles concurrent requests from multiple clients.
  • Real-time speech-to-text using Gemini Live API with continuous capture, no noise gate on the client side.
  • Dynamic sound effects fetched in real time based on game context from Freesound API and cached locally for reuse.
  • An audio pipeline that captures raw PCM data at 16kHz and batches it before sending to the server.
  • A scene detector on the server that triggers events like sound changes or text updates based on transcript analysis.
  • A dynamic story arc system that evolves through phases, generating encounter seeds tailored to the current phase of the larger narrative.

Read the full article at DEV Community


Want to create content about this topic? Use Nemati AI tools to generate articles, social posts, and more.

10
Comments
Ali Nemati
Ali NematiWritten by Ali
View all posts

Related Articles

How I Built GM-Genie: A Cinematic AI Game Master with Gemini Live API | OSLLM.ai