Starting from the topic "How do chameleons change color?", the Skill automatically handled content research, wrote the voiceover script, and generated TTS narration. It then planned 7 animated info cards, overlaid and rendered them, and burned in subtitles to produce a 67-second landscape science video. Along the way, the user only needed to approve the script and card plan—everything else was handled by the Skill.


The user only needs to provide a keyword. The Skill then retrieves core scientific knowledge from multiple sources at once and distills it into visualizable concepts, with no need for the user to prepare materials or copy.
The voiceover script is automatically organized into a three-part structure: "counterintuitive hook → mechanism explained → closing insight," with a natural spoken tone that adapts to formats from 60 seconds to 5 minutes.
Based on the content style, the Skill automatically chooses voice and speaking speed between a documentary tone and a science-creator tone. Chinese and English each get matching voices, with no manual selection needed.
Once TTS is finished, the Skill outputs a timestamped subtitle file at the same time. It can also proofread typos and clean up punctuation so the subtitles stay perfectly aligned with the narration.
For key concepts and data points in the script, the Skill plans dynamic cards such as Keyword, Compare, and LowerThird. After frame-based preview confirmation, it overlays them onto the video with ffmpeg and burns in subtitles, so the on-screen information density stays in sync with the narration pace.