Now Available: Audio Transcription & Video Generation

May 22, 2026 by Tyler Rockwood

Your team’s best context does not always live in text.

Sometimes it is buried in a podcast, a customer call, a meeting recording, or a voice note. And your team may want to produce more than text — a visual clip or video.

Tasklet can now help with both.

We just rolled out two new capabilities: audio transcription and video generation.

Audio Transcription

This means you can now drop an audio file into Tasklet, generate a transcript, and ask your agent to pull out the parts that matter: summaries, actionable insights, quotes, and more.

In the demo, we used a 90-minute episode of The Cognitive Revolution featuring Tasklet founder Andrew Lee. Tasklet transcribes it and pulls out insights like positioning, differentiation, campaign ideas, and strong one-liners.

The transcript is not the final output. It becomes the raw material and context your agent can use to get work done.

Try audio transcription:

Video Generation

With native video generation, you describe the video you want in plain text and Tasklet turns it into something like this:

Try video generation:

Regardless of which model you choose in Tasklet, we may use specialized capabilities offered by other models to more effectively complete specific tasks. For example, image generation uses GPT Image 2.0, audio transcription uses Gemini 3.5 Flash, and video generation uses Veo 3.1. Tasklet can route different pieces of work to the model best suited for the job.

These features help your agents do more with the context your team already has. Audio files, transcripts, videos, images, documents, apps, tools, workflows — more of your team’s work can now happen in one place, no matter what format it’s in.

Have feedback? Drop me a note at tyler@tasklet.ai