SoFunction
Updated on 2025-04-11

Go uses ffmpeg for video and audio processing

ffmpeg is a powerful multimedia processing tool that supports video and audio encoding, decoding, transcoding, as well as frame extraction and stream processing. It has become the preferred tool for developers to handle multimedia content. In this article, we will show how to interact with ffmpeg through a Go package, ffmpegutil, to simplify video and audio processing.

We will introduce some common usage scenarios such as video format conversion, audio extraction, thumbnail creation, and frame extraction, and explore how to interact with ffmpeg in Go efficiently.

ffmpegutil package overview

The ffmpegutil package is designed to encapsulate common ffmpeg operations and provide Go with a simpler and easier interface. It contains the following functions:

  • Video format conversion
  • Extract audio from video
  • Get video information and metadata
  • Create video thumbnails
  • Extract frames in random timestamp

This package relies on the ffmpeg encapsulation library of the Go language for ffmpeg-go, making the functions of ffmpeg more conveniently integrated into Go projects.

Introduction to main functions

1. Video format conversion

Video format conversion is one of the most common applications of ffmpeg. In ffmpegutil, the ConvertVideo function can convert the input video file into a specified format through a simple interface call.

// ConvertVideo converts video from one format to anotherfunc ConvertVideo(inputFile, outputFile string, key, value string) error {
    err := (inputFile).
        Output(outputFile, {key: value}).
        OverWriteOutput().ErrorToStdOut().Run()
    if err != nil {
        return ("error converting video: %w", err)
    }
    ("Video conversion complete: %s -> %s", inputFile, outputFile)
    return nil
}

(inputFile).Output(outputFile, {key: value}), you can set the input and output file path and conversion parameters. ffmpeg-go will automatically handle the conversion process.

2. Extract audio

Extracting audio from video is a common requirement, especially when working with video files. The ExtractAudio function uses ffmpeg to implement this operation.

// ExtractAudio Extract audio from video filefunc ExtractAudio(inputFile, outputFile string) error {
    err := (inputFile).Output(outputFile, {"vn": ""}).Run()
    if err != nil {
        return ("error extracting audio: %w", err)
    }
    ("Audio extraction complete: %s -> %s", inputFile, outputFile)
    return nil
}

In {"vn": ""}, the vn parameter means that the video stream is not processed, and only the audio stream is extracted.

3. Obtain video information

Getting basic information about videos is another common operation. In ffmpegutil, the GetVideoInfo function uses .

// GetVideoInfo Get basic information about video filesfunc GetVideoInfo(inputFile string) (string, error) {
    probeData, err := (inputFile)
    if err != nil {
        return "", ("error getting video info: %w", err)
    }
    ("Video Info: %v", probeData)
    return probeData, nil
}

The returned video file metadata contains information such as format, duration, and code rate, and can be used for subsequent processing.

4. Create a video thumbnail

The generation of video thumbnails is a common requirement in video processing, especially when displaying videos on multimedia platforms. The CreateThumbnail function extracts a frame from the video as a thumbnail.

// CreateThumbnail Create thumbnails for videosfunc CreateThumbnail(inputFile, outputFile string) error {
    err := (inputFile).Output(outputFile, {"vframes": "1", "vf": "scale=800:600"}).Run()
    if err != nil {
        return ("error creating thumbnail: %w", err)
    }
    ("Thumbnail created: %s -> %s", inputFile, outputFile)
    return nil
}

This function extracts the first frame of the video by setting vframes=1 and adjusts the size of the thumbnail by scale=800:600.

5. Extract random frames

Extracting random frames in video is an advanced operation, often used for video analysis or to generate video previews. In ffmpegutil, there are two versions of the ExtractRandomFrames function, one is a single-threaded version and the other is a multi-threaded version.

Wireless version:

// ExtractRandomFramesNoThread Extracts random frames in video (wireless)func ExtractRandomFramesNoThread(inputFile, outputDir, filePrefix string, numFrames int) error {
    // Make sure the output directory exists    err := (outputDir, )
    if err != nil {
        return ("failed to create output directory: %w", err)
    }

    format, err := GetVideoFormat(inputFile)
    if err != nil {
        return ("error getting video format: %w", err)
    }

    duration, err := (, 64)
    if err != nil {
        return ("error parsing duration: %w", err)
    }

    randSource := (().UnixNano())
    randGen := (randSource)
    timestamps := generateRandomTimestamps(duration, numFrames, randGen)

    for i, timestamp := range timestamps {
        outputFile := (outputDir, ("%s_%", filePrefix, i+1))
        err := extractFrameAtTimestamp(inputFile, outputFile, timestamp)
        if err != nil {
            ("Error extracting frame: %v", err)
        } else {
            ("Frame extracted: %s -> %s", inputFile, outputFile)
        }
    }

    return nil
}

Multithreaded version:

// ExtractRandomFrames Extract RandomFrames (multi-threaded) in the videofunc ExtractRandomFrames(inputFile, outputDir, filePrefix string, numFrames, numThreads int) error {
    // Make sure the output directory exists    err := (outputDir, )
    if err != nil {
        return ("failed to create output directory: %w", err)
    }

    format, err := GetVideoFormat(inputFile)
    if err != nil {
        return ("error getting video format: %w", err)
    }

    duration, err := (, 64)
    if err != nil {
        return ("error parsing duration: %w", err)
    }

    randSource := (().UnixNano())
    randGen := (randSource)
    timestamps := generateRandomTimestamps(duration, numFrames, randGen)

    var wg 
    sem := make(chan struct{}, numThreads)

    for i, timestamp := range timestamps {
        (1)

        go func(index int, ts float64) {
            defer ()

            sem <- struct{}{} // acquire semaphore

            outputFile := (outputDir, ("%s_%", filePrefix, index+1))
            err := extractFrameAtTimestamp(inputFile, outputFile, ts)
            if err != nil {
                ("Error extracting frame: %v", err)
            } else {
                ("Frame extracted: %s -> %s", inputFile, outputFile)
            }

            <-sem // release semaphore
        }(i, timestamp)
    }

    ()

    return nil
}

Summarize

With the ffmpegutil package, Go developers can easily implement common processing tasks for video and audio such as format conversion, audio extraction, thumbnail generation, and random frame extraction. Using the ffmpeg-go package library, combined with the concurrency characteristics of Go, it can efficiently process large amounts of video data and meet complex multimedia processing needs.

Whether it is used for video analysis, audio processing, or generating thumbnails for video platforms, ffmpeg is an indispensable tool. Through Go's packaging of ffmpeg, it can be more convenient to integrate it into your own projects and improve development efficiency.

This is the article about Go using ffmpeg for video and audio processing. For more information about Go ffmpeg audio and video processing, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!