Introduction to FFmpeg: Common API usage and C language development

Understanding Related Concepts

1. Basic concepts of multimedia files
2. Quantitative encoding of audio
3. Time Base

Environment Configuration

Related Downloads
Environment Configuration
test

Development Cases
Processing logic and using API s
Related Codes
Reference Article

For project reasons, ffmpeg was used when ffmpeg was called with c#to transcode the video by instructions.Instructions are easier to use, but if you are involved in secondary development of audio and video with complex points, it is difficult to understand the meaning and logic of the code without having some understanding of audio and video related concepts.Because of interest, I recently began to explore the use of the ffmpeg API for learning.

Understanding Related Concepts

1. Basic concepts of multimedia files

Multimedia files are a container
There are many streams in the container (Stream/Track)
Each stream is encoded by a different encoder
Data read from a stream is called a package
Contains one or more frames in a package

2. Quantitative encoding of audio

The process of converting analog signals to digital signals (continuous->discrete, discontinuous processes can only be used by computers
Analog signal - > sampling - > quantization - > encoding - > digital signal
The basic concept of quantification: sample size: how many bits to store a sample, commonly 16 bits
Sampling rate: Sampling frequency (sampling times in one second). Generally, the sampling rate is 8 kHz, 16 kHz, 32 kHz, 44.1 kHz, 48 kHz, etc. The higher the sampling frequency, the more true and natural the sound restoration, and of course, the larger the amount of data
Channel Number: In order to restore the true sound field when playing sound, sound is acquired from several different positions at the same time when recording sound, one channel for each position.The number of channels is the number of sound sources during sound recording or the number of corresponding speakers during playback. They are mono, dual, and multichannel.
Bit rate: Also called bit rate, refers to the number of bits transmitted per second.The unit is bps(Bit Per Second). The higher the bit rate, the more data is transmitted per second, and the better the sound quality.

Code rate calculation formula:
Bit Rate = Sampling Rate * Sampling Size * Channel Number
 For example, a sample rate of 44.1 kHz, a sample size of 16 bits, and a two-channel PCM-encoded WAV file:
Bit rate=44.1hHz*16bit*2=1411.2kbit/s.
The size of the music recorded for one minute is (1411.2 * 1000 * 60) / 8 / 1024 / 1024 = 10.09M.

3. Time Base

Time_base is a measure of time, such as time_base = {1,40}, which means that a second is divided into 40 segments, so each segment is 1/40 seconds. In FFmpeg, the function av_q2d(time_base) is used to calculate a period of time, and the result is 1/40 seconds.For example, if a frame in a video has a pts of 800, that is, 800 segments, how many seconds does it represent, pts av_q2d(time_base) =800 (1/40) =20s, that is, to play the conversion of the frame's time base at the 20th second.Different formats have different time bases.
PTS is the time stamp used for rendering.DTS is the decoding timestamp.
pts for Audio: Take AAC Audio as an example, an AAC raw frame contains 1024 samples and related data over a period of time, that is, 1024 samples per frame. If the sampling rate is 44.1 kHz (44100 samples are collected in one second), the aac audio has 44100/1024 frames in one second, and the duration of each frame is 1024/44100 seconds, then pts for each frame can be calculated.
Conversion Formula

Timestamp (seconds) = PTS * av_q2d (st->time_base)//Calculate the position of the frame in the video audio
 Time (seconds) = st->duration * av_q2d (st->time_base)//Calculate length in video audio
 st is the AVStream stream pointer
 Time Base Conversion Formula
 timestamp(ffmpeg internal timestamp) = AV_TIME_BASE * time (seconds)
Time (seconds) = AV_TIME_BASE_Q * timestamp(ffmpeg internal timestamp)//timestamp even PTS/DTS

Environment Configuration

Related Downloads

Get into Official Web Download Dev and Shared packages, respectively.Download Note Platform Selection Corresponds.
Unzip the include and lib files from dev into the following directories, respectively.An error will occur if the dll file in shared is copied to the project Debug directory.

Environment Configuration

Create c/c++ project in VS, right-click project properties

Add the following dll file to it

avcodec.lib; avformat.lib; avutil.lib; avdevice.lib; avfilter.lib; postproc.lib; swresample.lib; swscale.lib

libavcodec provides implementation of a series of encoders
 libavformat implements streaming protocols, container formats, and their IO access
 libavutil includes hash decoders, decoders, and various tool functions
 libavfilter provides various audio and video filters
 libavdevice provides access to capture and playback devices
 libswresample realizes mixing and resampling
 libswscale implements color conversion and scaling

test

I developed using VS2017 as an editor.

#include<stdio.h>
#include <iostream>
extern "C" {
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
}
int main(int argc, char* argv[]) {
	printf(avcodec_configuration());
	system("pause");
	return 0;
}

Development Cases

To achieve video-audio mixing of two sets of videos, a function similar to a small curry show.

Processing logic and using API s

API Registration
Create input and output contexts
Get Input Audio Stream, Input Video Stream
Create Output Audio Stream, Output Video Stream
Copy input stream parameters to output stream parameters
Determine file size and output file length
Write header information
Initialize package, read audio and video data separately, and write to file

Related Codes

#include<stdio.h>
#include <iostream>
extern "C" {
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
#include "libavformat/avio.h"
#include <libavutil/log.h>
#include <libavutil/timestamp.h>
}
#define ERROR_STR_SIZE 1024
int main(int argc, char const *argv[])
{
	int ret = -1;
	int err_code;
	char errors[ERROR_STR_SIZE];
	AVFormatContext *ifmt_ctx1 = NULL;
	AVFormatContext *ifmt_ctx2 = NULL;
	AVFormatContext *ofmt_ctx = NULL;
	AVOutputFormat *ofmt = NULL;
	AVStream *in_stream1 = NULL;
	AVStream *in_stream2 = NULL;
	AVStream *out_stream1 = NULL;
	AVStream *out_stream2 = NULL;
	int audio_stream_index = 0;
	int vedio_stream_indes = 0;
	// Maximum file length to ensure consistent audio and video data length
	double max_duration = 0;
	AVPacket pkt;
	int stream1 = 0, stream2 = 0;
	av_log_set_level(AV_LOG_DEBUG);
	//Open two input files
	if ((err_code = avformat_open_input(&ifmt_ctx1, "C:\\Users\\haizhengzheng\\Desktop\\meta.mp4", 0, 0)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR, "Could not open src file, %s, %d(%s)\n",
			"C:\\Users\\haizhengzheng\\Desktop\\meta.mp4", err_code, errors);
		goto END;
	}
	if ((err_code = avformat_open_input(&ifmt_ctx2, "C:\\Users\\haizhengzheng\\Desktop\\mercury.mp4", 0, 0)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR,
			"Could not open the second src file, %s, %d(%s)\n",
			"C:\\Users\\haizhengzheng\\Desktop\\mercury.mp4", err_code, errors);
		goto END;
	}
	//Create output context
	if ((err_code = avformat_alloc_output_context2(&ofmt_ctx, NULL, NULL, "C:\\Users\\haizhengzheng\\Desktop\\amv.mp4")) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR, "Failed to create an context of outfile , %d(%s) \n",
			err_code, errors);
	}
	ofmt = ofmt_ctx->oformat;//Get format information for output file
	// Find the best audio stream in the first parameter and the video stream subscript in the second file
	audio_stream_index = av_find_best_stream(ifmt_ctx1, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);//Get Audio Stream Subscript
	vedio_stream_indes = av_find_best_stream(ifmt_ctx2, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0);//Get video stream Subscripts
	// Get the audio stream from the first file
	in_stream1 = ifmt_ctx1->streams[audio_stream_index];
	stream1 = 0;
	// Create Audio Output Stream
	out_stream1 = avformat_new_stream(ofmt_ctx, NULL);
	if (!out_stream1) {
		av_log(NULL, AV_LOG_ERROR, "Failed to alloc out stream!\n");
		goto END;
	}
	// Copy Flow Parameters
	if ((err_code = avcodec_parameters_copy(out_stream1->codecpar, in_stream1->codecpar)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR,
			"Failed to copy codec parameter, %d(%s)\n",
			err_code, errors);
	}
	out_stream1->codecpar->codec_tag = 0;
	// Get the video stream from the second file
	in_stream2 = ifmt_ctx2->streams[vedio_stream_indes];
	stream2 = 1;

	// Create video output stream
	out_stream2 = avformat_new_stream(ofmt_ctx, NULL);
	if (!out_stream2) {
		av_log(NULL, AV_LOG_ERROR, "Failed to alloc out stream!\n");
		goto END;
	}
	// Copy Flow Parameters
	if ((err_code = avcodec_parameters_copy(out_stream2->codecpar, in_stream2->codecpar)) < 0) {
		av_strerror(err_code, errors, ERROR_STR_SIZE);
		av_log(NULL, AV_LOG_ERROR,
			"Failed to copy codec parameter, %d(%s)\n",
			err_code, errors);
		goto END;
	}
	out_stream2->codecpar->codec_tag = 0;
	//Output Stream Information
	av_dump_format(ofmt_ctx, 0, "C:\\Users\\haizhengzheng\\Desktop\\amv.mp4", 1);

	// Determine the length of the two streams and determine the length of the final file, time (seconds) = st->duration * av_q2d (st->time_base) duration is dtspts av_q2d() is the reciprocal
	if (in_stream1->duration * av_q2d(in_stream1->time_base) > in_stream2->duration * av_q2d(in_stream2->time_base)) {
		max_duration = in_stream2->duration * av_q2d(in_stream2->time_base);
	}
	else {
		max_duration = in_stream1->duration * av_q2d(in_stream1->time_base);
	}
	//Open Output File
	if (!(ofmt->flags & AVFMT_NOFILE)) {
		if ((err_code = avio_open(&ofmt_ctx->pb, "C:\\Users\\haizhengzheng\\Desktop\\amv.mp4", AVIO_FLAG_WRITE)) < 0) {
			av_strerror(err_code, errors, ERROR_STR_SIZE);
			av_log(NULL, AV_LOG_ERROR,
				"Could not open output file, %s, %d(%s)\n",
				"C:\\Users\\haizhengzheng\\Desktop\\amv.mp4", err_code, errors);
			goto END;
		}
	}
	//Header Information
	avformat_write_header(ofmt_ctx, NULL);
	av_init_packet(&pkt);
	// Read audio data and write to output file
	while (av_read_frame(ifmt_ctx1, &pkt) >= 0) {
		// If the reading time exceeds the maximum time indicating that the frame is not needed, skip
		if (pkt.pts * av_q2d(in_stream1->time_base) > max_duration) {
			av_packet_unref(&pkt);
			continue;
		}
		// If it is the audio stream we need, write the file av_rescale_q_rnd() time base conversion function after converting the time base
		if (pkt.stream_index == audio_stream_index) {
			pkt.pts = av_rescale_q_rnd(pkt.pts, in_stream1->time_base, out_stream1->time_base,//Get PTS\DTS\duration of package
				(AVRounding)(AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
			pkt.dts = av_rescale_q_rnd(pkt.dts, in_stream1->time_base, out_stream1->time_base,
				(AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
			pkt.duration = av_rescale_q(max_duration, in_stream1->time_base, out_stream1->time_base);
			pkt.pos = -1;
			pkt.stream_index = stream1;
			av_interleaved_write_frame(ofmt_ctx, &pkt);
			av_packet_unref(&pkt);
		}
	}


	// Read video data and write to output file
	while (av_read_frame(ifmt_ctx2, &pkt) >= 0) {

		// If the reading time exceeds the maximum time indicating that the frame is not needed, skip
		if (pkt.pts * av_q2d(in_stream2->time_base) > max_duration) {
			av_packet_unref(&pkt);
			continue;
		}
		// If this is the video stream we need, write the file after converting the time base
		if (pkt.stream_index == vedio_stream_indes) {
			pkt.pts = av_rescale_q_rnd(pkt.pts, in_stream2->time_base, out_stream2->time_base,
				(AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
			pkt.dts = av_rescale_q_rnd(pkt.dts, in_stream2->time_base, out_stream2->time_base,
				(AVRounding)(AV_ROUND_NEAR_INF | AV_ROUND_PASS_MINMAX));
			pkt.duration = av_rescale_q(max_duration, in_stream2->time_base, out_stream2->time_base);
			pkt.pos = -1;
			pkt.stream_index = stream2;
			av_interleaved_write_frame(ofmt_ctx, &pkt);
			av_packet_unref(&pkt);
		}
	}
	//End Write Information
	av_write_trailer(ofmt_ctx);
		ret = 0;
END:
	// Release memory
	if (ifmt_ctx1) {
		avformat_close_input(&ifmt_ctx1);
	}

	if (ifmt_ctx2) {
		avformat_close_input(&ifmt_ctx2);
	}

	if (ofmt_ctx) {
		if (!(ofmt->flags & AVFMT_NOFILE)) {
			avio_closep(&ofmt_ctx->pb);
		}
		avformat_free_context(ofmt_ctx);
	}
}

Reference Article

Audio Basics
Code reference

Hi Clang

31 original articles published, 29 praised, 8659 visited

Private letter follow

Posted by baber_abbasi on Fri, 06 Mar 2020 19:05:22 -0800

Programmer Group

Introduction to FFmpeg: Common API usage and C language development