Some structures of FFmpeg

Keywords: data structure

1. AVCodec structure
2. AVCodecContext structure
3. AVInputFormat structure
4. AVFormatContext structure
5. MovContext structure
6. URLProtocol structure
7. URLContext structure
8. AVIOContext structure (old version: ByteIOContext)
9. AVStream structure
10. Movstramcontext structure
11. AVPacket structure
12. AVPacketList structure
13. AVFrame structure

1 AVCodec   structural morphology

typedef struct AVCodec
// Indicate the name of Codec, such as "H264", "H263", etc.
const char *name;
// Indicates the type of Codec, including video, audio, etc.
enum CodecType type;
// It indicates the ID of Codec with CODEC_ID_H264, etc.
enum CodecID id;
// Indicates the size of the Context corresponding to the specific Codec, such as H264Context.
int priv_data_size;
// The following indicates the operations provided by Codec. Each decoder will implement these operations.
int(*encode)(AVCodecContext *, uint8_t *buf, int buf_size, void *data);
int(*decode)(AVCodecContext *, void *outdata, int *outdata_size, uint8_t *buf, int buf_size);
struct AVCodec *next;
H264 The initialization of the main structure of is as follows:
AVCodec ff_h264_decoder = {

AVCodec is a data structure similar to COM interface, representing audio and video codec, focusing on function functions. One media type corresponds to one
AVCodec structure, there are multiple instances when the program is running. The next variable is used to connect all supported codecs into a linked list to facilitate traversal and search; id accuracy
Determine the unique Codec; priv_data_size indicates the size of the Context structure corresponding to a specific Codec, such as MsrleContext
Or TSContext. These specific knot definitions are scattered in various. c files. In order to avoid too many if else statements, judge the type and then calculate the size
The size is directly indicated in the. Because this is a statically determined field at compile time, it is placed in AVCodec instead of AVCodecContext.

2 AVCodecContext   structural morphology

typedef struct AVCodecContext
int bit_rate;
int frame_number;
//Extended data, such as additional decoding information of esds in audio trak in mov format and aac format.
unsigned char *extradata;
//Extended data size
int extradata_size;
//The original width and height of the video
int width, height; // This segment is for video only
//Format of one frame of video image, such as YUV420
enum PixelFormat pix_fmt;
//Audio sampling rate
int sample_rate;
//Number of audio channels
int channels;
int bits_per_sample;
int block_align;
// Point to the corresponding decoder, such as ff_h264_decoder
struct AVCodec *codec;
//The context pointing to the specific corresponding decoder, such as H264Context
void *priv_data;
//Common operation function
int(*get_buffer)(struct AVCodecContext *c, AVFrame *pic);
void(*release_buffer)(struct AVCodecContext *c, AVFrame *pic);
int(*reget_buffer)(struct AVCodecContext *c, AVFrame *pic);

The AVCodecContext structure represents the current Codec context used by the program, focusing on the common attributes of all codecs (and in the process)
(its value can only be determined at run time) and fields associated with other structures. Extradata and extradata_ The two fields size represent the private used by the corresponding Codec
data The codec field is associated with the corresponding codec; priv_ The data field is associated with the unique attribute context of each specific codec, which is combined with AVCodec
Priv in structure_ data_ Size is used in pairs.

3 AVInputFormat   structural morphology

typedef struct AVInputFormat
// Mark the name of format, such as "mov", "mp4", etc.
const char *name;
// Indicates the size of the Context corresponding to the specific format, such as MovContext.
int priv_data_size;
//Specific operation function
int(*read_header)(struct AVFormatContext *,AVFormatParameters *ap);
int(*read_packet)(struct AVFormatContext *, AVPacket *pkt);
int(*read_close)(struct AVFormatContext*);
struct AVInputFormat *next;
} AVInputFormat;
Mov or mp4 The initialization of the main structure of is as follows:
AVInputFormat ff_mov_demuxer = {
NULL_IF_CONFIG_SMALL("QuickTime/MPEG-4/Motion JPEG 2000 format"),

AVInputFormat is a data structure similar to COM interface, which represents the input file container format, focuses on function functions, and corresponds to a file container format
An AVInputFormat structure with multiple instances when the program is running. The next variable is used to connect all supported input file container formats into a linked list,
Easy to traverse and find. priv_data_size indicates the size of the Context corresponding to the specific file container format. In this case, it is MovContext
The knot definitions of are scattered in various. c files.

4 AVFormatContext   structural morphology

typedef struct AVFormatContext
//Point to AVInputFormat, such as FF for mp4 or mov_ mov_ demuxer
struct AVInputFormat *iformat;
// Point to the Context corresponding to the specific format, such as MovContext.
void *priv_data;
//Pointing to the data reading unified interface context
ByteIOContext pb;
//Number of streams
int nb_streams;
//At least 2 pointer elements point to video stream and audio stream respectively
AVStream *streams[MAX_STREAMS];
} AVFormatContext;

The AVFormatContext structure represents the context used by the current file container format in which the program runs, focusing on the attributes common to all file containers (and
And fields associated with other structures. The iformat field is associated with the corresponding file container format; pb Association generalized input file;
Streams associated audio and video streams; priv_ The data field is associated with the unique attribute context of each specific file container, and priv_data_size is used in pairs.

5 MovContext   structural morphology

typedef struct MovContext
//Temporarily holds a pointer to AVFormatContext
AVFormatContext *fc;
//Time scaling factor
int time_scale;
//Duration of video
int64_t duration;
//Is the "moov" head found during unpacking
int found_moov;
//Is "mdat" header found during unpacking
int found_mdat;
int isom;
MOVFragment fragment;
MOVTrackExt *trex_data;
unsigned trex_count;
int itunes_metadata; ///< metadata are itunes style
int chapter_track;
} MOVContext;

MOVContext defines some properties of the stream in mp4.

6 URLProtocol   structural morphology

typedef struct URLProtocol
const char *name;
//Unified template function with
int(*url_open)(URLContext *h, const char *filename, int flags);
int(*url_read)(URLContext *h, unsigned char *buf, int size);
int(*url_write)(URLContext *h, unsigned char *buf, int size);
offset_t(*url_seek)(URLContext *h, offset_t pos, int whence);
int(*url_close)(URLContext *h);
struct URLProtocol *next;
} URLProtocol;ffurl_connect
file The initialization of the main structure of is as follows:
URLProtocol ff_file_protocol = {
.name = "file",
.url_open = file_open,
.url_read = file_read,
.url_write = file_write,
.url_seek = file_seek,
.url_close = file_close,
.url_get_file_handle = file_get_handle,
.url_check = file_check,

URLProtocol is a data structure similar to COM interface. It represents a generalized input file, focusing on function functions. A generalized input file corresponds to a
A URLProtocol structure, such as file, pipe, tcp, etc., defines a general template function for file, tcp, etc. the next variable is used to
The generalized input files supported are connected into a linked list to facilitate traversal and search.

7 URLContext   structural morphology

typedef struct URLContext
//Point to the corresponding protocol (the protocol is registered from the initialization linked list), such as ff_file_protocol
struct URLProtocol *prot;
int flags;
int max_packet_size;
//Handle of corresponding communication mode, fd handle for file, socket handle for network, etc
void *priv_data;
//The name of the file, which does not distinguish between local and network
char *filename;
} URLContext

The URLContext structure represents the context used by the current generalized input file of the program, focusing on the common attributes of all generalized input files (and
The prot field is associated with the corresponding generalized input file; the priv_data field is associated with each parameter
Handle to the input file.

8 AVIOContext   Structure (old version: ByteIOContext)

typedef struct ByteIOContext
//Data buffer
unsigned char *buffer;
//Data buffer size
int buffer_size;
//Data read marker pointer
unsigned char *buf_ptr, *buf_end;
//The pointer points to the corresponding URLContext and is associated with the URLContext
void *opaque;
int (*read_packet)(void *opaque, uint8_t *buf, int buf_size);
int (*write_packet)(void *opaque, uint8_t *buf, int buf_size);
offset_t(*seek)(void *opaque, offset_t offset, int whence);
//The location of the current buffer in the file
offset_t pos;
//Indicates to seek and scour data
int must_flush;
//Did you reach the end of the file
int eof_reached; // true if eof reached
int write_flag;
int max_packet_size;
int error; // contains the error code or 0 if no error happened
} ByteIOContext;

ByteIOContext structure extends URLProtocol structure into a broad sense file with internal buffer mechanism to improve the IO performance of generalized input files.
According to the fields defined by its data structure, it mainly includes buffer related fields, tag fields and an associated field opaque to complete the reading and writing of generalized files
Operation. The opaque association field is used to associate the URLContext structure and indirectly associate and extend the URLProtocol structure.

9 AVStream   structural morphology

typedef struct AVStream
//Refers to the decoder context, which is used to associate the decoder
AVCodecContext *actx;
//codec parser: each encoder will encapsulate the actual load data during compression, and add / / header information, such as h264. Nals need to be parsed
 Unit, associated by avav_find_stream_info()
struct AVCodecParserContext *parser;
//The context pointing to the demultiplexed stream, such as MovStreamcontext of mp4
void *priv_data;
AVRational time_base;
//It is used in seek to quickly index key frames, such as the keyframes index table of flv and the I of mp4
//It is important that the index tables of frames are stored here
AVIndexEntry *index_entries;
//index_ Number of elements of entries
int nb_index_entries;
int index_entries_allocated_size;
double frame_last_delay;
} AVStream;

The AVStream structure represents the context of the current media stream, focusing on the attributes common to all media streams (and their values can be determined only when the program is running) and associations
Fields of other structures. The actx field is associated with the context of the codec used by the current audio and video media; priv_ The data field is associated with each specific media
context for stream demultiplexing and unpacking; The index table of key frames is also stored here.

10 MOVStreamContext   structural morphology

typedef struct MOVStreamContext {
//Index of the stream, 0 or 1
int ffindex;
//A temporary variable that holds the number of the next chunk
int next_chunk;
//Number of chunks (in mp4 file format, the value from stco must be the total number of chunks)
unsigned int chunk_count;
//The offset array of the chunk in the file (the physical storage of the sample in each chunk in the file)  // Is continuous), used to save the scto table
int64_t *chunk_offsets;
//Number of stts elements
unsigned int stts_count;
//stts time data sheet
MOVStts *stts_data;
//Number of elements of CTTs (used to correct timestamp when there is B-frame mixing)
unsigned int ctts_count;
//ctts data sheet
MOVStts *ctts_data;
//Number of elements in STSC (spatial distribution table)
unsigned int stsc_count;
//stsc data sheet
MOVStsc *stsc_data;
//A temporary variable that records the index of the currently used ctts table
int ctts_index;
//Record the index of the sample used by the current ctts element
int ctts_sample;
//The size of the smaple may be the same in the stsz table. If it is the same, use this value
unsigned int sample_size;
//Number of elements in stsz
unsigned int sample_count;//Number of sample s
//The stsz data table records the size of each sample. If sample_ If size = 0, the table will not / / be empty
int *sample_sizes;
//Number of elements in STSs (key frame index table)
unsigned int keyframe_count;
//Key frame data table
int *keyframes;
//The number of elements of dref is generally 1
unsigned drefs_count;
//dref data sheet
MOVDref *drefs;
//tkhd width
int width;
//tkhd height
int height;
} MOVStreamContext;

The MOVStreamContext structure is used to store the information obtained from the header for unpacking and demultiplexing from mov or mp4.

11 AVPacket   structural morphology

typedef struct AVPacket
//presentation time stamp 
int64_t pts;
//Decoding timestamp
int64_t dts;
//The location of bytes recorded in a stream in a file or network
int64_t pos;
//Actual data pointer
uint8_t *data;
//Actual data size
int size;
//The index of the stream to which the packet belongs, usually 0 or 1
int stream_index;
int flags;
void(*destruct)(struct AVPacket*);
} AVPacket;

AVPacket represents audio and video data frames. Its inherent attributes are some tags, clock information, compressed data header address, size and other information.

12 AVPacketList   structural morphology

typedef struct AVPacketList
AVPacket pkt;
struct AVPacketList *next;
} AVPacketList;

Note: AVPacketList forms a small linked list of audio and video avpackets.

13 AVFrame   structural morphology

typedef struct AVFrame {
uint8_t *data[AV_NUM_DATA_POINTERS];
int linesize[AV_NUM_DATA_POINTERS];
uint8_t **extended_data;
/**Width and height */
int width, height;
int nb_samples;
int format;
/**Is it a keyframe*/
int key_frame;
/**Frame type (I,B,P)*/
enum AVPictureType pict_type;
uint8_t *base[AV_NUM_DATA_POINTERS];
AVRational sample_aspect_ratio;
int64_t pts;
int64_t pkt_pts;
int64_t pkt_dts;
int coded_picture_number;
int display_picture_number;
int quality;
int reference;
/**QP surface*/
int8_t *qscale_table;
int qstride;
int qscale_type;
/**Skip macroblock table */
uint8_t *mbskip_table;
/**Motion vector table*/
int16_t (*motion_val[2])[2];
/**Macroblock type table */
uint32_t *mb_type;
/**DCT coefficient */
short *dct_coeff;
/**Reference frame list */
int8_t *ref_index[2];
void *opaque;
uint64_t error[AV_NUM_DATA_POINTERS];
int type;
int repeat_pict;
int interlaced_frame;
int top_field_first;
int palette_has_changed;
int buffer_hints;
AVPanScan *pan_scan;
int64_t reordered_opaque;
void *hwaccel_picture_private;
struct AVCodecContext *owner;
void *thread_opaque;
* log2 of the size of the block which a single vector in motion_val represents:
* (4->16x16, 3->8x8, 2-> 4x4, 1-> 2x2)
* - encoding: unused
* - decoding: Set by libavcodec.
uint8_t motion_subsample_log2;
/**(Audio sampling rate */
int sample_rate;
uint64_t channel_layout;
int64_t best_effort_timestamp;
int64_t pkt_pos;
int64_t pkt_duration;
AVDictionary *metadata;
int decode_error_flags;
int64_t channels;
} AVFrame;

Note: AVFrame structure is generally used to store original data (i.e. uncompressed data, such as YUV and RGB for video and PCM for audio),
In addition, it also contains some relevant information. For example, macroblock type table, QP table, motion vector table and other data are stored during decoding. When coding
Relevant data is also stored. Therefore, when using FFMPEG for bitstream analysis, AVFrame is a very important structure.

Posted by TomT64 on Sat, 18 Sep 2021 09:22:28 -0700