Linux system programming - file IO

Keywords: C C++ Linux Operation & Maintenance Back-end

  man has nine volumes. System programming is the content of Volume II, and volume V is file format and specification

open function

Function prototype

Parameter pathname file name

Macro with parameter flags as access mode: o_ Rdonly (read only), O_ Wronly, O_ Rdwr (read write) these three must be added

O_ Append, O_ Creat, O_ Excl (exists or not), O_ TRUNC (truncation, normal file write operation truncation is 0), O_ Nonblock (non blocking, no subsequent IO operation will be blocked after setting)

Parameter mode_t mode is an octal number, 0777

The return value is a file descriptor

case

Open open file

File already exists

open creates a file. The file is opened but not created

  open file exists. open, truncate (empty) file, no creation

  The mode permission in open is the permission & ~ umask (specify the permission and umask to take the inverse and) to get the final result

The first 0 represents octal, the rest are permissions, the first two inversions are 1, and any number with 1 is itself

The last one is the negation of 2, which is r-x. the last one is 4, which is r--

        R-X & R -- = R -- that is, creating a file has only read permission for other users

Error in open function

  fd outputs - 1, indicating failure

  The document is an error, returns - 1, and sets errno

We can use the returned errno and use sererrno(errno) to display the details

 

read and write functions

Function prototype

read

  ssize_t signed shaping, fd file descriptor,   buf read space, size_ T (unsigned integer) count the size of the read space in bytes

The number of bytes read is returned successfully. When 0 is returned, it means the end of the file is read

Failure returns - 1, setting errno

write

The parameter is the same as read. The cache is const, indicating that it cannot be changed during writing, and count the number of bytes written

Successfully returned the number of bytes written

Failure returns - 1, setting errno

case

Implement cp copy

cp aaa bbb (aaa assignment bbb)

#include <iostream>
#include <unistd.h>
#include <fcntl.h>
#include <string>

using namespace std;

int main(int argc,char** argv)
{
    // The command line argument argv[0] is the file name
    int fd = open(argv[1], O_RDONLY);
    if(fd == -1)
    {
        //Standard error can automatically display errno information
        //cerr is the error message of the ostream class itself
        perror("int fd = open(argv[1], O_RDONLY);");
        exit(1);
    }
    // File does not exist create file truncate file
    int fd2 = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0664);
    if(fd2 == -1)
    {
        perror("int fd2 = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0664);");
        exit(1);
    }

    int read_count;
    char buf[1024];

    // After reading successfully, the number of bytes read will be returned. After reading the end of the file, 0 will be returned. If - 1 is returned, it means failure
    while((read_count = read(fd,buf,sizeof(buf))) != 0)
    {
        if(read_count == -1)
        {
            perror("while((read_count = read(fd,buf,sizeof(buf))) != 0)");
            exit(1);
        }

        // Writes the read bytes to the file
        // read_count how many reads and how many writes
        write(fd2,buf,read_count);
    }

    //Close the file and open the file together, or you will really forget!!!
    close(fd);
    close(fd2);

    return 0;
}

Review the reading and writing of C + + files

#include <iostream>
#include <string>
#include <fstream>

using namespace std;

int main(int argc,char** argv)
{
	fstream fin(argv[1], ios::in);
	if (!fin)
	{
		cerr << "fstream fin(argv[1], ios::in)";
		exit(1);
	}

	// File does not exist, create, empty
	fstream fout(argv[2], ios::out);
	if (!fout)
	{
		cerr << "fstream fout(argv[2], ios::out)";
		exit(1);
	}

	string str;

	//getline can read spaces but discards line breaks
	while (getline(fin,str))
	{
		fout << str << endl;
	}

	fin.close();
	fout.close();

	return 0;
}

In fact, I can't reflect it here. The main reason is that the keyboard is uncomfortable and doesn't like to knock

Directly speaking, conclusion C + + functions will be faster than system functions

In either way, the kernel writes to the disk through the kernel. The system function writes 1024 bytes at a time, while the C + + function writes 4096 bytes at a time. This is because the system function is a system level buffer (1024 bytes are written directly to the kernel every time. The system default optimal IO is 4k, that is, when 4k is written to the disk (this is not accurate, because I don't know how the kernel writes to the disk, but that's what it means.) C + + functions have their own function cache (that is, user level buffer) (it directly writes enough 4k = 4096 bytes in its own cache, and then gives it to the kernel at one time to write to the disk by the kernel)

The time from the user to the kernel is the time when C + + functions are faster than system functions (of course, this is not only available in C + +, which is irresponsible for mainstream languages)

The use scenarios of system functions and library functions are different, not whose efficiency is high

File descriptor

  A process can open up to 1024 files (0-1023), and the system will automatically select the smallest available file descriptor

Blocking, non blocking

Regular files are not blocked, only network files and device files are blocked, / dev/tty terminal file / dev device directory
Blocking is a file attribute, not a function. File attributes can be modified

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <cstring> 

using namespace std;

int main(int argc, char** argv)
{
	char buf[9];
	int read_count;

	// Read the standard input. If there is no input, it will cause blocking
    read_count = read(STDIN_FILENO, buf, 9);
	if (read_count == -1)
	{
		perror("if (read_count = read(STDIN_FILENO, buf, 9) == -1)");
		exit(1);
	}

	// Write standard output
	write(STDOUT_FILENO, buf, read_count);

	return 0;
}

  No input will stay in this interface

Non blocking processing (dead loop demonstration effect)

When the device or network file is set to be non blocking, it returns - 1 and errno=EAGIN or ewouldblock (the two values are the same), indicating that there is no data in the file, rather than the opening failure

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>

using namespace std;

int main(int argc, char** argv)
{
	// /dev/tty is the terminal 
	int fd = open("/dev/tty", O_RDONLY | O_NONBLOCK);
	if (fd == -1)
	{
		perror("int fd = open( / dev / tty, O_RDONLY | O_TRUNC);");
		exit(1);
	}

	int read_count;
	char buf[9];

	while (true)
	{
		read_count = read(fd, buf, 9);
		if (read_count == -1 && errno == EAGAIN)
		{
			cout << "File has no data" << endl;
			sleep(5);
		}
		else if (read_count == -1 && errno != EAGAIN)
		{
			perror("read_count = read(fd, buf, 64);");
			exit(1);
		}
		else
		{
			write(STDOUT_FILENO, buf, read_count);
			close(fd);

			break;
		}
	}

	return 0;
}

Oh, by the way, the code header file is a program I've been using to change, because it's lazy. I said before that it doesn't affect the specific header file

  If read returns - 1 and errno is EAGAIN, no data will be displayed. Stop for five seconds and wait for input. If there is no input, no data will still be displayed

There is input to display data and jump out of the cycle

If the return is - 1 and errno is not EAGAIN, an error message is displayed and the program ends

Non blocking processing (only processing, not solving the problem. The method to solve this problem is response mode. This is not discussed at present)

It is obviously not advisable to wait indefinitely when non blocking is set above. (if blocking, it is even more desirable to wait indefinitely)

At this time, we need to consider why ordinary files do not have non blocking options

When an ordinary file is read, no matter how large the file is, when it is finally read, the device (Network) file will wait all the time (i.e. blocking)

The common way to deal with this problem is timeout. In fact, it is to wait for a certain time. If there is no data, exit the program and carry out subsequent operations

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>

using namespace std;

int main(int argc, char** argv)
{
	// /dev/tty is the terminal 
	int fd = open("/dev/tty", O_RDONLY | O_NONBLOCK);
	if (fd == -1)
	{
		perror("int fd = open( / dev / tty, O_RDONLY | O_TRUNC);");
		exit(1);
	}

	int read_count;
	char buf[9];
	int count = 3;

	while (count)
	{
		count--;

		read_count = read(fd, buf, 9);
		if (read_count == -1 && errno == EAGAIN)
		{
			cout << "File has no data" << endl;
			sleep(2);
		}
		else if (read_count == -1 && errno != EAGAIN)
		{
			perror("read_count = read(fd, buf, 64);");
			exit(1);
		}
		else
		{
			write(STDOUT_FILENO, buf, read_count);
			close(fd);

			break;
		}
	}

	if (count == 0)
	{
		cout << "overtime" << endl;
	}

	return 0;
}

  Fcntl (change the access control attribute of an open file)

  This function is very powerful and contains a lot of content. Here we only introduce two F_GETFL and F_SETFL commands

The parameter int cmd is these two commands

Failure returned - 1

Successfully return a bitmap (that is, to give a binary bit table, each bit has its own meaning)

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>

using namespace std;

int main(int argc, char** argv)
{
	// Get file attributes (status)
	int flag = fcntl(STDIN_FILENO, F_GETFL);
	if (flag == -1)
	{
		perror("int flag = fcntl(STDIN_FILENO, F_GETFL);");
		exit(1);
	}

	// Set file properties
	flag |= O_NONBLOCK;  // Add non blocking state
	fcntl(STDIN_FILENO, F_SETFL, flag);
	if (flag == -1)
	{
		perror("fcntl(STDIN_FILENO, F_SETFL, flag);");
		exit(1);
	}

	int read_count;
	char buf[9];
	int count = 3;

	while (count)
	{
		count--;

		read_count = read(STDIN_FILENO, buf, 9);
		if (read_count == -1 && errno == EAGAIN)
		{
			cout << "File has no data" << endl;
			sleep(2);
		}
		else if (read_count == -1 && errno != EAGAIN)
		{
			perror("read_count = read(fd, buf, 64);");
			exit(1);
		}
		else
		{
			write(STDOUT_FILENO, buf, read_count);

			break;
		}
	}

	if (count == 0)
	{
		cout << "overtime" << endl;
	}

	return 0;
}

  The effect is the same

Bit or

010 | = 001, the result is 011, and the bitwise OR operation result of 1 is 1

lseek modify file offset

  off_t is a vector

Offset offset

Where offset start position

         SEEK_SET start position
         SEEK_CUR current position
         SEEK_END end position

Successfully returns the offset from the start of the file

Failed to return - 1 setting errno

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>
#include <string>

using namespace std;

int main(int argc, char** argv)
{
	int fd = open(argv[1], O_RDWR | O_CREAT, 0664);
	if (fd == -1)
	{
		perror("int fd = open(argv[1], O_RDWR | O_CREAT);");
		exit(1);
	}

	char write_buf[1024] = "asfjljfkldjfkljf";

	//Write content to file
	write(fd, write_buf, strlen(write_buf));

	// If you do not add this statement, you will not be able to read that the file pointer is shared
	lseek(fd, 0, SEEK_SET);

	int read_count;
	char read_buf[1024];

	// read file
	while (read_count = read(fd, read_buf, 1024))
	{
		if (read_count == -1)
		{
			perror("while (read_count = read(fd, ch, 1024))");
			exit(1);
		}
		cout << read_buf << endl;
	}

	close(fd);

	return 0;
}

  When writing a file, the file pointer is always at the end of the file

When reading a file, it is still read from the end of the file. Therefore, if the file pointer is not reset, the data will not be read

The reason is that carriage return (line feed) is not read

lseek can get the file size

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>
#include <string>

using namespace std;

int main(int argc, char** argv)
{
	int fd = open("test.txt", O_RDWR | O_CREAT | O_TRUNC, 0664);
	if (fd == -1)
	{
		perror("int fd = open(test.txt, O_RDWR | O_CREAT);");
		exit(1);
	}

	char write_buf[1024] = "asfjljfkldjfkljf";

	//Write content to file
	write(fd, write_buf, strlen(write_buf));

	// The return value says that setting the offset to the end of the file returns the size of the file
	int file_size = lseek(fd, 0, SEEK_END);

	cout << "file size :" << file_size << endl;

	close(fd);

	return 0;
}

lseek can also expand files

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>
#include <string>

using namespace std;

int main(int argc, char** argv)
{
	int fd = open("test.txt", O_RDWR);
	if (fd == -1)
	{
		perror("int fd = open(test.txt, O_RDWR);");
		exit(1);
	}

	// Set the file pointer to the end of the file. When offsetting, the offset is the expanded size
	int file_size = lseek(fd, 24, SEEK_END);

	cout << "file size :" << file_size << endl;

	close(fd);

	return 0;
}

  However, this is a fake. The real expansion file needs IO operation,

truncate extension file

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>
#include <string>

using namespace std;

int main(int argc, char** argv)
{
	int flag = truncate("test.txt", 500);
	if (flag == -1)
	{
		perror("int flag = truncate(test.txt, 500);");
		exit(1);
	}

	return 0;
}

ftruncate extension file

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>
#include <string>

using namespace std;

int main(int argc, char** argv)
{
	int fd = open("test.txt", O_RDWR);
	if (fd == -1)
	{
		perror("int fd = open(test.txt, O_RDWR);");
		exit(1);
	}

	int flag = ftruncate(fd, 1000);
	if (flag == -1)
	{
		perror("int flag = ftruncate(fd, 1000);");
		exit(1);
	}

	return 0;
}

 

  Stat (get file properties)

  struct stat* statbuf is an outgoing parameter (i.e. when C, change a variable with a pointer and change a pointer with a secondary pointer)

The information in the structure is the attributes of some files and some information when the terminal commands ll

  st_mode to determine what file it is

#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <iostream>
#include <cerrno>  
#include <string.h>
#include <string>

using namespace std;

int main(int argc, char** argv)
{
	struct stat stat_buf;

	int fd = stat("test.txt", &stat_buf);
	if (fd == -1)
	{
		perror("int fd = stat(test.txt,&stat_buf);");
		exit(1);
	}

	cout << "file size:" << stat_buf.st_size << endl;

	// Is is a macro function and the return value is bool (in fact, the return values of system / library macro functions are basically bool values)
	if (S_ISREG(stat_buf.st_mode))
	{
		cout << "regular file" << endl;
	}

	// if macro definition, bitwise and (any number and 1 are itself)
	if ((stat_buf.st_mode & S_IFMT) == S_IFREG)
	{
		cout << "regular file" << endl;
	}

	return 0;
}

S_IFMT is a file mask, which can be understood as st_mode is a 16 bit bitmap. The first four bits represent the file type, the last nine bits represent the file permission, and there are three bits in the middle. I can't remember... (this is not accurate, but it roughly means this!!!)

  However, when a file is linked, it will directly display the properties of the source file

stat function can penetrate linked files by default, but lstat function will not

The parameters of the lstat function are exactly the same as those of stat, so we won't describe it any more

access detects whether the file exists or has some permissions

  0 is returned for success, and - 1 is returned for failure. Set errno

mode
         R_OK readable
         W_OK writable
         X_OK executable
         F_OK file exists

#include <unistd.h>
#include <iostream>

using namespace std;

int main(int argc,char** argv)
{
    int flag = access("test.txt",F_OK);
    if(!flag)
    {
        cout << "test.txt existence" << endl;
    }

    return 0;
}

  chmod modify file access

  The value of the mode parameter

  0 is returned for success, and - 1 is returned for failure. Set errno

Link create hard link

Parameters are two files

   0 is returned for success, and - 1 is returned for failure. Set errno

Unlink (delete the directory entry of a file)

Header file #include < unistd. H >

Prototype int unlink(const char *pathname);

0 returned successfully
Failure returns - 1, setting errno

A file directory entry is a structure that records inode s, and file names

Deleting files under Linux means constantly deleting files_ Nlink - 1 until 0. There is no file corresponding to the directory entry, and the process that opens the file is closed, it will be released by the system (the time is uncertain, which is the system's own mechanism). In fact, if a file is opened and not closed, even if the file is deleted, the file in the process still exists in the buffer

Symlink (create symbolic link)

Header file #include < unistd. H >

int symlink(const char *target, const char *linkpath);

0 returned successfully
Failure returns - 1, setting errno

The size of the symbolic link is the number of characters of the created path

The readlink command can view symbolic link files

Readlink (get the file name pointed to by the symbolic link)

#include <unistd.h>

ssize_t readlink(const char *pathname, char *buf, size_t bufsiz);

Successfully returns the number of bytes pointing to the file name
Failure returns - 1, setting errno

Rename

#include <stdio.h>

int rename(const char *oldpath, const char *newpath);

0 returned successfully
Failure returns - 1, setting errno

Getcwd (get the current working directory)

#include <unistd.h>

char *getcwd(char *buf, size_t size);

The string pointer is returned successfully. The value of the pointer is the same as that of buf
Failure returns NULL

Chdir (change the current process working directory)

#include <unistd.h>

int chdir(const char *path);

0 returned successfully
Failure returns - 1, setting errno

Opendir (open a directory)

#include <sys/types.h>
#include <dirent.h>

DIR *opendir(const char *name);

A pointer to the structure of the directory is returned successfully
Failure returns NULL

Just like the FILE pointer and FILE in C, just use the pointer name without knowing the details

Closedir (close a directory)

#include <sys/types.h>
#include <dirent.h>

int closedir(DIR *dirp);

0 returned successfully
Failure returns - 1, setting errno

Readdir (read directory)

 #include <dirent.h>

struct dirent *readdir(DIR *dirp);

The directory item structure pointer is returned successfully (after the circular reading is completed, NULL is returned instead of setting errno)
Failed to return NULL, set errno

struct dirent
{
    ino_t d_ino;  // inode number
    off_t d_off;  // Offset
    unsigned short d_reclen;  // Valid length of file name
    unsigned char d_type;  // Type (vim opens to see something like @ * / etc.)
    char d_name[256];  //file name
};

MKDIR (create directory)

#include <sys/stat.h>
#include <sys/types.h>

int mkdir(const char *pathname, mode_t mode);

0 returned successfully
Failure returns - 1, setting errno

mode file permissions (octal 0777)

Rewinddir (rewind directory read / write position to start)

#include <sys/types.h>
#include <dirent.h>

void rewinddir(DIR *dirp);

Telldir (get directory stream read location) / seekdir (set the location of the next read directory)

#include <dirent.h>

long telldir(DIR *dirp);

Successfully returns the current read / write location of the directory related to dirp (the offset from the next read location of the starting location)
Failure returns - 1, setting errno

void seekdir(DIR *dirp, long loc);

loc is generally determined by the return value of the telldir function

Posted by OriginalBoy on Thu, 25 Nov 2021 17:04:16 -0800