Articles Catalogue
Discription
This program reads text from standard input and modifies it, then writes it to standard output.
The program first reads a series of column labels. These labels appear in pairs to indicate the column range of the input row. This column label ends with a negative value as a closing flag. The remaining input lines are read and printed by the program, and then the selected strings in the input lines are extracted and printed.
Note: Column number of column 1 in each row is zero!
Source Program
/* ** This program reads input lines from the standard input and prints ** each input line, followed by just some portions of the lines, to ** the standard output. ** ** The first input is a list of column numbers, which ends with a ** negative number. The column numbers are paired and specify ** ranges of columns from the input line that are to be printed. ** For example, 0 3 10 12 -1 indicates that only columns 0 through 3 ** and columns 10 through 12 will be printed. */ #include <stdio.h> #include <stdlib.h> #include <string.h> #define MAX_COLS 20 /* max # of columns to process */ #define MAX_INPUT 1000 /* max len of input & output lines */ int read_column_numbers(int columns[], int max); void rearrange(char* output, char const* input, int n_columns, int const columns[]); int main(void) { int n_columns; /* # of columns to process */ int columns[MAX_COLS]; /* the columns to process */ char input[MAX_INPUT]; /* array for input line */ char output[MAX_INPUT]; /* array for output line */ /* ** Read the list of column numbers */ n_columns = read_column_numbers(columns, MAX_COLS); /* ** Read, process and print the remaining lines of input. */ while (gets(input) != NULL) { printf("Original input : %s\n", input); rearrange(output, input, n_columns, columns); printf("Rearranged line: %s\n", output); } return EXIT_SUCCESS; } /* ** Read the list of column numbers, ignoring any beyond the specified ** maximum. */ int read_column_numbers(int columns[], int max) { int num = 0; int ch; /* ** Get the numbers, stopping at eof or when a number is < 0. */ while (num < max && scanf("%d", &columns[num]) == 1 && columns[num] >= 0) num += 1; /* ** Make sure we have an even number of inputs, as they are ** supposed to be paired. */ if (num % 2 != 0) { puts("Last column number is not paired."); exit(EXIT_FAILURE); } /* ** Discard the rest of the line that contained the final ** number. */ while ((ch = getchar()) != EOF && ch != '\n') ; return num; } /* ** Process a line of input by concatenating the characters from ** the indicated columns. The output line is then NUL terminated. */ void rearrange(char* output, char const* input, int n_columns, int const columns[]) { int col; /* subscript for columns array */ int output_col; /* output column counter */ int len; /* length of input line */ len = strlen(input); output_col = 0; /* ** Process each pair of column numbers. */ for (col = 0; col < n_columns; col += 2) { int nchars = columns[col + 1] - columns[col] + 1; /* ** If the input line isn't this long or the output ** array is full, we're done. */ if (columns[col] >= len || output_col == MAX_INPUT - 1) break; /* ** If there isn't room in the output array, only copy ** what will fit. */ if (output_col + nchars > MAX_INPUT - 1) nchars = MAX_INPUT - output_col - 1; /* ** Copy the relevant data. */ strncpy(output + output_col, input + columns[col], nchars); output_col += nchars; } output[output_col] = '\0'; }
Interpretation
-
EXIT_SUCCESS and EXIT_FAILURE
_stdio.h defines EXIT_SUCESS and EXIT_FAILURE symbols.
- Keyword const
int read_colunm_numbers(int columns[ ], int max); int rearrange(char* output, char const* input, int n_columns, int const columns[ ]);
The second and fourth parameters in the_rearrange function are declared const, which means that the function will not modify the two parameters passed by the function caller.
In C language, array parameters are passed by reference, i.e. address calls, while scalar and constant are passed by value (similar to var parameters and value parameters in Pascal and Modula, respectively).
Any modification of scalar parameters in a function will be lost when the function returns, so the called function can not modify the parameters passed to it in the form of value transfer. However, when the called function modifies one of the parameters of the array, the array passed by the calling function is actually modified.
Therefore, when the rearrange function is called in the code, the output [] of the shape parameter group will be actually modified, while the input [] of the shape parameter group will remain unchanged due to the function of the keyword const.
- Get function
/* ** Read, process and print the remaining lines of input. */ while (gets(input) != NULL) { printf("Original input : %s\n", input); rearrange(output, input, n_columns, columns); printf("Rearranged line: %s\n", output); } return EXIT_SUCCESS;
The_gets function reads a line of text from standard input and stores it in an array passed to it as a parameter. A line of input consists of a string of characters, ending with a newline. The gets function discards newline characters and stores a NUL byte at the end of the line 1 (A NUL byte is a byte with a byte pattern of all 0, a character constant like' 0'). The gets function then returns a non-NULL value indicating that the row has been read successfully. 2 . When the gets function is called but there is no input line in fact, it returns the NULL value, indicating that it reaches the end of the input (the end of the file).
Although C does not have a "string" data type, there is a convention throughout the language:
A string is a string of characters ending in NUL bytes.
NUL is a terminator of a string and is not considered a part of the string itself.
For example,
String constants:
"Hello"
It occupies 6 bytes of memory, H, e, l, o and NUL, respectively.
- scanf function
/* ** Get the numbers, stopping at eof or when a number is < 0. */ while (num < max && scanf("%d", &columns[num]) == 1 && columns[num] >= 0) num += 1;
The_scanf function accepts several parameters, the first of which is a format string used to describe the desired input type. The remaining parameters are variables that store the input data read by the function.
The return value of the_scanf function is the number of values successfully converted by the function and stored in the parameters.
Because of the implementation principle of scanf, all scalar parameters must be preceded by a "&" symbol.
Array parameters do not need to be preceded by the "&" symbol 3.
However, if a subscript reference appears in an array parameter, that is to say, the actual parameter is a specific element in the array, then it must be preceded by the "&" symbol.
- getchar function
/* ** Discard the rest of the line that contained the final ** number. */ while ((ch = getchar()) != EOF && ch != '\n') ;
When the scanf function converts the input value, it only reads the characters that need to be read. In this way, the rest of the input line containing the last value will remain there, waiting to be read. It may contain only newline characters as terminators or other characters. In any case, the while loop reads and discards these remaining characters to prevent them from being interpreted as line 1 data.
This expression is discussed below:
(ch = getchar() ) != EOF && ch != '\n'
Firstly, the getchar function reads a character from standard input and returns its value. If there are no more characters in the input, the function returns the constant EOF (defined in stdio.h) to prompt the end of the file.
The value returned from the getchar function is assigned to the variable ch, which is then compared with EOF. If ch = EOF, the value of the entire expression is false and the loop terminates. If not, then compare ch with newline characters, and if they are equal, the loop will terminate.
Therefore, the value of the expression is true only when the input has not reached the end of the file and the input character is not a newline character (the loop will continue to execute). This loop eliminates the last remaining characters of the current input line.
Why is ch declared as an integer, and we actually need it to read characters?
The answer is that EOF is an integer value with more digits than character types. Declaring ch as an integer prevents accidental interpretations of characters read from input. But at the same time, it also means that the ch of the received character must be large enough to accommodate EOF, which is why ch uses integer values.
Characters are only small integers, so it doesn't cause any problems to include character values in an integer variable.
- Array pointer parameter
/* ** Process a line of input by concatenating the characters from ** the indicated columns. The output line is then NUL terminated. */ void rearrange(char* output, char const* input, int n_columns, int const columns[]) { int col; /* subscript for columns array */ int output_col; /* output column counter */ int len; /* length of input line */
When the array name is used as an argument, the passed function is actually a pointer to the starting position of the array, that is, the address of the array in memory. It is precisely because the actual transfer is a pointer rather than a copy of the array that the array name as a parameter has the semantics of the address call. Functions can manipulate arguments as pointers, or they can use subscripts to refer to elements of arrays just like array names.
However, because of its address calling semantics, if the function modifies the elements of the shape parameter group, it will actually modify the corresponding elements of the real parameter group. Therefore, the example program declares columns const in two ways. First, it declares that the author of the function intends that this parameter cannot be modified. Secondly, it causes the compiler to verify that the intention is violated. Therefore, the caller of this function does not have to worry that the elements in the array passed to the function as the fourth parameter in the sample program will be modified.
- strncpy function
/* ** Copy the relevant data. */ strncpy(output + output_col, input + columns[col], nchars); output_col += nchars;
The_strncpy function copies the selected character from the input line to the next available position in the output line. The first two parameters of the strncpy function are the address of the target string and the source string, respectively. In this call, the location of the target string is the address of the output_col column offset backward from the starting address of the output array, and the location of the source string is the address of the input array address offset backward from the column [col] position. The third parameter determines the number of characters to be assigned 4 . The output column calculator then moves nchars backwards.
} output[output_col] = '\0';
When the loop ends, the output string will be terminated with a NUL character.
-
NUL is the name of the' 0'character in the ASCII character set, and its byte mode is all 0. NULL is a pointer whose value is 0. They are all integer values, and their values are the same, so they can be used interchangeably. However, you should still use the appropriate constant, because it can tell the reader not only to use the value of 0, but also to tell him the purpose of using the value. ↩︎
-
The symbol NULL is defined in the header file stdio.h. On the other hand, there is no predefined symbol NUL, so if you want to use it instead of the character constant' 0', you have to define it yourself.
↩︎ -
But even if you add a "&" before it, there's nothing wrong, so if you like, you can add it.
↩︎ -
If the number of characters in the source string is less than the number of copies specified by the third parameter, the remaining bytes in the target string will be filled with NUL bytes. ↩︎