This note is excerpted from: https://www.cnblogs.com/liqingwen/p/5816051.html Record the learning process for future reference.
Many file system operations are essentially queries, so LINQ is a good way to use them.
I. query files with specified attributes or names
This example shows how to find all files in the specified directory tree with the specified file extension (for example,. txt), and how to return the latest or oldest files in the tree based on the creation time.
class Program { static void Main(string[] args) { #region LINQ Query for a file with the specified property or name //File path const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\"; //Take a snapshot of the file system var dir = new DirectoryInfo(path); //This method assumes that the application has search permission for all folders under the specified path var files = dir.GetFiles("*.*", SearchOption.AllDirectories); //Create query var qurey = from file in files where file.Extension == ".txt" orderby file.Name select file; //Execution query foreach (var file in qurey) { Console.WriteLine(file.FullName); } //Create and execute a new query by querying the creation time of the old file as a starting point. //Last: Select the last one, because it is in ascending order of date, so the latest one points to the last one. var newestFile = (from file in qurey orderby file.CreationTime select new { file.FullName, file.CreationTime }).Last(); Console.WriteLine($"\r\nThe newest .txt file is {newestFile.FullName}. Creation time: {newestFile.CreationTime}"); Console.Read(); #endregion } }
The operation results are as follows:
2. Group files by extension
This example demonstrates how to use LINQ to perform advanced grouping and sorting operations on a list of files or folders. In addition, it demonstrates how to use skip < tSource > and take < tSource > methods for console windows
The output in the interface is paged.
The following query shows how to group the contents of a specified directory tree by file extension.
class Program { static void Main(string[] args) { #region LINQ Group files by extension const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\"; //"path"Length of, which is later used to be removed at output“ path"This prefix. var trimLength = path.Length; //Take a snapshot of the file system var dir = new DirectoryInfo(path); //This method assumes that the application has search permission for all folders under the specified path. var files = dir.GetFiles("*.*", SearchOption.AllDirectories); //Create query var query = from file in files group file by file.Extension.ToLower() into fileGroup orderby fileGroup.Key select fileGroup; //Show one group at a time. If the number of rows of the list entity is greater than the number of rows in the console window, the output is paginated. PageOutput(trimLength, query); #endregion } /// <summary> /// Paging output /// </summary> /// <param name="rootLength"></param> /// <param name="query"></param> private static void PageOutput(int rootLength, IOrderedEnumerable<IGrouping<string, FileInfo>> query) { //Flag to jump out of paging loop var isAgain = true; //Height of console output var numLines = Console.WindowHeight - 3; //Traversal group set foreach (var g in query) { var currentLine = 0; do { Console.Clear(); Console.WriteLine(string.IsNullOrEmpty(g.Key) ? "[None]" : g.Key); //From " currentLine"Start display“ numLines"Number of bars var resultPage = g.Skip(currentLine).Take(numLines); //Execution query foreach (var info in resultPage) { Console.WriteLine("\t{0}", info.FullName.Substring(rootLength)); } //Record output lines currentLine += numLines; Console.WriteLine("Click "any key" to continue, press“ End"Key exit"); //Choose whether to jump out for users var key = Console.ReadKey().Key; if (key != ConsoleKey.End) continue; isAgain = false; break; } while (currentLine < g.Count()); if (!isAgain) { break; } } } }
The operation results are as follows:
3. Query the total number of bytes in a group of folders
This example shows how to retrieve the total number of bytes used by all files in a specified folder and all its subfolders.
The Sum method adds the value of all items selected in the select clause. You can easily modify this query to retrieve the largest or smallest file in the specified directory tree by calling min < tSource > or
Max < tSource > method, not Sum.
class Program { static void Main(string[] args) { #region LINQ Queries the total number of bytes in a set of folders const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\"; var dir = new DirectoryInfo(path); var files = dir.GetFiles("*.*", SearchOption.AllDirectories); var query = from file in files select file.Length; //Cache results to avoid multiple access to the file system var fileLengths = query as long[] ?? query.ToArray(); //Returns the size of the largest file var largestLength = fileLengths.Max(); //Returns the total number of bytes in all files under the specified folder var totalBytes = fileLengths.Sum(); Console.WriteLine(); Console.WriteLine("There are {0} bytes in {1} files under {2}", totalBytes, files.Count(), path); Console.WriteLine("The largest files is {0} bytes.", largestLength); Console.Read(); #endregion } }
The operation results are as follows:
IV. compare the contents of the two folders
This example demonstrates three ways to compare two file lists:
1. Query a Boolean value that specifies whether the two file lists are the same.
2. Query is used to retrieve the intersection of files in two folders at the same time.
3. Query is used to retrieve the difference set of files in one folder but not in another.
/// <summary> /// File name and byte comparison class /// </summary> public class FileComparer : IEqualityComparer<FileInfo> { public bool Equals(FileInfo x, FileInfo y) { return string.Equals(x.Name, y.Name, StringComparison.CurrentCultureIgnoreCase) && x.Length == y.Length; } //Returns a standard hash value. according to IEqualityComparer Rule, if equal, then the hash value must also be equal. //Because the equality defined here is just a simple value equality, not a reference identity, it is possible that two or more objects will produce the same hash value. public int GetHashCode(FileInfo obj) { var s = string.Format("{0}{1}", obj.Name, obj.Length); return s.GetHashCode(); } } class Program { static void Main(string[] args) { #region LINQ Queries the total number of bytes in a set of folders const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\"; var dir = new DirectoryInfo(path); var files = dir.GetFiles("*.*", SearchOption.AllDirectories); var query = from file in files select file.Length; //Cache results to avoid multiple access to the file system var fileLengths = query as long[] ?? query.ToArray(); //Returns the size of the largest file var largestLength = fileLengths.Max(); //Returns the total number of bytes in all files under the specified folder var totalBytes = fileLengths.Sum(); Console.WriteLine(); Console.WriteLine("There are {0} bytes in {1} files under {2}", totalBytes, files.Count(), path); Console.WriteLine("The largest files is {0} bytes.", largestLength); Console.Read(); #endregion } }
The operation results are as follows:
The FileComparer class shown here demonstrates how to use a custom comparer class with standard query operators. This class is not designed to be used in actual scenarios, it just uses each
The name and length of the file, in bytes, determine whether the contents of each folder are the same. In practice, the comparator should be modified to perform more strict equality checking.
V. query the largest file in the directory tree
This example demonstrates five queries related to file size in bytes:
1. How to retrieve the maximum file size (in bytes).
2. How to retrieve the minimum file size (in bytes).
3. How to retrieve the maximum or minimum file of FileInfo object from one or more folders under the specified root folder.
4. How to retrieve a sequence, such as 10 largest files.
The following example contains five different queries that demonstrate how to query and group files based on file size in bytes. You can easily modify these examples to make the query base
Some other property on the FileInfo object.
class Program { static void Main(string[] args) { #region LINQ Query the largest file in the directory tree const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\"; var dir = new DirectoryInfo(path); var files = dir.GetFiles("*.*", SearchOption.AllDirectories); var query1 = from file in files select file.Length; //Returns the size of the largest file var maxSize = query1.Max(); Console.WriteLine("The length of the largest file under {0} is {1}", path, maxSize); Console.WriteLine(); //Reverse order var query2 = from file in files let len = file.Length where len > 0 orderby len descending select file; var fileInfos = query2 as FileInfo[] ?? query2.ToArray(); //The first in reverse order is the largest file var longestFile = fileInfos.First(); //The first file in reverse order is the smallest file var smallestFile = fileInfos.Last(); Console.WriteLine("The largest file under {0} is {1} with a length of {2} bytes", path, longestFile.FullName, longestFile.Length); Console.WriteLine(); Console.WriteLine("The smallest file under {0} is {1} with a length of {2} bytes", path, smallestFile.FullName, smallestFile.Length); Console.WriteLine(); Console.WriteLine("===== The 10 largest files under {0} are: =====", path); //Return to the top 10 largest files var queryTenLargest = fileInfos.Take(10); foreach (var file in queryTenLargest) { Console.WriteLine("{0}: {1} bytes", file.FullName, file.Length); } Console.Read(); #endregion } }
The operation results are as follows:
To return one or more complete FileInfo objects, the query must first examine each object in the data source and then sort them by the value of their Length property so that
Returns a single object or sequence with the maximum length. Use first < tSource > to return the first element in the list use take < tSource > to return the first n elements.
6. Query duplicate files in the directory tree
Sometimes, files with the same name may exist in multiple folders. For example, in the Visual Studio installation folder, there are multiple folders that contain the readme.htm file.
This example shows how to query for duplicate file names in the specified root folder.
class Program { static void Main(string[] args) { #region LINQ Example 1 of querying duplicate files in the directory tree const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\"; var dir = new DirectoryInfo(path); var files = dir.GetFiles("*.*", SearchOption.AllDirectories); var charsToSkip = path.Length; var queryDupNames = (from file in files group file.FullName.Substring(charsToSkip) by file.Name into fileGroup where fileGroup.Count() > 1 select fileGroup).Distinct(); PageOutput(queryDupNames); #endregion } /// <summary> /// Paging output /// </summary> /// <typeparam name="TK"></typeparam> /// <typeparam name="TV"></typeparam> /// <param name="queryDupNames"></param> private static void PageOutput<TK, TV>(IEnumerable<IGrouping<TK, TV>> queryDupNames) { //Height of console output var numLines = Console.WindowHeight - 3; var dupNames = queryDupNames as IGrouping<TK, TV>[] ?? queryDupNames.ToArray(); foreach (var queryDupName in dupNames) { //Paging start var currentLine = 0; do { Console.Clear(); Console.WriteLine("Filename = {0}", queryDupName.Key.ToString() == string.Empty ? "[none]" : queryDupName.Key.ToString()); //skip currentLine OK, take numLines That's ok. var resultPage = queryDupName.Skip(currentLine).Take(numLines); foreach (var fileName in resultPage) { Console.WriteLine("\t{0}", fileName); } //Incrementer records the number of rows displayed currentLine += numLines; //Press a little tired, let it automatically next page. Thread.Sleep(100); } while (currentLine < queryDupName.Count()); } } }
The operation results are as follows:
This example shows how to query files whose size and creation time also match.
/// <summary> /// PortableKey class /// </summary> public class PortableKey { public string Name { get; set; } public DateTime CreationTime { get; set; } public double Length { get; set; } } class Program { static void Main(string[] args) { #region LINQ Example 2 of querying duplicate files in the directory tree const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\"; var dir = new DirectoryInfo(path); var files = dir.GetFiles("*.*", SearchOption.AllDirectories); var charsToSkip = path.Length; //Note the use of a composite key. Files with three attributes matching belong to the same group. //Anonymous types can also be used for compound keys, but cannot cross method boundaries. var queryDupFiles = from file in files group file.FullName.Substring(charsToSkip) by new PortableKey() { Name = file.Name, CreationTime = file.CreationTime, Length = file.Length } into fileGroup where fileGroup.Count() > 1 select fileGroup; var queryDupNames = queryDupFiles as IGrouping<PortableKey, string>[] ?? queryDupFiles.ToArray(); var list = queryDupNames.ToList(); var count = queryDupNames.Count(); //Paging output PageOutput(queryDupNames); Console.Read(); #endregion } /// <summary> /// Paging output /// </summary> /// <typeparam name="TK"></typeparam> /// <typeparam name="TV"></typeparam> /// <param name="queryDupNames"></param> private static void PageOutput<TK, TV>(IEnumerable<IGrouping<TK, TV>> queryDupNames) { //Height of console output var numLines = Console.WindowHeight - 3; var dupNames = queryDupNames as IGrouping<TK, TV>[] ?? queryDupNames.ToArray(); foreach (var queryDupName in dupNames) { //Paging start var currentLine = 0; do { Console.Clear(); Console.WriteLine("Filename = {0}", queryDupName.Key.ToString() == string.Empty ? "[none]" : queryDupName.Key.ToString()); //skip currentLine OK, take numLines That's ok. var resultPage = queryDupName.Skip(currentLine).Take(numLines); foreach (var fileName in resultPage) { Console.WriteLine("\t{0}", fileName); } //Incrementer records the number of rows displayed currentLine += numLines; //Press a little tired, let it automatically next page. Thread.Sleep(100); } while (currentLine < queryDupName.Count()); } } }
VII. Query the contents of files in folders
This example shows how to query all files in a specified directory tree, open each file, and check its contents. This type of technology can be used to index or reverse index the contents of a directory tree. This example
Although a simple string search is performed, a more complex type of pattern matching can be performed using regular expressions.
class Program { static void Main(string[] args) { #region LINQ Querying the contents of a file in a folder const string path = @"C:\Program Files (x86)\Microsoft Visual Studio\2017\"; var dir = new DirectoryInfo(path); var files = dir.GetFiles("*.*", SearchOption.AllDirectories); //String to match const string searchTerm = @"Visual Studio"; //Search the contents of each file. //You can also replace with regular expressions Contains Method var queryMatchingFiles = from file in files where file.Extension == ".html" let content = GetFileConetnt(file.FullName) where content.Contains(searchTerm) select file.FullName; //Execution query Console.WriteLine("The term \"{0}\" was found in:", searchTerm); foreach (var filename in queryMatchingFiles) { Console.WriteLine(filename); } Console.Read(); #endregion } /// <summary> /// Read all contents of the file /// </summary> /// <param name="fileName"></param> /// <returns></returns> static string GetFileConetnt(string fileName) { //If we have deleted the file after the snapshot, ignore it and return an empty string. return File.Exists(fileName) ? File.ReadAllText(fileName) : ""; } }
The operation results are as follows: