Monitor file changes in a directory through Java

Keywords: Java ftp Database Attribute

Recently, a requirement has been addressed, probably as follows:

  • Setting up FTP Server
  • Upload files to the specified directory (assuming directory A) to the server
  • We need to parse and update the uploaded files (files in the uploaded state can not be processed) to the database.
  • We only have the right to "read" directory A, that is, we can't delete, rename and move the files in directory A.

For this need, the solution I first came up with was:

  • Open a thread to read all files in directory A periodically
  • Comparing the list of files read twice, the new file name corresponds to the new file uploaded by the other party.
  • For a new file, first record its size and the last modification time, and then, every two seconds, read its two attribute values again. If these two values remain unchanged, then the file is uploaded and uploaded. If there is a change, then make sure again every two seconds.
  • After confirming that the file has been uploaded, parse the file and update it to the database

This scheme is generally competent, but it hides the following two minor problems:

  • If the interval of reading directory A is not set properly, the reading frequency will be too frequent if it is set small; if set too large, it may lead to a large backlog of files.
  • The time interval between the values of the two attributes, size and final modification time, is also uncertain. What I said above is 2 seconds, which is my own hypothesis. Because when large files are encountered, it is very likely that they will not be passed through in 2 seconds. If FTP is built on windows operating system, there will be the following problems:
    At the beginning of a file transfer, the size of the file has been determined. In the process of transmission, the value of the file will not change if it is viewed by lengh() of File class in java.
    For the last modification time property, only at the beginning of the file creation and after the file transfer ratio, will the change occur. In the transfer process, the value of the last Modified Time () of the File class in Java will not change if it is viewed through the last Modified Time () of the java file class.
  • If FTP is built on Unix operating system, there is no such problem. The size and the last modification time are always changing during the whole file transfer process. (I verified it on CentOS 7)

Since the above scheme is defective, think about other solutions.
Later, under the call of colleagues, we found a new API added to JDK7: File Watch Service.

The idea behind this API is actually the same as the observer pattern: register a Watcher for a specified directory, and Java notifies you that the Watcher says the file has changed when the file in the directory changes. In this way, you can deal with it.

Let's go directly to the code below:

import java.io.File;
import java.io.IOException;
import java.nio.file.FileSystems;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.WatchEvent;
import java.nio.file.WatchKey;
import java.nio.file.WatchService;
import static java.nio.file.StandardWatchEventKinds.*;

public class Sample {

    private WatchService watcher;

    private Path path;

    public Sample(Path path) throws IOException {
        this.path = path;
        watcher = FileSystems.getDefault().newWatchService();
        this.path.register(watcher, OVERFLOW, ENTRY_CREATE, ENTRY_DELETE, ENTRY_MODIFY);
    }

    public void handleEvents() throws InterruptedException {
        // start to process the data files
        while (true) {
            // start to handle the file change event
            final WatchKey key = watcher.take();

            for (WatchEvent<?> event : key.pollEvents()) {
                // get event type
                final WatchEvent.Kind<?> kind = event.kind();

                // get file name
                @SuppressWarnings("unchecked")
                final WatchEvent<Path> pathWatchEvent = (WatchEvent<Path>) event;
                final Path fileName = pathWatchEvent.context();

                if (kind == ENTRY_CREATE) {

                    // Explanation point 1
                    // create a new thread to monitor the new file
                    new Thread(new Runnable() {
                        public void run() {
                            File file = new File(path.toFile().getAbsolutePath() + "/" + fileName);
                            boolean exist;
                            long size = 0;
                            long lastModified = 0;
                            int sameCount = 0;
                            while (exist = file.exists()) {
                                // if the 'size' and 'lastModified' attribute keep same for 3 times,
                                // then we think the file was transferred successfully
                                if (size == file.length() && lastModified == file.lastModified()) {
                                    if (++sameCount >= 3) {
                                        break;
                                    }
                                } else {
                                    size = file.length();
                                    lastModified = file.lastModified();
                                }
                                try {
                                    Thread.sleep(500);
                                } catch (InterruptedException e) {
                                    return;
                                }
                            }
                            // if the new file was cancelled or deleted
                            if (!exist) {
                                return;
                            } else {
                                // update database ...
                            } 
                        }
                    }).start();
                } else if (kind == ENTRY_DELETE) {
                    // todo
                } else if (kind == ENTRY_MODIFY) {
                    // todo
                } else if (kind == OVERFLOW) {
                    // todo
                }
            }

            // IMPORTANT: the key must be reset after processed
            if (!key.reset()) {
                return;
            }
        }
    }

    public static void main(String args[]) throws IOException, InterruptedException {
        new Sample(Paths.get(args[0])).handleEvents();
    }
}

For "Point 1" in the above code, add the following points:

  • This method of processing files by judging their size and final modification time is limited to Unix operating system, for the reasons mentioned above.
  • For windows systems, after the event ENTRY_CREATE is generated, the monitor should continue until an "ENTRY_MODIFY" event or an ENTRY_DELETE event of the file is produced, which indicates that the file has been transferred or cancelled.
  • The built-in Thread is best built into another class, which looks easy to understand.

Reference Documents

Oracle official example

Posted by chrisranjana on Wed, 17 Jul 2019 15:58:32 -0700