Defect – /var is full


Defect – /var is full

Log is Deleted When Application Is Running

It is reported that /var file system on our product machine is full.
This is weird, as the application has the mechanism to trim log according to the specified maximum size.

Check the application logs are missing. Guess maybe somebody deleted them for unknown reason. 'lsof -p $app_pid' shows state of application logs as deleted, and its size is extremely huge.

So write a sample test to see what would happen if delete log file when application is running.

It is found out that that the OutputStream is unaware that underlying file is removed, still write to the already-deleted file, no exception is thrown(if use other streams than PrintStream), or no error code is set. No log would be really recorded!!!

And after log file is removed, file.length() would always be 0, so no trim would happen, this would cause file size increase continuously. Due to java still has file handle to the deleted file, Linux is unable to free up disk space held by deleted files.
At last /var is filled up!!!
$lsof -p $app_pid | grep $log_dir
java    22533 $USER    6w   REG    8,5      819 1153292 $log_name (deleted)

The fix is simple, before write out to log file, check whether log file exists, if not, close old stream, and recreate the file, and stream.
package logger;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.PrintStream;

public class LoggerTest {
    String fileName;
    File file;
    PrintStream ps;
    private static final long maxFileSize = 1024 * 1024;
    private long fileSize;

    public LoggerTest(String fileName) throws IOException {
       this.fileName = fileName;
       configure();
    }

    public void writeForEver(String str) throws IOException,
           InterruptedException {
       while (true) {
           // after file is removed, file.length() would always be 0, so no trim would happen
           // file size would increase continuously
           fileSize = file.length();
           if (fileSize > maxFileSize) {
              trim(true);
           }

           ps.println(str);
           ps.flush();
           // ps.checkError would not report error, even when the file is deleted
           if (ps.checkError()) {
              throw new IOException("ps.checkError() " + ps.checkError());
           }

           if (!file.exists()) {
              ps.close();
              configure();
              ps.println(file.getName() + " is deleted, recreate");
           }
           // verified after file is removed, file.length() would always be 0.
           System.out.println("fileSize " + fileSize + " " + str);
           Thread.sleep(10000);
       }
    }

    private void trim(boolean b) {
       // trim the log to specified size
    }

    private void configure() throws IOException {
       file = new File(fileName);
       FileOutputStream fos = new FileOutputStream(file, true);
       ps = new PrintStream(fos);
    }

    public static void main(String[] args) throws IOException,
           InterruptedException {
       new LoggerTest("log.txt").writeForEver("Hello World!");
    }
}

Labels

adsense (5) Algorithm (69) Algorithm Series (35) Android (7) ANT (6) bat (8) Big Data (7) Blogger (14) Bugs (6) Cache (5) Chrome (19) Code Example (29) Code Quality (7) Coding Skills (5) Database (7) Debug (16) Design (5) Dev Tips (63) Eclipse (32) Git (5) Google (33) Guava (7) How to (9) Http Client (8) IDE (7) Interview (88) J2EE (13) J2SE (49) Java (186) JavaScript (27) JSON (7) Learning code (9) Lesson Learned (6) Linux (26) Lucene-Solr (112) Mac (10) Maven (8) Network (9) Nutch2 (18) Performance (9) PowerShell (11) Problem Solving (11) Programmer Skills (6) regex (5) Scala (6) Security (9) Soft Skills (38) Spring (22) System Design (11) Testing (7) Text Mining (14) Tips (17) Tools (24) Troubleshooting (29) UIMA (9) Web Development (19) Windows (21) xml (5)