Friday, October 1, 2010

du and df -h confusing stats?

Just now explained someone with example, thought to put it in my blogs as well -

Background:

If someone is running an application with a file open in a directory and the open file is removed, the du output reflects a reduced size for this directory. However, df does not show a reduced size.

..and the confusion begins becuase 'df' and 'du 'are showing contradictory stats.

We must know that - du reports the space used by files and folders--even this is more than the file size. df reports the space used by the file system. This includes the overhead for journals and inode tables and such.The difference is that whenever an application has an open file, but the file is already deleted, then it is counted in the df output (because the space is certainly not free) but not in du (because it is not being used by a file). All blocks in the file system remain allocated until the application that has the file open itself closes the file. And only after the file closure, df will show the reduced usage for the file system.

Below is the exercise to reproduce and understand it better -

[root@DebuTestBox ~]# dd if=/dev/zero of=/tmp/duTest.txt bs=1024 count=500000

When this is going on - from another terminal, remove the file duTest.txt

[root@DebuTestBox ~]# rm /tmp/duTest.txt
rm: remove regular file `/tmp/duTest.txt'? y

Now check -

[root@DebaTestBox ~]# lsof | grep "deleted"
java 4579 root 285u REG 253,0 0 3670130 /tmp/org.hibernate.cache.StandardQueryCache.data (deleted)
dd 16516 root 1w REG 253,0 106242048 3670125 /tmp/dutest.txt (deleted)
--truncated---

Hope it helps.


Now to resolve this issue - we need to figure out what those files are, its importance and relevance in your work space and if found fit to delete them , go ahead and fire the following command -

#lsof | grep "deleted" | awk '{print $2}' | xargs kill -9


Thanks/-
DEBAJIT


No comments:

Post a Comment

RCA - Root Cause Analysis

An important step in finding the root causes of issues or occurrences that happen within a system or organization is root cause analysis (RC...