Non DFS used is any data in the filesystem of the data node(s) that isn't in dfs.data.dirs. This would include log files, mapreduce shuffle output and local copies of data files (if you put them on a data node). Use du or a similar tool to see whats taking up the space in your filesystem.