Adminbuntu

Everything for the Ubuntu Server Administrator

User Tools

Site Tools


Sidebar

Server Administration


Server Applications


At the Command Line


Elsewhere


Copyright 2013 Applied Conscious Technologies, LLC

Terms of Agreement

Contact


submit to reddit

finding_files

Files and Directories

Finding Files

Find Files by Filename

File files in fast index

Use locate to list files in found in the fast find index with .txt extension:

locate -r 'file[^/]*\.txt'
option description
-r The pattern specified on the command line is understood to be a regular expression, as opposed to a glob pattern. The Regular expressions work in the same was as in emacs and find, except for the fact that ”.” will match a newline. Filenames whose full paths match the specified regular expression are printed (or, in the case of the -c option, counted). If you wish to anchor your regular expression at the ends of the full path name, then as is usual with regular expressions, you should use the characters ^ and $ to signify this.

Files in the current directory and below by extension

This shows files in the current directory and below (recursive) by a ”.jpg” file extension.

find . -type f -name "*.jpg" -print
option description
-type f File is a regular file type.
-name Used to specify a filename pattern.

Find Large Files

List Files > 10M from the Current Directory on Down

This command is shown using sudo for the find command as it is commonly used to find large files owned by all users.

sudo find . -type f -size +10M -exec ls -lh {} \; | awk '{ print $9 " " $5 }'
sub-command purpose
find . -type f -size +10M -exec ls -lh {} \; From the current directory, find files (-type f) larger than 10 megabytes. FOr each files found execute ls -lh.
awk '{ print $9 ” ” $5 }' Parse the output of ls and display fields 9 and 5.

Show size of Files in Directory, in Descending Order by Size

du -cks * | sort -rn | while read size fname; do for unit in k M G T P E Z Y; do if [ $size -lt 1024 ]; then echo -e "${size}${unit}\t${fname}"; break; fi; size=$((size/1024)); done; done

Find Files by Attributes

Find Files not Readable by All

find . -type f ! -perm -444

Find Directories without World Readable Permissions

find . -type d ! -perm -111
option description
-type f Find files
-type d Find directories
-perm Find by file permissions

Find Files by Content

Find Files by Content

This searches from the current directory (”.”) on down using the find command.

find . -type f -name '*.php' | xargs grep 'something'

Unlike using locate, find searches the filesystem and is slower. However, it searches the current file system, not an index that was created last night. Find is much more flexible than locate with many options for exactly how to search. So find is slower, but more accurate and powerful.

The first parameter of find, ”.” (a period or dot) means the current directory.

Find Options

option description
-type f Only return files.
-name Base of file name (the path with the leading directories removed) matches shell pattern pattern. The metacharacters (`*', `?', and `[]') match a `.' at the start of the base name (this is a change in findutils-4.2.2; see section STANDARDS CONFORMANCE below). To ignore a directory and the files under it, use -prune; see an example in the description of -path. Braces are not recognised as being special, despite the fact that some shells including Bash imbue braces with a special meaning in shell patterns. The filename matching is performed with the use of the fnmatch(3) library function. Don't forget to enclose the pattern in quotes in order to protect it from expansion by the shell.

find man page: http://manpages.ubuntu.com/manpages/precise/man1/find.1.html

xargs man page: http://manpages.ubuntu.com/manpages/precise/man1/xargs.1.html

grep man page: http://manpages.ubuntu.com/manpages/precise/man1/grep.1.html

Find Files by Content in Indexed Files

If your system has locate installed you can search for files in the index. This is faster that using find, but only searches files contained in the index. When installed, this index is typically updated once a day, so this is not useful for recently added files.

Quickly search entire system for files with a certain extension containing specified text. In the example, file ”.php” files are searched for “video”.

locate  -i -b '.php' | xargs grep -H 'video'

Substitute “php” and “video” for the file extension and content of your choice.

The locate command uses a fast-find index which is updated with updatedb—typically systems are configured to update this index daily. Because it searches an index and does not search the filesystem, it is very fast. Files not in that index will not be searched. Files that are not world-readable are not located in the index.

locate options

option description
-i Ignore case distinctions in both the pattern and the file names.
-b Results are considered to match if the pattern specified matches the final component of the name of a file as listed in the database. This final component is usually referred to as the base name.

grep options

option description
-H Print the file name for each match. This is the default when there is more than one file to search.

locate man page: http://manpages.ubuntu.com/manpages/precise/man1/locate.findutils.1.html

xargs man page: http://manpages.ubuntu.com/manpages/precise/man1/xargs.1.html

grep man page: http://manpages.ubuntu.com/manpages/precise/man1/grep.1.html

see also: Locate

Find Files with a Specified Extension Matching a Regular Expression

This finds files with a specified extension containing a pattern matched with a regular expression.

Substitute “ext” with the extension of your choice. Substitute “something” with the regular expression to match. See Regular Expressions for information about regular expressions.

find . -name '*.ext' | xargs grep -E 'something'

See: Regular Expressions

Variation - in a certain directory and below (recursive)

find /dirname -name '*.ext' | xargs grep 'something'

Variation - in a certain directory only

find /dirname -maxdepth 1 -name '*.ext' | xargs grep 'something'

Variation - all normal files, recursive

find /dirname -type f | xargs grep 'something'

Housekeeping

How about a really dangerous command? Find and delete all files and directories older than 30 days in "/directory/name" and below

find /directory/name -mtime +30 -exec rm -rf {} \;

Wait a minute...you probably want to list those files first!

Before executing a somewhat complex remove command, test the find results first with ls.

find /directory/name -mtime +30 -exec ls -lh {} \;

find options

option description
-mtime File modification datetime. File’s data was last modified n*24 hours ago.
-exec Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ‘;’ is encountered. The string ‘{}’ is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find Both of these constructions might need to be escaped (with a ‘\’) or quoted to protect them from expansion by the shell. See the EXAMPLES section for examples of the use of the -exec option. The specified command is run once for each matched file. The command is executed in the starting directory. There are unavoidable security problems surrounding use of the -exec action; you should use the -execdir option instead.

Find Duplicate Files

Compare 2 directories, list duplicate files, based on content:

perl -MFile::Find::Duplicates -e'
    @dupes = find_duplicate_files("dir1", "dir2");
    printf "Files %s (of size %d) hash to %s\n", 
      (join "," , @{$_->files}), $_->size, $_->md5
        for @dupes'

Find Missing Files

Compare 2 directories, looking for missing files

diff -rq dir1 dir2

Now copy missing files from dir2 to dir1

rsync -a dir2 dir1

Using Find to Deal With a Large Number of Files

While Bash can expand a pretty fair number of files, the number is not limitless.

When a wildcard character is used specifying matching files to process, the shell (i.e. bash) expands that specification to a list of files that are passed to the receiving command. This buffer containing this data is finite and it is a practical problem.

When you need to process a large number of files with a command, it is wise to use find or xargs to to the filename expansion and to feed one filename at a time to the desired command.

When using find for this purpose, use the -name option to enter a filename specification containing wildcard character(s). Put the filename spec in quotes so the shell does not expand the filename.

Then use the -exec option to execute the desired command. Use {} where the individual filename will appear for the command. End the -exec clause with \;.

This example will feed all files matching “mary-*” to the s3cmd with options and a target bucket to upload to.

find . -name 'mary-*' -exec s3cmd put --acl-public --guess-mime-type {} s3://mybucketnamehere/ \;

finding_files.txt · Last modified: 2015/05/31 21:20 (external edit)