Unix find is a pretty tricky but very useful utility that can often fool even experienced UNIX professionals with ten on more years of sysadmins work under the belt. It can enhance functionality of those Unix utilities that does not include tree traversal (BTW GNU grep has -r option for this purpose and can be used on its own to perform tree traversal task: grep -r "search string" /tmp.). There are several versions of find with the main two being POSIX find used in Solaris, AIX, etc and GNU find used in linux. GNU find can be installed on Solaris and AIX and it is actually a strong recommendation as there are some differences; moreover gnu find have additional capabilities that are often useful.
But find can do more then a simple tree traversal available with option -r (or -R) in many Unix utilities. Traversal provided by find can have excluded directory tree branches, can select files or directories using regular expressions, can be limited to specific typed of filesystem, etc. This capability is far above and beyond regular tree traversal of Unix utilities so find is a real Unix utility -- a useful enhancer of functionally of other utilities including both utilities that do not have capability to traverse the directory tree and those which have built-in simple recursive tree traversal
The idea behind find is extremely simple: this is a utility for searching files using the directory information and in this sense it is similar to ls. But it is more powerful then ls as it can provide " a ride" for other utilities and has an idiosyncratic mini-language for specifying queries, the language which probably outlived its usefulness but nobody has courage to replace it with a standard scripting language.
For obscure historical reasons find mini-language is completely different from all other UNIX commands: it has full-word options rather than single-letter options. For example, instead of a typical Unix-style option -f to match filenames (like in tar -xvf mytar.tar) find uses option -name. Also path to search can consist of multiple starting points, for example
find /usr /bin /sbin /opt -name sar # here we exclude non-relevant directories
In general you need to specify the set of starting points for a search through the file system first. The first argument starting with "-" is considered to be a start of "find expression". The latter can have side effects if you specified actions in the expression.
It is very important to understand that you can specify more than one directory as a starting point for the search. To look across the /bin and /var/html directory trees for filenames that contain the pattern *.htm*, you can use the following command:
find /usr /var/html -name "*.htm*" -print
Please note that you need quotes for any regex. Otherwise it will be evaluated immediately in the current context by shell.
It is simply impossible to remember all the details of this language unless you construct complex queries each day and that's why this page was created. Along with this page it make sense to consult the list of typical (and not so typical) examples which can be found in in Examples page on this site as well as in the links listed in Webliography. An excellent paper Advanced techniques for using the UNIX find command was written by Bill Zimmerly. I highly recommend to read it and then print and have a reference. Several examples in this tutorial are borrowed from the article.
The full find language is pretty complex and consist of several dozens of different predicates and options. There are two versions of this language: one implemented in POSIX find and the second implemented in GNU find which is a superset of POSIX find. That can make big difference in complex scripts. But for interactive use the differences is minor: only small subset of options is typically used on day-to-day basis by system administrators. Among them:
- -name True if pattern matches the current file name. Simple regex (shell regex) may be used. A backslash (\) is used as an escape character within the pattern. The pattern should be escaped or quoted. If you need to include parts of the path in the pattern in GNU find you should use predicate wholename
Use the -iname predicate (GNU find supports it) to run a case-insensitive search, rather than just -name. For example:
$ find . -follow -iname '*.htm' -print0 | xargs -i -0 mv '{}' ~/webhome
Usage of -print0 is a simple insurance for the correct processing of files with spaces.
- -fstype type True if the filesystem to which the file belongs is of type type. For example on Solaris mounted local filesystems have type ufs (Solaris 10 added zfs). For AIX local filesystem is jfs or jfs2 (journalled file system). If you want to traverse NFS filesystems you can use nfs (network file system). If you want to avoid traversing network and special filesystems you should use predicate local and in certain circumstances mount
- "-atime/-ctime/-mtime" [+|-]n
Specify selection of the files based on three Unix timestamps: the last time a files's "access time", "file status" and "modification time".
n is time interval -- an integer with optional sign. It is measured in 24-hour periods (days) or minutes counted from the current moment.- n: If the integer n does not have sign this means exactly n 24-hour periods (days) ago, 0 means today.
- +n: if it has plus sing, then it means "more then n 24-hour periods (days) ago", or older then n,
- -n: if it has the minus sign, then it means less than n 24-hour periods (days) ago (-n), or younger then n. It's evident that -1, and 0 are the same and both means "today".
Note: If you use parameters with find command in scripts be careful when -mtime parameter is equal zero. Some (earlier) versions of GNU find incorrectly interpret the following expression
find -mtime +0 -mtime -1
which should be equivalent tofind -mtime -1
but does not produce any files- n: If the integer n does not have sign this means exactly n 24-hour periods (days) ago, 0 means today.
- +n: if it has plus sing, then it means "more then n 24-hour periods (days) ago", or older then n,
- -n: if it has the minus sign, then it means less than n 24-hour periods (days) ago (-n), or younger then n. It's evident that -1 and 0 are the same and both means "today".
- Examples:
- Find everything in your home directory modified in the last 24 hours:
- find $HOME -mtime -1
- find $HOME -mtime -1
- Find everything in your home directory modified in the last seven 24-hour periods (days):
- find $HOME -mtime -7
- find $HOME -mtime -7
- Find everything in your home directory that have NOT been modified in the last year:
- find $HOME -mtime +365
- find $HOME -mtime +365
- To find html files that have been modified in the last seven 24-hour periods (days), I can use -mtime with the argument -7 (include the hyphen):
find . -mtime -7 -name "*.html" -print
If you use the number 7 (without a hyphen), find will match only html files that were modified exactly seven 24-hour periods (days) ago:
find . -mtime 7 -name "*.html" -print
- To find those html files that I haven't touched for at least seven 24-hour periods (days), I use +7:
find . -mtime +7 -name "*.html" -print
- Find everything in your home directory modified in the last 24 hours:
- n: If the integer n does not have sign this means exactly n 24-hour periods (days) ago, 0 means today.
- -newer/-anewer/-cnewer baseline_file The time of modification, access time or creation time are compared with the same timestamp in the baseline file. If file is a symbolic link and the -H option or the -L option is in effect, the modification time of the file it points to is always used.
- -newer Modification time is compared with modification time of the basline_file True if file was modified more recently than baseline file.
- -anewer Access time is compared with access time of basline_file . True if file was last accessed more recently than baseline file.
- -cnewer Creation file is compared. For example: find everything in your home that has been modified more recently than "~joeuser/lastbatch.txt ":
- find $HOME -newer ~joeuser/lastbatch.txt
- find $HOME -newer ~joeuser/lastbatch.txt
- -newer Modification time is compared with modification time of the basline_file True if file was modified more recently than baseline file.
- -local True if the file system type is not a remote file system type. In Solaris those types are defined in the /etc/dfs/fstypes file. nfs is used as the default remote filesystem type if the /etc/dfs/fstypes file is not present. The -local option skips the hierarchy of non-local directories. You can also search without descending more then certain number of levels as explained later or exclude some directories from the search using
- -mount Always true. Restricts the search to the file system containing the directory specified. Does not list mount points to other file systems.
- -xdev Same as the -mount primary. Always evaluates to the value True. Prevents the find command from traversing a file system different from the one specified by the Path parameter.
- -xattr True if the file has extended attributes.
- -wholename simple-regex [GNU find only] . File name matches simple regular expression (often called shell patterns). In simple regular expressions the metacharacters '/' and '.' do not exist; so, for example, you can specify:
find . -wholename '/lib*'
which will print entries from directories /lib64 and /lib. To ignore the directories specified, use option -prune For example, to skip the directory /proc and all files and directories under it (which is important for linux as otherwise errors are produced you can something like this:find . -wholename '/proc' -prune -o -name file_to_be_found
If you administer a lot of linux boxes it is better to create alias ff:if [[ `uname` == "Linux" ]] ; do
alias ff='find . -wholename '/proc' -prune -o -name '
else
ff='find . -name ' # not GNU find does not support -wholename
fi
Other useful options of the find command include:
- -regex regex [GNU find only] File name matches regular expression. This is a match on the whole pathname not a filename. Stupidly enough the default regular expressions understood by find are Emacs Regular Expressions, not Perl regular expressions. It is important to note that "-iregex" option provide capability to ignore case.
- -perm permissions Locates files with certain permission settings. Often used for finding world-writable files or SUID files. See below
- -user Locates files that have specified ownership. Option -nouser locates files without ownership. For such files no user in /etc/passwd corresponds to file's numeric user ID (UID). such files are often created when tar of sip archive is transferred from other server on which the account probably exists under a different UID)
- -group Locates files that are owned by specified group. Option -nogroup means that no group corresponds to file's numeric group ID (GID) of the file
- -size Locates files with specified size.
-size
attribute lets you specify how big the files should be to match. You can specify your size in kilobytes and optionally also use+
or-
to specify size greater than or less than specified argument. For example:find /home -name "*.txt" -size 100k
find /home -name "*.txt" -size +100k
find /home -name "*.txt" -size -100k
The first brings up files of exactly 100KB, the second only files greater than 100KB, and the last only files less than 100KB.
- -ls list current file in `ls -dils' format on standard output.
- -type Locates a certain type of file. The most typical options for -type are as following:
- d -Directory
- f - File
- l - Link
For example to find a list of the directories use can use the -type specifier. Here's one example:
find . -type d -print
No comments:
Post a Comment