Auto-commit: 2025-10-31 08:58:35
This commit is contained in:
443
Tools/UniExtractRC3/UniExtract/bin/file/man/cat1/file.1.txt
Normal file
443
Tools/UniExtractRC3/UniExtract/bin/file/man/cat1/file.1.txt
Normal file
@@ -0,0 +1,443 @@
|
||||
FILE(1) BSD General Commands Manual FILE(1)
|
||||
|
||||
NAME
|
||||
file -- determine file type
|
||||
|
||||
SYNOPSIS
|
||||
file [-bchikLnNprsvz] [--mime-type] [--mime-encoding]
|
||||
[-f namefile] [-F separator] [-m magicfiles] file
|
||||
file -C [-m magicfile]
|
||||
file [--help]
|
||||
|
||||
DESCRIPTION
|
||||
This manual page documents version 5.03 of the file com-
|
||||
mand.
|
||||
|
||||
file tests each argument in an attempt to classify it.
|
||||
There are three sets of tests, performed in this order:
|
||||
filesystem tests, magic tests, and language tests. The
|
||||
first test that succeeds causes the file type to be
|
||||
printed.
|
||||
|
||||
The type printed will usually contain one of the words
|
||||
text (the file contains only printing characters and a few
|
||||
common control characters and is probably safe to read on
|
||||
an ASCII terminal), executable (the file contains the
|
||||
result of compiling a program in a form understandable to
|
||||
some UNIX kernel or another), or data meaning anything
|
||||
else (data is usually `binary' or non-printable). Excep-
|
||||
tions are well-known file formats (core files, tar ar-
|
||||
chives) that are known to contain binary data. When modi-
|
||||
fying magic files or the program itself, make sure to
|
||||
preserve these keywords. Users depend on knowing that all
|
||||
the readable files in a directory have the word `text'
|
||||
printed. Don't do as Berkeley did and change `shell
|
||||
commands text' to `shell script'.
|
||||
|
||||
The filesystem tests are based on examining the return
|
||||
from a stat(2) system call. The program checks to see if
|
||||
the file is empty, or if it's some sort of special file.
|
||||
Any known file types appropriate to the system you are
|
||||
running on (sockets, symbolic links, or named pipes
|
||||
(FIFOs) on those systems that implement them) are intuited
|
||||
if they are defined in the system header file
|
||||
<sys/stat.h>.
|
||||
|
||||
The magic tests are used to check for files with data in
|
||||
particular fixed formats. The canonical example of this
|
||||
is a binary executable (compiled program) a.out file,
|
||||
whose format is defined in <elf.h>, <a.out.h> and possibly
|
||||
<exec.h> in the standard include directory. These files
|
||||
have a `magic number' stored in a particular place near
|
||||
the beginning of the file that tells the UNIX operating
|
||||
system that the file is a binary executable, and which of
|
||||
several types thereof. The concept of a `magic' has been
|
||||
applied by extension to data files. Any file with some
|
||||
invariant identifier at a small fixed offset into the file
|
||||
can usually be described in this way. The information
|
||||
identifying these files is read from the compiled magic
|
||||
file c:/progra~1/file/share/misc/magic.mgc, or the files
|
||||
in the directory c:/progra~1/file/share/misc/magic if the
|
||||
compiled file does not exist. In addition, if
|
||||
$HOME/.magic.mgc or $HOME/.magic exists, it will be used
|
||||
in preference to the system magic files.
|
||||
|
||||
If a file does not match any of the entries in the magic
|
||||
file, it is examined to see if it seems to be a text file.
|
||||
ASCII, ISO-8859-x, non-ISO 8-bit extended-ASCII character
|
||||
sets (such as those used on Macintosh and IBM PC systems),
|
||||
UTF-8-encoded Unicode, UTF-16-encoded Unicode, and EBCDIC
|
||||
character sets can be distinguished by the different
|
||||
ranges and sequences of bytes that constitute printable
|
||||
text in each set. If a file passes any of these tests,
|
||||
its character set is reported. ASCII, ISO-8859-x, UTF-8,
|
||||
and extended-ASCII files are identified as `text' because
|
||||
they will be mostly readable on nearly any terminal;
|
||||
UTF-16 and EBCDIC are only `character data' because, while
|
||||
they contain text, it is text that will require transla-
|
||||
tion before it can be read. In addition, file will
|
||||
attempt to determine other characteristics of text-type
|
||||
files. If the lines of a file are terminated by CR, CRLF,
|
||||
or NEL, instead of the Unix-standard LF, this will be
|
||||
reported. Files that contain embedded escape sequences or
|
||||
overstriking will also be identified.
|
||||
|
||||
Once file has determined the character set used in a text-
|
||||
type file, it will attempt to determine in what language
|
||||
the file is written. The language tests look for particu-
|
||||
lar strings (cf. <names.h> ) that can appear anywhere in
|
||||
the first few blocks of a file. For example, the keyword
|
||||
.br indicates that the file is most likely a troff(1)
|
||||
input file, just as the keyword struct indicates a C pro-
|
||||
gram. These tests are less reliable than the previous two
|
||||
groups, so they are performed last. The language test
|
||||
routines also test for some miscellany (such as tar(1) ar-
|
||||
chives).
|
||||
|
||||
Any file that cannot be identified as having been written
|
||||
in any of the character sets listed above is simply said
|
||||
to be `data'.
|
||||
|
||||
OPTIONS
|
||||
-b, --brief
|
||||
Do not prepend filenames to output lines (brief
|
||||
mode).
|
||||
|
||||
-c, --checking-printout
|
||||
Cause a checking printout of the parsed form of
|
||||
the magic file. This is usually used in conjunc-
|
||||
tion with the -m flag to debug a new magic file
|
||||
before installing it.
|
||||
|
||||
-C, --compile
|
||||
Write a magic.mgc output file that contains a pre-
|
||||
parsed version of the magic file or directory.
|
||||
|
||||
-e, --exclude testname
|
||||
Exclude the test named in testname from the list
|
||||
of tests made to determine the file type. Valid
|
||||
test names are:
|
||||
|
||||
apptype
|
||||
EMX application type (only on EMX).
|
||||
|
||||
text
|
||||
Various types of text files (this test will try
|
||||
to guess the text encoding, irrespective of the
|
||||
setting of the `encoding' option).
|
||||
|
||||
encoding
|
||||
Different text encodings for soft magic tests.
|
||||
|
||||
tokens
|
||||
Looks for known tokens inside text files.
|
||||
|
||||
cdf
|
||||
Prints details of Compound Document Files.
|
||||
|
||||
compress
|
||||
Checks for, and looks inside, compressed files.
|
||||
|
||||
elf
|
||||
Prints ELF file details.
|
||||
|
||||
soft
|
||||
Consults magic files.
|
||||
|
||||
tar
|
||||
Examines tar files.
|
||||
|
||||
-f, --files-from namefile
|
||||
Read the names of the files to be examined from
|
||||
namefile (one per line) before the argument list.
|
||||
Either namefile or at least one filename argument
|
||||
must be present; to test the standard input, use
|
||||
`-' as a filename argument.
|
||||
|
||||
-F, --separator separator
|
||||
Use the specified string as the separator between
|
||||
the filename and the file result returned.
|
||||
Defaults to `:'.
|
||||
|
||||
-h, --no-dereference
|
||||
option causes symlinks not to be followed (on sys-
|
||||
tems that support symbolic links). This is the
|
||||
default if the environment variable
|
||||
POSIXLY_CORRECT is not defined.
|
||||
|
||||
-i, --mime
|
||||
Causes the file command to output mime type
|
||||
strings rather than the more traditional human
|
||||
readable ones. Thus it may say `text/plain;
|
||||
charset=us-ascii' rather than `ASCII text'. In
|
||||
order for this option to work, file changes the
|
||||
way it handles files recognized by the command
|
||||
itself (such as many of the text file types,
|
||||
directories etc), and makes use of an alternative
|
||||
`magic' file. (See the FILES section, below).
|
||||
|
||||
--mime-type, --mime-encoding
|
||||
Like -i, but print only the specified element(s).
|
||||
|
||||
-k, --keep-going
|
||||
Don't stop at the first match, keep going. Subse-
|
||||
quent matches will be have the string `\012- '
|
||||
prepended. (If you want a newline, see the `-r'
|
||||
option.)
|
||||
|
||||
-L, --dereference
|
||||
option causes symlinks to be followed, as the
|
||||
like-named option in ls(1) (on systems that sup-
|
||||
port symbolic links). This is the default if the
|
||||
environment variable POSIXLY_CORRECT is defined.
|
||||
|
||||
-m, --magic-file list
|
||||
Specify an alternate list of files and directories
|
||||
containing magic. This can be a single item, or a
|
||||
colon-separated list. If a compiled magic file is
|
||||
found alongside a file or directory, it will be
|
||||
used instead.
|
||||
|
||||
-n, --no-buffer
|
||||
Force stdout to be flushed after checking each
|
||||
file. This is only useful if checking a list of
|
||||
files. It is intended to be used by programs that
|
||||
want filetype output from a pipe.
|
||||
|
||||
-N, --no-pad
|
||||
Don't pad filenames so that they align in the out-
|
||||
put.
|
||||
|
||||
-p, --preserve-date
|
||||
On systems that support utime(2) or utimes(2),
|
||||
attempt to preserve the access time of files ana-
|
||||
lyzed, to pretend that file never read them.
|
||||
|
||||
-r, --raw
|
||||
Don't translate unprintable characters to \ooo.
|
||||
Normally file translates unprintable characters to
|
||||
their octal representation.
|
||||
|
||||
-s, --special-files
|
||||
Normally, file only attempts to read and determine
|
||||
the type of argument files which stat(2) reports
|
||||
are ordinary files. This prevents problems,
|
||||
because reading special files may have peculiar
|
||||
consequences. Specifying the -s option causes
|
||||
file to also read argument files which are block
|
||||
or character special files. This is useful for
|
||||
determining the filesystem types of the data in
|
||||
raw disk partitions, which are block special
|
||||
files. This option also causes file to disregard
|
||||
the file size as reported by stat(2) since on some
|
||||
systems it reports a zero size for raw disk parti-
|
||||
tions.
|
||||
|
||||
-v, --version
|
||||
Print the version of the program and exit.
|
||||
|
||||
-z, --uncompress
|
||||
Try to look inside compressed files.
|
||||
|
||||
-0, --print0
|
||||
Output a null character `\0' after the end of the
|
||||
filename. Nice to cut(1) the output. This does not
|
||||
affect the separator which is still printed.
|
||||
|
||||
--help Print a help message and exit.
|
||||
|
||||
FILES
|
||||
c:/progra~1/file/share/misc/magic.mgc Default compiled
|
||||
list of magic.
|
||||
c:/progra~1/file/share/misc/magic Directory contain-
|
||||
ing default magic
|
||||
files.
|
||||
|
||||
ENVIRONMENT
|
||||
The environment variable MAGIC can be used to set the
|
||||
default magic file name. If that variable is set, then
|
||||
file will not attempt to open $HOME/.magic. file adds
|
||||
`.mgc' to the value of this variable as appropriate. The
|
||||
environment variable POSIXLY_CORRECT controls (on systems
|
||||
that support symbolic links), whether file will attempt to
|
||||
follow symlinks or not. If set, then file follows symlink,
|
||||
otherwise it does not. This is also controlled by the -L
|
||||
and -h options.
|
||||
|
||||
SEE ALSO
|
||||
magic(5), strings(1), od(1), hexdump(1,) file(1posix)
|
||||
|
||||
STANDARDS CONFORMANCE
|
||||
This program is believed to exceed the System V Interface
|
||||
Definition of FILE(CMD), as near as one can determine from
|
||||
the vague language contained therein. Its behavior is
|
||||
mostly compatible with the System V program of the same
|
||||
name. This version knows more magic, however, so it will
|
||||
produce different (albeit more accurate) output in many
|
||||
cases.
|
||||
|
||||
The one significant difference between this version and
|
||||
System V is that this version treats any white space as a
|
||||
delimiter, so that spaces in pattern strings must be
|
||||
escaped. For example,
|
||||
|
||||
>10 string language impress (imPRESS data)
|
||||
|
||||
in an existing magic file would have to be changed to
|
||||
|
||||
>10 string language\ impress (imPRESS data)
|
||||
|
||||
In addition, in this version, if a pattern string contains
|
||||
a backslash, it must be escaped. For example
|
||||
|
||||
0 string \begindata Andrew Toolkit document
|
||||
|
||||
in an existing magic file would have to be changed to
|
||||
|
||||
0 string \\begindata Andrew Toolkit document
|
||||
|
||||
SunOS releases 3.2 and later from Sun Microsystems include
|
||||
a file command derived from the System V one, but with
|
||||
some extensions. My version differs from Sun's only in
|
||||
minor ways. It includes the extension of the `&' opera-
|
||||
tor, used as, for example,
|
||||
|
||||
>16 long&0x7fffffff >0 not stripped
|
||||
|
||||
MAGIC DIRECTORY
|
||||
The magic file entries have been collected from various
|
||||
sources, mainly USENET, and contributed by various
|
||||
authors. Christos Zoulas (address below) will collect
|
||||
additional or corrected magic file entries. A consolida-
|
||||
tion of magic file entries will be distributed periodi-
|
||||
cally.
|
||||
|
||||
The order of entries in the magic file is significant.
|
||||
Depending on what system you are using, the order that
|
||||
they are put together may be incorrect. If your old file
|
||||
command uses a magic file, keep the old magic file around
|
||||
for comparison purposes (rename it to
|
||||
c:/progra~1/file/share/misc/magic.orig ).
|
||||
|
||||
EXAMPLES
|
||||
$ file file.c file /dev/{wd0a,hda}
|
||||
file.c: C program text
|
||||
file: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
|
||||
dynamically linked (uses shared libs), stripped
|
||||
/dev/wd0a: block special (0/0)
|
||||
/dev/hda: block special (3/0)
|
||||
|
||||
$ file -s /dev/wd0{b,d}
|
||||
/dev/wd0b: data
|
||||
/dev/wd0d: x86 boot sector
|
||||
|
||||
$ file -s /dev/hda{,1,2,3,4,5,6,7,8,9,10}
|
||||
/dev/hda: x86 boot sector
|
||||
/dev/hda1: Linux/i386 ext2 filesystem
|
||||
/dev/hda2: x86 boot sector
|
||||
/dev/hda3: x86 boot sector, extended partition table
|
||||
/dev/hda4: Linux/i386 ext2 filesystem
|
||||
/dev/hda5: Linux/i386 swap file
|
||||
/dev/hda6: Linux/i386 swap file
|
||||
/dev/hda7: Linux/i386 swap file
|
||||
/dev/hda8: Linux/i386 swap file
|
||||
/dev/hda9: empty
|
||||
/dev/hda10: empty
|
||||
|
||||
$ file -i file.c file /dev/{wd0a,hda}
|
||||
file.c: text/x-c
|
||||
file: application/x-executable
|
||||
/dev/hda: application/x-not-regular-file
|
||||
/dev/wd0a: application/x-not-regular-file
|
||||
|
||||
|
||||
HISTORY
|
||||
There has been a file command in every UNIX since at least
|
||||
Research Version 4 (man page dated November, 1973). The
|
||||
System V version introduced one significant major change:
|
||||
the external list of magic types. This slowed the program
|
||||
down slightly but made it a lot more flexible.
|
||||
|
||||
This program, based on the System V version, was written
|
||||
by Ian Darwin <ian@darwinsys.com> without looking at any-
|
||||
body else's source code.
|
||||
|
||||
John Gilmore revised the code extensively, making it bet-
|
||||
ter than the first version. Geoff Collyer found several
|
||||
inadequacies and provided some magic file entries. Con-
|
||||
tributions by the `&' operator by Rob McMahon, cudcv@war-
|
||||
wick.ac.uk, 1989.
|
||||
|
||||
Guy Harris, guy@netapp.com, made many changes from 1993 to
|
||||
the present.
|
||||
|
||||
Primary development and maintenance from 1990 to the
|
||||
present by Christos Zoulas (christos@astron.com).
|
||||
|
||||
Altered by Chris Lowth, chris@lowth.com, 2000: Handle the
|
||||
-i option to output mime type strings, using an alterna-
|
||||
tive magic file and internal logic.
|
||||
|
||||
Altered by Eric Fischer (enf@pobox.com), July, 2000, to
|
||||
identify character codes and attempt to identify the lan-
|
||||
guages of non-ASCII files.
|
||||
|
||||
Altered by Reuben Thomas (rrt@sc3d.org), 2007 to 2008, to
|
||||
improve MIME support and merge MIME and non-MIME magic,
|
||||
support directories as well as files of magic, apply many
|
||||
bug fixes and improve the build system.
|
||||
|
||||
The list of contributors to the `magic' directory (magic
|
||||
files) is too long to include here. You know who you are;
|
||||
thank you. Many contributors are listed in the source
|
||||
files.
|
||||
|
||||
LEGAL NOTICE
|
||||
Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999.
|
||||
Covered by the standard Berkeley Software Distribution
|
||||
copyright; see the file LEGAL.NOTICE in the source distri-
|
||||
bution.
|
||||
|
||||
The files tar.h and is_tar.c were written by John Gilmore
|
||||
from his public-domain tar(1) program, and are not covered
|
||||
by the above license.
|
||||
|
||||
BUGS
|
||||
There must be a better way to automate the construction of
|
||||
the Magic file from all the glop in Magdir. What is it?
|
||||
|
||||
file uses several algorithms that favor speed over accu-
|
||||
racy, thus it can be misled about the contents of text
|
||||
files.
|
||||
|
||||
The support for text files (primarily for programming lan-
|
||||
guages) is simplistic, inefficient and requires recompila-
|
||||
tion to update.
|
||||
|
||||
The list of keywords in ascmagic probably belongs in the
|
||||
Magic file. This could be done by using some keyword like
|
||||
`*' for the offset value.
|
||||
|
||||
Complain about conflicts in the magic file entries. Make
|
||||
a rule that the magic entries sort based on file offset
|
||||
rather than position within the magic file?
|
||||
|
||||
The program should provide a way to give an estimate of
|
||||
`how good' a guess is. We end up removing guesses (e.g.
|
||||
`Fromas first 5 chars of file) because' they are not as
|
||||
good as other guesses (e.g. `Newsgroups:' versus
|
||||
`Return-Path:' ). Still, if the others don't pan out, it
|
||||
should be possible to use the first guess.
|
||||
|
||||
This manual page, and particularly this section, is too
|
||||
long.
|
||||
|
||||
RETURN CODE
|
||||
file returns 0 on success, and non-zero on error.
|
||||
|
||||
AVAILABILITY
|
||||
You can obtain the original author's latest version by
|
||||
anonymous FTP on ftp.astron.com in the directory
|
||||
/pub/file/file-X.YZ.tar.gz
|
||||
|
||||
BSD October 9, 2008 BSD
|
||||
Reference in New Issue
Block a user