Pages

Wednesday 11 September 2013

HDFS Commands



hadoop fs -cmd <args>

where cmd is the specific file command and <args> is a variable number of arguments.

Parameters inside brackets ([]) are optional and ellipsis (. . .) means the optional parameter can be repeated.
FILE is for filenames whereas PATH can be either filenames or directory names. 
SRC and DST are path names but they function specifically as source and destination, respectively.
LOCALSRC and LOCALDST are further required to be on the local file system.


COMMANDS
USAGE & DESCRIPTION
cat
hadoop fs –cat FILE [FILE ...]
Displays the files’ content. For reading compressed files, you should use the textcommand instead.
chgrp
hadoop fs –chgrp [-R] GROUP PATH [PATH ...]
Changes the group association for files and directories. The -Roption applies the change recursively. The user must be the files’ owner or a superuser. See section 8.3 for more background information on the HDFS file permission system.
chmod
hadoop fs –chmod [-R] MODE[,MODE ...] PATH [PATH ...]
Changes the permissions of files and directories. Similar to its Unix equivalent, MODE can be a 3-digit octal mode, or {augo}+/-{rwxX}. The -R option applies the change recursively. The user must be the files’ owner or a superuser. See section 8.3 for more background information on the HDFS file permission system.
chown
hadoop fs –chown [-R] [OWNER][:[GROUP]] PATH [PATH
...]
Changes the ownership of files and directories. The -Roption applies the change recursively. The user must be a superuser. See section 8.3 for more background information on the HDFS file permission system.
copyFromLocal
hadoop fs –copyFromLocal LOCALSRC [LOCALSRC ...] DST Identical to put (copy files from the local file system).
copyToLocal
hadoop fs –count [-q] PATH [PATH ...]
Displays the number of subdirectories, number of files, number of bytes used, and name for all files/directories identified by PATH. The -qoption displays quota information.
cp
hadoop fs –cp SRC [SRC ...] DST
Copies files from source to destination. If multiple source files are specified, destination has to be a directory.
du
hadoop fs –du PATH [PATH ...]
Displays file sizes. If PATH is a directory, the size of each file in the directory is reported. Filenames are stated with the full URI protocol prefix. Note that although dustands for disk usage, it should not be taken literally, as disk usage depends on block size and replica factors.
dus
hadoop fs –dus PATH [PATH ...]
Similar to du, but for a directory, dus reports the sum of file sizes in aggregate rather than individually.
expunge
hadoop fs –expunge
Empties the trash. If the trash feature is enabled, when a file is deleted, it is first moved into the temporary .Trash/ folder. The file will be permanently deleted from the .Trash/ folder only after a user-configurable delay. The expunge command forcefully deletes all files from the .Trash/ folder. Note that as long as a file is in the .Trash/ folder, it can be restored by moving it back to its original location.
get
hadoop fs –get [-ignorecrc] [-crc] SRC [SRC ...]
LOCALDST
Copies files to the local filesystem. If multiple source files are specified, local destination has to be a directory. If LOCALDST is -, the files are copied to stdout.
HDFS computes a checksum for each block of each file. The checksums for a file are stored separately in a hidden file. When a file is read from HDFS, the checksums in that hidden file are used to verify the file’s integrity. For the get command, the -crc option will copy that hidden checksum file. The -ignorecrc option will skip the checksum checking when copying.
getmerge
hadoop fs –getmerge SRC [SRC ...] LOCALDST [addnl]
Retrieves all files identified by SRC, merges them, and writes the single merged file to LOCALDST in the local filesystem. The option addnl will add a newline character to the end of each file.
help
hadoop fs –help [CMD]
Displays usage information for the command CMD. If CMD is omitted, it displays usage information for all commands.
ls
hadoop fs –ls PATH [PATH ...]
Lists files and directories. Each entry shows name, permissions, owner, group, size, and modification date. File entries also show their replication factor.
lsr
hadoop fs –lsr PATH [PATH ...] Recursive version of ls.
mkdir
hadoop fs –mkdir PATH [PATH ...]
Creates directories. Any missing parent directories are also created (like Unix mkdir –p).
movefromlocal
hadoop fs –moveFromLocal LOCALSRC [LOCALSRC ...] DST
Similar to put, except the local source is deleted after it’s been successfully copied to HDFS.
movetolocal
hadoop fs –moveToLocal [-crc] SRC [SRC ...] LOCALDST
Displays a “not implemented yet” message.
mv
hadoop fs –mv SRC [SRC ...] DST
Moves files from source(s) to destination. If multiple source files are specified, destination has to be a directory. Moving across filesystems is not permitted.
put
hadoop fs –put LOCALSRC [LOCALSRC ...] DST
Copies files or directories from local system to destination filesystem. If LOCALSRC is set to -, input is set to stdinand DST must be a file.
rm
hadoop fs –rm PATH [PATH ...]
Deletes files and empty directories.
rmr
hadoop fs –rmr PATH [PATH ...] Recursive version of rm.
setrep
hadoop fs –setrep [-R] [-w] REP PATH [PATH ...]
Sets the target replication factor to REPfor given files. The -R option will recursively apply the target replication factor to files in directories identified by PATH. The replication factor will take some time to get to the target. The -w option will wait for the replication factor to match the target.
stat
hadoop fs –stat [FORMAT] PATH [PATH ...]
Displays “statistical” information on files. The FORMAT string is printed exactly but with the following format specifiers replaced. %b Size of file in blocks %F The string “directory” or “regular file” depending on file type %n Filename %o Block size %r Replication %y UTC date in yyyy-MM-dd HH:mm:ss format %Y Milliseconds since January 1, 1970 UTC
tail
hadoop fs –tail [-f] FILE
Displays the last one kilobyte of FILE.
test
hadoop fs –test –[ezd] PATH
Performs one of the following type checks on PATH: -e PATH existence. Returns 0 if PATH exists. -z Empty file. Returns 0 if file length is 0. -d Returns 0 if PATH is a directory.
text
hadoop fs –text FILE [FILE ...]
Displays the textual content of files. Identical to catif files are text files. Files in known compressed format (gzip and Hadoop’s binary sequence file format) are uncompressed first.
touchz
hadoop fs –touchz FILE [FILE ...]
Creates files of length 0. Fails if files already exist and have nonzero length.





Read More on:
 
Hadoop Overview      |      Map Reduce Overview      |      Big Data Overview      |      HDFS overview


No comments: