How to split a file in Linux: split, merge
Linux has plenty of swift, easy, and practical commands you can use, including splitting or merging files with a single command. Come discover how to break any file you want seamlessly.
How to split a file in Linux?
To split a large file into smaller ones in Linux, you can use the split command which is defined as:
split [OPTION]... [FILE [PREFIX]]
- -a, --suffix-length=N: generate suffixes of length N (default 2)
- --additional-suffix=SUFFIX: append an additional SUFFIX to file names
- -b, --bytes=SIZE: put SIZE bytes per output file
- -C, --line-bytes=SIZE: put at most SIZE bytes of records per output file
- -d: use numeric suffixes starting at 0, not alphabetic
- --numeric-suffixes[=FROM]: same as -d, but allow setting the start value
- -x: use hex suffixes starting at 0, not alphabetic
- --hex-suffixes[=FROM]: same as -x, but allow setting the start value
- -e, --elide-empty-files: do not generate empty output files with '-n'
- --filter=COMMAND: write to shell COMMAND; filename is $FILE
- -l, --lines=NUMBER: put NUMBER lines/records per output file
- -n, --number=CHUNKS: generate CHUNKS output files; see explanation below
- -t, --separator=SEP: use SEP instead of a newline as the record separator; '\0' (zero) specifies the NUL character
- -u, --unbuffered: immediately copy input to output with '-n r/...'
- --verbose: print a diagnostic just before each output file is opened
- --help: display this help and exit
- --version: output version information and exit
The SIZE argument is an integer and optional unit (example: 10K is 10*1024).
Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).
Binary prefixes can be used, too: KiB=K, MiB=M, and so on.
CHUNKS may be:
- N: split into N files based on the size of the input
- K/N: output Kth of N to stdout
- l/N: split into N files without splitting lines/records
- l/K/N: output Kth of N to stdout without splitting lines/records
- r/N: like 'l' but use round-robin distribution
- r/K/N: likewise but only output Kth of N to stdout
Examples
The OPTION parameters are the rules to split the file into smaller ones, and the PREFIX is used to name the resulting files.
- split -l 500 myFile: It will split your myFile into smaller files with a maximum of 500 lines each (named xaa, xab, xac, etc.).
- split -l 500 myFile mySplittedFile: It will split your myFile into smaller files with a maximum of 500 lines each (named mySplittedFile.aa, mySplittedFile.ab, mySplittedFile.ac, etc.)
- split -b 40k myFile mySplittedFile: It will split your myFile into smaller files with a maximum size of 500 40k each (named mySplittedFile.aa, mySplittedFile.ab, mySplittedFile.ac, etc.).
How to merge or recover files in Linux?
You can easily recover the original file by concatenating them using the cat command. A word of warning, the >> operator will erase all the existing content in case the myMergedOriginalFile file already exist and contains data:
- cat mySplittedFile.aa mySplittedFile.ab >> myMergedOriginalFile