SPLIT/UNSPLIT

PURPOSE   OPERATION   UNSPLIT OPERATION   OPTIONS   COMMAND LINES   RELATED PROGRAMS


Author: Dan Mares, dmares @ maresware . com
Portions Copyright © 1998-2021 by Dan Mares and Mares and Company, LLC
Phone: 678-427-3275
Last updated: 8/10/2008

One liner: Splits large fixed length record files into manageable pieces..


top

PURPOSE

As of 8/2008 the operation of split has been modified. The split command help screen should be consulted for uptodate operation.

Splits a file into equal parts. This is useful when you have a large data file and need to split it to 64000 lines so that it will be imported cleanly into older versions of Excel (which have a 64000 line limit per page).

Splits a file into equal parts based on a fixed record length file format.

UNSPLT (no longer supported) will take the pieces off of the floppies and place them back into a single file on the hard drive. Unsplit reverses the action of split.


top

OPERATION

SPLIT will split a file into smaller pieces for easier manipulation and testing of it pieces. 

Similar to the basic FILSPLIT program, but the splitting is automatic, so that you do not have to retype the command line for each section you create.

One of the options will let you split a large text (or tsv delimited) file into smaller pieces. You can use thesesmaller (generally 60000 line) pieces which can then be easily loaded into Excel.

If you have a fixed length record file, you can use a record length to ensure that the split ends on a record boundary. This allows the program can cleanly split a normal file into pieces that will fit evenly (without records being split). 

The record length is only used to insure that IF you wanted to use any one disk as a standalone piece of the file, it would be a complete piece.


UNSPLIT OPERATION

NO LONGER SUPPORTED:


Following are the UNSPLIT routines:

Unsplit can be used as a simply pack program to pack files together one after the other.

It is designed to be used with the outputs from the split program, but will do other things also.

Since split sequentially numbers the output file extensions from 001 to xxx,it then follows that the unsplit routine will look for files with the same filename and extensions from 1 to xxx.

If your output files do not have sequential extensions then you will have to do that before running unsplit.


top

COMMAND LINE

C:>split_1 input [-r recl] [-l number or records/lines per file]

C:>split_1    input    -r 80  -l 10000 (use -r only for fixed length records)

C:>split  inputfile  -l 64000

The output files are sequentially numbered from 001-XXX to designate which split it is, and will each contain xxxxxx lines of text.

The filename of the output is the filename of the input with the extension changed.

.


top

OPTIONS

-r  recl:  Record length of input record. Should  be actual record length. DO NOT USE this option if you are splitting text files.

-l + number of records:  If the file is fixed length records, then this many records to each output. If the file is a text file, this many lines to the output. .



RELATED PROGRAMS

FILSPLIT

top