SS

PURPOSE   OPERATION   OPTIONS   COMMAND LINES   RELATED PROGRAMS


Author: Dan Mares, dmares @ maresware . com
Portions Copyright © 1998-2021 by Dan Mares and Mares and Company, LLC
Phone: 678-427-3275

top

PURPOSE

SS is designed to search a disk at the physical level to determine if specific strings are found anywhere on the disk.

It can also search floppy disks in the A: or B: drive. Linux (ext2) and MacIntosh floppy formats are supported.

SS searches the entire physical disk, and as it finds the strings on the disk, it will then write information to an output file identifying the physical head, track, sector (CHS), logical LBA sector number and physical byte number of the string within the sector.

These physical locators (CHS) are provided for compatibility with other programs. Clusters are not identified because NT, and Linux do not use cluster notation.


top

OPERATION

SS searches large blocks of sectors (entire tracks) of the disk for the strings in the list provided (-s option). When a hit is found, the information as to cylinder, head, sector (CHS) and logical LBA sector where the data was found is placed to the output file. In addition, a default of 80 characters of surrounding text is placed in the output record. This is to allow the reviewer to determine if a further examination of the data is necessary. The amount of surrounding text can be adjusted by the user using the -m option. This option (-m) is identical to the one found in the strsrch program. In fact, the output line looks pretty much identical in both programs which make merging the output very easy.

The advantage of SS over other string searching software is that you will be able to provide a file which contains any number of strings to search for, whereas other programs have limits on the number of strings to search for, or if there is no limit, the time increase for more strings is exponential. SS doesn't have these limitations. Also, this program runs through to completion without any operator intervention, whereas some of the other software packages, stop after each hit is made. That can be very time consuming.

The idea behind this program is that you start it running, and come back later to see the listing of the results. This is especially useful when dealing with large hard disks, which may take hours to search.

Another, slight advantage of these programs is because they read an entire "block" at a time, the possibility of the string being separated on the disk in a different sector is cut down because now we are reading a group of contiguous sectors.

If the option you provide on the command line with the -s for string file, is not a file, the program assumes this one word is the string you are looking for. Again, similar to the strsrch program -s option.

The SS program is also designed (-g option) to be able to only check the first bytes of each sector for the signatures. This check is performed if the string file has headers associated with file types. Since file headers are generally presumed to be in the first X bytes of a file, and the file begins at a sector boundary, the output should list all the starting sectors of files whose headers match the search strings. This is very dependent on an accurate string file being provided to the program. The program can identify all directories by searching for a string of "._______". thats a period followed by 7 spaces. Since FAT32 directory entry sectors always start with this sequence.


top

EXCLUDE WHITE SPACE & UNICODE

Another, enhancement is with the -c option (Compress spaces) you can search a string of words (a phrase) and make certain that spaces and carriage returns within the string are ignored. An example to follow is one like this, suppose we are looking for the words:

purchases  of  personal  computers

This text string has 4 words in it. The chance that an extra space, tab, or carriage ¶
return is contained within the words is possible but not likely. However, CANYOU   BE ¶
CERTAIN?

The chance that a carriage return will separate the string is very likely. If that were true, then a search for the text as typed would not yield any results.

SS handles these possible anomolies easily with the -c option. What this technically does is this. When the track of data is read into the computer memory, all text characters are compressed and the spaces (including hex 00's from unicode) are squeezed out. Now the string looks like

purchasesofpersonalcomputers

It becomes one very long string. The program can find that. The drawback to this search is, that now, a character string that was located in sector 9, could conceivably end up looking like it is in sector 1. (Because all the spaces are squeezed out). The disk is not altered, but the data in memory is reorganized so that the printed hit list is now slightly in error. The track and head indicated are still accurate, but obviously the sector can be in error ± 63 sectors, depending on how many spaces were squeezed out.

If the -c option is chosen, then a message is provided in the output file that the sector listing is not accurate. You will know what track and head to look at, but not exactly what sector. You can then manually look at the area with another program to actually locate the sector. But what you have gained, is the knowledge that the string is there.

The only mandatory requirement when using the -c option, is that the file you create which contains the strings should have the strings listed WITHOUT spaces, just as the program will search for them. (STILL ONLY ONE STRING PER LINE), but no spaces in each string.

Many current program write their output in UNICODE format. This is a technical format but suffice it to say that if the file was written using this format there are few if any string search programs that will find strings in these files. (The registry, and security files are also written in UNICODE format.) The -c option is one way to search these files.

In the hits file, there is also a record kept of the date and time that the program is being run. This cannot be turned off. But you can later use a word processor to delete these entries.

It is suggested that the user test, and completely familiarize himself with the programs operation before putting it into investigative operation. Also, set up some test disks, so you know exactly what you will be expecting as output.

As always, if you find the program needs enhancements, or doesn’t operate according to these instructions let me know. If there is a problem in the SS version, please try to send me the disk containing the information that it is not properly reporting.

In effect, if you eliminate the white space, you are generally searching for UNICODE characters as a side benefit.


top

OPTIONS

-a:  Append any output to the already existing file identified in the -o option. [APPEND=[ON|OFF]]

-b:  In order to be viewed and printed easily, the output record (surrounding characters) normally substitutes the traditional dot (.) for unprintable ASCII characters. This allows for better viewing and printing. However, in some cases, (ie if you are looking for binary data, and need it maintained) the user would want to preserve this binary (unprintable) data in the output record for future processing. This -b (binary) option allows all the surrounding text in the output record to be preserved as it was in the original location. The user should be cautioned that if the print command is given on the output file, and there is binary data in the records, the printer may have a heart attack when certain codes are sent to it.

-c:  ‘C’ompress or squeeze out all spaces and newline and in effect compresses all ascii characters together in the work buffer. In plain english, it allows you search for multiple word strings without worrying if extra spaces were placed within words, or if the string is divided by a carriage return. The drawback is that the SECTOR identified may not be correct because the characters being searched in the computers memory are not in the same location as they were read off the disk. The TRACK and HEAD are correct, and the string is there. The requirement of this option is that the word strings entered into the search string file CANNOT themselves contain any spaces. Enter the strings (words) without any spaces in the line. A sample: purchaseofcomputerequipmentbydanmares. [COMPRESS=[ON|OFF]]

-d + drive  to search: This option is required. Must be A or B for floppy drives, and physical drive 0 thru 9 for hard disks. sample: -d a or -d 0.

-g:  Sometimes the user wishes to search for the occurance of file headers. Most traditionally search for graphic type files using a string file containing the anticipipated headers of graphic files. This -g (graphic) option, tells the program to search only the beginning bytes of each sector for the strings. Assuming that a file header is located at the front of the file, and assuming that files begin on sector boundardies, this option only searches for the strings in the first X bytes of each sector. Besides being a fast search, it can identify the sectors which contain files containing the "header" signatures the user provided in the strings file. If this option is used, it is also recommended that the -m 80L option also be used. This will butt the hit to the Left of the data record for easier viewing and analysis. You should also consider using the -b (binary) option to keep all data in tact.

-iI:  If a hit is made, exit program immediately. Error level for this exit is 1. Error level for no hits is 0. [IMMEDIATE=[ON|OFF]]

-m #[CLR]:  replace # with a new width of how many characters are to be contained in the output line. There is no max to this value, but if you use greater than 1024 you may experience some problems when strings are at the first sector. The # value can be followed by one of the following upper case letters [CLR]. The "string" that is hit will therefore be placed in the 'C'enter, 'L'eft or 'R'ight side of the output record. This helps in viewing and further analysis of the output data. The Center is default. (sample: -m 80L, -m 80, -m 80R). If using the -g option, it is suggested the 'L' option be used for clarity in viewing the output. [WIDTH=XXX]

-oO + outputfilename:  This option is required. The output is placed into a file name by output. The uppercase O automatically initiates the append option. [OUTPUT=FILENAME]

-r:  'R'everse search criteria. Only one string (of a single character in length) can be in the string list. The program stops when it finds anything "EXCEPT" what is in the string list. (Possibly use this to confirm that a disk wiping program in fact put all XX's on the drive). Use this with a string file containing the single hex value 0, etc. to see if a drive is fully wiped. The 0 in the string file will be converted to a hex 0, and the program will stop when it finds anything except a hex 00.

-s + filename:  containing strings to search for. This option is required. Place the -s and follow it with the name of the file containing the strings you wish to search for. The file containing the strings cannot be larger than 32000 bytes. Each string cannot be longer than the number of characters located on one track of the disk, but be reasonable. Try to keep the strings no longer than 40 characters each. They should be one to a line, and the file should be created using an ascii (text) editor, not a word processor, because word processors add extra unreadable characters to the file. The strings can be upper or lower case. The search is done independent of case. If no file by filename exists, the program assumes the word after the -s is the only string to search for, and proceeds under that assumption.

-t + #:  the # is replaced by a track number to begin processing on, if you want to begin at some track other than the 1st. If the drive is an LBA drive, the numbers are interpreted to be the sector number to start and end with. [TRACK=XXX]

-Ww:  Make the output a single ‘W’ide line instead of the traditional two line output. Use this if you want to import the output into a data base. (-W also eliminates header from output. [QUIET=[ON|OFF]]), (-w or [SINGLE=[ON|OFF]] only produces a single line output with headers.)

-1+logfile:  A name of the logfile to print some accounting information to.


top

COMMAND LINES

Command line format:

C:>ss -d 0 -s filename  -o outputfile [options,  -cmw]

C:>ss -d 0 -s filename -o output -m 124 -w

Remember, the -d -s and -o options are REQUIRED.


RELATED PROGRAMS

DD

STRSRCH

top