fpcat command-line options¶
The following comes from
usage: fpcat [-h] [--in FORMAT] [--merge] [-o FILENAME] [--out FORMAT] [--level LEVEL] [--reorder] [--preserve-order] [--alignment N] [--show-progress] [--max-spool-size SIZE] [--tmpdir DIRNAME] [--version] [--license-check] [filename ...] Combine multiple fingerprint files into a single file positional arguments: filename input fingerprint filenames (default: use stdin) options: -h, --help show this help message and exit --in FORMAT input fingerprint format. One of fps or fpb (with optional gz or zst compression), or flush. (default guesses from filename or is fps) --merge assume the input fingerprint files are in popcount order and do a merge sort -o FILENAME, --output FILENAME save the fingerprints to FILENAME (default=stdout) --out FORMAT output fingerprint format. One of fps, fps.gz, fps.zst, fpb, or flush. (default guesses from output filename, or is 'fps') --level LEVEL compression level. Must be a positive integer or one of 'min', 'default', or 'max'. --reorder reorder the output fingerprints by popcount (default for FPB output) --preserve-order save the output fingerprints in the same order as the input (default for FPS output) --alignment N alignment size when saving a FPB file (default=8) --show-progress show progress --max-spool-size SIZE use temporary files for extra storage space for huge FPB files (default uses RAM) --tmpdir DIRNAME directory for the temporary files (default uses the system temp directory) --version show program's version number and exit --license-check Check the license and report results to stdout. Examples: fpcat can be used to convert between FPS and FPB formats. This is handy if you want to see what's inside of an FPB file: fpcat fingerprints.fpb You can use also use fpcat to make an FPB file from an FPS file: fpcat fingerprints.fps -o fingerprints.fpb You might have generated a set of FPS file which you want to merge into a single FPB. (For example, you might have used GNU parallel to generate FPS files for each of the PubChem files, which you want to merge into a single file.): fpcat Compound_*.fps -o pubchem.fpb By default the FPB format sorts the fingerprints by popcount. (Use --preserve-order if you really want to preserve the input order.) The sort overhead for PubChem uses about 10 GB of RAM. If you don't have that much memory then ask fpcat to use less memory: fpcat --max-spool-size 1GB Compound_*.fps -o pubchem.fpb This will use about 2 GB of RAM and the --tmpdir for the rest. (Yes, it would be nice if I could get those two memory size numbers to match.) The --merge option is experimental. Use it if the input fingerprints are in popcount order, because sorted output is a simple merge sort of the individual sorted inputs. However, this option opens all input files at the same time, which may exceed your resource limit on file descriptors. The current implementation also requires a lot of disk seeks so is slow for many files. The flush format is only available if the chemfp_converter package was installed.