fpcat command-line options

The following comes from fpcat --help:

usage: fpcat [-h] [--in FORMAT] [--merge] [-o FILENAME] [--out FORMAT]
             [--level LEVEL] [--reorder] [--preserve-order] [--alignment N]
             [--show-progress] [--max-spool-size SIZE] [--tmpdir DIRNAME]
             [--version] [--license-check]
             [filename ...]

Combine multiple fingerprint files into a single file

positional arguments:
  filename              input fingerprint filenames (default: use stdin)

options:
  -h, --help            show this help message and exit
  --in FORMAT           input fingerprint format. One of fps or fpb (with
                        optional gz or zst compression), or flush. (default
                        guesses from filename or is fps)
  --merge               assume the input fingerprint files are in popcount
                        order and do a merge sort
  -o FILENAME, --output FILENAME
                        save the fingerprints to FILENAME (default=stdout)
  --out FORMAT          output fingerprint format. One of fps, fps.gz,
                        fps.zst, fpb, or flush. (default guesses from output
                        filename, or is 'fps')
  --level LEVEL         compression level. Must be a positive integer or one
                        of 'min', 'default', or 'max'.
  --reorder             reorder the output fingerprints by popcount (default
                        for FPB output)
  --preserve-order      save the output fingerprints in the same order as the
                        input (default for FPS output)
  --alignment N         alignment size when saving a FPB file (default=8)
  --show-progress       show progress
  --max-spool-size SIZE
                        use temporary files for extra storage space for huge
                        FPB files (default uses RAM)
  --tmpdir DIRNAME      directory for the temporary files (default uses the
                        system temp directory)
  --version             show program's version number and exit
  --license-check       Check the license and report results to stdout.

Examples:

fpcat can be used to convert between FPS and FPB formats. This is
handy if you want to see what's inside of an FPB file:

    fpcat fingerprints.fpb

You can use also use fpcat to make an FPB file from an FPS file:

    fpcat fingerprints.fps -o fingerprints.fpb

You might have generated a set of FPS file which you want to merge
into a single FPB. (For example, you might have used GNU parallel to
generate FPS files for each of the PubChem files, which you want to
merge into a single file.):

    fpcat Compound_*.fps -o pubchem.fpb

By default the FPB format sorts the fingerprints by popcount. (Use
--preserve-order if you really want to preserve the input order.)  The
sort overhead for PubChem uses about 10 GB of RAM. If you don't have
that much memory then ask fpcat to use less memory:

    fpcat --max-spool-size 1GB Compound_*.fps -o pubchem.fpb

This will use about 2 GB of RAM and the --tmpdir for the rest. (Yes,
it would be nice if I could get those two memory size numbers to
match.)

The --merge option is experimental. Use it if the input fingerprints
are in popcount order, because sorted output is a simple merge sort of
the individual sorted inputs. However, this option opens all input
files at the same time, which may exceed your resource limit on file
descriptors. The current implementation also requires a lot of disk
seeks so is slow for many files.

The flush format is only available if the chemfp_converter package was
installed.