oe2fps command-line options

The following comes from oe2fps --help:

usage: oe2fps [-h] [--path] [--circular] [--tree] [--numbits INT]
              [--minbonds INT] [--maxbonds INT] [--minradius INT]
              [--maxradius INT] [--atype ATYPE] [--btype BTYPE] [--maccs166]
              [--substruct] [--rdmaccs] [--rdmaccs/1] [--aromaticity NAME]
              [--id-tag NAME] [--type TYPE_STRING] [--using FILENAME]
              [--in FORMAT] [-o FILENAME] [--out FORMAT]
              [--errors {strict,report,ignore}] [--progress] [--help-formats]
              [-R NAME=VALUE] [--delimiter {tab,whitespace,to-eol,space}]
              [--has-header] [--version] [--license-check]
              [filenames [filenames ...]]

Generate FPS or FPB fingerprints from a structure file using OEChem

positional arguments:
  filenames             input structure files (default is stdin)

optional arguments:
  -h, --help            show this help message and exit
  --aromaticity NAME    use the named aromaticity model (same as '-R
                        aromaticity=NAME')
  --id-tag NAME         tag name containing the record id (SD files only)
  --type TYPE_STRING    Specify a chemfp type string
  --using FILENAME      Get the fingerprint type from the metadata of a
                        fingerprint file
  --in FORMAT           input structure format (default guesses from filename)
  -o FILENAME, --output FILENAME
                        save the fingerprints to FILENAME (default=stdout)
  --out FORMAT          output structure format (default guesses from output
                        filename, or is 'fps')
  --errors {strict,report,ignore}
                        how should structure parse errors be handled?
                        (default=ignore)
  --progress, --no-progress
                        Show a progress bar (default: show unless the output
                        is a terminal)
  --help-formats        list the available formats and reader arguments
  -R NAME=VALUE         specify a reader argument
  --delimiter {tab,whitespace,to-eol,space}
                        delimiter style for SMILES and InChI files. Alias for
                        '-R delimiter=VALUE'.
  --has-header          Skip the first line of a SMILES or InChI file Alias
                        for '-R has_header=1'
  --version             show program's version number and exit
  --license-check       Check the license and report results to stdout.

path, circular, and tree fingerprints:
  --path                generate path fingerprints (default)
  --circular            generate circular fingerprints
  --tree                generate tree fingerprints
  --numbits INT         number of bits in the fingerprint (default=4096)
  --minbonds INT        minimum number of bonds in the path or tree
                        fingerprint (default=0)
  --maxbonds INT        maximum number of bonds in the path or tree
                        fingerprint (path default=5, tree default=4)
  --minradius INT       minimum radius for the circular fingerprint
                        (default=0)
  --maxradius INT       maximum radius for the circular fingerprint
                        (default=5)
  --atype ATYPE         atom type flags, described below (default=Default)
  --btype BTYPE         bond type flags, described below (default=Default)

166 bit MACCS substructure keys:
  --maccs166            generate MACCS fingerprints

881 bit ChemFP substructure keys:
  --substruct           generate ChemFP substructure fingerprints

ChemFP version of the 166 bit RDKit/MACCS keys:
  --rdmaccs, --rdmaccs/2
                        generate 166 bit RDKit/MACCS fingerprints (version 2)
  --rdmaccs/1           use the version 1 definition for --rdmaccs

ATYPE is one or more of the following, separated by the '|' character

  Arom AtmNum Chiral EqArom EqHBAcc EqHBDon EqHalo FCharge HCount HvyDeg
  Hyb InRing

The following shorthand terms and expansions are also available:
 DefaultPathAtom = AtmNum|Arom|Chiral|FCharge|HvyDeg|Hyb|EqHalo
 DefaultCircularAtom = AtmNum|Arom|Chiral|FCharge|HCount|EqHalo
 DefaultTreeAtom = AtmNum|Arom|Chiral|FCharge|HvyDeg|Hyb
and 'Default' selects the correct value for the specified fingerprint.

Examples:
  --atype Default
  --atype "Arom|AtmNum|FCharge|HCount"
  --atype Arom,AtmNum,FCharge,HCount

BTYPE is one or more of the following, separated by the '|' character

  Chiral InRing Order

The following shorthand terms and expansions are also available:
 DefaultPathBond = Order|Chiral
 DefaultCircularBond = Order
 DefaultTreeBond = Order
and 'Default' selects the correct value for the specified fingerprint.

Examples:
   --btype Default
   --btype Order|InRing

To simplify command-line use, a comma may be used instead of a '|' to
separate different fields. Example:
  --atype AtmNum,HvyDegree

By default, chemfp will use the filename extension to determine the
structure file format type and possible compression. Most of the file
readers support configuration parameters. Use the '-R' option to
specify those parameters.

Use '--help-formats' to list available formats and reader parameters.

Supported oe2fps formats

The following comes from oe2fps --help-formats:

These are the structure file formats that chemfp can read when using
the OEChem toolkit.

By default, chemfp uses the filename extension to determine the format
type. If the filename ends with ".gz" then it is intepreted as a gzip
compressed file, and the second-to-last extension is used to determine
the format type. Unknown or unsupported extensions are interpreted as
a SMILES file.

(The OEChem structure file readers do not support Zstandard
compression.)

You may instead specify the file format by name (see below), which is
especially important when reading from stdin, which has no associated
filename extension.

The supported filename extensions are:

   File Type    Extension(s)
   ==========   =============
     SMILES     can, ism, isosmi, smi, usm
      SDF       mdl, rxn, sd, sdf
     InChI      inchi
  Tripos Mol2   mol2, mol2h
      PDB       ent, pdb
      XYZ       xyz
      SKC       skc
   Macromodel   mmd, mmod
  ChemDraw CDX  cdx
   OE binary    oeb
 OEB compressed oez
      CIF       cif
     mmCIF      mmcif
     FASTA      fasta
      CSV       csv

Append a '.gz' to the filename to indicate that the contents are
gzip-compressed.

The format can also be specified by name using the '--in' option:

   File Type    Format name
   ==========   =============
     SMILES     smi, can, usm
      SDF       sdf
     InChI      inchi
  Tripos Mol2   mol2, mol2h
      PDB       pdb
      XYZ       xyz
      SKC       skc
   Macromodel   mmod
  ChemDraw CDX  cdx
   OE binary    oeb
 OEB compressed oez
      CIF       cif
     mmCIF      mmcif
     FASTA      fasta
      CSV       csv

Append a '.gz' to the format name to indicate that the contents are
gzip-compressed.

The input format parsers can be configured with the "-R" option. For
example, the following reader arguments tell the SMILES readers that
the fields are whitespace delimited and the first line is a header.

   -R delimiter=whitespace -R has_header=true

All formats handle the following two reader arguments:

  aromaticity - one of 'openeye', 'daylight', 'tripos', 'mdl', or 'mmff'
      (this can also be set via the older '--aromaticity' command-line option)

  flavor - a '|' or ',' separated list of flavor names, or a numeric value.
       A leading '-' means to remove the given flavor. Examples include:

       o  Canon,Strict  -- the bitwise merger of the format's Canon and Strict values
       o  Default,-Kekule -- the format's Default flavor but without the Kekule bits
                      (every flavor has a Default)
       o  42  -- the specific OEChem flavor value 42

The SMILES and InChI formats also handle reader arguments for the
delimiter style and the presence of an initial header line using the
following:

   delimiter - one of 'to-eol' (Daylight/OEChem style), 'tab',
        'whitespace', 'space', or 'native' (for the native toolkit style)

   has_header - '1' if the first line contains a header, else '0'.

The supported format, default reader arguments, and input flavors are:

Format: can
    aromaticity: openeye
    delimiter: to-eol
    flavor: Default
        default flags: <none>
        available flags: Canon, Strict
    has_header: 0

Format: cdx
    aromaticity: openeye
    flavor: Default
        default flags: SuperAtom
        available flags: SuperAtom

Format: cif
    aromaticity: openeye
    flavor: Default
        default flags: BondHydToClosest, BondOrder, FormalCrg, ImplicitH,
            NormalizeHydPos, OccFilterOneHalf, RemovePBCImages,
            RemoveQuestionMarkInLabel, Rings
        available flags: BondHydToClosest, BondOrder, FormalCrg, ImplicitH,
            NormalizeHydPos, OccFilterOneHalf, RemovePBCImages,
            RemoveQuestionMarkInLabel, Rings

Format: csv
    aromaticity: openeye
    flavor: Default
        default flags: Header
        available flags: Header

Format: fasta
    aromaticity: openeye
    flavor: Default
        default flags: <none>
        available flags: CustomResidues, EmbeddedSMILES

Format: inchi
    aromaticity: <N/A>
    delimiter: to-eol
    flavor: Default
      no flavor flags available
    has_header: 0

Format: mmcif
    aromaticity: openeye
    flavor: Default
        default flags: <none>
        available flags: NoAltLoc

Format: mmod
    aromaticity: openeye
    flavor: Default
        default flags: <none>
        available flags: FormalCrg

Format: mol2
    aromaticity: openeye
    flavor: Default
        default flags: <none>
        available flags: Forcefield, M2H

Format: mol2h
    aromaticity: openeye
    flavor: Default
        default flags: M2H
        available flags: M2H

Format: oeb
    aromaticity: <N/A>
    flavor: Default
      no flavor flags available

Format: oez
    aromaticity: <N/A>
    flavor: Default
      no flavor flags available

Format: pdb
    aromaticity: openeye
    flavor: Default
        default flags: BondOrder, Connect, END, ENDM, FormalCrg, ImplicitH,
            Rings, SecStruct
        available flags: ALL, ALTLOC, BondOrder, CHARGE, Connect, DATA, END,
            ENDM, FORMALCHARGE, FormalCrg, ImplicitH, RADIUS, Rings,
            SecStruct, TER

Format: sdf
    aromaticity: openeye
    flavor: Default
        default flags: <none>
        available flags: FixBondMarks, SuppressEmptyMolSkip,
            SuppressImp2ExpENHSTE

Format: sdf3k
    aromaticity: openeye
    flavor: Default
        default flags: <none>
        available flags: FixBondMarks, SuppressEmptyMolSkip,
            SuppressImp2ExpENHSTE

Format: skc
    aromaticity: openeye
    flavor: Default
      no flavor flags available

Format: smi
    aromaticity: openeye
    delimiter: to-eol
    flavor: Default
        default flags: <none>
        available flags: Canon, Strict
    has_header: 0

Format: usm
    aromaticity: openeye
    delimiter: to-eol
    flavor: Default
        default flags: <none>
        available flags: Canon, Strict
    has_header: 0

Format: xyz
    aromaticity: openeye
    flavor: Default
        default flags: BondOrder, Connect, FormalCrg, ImplicitH, Rings
        available flags: BondOrder, Connect, FormalCrg, ImplicitH, Rings


See https://docs.eyesopen.com/toolkits/cpp/oechemtk/molreadwrite.html#flavored-input-and-output
for documentation about the flavors for each format.