.. _oe2fps: oe2fps command-line options ==================================== The following comes from ``oe2fps --help``: .. code-block:: none Usage: oe2fps [OPTIONS] [FILENAMES]... Generate fingerprints from a structure file using OEChem and OEGraphSim. If specified, process the filenames, otherwise read from stdin. Fingerprint types: --path Generate path fingerprints (default). --circular Generate circular fingerprints. --tree Generate tree fingerprints --maccs166 Generate 166-bit MACCS fingerprints --substruct Generate chemfp's PubChem-like substructure fingerprints. --rdmaccs, --rdmaccs/2 Generate chemfp's MACCS fingerprints, version 2. --rdmaccs/1 Generate chemfp's MACCS fingerprints, version 1. --type TYPE_STR Specify a chemfp type string --using FILENAME Get the fingerprint type from the metadata of a fingerprint file Fingerprint options: --numbits INT Number of bits in the fingerprint (default=4096) [path, tree, circular] --minbonds INT Minimum number of bonds in the fingerprint (default=0) [path, tree] --maxbonds INT Maximum number of bonds in the fingerprint (default=4 for tree, 5 for path) [path, tree] --atype OPT Atom type flags, described below (default=Default) [path, tree, circular] --btype OPT Bond type flags, described below (default=Default) [path, tree, circular] --minradius INT Minimum radius for the circular fingerprint (default=0) [circular] --maxradius INT Maximum radius for the circular fingerprint (default=5) [circular] Options: --aromaticity NAME Use the named aromaticity model (same as '-R aromaticity=NAME') --id-tag TAG Get the record it from the tag TAG instead of the first line of the record. --in FORMAT Input structure format (default guesses from filename) -o, --output FILENAME Save the fingerprints to FILENAME (default=stdout) --out FORMAT Output structure format (default guesses from output filename, or is 'fps') --include-metadata / --no-metadata With --no-metadata, do not include the header metadata for FPS output. --no-date Do not include the 'date' metadata in the output header --date STR An ISO 8601 date (like '2022-02-07T11:10:15') to use for the 'date' metadata in the output header --delimiter VALUE Delimiter style for SMILES and InChI files. Forces '-R delimiter=VALUE'. --has-header Skip the first line of a SMILES or InChI file. Forces '-R has_header=1'. -R NAME=VALUE Specify a reader argument --cxsmiles / --no-cxsmiles Use --no-cxsmiles to disable the default support for CXSMILES extensions. Forces '-R cxsmiles=1' or '-R cxsmiles=0'. --errors [strict|report|ignore] How should structure parse errors be handled? (default=ignore) --progress / --no-progress Show a progress bar (default: show unless the output is a terminal) --help-formats List the available formats and reader arguments --version Show the version and exit. --license-check Check the license and report results to stdout. --help Show this message and exit. ATYPE is one or more of the following, separated by the '|' character Arom AtmNum Chiral EqArom EqHBAcc EqHBDon EqHalo FCharge HCount HvyDeg Hyb InRing The following shorthand terms and expansions are also available: DefaultPathAtom = AtmNum|Arom|Chiral|FCharge|HvyDeg|Hyb|EqHalo DefaultCircularAtom = AtmNum|Arom|Chiral|FCharge|HCount|EqHalo DefaultTreeAtom = AtmNum|Arom|Chiral|FCharge|HvyDeg|Hyb and 'Default' selects the correct value for the specified fingerprint. Examples: --atype Default --atype "Arom|AtmNum|FCharge|HCount" --atype Arom,AtmNum,FCharge,HCount BTYPE is one or more of the following, separated by the '|' character Chiral InRing Order The following shorthand terms and expansions are also available: DefaultPathBond = Order|Chiral DefaultCircularBond = Order DefaultTreeBond = Order and 'Default' selects the correct value for the specified fingerprint. Examples: --btype Default --btype Order|InRing To simplify command-line use, a comma may be used instead of a '|' to separate different fields. Example: --atype AtmNum,HvyDegree By default, chemfp will use the filename extension to determine the structure file format type and possible compression. Most of the file readers support configuration parameters. Use the '-R' option to specify those parameters. Use '--help-formats' to list available formats and reader parameters. Supported oe2fps formats ---------------------------------------------------- The following comes from ``oe2fps --help-formats``: .. code-block:: none These are the structure file formats that chemfp can read when using the OEChem toolkit. By default, chemfp uses the filename extension to determine the format type. If the filename ends with ".gz" then it is intepreted as a gzip compressed file, and the second-to-last extension is used to determine the format type. Unknown or unsupported extensions are interpreted as a SMILES file. (The OEChem structure file readers do not support Zstandard compression.) You may instead specify the file format by name (see below), which is especially important when reading from stdin, which has no associated filename extension. The supported filename extensions are: File Type Extension(s) ========== ============= SMILES can, ism, isosmi, smi, usm SDF mdl, rxn, sd, sdf InChI inchi Tripos Mol2 mol2, mol2h PDB ent, pdb XYZ xyz SKC skc Macromodel mmd, mmod ChemDraw CDX cdx OE binary oeb OEB compressed oez CIF cif mmCIF mmcif FASTA fasta CSV csv Append a '.gz' to the filename to indicate that the contents are gzip- compressed. The format can also be specified by name using the '--in' option: File Type Format name ========== ============= SMILES smi, can, usm SDF sdf InChI inchi Tripos Mol2 mol2, mol2h PDB pdb XYZ xyz SKC skc Macromodel mmod ChemDraw CDX cdx OE binary oeb OEB compressed oez CIF cif mmCIF mmcif FASTA fasta CSV csv Append a '.gz' to the format name to indicate that the contents are gzip- compressed. The input format parsers can be configured with the "-R" option. For example, the following reader arguments tell the SMILES readers that the fields are whitespace delimited and the first line is a header. -R delimiter=whitespace -R has_header=true All formats handle the following two reader arguments: aromaticity - one of 'openeye', 'daylight', 'tripos', 'mdl', or 'mmff' (this can also be set via the older '--aromaticity' command-line option) flavor - a '|' or ',' separated list of flavor names, or a numeric value. A leading '-' means to remove the given flavor. Examples include: o Canon,Strict -- the bitwise merger of the format's Canon and Strict values o Default,-Kekule -- the format's Default flavor but without the Kekule bits (every flavor has a Default) o 42 -- the specific OEChem flavor value 42 The SMILES and InChI formats also handle reader arguments for the delimiter style and the presence of an initial header line using the following: delimiter - one of 'to-eol' (Daylight/OEChem style), 'tab', 'whitespace', 'space', or 'native' (for the native toolkit style) has_header - '1' if the first line contains a header, else '0'. The SMILES formats also support the `cxsmiles` option, which describes how handle CXSMILES extensions. The default (true) will have OEChem process the extension as OEFormat_CXSMILES. If false the record will be parsed as OEFormat_SMI and any extension will be treated as part of the identifier. The supported format, default reader arguments, and input flavors are: Format: can aromaticity: openeye delimiter: to-eol flavor: Default default flags: available flags: Canon, Strict has_header: 0 Format: cdx aromaticity: openeye flavor: Default default flags: SuperAtom available flags: SuperAtom Format: cif aromaticity: openeye flavor: Default default flags: BondHydToClosest, BondOrder, FormalCrg, ImplicitH, NormalizeHydPos, OccFilterOneHalf, RemovePBCImages, RemoveQuestionMarkInLabel, Rings available flags: BondHydToClosest, BondOrder, FormalCrg, ImplicitH, NormalizeHydPos, OccFilterOneHalf, RemovePBCImages, RemoveQuestionMarkInLabel, Rings Format: csv aromaticity: openeye flavor: Default default flags: Header available flags: Header Format: cxsmi aromaticity: openeye delimiter: to-eol flavor: Default default flags: available flags: Canon, Strict has_header: 0 Format: fasta aromaticity: openeye flavor: Default default flags: available flags: CustomResidues, EmbeddedSMILES Format: inchi aromaticity: delimiter: to-eol flavor: Default no flavor flags available has_header: 0 Format: mmcif aromaticity: openeye flavor: Default default flags: available flags: NoAltLoc Format: mmod aromaticity: openeye flavor: Default default flags: available flags: FormalCrg Format: mol2 aromaticity: openeye flavor: Default default flags: available flags: Forcefield, M2H Format: mol2h aromaticity: openeye flavor: Default default flags: M2H available flags: M2H Format: oeb aromaticity: flavor: Default no flavor flags available Format: oez aromaticity: flavor: Default no flavor flags available Format: pdb aromaticity: openeye flavor: Default default flags: BondOrder, Connect, END, ENDM, FormalCrg, ImplicitH, Rings, SecStruct available flags: ALL, ALTLOC, BondOrder, CHARGE, Connect, DATA, END, ENDM, FORMALCHARGE, FormalCrg, ImplicitH, RADIUS, Rings, SecStruct, TER Format: sdf aromaticity: openeye flavor: Default default flags: available flags: FixBondMarks, SuppressEmptyMolSkip, SuppressImp2ExpENHSTE Format: sdf3k aromaticity: openeye flavor: Default default flags: available flags: FixBondMarks, SuppressEmptyMolSkip, SuppressImp2ExpENHSTE Format: skc aromaticity: openeye flavor: Default no flavor flags available Format: smi aromaticity: openeye delimiter: to-eol flavor: Default default flags: available flags: Canon, Strict has_header: 0 Format: usm aromaticity: openeye delimiter: to-eol flavor: Default default flags: available flags: Canon, Strict has_header: 0 Format: xyz aromaticity: openeye flavor: Default default flags: BondOrder, Connect, FormalCrg, ImplicitH, Rings available flags: BondOrder, Connect, FormalCrg, ImplicitH, Rings See https://docs.eyesopen.com/toolkits/cpp/oechemtk/molreadwrite.html#flavored-input-and-output for documentation about the flavors for each format.