.. _ob2fps: ob2fps command-line options ==================================== The following comes from ``ob2fps --help``: .. code-block:: none Usage: ob2fps [OPTIONS] [FILENAMES]... Generate fingerprints from a structure file using Open Babel. If specified, process the filenames, otherwise read from stdin. Fingerprint types: --FP2 Linear fragments up to 7 atoms (default) --FP3 SMARTS patterns specified in the file patterns.txt --FP4 SMARTS patterns specified in the file SMARTS_InteLigand.txt --MACCS, --maccs, --maccs166 Open Babel's implementation of the MACCS 166 keys --ECFP0 ECFP (circular) fingerprints with diameter 0 --ECFP2 ECFP (circular) fingerprints with diameter 2 --ECFP4 ECFP (circular) fingerprints with diameter 4 --ECFP6 ECFP (circular) fingerprints with diameter 6 --ECFP8 ECFP (circular) fingerprints with diameter 8 --ECFP10 ECFP (circular) fingerprints with diameter 10 --substruct chemfp's PubChem-like substructure fingerprints --rdmaccs, --rdmaccs/2 chemfp's MACCS fingerprints, version 2. --rdmaccs/1 chemfp's MACCS fingerprints, version 1 --type TYPE_STR Specify a chemfp type string --using FILENAME Get the fingerprint type from the metadata of a fingerprint file Fingerprint options: --nBits INT number of bits in the fingerprint (default=4096) [ECFP] Options: --id-tag TAG Tag name containing the record id (SD files only) --in FORMAT Input structure format (default guesses from filename) -o, --output FILENAME Save the fingerprints to FILENAME (default=stdout) --out FORMAT Output structure format (default guesses from output filename, or is 'fps') --include-metadata / --no-metadata With --no-metadata, do not include the header metadata for FPS output. --no-date Do not include the 'date' metadata in the output header --date STR An ISO 8601 date (like '2022-02-07T11:10:15') to use for the 'date' metadata in the output header --delimiter VALUE Delimiter style for SMILES and InChI files. Forces '-R delimiter=VALUE'. --has-header Skip the first line of a SMILES or InChI file. Forces '-R has_header=1'. -R NAME=VALUE Specify a reader argument --cxsmiles / --no-cxsmiles Use --no-cxsmiles to disable the default support for CXSMILES extensions. Forces '-R cxsmiles=1' or '-R cxsmiles=0'. --errors [strict|report|ignore] How should structure parse errors be handled? (default=ignore) --progress / --no-progress Show a progress bar (default: show unless the output is a terminal) --help-formats List the available formats and reader arguments --version Show the version and exit. --license-check Check the license and report results to stdout. --help Show this message and exit. By default the Open Babel structure reader determines the file format and compression type based on the filename extension. Unknown filename extensions are treated as a uncompressed SMILES files. If the data comes from stdin, or the guess based on extension name is wrong, then use "--in FORMAT" option to change the default input format. For examples: --in smi --in sdf.gz Use `-R` to specify format-specific reader arguments. Use `--help-formats` for a list of available formats and reader arguments. Supported ob2fps formats ---------------------------------------------------- The following comes from ``ob2fps --help-formats``: .. code-block:: none These are the structure file formats that chemfp can read when using the Open Babel toolkit. chemfp has special support for the SMILES, InChI, and SDF formats when using the Open Babel toolkit. For these formats, by default, chemfp uses the filename extension to determine the format type. If the filename ends with ".gz" or ".zst" then it is intepreted as a gzip or Zstandard compressed file, and the second-to-last extension is used to determine the format type. Unknown or unsupported extensions are then tested against Open Babel format names (see below), and if still unknown, interpreted as a SMILES file. You will need to use "-R implementation=chemfp" to enable zst support for the SDF format. You may instead specify the file format by name (see below), which is especially important when reading from stdin, which has no associated filename extension. These specially supported filename extensions are: File Type Extension(s) ========== ============= SMILES can, ism, isosmi, smi, usm SDF sdf InChI inchi The format can also be specified by name using the '--in' option: File Type Format name (append .gz or .zst if compressed) ========== =========== SMILES smi, can, usm SDF sdf InChI inchi The input format parsers can be configured with the "-R" option. For examples, the following reader arguments tell the SMILES readers that the fields are whitespace delimited and the first line is a header. -R delimiter=whitespace -R has_header=true All of the readers support the 'options' reader argument, which is a string passed directly to OBConversion(). This is a compact way to encode all of the Open Babel parameters used in the conversion. For example, 'ab"text"', would set option 'a' to True, and option 'b' to the string "text". The SMILES format parsers use three additional reader arguments: * 'delimiter' specifies the delimiter type. The default is 'to-eol'. The other values are 'tab', 'whitespace', 'space' and 'native'. Use "-R delimiter=native" to match Open Babel's native delimiter style, which is 'to-eol'. * 'has_header', if false will skip the first line of the SMILES file (because it is a header line). * 'cxsmiles' describes how to handle CXSMILES extensions. Open Babel does not handle CXSMILES. The default (true) will remove the extension before processing. If false any extension will be treated as part of the identifier. The SDF format parser supports one additional reader argument: * 'implementation': if "openbabel" or "native", use Open Babel's native SDF parser. If "chemfp" use chemfp's own implementation to find SDF records, which are then passed to Open Babel for parsing. This gives more fine-grained error reporting, and supports zst compression, and with similar performance. (Note: Open Babel supports additional options.) The InChI format parser supports one additional reader argument: * 'delimiter' works the same as it does for the SMILES formats In addition, you may specify an Open Babel formats, either by one of the following format names, or by reading a filename ending with one of the format names, optionally with a .gz suffix. Zstandard compression is not supported by the native Open Babel reader. Format Description and options ========= ========================== CONFIG DL-POLY CONFIG CONTCAR VASP format s Output single bonds only b Disable bonding entirely CONTFF MDFF format HISTORY DL-POLY HISTORY MDFF MDFF format POSCAR VASP format s Output single bonds only b Disable bonding entirely POSFF MDFF format VASP VASP format s Output single bonds only b Disable bonding entirely abinit ABINIT Output Format s Output single bonds only b Disable bonding entirely acesout ACES output format s Output single bonds only b Disable bonding entirely acr ACR format adfband ADF Band output format adfdftb ADF DFTB output format adfout ADF output format s Output single bonds only b Disable bonding entirely alc Alchemy format aoforce Turbomole AOFORCE output format arc Accelrys/MSI Biosym/Insight II CAR format s Output single bonds only b Disable bonding entirely axsf XCrySDen Structure Format s Output single bonds only b Disable bonding entirely bgf MSI BGF format box Dock 3.5 Box format bs Ball and Stick format c09out Crystal 09 output format s Consider single bonds only c3d1 Chem3D Cartesian 1 format c3d2 Chem3D Cartesian 2 format caccrt Cacao Cartesian format s Output single bonds only b Disable bonding entirely car Accelrys/MSI Biosym/Insight II CAR format s Output single bonds only b Disable bonding entirely castep CASTEP format ccc CCC format cdjson ChemDoodle JSON c coordinate multiplier (default: 20) cdx ChemDraw binary format m read molecules only; no reactions d output CDX tree to OBText object cdxml ChemDraw CDXML format cif Crystallographic Information File s Output single bonds only b Disable bonding entirely B Use bonds listed in CIF file from _geom_bond_etc records (overrides option b) ck ChemKin format f File with standard thermo data: default therm.dat z Use standard thermo only L Reactions have labels (Usually optional) cml Chemical Markup Language 2 read 2D rather than 3D coordinates if both provided cmlr CML Reaction format cof Culgi object file format crk2d Chemical Resource Kit diagram(2D) crk3d Chemical Resource Kit 3D format ct ChemDraw Connection Table format cub Gaussian cube format b no bonds s no multiple bonds cube Gaussian cube format b no bonds s no multiple bonds dallog DALTON output format s Output single bonds only dalmol DALTON input format s Output single bonds only b Disable bonding entirely dat Generic Output file format s Output single bonds only b Disable bonding entirely dmol DMol3 coordinates format s Output single bonds only b Disable bonding entirely dx OpenDX cube format for APBS ent Protein Data Bank format s Output single bonds only b Disable bonding entirely c Ignore CONECT records exyz Extended XYZ cartesian coordinates format s Output single bonds only b Disable bonding entirely fa FASTA format 1 Output single-stranded DNA t Use the specified number of base pairs per turn (e.g., 10) s Output single bonds only b Disable bonding entirely fasta FASTA format 1 Output single-stranded DNA t Use the specified number of base pairs per turn (e.g., 10) s Output single bonds only b Disable bonding entirely fch Gaussian formatted checkpoint file format fchk Gaussian formatted checkpoint file format fck Gaussian formatted checkpoint file format feat Feature format s Output single bonds only b Disable bonding entirely fhiaims FHIaims XYZ format s Output single bonds only b Disable bonding entirely fract Free Form Fractional format s Output single bonds only b Disable bonding entirely fs Fastsearch format t # Do similarity search:#mols or # as min Tanimoto a Add Tanimoto coeff to title in similarity search l # Maximum number of candidates. Default<4000> e Exact match Alternative to using exact in ``-s`` parameter, see above n No further SMARTS filtering after fingerprint phase fsa FASTA format 1 Output single-stranded DNA t Use the specified number of base pairs per turn (e.g., 10) s Output single bonds only b Disable bonding entirely g03 Gaussian Output s Output single bonds only b Disable bonding entirely g09 Gaussian Output s Output single bonds only b Disable bonding entirely g16 Gaussian Output s Output single bonds only b Disable bonding entirely g92 Gaussian Output s Output single bonds only b Disable bonding entirely g94 Gaussian Output s Output single bonds only b Disable bonding entirely g98 Gaussian Output s Output single bonds only b Disable bonding entirely gal Gaussian Output s Output single bonds only b Disable bonding entirely gam GAMESS Output s Output single bonds only b Disable bonding entirely c Read multiple conformers gamess GAMESS Output s Output single bonds only b Disable bonding entirely c Read multiple conformers gamin GAMESS Input gamout GAMESS Output s Output single bonds only b Disable bonding entirely c Read multiple conformers got GULP format gpr Ghemical format gro GRO format s Consider single bonds only gukin GAMESS-UK Input gukout GAMESS-UK Output gzmat Gaussian Z-Matrix Input s Output single bonds only b Disable bonding entirely hin HyperChem HIN format inp GAMESS Input ins ShelX format s Output single bonds only b Disable bonding entirely jin Jaguar input format s Output single bonds only b Disable bonding entirely jout Jaguar output format s Output single bonds only b Disable bonding entirely log Generic Output file format s Output single bonds only b Disable bonding entirely lpmd LPMD format s Output single bonds only b Disable bonding entirely mae Maestro format maegz Maestro format mcdl MCDL format mcif Macromolecular Crystallographic Info mdl MDL MOL format s determine chirality from atom parity flags The default setting for 2D and 3D is to ignore atom parity and work out the chirality based on the bond stereochemistry (2D) or coordinates (3D). For 0D the default is already to determine the chirality from the atom parity. S do not read stereochemistry from 0D MOL files Open Babel supports reading and writing cis/trans and tetrahedral stereochemistry to 0D MOL files. This is an extension to the standard which you can turn off using this option. T read title only P read title and properties only When filtering an sdf file on title or properties only, avoid lengthy chemical interpretation by using the ``T`` or ``P`` option together with the :ref:`copy format `. ml2 Sybyl Mol2 format c Read UCSF Dock scores saved in comments preceding molecules mmcif Macromolecular Crystallographic Info mmd MacroModel format mmod MacroModel format mol MDL MOL format s determine chirality from atom parity flags The default setting for 2D and 3D is to ignore atom parity and work out the chirality based on the bond stereochemistry (2D) or coordinates (3D). For 0D the default is already to determine the chirality from the atom parity. S do not read stereochemistry from 0D MOL files Open Babel supports reading and writing cis/trans and tetrahedral stereochemistry to 0D MOL files. This is an extension to the standard which you can turn off using this option. T read title only P read title and properties only When filtering an sdf file on title or properties only, avoid lengthy chemical interpretation by using the ``T`` or ``P`` option together with the :ref:`copy format `. mol2 Sybyl Mol2 format c Read UCSF Dock scores saved in comments preceding molecules mold Molden format b no bonds s no multiple bonds molden Molden format b no bonds s no multiple bonds molf Molden format b no bonds s no multiple bonds moo MOPAC Output format s Output single bonds only b Disable bonding entirely mop MOPAC Cartesian format s Output single bonds only b Disable bonding entirely mopcrt MOPAC Cartesian format s Output single bonds only b Disable bonding entirely mopin MOPAC Internal mopout MOPAC Output format s Output single bonds only b Disable bonding entirely mpc MOPAC Cartesian format s Output single bonds only b Disable bonding entirely mpo Molpro output format s Output single bonds only b Disable bonding entirely mpqc MPQC output format s Output single bonds only b Disable bonding entirely mrv Chemical Markup Language 2 read 2D rather than 3D coordinates if both provided msi Accelrys/MSI Cerius II MSI format nwo NWChem output format s Output single bonds only f Overwrite molecule if more than one calculation with different molecules is present in the output file (last calculation will be prefered) b Disable bonding entirely orca ORCA output format s Output single bonds only b Disable bonding entirely out Generic Output file format s Output single bonds only b Disable bonding entirely outmol DMol3 coordinates format s Output single bonds only b Disable bonding entirely output Generic Output file format s Output single bonds only b Disable bonding entirely pc PubChem format pcjson PubChem JSON s disable stereo perception and just read stereo information from input pcm PCModel Format pdb Protein Data Bank format s Output single bonds only b Disable bonding entirely c Ignore CONECT records pdbqt AutoDock PDBQT format b Disable automatic bonding d Input file is in dlg (AutoDock docking log) format png PNG 2D depiction y Look also in chunks with specified ID pos POS cartesian coordinates format s Output single bonds only b Disable bonding entirely pqr PQR format s Output single bonds only b Disable bonding entirely pqs Parallel Quantum Solutions format prep Amber Prep format pwscf PWscf format qcout Q-Chem output format s Output single bonds only b Disable bonding entirely res ShelX format s Output single bonds only b Disable bonding entirely rsmi Reaction SMILES format rxn MDL RXN format sd MDL MOL format s determine chirality from atom parity flags The default setting for 2D and 3D is to ignore atom parity and work out the chirality based on the bond stereochemistry (2D) or coordinates (3D). For 0D the default is already to determine the chirality from the atom parity. S do not read stereochemistry from 0D MOL files Open Babel supports reading and writing cis/trans and tetrahedral stereochemistry to 0D MOL files. This is an extension to the standard which you can turn off using this option. T read title only P read title and properties only When filtering an sdf file on title or properties only, avoid lengthy chemical interpretation by using the ``T`` or ``P`` option together with the :ref:`copy format `. siesta SIESTA format smiles SMILES format a Preserve aromaticity present in the SMILES This option should only be used if reading aromatic SMILES generated by the same version of Open Babel. Any other use will lead to undefined behavior. The advantage of this option is that it avoids aromaticity perception, thus speeding up reading SMILES. S Clean stereochemistry By default, stereochemistry is accepted as given. If you wish to clean up stereochemistry (e.g. by removing tetrahedral stereochemistry where two of the substituents are identical) then specifying this option will reperceive stereocenters. smy SMILES format using Smiley parser sy2 Sybyl Mol2 format c Read UCSF Dock scores saved in comments preceding molecules t41 ADF TAPE41 format s Output single bonds only b Disable bonding entirely tdd Thermo format e Terminate on "END" text Read and write raw text therm Thermo format e Terminate on "END" tmol TurboMole Coordinate format s Output single bonds only b Disable bonding entirely a Input in Angstroms txt Title format txyz Tinker XYZ format s Generate single bonds only unixyz UniChem XYZ format s Output single bonds only b Disable bonding entirely vmol ViewMol format s Output single bonds only b Disable bonding entirely wln Wiswesser Line Notation xml General XML format n Read objects of first namespace only xsf XCrySDen Structure Format s Output single bonds only b Disable bonding entirely xyz XYZ cartesian coordinates format s Output single bonds only b Disable bonding entirely yob YASARA.org YOB format You will need to consult the Open Babel documentation (see https://openbabel.org/wiki/List_of_extensions ) and implementation for full details about how these options work.