chemfp.openbabel_toolkit module

The chemfp toolkit API wrapper for the Open Babel toolkit.

This module is also available as chemfp.openbabel.

chemfp.openbabel_toolkit.name

The string “openbabel”.

chemfp.openbabel_toolkit.software

The string used in output file metadata to describe this version of Open Babel. For example, “OpenBabel/3.1.0”.

chemfp.openbabel_toolkit.ecfp0

The available version of the ‘OpenBabel-ECFP0’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelECFP0FingerprintType_v1 with the full type:

OpenBabel-ECFP0/1 nBits=4096
chemfp.openbabel_toolkit.ecfp2

The available version of the ‘OpenBabel-ECFP2’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelECFP2FingerprintType_v1 with the full type:

OpenBabel-ECFP2/1 nBits=4096
chemfp.openbabel_toolkit.ecfp4

The available version of the ‘OpenBabel-ECFP4’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelECFP4FingerprintType_v1 with the full type:

OpenBabel-ECFP4/1 nBits=4096
chemfp.openbabel_toolkit.ecfp6

The available version of the ‘OpenBabel-ECFP6’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelECFP6FingerprintType_v1 with the full type:

OpenBabel-ECFP6/1 nBits=4096
chemfp.openbabel_toolkit.ecfp8

The available version of the ‘OpenBabel-ECFP8’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelECFP8FingerprintType_v1 with the full type:

OpenBabel-ECFP8/1 nBits=4096
chemfp.openbabel_toolkit.ecfp10

The available version of the ‘OpenBabel-ECFP10’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelECFP10FingerprintType_v1 with the full type:

OpenBabel-ECFP10/1 nBits=4096
chemfp.openbabel_toolkit.fp2

The available version of the ‘OpenBabel-FP2’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelFP2FingerprintType_v1 with the full type:

OpenBabel-FP2/1
chemfp.openbabel_toolkit.fp3

The available version of the ‘OpenBabel-FP3’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelFP3FingerprintType_v1 with the full type:

OpenBabel-FP3/1
chemfp.openbabel_toolkit.fp4

The available version of the ‘OpenBabel-FP4’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelFP4FingerprintType_v1 with the full type:

OpenBabel-FP4/1
chemfp.openbabel_toolkit.maccs

The available version of the ‘OpenBabel-MACCS’ fingerprint type, for example, an instance of chemfp.openbabel_types.OpenBabelMACCSFingerprintType_v2 with the full type:

OpenBabel-MACCS/2
chemfp.openbabel_toolkit.is_licensed()

Return True - Open Babel is always licensed

Returns:True
chemfp.openbabel_toolkit.get_formats(include_unavailable=False)

Get the list of structure formats that Open Babel supports

If include_unavailable is True then also include Open Babel formats which aren’t available to this specific version of Open Babel.

Parameters:include_unavailable (True or False) – include unavailable formats?
Returns:a list of chemfp.base_toolkit.Format objects
chemfp.openbabel_toolkit.get_input_formats()

Get the list of supported Open Babel input formats

Returns:a list of chemfp.base_toolkit.Format objects
chemfp.openbabel_toolkit.get_output_formats()

Get the list of supported Open Babel output formats

Returns:a list of chemfp.base_toolkit.Format objects
chemfp.openbabel_toolkit.get_format(format_name)

Get the named format, or raise a ValueError

This will raise a ValueError if Open Babel does not implement the format format_name or that format is not available.

Parameters:format_name (a string) – the format name
Returns:a chemfp.base_toolkit.Format object
chemfp.openbabel_toolkit.get_input_format(format_name)

Get the named input format, or raise a ValueError

This will raise a ValueError if Open Babel does not implement the format format_name or that format is not an input format.

Parameters:format_name (a string) – the format name
Returns:a chemfp.base_toolkit.Format object
chemfp.openbabel_toolkit.get_output_format(format_name)

Get the named format, or raise a ValueError

This will raise a ValueError if Open Babel does not implement the format format_name or that format is not an output format.

Parameters:format_name (a string) – the format name
Returns:a chemfp.base_toolkit.Format object
chemfp.openbabel_toolkit.get_input_format_from_source(source=None, format=None)

Get the most appropriate format given the available source and format information

If format is a chemfp.base_toolkit.Format then return it. If it’s a Format-like object with “name” and “compression” attributes use it to make a real Format object with the same attributes. If it’s a string then use it to create a Format object.

If format is None, use the source to auto-detect the format. If auto-detection is not possible, assume it’s an uncompressed SMILES file.

Parameters:
  • source (a filename (as a string), a file object, or None to read from stdin) – the structure data source.
  • format (a Format(-like) object, string, or None) – format information, if known.
Returns:

a chemfp.base_toolkit.Format object

chemfp.openbabel_toolkit.get_output_format_from_destination(destination=None, format=None)

Get the most appropriate format given the available destination and format information

If format is a chemfp.base_toolkit.Format then return it. If it’s a Format-like object with “name” and “compression” attributes use it to make a real Format object with the same attributes. If it’s a string then use it to create a Format object.

If format is None, use the destination to auto-detect the format. If auto-detection is not possible, assume it’s an uncompressed SMILES file.

Parameters:
  • destination (a filename (as a string), a file object, or None to read from stdin) – the structure data source.
  • format (a Format(-like) object, string, or None) – format information, if known.
Returns:

a chemfp.base_toolkit.Format object

chemfp.openbabel_toolkit.read_molecules(source=None, format=None, id_tag=None, reader_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict')

Return an iterator that reads OBMol molecules from a structure file

Iterate through the format structure records in source. If format is None then auto-detect the format based on the source. For SD files, use id_tag to get the record id from the given SD tag instead of the title line. (read_molecules() will ignore the id_tag. It exists to make it easier to switch between reader functions.)

Note: the reader will clear and reuse the OBMol instance. Make a copy if you want to keep the molecule around.

The reader_args dictionary parameters depend on the format. Every Open Babel format supports an “options” entry, which is passed to SetOptions(). See that documentation for details. Some formats support additional parameters:

  • SMILES and InChI
    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None
    • has_header - True or False
  • SDF
    • implementation - if “openbabel” or None, use the Open Babel record parser; if “chemfp”, use chemfp’s own record parser, which has better location tracking

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

The location parameter takes a chemfp.io.Location instance. If None then a default Location will be created.

See chemfp.openbabel_toolkit.read_ids_and_molecules() if you want (id, OBMol) pairs instead of just the molecules.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the structure source
  • format (a format name string, or Format object, or None to auto-detect) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.MoleculeReader iterating OBMol molecules

chemfp.openbabel_toolkit.read_molecules_from_string(content, format, id_tag=None, reader_args=None, errors='strict', location=None)

Return an iterator that reads OBMol molecules from a string containing structure records

content is a string containing 0 or more records in the format format. See chemfp.openbabel_toolkit.read_molecules() for details about the other parameters. See chemfp.openbabel_toolkit.read_ids_and_molecules_from_string() if you want to read (id, OBMol) pairs instead of just molecules.

Note: the reader will clear and reuse the OBMol instance. Make a copy if you want to keep the molecule around.

Parameters:
  • content (a string) – the string containing structure records
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.MoleculeReader iterating OBMol molecules

chemfp.openbabel_toolkit.read_ids_and_molecules(source=None, format=None, id_tag=None, reader_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict')

Return an iterator that reads (id, OBMol molecule) pairs from a structure file

See chemfp.openbabel_toolkit.read_molecules() for full parameter details. The major difference is that this returns an iterator of (id, OBMol) pairs instead of just the molecules.

Note: the reader will clear and reuse the OBMol instance. Make a copy if you want to keep the molecule around.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the structure source
  • format (a format name string, or Format object, or None to auto-detect) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating (id, OBMol) pairs

chemfp.openbabel_toolkit.read_ids_and_molecules_from_string(content, format, id_tag=None, reader_args=None, errors='strict', location=None)

Return an iterator that reads (id, OBMol) pairs from a string containing structure records

content is a string containing 0 or more records in the format format. See chemfp.openbabel_toolkit.read_molecules() for details about the other parameters. See chemfp.openbabel_toolkit.read_molecules_from_string() if you just want to read the OBMol molecules instead of (id, OBMol) pairs.

Note: the reader will clear and reuse the OBMol instance. Make a copy if you want to keep the molecule around.

Parameters:
  • content (a string) – the string containing structure records
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating (id, OBMol) pairs

chemfp.openbabel_toolkit.make_id_and_molecule_parser(format, id_tag=None, reader_args=None, errors='strict')

Create a specialized function which takes a record and returns an (id, OBMol) pair

The returned function is optimized for reading many records from individual strings because it only does parameter validation once. The function will reuse the OBMol for successive calls, so make a copy if you want to keep it around. However, I haven’t really noticed much of a performance difference between this and chemfp.openbabel_toolkit.parse_id_and_molecule() so I suggest you use that function directly instead of making a specialized function. (Let me know if making a specialized function is useful.)

See chemfp.openbabel_toolkit.read_molecules() for details about the other parameters.

Parameters:
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

a function of the form parser(record string) -> (id, OBMol)

chemfp.openbabel_toolkit.parse_molecule(content, format, id_tag=None, reader_args=None, errors='strict')

Parse the first structure record from the content string and return an OBMol molecule.

content is a string containing a single structure record in format format. (Additional records are ignored). See chemfp.openbabel_toolkit.read_molecules() for details about the other parameters. See chemfp.openbabel_toolkit.parse_id_and_molecule() if you want the (id, OBMol) pair instead of just the molecule.

Parameters:
  • content (a string) – the string containing a structure record
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

an OBMol molecule

chemfp.openbabel_toolkit.parse_id_and_molecule(content, format, id_tag=None, reader_args=None, errors='strict')

Parse the first structure record from content and return the (id, OBMol) pair.

content is a string containing a single structure record in format format. (Additional records are ignored). See chemfp.openbabel_toolkit.read_molecules() for details about the other parameters.

See chemfp.openbabel_toolkit.read_molecules() for details about the other parameters. See chemfp.openbabel_toolkit.parse_molecule() if just want the OBMol molecule and not the the (id, OBMol) pair.

Parameters:
  • content (a string) – the string containing a structure record
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

an (id, OBMol molecule) pair

chemfp.openbabel_toolkit.create_string(mol, format, id=None, writer_args=None, errors='strict')

Convert an OBMol into a structure record in the given format as a Unicode string

If id is not None then use it instead of the molecule’s own title. Warning: this may briefly modify the molecule, so may not be thread-safe.

Parameters:
  • mol (an Open Babel molecule) – the molecule to use for the output
  • format (a format name string, or Format object) – the output structure format
  • id (a string, or None to use the molecule's own id) – an alternate record id
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

a Unicode string

chemfp.openbabel_toolkit.create_bytes(mol, format, id=None, writer_args=None, errors='strict', level=None)

Convert an OBMol into a structure record in the given format as a byte string

If id is not None then use it instead of the molecule’s own title. Warning: this may briefly modify the molecule, so may not be thread-safe.

Parameters:
  • mol (an Open Babel molecule) – the molecule to use for the output
  • format (a format name string, or Format object) – the output structure format
  • id (a string, or None to use the molecule's own id) – an alternate record id
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats
Returns:

a byte string

chemfp.openbabel_toolkit.translate_record(content, in_format='smi', out_format='smi', *, id_tag=None, reader_args=None, writer_args=None, id=None, errors='strict')

Translate a molecule record from one format to another

Use the RDKit toolkit to parse the content as format in_format (default: “smi”) and translate it into out_format (default: “smi”). For an SDF record, use id_tag to get the record id from the given SD tag instead of the title line. Use reader_args and writer_args to configure format-specific parameters. Use id to set the id of the output record.

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

Parameters:
  • content (a string) – the string containing a structure record
  • in_format (a format name string, or Format object) – the input structure format
  • out_format (a format name string, or Format object) – the output structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary, or None) – reader arguments for the specified in_format
  • writer_args (a dictionary, or None) – writer arguments for the specified out_format
  • id (a string, or None to use the default) – the record id to use for the output record
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

a string

chemfp.openbabel_toolkit.open_molecule_writer(destination=None, format=None, writer_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict', level=None)

Return a MoleculeWriter which can write Open Babel molecules to a destination.

A chemfp.base_toolkit.MoleculeWriter has the methods write_molecule, write_molecules, and write_ids_and_molecules, which are ways to write an OBMol molecule, an OBMol molecule iterator, or an (id, OBMol molecule) pair iterator to a file.

Molecules are written to destination. The output format can be a string like “sdf.gz” or “smi”, a chemfp.base_toolkit.Format, or Format-like object with “name” and “compression” attributes, or None to auto-detect based on the destination. If auto-detection is not possible, the output will be written as uncompressed SMILES.

The writer_args dictionary parameters depend on the format. Every format supports an options entry, which is passed to Open Babel’s SetOptions(). See the Open Babel documentation for details. Some formats supports additional parameters:

  • SMILES
    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None
    • isomeric - True to write isomeric SMILES, False or default is non-isomeric
    • canonicalization - True, “default”, or None uses Open Babel’s own canonicalization algorithm; False or “none” to use no canonicalization; “universal” generates a universal SMILES; “anticanonical” generates a SMILES with randomly assigned atom classes; “inchified” uses InChI-fied SMILES
  • InChI and InChIKey
    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None
    • include_id - True or default to include the id as the second column; False has no id column
  • SDF
    • always_v3000 - True to always write V3000 files; False or default to write V3000 files only if needed.
    • include_atom_class - True to include atom class; False or default does not
    • include_hcount - True to include hcount; False or default does not

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

The location parameter takes a chemfp.io.Location instance. If None then a default Location will be created.

Parameters:
  • destination (a filename, file object, or None to write to stdout) – the structure destination
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track writer state information
  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats (does not affect Open Babel)
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_molecule_writer_to_string(format, writer_args=None, errors='strict', location=None)

Return a MoleculeStringWriter which can write Open Babel molecule records to a string.

See chemfp.openbabel_toolkit.open_molecule_writer() for full parameter details.

Use the writer’s chemfp.base_toolkit.MoleculeStringWriter.getvalue() to get the output as a Unicode string.

Parameters:
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track writer state information
Returns:

a chemfp.base_toolkit.MoleculeStringWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_molecule_writer_to_bytes(format, writer_args=None, errors='strict', location=None, level=None)

Return a MoleculeStringWriter which can write Open Babel molecule records to a byte string

See chemfp.openbabel_toolkit.open_molecule_writer() for full parameter details.

Use the writer’s chemfp.base_toolkit.MoleculeStringWriter.getvalue() to get the output as a byte string.

Parameters:
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track writer state information
  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats (does not affect Open Babel)
Returns:

a chemfp.base_toolkit.MoleculeStringWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.copy_molecule(mol)

Return a new OBMol molecule which is a copy of the given Open Babel molecule

Parameters:mol (an Open Babel molecule) – the molecule to copy
Returns:a new OBMol instance
chemfp.openbabel_toolkit.add_tag(mol, tag, value)

Add an SD tag value to the Open Babel molecule

Raises a KeyError if the tag is a special internal Open Babel name.

Parameters:
  • mol (an Open Babel molecule) – the molecule
  • tag (string) – the SD tag name
  • value (string) – the text for the tag
Returns:

None

chemfp.openbabel_toolkit.get_tag(mol, tag)

Get the named SD tag value, or None if it doesn’t exist

Parameters:
  • mol (an Open Babel molecule) – the molecule
  • tag (string) – the SD tag name
Returns:

a string, or None

chemfp.openbabel_toolkit.get_tag_pairs(mol)

Get a list of all SD tag (name, value) pairs for the molecule

Parameters:mol (an Open Babel molecule) – the molecule
Returns:a list of (string name, string value) pairs
chemfp.openbabel_toolkit.get_id(mol)

Get the molecule’s id using Open Babel’s GetTitle()

Parameters:mol (an Open Babel molecule) – the molecule
Returns:a string
chemfp.openbabel_toolkit.set_id(mol, id)

Set the molecule’s id using Open Babel’s SetTitle()

Parameters:
  • mol (an Open Babel molecule) – the molecule
  • id (string) – the new id
Returns:

None

chemfp.openbabel_toolkit.parse_smistring(content: Union[str, bytes], *, options: Optional[str, None] = None, cxsmiles: bool = True, errors: str = 'strict')

Parse a SMILES string using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "smistring", reader_args={...}, errors=errors)
Parameters:
  • options (string or None) – options string passed to Open Babel
  • cxsmiles (Boolean (default: True)) – If true, strip CXSMILES extension before processing. Open Babel does not support CXSMILES.
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_smistring(mol: Any, *, id: Optional[str, None] = None, options: Optional[str, None] = None, isomeric: bool = True, canonicalization: Literal[True, 1, True, true, default, 0, False, False, false, none, universal, inchified, anticanonical] = True, explicit_hydrogens: bool = False, cxsmiles: bool = False, errors: str = 'strict') → Optional[str, None]

Generate a SMILES string from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "smistring", id=id, writer_args={...}, errors=errors)
The ‘canonicalization’ value can be one of:
True, ‘1’, ‘True’, ‘true’, or ‘default’ for standard canonicalization False, ‘0’, ‘False’, ‘false’ or ‘none’ for no canonicalization ‘universal’ for Universal SMILES (based on the InChI numbering) ‘inchified’ for the InChI numbering ‘anticanonical’ for a randomly selected initial numbering
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (string or None) – options string passed to Open Babel
  • isomeric (Boolean (default: True)) – if true, generate isomeric SMILES
  • canonicalization (Boolean or string (default: True)) – canonicalization method
  • explicit_hydrogens (Boolean (default: False)) – if true, use explicit hydrogens as [H]
  • cxsmiles (Boolean (default: False)) – Compatibility flag. Open Babel does not support CXSMILES.
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.parse_smi(content: Union[str, bytes], *, options: Optional[str, None] = None, cxsmiles: bool = True, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, has_header: bool = False, errors: str = 'strict')

Parse a SMILES string and its id using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "smi", reader_args={...}, errors=errors)
Parameters:
  • options (string or None) – options string passed to Open Babel
  • cxsmiles (Boolean (default: True)) – If true, strip CXSMILES extension before processing. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_smi(mol: Any, *, id: Optional[str, None] = None, options: Optional[str, None] = None, isomeric: bool = True, canonicalization: Literal[True, 1, True, true, default, 0, False, False, false, none, universal, inchified, anticanonical] = 'default', explicit_hydrogens: bool = False, cxsmiles: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict') → Optional[str, None]

Generate a SMILES string and its id from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "smi", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (string or None) – options string passed to Open Babel
  • isomeric (Boolean (default: True)) – if true, generate isomeric SMILES
  • canonicalization (Boolean or string (default: "default")) – canonicalization method
  • explicit_hydrogens (Boolean (default: False)) – if true, use explicit hydrogens as [H]
  • cxsmiles (Boolean (default: False)) – Compatibility flag. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.read_smi_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, cxsmiles: bool = True, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, has_header: bool = False, errors: str = 'strict')

Read molecules from a SMILES file using the Open Babel toolkit

This is mostly equivalent to calling:

read_molecules(source, "smi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • cxsmiles (Boolean (default: True)) – If true, strip CXSMILES extension before processing. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_smi_ids_and_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, cxsmiles: bool = True, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, has_header: bool = False, errors: str = 'strict')

Read ids and molecules from a SMILES file using the Open Babel toolkit

This is mostly equivalent to calling:

read_ids_and_molecules(source, "smi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • cxsmiles (Boolean (default: True)) – If true, strip CXSMILES extension before processing. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_smi_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, cxsmiles: bool = True, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, has_header: bool = False, errors: str = 'strict')

Read molecules from a string containing a SMILES file using the Open Babel toolkit

This is equivalent to calling:

read_molecules_from_string(content, "smi", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • cxsmiles (Boolean (default: True)) – If true, strip CXSMILES extension before processing. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_smi_ids_and_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, cxsmiles: bool = True, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, has_header: bool = False, errors: str = 'strict')

Read ids and molecules from a string containing a SMILES file using the Open Babel toolkit

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "smi", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • cxsmiles (Boolean (default: True)) – If true, strip CXSMILES extension before processing. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.open_smi_writer(destination: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, isomeric: bool = True, canonicalization: Literal[True, 1, True, true, default, 0, False, False, false, none, universal, inchified, anticanonical] = 'default', explicit_hydrogens: bool = False, cxsmiles: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Open a SMILES file to write Open Babel molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "smi", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • options (string or None) – options string passed to Open Babel
  • isomeric (Boolean (default: True)) – if true, generate isomeric SMILES
  • canonicalization (Boolean or string (default: "default")) – canonicalization method
  • explicit_hydrogens (Boolean (default: False)) – if true, use explicit hydrogens as [H]
  • cxsmiles (Boolean (default: False)) – Compatibility flag. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_smi_writer_to_string(*, options: Optional[str, None] = None, isomeric: bool = True, canonicalization: Literal[True, 1, True, true, default, 0, False, False, false, none, universal, inchified, anticanonical] = 'default', explicit_hydrogens: bool = False, cxsmiles: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Open a SMILES file to write Open Babel molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("smi", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • isomeric (Boolean (default: True)) – if true, generate isomeric SMILES
  • canonicalization (Boolean or string (default: "default")) – canonicalization method
  • explicit_hydrogens (Boolean (default: False)) – if true, use explicit hydrogens as [H]
  • cxsmiles (Boolean (default: False)) – Compatibility flag. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.parse_sdf(content: Union[str, bytes], *, implementation: Literal[None, openbabel, chemfp] = 'openbabel', perceive_stereo: bool = False, perceive_0d_stereo: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Parse an SDF record using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "sdf", reader_args={...}, errors=errors)
Parameters:
  • implementation (None or 'openbabel' to use Open Babel, else 'chemfp' (default: "openbabel")) – SDF record tokenizer implementation
  • perceive_stereo (Boolean (default: False)) – not implemented
  • perceive_0d_stereo (Boolean (default: False)) – not implemented
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_sdf(mol: Any, *, id: Optional[str, None] = None, always_v3000: bool = False, include_atom_class: bool = False, include_hcount: bool = False, options: Optional[str, None] = None, errors: str = 'strict') → Optional[str, None]

Generate an SDF record from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "sdf", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • always_v3000 (Boolean (default: False)) – if true, always save records in V3000 format
  • include_atom_class (Boolean (default: False)) – if true, include atom classes in the output
  • include_hcount (Boolean (default: False)) – if true, always include hydrogen counts in the output
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.read_sdf_molecules(source: Union[None, str, BinaryIO], *, implementation: Literal[None, openbabel, chemfp] = 'openbabel', perceive_stereo: bool = False, perceive_0d_stereo: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Read molecules from an SDF file using the Open Babel toolkit

This is mostly equivalent to calling:

read_molecules(source, "sdf", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • implementation (None or 'openbabel' to use Open Babel, else 'chemfp' (default: "openbabel")) – SDF record tokenizer implementation
  • perceive_stereo (Boolean (default: False)) – not implemented
  • perceive_0d_stereo (Boolean (default: False)) – not implemented
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_sdf_ids_and_molecules(source: Union[None, str, BinaryIO], *, id_tag: Optional[None, str] = None, implementation: Literal[None, openbabel, chemfp] = 'openbabel', perceive_stereo: bool = False, perceive_0d_stereo: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Read ids and molecules from an SDF file using the Open Babel toolkit

This is mostly equivalent to calling:

read_ids_and_molecules(source, "sdf", id_tag=id_tag, reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • id_tag (a string, or None to use the title) – get the id from the named data item instead of using the record title
  • implementation (None or 'openbabel' to use Open Babel, else 'chemfp' (default: "openbabel")) – SDF record tokenizer implementation
  • perceive_stereo (Boolean (default: False)) – not implemented
  • perceive_0d_stereo (Boolean (default: False)) – not implemented
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_sdf_molecules_from_string(content: Union[str, bytes], *, implementation: Literal[None, openbabel, chemfp] = 'openbabel', perceive_stereo: bool = False, perceive_0d_stereo: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Read molecules from a string containing an SDF file using the Open Babel toolkit

This is equivalent to calling:

read_molecules_from_string(content, "sdf", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

Parameters:
  • implementation (None or 'openbabel' to use Open Babel, else 'chemfp' (default: "openbabel")) – SDF record tokenizer implementation
  • perceive_stereo (Boolean (default: False)) – not implemented
  • perceive_0d_stereo (Boolean (default: False)) – not implemented
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_sdf_ids_and_molecules_from_string(content: Union[str, bytes], *, id_tag: Optional[None, str] = None, implementation: Literal[None, openbabel, chemfp] = 'openbabel', perceive_stereo: bool = False, perceive_0d_stereo: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Read ids and molecules from a string containing an SDF file using the Open Babel toolkit

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "sdf", id_tag=id_tag, reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

Parameters:
  • id_tag (a string, or None to use the title) – get the id from the named data item instead of using the record title
  • implementation (None or 'openbabel' to use Open Babel, else 'chemfp' (default: "openbabel")) – SDF record tokenizer implementation
  • perceive_stereo (Boolean (default: False)) – not implemented
  • perceive_0d_stereo (Boolean (default: False)) – not implemented
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.open_sdf_writer(destination: Union[None, str, BinaryIO], *, always_v3000: bool = False, include_atom_class: bool = False, include_hcount: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Open an SDF file to write Open Babel molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "sdf", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • always_v3000 (Boolean (default: False)) – if true, always save records in V3000 format
  • include_atom_class (Boolean (default: False)) – if true, include atom classes in the output
  • include_hcount (Boolean (default: False)) – if true, always include hydrogen counts in the output
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_sdf_writer_to_string(*, always_v3000: bool = False, include_atom_class: bool = False, include_hcount: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Open an SDF file to write Open Babel molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("sdf", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

Parameters:
  • always_v3000 (Boolean (default: False)) – if true, always save records in V3000 format
  • include_atom_class (Boolean (default: False)) – if true, include atom classes in the output
  • include_hcount (Boolean (default: False)) – if true, always include hydrogen counts in the output
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.create_sdf3k(mol: Any, *, id: Optional[str, None] = None, always_v3000: bool = True, include_atom_class: bool = False, include_hcount: bool = False, options: Optional[str, None] = None, errors: str = 'strict') → Optional[str, None]

Generate an SDF record in V3000 format from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "sdf3k", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • always_v3000 (Boolean (default: True)) – if true, always save records in V3000 format
  • include_atom_class (Boolean (default: False)) – if true, include atom classes in the output
  • include_hcount (Boolean (default: False)) – if true, always include hydrogen counts in the output
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.open_sdf3k_writer(destination: Union[None, str, BinaryIO], *, always_v3000: bool = True, include_atom_class: bool = False, include_hcount: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Open an SDF file in V3000 format to write Open Babel molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "sdf3k", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • always_v3000 (Boolean (default: True)) – if true, always save records in V3000 format
  • include_atom_class (Boolean (default: False)) – if true, include atom classes in the output
  • include_hcount (Boolean (default: False)) – if true, always include hydrogen counts in the output
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_sdf3k_writer_to_string(*, always_v3000: bool = True, include_atom_class: bool = False, include_hcount: bool = False, options: Optional[str, None] = None, errors: str = 'strict')

Open an SDF file in V3000 format to write Open Babel molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("sdf3k", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

Parameters:
  • always_v3000 (Boolean (default: True)) – if true, always save records in V3000 format
  • include_atom_class (Boolean (default: False)) – if true, include atom classes in the output
  • include_hcount (Boolean (default: False)) – if true, always include hydrogen counts in the output
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.parse_fasta(content: Union[str, bytes], *, options: Optional[str, None] = None, errors: str = 'strict')

Parse a FASTA record using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "fasta", reader_args={...}, errors=errors)
Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_fasta(mol: Any, *, id: Optional[str, None] = None, options: Optional[str, None] = None, errors: str = 'strict') → Optional[str, None]

Generate a FASTA record from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "fasta", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.read_fasta_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, errors: str = 'strict')

Read molecules from a FASTA file using the Open Babel toolkit

This is mostly equivalent to calling:

read_molecules(source, "fasta", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_fasta_ids_and_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, errors: str = 'strict')

Read ids and molecules from a FASTA file using the Open Babel toolkit

This is mostly equivalent to calling:

read_ids_and_molecules(source, "fasta", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_fasta_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, errors: str = 'strict')

Read molecules from a string containing a FASTA file using the Open Babel toolkit

This is equivalent to calling:

read_molecules_from_string(content, "fasta", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_fasta_ids_and_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, errors: str = 'strict')

Read ids and molecules from a string containing a FASTA file using the Open Babel toolkit

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "fasta", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.open_fasta_writer(destination: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, errors: str = 'strict')

Open a FASTA file to write Open Babel molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "fasta", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_fasta_writer_to_string(*, options: Optional[str, None] = None, errors: str = 'strict')

Open a FASTA file to write Open Babel molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("fasta", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.parse_pdb(content: Union[str, bytes], *, options: Optional[str, None] = None, errors: str = 'strict')

Parse a PDB record using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "pdb", reader_args={...}, errors=errors)
Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_pdb(mol: Any, *, id: Optional[str, None] = None, options: Optional[str, None] = None, errors: str = 'strict') → Optional[str, None]

Generate a PDB record from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "pdb", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.read_pdb_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, errors: str = 'strict')

Read molecules from a PDB file using the Open Babel toolkit

This is mostly equivalent to calling:

read_molecules(source, "pdb", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_pdb_ids_and_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, errors: str = 'strict')

Read ids and molecules from a PDB file using the Open Babel toolkit

This is mostly equivalent to calling:

read_ids_and_molecules(source, "pdb", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_pdb_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, errors: str = 'strict')

Read molecules from a string containing a PDB file using the Open Babel toolkit

This is equivalent to calling:

read_molecules_from_string(content, "pdb", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_pdb_ids_and_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, errors: str = 'strict')

Read ids and molecules from a string containing a PDB file using the Open Babel toolkit

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "pdb", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.open_pdb_writer(destination: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, errors: str = 'strict')

Open a PDB file to write Open Babel molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "pdb", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_pdb_writer_to_string(*, options: Optional[str, None] = None, errors: str = 'strict')

Open a PDB file to write Open Babel molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("pdb", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.parse_inchi(content: Union[str, bytes], *, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Parse an InChI string and its id using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "inchi", reader_args={...}, errors=errors)
Parameters:
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_inchi(mol: Any, *, id: Optional[str, None] = None, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict') → Optional[str, None]

Generate an InChI string and its id from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "inchi", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.read_inchi_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Read molecules from an InChI file (with InChI and optional id) using the Open Babel toolkit

This is mostly equivalent to calling:

read_molecules(source, "inchi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_inchi_ids_and_molecules(source: Union[None, str, BinaryIO], *, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Read ids and molecules from an InChI file (with InChI and optional id) using the Open Babel toolkit

This is mostly equivalent to calling:

read_ids_and_molecules(source, "inchi", reader_args={...}, errors=errors)

along with decompression based on the source filename’s extension.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_inchi_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Read molecules from a string containing an InChI file (with InChI and optional id) using the Open Babel toolkit

This is equivalent to calling:

read_molecules_from_string(content, "inchi", reader_args={...}, errors=errors)

Use read_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.read_inchi_ids_and_molecules_from_string(content: Union[str, bytes], *, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Read ids and molecules from a string containing an InChI file (with InChI and optional id) using the Open Babel toolkit

This is equivalent to calling:

read_ids_and_molecules_from_string(content, "inchi", reader_args={...}, errors=errors)

Use read_ids_and_molecules_from_string() if the content is compressed.

Parameters:
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating Open Babel molecules

chemfp.openbabel_toolkit.open_inchi_writer(destination: Union[None, str, BinaryIO], *, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Open an InChI file (with InChI and optional id) to write Open Babel molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "inchi", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_inchi_writer_to_string(*, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Open an InChI file (with InChI and optional id) to write Open Babel molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("inchi", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

Parameters:
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.parse_inchistring(content: Union[str, bytes], *, options: Optional[str, None] = None, errors: str = 'strict')

Parse an InChI string using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "inchistring", reader_args={...}, errors=errors)
Parameters:
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_inchistring(mol: Any, *, id: Optional[str, None] = None, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, errors: str = 'strict') → Optional[str, None]

Generate an InChI string from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "inchistring", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.create_inchikey(mol: Any, *, id: Optional[str, None] = None, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict') → Optional[str, None]

Generate an InChIKey string and its id from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "inchikey", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.open_inchikey_writer(destination: Union[None, str, BinaryIO], *, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Open an InChIKey file (with InChIKey and optional id) to write Open Babel molecules

This is mostly equivalent to calling:

open_molecule_writer(destination, "inchikey", writer_args={...}, errors=errors)

along with compression based on the destination filename’s extension.

Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.open_inchikey_writer_to_string(*, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Open an InChIKey file (with InChIKey and optional id) to write Open Babel molecules to an in-memory string

This is equivalent to calling:

open_molecule_writer_to_string("inchikey", writer_args={...}, errors=errors)

Use write_molecules_to_string() to write compressed output.

Parameters:
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting Open Babel molecules

chemfp.openbabel_toolkit.create_inchikeystring(mol: Any, *, id: Optional[str, None] = None, FixedHLayer: bool = False, ReconnectedMetals: bool = False, truncspec: bool = None, options: Optional[str, None] = None, errors: str = 'strict') → Optional[str, None]

Generate an InChIKey string from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "inchikeystring", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • FixedHLayer (Boolean (default: False)) – if true, include the fixed hydrogen layer
  • ReconnectedMetals (Boolean (default: False)) – if true, reconnect metals
  • truncspec (Boolean (default: None)) – if true, truncate the InChI
  • options (string or None) – options string passed to Open Babel
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored

chemfp.openbabel_toolkit.parse_smiles(content: Union[str, bytes], *, options: Optional[str, None] = None, cxsmiles: bool = True, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, has_header: bool = False, errors: str = 'strict')

Parse a SMILES string and its id using the Open Babel toolkit

This is equivalent to calling:

parse_molecule(content, "smi", reader_args={...}, errors=errors)
Parameters:
  • options (string or None) – options string passed to Open Babel
  • cxsmiles (Boolean (default: True)) – If true, strip CXSMILES extension before processing. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an Open Babel molecule object

chemfp.openbabel_toolkit.create_smiles(mol: Any, *, id: Optional[str, None] = None, options: Optional[str, None] = None, isomeric: bool = True, canonicalization: Literal[True, 1, True, true, default, 0, False, False, false, none, universal, inchified, anticanonical] = 'default', explicit_hydrogens: bool = False, cxsmiles: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict') → Optional[str, None]

Generate a SMILES string and its id from an Open Babel molecule

This is equivalent to calling:

create_string(mol, "smi", id=id, writer_args={...}, errors=errors)
Parameters:
  • mol (an Open Babel molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (string or None) – options string passed to Open Babel
  • isomeric (Boolean (default: True)) – if true, generate isomeric SMILES
  • canonicalization (Boolean or string (default: "default")) – canonicalization method
  • explicit_hydrogens (Boolean (default: False)) – if true, use explicit hydrogens as [H]
  • cxsmiles (Boolean (default: False)) – Compatibility flag. Open Babel does not support CXSMILES.
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a string, or None if errors are ignored