chemfp.rdkit_toolkit module

The chemfp toolkit API wrapper for the RDKit toolkit.

This module is also available as chemfp.rdkit.

chemfp.rdkit_toolkit.is_licensed()

Return True - RDKit is always licensed

Returns:True
chemfp.rdkit_toolkit.get_formats(include_unavailable=False)

Get the list of structure formats that RDKit supports

If include_unavailable is True then also include RDKit formats which aren’t available to this specific version of RDKit, such as the InChI formats if your RDKit installation wasn’t compiled with InChI support.

Parameters:include_unavailable (True or False) – include unavailable formats?
Returns:a list of Format objects
chemfp.rdkit_toolkit.get_input_formats()

Get the list of supported RDKit input formats

Returns:a list of chemfp.base_toolkit.Format objects
chemfp.rdkit_toolkit.get_output_formats()

Get the list of supported RDKit output formats

Returns:a list of chemfp.base_toolkit.Format objects
chemfp.rdkit_toolkit.get_format(format)

Get the named format, or raise a ValueError

This will raise a ValueError if RDKit does not implement the format format_name or that format is not available.

Parameters:format_name (a string) – the format name
Returns:a list of chemfp.base_toolkit.Format objects
chemfp.rdkit_toolkit.get_input_format(format)

Get the named input format, or raise a ValueError

This will raise a ValueError if RDKit does not implement the format format_name or that format is not an input format.

Parameters:format_name (a string) – the format name
Returns:a list of chemfp.base_toolkit.Format objects
chemfp.rdkit_toolkit.get_output_format(format)

Get the named format, or raise a ValueError

This will raise a ValueError if RDKit does not implement the format format_name or that format is not an output format.

Parameters:format_name (a string) – the format name
Returns:a list of chemfp.base_toolkit.Format objects
chemfp.rdkit_toolkit.get_input_format_from_source(source=None, format=None)

Get the most appropriate format given the available source and format information

If format is a chemfp.base_toolkit.Format then return it. If it’s a Format-like object with “name” and “compression” attributes use it to make a real Format object with the same attributes. If it’s a string then use it to create a Format object.

If format is None, use the source to auto-detect the format. If auto-detection is not possible, assume it’s an uncompressed SMILES file.

Parameters:
  • source (a filename (as a string), a file object, or None to read from stdin) – the structure data source.
  • format (a Format(-like) object, string, or None) – format information, if known.
Returns:

a chemfp.base_toolkit.Format object

chemfp.rdkit_toolkit.get_output_format_from_destination(destination=None, format=None)

Get the most appropriate format given the available destination and format information

If format is a chemfp.base_toolkit.Format then return it. If it’s a Format-like object with “name” and “compression” attributes use it to make a real Format object with the same attributes. If it’s a string then use it to create a Format object.

If format is None, use the destination to auto-detect the format. If auto-detection is not possible, assume it’s an uncompressed SMILES file.

Parameters:
  • destination (a filename (as a string), a file object, or None to read from stdin) – The structure data source.
  • format (a Format(-like) object, string, or None) – format information, if known.
Returns:

a chemfp.base_toolkit.Format object

chemfp.rdkit_toolkit.read_molecules(source=None, format=None, id_tag=None, reader_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict')

Return an iterator that reads RDKit molecules from a structure file

Iterate through the format structure records in source. If format is None then auto-detect the format based on the source. For SD files, use id_tag to get the record id from the given SD tag instead of the title line. (read_molecules() will ignore the id_tag. It exists to make it easier to switch between reader functions.)

Note: the reader returns a new RDKit molecule each time.

The reader_args dictionary parameters depend on the format. These include:

  • SMILES
    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None
    • has_header - True or False
    • sanitize - True or default sanitizes; False for unsanitized processing
  • InChI
    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None
    • sanitize - True or default sanitizes; False for unsanitized processing
    • removeHs - True or default removes explicit hydrogens; False leaves them in the structure
    • logLevel - an integer log level
    • treatWarningAsError - True raises an exception on error; False or default keeps processing
  • SDF
    • sanitize - True or default sanitizes; False for unsanitized processing
    • removeHs - True or default removes explicit hydrogens; False leaves them in the structure
    • strictParsing - True or default for strict parsing; False for lenient parsing

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

The location parameter takes a chemfp.io.Location instance. If None then a default Location will be created.

See chemfp.rdkit_toolkit.read_ids_and_molecules() if you want (id, molecule) pairs instead of just the molecules.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the structure source
  • format (a format name string, or Format object, or None to auto-detect) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader parameters passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.MoleculeReader iterating RDKit molecules

chemfp.rdkit_toolkit.read_molecules_from_string(content, format, id_tag=None, reader_args=None, errors='strict', location=None)

Return an iterator that reads RDKit molecules from a string containing structure records

content is a string containing 0 or more records in the format format. See chemfp.rdkit_toolkit.read_molecules() for details about the other parameters. See chemfp.rdkit_toolkit.read_ids_and_molecules_from_string() if you want to read (id, RDKit) pairs instead of just molecules.

Parameters:
  • content (a string) – the string containing structure records
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.MoleculeReader iterating RDKit molecules

chemfp.rdkit_toolkit.read_ids_and_molecules(source=None, format=None, id_tag=None, reader_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict')

Return an iterator that reads (id, RDKit molecule) pairs from a structure file

See chemfp.rdkit_toolkit.read_molecules() for full parameter details. The major difference is that this returns an iterator of (id, RDKit molecule) pairs instead of just the molecules.

Parameters:
  • source (a filename, file object, or None to read from stdin) – the structure source
  • format (a format name string, or Format object, or None to auto-detect) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating (id, RDKit molecule) pairs

chemfp.rdkit_toolkit.read_ids_and_molecules_from_string(content, format, id_tag=None, reader_args=None, errors='strict', location=None)

Return an iterator that reads (id, RDKit molecule) pairs from a string containing structure records

content is a string containing 0 or more records in the format format. See chemfp.rdkit_toolkit.read_molecules() for details about the other parameters. See chemfp.rdkit_toolkit.read_molecules_from_string() if you just want to read the RDKit molecules instead of (id, molecule) pairs.

Parameters:
  • content (a string) – the string containing structure records
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track parser state information
Returns:

a chemfp.base_toolkit.IdAndMoleculeReader iterating (id, RDKit molecule) pairs

chemfp.rdkit_toolkit.make_id_and_molecule_parser(format, id_tag=None, reader_args=None, errors='strict')

Create a specialized function which takes a record and returns an (id, RDKit molecule) pair

The returned function is optimized for reading many records from individual strings because it only does parameter validation once. However, I haven’t really noticed much of a performance difference between this and chemfp.rdkit_toolkit.parse_id_and_molecule() so you can probably so I suggest you use that function directly instead of making a specialized function. (Let me know if making a specialized function is useful.)

See chemfp.rdkit_toolkit.read_molecules() for details about the other parameters.

Parameters:
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

a function of the form parser(record string) -> (id, RDKit molecule)

chemfp.rdkit_toolkit.parse_molecule(content, format, id_tag=None, reader_args=None, errors='strict')

Parse the first structure record from the content string and return an RDKit molecule.

content is a string containing a single structure record in format format. (Additional records are ignored). See chemfp.rdkit_toolkit.read_molecules() for details about the other parameters. See chemfp.rdkit_toolkit.parse_id_and_molecule() if you want the (id, RDKit molecule) pair instead of just the molecule.

Parameters:
  • content (a string) – the string containing a structure record
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

an RDKit molecule

chemfp.rdkit_toolkit.parse_id_and_molecule(content, format, id_tag=None, reader_args=None, errors='strict')

Parse the first structure record from content and return the (id, RDKit molecule) pair.

content is a string containing a single structure record in format format. (Additional records are ignored). See chemfp.rdkit_toolkit.read_molecules() for details about the other parameters.

See chemfp.rdkit_toolkit.read_molecules() for details about the other parameters. See chemfp.rdkit_toolkit.parse_molecule() if just want the RDKit molecule and not the the (id, RDKit molecule) pair.

Parameters:
  • content (a string) – the string containing a structure record
  • format (a format name string, or Format object) – the input structure format
  • id_tag (string, or None to use the record title) – SD tag containing the record id
  • reader_args (a dictionary) – reader arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

an (id, RDKit molecule) pair

chemfp.rdkit_toolkit.create_string(mol, format, id=None, writer_args=None, errors='strict')

Convert an RDKit molecule into a structure record in the given format as a Unicode string

If id is not None then use it instead of the molecule’s own title. Warning: this may briefly modify the molecule, so may not be thread-safe.

Parameters:
  • mol (an RDKit molecule) – the molecule to use for the output
  • format (a format name string, or Format object) – the output structure format
  • id (a string, or None to use the molecule's own id) – an alternate record id
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
Returns:

a Unicode string

chemfp.rdkit_toolkit.create_bytes(mol, format, id=None, writer_args=None, errors='strict', level=None)

Convert an RDKit molecule into a structure record in the given format as a byte string

If id is not None then use it instead of the molecule’s own title. Warning: this may briefly modify the molecule, so may not be thread-safe.

Parameters:
  • mol (an RDKit molecule) – the molecule to use for the output
  • format (a format name string, or Format object) – the output structure format
  • id (a string, or None to use the molecule's own id) – an alternate record id
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats
Returns:

a byte string

chemfp.rdkit_toolkit.open_molecule_writer(destination=None, format=None, writer_args=None, errors='strict', location=None, encoding='utf8', encoding_errors='strict', level=None)

Return a MoleculeWriter which can write RDKit molecules to a destination.

A chemfp.base_toolkit.MoleculeWriter has the methods write_molecule, write_molecules, and write_ids_and_molecules, which are ways to write an RDKit molecule, an RDKit molecule iterator, or an (id, RDKit molecule) pair iterator to a file.

Molecules are written to destination. The output format can be a string like “sdf.gz” or “smi”, a chemfp.base_toolkit.Format, or Format-like object with “name” and “compression” attributes, or None to auto-detect based on the destination. If auto-detection is not possible, the output will be written as uncompressed SMILES.

The writer_args dictionary parameters depend on the format. These include:

  • SMILES
    • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None
    • isomericSmiles - True to generate isomeric SMILES
    • kekuleSmiles - True to generate SMILES in Kekule form
    • canonical - True to generate a canonical SMILES
    • allBondsExplicit - True to write explict ‘-’ and ‘:’ bonds, even if they can be inferred; default is False
    • allHsExplicit - True to write explicit hydrogen counts; default is False
    • cxsmiles - True to include CXSMILES annotations; default is False

InChI and InChIKey

  • delimiter - one of “tab”, “space”, “to-eol”, the space or tab characters, or None
  • include_id - True or default to include the id as the second column; False has no id column
  • options - an options string passed to the underlying InChI library
  • logLevel - an integer log level
  • treatWarningAsError - True raises an exception on error; False or default keeps processing

SDF

  • includeStereo - True include stereo information; False or default does not
  • kekulize - True or default creates the connection table with bonds in Kekeule form
  • v3k - True to alway export in V3000 format

The errors parameter specifies how to handle errors. “strict” raises an exception, “report” sends a message to stderr and goes to the next record, and “ignore” goes to the next record.

The location parameter takes a chemfp.io.Location instance. If None then a default Location will be created.

Parameters:
  • destination (a filename, file object, or None to write to stdout) – the structure destination
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format
  • writer_args (a dictionary) – writer parameters passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track writer state information
  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.open_molecule_writer_to_string(format, writer_args=None, errors='strict', location=None)

Return a MoleculeStringWriter which can write molecule records in the given format to a string.

See chemfp.rdkit_toolkit.open_molecule_writer() for full parameter details.

Use the writer’s chemfp.base_toolkit.MoleculeStringWriter.getvalue() to get the output as a Unicode string.

Parameters:
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track writer state information
Returns:

a chemfp.base_toolkit.MoleculeStringWriter expecting RDKit molecules

chemfp.rdkit_toolkit.open_molecule_writer_to_bytes(format, writer_args=None, errors='strict', location=None, level=None)

Return a MoleculeStringWriter which can write molecule records in the given format to a text string.

See chemfp.rdkit_toolkit.open_molecule_writer() for full parameter details.

Use the writer’s chemfp.base_toolkit.MoleculeStringWriter.getvalue() to get the output as a byte string.

Parameters:
  • format (a format name string, or Format(-like) object, or None to auto-detect) – the output structure format
  • writer_args (a dictionary) – writer arguments passed to the underlying toolkit
  • errors (one of "strict", "report", or "ignore") – specify how to handle errors
  • location (a chemfp.io.Location object, or None) – object used to track writer state information
  • level (None, a positive integer, or one of the strings 'min', 'default', or 'max') – compression level to use for compressed formats
Returns:

a chemfp.base_toolkit.MoleculeStringWriter expecting RDKit molecules

chemfp.rdkit_toolkit.copy_molecule(mol)

Return a new RDKit molecule which is a copy of the given molecule

Parameters:mol (an RDKit molecule) – the molecule to copy
Returns:a new RDKit Mol instance
chemfp.rdkit_toolkit.add_tag(mol, tag, value)

Add an SD tag value to the RDKit molecule

Parameters:
  • mol (an RDKit molecule) – the molecule
  • tag (string) – the SD tag name
  • value (string) – the text for the tag
Returns:

None

chemfp.rdkit_toolkit.get_tag(mol, tag)

Get the named SD tag value, or None if it doesn’t exist

Parameters:
  • mol (an RDKit molecule) – the molecule
  • tag (string) – the SD tag name
Returns:

a string, or None

chemfp.rdkit_toolkit.get_tag_pairs(mol)

Get a list of all SD tag (name, value) pairs for the molecule

Parameters:mol (an RDKit molecule) – the molecule
Returns:a list of (string name, string value) pairs
chemfp.rdkit_toolkit.get_id(mol)

Get the molecule’s id from RDKit’s _Name property

Parameters:mol (an RDKit molecule) – the molecule
Returns:a string
chemfp.rdkit_toolkit.set_id(mol, id)

Set the molecule’s id as RDKit’s _Name property

Parameters:
  • mol (an RDKit molecule) – the molecule
  • id (string) – the new id
Returns:

None

chemfp.rdkit_toolkit.from_smistring(content: str, *, sanitize: bool = True, errors: str = 'strict')

Parse a SMILES string using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “smistring”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_smistring(mol: Any, *, id: Optional[str, None] = None, isomericSmiles: bool = True, kekuleSmiles: bool = False, canonical: bool = True, allBondsExplicit: bool = False, allHsExplicit: bool = False, cxsmiles: bool = False, errors: str = 'strict')

Generate a SMILES string from an RDKit molecule

This is equivalent to calling:
create_string(mol, “smistring”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • isomericSmiles (Boolean (default: True)) – If true, generate an isomeric SMILES
  • kekuleSmiles (Boolean (default: False)) – If true, generate Kekule SMILES
  • canonical (Boolean (default: True)) – If true, generate a canonical SMILES
  • allBondsExplicit (Boolean (default: False)) – If true, include bond symbols even for single and aromatic bond
  • allHsExplicit (Boolean (default: False)) – If true, include hydrogen counts for every atom
  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_smi(content: str, *, sanitize: bool = True, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Parse a SMILES string and id using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “smi”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_smi(mol: Any, *, id: Optional[str, None] = None, isomericSmiles: bool = True, kekuleSmiles: bool = False, canonical: bool = True, allBondsExplicit: bool = False, allHsExplicit: bool = False, cxsmiles: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Generate a SMILES string and id from an RDKit molecule

This is equivalent to calling:
create_string(mol, “smi”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • isomericSmiles (Boolean (default: True)) – If true, generate an isomeric SMILES
  • kekuleSmiles (Boolean (default: False)) – If true, generate Kekule SMILES
  • canonical (Boolean (default: True)) – If true, generate a canonical SMILES
  • allBondsExplicit (Boolean (default: False)) – If true, include bond symbols even for single and aromatic bond
  • allHsExplicit (Boolean (default: False)) – If true, include hydrogen counts for every atom
  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_smi_file(source: Union[None, str, BinaryIO], *, sanitize: bool = True, has_header: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Parse a SMILES string and id file using the RDKit toolkit

This is mostly equivalent to calling:
read_molecules(source, “smi”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating RDKit molecules

chemfp.rdkit_toolkit.to_smi_file(destination: Union[None, str, BinaryIO], *, isomericSmiles: bool = True, kekuleSmiles: bool = False, canonical: bool = True, allBondsExplicit: bool = False, allHsExplicit: bool = False, cxsmiles: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Generate a SMILES string and id from an RDKit molecule

This is mostly equivalent to calling:
open_molecule_writer(destination, “smi”, writer_args={…}, errors=errors)
Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • isomericSmiles (Boolean (default: True)) – If true, generate an isomeric SMILES
  • kekuleSmiles (Boolean (default: False)) – If true, generate Kekule SMILES
  • canonical (Boolean (default: True)) – If true, generate a canonical SMILES
  • allBondsExplicit (Boolean (default: False)) – If true, include bond symbols even for single and aromatic bond
  • allHsExplicit (Boolean (default: False)) – If true, include hydrogen counts for every atom
  • cxsmiles (Boolean (default: False)) – If true, generate CXSmiles
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.from_sdf(content: str, *, sanitize: bool = True, removeHs: bool = True, strictParsing: bool = True, includeTags: bool = True, errors: str = 'strict')

Parse an SDF record using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “sdf”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • strictParsing (Boolean (default: True)) – If true, require stricter adherence to the SDF specification
  • includeTags (Boolean (default: True)) – if true, extract the struture data tag fields
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_sdf(mol: Any, *, id: Optional[str, None] = None, includeStereo: bool = False, kekulize: bool = True, v3k: bool = False, errors: str = 'strict')

Generate an SDF record from an RDKit molecule

This is equivalent to calling:
create_string(mol, “sdf”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • includeStereo (Boolean (default: False)) – if true, include stereochemistry information in the record
  • kekulize (Boolean (default: True)) – if true, Kekulize the molecule before creating the record
  • v3k (Boolean (default: False)) – if true, always write in V3000 format
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_sdf_file(source: Union[None, str, BinaryIO], *, sanitize: bool = True, removeHs: bool = True, strictParsing: bool = True, includeTags: bool = True, errors: str = 'strict')

Parse an SDF record file using the RDKit toolkit

This is mostly equivalent to calling:
read_molecules(source, “sdf”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • strictParsing (Boolean (default: True)) – If true, require stricter adherence to the SDF specification
  • includeTags (Boolean (default: True)) – if true, extract the struture data tag fields
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating RDKit molecules

chemfp.rdkit_toolkit.to_sdf_file(destination: Union[None, str, BinaryIO], *, includeStereo: bool = False, kekulize: bool = True, v3k: bool = False, errors: str = 'strict')

Generate an SDF record from an RDKit molecule

This is mostly equivalent to calling:
open_molecule_writer(destination, “sdf”, writer_args={…}, errors=errors)
Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • includeStereo (Boolean (default: False)) – if true, include stereochemistry information in the record
  • kekulize (Boolean (default: True)) – if true, Kekulize the molecule before creating the record
  • v3k (Boolean (default: False)) – if true, always write in V3000 format
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.to_sdf3k(mol: Any, *, id: Optional[str, None] = None, includeStereo: bool = False, kekulize: bool = True, v3k: bool = True, errors: str = 'strict')

Generate an SDF record in V3000 format from an RDKit molecule

This is equivalent to calling:
create_string(mol, “sdf3k”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • includeStereo (Boolean (default: False)) – if true, include stereochemistry information in the record
  • kekulize (Boolean (default: True)) – if true, Kekulize the molecule before creating the record
  • v3k (Boolean (default: True)) – if true, always write in V3000 format
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_sdf3k_file(destination: Union[None, str, BinaryIO], *, includeStereo: bool = False, kekulize: bool = True, v3k: bool = True, errors: str = 'strict')

Generate an SDF record in V3000 format from an RDKit molecule

This is mostly equivalent to calling:
open_molecule_writer(destination, “sdf3k”, writer_args={…}, errors=errors)
Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • includeStereo (Boolean (default: False)) – if true, include stereochemistry information in the record
  • kekulize (Boolean (default: True)) – if true, Kekulize the molecule before creating the record
  • v3k (Boolean (default: True)) – if true, always write in V3000 format
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.from_molfile(content: str, *, sanitize: bool = True, removeHs: bool = True, strictParsing: bool = True, errors: str = 'strict')

Parse a molfile using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “molfile”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • strictParsing (Boolean (default: True)) – If true, require stricter adherence to the SDF specification
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_molfile(mol: Any, *, id: Optional[str, None] = None, includeStereo: bool = False, kekulize: bool = True, v3k: bool = False, errors: str = 'strict')

Generate a molfile from an RDKit molecule

This is equivalent to calling:
create_string(mol, “molfile”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • includeStereo (Boolean (default: False)) – if true, include stereochemistry information in the record
  • kekulize (Boolean (default: True)) – if true, Kekulize the molecule before creating the record
  • v3k (Boolean (default: False)) – if true, always write in V3000 format
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_rdbinmol(content: str, *, errors: str = 'strict')

Parse an RDKit binary molecule byte string using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “rdbinmol”, reader_args={…}, errors=errors)
Parameters:errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:an RDKit molecule object
chemfp.rdkit_toolkit.to_rdbinmol(mol: Any, *, id: Optional[str, None] = None, errors: str = 'strict')

Generate an RDKit binary molecule byte string from an RDKit molecule

This is equivalent to calling:
create_string(mol, “rdbinmol”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_fasta(content: str, *, sanitize: bool = True, flavor: Literal[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] = 0, errors: str = 'strict')

Parse a FASTA record using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “fasta”, reader_args={…}, errors=errors)

Possible flavor values are:

  • 0 = L-amino acids
  • 1 = D-amino acids
  • 2 = RNA, no cap
  • 3 = RNA, 5’ cap
  • 4 = RNA, 3’ cap
  • 5 = RNA, both caps
  • 6 = DNA, no cap
  • 7 = DNA, 5’ cap
  • 8 = DNA, 3’ cap
  • 9 = DNA, both caps
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • flavor (integer from 0-9, inclusive (default: 0)) – The sequence type (amino acid, RNA, or DNA), and how to handle caps
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_fasta(mol: Any, *, id: Optional[str, None] = None, errors: str = 'strict')

Generate a FASTA record from an RDKit molecule

This is equivalent to calling:
create_string(mol, “fasta”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_fasta_file(source: Union[None, str, BinaryIO], *, sanitize: bool = True, flavor: Literal[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] = 0, errors: str = 'strict')

Parse a FASTA record file using the RDKit toolkit

This is mostly equivalent to calling:
read_molecules(source, “fasta”, reader_args={…}, errors=errors)

Possible flavor values are:

  • 0 = L-amino acids
  • 1 = D-amino acids
  • 2 = RNA, no cap
  • 3 = RNA, 5’ cap
  • 4 = RNA, 3’ cap
  • 5 = RNA, both caps
  • 6 = DNA, no cap
  • 7 = DNA, 5’ cap
  • 8 = DNA, 3’ cap
  • 9 = DNA, both caps
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • flavor (integer from 0-9, inclusive (default: 0)) – The sequence type (amino acid, RNA, or DNA), and how to handle caps
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating RDKit molecules

chemfp.rdkit_toolkit.to_fasta_file(destination: Union[None, str, BinaryIO], *, errors: str = 'strict')

Generate a FASTA record from an RDKit molecule

This is mostly equivalent to calling:
open_molecule_writer(destination, “fasta”, writer_args={…}, errors=errors)
Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.from_sequence(content: str, *, sanitize: bool = True, flavor: Literal[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] = 0, errors: str = 'strict')

Parse an IUPAC sequence using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “sequence”, reader_args={…}, errors=errors)

Possible flavor values are:

  • 0 = L-amino acids
  • 1 = D-amino acids
  • 2 = RNA, no cap
  • 3 = RNA, 5’ cap
  • 4 = RNA, 3’ cap
  • 5 = RNA, both caps
  • 6 = DNA, no cap
  • 7 = DNA, 5’ cap
  • 8 = DNA, 3’ cap
  • 9 = DNA, both caps
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • flavor (integer from 0-9, inclusive (default: 0)) – The sequence type (amino acid, RNA, or DNA), and how to handle caps
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_sequence(mol: Any, *, id: Optional[str, None] = None, errors: str = 'strict')

Generate an IUPAC sequence from an RDKit molecule

This is equivalent to calling:
create_string(mol, “sequence”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_helm(content: str, *, sanitize: bool = True, errors: str = 'strict')

Parse a HELM string using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “helm”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_helm(mol: Any, *, id: Optional[str, None] = None, errors: str = 'strict')

Generate a HELM string from an RDKit molecule

This is equivalent to calling:
create_string(mol, “helm”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_pdb(content: str, *, sanitize: bool = True, removeHs: bool = True, flavor: Literal[0] = 0, proximityBonding: bool = True, errors: str = 'strict')

Parse a PDB record using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “pdb”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • flavor (0) – The value 0 (may change in the future)
  • proximityBonding (Boolean (default: True)) – If true, connect atoms based on a proximity search
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_pdb(mol: Any, *, id: Optional[str, None] = None, flavor: Union[int, str] = 0, errors: str = 'strict')

Generate a PDB record from an RDKit molecule

This is equivalent to calling:
create_string(mol, “pdb”, id=id, writer_args={…}, errors=errors)

Available bit flag flavors are:

  • 1 = ‘MODEL’ = write MODEL/ENDMDL
  • 2 = ‘NO_CONECT’ = no CONECT records
  • 4 = ‘BOTH_CONECT’ = CONECT records in both directions
  • 8 = ‘NO_BOND_ORDER’ = use only one CONECT even for higher bond orders
  • 16 = ‘MASTER’ = write MASTER record
  • 32 = ‘TER’ = write TER record
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • flavor (An integer or string of '|'- or ','-separated terms) – Output flavor bit flags
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_pdb_file(source: Union[None, str, BinaryIO], *, sanitize: bool = True, removeHs: bool = True, flavor: Literal[0] = 0, proximityBonding: bool = True, errors: str = 'strict')

Parse a PDB record file using the RDKit toolkit

This is mostly equivalent to calling:
read_molecules(source, “pdb”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • flavor (0) – The value 0 (may change in the future)
  • proximityBonding (Boolean (default: True)) – If true, connect atoms based on a proximity search
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating RDKit molecules

chemfp.rdkit_toolkit.to_pdb_file(destination: Union[None, str, BinaryIO], *, flavor: Union[int, str] = 0, errors: str = 'strict')

Generate a PDB record from an RDKit molecule

This is mostly equivalent to calling:
open_molecule_writer(destination, “pdb”, writer_args={…}, errors=errors)

Available bit flag flavors are:

  • 1 = ‘MODEL’ = write MODEL/ENDMDL
  • 2 = ‘NO_CONECT’ = no CONECT records
  • 4 = ‘BOTH_CONECT’ = CONECT records in both directions
  • 8 = ‘NO_BOND_ORDER’ = use only one CONECT even for higher bond orders
  • 16 = ‘MASTER’ = write MASTER record
  • 32 = ‘TER’ = write TER record
Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • flavor (An integer or string of '|'- or ','-separated terms) – Output flavor bit flags
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.from_inchi(content: str, *, sanitize: bool = True, removeHs: bool = True, logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Parse an InChI string and id using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “inchi”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_inchi(mol: Any, *, id: Optional[str, None] = None, options: str = '', logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Generate an InChI string and id from an RDKit molecule

This is equivalent to calling:
create_string(mol, “inchi”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (a string (default: "")) – an configuration string to pass to the InChI API
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.from_inchi_file(source: Union[None, str, BinaryIO], *, sanitize: bool = True, removeHs: bool = True, logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, errors: str = 'strict')

Parse an InChI string and id file using the RDKit toolkit

This is mostly equivalent to calling:
read_molecules(source, “inchi”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeReader iterating RDKit molecules

chemfp.rdkit_toolkit.to_inchi_file(destination: Union[None, str, BinaryIO], *, options: str = '', logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Generate an InChI string and id from an RDKit molecule

This is mostly equivalent to calling:
open_molecule_writer(destination, “inchi”, writer_args={…}, errors=errors)
Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • options (a string (default: "")) – an configuration string to pass to the InChI API
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.from_inchistring(content: str, *, sanitize: bool = True, removeHs: bool = True, logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, errors: str = 'strict')

Parse an InChI string using the RDKit toolkit

This is equivalent to calling:

parse_molecule(content, “inchistring”, reader_args={…}, errors=errors)
Parameters:
  • sanitize (Boolean (default: True)) – If true, sanitize the molecule after parsing
  • removeHs (Boolean (default: True)) – If true, remove simple hydrogens from the molecular graph
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_inchistring(mol: Any, *, id: Optional[str, None] = None, options: str = '', logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, errors: str = 'strict')

Generate an InChI string from an RDKit molecule

This is equivalent to calling:
create_string(mol, “inchistring”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (a string (default: "")) – an configuration string to pass to the InChI API
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_inchikey(mol: Any, *, id: Optional[str, None] = None, options: str = '', logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Generate an InChIKey string and id from an RDKit molecule

This is equivalent to calling:
create_string(mol, “inchikey”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (a string (default: "")) – an configuration string to pass to the InChI API
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object

chemfp.rdkit_toolkit.to_inchikey_file(destination: Union[None, str, BinaryIO], *, options: str = '', logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = None, include_id: bool = True, errors: str = 'strict')

Generate an InChIKey string and id from an RDKit molecule

This is mostly equivalent to calling:
open_molecule_writer(destination, “inchikey”, writer_args={…}, errors=errors)
Parameters:
  • destination (None, a filename string, or a file-like object) – where to write the molecules
  • options (a string (default: "")) – an configuration string to pass to the InChI API
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: None)) – The separator between the SMILES and the id
  • include_id (Boolean (default: True)) – if true, include the molecule id in the output
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a chemfp.base_toolkit.MoleculeWriter expecting RDKit molecules

chemfp.rdkit_toolkit.to_inchikeystring(mol: Any, *, id: Optional[str, None] = None, options: str = '', logLevel: Optional[int, None] = None, treatWarningAsError: bool = False, errors: str = 'strict')

Generate an InChIKey string from an RDKit molecule

This is equivalent to calling:
create_string(mol, “inchikeystring”, id=id, writer_args={…}, errors=errors)
Parameters:
  • mol (an RDKit molecule) – a molecule object
  • id (None or a string (default: None)) – an alternate identifier for the output record, if relevant
  • options (a string (default: "")) – an configuration string to pass to the InChI API
  • logLevel (an integer, or None to disable logging completely (default: None)) – the log level for the InChI API
  • treatWarningAsError (Boolean (default: False)) – treat any InChI warnings as an error
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

an RDKit molecule object