chemfp.cdk_types module

class chemfp.cdk_types.CDKBaseFingerprintType(fingerprint_kwargs)

Bases: chemfp.types.ThreadsafeFingerprinterMixin, chemfp.types.FingerprintType

fingerprinter_can_fail = True

a CDK fingerprinter can raise an exception if the molecule isn’t correctly prepared

from_inchi(content: str, *, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', errors: str = 'strict')

Generate a fingerprint from an InChI string and id

This is equivalent to calling:

mol = fptype.toolkit.from_inchi(content, ..., errors=errors)
fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters:
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a fingerprint byte string

from_inchistring(content: str, *, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', errors: str = 'strict')

Generate a fingerprint from an InChI string

This is equivalent to calling:

mol = fptype.toolkit.from_inchistring(content, ..., errors=errors)
fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters:
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a fingerprint byte string

from_molfile(content: str, *, ForceReadAs3DCoordinates: bool = False, mode: Literal[RELAXED, STRICT] = 'RELAXED', AddStereoElements: bool = True, InterpretHydrogenIsotopes: bool = True, implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', errors: str = 'strict')

Generate a fingerprint from a molfile

This is equivalent to calling:

mol = fptype.toolkit.from_molfile(content, ..., errors=errors)
fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters:
  • ForceReadAs3DCoordinates (Boolean (default: False)) – if true, always interpret coordinates as 3D
  • mode ('RELAXED' will attempt to recover, 'STRICT' will not) – strictness mode when parsing a record
  • AddStereoElements (Boolean (default: True)) – if true, detect and create IStereoElements
  • InterpretHydrogenIsotopes (Boolean (default: True)) – if true, interpret D and T as hydrogen isotopes
  • implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a fingerprint byte string

from_sdf(content: str, *, ForceReadAs3DCoordinates: bool = False, mode: Literal[RELAXED, STRICT] = 'RELAXED', AddStereoElements: bool = True, InterpretHydrogenIsotopes: bool = True, implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', errors: str = 'strict')

Generate a fingerprint from an SDF record

This is equivalent to calling:

mol = fptype.toolkit.from_sdf(content, ..., errors=errors)
fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters:
  • ForceReadAs3DCoordinates (Boolean (default: False)) – if true, always interpret coordinates as 3D
  • mode ('RELAXED' will attempt to recover, 'STRICT' will not) – strictness mode when parsing a record
  • AddStereoElements (Boolean (default: True)) – if true, detect and create IStereoElements
  • InterpretHydrogenIsotopes (Boolean (default: True)) – if true, interpret D and T as hydrogen isotopes
  • implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a fingerprint byte string

from_smi(content: str, *, has_header: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', kekulise: bool = True, errors: str = 'strict')

Generate a fingerprint from a SMILES string and id

This is equivalent to calling:

mol = fptype.toolkit.from_smi(content, ..., errors=errors)
fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters:
  • has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
  • delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
  • implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
  • kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a fingerprint byte string

from_smiles(content: str, *, kekulise: bool = True, errors: str = 'strict')

Generate a fingerprint from a SMILES string

This is equivalent to calling:

mol = fptype.toolkit.from_smistring(content, ..., errors=errors)
fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters:
  • kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a fingerprint byte string

from_smistring(content: str, *, kekulise: bool = True, errors: str = 'strict')

Generate a fingerprint from a SMILES string

This is equivalent to calling:

mol = fptype.toolkit.from_smistring(content, ..., errors=errors)
fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters:
  • kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
  • errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns:

a fingerprint byte string

software = ...

a description of the CDK and chemfp versions

toolkit = <module 'chemfp.cdk_toolkit>'

a reference to the cdk_toolkit module

class chemfp.cdk_types.CDKDaylightFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s Daylight-like fingerprint, version 2.0

The CDK-Daylight/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • searchDepth - maximum path length (default: 7)
  • pathLimit - maximum number of paths (default: 42000)
  • hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
name = 'CDK-Daylight/2.0'
class chemfp.cdk_types.CDKGraphOnlyFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s GraphOnly fingerprint, version 2.0

The CDK-GraphOnly/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • searchDepth - maximum path length (default: 7)
  • pathLimit - maximum number of paths (default: 42000)
  • hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
name = 'CDK-GraphOnly/2.0'
class chemfp.cdk_types.CDKMACCSFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.types.NoFingerprintParametersMixin, chemfp.cdk_types.FixedSizeFingerprint

CDK’s implementation of the 166 MACCS keys, version 2.0

The CDK-MACCS/2.0 fingerprints have no parameters.

name = 'CDK-MACCS/2.0'
num_bits = 166
class chemfp.cdk_types.CDKEStateFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.types.NoFingerprintParametersMixin, chemfp.cdk_types.FixedSizeFingerprint

CDK’s implementation of the EState fingerprint, version 2.0

The CDK-EState/2.0 fingerprints have no parameters.

name = 'CDK-EState/2.0'
num_bits = 79
class chemfp.cdk_types.CDKExtendedFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s Extended fingerprint, version 2.0

The CDK-Extended/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • searchDepth - maximum path length (default: 7)
  • pathLimit - maximum number of paths (default: 42000)
  • hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
name = 'CDK-Extended/2.0'
class chemfp.cdk_types.CDKHybridizationFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s Hybridization fingerprint, version 2.0

The CDK-Hybridization/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • searchDepth - maximum path length (default: 7)
  • pathLimit - maximum number of paths (default: 42000)
  • hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
name = 'CDK-Hybridization/2.0'
class chemfp.cdk_types.CDKKlekotaRothFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.types.NoFingerprintParametersMixin, chemfp.cdk_types.FixedSizeFingerprint

CDK’s implementation of the Klekota-Roth fingerprints

The CDK-KlekotaRoth/2.0 fingerprints have no parameters.

name = 'CDK-KlekotaRoth/2.0'
num_bits = 4860
class chemfp.cdk_types.CDKPubchemFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.types.NoFingerprintParametersMixin, chemfp.cdk_types.FixedSizeFingerprint

CDK’s implementation of the Klekota-Roth fingerprints

The CDK-Pubchem/2.0 fingerprints have no parameters.

name = 'CDK-Pubchem/2.0'
num_bits = 881
class chemfp.cdk_types.CDKSubstructureFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.types.NoFingerprintParametersMixin, chemfp.cdk_types.FixedSizeFingerprint

CDK’s Substructure fingerprints

The CDK-Substructure/2.0 fingerprints have no parameters.

name = 'CDK-Substructure/2.0'
num_bits = 307
class chemfp.cdk_types.CDKShortestPathFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s ShortestPath fingerprint, version 2.0

The CDK-ShortestPath/2.0 FingerprintType parameter is:

  • size - the number of bits in the fingerprint (default: 1024)

This is generated by CDK versions older than 2.7.

fingerprinter_can_fail = True
name = 'CDK-ShortestPath/2.0'
class chemfp.cdk_types.CDKShortestPathFingerprintType_v27(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s ShortestPath fingerprint, version 2.7

The CDK-ShortestPath/2.7 FingerprintType parameter is:

  • size - the number of bits in the fingerprint (default: 1024)

This version is new in CDK 2.7, where the internal PRNG was changed from a Mersenne Twister to XorShift, resulting in a different bit pattern.

fingerprinter_can_fail = True
name = 'CDK-ShortestPath/2.7'
class chemfp.cdk_types.CDKECFP0FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the ECFP0 fingerprint, version 2.0

The CDK-ECFP0/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-ECFP0/2.0'
class chemfp.cdk_types.CDKECFP2FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the ECFP2 fingerprint, version 2.0

The CDK-ECFP2/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-ECFP2/2.0'
class chemfp.cdk_types.CDKECFP4FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the ECFP4 fingerprint, version 2.0

The CDK-ECFP4/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-ECFP4/2.0'
class chemfp.cdk_types.CDKECFP6FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the ECFP6 fingerprint, version 2.0

The CDK-ECFP6/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-ECFP6/2.0'
class chemfp.cdk_types.CDKFCFP0FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the FCFP0 fingerprint, version 2.0

The CDK-FCFP0/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-FCFP0/2.0'
class chemfp.cdk_types.CDKFCFP2FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the FCFP2 fingerprint, version 2.0

The CDK-FCFP2/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-FCFP2/2.0'
class chemfp.cdk_types.CDKFCFP4FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the FCFP4 fingerprint, version 2.0

The CDK-FCFP4/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-FCFP4/2.0'
class chemfp.cdk_types.CDKFCFP6FingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.cdk_types.VariableSizeFingerprint

CDK’s implementation of the FCFP6 fingerprint, version 2.0

The CDK-FCFP6/2.0 FingerprintType parameters are:

  • size - the number of bits in the fingerprint (default: 1024)
  • perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
name = 'CDK-FCFP6/2.0'
class chemfp.cdk_types.CDKAtomPairs2DFingerprintType_v20(fingerprint_kwargs)

Bases: chemfp.types.NoFingerprintParametersMixin, chemfp.cdk_types.FixedSizeFingerprint

CDK’s implementation of the atom pairs fingerprint from Yap Chun Wei’s PaDEL.

The CDK-AtomPairs2D/2.0 FingerprintType takes no parameters.

name = 'CDK-AtomPairs2D/2.0'
num_bits = 780