chemfp.cdk_types module¶
This module should not be imported directly.
It contains internal implementation details of CDK fingerprint generation.
This module is included in the documentation because parts of this module are returned to the user, and are part of the public API.
-
class
chemfp.cdk_types.
CDKBaseFingerprintType
(fingerprint_kwargs)¶ Bases:
chemfp.types.ThreadsafeFingerprinterMixin
,chemfp.types.FingerprintType
-
fingerprinter_can_fail
= True¶ a CDK fingerprinter can raise an exception if the molecule isn’t correctly prepared
-
from_inchi
(content: Union[str, bytes], *, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', errors: str = 'strict')¶ Generate a fingerprint from an InChI string and its id
This is equivalent to calling:
mol = fptype.toolkit.from_inchi(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_inchistring
(content: Union[str, bytes], *, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', errors: str = 'strict')¶ Generate a fingerprint from an InChI string
This is equivalent to calling:
mol = fptype.toolkit.from_inchistring(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_molfile
(content: Union[str, bytes], *, AddStereo0d: bool = True, AddStereoElements: bool = True, InterpretHydrogenIsotopes: bool = True, ForceReadAs3DCoordinates: bool = False, mode: Literal[RELAXED, STRICT] = 'RELAXED', implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', errors: str = 'strict')¶ Generate a fingerprint from a molfile record
This is equivalent to calling:
mol = fptype.toolkit.from_molfile(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - AddStereo0d (Boolean (default: True)) – if true, create stereo from parity value when no coordinates
- AddStereoElements (Boolean (default: True)) – if true, detect and create IStereoElements
- InterpretHydrogenIsotopes (Boolean (default: True)) – if true, interpret D and T as hydrogen isotopes
- ForceReadAs3DCoordinates (Boolean (default: False)) – if true, always interpret coordinates as 3D
- mode ('RELAXED' will attempt to recover, 'STRICT' will not) – strictness mode when parsing a record
- implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_sdf
(content: Union[str, bytes], *, AddStereo0d: bool = True, AddStereoElements: bool = True, InterpretHydrogenIsotopes: bool = True, ForceReadAs3DCoordinates: bool = False, mode: Literal[RELAXED, STRICT] = 'RELAXED', implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', errors: str = 'strict')¶ Generate a fingerprint from an SDF record
This is equivalent to calling:
mol = fptype.toolkit.from_sdf(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - AddStereo0d (Boolean (default: True)) – if true, create stereo from parity value when no coordinates
- AddStereoElements (Boolean (default: True)) – if true, detect and create IStereoElements
- InterpretHydrogenIsotopes (Boolean (default: True)) – if true, interpret D and T as hydrogen isotopes
- ForceReadAs3DCoordinates (Boolean (default: False)) – if true, always interpret coordinates as 3D
- mode ('RELAXED' will attempt to recover, 'STRICT' will not) – strictness mode when parsing a record
- implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_smi
(content: Union[str, bytes], *, has_header: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', kekulise: bool = True, cxsmiles: bool = True, errors: str = 'strict')¶ Generate a fingerprint from a SMILES string and its id
This is equivalent to calling:
mol = fptype.toolkit.from_smi(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
- delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
- implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
- kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
- cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string (only works with ‘chemfp’ implementation)
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_smiles
(content: Union[str, bytes], *, kekulise: bool = True, cxsmiles: bool = True, errors: str = 'strict')¶ Generate a fingerprint from a SMILES string
This is equivalent to calling:
mol = fptype.toolkit.from_smistring(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
- cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_smistring
(content: Union[str, bytes], *, kekulise: bool = True, cxsmiles: bool = True, errors: str = 'strict')¶ Generate a fingerprint from a SMILES string
This is equivalent to calling:
mol = fptype.toolkit.from_smistring(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
- cxsmiles (Boolean (default: True)) – If true, look for ChemAxon CXSMILES extensions after the SMILES string
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
module
= <module 'chemfp.cdk_toolkit>'¶
-
software
= ...¶ a description of the CDK and chemfp versions
-
-
class
chemfp.cdk_types.
CDKDaylightFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s Daylight-like fingerprint, version 2.0
The CDK-Daylight/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-Daylight/2.0'¶
-
class
chemfp.cdk_types.
CDKGraphOnlyFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s GraphOnly fingerprint, version 2.0
The CDK-GraphOnly/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-GraphOnly/2.0'¶
-
class
chemfp.cdk_types.
CDKMACCSFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the 166 MACCS keys, version 2.0
The CDK-MACCS/2.0 fingerprints have no parameters.
-
name
= 'CDK-MACCS/2.0'¶
-
num_bits
= 166¶
-
-
class
chemfp.cdk_types.
CDKEStateFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the EState fingerprint, version 2.0
The CDK-EState/2.0 fingerprints have no parameters.
-
name
= 'CDK-EState/2.0'¶
-
num_bits
= 79¶
-
-
class
chemfp.cdk_types.
CDKExtendedFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s Extended fingerprint, version 2.0
The CDK-Extended/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-Extended/2.0'¶
-
class
chemfp.cdk_types.
CDKHybridizationFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s Hybridization fingerprint, version 2.0
The CDK-Hybridization/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-Hybridization/2.0'¶
-
class
chemfp.cdk_types.
CDKKlekotaRothFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the Klekota-Roth fingerprints
The CDK-KlekotaRoth/2.0 fingerprints have no parameters.
-
name
= 'CDK-KlekotaRoth/2.0'¶
-
num_bits
= 4860¶
-
-
class
chemfp.cdk_types.
CDKPubchemFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the Klekota-Roth fingerprints
The CDK-Pubchem/2.0 fingerprints have no parameters.
-
name
= 'CDK-Pubchem/2.0'¶
-
num_bits
= 881¶
-
-
class
chemfp.cdk_types.
CDKSubstructureFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s Substructure fingerprints
The CDK-Substructure/2.0 fingerprints have no parameters.
-
name
= 'CDK-Substructure/2.0'¶
-
num_bits
= 307¶
-
-
class
chemfp.cdk_types.
CDKShortestPathFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s ShortestPath fingerprint, version 2.0
The CDK-ShortestPath/2.0
FingerprintType
parameter is:- size - the number of bits in the fingerprint (default: 1024)
This is generated by CDK versions older than 2.7.
-
fingerprinter_can_fail
= True¶
-
name
= 'CDK-ShortestPath/2.0'¶
-
class
chemfp.cdk_types.
CDKShortestPathFingerprintType_v27
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s ShortestPath fingerprint, version 2.7
The CDK-ShortestPath/2.7
FingerprintType
parameter is:- size - the number of bits in the fingerprint (default: 1024)
This version is new in CDK 2.7, where the internal PRNG was changed from a Mersenne Twister to XorShift, resulting in a different bit pattern.
-
fingerprinter_can_fail
= True¶
-
name
= 'CDK-ShortestPath/2.7'¶
-
class
chemfp.cdk_types.
CDKECFP0FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP0 fingerprint, version 2.0
The CDK-ECFP0/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP0/2.0'¶
-
class
chemfp.cdk_types.
CDKECFP2FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP2 fingerprint, version 2.0
The CDK-ECFP2/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP2/2.0'¶
-
class
chemfp.cdk_types.
CDKECFP4FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP4 fingerprint, version 2.0
The CDK-ECFP4/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP4/2.0'¶
-
class
chemfp.cdk_types.
CDKECFP6FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP6 fingerprint, version 2.0
The CDK-ECFP6/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP6/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP0FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP0 fingerprint, version 2.0
The CDK-FCFP0/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP0/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP2FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP2 fingerprint, version 2.0
The CDK-FCFP2/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP2/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP4FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP4 fingerprint, version 2.0
The CDK-FCFP4/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP4/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP6FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP6 fingerprint, version 2.0
The CDK-FCFP6/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP6/2.0'¶
-
class
chemfp.cdk_types.
CDKAtomPairs2DFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the atom pairs fingerprint from Yap Chun Wei’s PaDEL.
The CDK-AtomPairs2D/2.0
FingerprintType
takes no parameters.-
name
= 'CDK-AtomPairs2D/2.0'¶
-
num_bits
= 780¶
-