chemfp.cdk_types module¶
-
class
chemfp.cdk_types.
CDKBaseFingerprintType
(fingerprint_kwargs)¶ Bases:
chemfp.types.ThreadsafeFingerprinterMixin
,chemfp.types.FingerprintType
-
fingerprinter_can_fail
= True¶ a CDK fingerprinter can raise an exception if the molecule isn’t correctly prepared
-
from_inchi
(content: str, *, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', errors: str = 'strict')¶ Generate a fingerprint from an InChI string and id
This is equivalent to calling:
mol = fptype.toolkit.from_inchi(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_inchistring
(content: str, *, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', errors: str = 'strict')¶ Generate a fingerprint from an InChI string
This is equivalent to calling:
mol = fptype.toolkit.from_inchistring(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_molfile
(content: str, *, ForceReadAs3DCoordinates: bool = False, mode: Literal[RELAXED, STRICT] = 'RELAXED', AddStereoElements: bool = True, InterpretHydrogenIsotopes: bool = True, implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', errors: str = 'strict')¶ Generate a fingerprint from a molfile
This is equivalent to calling:
mol = fptype.toolkit.from_molfile(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - ForceReadAs3DCoordinates (Boolean (default: False)) – if true, always interpret coordinates as 3D
- mode ('RELAXED' will attempt to recover, 'STRICT' will not) – strictness mode when parsing a record
- AddStereoElements (Boolean (default: True)) – if true, detect and create IStereoElements
- InterpretHydrogenIsotopes (Boolean (default: True)) – if true, interpret D and T as hydrogen isotopes
- implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_sdf
(content: str, *, ForceReadAs3DCoordinates: bool = False, mode: Literal[RELAXED, STRICT] = 'RELAXED', AddStereoElements: bool = True, InterpretHydrogenIsotopes: bool = True, implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', errors: str = 'strict')¶ Generate a fingerprint from an SDF record
This is equivalent to calling:
mol = fptype.toolkit.from_sdf(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - ForceReadAs3DCoordinates (Boolean (default: False)) – if true, always interpret coordinates as 3D
- mode ('RELAXED' will attempt to recover, 'STRICT' will not) – strictness mode when parsing a record
- AddStereoElements (Boolean (default: True)) – if true, detect and create IStereoElements
- InterpretHydrogenIsotopes (Boolean (default: True)) – if true, interpret D and T as hydrogen isotopes
- implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_smi
(content: str, *, has_header: bool = False, delimiter: Optional[Literal[to_eol, space, tab, comma, whitespace, native, , ], None] = 'to-eol', implementation: Optional[Literal[cdk, chemfp], None] = 'cdk', kekulise: bool = True, errors: str = 'strict')¶ Generate a fingerprint from a SMILES string and id
This is equivalent to calling:
mol = fptype.toolkit.from_smi(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - has_header (Boolean (default: False)) – If true, treat the first line of the SMILES file as a header
- delimiter (One of None, 'to_eol', 'space', 'tab', 'comma', 'whitespace', 'native', or the space or tab characters (default: "to-eol")) – The separator between the SMILES and the id
- implementation (either 'cdk' or 'chemfp') – use CDK or chemfp to identify records
- kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_smiles
(content: str, *, kekulise: bool = True, errors: str = 'strict')¶ Generate a fingerprint from a SMILES string
This is equivalent to calling:
mol = fptype.toolkit.from_smistring(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
from_smistring
(content: str, *, kekulise: bool = True, errors: str = 'strict')¶ Generate a fingerprint from a SMILES string
This is equivalent to calling:
mol = fptype.toolkit.from_smistring(content, ..., errors=errors) fp = fptype.from_mol(mol) if (mol is not None) else None
Parameters: - kekulise (Boolean (default: True)) – if true, ensure a valid Kekule intepretation exists
- errors (one of "strict", "ignore", or "log") – specify how to handle errors
Returns: a fingerprint byte string
-
software
= ...¶ a description of the CDK and chemfp versions
-
toolkit
= <module 'chemfp.cdk_toolkit>'¶ a reference to the
cdk_toolkit
module
-
-
class
chemfp.cdk_types.
CDKDaylightFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s Daylight-like fingerprint, version 2.0
The CDK-Daylight/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-Daylight/2.0'¶
-
class
chemfp.cdk_types.
CDKGraphOnlyFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s GraphOnly fingerprint, version 2.0
The CDK-GraphOnly/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-GraphOnly/2.0'¶
-
class
chemfp.cdk_types.
CDKMACCSFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the 166 MACCS keys, version 2.0
The CDK-MACCS/2.0 fingerprints have no parameters.
-
name
= 'CDK-MACCS/2.0'¶
-
num_bits
= 166¶
-
-
class
chemfp.cdk_types.
CDKEStateFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the EState fingerprint, version 2.0
The CDK-EState/2.0 fingerprints have no parameters.
-
name
= 'CDK-EState/2.0'¶
-
num_bits
= 79¶
-
-
class
chemfp.cdk_types.
CDKExtendedFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s Extended fingerprint, version 2.0
The CDK-Extended/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-Extended/2.0'¶
-
class
chemfp.cdk_types.
CDKHybridizationFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s Hybridization fingerprint, version 2.0
The CDK-Hybridization/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- searchDepth - maximum path length (default: 7)
- pathLimit - maximum number of paths (default: 42000)
- hashPseudoAtoms - include pseudo-atoms in the hash calculation (default: 0)
-
name
= 'CDK-Hybridization/2.0'¶
-
class
chemfp.cdk_types.
CDKKlekotaRothFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the Klekota-Roth fingerprints
The CDK-KlekotaRoth/2.0 fingerprints have no parameters.
-
name
= 'CDK-KlekotaRoth/2.0'¶
-
num_bits
= 4860¶
-
-
class
chemfp.cdk_types.
CDKPubchemFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the Klekota-Roth fingerprints
The CDK-Pubchem/2.0 fingerprints have no parameters.
-
name
= 'CDK-Pubchem/2.0'¶
-
num_bits
= 881¶
-
-
class
chemfp.cdk_types.
CDKSubstructureFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s Substructure fingerprints
The CDK-Substructure/2.0 fingerprints have no parameters.
-
name
= 'CDK-Substructure/2.0'¶
-
num_bits
= 307¶
-
-
class
chemfp.cdk_types.
CDKShortestPathFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s ShortestPath fingerprint, version 2.0
The CDK-ShortestPath/2.0
FingerprintType
parameter is:- size - the number of bits in the fingerprint (default: 1024)
This is generated by CDK versions older than 2.7.
-
fingerprinter_can_fail
= True¶
-
name
= 'CDK-ShortestPath/2.0'¶
-
class
chemfp.cdk_types.
CDKShortestPathFingerprintType_v27
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s ShortestPath fingerprint, version 2.7
The CDK-ShortestPath/2.7
FingerprintType
parameter is:- size - the number of bits in the fingerprint (default: 1024)
This version is new in CDK 2.7, where the internal PRNG was changed from a Mersenne Twister to XorShift, resulting in a different bit pattern.
-
fingerprinter_can_fail
= True¶
-
name
= 'CDK-ShortestPath/2.7'¶
-
class
chemfp.cdk_types.
CDKECFP0FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP0 fingerprint, version 2.0
The CDK-ECFP0/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP0/2.0'¶
-
class
chemfp.cdk_types.
CDKECFP2FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP2 fingerprint, version 2.0
The CDK-ECFP2/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP2/2.0'¶
-
class
chemfp.cdk_types.
CDKECFP4FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP4 fingerprint, version 2.0
The CDK-ECFP4/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP4/2.0'¶
-
class
chemfp.cdk_types.
CDKECFP6FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the ECFP6 fingerprint, version 2.0
The CDK-ECFP6/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-ECFP6/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP0FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP0 fingerprint, version 2.0
The CDK-FCFP0/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP0/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP2FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP2 fingerprint, version 2.0
The CDK-FCFP2/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP2/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP4FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP4 fingerprint, version 2.0
The CDK-FCFP4/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP4/2.0'¶
-
class
chemfp.cdk_types.
CDKFCFP6FingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.cdk_types.VariableSizeFingerprint
CDK’s implementation of the FCFP6 fingerprint, version 2.0
The CDK-FCFP6/2.0
FingerprintType
parameters are:- size - the number of bits in the fingerprint (default: 1024)
- perceiveStereochemistry - if True, re-perceive stereochemistry from 2D/3D coordinates
-
name
= 'CDK-FCFP6/2.0'¶
-
class
chemfp.cdk_types.
CDKAtomPairs2DFingerprintType_v20
(fingerprint_kwargs)¶ Bases:
chemfp.types.NoFingerprintParametersMixin
,chemfp.cdk_types.FixedSizeFingerprint
CDK’s implementation of the atom pairs fingerprint from Yap Chun Wei’s PaDEL.
The CDK-AtomPairs2D/2.0
FingerprintType
takes no parameters.-
name
= 'CDK-AtomPairs2D/2.0'¶
-
num_bits
= 780¶
-