chemfp.text_records module

Record types used by the text toolkit.

This is an internal chemfp module. It should not be imported by programs which use the public API. (Let me know if anything else should be part of the public API.)

This module contains class definitions for the record objects returned by the text_toolkit, which are part of the public API.

class chemfp.text_records.TextRecord

Bases: object

Base class for the text_toolkit ‘molecules’, which work with the records as text.

The chemfp.text_toolkit implements the toolkit API, but it doesn’t know chemistry. Instead of returning real molecule objects, with atoms and bonds, it returns TextRecord subclass instances that hold the record as a text string.

As an implementation detail (which means its subject to change) there is a subclass for each of the support formats.

  • SDFRecord - holds “sdf” records
  • SmiRecord - holds “smi” records (the full line from a “smi” SMILES file)
  • CanRecord - holds “can” records (the full line from a “can” SMILES file)
  • UsmRecord - holds “usm” records (the full line from a “usm” SMILES file)
  • SmiStringRecord - holds “smistring” records (only the “smistring” SMILES string; no id)
  • CanStringRecord - holds “canstring” records (only the “canstring” SMILES string; no id)
  • UsmStringRecord - holds “usmstring” records (only the “usmstring” SMILES string; no id)

All of the classes have the following attributes: .. py:attribute:: id

The record identifier as a Unicode string, or None if there is no identifier
id_bytes

The record identifier as a byte string, or None if there is no identifier

record

The record, as a string. For the smistring, canstring, and usmstring formats, this is only the SMILES string.

record_format

One of “sdf”, “smi”, “can”, “usm”, “smistring”, “canstring”, or “usmstring”.

The SMILES classes have an attribute:

smiles

The SMILES string component of the record.

add_tag(tag, value)

Add an SD tag value to the TextRecord

This methods does nothing if the record is not an “sdf” record.

Parameters:
  • tag (string) – the SD tag name
  • value (string) – the text for the tag
Returns:

None

copy()

Return a new record which is a copy of the given record

get_tag(tag)

Get the named SD tag value, or None if it doesn’t exist or is not an “sdf” record.

Parameters:tag (byte or Unicode string) – the SD tag name
Returns:a Unicode string, or None
get_tag_as_bytes(tag)

Get the named SD tag value, or None if it doesn’t exist or is not an “sdf” record.

Parameters:tag (byte string) – the SD tag name
Returns:a byte string, or None
get_tag_pairs()

Get a list of all SD tag (name, value) pairs for the TextRecord using Unicode strings

This function returns an empty list if the record is not an “sdf” record.

Returns:a list of (Unicode string name, Unicode string value) pairs
get_tag_pairs_as_bytes()

Get a list of all SD tag (name, value) pairs for the TextRecord using byte strings

This function returns an empty list if the record is not an “sdf” record.

Returns:a list of (byte string name, byte string value) pairs
class chemfp.text_records.SDFRecord(id_bytes, record, encoding, encoding_errors)

Bases: chemfp.text_records.TextRecord

Holds an SDF record. See TextRecord for API details

add_tag(tag, value)

Add an SD tag value to the TextRecord

This methods does nothing if the record is not an “sdf” record.

Parameters:
  • tag (string) – the SD tag name
  • value (string) – the text for the tag
Returns:

None

copy()

Return a new record which is a copy of the given record

encoding
encoding_errors
get_tag(tag)

Get the named SD tag value, or None if it doesn’t exist or is not an “sdf” record.

Parameters:tag (byte or Unicode string) – the SD tag name
Returns:a Unicode string, or None
get_tag_as_bytes(tag)

Get the named SD tag value, or None if it doesn’t exist or is not an “sdf” record.

Parameters:tag (byte string) – the SD tag name
Returns:a byte string, or None
get_tag_pairs(encoding=None, encoding_errors=None)

Get a list of all SD tag (name, value) pairs for the TextRecord using Unicode strings

This function returns an empty list if the record is not an “sdf” record.

Returns:a list of (Unicode string name, Unicode string value) pairs
get_tag_pairs_as_bytes()

Get a list of all SD tag (name, value) pairs for the TextRecord using byte strings

This function returns an empty list if the record is not an “sdf” record.

Returns:a list of (byte string name, byte string value) pairs
id
id_bytes
record
record_format = 'sdf'
class chemfp.text_records.BaseSmiRecord(id, record, smiles, encoding, encoding_errors)

Bases: chemfp.text_records.TextRecord

The base record type for SMILES files

copy()

Return a new record which is a copy of the given record

encoding
encoding_errors
id
record
smiles
class chemfp.text_records.SmiRecord(id, record, smiles, encoding, encoding_errors)

Bases: chemfp.text_records.BaseSmiRecord

Holds a “smi” record. See TextRecord for API details

record_format = 'smi'
class chemfp.text_records.CanRecord(id, record, smiles, encoding, encoding_errors)

Bases: chemfp.text_records.BaseSmiRecord

Holds a “can” record. See TextRecord for API details

record_format = 'can'
class chemfp.text_records.UsmRecord(id, record, smiles, encoding, encoding_errors)

Bases: chemfp.text_records.BaseSmiRecord

Holds a “usm” record. See TextRecord for API details

record_format = 'usm'
class chemfp.text_records.SmiStringRecord(id, record, smiles)

Bases: chemfp.text_records._SmiStringRecord

Holds a “smistring” record. See TextRecord for API details

record_format = 'smistring'
class chemfp.text_records.CanStringRecord(id, record, smiles)

Bases: chemfp.text_records._SmiStringRecord

Holds a “canstring” record. See TextRecord for API details

record_format = 'canstring'
class chemfp.text_records.UsmStringRecord(id, record, smiles)

Bases: chemfp.text_records._SmiStringRecord

Holds a “usmstring” record. See TextRecord for API details

record_format = 'usmstring'