chemfp maxmin command-line options¶
The following comes from chemfp maxmin --help
:
usage: chemfp maxmin [-h] [--num-picks N] [-t FLOAT] [--all-equal]
[--pick-id PICK_ID] [--pick-index PICK_INDEX]
[--in CANDIDATES_FORMAT] [--references FILENAME]
[--references-format FORMAT]
[--randomize | --no-randomize] [--seed N]
[--neighbors FILENAME] [--neighbors-format FILENAME]
[--mmap | --no-mmap] [--output FILENAME]
[--out OUTPUT_FORMAT] [--precision N]
[--save-picks FILENAME] [--save-picks-format FILENAME]
[--save-candidates FILENAME]
[--save-candidates-format FILENAME]
[--pick-time | --no-pick-time] [--no-date] [--date STR]
[--times] [--progress | --no-progress]
candidates
Select diverse fingerprints using the MaxMin algorithm
positional arguments:
candidates fingerprint file containing candidates (fingerprints
to pick from)
options:
-h, --help show this help message and exit
--num-picks N, -n N Number of picks (default: 'all')
-t FLOAT, --threshold FLOAT
Maximum similarity (default: 1.0)
--all-equal Continue picking past --num-picks if the pick score is
unchanged
--pick-id PICK_ID Candidate id to use for the initial pick (default: use
heapsweep)
--pick-index PICK_INDEX
Candidate index to use for the initial pick (default:
use heapsweep)
--in CANDIDATES_FORMAT, --candidates-format CANDIDATES_FORMAT
Format of the candidates file (default uses filename
extension, or 'fps')
--references FILENAME
Fingerprint file containing reference fingerprints to
avoid (the fingerprints you have)
--references-format FORMAT
Format of the references file (default uses filename
extension, or 'fps')
--randomize, --no-randomize
Use --randomize (the default) to shuffle the
candidates before starting MaxMin (default: True)
--seed N Specify the random number generator seed between 0 and
2**64-1, inclusive, or use -1 to have one picked at
random (default: -1)
--neighbors FILENAME For each pick, includes the nearest neighbor and score
from FILENAME
--neighbors-format FILENAME
Format of the neighbors file (default uses filename
extension, or 'fps')
--mmap, --no-mmap Don't use mmap to read uncompressed FPB files. May
give better performance on networked file systems, at
the expense of higher memory use. (default: True)
--output FILENAME, -o FILENAME
Write output to the named file instead of stdout.
--out OUTPUT_FORMAT Output format. Must be one of 'chemfp' (the default),
'csv', 'tsv', or 'excel-tab', with optional
compression
--precision N Number of digits in Tanimoto score (default: based on
the fingerprint size)
--save-picks FILENAME
Write picked fingerprints to the named file.
--save-picks-format FILENAME
Specify the format for the picked fingerprints.
--save-candidates FILENAME
Write remaining candidate fingerprints to the named
file.
--save-candidates-format FILENAME
Specify the format for the remaining candidate
fingerprints.
--pick-time, --no-pick-time
include the elapsed time for each pick (default:
False)
--no-date Do not include the 'date' metadata in the output
header
--date STR An ISO 8601 date (like '2022-02-07T11:10:15') to use
for the 'date' metadata in the output header
--times Write timing information to stderr
--progress, --no-progress
Show a progress bar (default: show unless the output
is a terminal)