The ens Command

The ens command is the generic command used to manipulate molecular ensembles. Ensembles are the most commonly used chemistry major object. Ensembles contain atom, bonds, molecules and other minor objects.

The syntax of this command follows the standard schema of command/subcommand/majorhandle. Since molecular ensembles are major objects, they are not addressed via labels.

Similar to the functionality of molfile and dataset objects, ensembles can be persistent, or transient. Persistent ensembles are those created by the ens create command or similar functions. They possess a handle and exist until explicitly deleted. Transient ensembles only exist for the duration of a single command. They are deleted as soon as the command finishes, regardless whether the command was successful or not.

Examples:

ens get $ehandle E_SMILES
ens merge [ens create CCC] [ens create CCC]
ens get lycorine E_CID

This is the list of officially supported subcommands:

ens add

ens add ehandle ?ehandle_list?...
e.add(?eref/erefsequence?,...)
e += eref

This command performs the same operation as the ens merge command, but preserves the ensembles in the merge lists (argument four and onwards in the Tcl command variant). The base ensemble (third argument) is modified.

Please refer to the ens merge command for a more detailed documentation.

The Python arithmetic command returns a reference of the original ensemble, not the new first atom label or reference of the merged ensemble (see again ens merge ).

ens align3d

ens align3d ehandle box/center/masscenter/pmi ?usehydrogens? ?property?
e.align3d(?mode=?,?usehydrogens=?,?coordinateproperty=?)

Perform a 3D alignment by modifying standard atom coordinates property A_XYZ , or an alternative explicitly specified atomic coordinate property.

The possible alignment modes are

By default all atoms are used to compute the alignment rotation and movement vectors, including hydrogens. If these should be omitted from computing the movement vectors (but not the subsequent atom movement), the optional usehydrogens parameter can be set to false .

The command returns the handle or reference of the ensemble.

ens append

ens append ehandle ?property value?...
e.append({?property:value,?...})
e.append(?property,value,?...)

Standard data manipulation command for appending property data. It is explained in more detail in the section about setting property data.

The command returns the first data value.

Example:

ens append $ehandle E_NAME “_linker”

ens assign

ens assign ehandle srcproperty dstproperty
e.assign(srcproperty=,dstproperty=)

Assign property data to another property on the same ensemble. Both properties must be associated with the ensemble object class. This process is more efficient than going through a pair of ens get/ens set commands, because in most cases no string or Tcl/Python script object representations of the property data need to be created.

Both source and destination properties may be addressed with field specifications. A data conversion path must exist between the data types of the involved properties. If any data conversion fails, the command fails. For example, it is possible to assign a string property to a numeric property - but only if all property values can be successfully converted to that numeric type. The reverse example case always succeeds, out-of-memory errors and similar global events excluded.

The original property data remains valid. The command variant ens rename directly exchanges the property name without any data duplication or conversion, if that is possible. In any case, the original property data is no longer present after the execution of this command variant.

The command returns the original object handle for Tcl , or object reference for Python .

Examples:

ens assign $ehandle A_XY A_XY%
ens assign $ehandle E_NMRSPECTRUM(spectrometer) E_METHOD
ens rename $ehandle E_IDENT E_NAME

ens atoms

ens atoms ehandle ?filterset? ?filtermode?
e.atoms(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the atoms the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

Examples:

ens atoms $ehandle
ens atoms $ehandle hydrogen
ens atoms $ehandle !hydrogen count

The first example simply returns a list of the labels of the atoms the ensemble contains as minor objects. The second example returns the atom label(s) of all hydrogen atoms in the ensemble. If there are no such atoms, an empty list is returned. The final example counts the number of non-hydrogen atoms in the ensemble.

ens bondangles

ens bondangles ehandle ?filterset? ?filtermode?
e.bondangles(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the bond angle objects the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

ens bonds

ens bonds ehandle ?filterset? ?filtermode?
e.bonds(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the bonds the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

Examples:

ens bonds $ehandle
ens bonds $ehandle doublebond
ens bonds $ehandle carbon count

The first example simply returns a list of the labels of the bonds the ensemble contains as minor objects. The second example returns the bonds label(s) of all double bonds in the ensemble. If there are no such bonds, an empty list is returned. The final example counts the number of bonds which involve one or more carbon atoms in the ensemble.

ens cast

ens cast ehandle dataset/ens/reaction/table ?propertylist?
e.cast(objectclass=,?properties=?)

Transform the ensemble into a different object. Depending on the target object class, the result is as follows:

If the optional property list is specified, an attempt is made to compute the listed properties before the cast operation, so that they may become a part of the new object. No error is raised if a computation fails.

The command returns the handle (reference for Python ) of the new object, or the input object in case of mode ens .

ens clear

ens clear ehandle ?keepensprops?
e.clear(?keepensproperties=?)

This command resets an ensemble to a virgin state. All minor objects and all property data of the ensemble are deleted. However, the ensemble handle or reference remains valid, representing an ensemble without any atoms, bonds, rings or other minor objects. If the optional argument is set to a true value, ensemble-class properties ( E_* ) are not deleted, but everything else still is.

Ensemble membership in datasets, reactions, etc. is not changed by this command.

The command returns the original handle or reference.

ens compare

ens compare ehandle ehandle2
e.compare(eref/ehandle)

Compare two ensembles, yielding a stable sort order. The compared attributes are, in this order, the number of atoms, the number of bonds, the ensemble molecular weight, the number of ESSSR rings and finally the stereo- and isotope aware 64-bit hashcode ( E_ISOTOPE_STEREO_HASHY ). The command returns 1 if the first ensemble is larger, -1 if the second is larger, and 0 if they are identical according to the comparison scheme.

The compared property values, with the exception of the final hashcode tiebreaker, are compatible with the RDKit model.

ens copy

ens copy src_ehandle dst_ehandle
e.copy(eref_dst)

Create a copy of the input ensemble into the framework of an existing ensemble. The old data of the destination ensemble is destroyed, but its handle or reference is reused for the copy. The destination handle can be an empty string, #new, #auto or None for Python . In that case, the ensemble is duplicated and a new handle assigned.

This command is useful when an ensemble handle or reference is potentially stored in unknown locations and the ensemble data needs to be updated.

The return value of the command is the handle or reference of the destination ensemble. It is allowed to copy an ensemble onto itself.

Example:

set eh1 [ens create CC]
set eh2 [ens create CCC]
ens copy $eh1 $eh2

After the example code sequence, both ensembles represent ethane, the first compound. However, these are independent ensembles. Any further modifications of the ensemble data on any of the ensembles will not be seen by the other.

The command returns the handle or reference of the target ensemble.

ens create

ens create ?codestring? ?mode? ?datasethandle? ?macroset?
Ens(?data=?,?mode=?,?dataset=?,?macroset=?)
Ens.Create(?data=?,?mode=?,?dataset=?,?macroset=?)

This command creates a new molecular ensemble and returns its handle or reference. If none of the optional arguments are specified, or the argument string is an empty string (or None for Python ), an empty ensemble without any atoms or bonds is created. These may later be populated with commands like atom create.

If data string may either begin with an automatically recognized prefix, or an automatic format detection process is initiated. Recognized prefixes are:

The colon in the prefix may be omitted (except for the name: item), but this is not recommended, since it may lead to misinterpretation of the data if the prefix is also part of a valid structure encoding.

In addition, URL s as structure data argument are automatically detected and handled specially. If the URL is a data URI , it is unpacked and its payload processed in a second cycle. If it is an HTTP or FTP URL , the file is downloaded and its contents read a a structure file with automatic format detection. This is not identical to data URI processing: Data URI s are again interpreted as command arguments with all prefix and line notation interpretation, while file contents are only interpreted as a record in a structure data file.

If none of the above special cases are recognized, automatic interpretation is performed next. Currently, the encoding then may either be

In the absence of a prefix, the encoding is automatically detected. With the exception of PubChem CIDs, the long form of a database ID must be used, not its simple integer value (i.e. a simple 70 is interpreted as PubChem CID, while CHEMBL70 or chembl:70 are decoded as ChEMBL database IDs).

For the base64 -encoded compressed records, the compression algorithm may be raw zlib , gzip or zip and its type is automatically detected.

In case one of the SMILES -class encoding schemes is used, the mode argument of the ens create command provides finer control of the decoding. By default, or when this argument is an empty string, the string is interpreted as standard SMILES , except when there are elements in the string which cannot occur in SMILES but in SMARTS . In SMILES mode, query expressions are only recognized to a very limited degree, and implicit hydrogens are automatically added. This decoding scheme may also be explicitly selected by specifying hadd as mode.

In order to force a full hydrogen addition to the raw decoded structure even if it would not be done otherwise, use the mode forcehadd .

Mode strictsmiles decodes SMILES with hydrogen addition but as if the strictsmiles: prefix was set. This is described above.

Mode nohadd is essentially the same as basic SMILES decoding, but implicit hydrogen addition does not happen. In any case, explicitly encoded hydrogen is decoded and preserved.

Mode smarts (or query ) also skips hydrogen addition, but in addition the decoder now fully parses SMARTS , including Recursive SMARTS, but it also becomes less lenient in the area of superatom encodings and similar gray areas, in order to avoid ambiguity. The recognized SMILES dialect may be switched via the control variable ::cactvs(smiles_version). The default is Daylight release 4.9 with Cactvs and EliLilly extensions.

Mode sln forces the interpretation of the input string as Sybyl Line Notation . If the SLN I/O module has already been loaded, interpretation as SLN is automatically attempted in any case, but only after SMILES decoding has failed. Since there are strings which are both valid SMILES and SLN , but mean something different, this automatism can lead to misinterpretation, so if you know you are dealing with SLN , it is a good idea to specify it. The sln mode attempts to auto-load the SLN I/O module if it is not yet loaded. In case it cannot be loaded, this mode raises an error. Mode querysln is similar, but assumes the input is query SLN , not plain SLN .

The 3D decoder mode prefers resolution of identifiers as 3D model instead of 2D connectivity. This has an effect only with a few select combination of identifiers and resolvers and should be considered experimental.

Instead of using an explicit decoder mode or a data prefix, it is also possible to supply the name of a property the structure data is an instance of. Examples are E_SDF_STRING or E_SMILES . Such properties are expected to provide suitable default decoder configuration data in their fileformat and fileflags attributes, and these are then used to decode the structure.

In nohadd decoder mode, the structure code is finally, if everything else fails, interpreted as a plain molecular formula. If the string is parsed successfully as a formula, a collection of atoms of the specified elements is created, without any bonds.

By default, or if the optional target dataset parameter is an empty string, the new ensemble is not a member of any dataset. It may be directly made a dataset member if a dataset handle is specified.

If a macro set name is specified, SMILES and SMARTS with macro definitions can be processed. Any patterns names which belong to the specified set are expanded. Set names, pattern names and expansion fragments are specified in the system macro table. Macro expansion is not available if the toolkit was compiled without table support.

Examples:

set eh [ens create]
set eh [ens create CCC]
set sshandle [ens create {[CH3][Cl,Br,I]} smarts]
set eh [ens create [decode -url C%23C] nohadd]

In case a structure is encoded as a string in a format which cannot be directly decoded by the ens create command (such as a plain string representation of an MDL molfile), the standard method is to load the appropriate file format decoder (if not built in, this is needed so that automatic format detection of the memory image record works), open the structure string as a memory-based structure file, and read from this file. This technique allows the input of multiple records from the in-memory file and thus is also useful in cases like a multi-record SMILES file encoded as a string.

Example:

filex load cdx
set fh [molfile open [decode -base 64 $cdxstring] s]
set eh [molfile read $fh]
molfile close $fh

ens dataset

ens dataset ehandle ?filterlist?
e.dataset(?filters=?)

Return the dataset handle or reference of the dataset the ensemble is part of. It the ensemble is not member of a dataset, or does not pass all of the optional filters, an empty string or None for Python is returned.

Example:

ens dataset $ehandle

ens defined

ens defined ehandle property
e.defined(property)

This command checks whether a property is defined for the ensemble. This is explained in more detail in the section about property validity checking. Note that this is not a check for the presence of property data! The ens valid command is used for this purpose.

The command returns a boolean result.

ens delete

ens delete all
ens delete ?ehandlelist?...
e.delete()
Ens.Delete(“all”)
Ens.Delete(?erefsequence/eref/ehandle?,...)

Delete ensembles and the minor objects which are part of the deleted ensembles. The special parameter all may be used to delete all ensembles currently registered in the application, including those which are part of reactions or other major objects. Alternatively, any number of lists of ensemble handles may be specified for specific deletions.

The command returns the number of deleted ensembles.

For historic reasons, the same command may also be invoked as ens destroy .

Example:

ens delete $ehandle
ens delete $ehandlelist1 $ehandlelist2

ens dget

ens dget ehandle propertylist ?filterset? ?parameterdict?
e.dget(property=,?filters=?,?parameters=?)
Ens.Dget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens dget is that the latter does not attempt computation of property data, but rather initializes the property values to the default and return that default if the data is not yet available. For data already present, ens get and ens dget are equivalent.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes. The data for the creation of the temporary ensemble is equivalent to the first argument of the standard constructor. Additional constructor parameters cannot be used.

ens dup

ens dup ehandle ?datasethandle? ?position? ?filterset? ?ctonlyflag?
e.dup(?dataset=?,?position=?,?filters=?,?ctonly=?)

Duplicate an ensemble. The return value is the handle or reference of the new ensemble.

The duplicate ensemble is placed into the same dataset as the source, if it is a member of a dataset. Specifying an explicitly empty dataset argument (including None for Python ) places the duplicate outside any dataset, regardless of the dataset membership of the source ensemble.

If the duplicate is moved to a dataset, it is appended to the dataset end by default. This happens also if the position parameter is explicitly specified as end or an empty string. Otherwise, the ensemble is inserted at the given position, starting with 0. If the requested position is larger than the current size of the dataset, the ensemble is appended.

The filter parameter allows the selection of only a subset of atoms to be copied. All atoms which do not pass the filters are discarded, as are all bonds which connect to discarded atoms. If no atoms pass the filters, the result is an empty ensemble. By default, no atom filtering takes place, and all atoms and bonds of the original ensemble are part of the duplicate.

The final optional parameter can be used to make the duplicate lightweight. If this boolean parameter is set, the duplicate is limited to the basic connectivity information with all atom and bond properties, but it has no copies of properties of other object classes, and no copies of rings, molecules, groups or other minor object classes.

The ens hdup command is a variant of this command. It automatically adds a hydrogen set to the duplicate.

Examples:

ens dup $ehandle
ens dup $ehandle [dataset create] end ringatom

The first sample line is a standard use. The second example moves the duplicate into a newly created dataset, and isolates the ring systems. All other atoms are stripped.

ens exists

ens exists ehandle ?filterset?
e.exists(?filters=?)
Ens.Exists(eref=,?filters=?)

Check whether an ensemble handle or reference is valid. The command returns boolean 0 or 1. Optionally, the ensemble may be filtered by a standard filter list and it is reported as not valid if it does not pass the filters. If filters in the filter list operate on atom, bonds, or other minor objects, it is sufficient if a single minor object of the ensemble passes the filter.

Example:

ens exists $ehandle chlorine

Check whether the ensemble with the handle in variable $ehandle exists and, if it exists, whether it contains one or more chlorine atoms.

ens expand

ens expand ehandle ?allowambiguous? ?noimplicith?
e.expand(?allowambiguous=?,?noimplicith=?)

This command expands all superatoms in the ensemble. The mechanisms for the expansion of superatoms are described in detail for the atom expand command. This command is functionally equivalent, working on all atoms in the ensemble instead a single atom.

Example:

ens expand $ehandle

The command returns the total number of successfully expanded atoms.

ens expr

ens expr ehandle expression
e.expr(expression)

Compute a standard SQL -style property expression for the ensemble. This is explained in detail in the chapter on property expressions.

ens fill

ens fill ehandle ?property value?...
e.fill({?property:value,...})
e.fill(?property,value?,...)

Standard data manipulation command for setting data, ignoring possible mismatches between the lengths of the lists of objects associated with the property and the value list. It is explained in more detail in the section about setting property data.

Example:

ens fill $ehandle B_COLOR red

sets the color of the first bond in the ensemble to red.

ens filter

ens filter ehandle filterlist
e.filter(filters)

Check whether the ensemble passes a filter list. The return value is boolean 1 for success and 0 for failure.

Example:

ens filter [ens create CCCl] chlorine

checks whether the ensemble contains one or more chlorine atoms. If the filter operates on minor objects of the ensemble, it is sufficient to have a single ensemble minor object pass the filter condition.

ens forget

ens forget ehandle ?objclass?
e.forget(?objectclass=?)

Delete specific classes of minor objects and their data from the ensemble data structure. If no object class is specified, all minor object classes except atoms and bonds and the ensemble data are purged.

If the object class ens is specified, all property data attached to the ensemble object class (usually those properties starting with E_* ) are deleted, but not the ensemble itself.

The command returns the original ensemble handle or reference.

ens formulamatch

ens formulamatch ehandle formula_expression ?other_elements?
e.formulamatch(query=,?other_elements=?)

Match the ensemble against a formula expression. Its syntax is the same as in formula queries in molfile scan and other scan commands.

There are several methods to specify whether any elements not mentioned in the formula expression may or must be present. If the other_elements flag is used, it has the highest priority. If may be set to 0 (no other elements allowed), 1 (allowed) or 2 (required), and if it is set, any prefix in the formula expression is ignored. If it is not used, a prefix in the formula expression may be used to control the matching. Supported prefixes are = (no other elements), >= (other elements allowed) and > (required). If no prefix is used, the default mode is an exact match without other elements.

The return value is the boolean match result.

Example:

ens formulamatch $eh >=C6

Matches any ensemble with has six carbon atoms.

ens formulamatch $eh C5-6(Cl+Br+I)2- 1

Matches an ensemble with five or six carbon atoms, two ore more heavy halogens, and potentially any other elements.

ens fragment

ens fragment ehandle atomlist ?datasethandle? ?position?
e.fragment(atomsequence=,?dataset=?,?position=?)

Create a new ensemble from a set of atoms in another ensemble. All bonds existing between those atoms are also preserved. The atoms can be selected with any standard atom selection syntax, with one selector per list element. Duplicate atom specifications are ignored. Atom specifications which cannot be resolved generate an error.

By default, the new ensemble becomes a member of the same dataset (if any) as the source ensemble, but this can be changed with the optional fifth argument. If no explicit position is given, the ensemble is appended to rear of the target dataset. The new ensemble only inherits the selected atoms and bonds plus stable atom and bond properties, but not other minor objects or ensemble data.

The command returns the handle or reference of the new ensemble object.

Example:

match ss $substructure $eh amap
set ehfrag [ens fragment $ehandle [unzip $amap 1]]

Above code sequence matches a substructure, and then extracts the matched structure part as a new ensemble.

ens get

ens get ehandle propertylist ?filterset? ?parameterdict?
ens get ehandle attribute
e.get(property=,?filters=?,?parameters=?)
e.get(attribute)
e[property/attribute]
e.property/attribute
Ens.Get(data,property=,?filters=?,?parameters=?)
Ens.Get(data,attribute)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

Examples:

ens get $ehandle {M_WEIGHT A_ELEMENT}

yields a nested list with two elements. The first element is a list of the molecular weights of all molecules in the ensemble. The second element is a list of the element numbers of all atoms in the ensemble. If the information is not yet available, an attempt is made to compute it. If the computation fails, an error results.

ens get $ehandle B_ORDER ringbond

gives the bond orders of all bonds of the ensemble which are ring bonds.

The format of the optional parameter list argument is a series of keyword/value pairs, as produced by the Tcl command array get or the standard Tcl dictionary commands. If a this parameter list is present as argument, and the requested property data is already valid for the ensemble, a check if made if all the specified parameters are the same as the parameters the present property data was computed with. If this is the case, the values are directly returned as usual. Otherwise, the data is discarded and re-computed.

If computation of the property data is performed, either because the parameter set was not matched, or the requested data was not valid, the computation integrates the specified parameter set into the parameters of the computation function. Parameters from the list temporarily override the global settings of these parameters in the property definition. Parameters used by the property computation function but not listed in the local parameter list are neither used for data validity checking, nor their value changed during the computation request. After the computation finishes, the old global parameter settings of the property definition are restored.

The use of a parameter list argument is primarily useful only if a single property is requested with this command, but its use with a multiple-property request is not illegal - the parameter list is simply applied to all properties in sequence.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes. The data for the creation of the temporary ensemble is equivalent to the first argument of the standard constructor. Additional constructor parameters cannot be used.

Example:

ens get $ehandle E_GIF {} [dict create width 200 height 200 bgcolor white]

Variants of the ens get command are ens new, ens dget, ens jget, ens jnew, ens jshow, ens nget, ens show, ens sqldget, ens sqlget, ens sqlnew, and ens sqlshow .

Further examples:

ens get $ehandle E_NAME
ens get $ehandle A_FLAGS(boxed)

In addition to property data, the ensemble object possesses a few attributes, which can be retrieved with the ens get command (but not by its related sister subcommands like ens dget, ens sqlget, etc.). Some of them are also modifiable via ens set. These attributes are:

ens getparam

ens getparam ehandle property ?key? ?default?
e.getparam(property=,?key=?,?default=?)

Retrieve a named computation parameter from valid property data. If the key is not present in the parameter list, an empty string is returned ( None for Python ). If the default argument is supplied, that value is returned in case the key is not found.

If the key parameter is omitted, a complete set of the parameters used for computation of the property value is returned in dictionary format.

This command does not attempt to compute property data. If the specified property is not present, an error results.

Example:

ens getparam $ehandle E_GIF format

returns the actual format of the image, which could be gif , png , or various bitmap formats.

ens groups

ens groups ehandle ?filterset? ?filtermode?
e.groups(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the groups the ensemble contains. This is explained in more detail in the section about object cross-references.

Example:

ens groups $ehandle

ens hadd

ens hadd ehandle ?filterset? ?flags? ?changeset?
e.hadd(?filters=?,?flags=?,?changeset=?)

Add a standard set of hydrogens to the ensemble. If the filterset parameter is specified, only those atoms which pass the filter set are processed.

Additional operation flags may be activated by setting the flags parameter to a list of flag names, or a numerical value representing the bit-ored values of the selected flags. By default, the flag set is empty, corresponding to the use of an empty string or none as parameter value. These flags are currently supported:

Adding hydrogens with this command, except wit a set protonate flag, is less destructive to the property data set of the ensemble than adding them with individual atom create/bond create commands, because many properties are designed to be indifferent to explicit hydrogen status changes, but are invalidated if the structure is changed in other ways.

If the effects of the hydrogen addition step to the validity of the property data set should not be handled according to this standard procedure, it is possible to explicitly generate additional property invalidation events by specifying an event list as the optional last parameter, for example a list of atom and bond to trigger both the atom change and bond change events.

The command returns the number of hydrogens which were added.

Example:

set ehandle [ens create {[C].[C]}]
ens hadd $ehandle

adds a total of eight hydrogens to the two carbon atoms, transforming them into methane.

ens hdup

ens hdup ehandle ?datasethandle? ?position? ?filterset? ?ctonlyflag?
e.dup(?dataset=?,?position=?,?filters=?,?ctonly=?)

This command is a convenience variant of the ens dup command. It has the same parameters, but also adds a full standard hydrogen set (equivalent to executing an ens hadd $eh command) to the duplicate.

The command arguments are documented in the paragraph on ens dup .

ens hfragment

ens hfragment ehandle atomlist ?datasethandle? ?position?
e.hfragment(atomsequence=,?dataset=?,?position=?)

This command has the same arguments as ens fragment . The only difference is that after the duplication all open valences in the fragment are plugged with hydrogen, as if an ens hadd command had been executed immediately after the fragment creation command.

The command returns the handle or reference of the new ensemble object.

ens hierarchy

ens hierarchy ehandle ?filterlist? ?root?
e.hierarchy(?filters=?,?root=?)

Return the hierarchy handle or reference of the hierarchy the ensemble is part of. If the ensemble is not member of a hierarchy, or does not pass all of the optional filters, an empty string or None for Python is returned. By default, the hierarchy object which directly contains the ensemble is returned. If the root flag is set, the root hierarchy object is reported instead, which is the same only if the hierarchy has only a single level.

Example:

ens hierarchy $ehandle

ens hstrip

ens hstrip ehandle ?flags? ?changeset?
e.hstrip(?flags=?,?changeset=?)

This command removes hydrogens from the ensemble. By default, all hydrogen atoms in the ensemble are removed.

The flags parameter can be used to make the operation more selective. It may be a list of the following flags:

If the flags parameter is an empty string, or none , it is ignored. The default flag value is wedgetransfer - but this default value is overridden if any flags are set!

If the changeset parameter is specified, the property change events listed in the parameter are triggered after the command.

Hydrogen stripping is not as disruptive to the ensemble data content as normal atom deletion, except when the deprotonate flag is set. The system assumes that this operation is done as part of some file output or visualization preparation. However, if any new data is computed after stripping, the computation functions see the stripped structure, and proceed to work on that reduced structure without knowledge that the structure may contain implicit hydrogens.

The command returns the number of stripped hydrogens.

Example:

ens hstrip $ehandle [list keeporiginal wedgetransfer]

ens hydrogenate

ens hydrogenate ehandle ?filterset? ?changeset?
e.hydrogenate(?filters=?,?changeset=?)

Reduce all bonds in the ensemble to single bonds, except those excluded by the filter set.

If a change set is supplied, its interpretation is the same as in ens hadd.

The command returns the number of added hydrogens.

Example:

ens hydrogenate $eh {!arobond !ccbond}

This reduces all non-aromatic bonds involving hetero atoms to single bonds.

ens image

ens image ehandle ?width? ?height? ?options?

This command generates a Tk image object displaying the ensemble as an icon. The command is only available in toolkit variants which are linked with the portable Tk GUI toolkit library and which are either statically linked with the GD image drawing library, or can load it dynamically. It is currently not support in the Python interface.

The default image size is 64x64 pixels, but this may be overridden by the width and height parameters. If only width is set, it is also used for the height. The command returns a Tk image handle. These images may for example be placed on Tk canvases as canvas objects, or used on buttons and other GUI objects.

Because of the small size of the images, atoms are not displayed as symbols, but small color-coded squares. This is a command for the implementation of graphical structure-handling applications with icons. For serious structure visualization, use the E_GIF , E_EMF_IMAGE or E_EPS_IMAGE properties.

Additional options may be added by an arbitrary sequence of option/value pairs. Color names can be those registered in the X11 color database, or a numeric specification in the #rrggbb format. These options are currently supported:

Images are cached. If an image for the selected ensemble with the same display attributes exists, it is reused.

Example:

set img [ens image $ehandle 80 80 -border yellow -linecolor blue]
canvas create .canvaswin image 50 50 -image $img

ens index

ens index ehandle
e.index()

Get the position of the ensemble in the object list of its dataset. If the ensemble is not member of a dataset, -1 is returned.

ens isotopecheck

ens isotopecheck ehandle ?failedatomvariable? ?extended?
e.isotopecheck(variable=,extended=)

Test whether the isotope labels on the atoms of the ensemble, if they exist, are physically reasonable. The command returns the number of failed atoms. If a capture variable is specified, the atom labels or references of these atoms are stored therein. If no isotope labels are set in A_ISOTOPE , the command always reports zero problems.

By default, a smaller isotope table is used which contains only isotopes which are sufficiently long-lived to perform chemistry on. These include naturally occurring isotopes as well as isotopes used for experimental labeling, such as 3H or 14C. If the extended boolean flag is set, a larger table containing all known isotopes of the elements is used.

The isocheck command is an alias.

ens jget

ens jget ehandle propertylist ?filterset? ?parameterdict?
e.jget(property=,?filters=?,?parameters=?)
Ens.Jget(data,property=,?filters=?,?parameters=?)

This is a variant of ens get which returns the result data as a JSON formatted string instead of Tcl interpreter objects. The command is usable only for property data, not attribute retrieval.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens jnew

ens jnew ehandle propertylist ?filterset? ?parameterdict?
e.jnew(property=,?filters=?,?parameters=?)
Ens.Jnew(data,property=,?filters=?,?parameters=?)

This is a variant of ens new which returns the result data as a JSON formatted string instead of Tcl interpreter objects.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens jshow

ens jshow ehandle propertylist ?filterset? ?parameterdict?
e.jshow(property=,?filters=?,?parameters=?)
Ens.Jshow(data,property=,?filters=?,?parameters=?)

This is a variant of ens show which returns the result data as a JSON formatted string instead of Tcl interpreter objects.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens ldup

ens ldup ?ehandlelist?...
Ens.Ldup(?eref/erefsequence?,...)

Duplicate all ensembles in the argument list(s) in default mode.

The return value is a single list (even if multiple source lists are used) of the duplicated ensemble handles or references. If an argument list element is an empty string (or None for Python ), it indicates a missing object, and the output list also receives an empty string element (for Tcl ) or None (for Python ) at its position, without raising an error.

ens lhdup

ens lhdup ?ehandlelist?...
Ens.Lhdup(?eref/erefsequence?,...)

Duplicate all ensembles in the argument list(s) in default mode, and add hydrogens.

The return value is a single list (even if multiple source lists are used) of the duplicated ensemble handles or references. If an argument list element is an empty string (or None for Python ), it indicates a missing object, and the output list also receives an empty string element (for Tcl ) or None (for Python ) at its position, without raising an error.

ens list

ens list ?filterlist?
Ens.List(?filters=?)

This command returns a list of the ensemble handles currently registered in the application. This list may optionally be filtered by a standard filter list. If the filter operates on ensemble minor objects such as atoms or bonds and not directly on the ensemble object, it is sufficient if a single minor object passes the filter.

Example:

ens list halogen

lists the handles of all ensembles in the application which contain one or more halogen atoms.

ens lock

ens lock ehandle propertylist/objclass/all ?compute?
e.lock(property=,?compute=?)

Lock property data of the ensemble, meaning that it is no longer managed by the standard data consistency manager. The data consistency manager deletes specific property data if anything is done to the ensemble which would invalidate the information. Blocking the consistency manager can be useful when building ensembles from components in a script. Property data remains locked until is it explicitly unlocked.

The property data to lock can be selected by providing a list of the following identifiers:

The lock can be released by an ens unlock command.

The return value is the original ensemble handle or reference.

Example:

set eh [ens create CCC]
ens lock $eh A_SYMBOL 1
ens purge $eh A_ELEMENT
atom set $eh 1 A_query(dsearch) 3
ens unlock $eh A_SYMBOL

In this example, an ensemble is created, and the atom symbol information is locked. Next, the element number property is deleted, and a query attribute is set. Finally, the lock is released. Had the element symbol information not been locked, the ensemble would have become unusable due to an overzealous data consistency manager. Setting query information in property A_query can have an influence on the atom symbol. So the default action of invalidating A_SYMBOL when manipulating A_query is correct. However, in case there is no element information A_ELEMENT , and no atom symbol information A_SYMBOL , the element information is completely lost, and the ensemble becomes unusable. So in this case, locking A_SYMBOL (or alternatively A_ELEMENT ) is required to avoid unexpected side effects of structure editing.

ens loop

ens look ehandle objvariable ?maxmol? ?offset? body
e.loop(function=,?maxloop=?,?offset=?,?variable=?)
for m in e:

Loop over all molecules in the ensemble, by providing a temporary ensemble duplicate of each found molecule. The handle of the duplication is stored in the object variable and visible to the loop code.

The loop code cannot delete the duplicate ensemble. It is automatically deleted at the end of each cycle. Changes made to the duplicate molecule are not seen in the base ensemble. It is however possible to explicitly assign data computed on the duplicate ensemble to the base ensemble.

The optional parameters allow more control over which molecules are processed. By default the maxmol parameter is -1, meaning an unlimited number of fragments are processed, and the offset is zero, meaning that processing begins with the first molecule in the molecule list of the base ensemble.

For Tcl scripts, within the loop code, the standard Tcl commands break and continue work as expected.

The Python version of the loop method does intentionally have a different argument sequence for convenience. The function argument may either be a multi-line string (similar to the Tcl construct), or a function reference. Functions are called with the reference of the current loop object as single argument, and have their own context frame, so that the specification of a reference variable is not generally useful in that call style, though is is allowed. For string function blocks the code is executed in the local call frame, and the variable with the current object reference is visible locally. Script code blocks must be written with an initial indentation level of zero. Within the Python functions, the normal break and continue commands cannot be used to to scope limitations. Instead, the custom exceptions BreakLoop and ContinueLoop can be raised. These are automatically caught and processed in the loop body handler code.

In Python , there is also an object iterator so that simple loops over ensemble molecules can be written with a for statement. The ensemble object iterator is of the self style (i.e. there is one per ensemble, these are not independent objects), so nesting them is not possible on the same ensemble.

Python object loop constructs and their peculiarities are discussed in more detail in the general chapter on Python scripting.

The command returns the number of molecule fragments processed.

Example:

set midx 0
ens loop $ehandle ehdup {
	mol set $ehandle [mol mol $ehandle #$midx] M_MYPROP [ens get $ehdup E_MYPROP]]
	incr midx
}

The example loop assigns a custom property where the compute function is only defined for a single-fragment ensemble to the equivalent molecule property in a multi-fragment base ensemble.

ens mask

ens mask ehandle labellist/all property onvalue ?offvalue?
e.mask(objects=,property=,onvalue=,?offvalue=?)
e.mask(“all”,property=,onvalue=,?offvalue=?)

This command sets property values of a subset of minor objects of one class in the ensemble to a specific value, and optionally resets the values of the same property for all other minor objects of the ensemble which are not selected.

The first argument after the ensemble handle is either a list of object identifiers, or the magic value all . Object identifiers are usually the standard numerical labels, but any construct which identifies an atom, a bond, etc. can be used. The next argument identifies the property. The object identifiers in the previous argument must correspond to the object class of the property, i.e. atom label pairs can only be used it the property is a bond property, but simple numerical labels work for all classes. If data for that property is not present on the ensemble, it is instantiated with the default value. The final one or two arguments must be decodable data values for that property.

If the all object subset identifier is used, all values of the property in the ensemble are set to the onvalue . Any offvalue specification is ignored.

Otherwise, the explicit label list is processed. If an off value is given, all values of the property in the ensemble are first reset to that value. If no off value was specified is, no reset is performed and the current values remain valid. Then, all minor objects in the list are looked up from their labels or other identifiers, and their property value set to the onvalue .

Example:

ens mask $eh [ens atoms $eh carbon] A_COLOR green black

This command sets the A_COLOR property value for all carbon atoms in the ensemble to green, and all other atoms to black. This is shorter and more efficient then explicitly coding a loop of atom set statements.

The command returns the original ensemble handle or reference.

ens match

ens match ehandle ss_ehandle ?matchflags? ?ignoreflags? ?atommatchvar? 	?bondmatchvar? ?molmatchvar?
e.match(substructure=,?matchflags=?,?ignoreflags=?,?atommatchvariable=?,	?bondmatchvariable=?,?molmatchvariable=?)

Check whether the ensemble matches a substructure. The substructure may be any structure ensemble, and even be in the same ensemble as the primary command ensemble.

The precise operation of the substructure match routine can be tuned by providing a standard set of match flags and feature ignore flags. The default match flag set has set bits for the bondorder , atomtree and bondtree comparison features, and an empty ignore set. If a flag set is specified as an empty string, the default set is used. In order to reset a flag set, an explicit none value must be used. The bit options of the match flag are explained in the documentation of the match ss command.

The command returns boolean 1 for a successful match, 0 otherwise. If an optional atom, bond, or molecule match variable is specified, it is set to a nested list of matching substructure/structure atom, bond or molecule labels ( Tcl ) or references ( Python ). If no match can be found, the variable is set to an empty list. In case only a bond or molecule match variable is needed, an empty string can be used to skip the unused match variable argument positions.

This is a very simple variant of substructure matching. The match ss command provides many more advanced match determination and match processing options.

ens max

ens max ehandle propertylist ?filterset?
e.max(property=,?filters=?)

Get the maximum values of the properties named in the propertylist parameter. The return value of the command is a list of the maximum property values. The objects whose property values are used for the determination of the maximum values may optionally be filtered by a standard filter set. If no objects pass the filter, the result is an empty string.

Example:

ens max $ehandle A_ELEMENT

computes the maximum element number in the ensemble.

ens merge

ens merge ehandle ?ehandle_list?...
e.merge(?eref/erefsequence?,...)

Merge a set of ensembles into one ensemble. All structure information is accumulated in the first (base) ensemble. Its handle remains unchanged. All other ensembles are destroyed. It is not possible to name an ensemble more than once in the argument lists, and ensembles cannot be merged with themselves.

The merged ensemble has a consistent property set for all minor objects. If the information content of the input ensembles varies, an attempt is made to compute the missing information for ensembles which do not have valid data for each individual property. If the computation fails, the property data is discarded for all merged objects. In addition, a merge property invalidation event is issued, which may lead to additional loss of property data. For surviving properties which have defined a merge update function, this function is then called and may perform additional data adjustments. For example, the A_XY 2D plot coordinate property merge function transforms the structure plot coordinates in the new ensemble to a uniform scale and arrange the coordinates for the atoms from the merged ensembles as a sequence of plots from left to right.

The return value of this command is a list of the new first atom labels or references for every merged ensemble, excluding the base ensemble. All minor object labels in the merged ensembles are re-assigned to avoid collisions. The new labels begin with the highest respective minor object label in use in the base ensemble plus one, and are thereafter assigned in sequence. In case an empty ensemble was merged, the list contains an empty string ( Tcl ) or None ( Python ) at its merge position.

The ens add command performs the same operation as the ens merge command, but merges duplicates of the input ensembles, thus preserving them.

Example:

ens merge [ens create CC] [list [ens create CCC.CCCC] [ens create C]]

Merge three ensembles into one. The new ensemble contains the molecules ethane, propane, butane and methane in that order.

ens metadata

ens metadata ehandle property ?field ?value??
e.metadata(property=,?field=?,?value=?)

Obtain property metadata information, or set it. The handling of property metadata is explained in more detail in its own introductory section. The related commands ens setparam and ens getparam can be used for convenient manipulation of specific keys in the computation parameter field. Metadata can only be read from or set on valid property data.

Valid field names are bounds , comment , info , flags , parameters and unit .

Examples:

array set gifparams [ens metadata $ehandle E_GIF parameters]
ens metadata $ehandle E_NAME comment “This is a CAS name in 1995 revision. The IUPAC name, or any previous or later CAS revision name, look completely different.”

The first line retrieves the computation parameters of the property E_GIF as keyword/value pairs. These are read into the array variable gifparams , and may subsequently be accessed as $gifparams(format) , $gifparams(height) , etc. The second example shows how to attach a comment to a property value.

ens min

ens min ehandle propertylist ?filterset?
e.min(property=,?filters=?)

Get the minimum values of the properties named in the propertylist parameter. The return value of the command is a list of the minimum property values. The objects whose property values are used for the determination of the minimum values may optionally be filtered by a standard filter set. If no objects pass the filter, the result is an empty string.

Example:

ens min $ehandle A_FORMAL_CHARGE xatom

gets the lowest value of the formal charge of a hetero atom in the ensemble.

ens mols

ens mols ehandle ?filterset? ?filtermode?
e.mols(?filters=?,?mode=?)

Standard cross-referencing command to obtain the label(s) of the molecule the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

Examples:

ens mols $ehandle
ens mols $ehandle heterocycle

The first example simply returns a list of the labels of the molecules the ensemble contains as minor objects. Note that it is possible that there is more than one molecule in the ensemble - this is the reason why the command name is mols , not mol . The second example returns the molecule label(s) of all the molecules in the ensemble which contain one or more heterocycles. If there are no such molecules, an empty list is returned.

ens move

ens move ehandle ?datasethandle|remotehandle? ?position?
e.move(?target=?,?position=?)

Make the ensemble a member of a dataset, or remove it from a dataset. If the dataset handle or reference parameter is omitted, or is an empty string, or None for Python , the object is removed from its current dataset. The dataset handle or reference may be the name of a remote dataset for moving object over a network connection.

If a target dataset handle or reference is specified, the ensemble is added to the dataset, if allowed by the acceptance bits of the dataset, and removed from any dataset it was member of before the execution of the command. By default the ensemble is added to the end of the dataset object list, but the final optional parameter allows the specification of an object list index. The first position is index zero. If the parameter value end is used, or the index is bigger than the current number of dataset objects minus one, the ensemble is appended as per the default. It is legal to use this command for moving ensembles within the same dataset.

Another special position value is random or rnd . This value moves to the object to a random position in the dataset. Using this mode with remote datasets is currently not supported.

The dataset handle cannot be a transient dataset.

The return value of the command is the dataset of the object prior to the move operation. It is either a dataset handle/reference, or an empty string ( Tcl ) or None ( Python ) if it was not member of a dataset.

This command interacts with the insert control mechanism of size-constrained datasets. More information is provided in the description of the sizecontrol dataset parameter.

Examples:

ens move $ehandle $dhandle 0
ens move $ehandle

In the first example, the ensemble is inserted as the first element in a dataset. The second line reverts this operation and removes the ensemble from the dataset.

This command can be used with a remote dataset descriptor. In that case, the ensemble is packed into a serialized object representation, transmitted over the network and restored as member of the remote dataset at the specified position. The local ensemble is deleted if the transfer succeeds.

Example:

ens move $ehandle blockbuster@server2:9998 end

This command moves the ensemble to the dataset which was set up as listener on port 9998 and pass phrase blockbuster on host server2 . The local ensemble is deleted, and its copy is inserted at the end of the remote dataset.

ens mutex

ens mutex ehandle mode
e.mutex(mode)

Manipulate the object mutex.

During the execution of a script command, the mutex of the major object(s) associated with the command are automatically locked and unlocked, so that the operation of the command is thread-safe. This applies to toolkit builds that support multi-threading, either by allowing multiple parallel script interpreters in separate threads or by supporting helper threads for the acceleration of command execution or background information processing.

Going beyond this automatic per-statement protection, this command locks major objects for a period of time that exceeds a single command. A lock on the object can only be released from the same interpreter thread that set the lock. Any other threaded interpreters, or auxiliary threads, block until a mutex release command has been executed when accessing a locked command object. This command supports the following modes:

There is no trylock command variant because the command already needs to be able to acquire a transient object mutex lock for its execution.

The command returns the current lock count.

ens need

ens need ehandle propertylist ?mode? ?parameterdict?
e.need(property=,?mode=?,?parameters=?)

Standard command for the computation of property data, without immediate retrieval of results. This command is explained in more detail in the section about retrieving property data.

The return value is the original ensemble handle or reference.

Examples:

ens need $ehandle A_XY recalc
ens need $ehandle E_EINECS_ID threaded

ens new

ens new ehandle propertylist ?filterset? ?parameterdict?
e.new(property=,?filters=?,?parameters=?)
Ens.New(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens new is that the latter forces the re-computation of the property data, regardless whether it is present and valid, or not.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens nget

ens nget ehandle propertylist ?filterset? ?parameterdict?
e.nget(property=,?filters=?,?parameters=?)
Ens.Nget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens nget is that the latter returns numeric data, even if symbolic names for the values are available.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens nnew

ens nnew ehandle propertylist ?filterset? ?parameterdict?
e.nnew(property=,?filters=?,?parameters=?)
Ens.Nnew(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data and attributes. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens nnew is that the latter always returns numeric data, even if symbolic names for the values are available, and that property data re-computation is enforced.

The Python class method is a one-shot command. The transient ensemble created from the initialization items is automatically deleted when the command finishes.

ens nitrostyle

ens nitrostyle ehandle style
e.nitrostyle(style=)

Change the internal encoding of nitro groups and similar functional groups in the ensemble. Possible values for the style parameter are:

The command returns the original ensemble handle or reference.

ens op2d

ens op2d ehandle mode ?atomfilter_bit/degrees?
e.op2d(mode=,?atomfilter=?)

Perform various operations on the standard 2D layout coordinates of the structure (property A_XY ). Properties tightly connected to A_XY are also updated (most notably, B_FLAGS to keep wedges in sync with stereochemistry defined in other properties).

In mode rotate , the optional argument is the rotation angle in degrees. If it is not specified, the default are 30 degrees.

For alignment and flipping operations, the atoms which are used to determine the orientation can be filtered by specifying one or more value bits of property A_FLAGS . Only atoms where one or more of these bits are set in A_FLAGS are used for computing the alignment (in modes xalign , yalign , xyalign - all atoms are moved) or are flipped (modes hflip , vflip - unselected atoms are not moved). If no but filter values are specified, or none is used, all ensemble atoms and bonds are processed.

The following modes are supported:

Additionally, the mode argument may an ensemble handle or reference. In that case, it is interpreted as a substructure, matched onto the ensemble, and if a match is found, the 2D coordinates of the ensemble atoms are adjusted by scaling and rotation for maximum overlap between the 2D coordinates of the substructure and the matched part of the ensemble. This mode retains the relative positions of the matched atoms - this is not a full redraw operation around a match template.

The command returns 0 (nothing done) or 1 (coordinates changed).

ens pack

ens pack ehandle ?maxsize? ?requestprops? ?suppressedprops? ?compressionlib?
e.pack(?maxsize=?,?requestprops=?,?suppressedprops=?,?compressionlib=?)

Pack the ensemble object into a base64-encoded compressed serialized object string. This string does not contain any non-printable characters and is a full dump of the internal state of the object, omitting only property data that was declared to be so easily re-computed that a dump is not worthwhile. Outside object relationship information, such as the dataset the ensemble might be a member of, or tables the ensemble is associated with, are not included.

The maximum size of the object string (default -1, meaning unlimited) can be configured by the optional maxsize parameter. The size is specified in bytes. If the pack string would be longer than the maximum size, an error results.

The two optional parameters lists allow to request a specific property set to be part of the package, even if it normally would not be included, and to explicitly omit properties from the dump. No property computation is performed, and suppressed properties are not purged from the source ensemble.

Ensembles can be restored from a packed object string by the ens unpack and ens create commands.

The ensemble object and its minor objects are unchanged after using this command.

The default compression library is zlib . Other useful variants include lzo and gzip (and there are other internal types), but these may not be available on all builds due to license issues, and you need to specify the compression library when a dataset is unpacked. It is generally recommended to stay with zlib .

The return value of this command is the packed string.

In Python , ensembles support the standard pickle / unpickle protocol.

Example:

set dbstring [ens pack [ens create CC=O]]

ens pis

ens pis ehandle ?filterset? ?filtermode?
e.pis(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the π systems the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens pis $ehandle

π systems are a rather exotic feature and not commonly used. These are essentially descriptions of bonding interactions which use p or d orbitals, such as in standard covalent multiple bonds. A simple double bond is described with one σ system and one π system in this representation.

ens prepare

ens prepare ehandle molfilehandle
e.prepare(molfileref)

Prepare the ensemble for output via the specified file handle, for example by pre-computing properties that are needed for output. This has only an effect if the I/O module for the format of the file handle provides an output object preparation function, which is currently only the case for the BDB database format. The output of prepared and unprepared ensembles sent to the same file handle is indistinguishable.

The purpose of this command is to allow the preparation of the ensembles for output in a separate thread. For unprepared ensembles, a significant part of the time to write the record may be spent in computing required data. During this time, the file handle is blocked. Prepared ensembles already contain all required data, and are thus faster to write to file. The total time required in single-thread scripts for a simple molfile write command vs. a ens prepare plus molfile write combo is not much different. However, these operations are largely independent, and on multi-threaded scripts the total time savings can be significant if the two commands are executed in different threads.

The command returns the molfile handle or reference.

ens properties

ens properties ehandle ?pattern? ?noempty?
e.properties(?pattern=?,?noempty=?)

Get a list of valid properties of the ensemble and its minor objects. Property subsets may be selected by a non-empty filter pattern, which the property names must match in order to be listed. If the ensemble is a member of a reaction, reaction properties are included in the list. The same mechanism is used for dataset properties.

If the noempty flag is set, only properties where at least one data element controlled by the ensemble (i.e. a value for an atom of the ensemble, etc.) is not the property default value are output. By default, the filter pattern is an empty string, and the noempty flag is not set.

This command may also be invoked as ens props or e.props() .

Example:

ens properties $ehandle X_*
ens props $ehandle

The first example returns a list of the currently valid reaction properties of the reaction the ensemble is a member of, or an empty list if it is not. The second example lists all properties, including those of the ensemble proper, its minor objects such as atoms and bonds, and possibly of the reaction the ensemble is a member of, if it is an reaction ensemble.

ens purge

ens purge ehandle propertylist/objectclass/specialname ?emptyonly?
e.purge(?properties=?,?emptyonly=?)

Delete property data from the ensemble. The properties may either be properties of a reaction the ensemble is a member of (prefix X_ ), properties of a dataset the ensemble is a member of (prefix D_ ), or properties of the ensemble proper and its minor objects, such as ensemble or atom properties. If a property marked for deletion is not present, it is silently ignored.

If an object class name, such as ens or atom , is used instead of a property name, all properties of that class set on the ensemble are deleted, if they are not locked, or filtered out by the optional empty-only flag.

Setting the optional boolean flag emptyonly allows restricts the deletion to those properties where all the values for a property associated with a major object (such as on all atoms in an ensemble for atom properties, or just the single ensemble property value for ensemble properties) are set to the default property value.

Besides normal property names, a few convenient special names for common property deletion tasks are defined and can be used as a replacement for the property list. These include:

Examples:

ens purge $ehandle X_IDENT
ens purge $ehandle E_IDENT 1
ens purge $ehandle stereochemistry

The first example deletes the property data X_IDENT from the reaction the ensemble is a member of - provided it actually is a reaction ensemble. The second example deletes property E_IDENT from the ensemble if the property value is equal to the default value for E_IDENT . The last example removes all stereochemistry information from the ensemble.

The command returns the original ensemble handle or reference.

ens reaction

ens reaction ehandle ?filterlist?
e.reaction(?filters=?)

Return the handle or reference of the reaction the ensemble is a member of. Optionally, the reaction may be filtered by a simple filter list. If the ensemble is not part of a reaction, or does not pass the filter, an empty string is returned for Tcl , and None for Python .

Because an ensemble can only participate in a single reaction, the command is spelled ens reaction in singular.

Example:

ens reaction $ehandle

ens rebuild

ens rebuild ehandle ?minor_objectclass?
e.rebuild(?objectclass=?)

This command discards all minor objects and attached property data of a specific class associated with the ensemble. Afterwards, the minor object set is re-populated by the standard set-up function of the object class, if such a set-up function is defined.

If no minor object class is specified, bonds are regenerated - for example from 3D atomic coordinates. Bonds , molecules ( mols ), sigma and pi systems ( sigmas , pis ), rings and ring systems ( rings , ringsystems ) can all be rebuilt. However, by default no reconstruction function is defined for groups and surface patches ( surfaces ), although it is possible to set one via the object class manipulation command.

Generally, object sets should only be regenerated under exceptional circumstances, for example in order to undo a manual manipulation. Object sets are automatically generated when they are required - for example, bonds are automatically derived from atomic 3D coordinates if any property data associated with bonds is used in any context, and the ensemble so far did not contain bond information. An explicit request to generate connectivity is rarely needed.

Under normal circumstances, the use of minor object information such as bonds encoded explicitly in an input file is preferable to indirectly derived sets, such as regenerated connectivity. The connectivity algorithm of the toolkit is rather capable, but has its limitations, especially when hydrogen-depleted charged structures are encountered.

Files encoded in a few notorious structure file formats, such as PDB , may contain an incomplete bond set - without any indication that the bond set is incomplete. The PDB input routine tries to detect this, and automatically augments the bond set if obvious deficiencies are found. However, in case of minor omissions in the input data, a PDB structure may be one of the rare cases when an explicit request for a rebuild of the bond set can be helpful.

Besides the set of ensemble minor objects, the pseudo object class aro is also recognized. This keyword triggers a re-evaluation of aromatic systems and re-assign Kekulé bond orders, but not completely redo the bond set.

Example:

ens rebuild $ehandle bonds

This command discards the old bond set, and generate a new one. This only works if there is information which can be used for regeneration, such as atomic 3D coordinates. If no such information is present, the loss of bonds is irreversible and the ensemble useless for almost all applications short of a simulated plasma torch atomization.

The command returns the original ensemble handle or reference.

ens ref

Ens.Ref(identifier)

Python only method to get an ensemble reference from a string handle or another identifier. For ensembles, other recognized identifiers are ensemble references, or integers encoding the numeric part of the handle string.

ens rename

ens rename ehandle srcproperty dstproperty
e.rename(srcproperty=,dstproperty=)

This is a variant of the ens assign command. Please refer the command description in that paragraph.

ens replace

ens replace ehandle property/enshandle/emptystring ?preserved_propertylist/all?
e.replace(source=,?keep=?)

Substitute the ensemble with a structure decoded from data held in an ensemble property of that ensemble, or with the structure and associated data of another ensemble identified by its handle.

The original handle of the command ensemble is always preserved. The original structure data, with the exception of explicitly saved properties, is discarded. If the structure source argument is an ensemble handle, that ensemble is deleted.

For convenience, the replacement data argument may also be an empty string, which results in a no-op.

If the replacement argument is a property name, the exact type of operation depends on the data type of the property. The following data types are currently supported:

Any other property data type, NULL values of the property, non-ensemble properties, or malformed data result in an error and the original structure remains unchanged.

The structure source property data does not become not a property of the updated ensemble. In that ensemble, by default all other ensemble properties of the original are also purged, and all ensemble properties of the replacement structure are retained. However, by specifying a list of properties to be transferred, or using the special argument all , all or a subset of the ensemble property data of the original ensemble can be transferred to the replacement structure and thus saved. Under these circumstances, property data from the original ensemble has precedence and overwrites existing values of the same property on the replacement ensemble. However, all ensemble property data on the replacement ensemble which are not overwritten remain present in the updated ensemble. It is not possible to transfer atom, bond, or any other ensemble minor object property data to the replacement structure directly with this command.

The command returns the original, unchanged ensemble handle or reference.

Examples:

ens replace $eh E_CANONIC_TAUTOMER [list E_IDENT E_NAME]

This command replaces the current structure with its canonic tautomer. The values of properties E_IDENT and E_NAME from the original ensemble are kept in the updated form, all other ensemble property data of the original is discarded.

ens replace $eh $ehnew

Replace the structure with the one in $ehnew . The second ensemble is destroyed in the process.

ens replicate

ens replicate ehandle ?count?
e.replicate(?count=?)

This command duplicates all molecules in the ensemble and appends them to the atom, bond and other minor object lists of the ensemble.

The default replication count is one, but any other number of duplications may be chosen by an appropriate count parameter. If the count is less than one, the command is silently ignored.

The command returns the original ensemble handle or reference. As part of the integration step, merge property invalidation events are generated.

The ens dup command generates a new ensemble, while this command expands the current ensemble.

Example:

echo [ens get [ens replicate [ens create C.CC]] E_SMILES]

This prints C.CC.C.CC as result SMILES string, because both molecules in the original ensemble were duplicated and appended to the existing ensemble data.

ens rings

ens rings ehandle ?filterset? ?filtermode?
e.rings(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the rings the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens rings $ehandle
ens rings $ehandle [list heterocycle aroring]

The first example returns the labels of all rings the ensemble contains. If the ensemble does not contain any rings, an empty list is returned. Only labels of rings in the SSSR or ESSSR set are returned, even if the currently configured ring set is larger. The second example filters the rings - only heteroaromatic rings are reported.

ens ringsystems

ens ringsystems ehandle ?filterset? ?filtermode?
e.ringsystems(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the ring systems the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens ringsystems $ehandle
ens ringsystems $ehandle [list heterocycle aroring]

The first example returns the labels of all ring systems the ensemble contains. If the ensemble does not contain any ring systems, an empty list is returned. The second example filters the ring systems - a ring system label is included in the output list only if that ring system contains one or more hetero aromats.

ens rotate

ens rotate ehandle angle axis ?center? ?property?
e.rotate(angle=,axis=,?center=?,?coordinateproperty=?)

Rotate the ensemble in 3D space by manipulating property A_XYZ , or a custom atom float vector coordinate property.

The angle argument is a floating-point number in degrees. The axis argument is a 3D vector in standard notation, i.e. usually a list/tuple of three floating point numbers for the x, y and z components. If the last optional argument is omitted, the center of rotation is the 3D unweighted coordinate average of all ensemble atoms with valid 3D coordinates, which is computed as property E_CENTER . If the center argument is specified, it is expected to be a 3D point which is used as center of rotation instead.

This operation triggers a 3dglop property invalidation event.

The command returns the original ensemble handle or reference.

Example:

ens rotate $eh 60 {0 0 1}

Rotate the ensemble 60 degrees counterclockwise around the z axis.

ens scan

ens scan ehandle expression/queryhandle ?mode? ?parameterdict?
e.scan(query=,?resultmode=?,?parameters=?)

Perform a query on the ensemble object. The syntax of the query expression and the optional selection list is the same as that of the dataset scan command with a transient dataset consisting of the current ensemble only. For more details, please refer to the paragraphs on dataset scan and molfile scan .

The return value depends on the mode. The default query mode, this is different from the default in molfile scan , is exists .

ens set

ens set ehandle ?property value?...
e.set(property,value,...)
e.set({property:value,...})
e.property = value
e[property] = value

Standard data manipulation command for setting property data. It is explained in more detail in the section about setting property data.

Example:

ens set $ehandle E_NAME “Pharmacon X-25”

ens setparam

ens setparam ehandle property ?key value?...
ens setparam ehandle property dictionary
e.setparam(property,?key,value?...)
e.setparam(property,dict)

Set or update a property computation parameter in the metadata parameter list of a valid property. This command is described in the section about retrieving property data. The current settings of the computation parameters in the property definition are not changed.

The return value is the updated property computation parameter dictionary.

Example:

ens setparam $ehandle E_GIF comment “Top Secret Lead Structure”

ens setup

ens setup ehandle ?minorobjclass?
e.setup(?objectclass=?)

Query the status of the minor object lists in the ensemble, or initialize one of these to an empty list.

If no class is specified, a dictionary with all currently registered minor object classes of the ensemble is returned. The object class names are the key, the value is a boolean flag for the status.

If an object class argument is supplied, the object class is instantiated on the ensemble, if necessary by auto-loading an object class handler module. Unknown object class names result in an error. If the minor object class is already instantiated, it is not changed. Otherwise, an empty minor object set is added. This is even the case if the minor object class handler provides a default object setup function (see ens rebuild command). Instantiating an object class with this command always creates an empty collection of the minor objects associated with the ensemble.

Minor object lists are usually implicitly instantiated, as in

ens get $eh M_LABEL

which automatically sets up the molecule/fragment object set if it is not yet present, and populates it with objects identifying disconnected fragments in the ensemble, or

group create $eh [list $a1 $a2 $a3]

which adds a group to the ensemble, again automatically initializing the group object set if it was not initialized.

The ens setup command is intended for special circumstances and not commonly used.

ens show

ens show ehandle propertylist ?filterset? ?parameterdict?
e.show(property=,?filters=?,?parameters=?)
Ens.Show(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens show is that the latter does not attempt computation of property data, but raises an error if the data is not present and valid. For data already present, ens get and ens show are equivalent.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens sigmas

ens sigmas ehandle ?filterset? ?filtermode?
e.sigmas(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the σ systems the ensemble contains. This is explained in more detail in the section about object cross-references.

Examples:

ens sigmas $ehandle

σ systems are a rather exotic feature and not commonly used. These are essentially descriptions of bonding interactions which use s orbitals, such as normal, covalent single bonds, or the central bond in multiple bonds. A simple double bond is described with one σ system and one π system in this representation.

ens sort

ens sort ehandle ?sort_property? ?relabel? ?duplicate? ?datasethandle? ?position?
e.sort(?property=?,?relabel=?,?duplicate=?,?target=?,?position=?)

Sort the atoms in an ensemble according to a property value. The default property is A_LABEL , the standard atom label. The first optional argument can be used to sort on a different property, or a property field. However, the property must be either an atom property, or a molecule property. If the relabel flag is set, the ensemble atoms and molecules are renumbered after the sort in ascending order, starting with one. By default, atoms and molecules retain their original labels even if they change positions. If the duplicate flag is set, the sort operation works on a duplicate of the original ensemble. If the flag is unset, or the argument omitted, the operation modifies the original ensemble object.

The final two optional arguments allow the direct transfer of the modified ensemble or duplicate into a dataset, similar to an ens move command. The ensemble may be inserted into a specific position of a target dataset. If the special value end is used, or the zero-based position index is beyond the current end of the target dataset, the ensemble is simply appended. By default the ensemble is not moved, and if it is moved without an explicit position, it is appended.

The sequence of the atoms in the ensemble is rearranged so that the atoms are in ascending order of the values of the sort property or property field. Indirectly, molecules are also rearranged to correspond to the sequence of the first atoms in every molecule. This operation triggers a shuffle property invalidation event. If the renumbering option is selected, the atom and molecule sets are re-labeled with their standard label properties (i.e. A_LABEL for atoms, M_LABEL for molecules) in ascending order, starting with one. Other minor object collections remain in their original sequence and retain their current labels. Certain important properties which, if present, are dependent on atom label values, notably A_LABEL_STEREO , B_LABEL_STEREO and B_FLAGS , are specifically adjusted to the new labeling scheme instead of being invalidated.

The command returns an ensemble handle or reference. If the operation was operating on a duplicate, it is the handle or reference of the new ensemble, otherwise that of the original ensemble.

ens split

ens split ehandle ?minsize? ?splitproperty?
e.split(?minsize=?,?splitproperty=?)

Split the molecules of the ensemble into individual ensembles. The return value is a list of the handles or references of the new ensembles. If the original structure contains only a single fragment, the result is the same as a simple ens dup command. The split structures do not become a member of a reaction or dataset, even if the original structure is.

The optional minsize parameter is a minimum value for the number of heavy atoms (property M_HEAVY_ATOM_COUNT ) in the molecules. If this is not an empty string, molecules which have less heavy atoms than the minimum are not duplicated. If all molecules in the input ensemble are smaller than the required size, an empty list is returned.

The optional splitproperty argument can be used to split the ensemble on values of a molecule property, which needs to be either already set or computable, instead of simply separating fragments on connectivity. All molecules in the input ensemble which have a common value of this property are put into a joint result ensemble, and each distinct split property value starts a new result ensemble. Molecules with a common property value do not need to be present in the input ensemble in a consecutive sequence, nor are there any special requirements for the data type or value range of the split property, as long as the data type has a comparison function. If the values of the split property are distinct over all molecules in the input ensemble, the outcome of command is indistinguishable from running it without any split property.

Example:

lassign [ens split [ens create “CC.CC”]] eh1 eh2

This example creates an ensemble with two ethane molecules, splits it, and assigns the two new ensemble handles to variables eh1 and eh2 .

set elist [ens split $eh {} M_REACTION_LABEL]

Split ensemble along the original reagent or product data blocks found in an RXN or RDF file.

ens sqldget

ens sqldget ehandle propertylist ?filterset? ?parameterdict?
e.sqldget(property=,?filters=?,?parameters=?)
Ens.Sqldget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The differences between ens get and ens sqldget are that the latter does not attempt computation of property data, but initializes the property value to the default and returns that default, if the data is not present and valid; and that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens sqlget

ens sqlget ehandle propertylist ?filterset? ?parameterdict?
e.sqlget(property=,?filters=?,?parameters=?)
Ens.Sqlget(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The difference between ens get and ens sqlget is that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes..

ens sqlnew

ens sqlnew ehandle propertylist ?filterset? ?parameterdict?
e.sqlnew(property=,?filters=?,?parameters=?)
Ens.Sqlnew(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The differences between ens get and ens sqlnew are that the latter forces re-computation of the property data, and that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens sqlshow

ens sqlshow ehandle propertylist ?filterset? ?parameterdict?
e.sqlshow(property=,?filters=?,?parameters=?)
Ens.Sqlshow(data,property=,?filters=?,?parameters=?)

Standard data manipulation command for reading object data. It is explained in more detail in the section about retrieving property data.

For examples, see the ens get command. The differences between ens get and ens sqlshow are that the latter does not attempt computation of property data, but raises an error if the data is not present and valid, and that the SQL command variant formats the data as SQL values rather than for Tcl or Python script processing.

The Python class method is a one-shot command. The transient dataset created from the initialization items is automatically deleted when the command finishes.

ens subcommands

ens subcommands
dir(Ens)

Lists all subcommands of the ens command. Note that this command does not require an ensemble handle.

ens surfaces

ens surfaces ehandle ?filterset? ?filtermode?
e.surfaces(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of surface patches the ensemble contains. This is explained in more detail in the section about object cross-references.

Example:

ens surfaces $ehandle carbon

This example lists all surface patches which are associated with carbon atoms. Surface patches associated with other atoms, or with no atoms, are not listed.

ens swapin

ens swapin ehandle
e.swapin()

Swap an ensemble from the disk store fully back into memory, and disable further automatic loading and shelving. If the ensemble was not swapped out, the command does nothing.

The command returns the original ensemble handle or reference.

ens swapout

ens swapout ehandle
e.swapout()

Remove most of the ensemble data from memory and store it in a temporary disk store. The ensemble handle remains valid. As soon as it is used in a command again after this command has been executed, the swapped ensemble data is automatically reloaded from file, and then stored again when the object lock is released. To disable the automatic swapping of an ensemble, use the ens swapin command.

This command is intended to be used in cases where a large number of ensembles must be kept in memory. Its routine use is not encouraged - it is only useful in case the programmer knows about access patterns. In other cases, the standard virtual memory mechanism of the operating system might yield better performance results.

The ensembles are stored as binary blobs in a key/value store in a process-specific swap directory cactvs%d, ( %d is replaced by the process ID) which is created automatically in the standard temporary directory. When an ensemble is deleted, its swap record is also removed, if one was created during the lifetime of the ensemble. When a Cactvs application program exits, the swap store as well as the swap directory are automatically deleted, even without explicit deletion of the last set of ensembles in memory. In case of program crashes, the swap directory and its contents may however survive. If ensemble swapping is used with unstable applications, the temporary directory should be checked from time to time.

The command returns the original ensemble handle or reference.

Example:

ens swapout $ehandle

ens tables

ens tables ehandle ?filterlist?
e.tables(?filters=?)

Return a list of the handles of all table objects the ensemble is associated with. Optionally, the table set may be filtered by a simple filter list. If the ensemble is not related to any table, or none of these tables passes the filter list, an empty string is returned.

This command is only available if the toolkit was compiled with table support.

Example:
ens tables $ehandle

ens taint

ens taint ehandle propertylist/changeset ?purge?
e.taint(property=,?purge=?)

Issue a property data tainting event which acts on the ensemble data.

If the ensemble is a member of a dataset, the dataset and its objects are not tainted.

The event list may contain any number of the following items:

The command returns the original ensemble handle or reference.

ens torsions

ens torsions ehandle ?filterset? ?filtermode?
e.torsions(?filters=?,?mode=?)

Standard cross-referencing command to obtain the labels or references of the torsion objects the ensemble contains as minor objects. This is explained in more detail in the section about object cross-references.

ens transfer

ens transfer ehandle propertylist ?targethandle? ?targetpropertylist?
e.transfer(properties=,?target=?,?targetproperties=?)

Copy property data from one ensemble to another ensemble or other major object, without going through an intermediate scripting language object representation, or dissociate property data from the ensemble. If a property in the argument property list is not already valid on the source ensemble, an attempt is made to compute it.

If a target object is specified, and a property is not an ensemble but an ensemble minor object property, the number of property-associated minor objects is usually expected to be the same in both ensembles, and expected to have the same label set, tough it is not required that they are in the same sequence. Property data is assigned to the target ensemble minor objects with the minor object label as reference key. In case of a label set or object count mismatch between the two ensembles, no error is raised. Excess source data items are discarded, and excess target minor objects, or those with unmatched labels, retain their original value if the property was present on the target, or are set to the default value if the property was freshly instantiated. In this command mode, the return value is the handle of the target ensemble. Source and target ensembles cannot be the same object.

If a target property list is given, the data from the source is stored as content of a different property on the target. For this, the data types of the properties must be compatible, and the object class of the target property that of the target object. No attempt is made to convert data of mismatched types. In case of multiple properties, the source property list and the target property list are stepped through in parallel. If there is no target property list, or it is shorter than the source list, unmatched entries are stored as original property values, and this implies that the object class of the source and target objects are the same.

If no target object is specified, or it is spelled as an empty string or Python None , the visible effect of the command is the same as a simple ens get , i.e. the result is the property data value or value list. The property data is then deleted from the source object. In case the data type of the deleted property was a major object (i.e. an ensemble, reaction, table, dataset or network), it is only unlinked from the source object, but not destroyed. This means that the object handles returned by the command can henceforth the used as independent objects. They can be deleted by a normal object deletion command, and are no longer managed by the source object..

Properties which are ensemble minor object properties can only be transferred to another ensemble. Ensemble properties can be moved to other major objects.

Example:

ens transfer $eh E_EMF_IMAGE $eh2

This copies property E_EMF_IMAGE from the first ensemble to the second. The property data remains valid on the source ensemble.

set ehc [ens transfer $eh E_CANONIC_TAUTOMER]

Get the handle of the canonic tautomer of the source ensemble, and dissociate it from the source ensemble.

ens transform

ens transform ehandle SMIRKSlist ?direction? ?reactionmode? ?selectionmode? 	?flags? ?overlapmode? ?{?exclusionmode? excludesslist}? ?maxstructures? 	?timeout? ?maxtransforms? ?niterations? ?statusvariable?
e.transform(transforms=,?direction=?,?reactionmode=?,?selectionmode=?,?flags=?,	?overlapmode=?,?excludess=?,?maxstructures=?,?timeout=?,?maxtransforms=?,	?iterations=?,?statusvariable=?)

This command applies one or more SMIRKS transforms to an ensemble and returns a list of ensemble handles or references of transformation products. The transformation products are filtered for duplicates. The original start structure is never returned - if a transform set does not match at all, an empty list is returned.

The required parameter after the ensemble handle is a list of SMIRKS lines, where each SMIRKS line is itself a list. A SMIRKS line is in the simplest case a simple SMIRKS transform without any extra data, but it may be padded by additional parameters which apply only to the application of that transform. If these optional parameters local to the current transform are not specified, their global counterpart on the command line is used instead. The syntax of an individual SMIRKS line is

SMIRKStransform ?step? ?direction? ?flags? ?overlapmode?

The SMIRKS transform part is the only required list element. It may be provided either as a string in standard Daylight notation, or as a handle of a reaction, which should have been decoded in SMIRKS mode (see reaction create command). Care should be taken to pass SMIRKS strings as proper elements of a list, even if only a single string is used, because they may contain whitespace and naming information after the actual transform code. Example:

ens transform $ehandle [list [list {[C:1][C:2]>>[C:1]=[C:2] Dehydrogenation} 1]]

The string Dehydrogenation is part of the transform specification string and not the transform step. The name string is attached to the (intermediate, in this case) transform reaction object as property X_NAME and can be used to track the reaction history of transform result structures.

The optional step element in a transform line (a positive integer or 0) identifies the reaction step of the transform. Transform sets of different step numbers are isolated from each other and do not interact. Transforms are executed in ascending step number. Transforms with different step numbers need not to be sorted, and the step numbers neither need to begin with one, nor form an uninterrupted sequence. A step number of 0 disables the transform. The default step number is one. All transforms of the same step number are essentially executed in parallel and may interact with each other.

The third and again optional element of transform lines is the direction identifier. It may be either forward, backward, or bidirectional. In forward mode, only the left part of a transform is used for matching, and the matched structure part is modified according to the description on the right side. backward works the other way around, and in bidirectional mode, both sides of the transform scheme are independently matched, and, if the match is successful, transformed to the other side. If this parameter is not specified, or specified as an empty string, the global direction parameter from the command line is substituted.

The fourth and once more optional element of a transform line is a list of flag words. Every word sets an additional flag. Currently, the following flag words are recognized:

The fifth, final, and again optional element of a SMIRKS line is the overlap mode. Again, if this parameter is omitted or supplied as an empty string, the global default from the command line is used. The overlap mode determines whether a transform substructure which consists of multiple disconnected fragments may match onto common target structure atoms or bonds. The following values are supported:

Every SMIRKS line follows the outlined scheme, and all settings within that line are applicable only to the current transform scheme.

There is no general limit for the maximum number of transforms in this command. However, if transforms are combined with exclusion substructures, and these exclusion substructures are to be applied on a per-transform basis, (see below), the highest transform index for which an applicability flag can be set is 63. Every transform which is applied in bidirectional fashion, either by global configuration or transform-specific flags, is counted twice toward this limit.

All parameters after the SMIRKS lines list act globally. The third and optional direction parameter, command word number five, sets the default for the directionality of all transforms for which no local override was set in their respective SMIRKS lines. If this parameter is not specified, the default is forward .

The optional reaction mode, parameter four and command word six, does not have a counterpart in the SMIRKS lines. This parameter determines how the possibility of multiple matches of a transform substructure in the target molecules is handled. It can be one of these values:

The default value for the reaction mode is first .

The next optional command parameter, the selection mode , (command argument five and command word seven) again has no counterpart in the SMIRKS line parameters. It determines the interaction of transforms of the same step number. All these transforms form a group. This parameter determines which of the transforms from the current group are executed, and in which order. The parameter can be set to one of the following values:

The default selection mode is first .

The next and again optional flags parameter (command argument six, command word eight) defines the default for those transforms which do not possess an override flag set in their SMIRKS line. Note that if a flag set is specified on a SMIRKS line it completely replaces the default flag set. It does not simply add or bit-or more flags compared to the global setting. The default flag set is empty.

Similarly, the overlap mode parameter (command argument seven, command word nine) sets the default for handling potential overlap when matching disconnected transform fragments onto the structure to be transformed. The default setting is none , disallowing any fragment overlap. If the transforms only consists of a single fragment in the applicable direction(s), there is no effect of this parameter.

The excludesslist parameter (command parameter eight, command word ten) again has a potentially complex internal structure. It defines exclusion fragments. An exclusion fragment blocks all sections of the target structure from matching any transform substructure, either by preventing the match of transform atoms (the default) or transform bonds. This is a useful feature for example to easily prevent amide groups from matching amino group transforms. The default exclusion substructure list is empty. The parameter is a list. Every list element can be a simple structure identifier, or a list of a structure identifier and a transform index list.

Structure identifiers recognized by this command are:

If the exclusion substructure identifier is not associated with a transform index list, the substructure applies to all transforms. The optional transform index list consists of an arbitrary number of transform indices in the range 0...63. If a transform index list is supplied, the exclusion substructure applies only to the listed transforms. Note that it is not possible to set individual exclusion indices for transforms beyond the 64th, even though it is allowable to use any number of transforms in the transform list. All ensembles, including intermediate result ensembles, are checked against all applicable exclusion structures immediately before the application of a transform is attempted.

The exclusion substructure specification list may be prepended by a magical list element with value ( marked ) atoms , ( marked ) bonds, unmarkedatoms or unmarkedbonds . These control the mechanism how matched substructures are marked in the transform source structure. The default mode is atoms , where excluded atoms are prevented from matching transform pattern atoms. The bonds mode switches this to preventing a bond match. The difference is that in bonds mode, transform pattern atoms can still overlap, by a single atom, excluded regions, but not change bonds therein, while in atoms mode absolutely no atom or bond overlap between excluded regions and transform patterns is allowed. The unmarked variants operated with a reversed exclusion set - i.e. atoms or bonds which are not matched are excluded from the structure region eligible for transform application.

In case the exclusion mode is ( marked ) atoms or unmarkedatoms , an atom identifier, i.e. any notation which is supported to identify an atom in the atom command, may also be used in addition to the three substructure specification styles listed above to directly exclude a single atom from matching by all transforms. In ( marked ) bonds or unmarkedbonds marker mode, bond identifications in the same style as supported by the bond command, such as bond labels or bond atom label pairs, are similarly allowed as additional direct bond exclusion specifications, and these again apply to all transforms.

Exclusion markings, once set for the input structure, are inherited by newly generated result structures, so that the protection remains active even for structures undergoing sequences of transformations.

The related dataset transform command does not support direct atom or bond exclusion marking, even if the dataset only contains a single structure.

An example for an exclusion list:

ens transform $eh $tlist ... [list „atoms“ {C(=O)[NH2]} {{C[NH]C} {0 1}} 1]

This exclusion set protects amide groups (the first substructure) from all transforms, secondary amines including their immediate carbon neighbor atoms from the first two transforms in the set (index 0 and 1, the transform set is specified in the tlist variable), and the single atom with label 1 in the input ensemble. The exclusion marker mode is explicitly spelled out as atoms in first exclusion list element, which however is already the default.

Another example:

ens transform $eh $tlist ... [list „unmarkedatoms“ {*}$statoms]

This transform only operates on the atoms of which the labels or other identifiers are included in the list in variable statoms . All other parts of the structure are excluded and cannot participate in the transform.

The next optional global command parameter (parameter nine, command word eleven) is the maximum number of result ensembles to generate. The input ensemble is not counted. As soon as the maximum is reached, the command finishes and returns the result ensembles which were generated so far. If the maximum number of results is set to a negative number (the default), no limit applies. If it is set to zero, the transform command is effectively disabled. The global control variable ::cactvs(setsize_exceeded) is set to 1 if the specified maximum number of result ensembles was going to be exceeded. At the beginning of the execution of the ens transform command, this control variable is reset to zero. The limit applies to the total of generated unique structures, which is not necessarily the same as the number of output structures in case the processing mode dictates that they are processed further and not included as intermediates in the result set. In the special case of exhaustive transform application, the parameter limits the size of the intermediate result set after each pass, not the overall total of unique structures.

The timeout parameter (command parameter ten, command word twelve) can be used to set a time limit in seconds for the command execution. If this parameter is set to 0 or a negative number, no timeout applies. This is the default. Otherwise, the generation of result ensembles is stopped after the specified time, and the command returns with the results generated so far. The global control variable ::cactvs(interrupted) is set to 1 if a timeout occurs. It is reset to 0 at the beginning of the execution of the command.

The next optional parameter (command parameter eleven, command word thirteen) can be used to limit the number of transforms applied to the starting structure and intermediate structures. If this parameter is not specified, or specified as an empty string or a negative value, no limit is imposed. If this parameter or the timeout option is used, the result set may become dependent on the atom and bond order of the input structure because the traversed part of the possible transform match space is different and might yield different and/or a different number of results when the timeout or application count restriction is triggered.

The second last optional parameter (command parameter twelve, command word fourteen) is an iteration count. Its default value is one, meaning that the whole transformation process is only executed once. If set to a larger value, the transformation routine calls itself recursively. This is equivalent to first running ens transform with a start structure, and then repeatedly execute dataset transform commands for the second and later iterations with the last result set. All limits and other control parameters are passed in the original configuration, and apply only to the next iteration, not globally over the sum of all transform cycles. By default, the result set of this mode is what the last iteration produced, but this can be changed to the union of all iteration results by the keepiterationintermediates flag. Uniqueness checking of result structures is applied to the full return set. If the parameter is set to zero or a negative value, no transformations are executed. If the setpathname flag is set, it is automatically switched to appendpathname for the second and later cycles, so that the name mirrors the full transformation history and is not reset in each cycle.

The final optional parameter is an array variable name. If it is specified, various statistics about the transform application are collected and stored in that array. Some important array elements are:

Example:

set t1 {{[O,S;X1:1]=[C:2x1][C:3X4][#1:4]>>[#1:4][O,S;X2:1][C:2x1]=[C:3] enol/thioenol}}
set elist [ens transform $eh [list $t1] bidirectional multistep all preservecharges none]

This example is part of a tautomer generator. The full standard generator in the toolkit uses a lengthy list of transform schemes and not just the one sample keto/enol schema displayed here. Because the operation is bidirectional, the transform transforms ketones into enols, and vice versa. If more than one interchangeable group exists, all intermediate structures are generated ( multistep reaction mode). All results are retained ( all selection mode), and all intermediate structures are again subjected to all transforms (this does not have any effect with a single transform, but the real application uses a set of transforms). Finally, charges should not be changed ( preservecharges flags), and fragment overlap is not allowed ( none overlap mode) - this again is without effect in this sample transform, because it does not consist of disconnected fragments on either side.

Multiple structures may be jointly transformed in a single command by means of the very similar dataset transform command.

ens translate

ens translate ehandle pt1 ?pt2? ?property?
e.translate(point1=,?point2=?,?coordinateproperty=?)

Move the atoms of the ensemble by modifying their 3D coordinates in property A_XYZ , or a custom atomic float vector coordinate property. This command requires atomic 3D coordinates and will attempt to compute them if they are not yet present. If no 3D atomic coordinates can be generated, the command fails with an error.

The first argument is interpreted as a 3D vector if this is the only coordinate argument. All atoms with valid 3D coordinates are moved according to the vector coordinates. In case a second argument is supplied, both arguments are interpreted as points in 3D space. The ensemble atoms are moved according to the difference vector between the second and the first point.

This operation triggers a 3dglop property invalidation event.

The command returns the original ensemble handle or reference.

Examples:

ens translate $eh {0 0 1}
ens translate $eh [atom get $eh $a1 A_XYZ] [atom get $eh $a2 A_XYZ]

ens trim

ens trim ehandle ?propertylist?
e.trim(?properties=?)

Reduce the information content of a structure to a standard minimum set and discard any additional information. This process minimizes the storage requirements of the ensemble. The properties of the internally defined minimum set are computed if required. The retained property set is designed to support a faithful representation of connectivity including bond and atom labels and types as well as formal charges, stereochemistry, isotopes, 2D and 3D coordinates, but not of auxiliary additional attributes of atoms, bonds or other minor objects.

The optional fourth argument is a list of properties which should be retained in addition to the standard set. If any of these are not present on the ensemble to be trimmed, they are silently ignored and no attempt is made to compute them. Specifying properties of the standard retention set in this list is allowed but has no additional effect.

The return value of the command is a list of the remaining properties of the ensemble.

Example:

ens trim $ehandle {E_GIF E_SMILES}

ens uncharge

ens uncharge ehandle ?filterset? ?flags?
e.uncharge(?filters=?,?flags=?)

Attempt to remove charges on atoms in a chemically sensible way. Charge removal by default happens via addition or removal of protons. In cases where this does not make chemical sense, a direct charge manipulation may be performed instead. Charged metal ions and other charged species without an obvious method for neutralization remain unchanged.

By default all atoms are processed, but the set of processed atoms can be limited by specifying a filter collection. Additional conditions on processed atoms can be set via the flag argument, which accepts the same values as ens hadd . Please refer to that command for a list and explanation of these flags.

The command returns the number of atoms which were neutralized.

Example:

ens uncharge [ens create {[NH3+]CC(=O)[O-]}]

This sample line removes a proton from the charged amino group and add a proton to the charged carboxyl group of the initial glycine zwitterion. The returned result value is 2. In this example the total hydrogen count has not changed. In case of an unbalanced set of positive and negative, modified charged centers this is usually not the case.

ens unlock

ens unlock ehandle propertylist/objclass/all
e.unlock(property=)

Unlock property data for the ensemble, meaning that they are again under the control of the standard data consistency manager.

The property data to unlock can be selected by providing a list of the following identifiers:

Property data locks are obtained by the ens lock command.

The return value is the original ensemble handle or reference.

Example:

set eh [ens create CCC]
ens lock $eh A_SYMBOL 1
ens purge $eh A_ELEMENT
atom set $eh 1 A_query(dsearch) 3
ens unlock $eh A_SYMBOL

In this example, an ensemble is created, and the atom symbol information is locked. Next, the element number property is deleted, and a query attribute is set. Finally, the lock is released. Had the element symbol information not been locked, the ensemble would have become unusable due to an overzealous data consistency manager. Setting query information in property A_query can have an influence on the atom symbol. So the default action of invalidating A_SYMBOL when manipulating A_query is correct. However, in case there is no element information A_ELEMENT , and no atom symbol information A_SYMBOL , the element information is completely lost, and the ensemble becomes unusable. So in this case, locking A_SYMBOL (or alternatively A_ELEMENT ) is required to avoid unexpected side effects of structure editing.

ens unpack

ens unpack packstring ?compressionlib?
Ens.Unpack(data=,?compressionlib=?)

Unpack a base64-encoded serialized object string which was created by an ens pack command. The return value of this function is the handle of the newly created ensemble object, which is an exact duplicate of the packed original ensemble.

Packed ensembles may also be unpacked by the ens create command.

The default compression library is zlib . For more options, see ens pack .

Example:

set packdata [ens pack [ens create CCCl]]
set ehandle [ens unpack $packdata]

ens valencecheck

ens valencecheck ehandle ?failedatomvariable? ?nitrogenmode?
e.valencecheck(?variable=?,?nitrogenmode=?)

Perform a valence check on the ensemble, comparing the current bonding situation at all atoms to the list of element-specific valence states in the system element table. This command is intentionally quite picky, discouraging for example the use of pentavalent nitrogen by default. For the calculation of valence, only bonds of type normal (valence bonds) are taken into account. Complex bonds and pseudo bond types thus do not interfere in the calculation. Some more exotic metal atoms with many different valence states, or few well-defined covalent compounds, such as vanadium or rhodium , always pass.

The handling of nitrogen in pentavalent or ionic form can be controlled by setting the optional nitrogenmode argument, or modifying the global : :cactvs(nitrogen_valence_check) variable.Possible values are xionic , ionic (the default), asis , pentavalent and xpentavalent . These are the same values as with the ens nitrostyle command - please refer to that command for more information. In asis mode, both ionic and pentavalent forms pass.

The return value of this command is the number of atoms which failed the valence check. If the optional failedatomvariable argument is specified as non-empty string, it is the name of a variable which receives a list of the atom labels which failed the check, or is set to an empty list in case no problems were found.

Note that this command assumes that all hydrogen atoms are in place. Processing of structures with implicit hydrogen atoms is not supported.

mol valcheck is a short command alias.

Example:

ens valencecheck [ens create {CN(=O)=O.C[N+](=O)[O-]}] badatoms

This sample command checks the valence situation of nitromethane in two encoding formats. The first molecule, using a pentavalent nitrogen encoding, is responsible for the result value 1, indicating one failed atom, and the variable badatoms is set to 2, the label of the pentavalent nitrogen atom. The second molecule passes the check and reports no additional problems.

ens valcheck is a short alias.

ens valid

ens valid ehandle propertylist
e.valid(property/propertysequence)

Returns a list of boolean values indicating whether values for the named properties are currently set for the ensemble. No attempt at computation is made. For Python , where single-item lists are syntactically not the same as a single value, the return value is a single boolean if the argument was a string or a property reference, and only a single property was decoded.

Example:

ens valid $xhandle X_IDENT

reports whether the ensemble has a standard ID (has a valid E_IDENT property) or not.

ens has is an alias to this command.

ens vector

ens vector ehandle property vectorname ?invert? ?integrate?

Map ensemble property data to a Blt library vector object. Please refer to the Blt manual pages for more information on these. Blt vector objects are very useful, for example, for the efficient set-up of GUI graphing widgets which are provided by the Blt Tk extension. This command automatically attempts to load the Blt Tcl module if necessary. If that fails, an error results.

The vectorized property data must be of a vector type, and the element type of the vector must either be a simple numeric type, or a bit for bitvectors, or a floating-point pair. It is possible to address a property field, for example the X/Y data points of a spectrum which are typically stored as a field in a complex compound property.

If the invert flag is set, the stored Blt vector object values are set to 1.0 minus the property data value. By default, this flag is not active. If the integrate flag is set, the Blt vector object element values are set to the sum of all preceding property data values. This flag is also disabled by default.

If the property data type is a float pair vector, two vector objects are created in the Blt namespace, with suffixes _X and _Y . For simple vector types, the vector name is used directly. It is possible to overwrite existing Blt vectors of the same name with this command.

The return value of the command is a list of the generated name of the vector, followed by the minimum and maximum data values in that vector object. These may the different from the ensemble property data values because of the application of the invert or integrate flags.For float pair vectors, the same information is repeated for the second vector object.

The command is not supported in the Python interface.

ens verify

ens verify ehandle property
e.verify(property)

Verify the values of the specified property on the ensemble. The property data must be valid, and of an ensemble or ensemble minor object property. If the data can be found, it is checked against all constraints defined for the property, and, if such a function has been defined, is tested with the value verification function of the property.

If all tests are passed, the boolean return value is boolean 1, 0 if the data could be found but fails the tests, and an error condition otherwise.

ens weed

ens weed ehandle keywords
e.weed(keywordsequence)
e.weed(?keyword?,...)

This command performs a number of common clean-up and standardization operations on the ensemble, which are especially useful in the context of processing PDB files. The ensemble is potentially modified, but keeps its handle or reference, which is returned as command result. In addition, properties A_XYZ and A_RESIDUE , which are normally susceptible to bond manipulations, are locked and retained.

The keywords argument selects the desired set of operations. Most of the keywords are single words, but the minsize and maxsize as well as the minaminoacids and maxaminoacids keywords take an additional integer number as argument. The following operations are currently supported:

The order of the keywords is not important. The sequence of operations is always

metalatoms > specialbonds > proteinspecialbonds,proteinhetatmbonds > metaloxygenbonds > disulphides > carbonless,hydrogenless,inorganic,maxsize,metalions,minsize,water > maxaminoacids,minaminoacids > duplicates

Applied operations which potentially change the set of molecules and rings trigger an automatic re-evaluation of this data after the operation block has been executed.

Example:

The code below is part of a reliable PDB ligand extractor.

ens weed $eh {metaloxygenbonds water proteinspecialbonds duplicates minsize 10 \ maxsize 300 maxaminoacids 6 disulfides}
if {[ens get $eh E_NATOMS]==0} {
# try again with additional bond cut step. Cannot do this by default, because# there are plenty of ligands with embedded amino acid parts# that are encoded as ATOM lines. PDB files suck.
	molfile backspace $fh
	set eh [molfile read $fh]
	ens weed $eh {metaloxygenbonds water proteinspecialbonds proteinhetatmbonds \													duplicates minsize 10 maxsize 300 maxaminoacids 6 disulfides}
}

ens xhandle

ens xhandle ehandle

Return the remote handle of the ensemble if it was exported and is currently under the control of a live-linked application. In case the ensemble is not exported, an error results.

This command is not supported in the Python interface.


1. Do not use this mode with transforms which add a group which is again matchable by the transform - you will face a runaway polymerization-style reaction!