pyrexMD.analysis

pyrexMD.analysis.analyze

Hint

This module contains various functions for basic trajectory analyses, e.g., calculating RMSDs, distances, etc.

Example:

import MDAnalysis as mda
import pyrexMD.topology as top
import pyrexMD.analysis.analyze as ana

ref = mda.Universe("path/to/pdb")
mobile = mda.Universe("path/to/tpr", "path/to/xtc")

# get RMSD
top.norm_and_align_universe(mobile, ref)
FRAME, TIME, RMSD = ana.get_RMSD(mobile, ref)

# plot
ana.PLOT(xdata=FRAME, ydata=RMSD, alpha=1, marker=None, xlabel="frame", ylabel=r"RMSD ($\AA$)")

Content:

pyrexMD.analysis.analyze.PLOT(xdata, ydata, xlabel='', ylabel='', title='', xlim=None, ylim=None, **kwargs)[source]

General plot function for analysis.py

Parameters
  • xdata (array, list) –

  • ydata (array, list) –

  • xlabel (str) –

  • ylabel (str) –

  • title (None, str) –

  • xlim (None, list) –

  • ylim (None, list) –

Keyword Arguments
  • alpha (float) – 1.0

  • color (str) – “r”

  • lw (float) – 1.5

  • ms (int) – 1

  • marker (None, str) – None

Hint

Args and Keyword Args of misc.figure() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

Example

>> PLOT(TIME, RMSD, xlabel=”time (ns)”, ylabel=r”RMSD ($AA$)”,
title=”RMSD plot”, xlim=[50, 500], ylim=[0, 20])
pyrexMD.analysis.analyze.alignto(mobile, ref, sel1, sel2, weights='mass', tol_mass=0.1, strict=False)[source]

Modified version of MDAnalysis.analysis.align.alignto() which works properly with two different selection strings.

Description:

Perform a spatial superposition by minimizing the RMSD.

Spatially align the group of atoms mobile to reference by doing a RMSD fit on sel1 of mobile atoms and sel1 of referece atoms.

The superposition is done in the following way:

  1. A rotation matrix is computed that minimizes the RMSD between the coordinates of mobile.select_atoms(sel1) and reference.select_atoms(sel2); before the rotation, mobile is translated so that its center of geometry (or center of mass) coincides with the one of reference. (See below for explanation of how sel1 and sel2 are derived from select.)

  2. All atoms in Universe that contain mobile are shifted and rotated. (See below for how to change this behavior through the subselection keyword.)

Parameters
  • mobile (universe) – mobile structure

  • ref (universe) – reference structure

  • sel1 (str) – selection string of mobile structure

  • sel2 (str) – selection string of reference structure

  • weights (None, str, array) –

    None: weigh each atom equally
    ”mass”: weigh atoms based on mass
    array: If a float array of the same length as mobile is provided, use each element of the array_like as a weight for the corresponding atom in mobile.

  • tol_mass (float) – Reject match if the atomic masses for matched atoms differ by more than tol_mass, default [0.1]

  • strict (bool) –

    True: Will raise SelectionError if a single atom does not match between the two selections.
    False: Will try to prepare a matching selection by dropping residues with non-matching atoms.

Returns

old_rmsd (float)

RMSD before spatial alignment

new_rmsd (float)

RMSD after spatial alignment

pyrexMD.analysis.analyze.get_Distance_Matrices(mobile, sel='protein and name CA', sss=[None, None, None], flatten=False, verbose=True, **kwargs)[source]

Calculate distance matrices for mobile and return them.

Parameters
  • mobile (universe, list) –

    (MDA universe): structure with trajectory
    (list): list with paths to structure files (.pdb)

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • flatten (bool) – return flattened distance matrices

  • verbose (bool) – show progress bar

Keyword Arguments
  • dtype (dtype) – float (default)

  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

Returns

DM (array)

array of distance matrices

pyrexMD.analysis.analyze.get_FASTA(pdb_file, verbose=True)[source]

get FASTA sequence for each polypeptide of pdb_file

Parameters
  • pdb_file (str) – path to pdb_file

  • verbose (bool) – print sequences

Returns

FASTA (list)

list with FASTA sequences (one per polypeptide)

pyrexMD.analysis.analyze.get_RMSD(mobile, ref, sel1, sel2, sss=[None, None, None], prec=3, weights='mass', superposition=True, plot=False, verbose=True, **kwargs)[source]

Returns rmsd for multiple frames.

Parameters
  • mobile (universe, atomgrp) – mobile structure

  • ref (universe, atomgrp) – reference structure

  • sel1 (str) – selection string of mobile structure

  • sel2 (str) – selection string of reference structure

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • prec (None, int) – rounding precission

  • weights (bool, str, array_like) –

    weights during superposition of mobile and reference
    None: no weights
    ”mass”: atom masses as weights
    array_like: If a float array of the same length as “mobile” is provided, use each element of the “array_like” as a weight for the corresponding atom in “mobile”.

  • superposition (bool) –

  • plot (bool) –

Hint

Args and Keyword Args of analysis.PLOT() are valid Keyword Args.

Returns

FRAME (array)

frame array

TIME (array)

time array in ps

RMSD (array)

rmsd array in Å

pyrexMD.analysis.analyze.get_RMSF(mobile, sel='protein and name CA', plot=False, **kwargs)[source]

Calculates the RMSF of mobile. Alias function of MDAnalysis.analysis.rms.RMSF. See help(MDAnalysis.analysis.rms.get_RMSF) for more information.

Parameters
  • mobile (universe, atomgrp) – mobile structure

  • sel (str) – selection string

  • plot (bool) – create plot

Returns

RMSF (array)

array with RMSD values

pyrexMD.analysis.analyze.get_rmsd(mobile, ref, sel1, sel2, prec=3, weights='mass', superposition=True)[source]

Returns rmsd for single frame.

Parameters
  • mobile (universe, atomgrp) – mobile structure

  • ref (universe, atomgrp) – reference structure

  • sel1 (str) – selection string of mobile structure

  • sel2 (str) – selection string of reference structure

  • prec (None, int) – rounding precission

  • weights (bool, str, array_like) –

    weights during superposition of mobile and reference
    None: no weights
    ”mass”: atom masses as weights
    array_like: If a float array of the same length as “mobile” is provided, use each element of the “array_like” as a weight for the corresponding atom in “mobile”.

  • superposition (bool) –

Returns

rmsd (float)

rmsd value in Å

pyrexMD.analysis.analyze.get_shortest_RES_distances(u, sel)[source]

Calculates shortest RES distances (current frame) for selection of universe u.

Attention

Displayed RES ids always start with 1

Parameters
  • u (universe, str) – structure universe or pdb path

  • sel (str) – selection string

Returns

SD (arrray)
NxN Matrix with shortest RES distances
i.e.: SD[0][0] is shortest distance RES0-RES0
SD[0][1] is shortest distance RES0-RES1
SD[0][2] is shortest distance RES0-RES2
etc.
SD_d (array, dtype=object)

detailed List with [d_min, (RES_pair), (ATM_pair), (ATM_names)]

pyrexMD.analysis.analyze.get_time_conversion(u, **kwargs)[source]

get/print time conversion of MDA universe <u>

Parameters

u (universe) –

Hint

Args and Keyword Args of misc.cprint() are valid Keyword Args.

pyrexMD.analysis.analyze.get_trendline(xdata, ydata, compress=10)[source]

Reduces <xdata, ydata> by <compress> factor and returns trendline data. (Trendline: sum and normalize each <compress> elements of input data)

Parameters
  • xdata (list, array) –

  • ydata (list, array) –

  • compress (int) – compress factor (sum and normalize <compress> elements)

Returns

trend_xdata (list, array)

reduced xdata

trend_ydata (list, array)

reduced ydata

pyrexMD.analysis.analyze.plot_HEATMAP(data, **kwargs)[source]

plot heatmap

Note

if data is incomplete: use min/max values or np.nan for missing values and the Keyword Arg annot=True or annot=False

Parameters
  • data (arrays) –

  • save_as (str) – save name or realpath to save file

Keyword Arguments
  • annot (bool, list) – toggle annotation within the heatmap squares. If a list is provided, annotates the list content.

  • cmap (str) – color map string

  • n_colors (None, int) –

  • cbar_min/vmin (None, int, float) – min value of colorbar and heatmap

  • cbar_max/vmax (None, int, float) – max value of colorbar and heatmap

  • cbar_label (None, str) –

  • title (None, str) –

  • xlabel (None, str) –

  • ylabel (None, str) –

  • xticklabels (None, list) –

  • yticklabels (None, list) –

  • show_ticks (bool) – toggle (label) ticks

  • save_as (None, str) –

Hint

Args and Keyword Args of misc.figure() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

pyrexMD.analysis.analyze.plot_HEATMAP_REX_RMSD(REX_RMSD_dir, cps=['RdBu_r', 50, 0, 8], auto_convert=True, save_as='', **kwargs)[source]

plot heatmap with REX RMSDs. This function uses a directory with RMSD files as input.

Parameters
  • REX_RMSD_dir (str) – directory to REX_RMSD files

  • cps (list) –

    color_palette_settings
    cps[0]: color_palette name (str)
    cps[1]: n_colors (int)
    cps[2]: heatmap vmin (int, float)
    cps[3]: heatmap vmax (int, float)

  • auto_convert (bool) –

    convert “displayed” RMSD and xticks for plot. Returned values stay unconverted
    True: convert RMSD (nm) -> RMSD (angstrom)
    convert frame -> time (ns)

  • save_as (str) – save name or realpath to save file

Keyword Arguments

title (None, str) –

Hint

Args and Keyword Args of misc.figure() are also valid.

Returns

TIME (list)

time values taken from first REX_RMSD file

RMSD (list of arrays)

RMSD values of each REX_RMSD file

pyrexMD.analysis.analyze.plot_RMSD(RMSD_file, sss=[None, None, None], verbose=None, save_as='', **kwargs)[source]

plot RMSD curve

Parameters
  • RMSD_file (str) – path to RMSD file

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • verbose (None, bool) –

    None: print step to user (as a reminder)
    True: print start, step, stop to user
    False: no print

  • save_as (str) – save name or realpath to save file

Keyword Arguments
  • alpha (float) –

  • color (str) –

  • cut_min (None, int, float) – min value for cutoff

  • cut_max (None ,int, float) – max value for cutoff

  • start (None ,int) – start frame

  • stop (None ,int) – stop frame

  • step (None ,int) – step size

  • filedir (str) – default directory where to save figure

  • title (None/str) –

Hint

Args and Keyword Args of misc.figure() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

pyrexMD.analysis.analyze.plot_deltahist(RMSD_file, RMSD_ref, sss=[None, None, None], show_all_hist=False, save_as='', **kwargs)[source]

plot delta histogram (i.e. delta = RMSD_file - RMSD_ref)

Parameters
  • RMSD_file (str) – path to RMSD file

  • RMSD_ref (str) – path to RMSD reference file

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • show_all_hist (bool) –

    True: show ref hist, non-ref hist and delta hist
    False: show only delta hist

  • title (str) –

  • save_as (str) – save name or realpath to save file

Keyword Arguments
  • alpha (float) –

  • colors (list) – two strings in list describing color_positive and color_negative, e.g. [“g”, “r”]

  • color_positive (str) –

  • color_negative (str) –

  • ec (None, str) – edge color

  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • cut_min (None, int, float) – min value for cutoff (here: alias of vmin)

  • cut_max (None, int, float) – max value for cutoff (here: alias of vmax)

  • apply_cut_limits (bool) –

    apply plt.xlim(cut_min, cut_max) if orientation is “vertical”
    apply plt.ylim(cut_min, cut_max) if orientation is “horizontal”

  • bins (None, list) – use bins if passed, else construct bins based on n_bins, vmin, vmax, vbase

  • n_bins (int) – number of bins

  • n_bins_autoadd (int) – autocorrect n_bins (default: 1), because code requires 1 extra bin.

  • vmin (None, int, float) – min value of histogram/bins

  • vmax (None, int, float) – max value of histogram/bins

  • vbase (int) –

    rounding base of vmin, vmax.
    look up misc.round_down(number,base) or misc.round_up(number,base)

  • logscale (bool) – apply logscale on the “count” axis

  • minorticks (bool) – turns minorticks (~logscale) for hist figure on/off

  • norm (bool) – normalize histogram

  • orientation (str) – “vertical”, “horizontal”

  • title (None, str) –

Hint

Args and Keyword Args of misc.figure() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

pyrexMD.analysis.analyze.plot_hist(data, sss=[None, None, None], save_as=None, **kwargs)[source]

plot histogram

Parameters
  • data (list, array, list of lists) – input data. Dimensions of each list must be equal if multiple lists are provided.

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • save_as (str) – save name or realpath to save file

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • cut_min (None, int, float) – min value for cutoff

  • cut_max (None, int, float) – max value for cutoff

  • apply_cut_limits (bool) –

    apply plt.xlim(cut_min, cut_max) if orientation is “vertical”
    apply plt.ylim(cut_min, cut_max) if orientation is “horizontal”

  • align_bins (str) – ‘center’, ‘edge’ (default, left edge alignment)

  • bins (None, list) – use bins if passed, else construct bins based on n_bins, vmin, vmax, vbase

  • n_bins (int) – number of bins

  • n_bins_autoadd (int) – autocorrect n_bins (default: 1), because code requires 1 extra bin.

  • vmin (None, int, float) – min value of histogram/bins

  • vmax (None, int, float) – max value of histogram/bins

  • vbase (int) –

    rounding base of vmin, vmax.
    look up misc.round_down(number,base) or misc.round_up(number,base)

  • logscale (bool) – apply logscale on the “count” axis

  • minorticks (bool) – turns minorticks (~logscale) for hist figure on/off

  • norm (bool) – normalize histogram

  • orientation (str) – “vertical”, “horizontal”

  • alpha (int, float) –

  • colors (list of str/sns.colorpalette) –

  • ec (None, str) – edge color

  • title (None, str) –

Hint

Args and Keyword Args of misc.figure() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

hist (tuple, list)

(n, bins, patches) or list of (n, bins, patches)

pyrexMD.analysis.analyze.plot_trendline(xdata, ydata, compress=10, fig=None, remove_previous=True, **kwargs)[source]

Plot trendline of <xdata, ydata> and return trendline object

Hint

trendline can be removed via trendline.remove()

Parameters
  • xdata (list, array) –

  • ydata (list, array) –

  • compress (int) – compress factor (sum and normalize <compress> elements)

  • fig (int) – figure number (get with plt.gcf().number)

  • remove_previous (bool) – remove previous trendline data

Keyword Arguments
  • "alpha" – 1.0

  • "color" – “black”

  • "ls" – “-”

  • "lw" – 2.0

  • "ms" – 1

  • "marker" – “.”

Returns

trendline (matplotlib.lines.Line2D)

”trendline” object

pyrexMD.analysis.analyze.remove_trendline(trendline=None, fig=None)[source]

remove trendline of figure.

Parameters
  • trendline (None, obj) – “trendline object”

  • fig (None, obj) – figure object

pyrexMD.analysis.contacts

Hint

This module contains functions for native contact and bias contact analyses.

Example:

import MDAnalysis as mda
import pyrexMD.analysis.contacts as con

ref = mda.Universe("path/to/pdb")
mobile = mda.Universe("path/to/tpr", "path/to/xtc")

# get and plot native contacts
con.get_Native_Contacts(ref, sel="protein")
con.plot_Contact_Map(ref, sel="protein")

# compare contact map for n bias contacts
n = 50
con.plot_Contact_Map(ref, DCA_fin="path/to/bias/contacts", n_DCA=n)

# check true positive rate for n bias contacts
n = 50
con.plot_DCA_TPR(ref, DCA_fin="path/to/bias/contacts", n_DCA=n)

Content:

pyrexMD.analysis.contacts.get_BC_distances(u, bc, sss=[None, None, None], sel='protein', method='shortest_distance', plot=False, **kwargs)[source]

get bias contact distances.

Parameters
  • u (universe, str) –

  • bc (list) – list of bias contacts represented by (RESi, RESj) tuple

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • sel (str) – selection string

  • method (str) –

    ‘shortest_distance’: uses shortest RES distance between selection atoms
    ’contact_distance’:
    protein target ~ use distances of CA atoms
    RNA target ~ use distances of N1 and N3 atoms based on nucleic residues according to topology.sel2selid()

  • plot (bool) –

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • norm (bool) – apply topology.norm_universe()

  • ignh (bool) – ignore hydrogen (mass < 1.2)

  • save_as (None, str) – save native contacts logfile as…

Returns

BC (list)

bias contacts with unique RES pairs

BC_dist (list)

bias contact distances ~ distances of items in BC

DM (array)

array of distance matrices

pyrexMD.analysis.contacts.get_NC_distances(mobile, ref, sss=[None, None, None], sel='protein', method='shortest_distance', d_cutoff=6.0, plot=False, verbose=True, **kwargs)[source]

get native contact distances.

Parameters
  • mobile (universe, str) –

  • ref (universe, str) –

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • sel (str) – selection string

  • method (str) –

    ‘shortest_distance’: uses shortest RES distance between selection atoms
    ’contact_distance’:
    protein target ~ use distances of CA atoms
    RNA target ~ use distances of N1 and N3 atoms based on nucleic residues according to topology.sel2selid()

  • d_cutoff (float) – cutoff distance

  • plot (bool) –

  • verbose (bool) – show/hide progress bar

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • norm (bool) – apply topology.norm_universe()

  • ignh (bool) – ignore hydrogen (mass < 1.2)

  • save_as (None, str) – save native contacts logfile as…

Returns

NC (list)

native contacts with unique RES pairs

NC_dist (list)

native contact distances ~ distances of items in NC

DM (array)

array of distance matrices

pyrexMD.analysis.contacts.get_Native_Contacts(ref, d_cutoff=6.0, sel='protein', **kwargs)[source]

Get list of unique RES pairs and list of detailed RES pairs with native contacts.

Note

len(NC) < len(NC_d) because NC contains only unique RES pairs whereas NC_d contains each ATOM pair.

Parameters
  • ref (str, universe, atomgrp) – reference structure

  • d_cutoff (float) – cutoff distance for native contacts

  • sel (str) – selection string

Keyword Arguments
  • method (str) –

    ‘1’ or ‘Contact_Matrix’: mda.contact_matrix() with d_cutoff
    ’2’ or ‘Shadow_Map’ #TODO

  • norm (bool) – apply topology.norm_universe()

  • ignh (bool) – ignore hydrogen

  • save_as (None, str) –

    save RES pairs and ATOM pairs of native contacts to a log file.
    selection is hardcoded to “protein and name CA” for proteins
    and “name N1 or name N3” for RNA/DNA.

Returns

NC (list)

native contacts with unique RES pairs

NC_d (list)

detailed list of NCs containing (RES pairs), (ATOM numbers), (ATOM names)

pyrexMD.analysis.contacts.get_QBias(mobile, bc, sss=[None, None, None], d_cutoff=8.0, prec=3, norm=True, plot=True, warn=True, verbose=True, **kwargs)[source]

Get QValue for formed bias contacts.

Note

selection of get_QBias() is hardcoded to sel=’protein and name CA’.

Reason: bias contacts are unique RES PAIRS and grow ‘slowly’, but going from sel=’protein and name CA’ to sel=’protein’ increases the atom count significantly. Most of them will count as non-native and thus make the QBias value very low.

Note

MDAnalysis’ qvalue algorithm includes selfcontacts. Comparison of both methods yields better results when the kwarg include_selfcontacts (bool) is set to True. However this improves the calculated QBias value artificially (e.g. even when all used bias contacts are never formed, QBias will not be zero due to the selfcontact counts)

Parameters
  • mobile (universe) – mobile structure with trajectory

  • bc (list) – list with bias contacts to analyze

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • d_cutoff (float) – cutoff distance for native contacts

  • prec (None, int) – rounding precision

  • norm (bool) – norm universe before calculation.

  • plot (bool) –

  • warn (bool) – print important warnings about usage of this function.

  • verbose (bool) –

Keyword Arguments
  • dtype (dtype) – data type of returned contact matrix CM

  • include_selfcontacts (bool) –

    sets norm of QBIAS. Default is False.
    True: includes selfcontacts on main diagonal of CM (max count > len(bc))
    False: ignores selfcontacts on main diagonal of CM (max count == len(bc))

  • softcut (bool) – Increase QBias if np.isclose(d, d_cutoff, atol=softcut_tol) is True. Defaults to True.

  • softcut_tol (float) – Softcut tolerance. Defatuls to 0.2 (Angstrom).

  • softcut_inc – Increase QBias by this value if softcut applies. Defaults to 0.25.

  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • figsize (tuple) –

  • color (str) – “r”

  • alpha (float) – 1

  • marker (str) – None (old default: “.”)

  • ms (int) – 4

  • lw (int) – 1 (old default: 0)

  • save_plot (bool) –

  • save_as (str) – “QBias.png”

  • disable (bool) – hide/show progress bar

Returns

FRAMES (array)

array with frame numbers

QBIAS (array)

array with fraction of formed bias contacts

CM (array)

array with contact matrices

pyrexMD.analysis.contacts.get_QBias_TPFP(mobile, BC, NC, sss=[None, None, None], d_cutoff=8.0, prec=3, norm=True, plot=True, verbose=True, **kwargs)[source]

Get QValue of true positive (TP) and false positive (FP) bias contacts.

Note

selection of this function is hardcoded to sel=’protein and name CA’.

Reason: bias contacts are unique RES PAIRS and grow ‘slowly’, but going from sel=’protein and name CA’ to sel=’protein’ increases the atom count significantly. Most of them will count as non-native and thus make the QBias value very low.

Note

MDAnalysis’ qvalue algorithm includes selfcontacts. Comparison of both methods yields better results when the kwarg include_selfcontacts (bool) is set to True. However this improves the calculated QBias value artificially (e.g. even when all used bias contacts are never formed, QBias will not be zero due to the selfcontact counts)

Parameters
  • mobile (universe) – mobile structure with trajectory

  • BC (list) – list with bias contacts (saved as tuples (RESi, RESj))

  • NC (list) – list with native contacts (saved as tuples (RESi, RESj))

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • d_cutoff (float) – cutoff distance for native contacts

  • prec (None, int) – rounding precision

  • norm (bool) – norm universe before calculation.

  • plot (bool) –

  • verbose (bool) –

Keyword Arguments
  • dtype (dtype) – data type of returned contact matrix CM

  • include_selfcontacts (bool) –

    sets norm of QBIAS. Default is False.
    True: includes selfcontacts on main diagonal of CM (max count > len(bc))
    False: ignores selfcontacts on main diagonal of CM (max count == len(bc))

  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • color_TP (str) – color of true positive QBias. Defaults to “g”.

  • color_FP (str) – color of false positive QBias. Defaults to “r”.

  • alpha (float) – 0.3

  • marker (str) – “.”

  • disable (bool) – hide/show progress bar

Returns

FRAMES (array)

array with frame numbers

TP (array)

array with Qvalue of true positive bias contacts

FP (array)

array with Qvalue of false positive bias contacts

CM (array)

array with distance matrices

pyrexMD.analysis.contacts.get_QNative(mobile, ref, sel='protein and name CA', sss=[None, None, None], d_cutoff=8.0, plot=True, verbose=True, **kwargs)[source]

Get QValue for native contacts.

  • norms and aligns mobile to ref so that atomgroups have same resids

  • performs native contact analysis

Parameters
  • mobile (universe) – mobile structure with trajectory

  • ref (universe) – reference structure

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • d_cutoff (float) – cutoff distance for native contacts

  • plot (bool) –

  • verbose (bool) –

Keyword Arguments
  • sel1 (str) – selection string for contacting group (1 to 1 mapping)

  • sel2 (str) – selection string for contacting group (1 to 1 mapping)

  • method (str) –

    method for QValues calculation
    ’radius_cut’: native if d < d_cutoff
    ’soft_cut’: see help(MDAnalysis.analysis.contacts.soft_cut_q)
    ’hardcut’: native if d < d_ref

  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • color (str) – “r”

  • alpha (float) – 1

  • marker (str) – None (old default: “.”)

  • ms (int) – 4

  • lw (int) – 1 (old default: 0)

  • save_plot (bool) –

  • save_as (str) – “QNative.png”

Note

sel (arg) is ignored if sel1 or sel2 (kwargs) are passed.

Returns

FRAMES (list)

list with frames

QNATIVE (list)

list with fraction of native contacts

pyrexMD.analysis.contacts.get_formed_contactpairs(u, cm, sel='protein and name CA', norm=True, **kwargs)[source]

get formed contact pairs by converting a contact matrix cm into a list with tuples (RESi,RESj).

Parameters
  • u (universe) – used to obtain topology information

  • cm (array) – single contact matrix of size (n_atoms, n_atoms)

  • sel (str) – selection string

  • norm (bool) – norm universe before obtaining topology information

Keyword Arguments

include_selfcontacts (bool) – include contacts on main diagonal of cm. Default is False.

Returns

CP (list)

list with contact pairs (RESi,RESj)

pyrexMD.analysis.contacts.plot_Contact_Map(ref, DCA_fin=None, n_DCA=None, d_cutoff=6.0, sel='protein', pdbid='pdbid', **kwargs)[source]

Create contact map based on the reference pdb structure. If DCA file is passed, correct/wrong contacts are visualized in green/red.

Parameters
  • ref (str, universe, atomgrp) – reference path or structure

  • DCA_fin (None, str) – DCA input file

  • n_DCA (None, int) – number of used DCA contacts

  • d_cutoff (float) – cutoff distance for native contacts

  • sel (str) – selection string

  • pdbid (str) – pdbid which is used for plot title

Keyword Arguments
  • DCA_cols (tuple) – columns containing the RES pairs in DCA_fin

  • DCA_skiprows (int) –

    skip header rows of DCA_fin
    -1 or “auto”: auto detect

  • filter_DCA (bool) –

    True: ignore DCA pairs with abs(i-j) < 3.
    False: use all DCA pairs w/o applying filter.

  • RES_range (None, list) –

    include RES only within [RES_min, RES_max] range
    None: do not narrow down RES range

  • ignh (bool) – ignore hydrogen (mass < 1.2)

  • norm (bool) – apply topology.norm_universe()

  • save_plot (bool) –

  • ms (None, int) – marker size (should be divisible by 2)

Hint

Args and Keyword Args of misc.figure() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

pyrexMD.analysis.contacts.plot_Contact_Map_Distances(ref, NC, NC_dist, pdbid='pdbid', **kwargs)[source]

Create contact map with color-coded distances for selected contact pairs. Distances larger than the highest value of the colorbar will be displayed in grey.

Warning

vmin and vmax do only modify the limits of the color bar. They do not affect the color mapping of the contacts, which is defined in cmap.

Parameters
  • ref (universe, str) – reference universe or pdb. Used to detect if target is protein or nucleic to select pre-defined color map.

  • NC (list) – output of get_NC_distances()

  • NC_dist (list) – output of get_NC_distances()

  • pdbid (str) – pdbid which is used for plot title

Keyword Arguments
  • cmap (None, dict) –

    dictionary with “color”:”threshold” pairs, e.g. “red”:12.0
    None: use pre-defined cmaps for Protein and RNA targets

  • vmin (None, float) –

  • vmax (None, float) –

  • ms (None, int) – marker size (should be divisible by 2)

  • title (None, str) – None, “default” or any title string

  • save_plot (bool) –

Hint

Args and Keyword Args of misc.figure() and misc.add_cbar() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

pyrexMD.analysis.contacts.plot_DCA_TPR(ref, DCA_fin, n_DCA, d_cutoff=6.0, sel='protein', pdbid='pdbid', verbose=True, **kwargs)[source]

Plots true positive rate (TPR) for number of used DCA contacts.

  • calculates shortest RES distance of the selection (only heavy atoms if ignh is True)

  • if distance is below threshold: DCA contact is True

Note

figure shows

  • blue line: TPR

  • red line: 75% cutoff threshold (TPR of used number of contacts should be above 75% for contact-guided REX, see https://doi.org/10.1371/journal.pone.0242072)

  • orange lines: suggested/guessed optimum number of contacts and the corresponding TPR

  • orange region: region of interest between L/2 and L contacts

Parameters
  • ref (universe, atomgrp) – reference path

  • ref – reference structure

  • DCA_fin (str) – DCA input file (path)

  • n_DCA (int) – number of used DCA contacts

  • d_cutoff (float) – cutoff distance for native contacts

  • sel (str) – selection string

  • pdbid (str) – pdbid; used for plot title and figure name

  • verbose (bool) – prints the total residue length of reference

Keyword Arguments
  • DCA_cols (tuple, list) – columns containing the RES pairs in DCA_fin

  • DCA_skiprows (int) –

    skip header rows of DCA_fin
    -1 or “auto”: auto detect

  • filter_DCA (bool) –

    True: ignore DCA pairs with abs(i-j) < 4
    False: use all DCA pairs w/o applying filter

  • RES_range (None) –

    [RES_min, RES_max] range. Ignore invalid RES ids
    None: do not narrow down RES range

  • ignh (bool) – ignore hydrogen (mass < 1.2)

  • norm (bool) – apply topology.norm_universe()

  • TPR_layer (str) –

    plot TPR curve in foreground or background layer
    ”fg”, “foreground”, “bg”, “background”

  • color (str) –

  • shade_area (bool) – shade area between L/2 and L ranked contacts

  • shade_color (str) –

  • shade_alpha (float) –

  • hline_color (None, str) – color of 75p threshold (horizontal line)

  • hline_ls (str) – linestyle of hline (default: “-“)

  • nDCA_opt_color (None, str) –

    color of optimal/recommended nDCA threshold
    nDCA_opt = misc.round_up(3/4*len(ref.residues), base=5))
    (vertical line ~ nDCA_opt)
    (horizontal line ~ TPR(nDCA_opt))

  • nDCA_opt_ls (str) – linestyle of nDCA_opt (default: “–“)

  • save_plot (bool) –

  • save_log (bool) –

  • figsize (tuple) –

  • marker (None, str) – marker

  • ms (int) – markersize

  • ls (str) – linestyle

  • alpha (float) – alpha value

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

pyrexMD.analysis.dihedrals

Hint

This module contains functions for dihedral-angle analyses.

Example:

import MDAnalysis as mda
import pyrexMD.analysis.dihedrals as dih

mobile = mda.Universe("path/to/tpr", "path/to/xtc")

# get dihedral angles of current frame
phi = dih.get_phi_values(u, sel="protein and resid 1-3")
psi = dih.get_psi_values(u, sel="protein and resid 1-3")
omega = dih.get_omega_values(u, sel="protein and resid 1-3")

# plot ramachandran
rama = dih.get_ramachandran(mobile)

Content:

pyrexMD.analysis.dihedrals.get_chi1_values(u, sel='protein', sss=[None, None, None], warn=True, **kwargs)[source]

Get chi1 dihedral values.

Parameters
  • u (universe) –

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • warn (bool) – print warning “All ALA, CYS, GLY, PRO, SER, THR, and VAL residues have been removed from the selection.”

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • norm (bool) – norm universe before calculation. Defaults to True.

Returns

chi1_values (array)

array with chi1 dihedral values

pyrexMD.analysis.dihedrals.get_chi2_values(u, sel='protein', sss=[None, None, None], warn=True, **kwargs)[source]

Get chi2 dihedral values.

Parameters
  • u (universe) –

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • warn (bool) – print warning “All ALA, CYS, GLY, PRO, SER, THR, and VAL residues have been removed from the selection.”

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • norm (bool) – norm universe before calculation. Defaults to True.

Returns

chi2_values (array)

array with chi2 dihedral values

pyrexMD.analysis.dihedrals.get_janin(u, sel='protein', sss=[None, None, None], plot=True, verbose=True, **kwargs)[source]

Get Janin information (chi1 & chi2 dihedral values, frames, etc).

Note

chi1/chi2 angles are not defined for ALA, CYS, GLY, PRO, SER, THR, VAL

Parameters
  • u (universe) –

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • plot (bool) –

  • verbose (bool) – print warning “All ALA, CYS, GLY, PRO, SER, THR, and VAL residues have been removed from the selection.”

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • color (str) – “black”

  • marker (str) – “.”

  • ref (bool) – True: adds a general Janin plot which shows allowed (dark blue ~ 90% data points) and marginally allowed regions (light blue ~ 99% data points). Lookup MDAnalysis online documentation about dihedrals for more information.

  • num (None, int) –

    figure number
    None: create new figure

  • norm (bool) – norm universe before calculation. Defaults to True.

Returns

janin (MDAnalysis.analysis.dihedrals.Janin)

class contains atom groups, angles, frames, times, etc

pyrexMD.analysis.dihedrals.get_omega_values(u, sel='protein', sss=[None, None, None], **kwargs)[source]

Get omega dihedral values.

Parameters
  • u (Muniverse) –

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • norm (bool) – norm universe before calculation. Defaults to True.

Returns

omega_values (array)

array with omega dihedral values

pyrexMD.analysis.dihedrals.get_phi_values(u, sel='protein', sss=[None, None, None], **kwargs)[source]

Get phi dihedral values.

Note

phi/psi angles are not defined for first and last residues.

Parameters
  • u (universe) –

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • norm (bool) – norm universe before calculation. Defaults to True.

Returns

phi (array)

array with phi dihedral values

pyrexMD.analysis.dihedrals.get_psi_values(u, sel='protein', sss=[None, None, None], **kwargs)[source]

Get psi dihedral values.

Note

phi/psi angles are not defined for first and last residues.

Parameters
  • u (universe) –

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • norm (bool) – norm universe before calculation. Defaults to True.

Returns

psi_values (array)

array with psi dihedral values

pyrexMD.analysis.dihedrals.get_ramachandran(u, sel='protein', sss=[None, None, None], plot=True, **kwargs)[source]

Get Ramachandran information (phi & psi dihedral values, frames, etc).

Note

phi/psi angles are not defined for first and last residues.

Parameters
  • u (universe) –

  • sel (str) – selection string

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • plot (bool) –

Keyword Arguments
  • start (None, int) – start frame

  • stop (None, int) – stop frame

  • step (None, int) – step size

  • color (str) – “black”

  • marker (str) – “.”

  • ref (bool) – True: Adds a general Ramachandran plot which shows allowed (dark blue ~ 90% data points) and marginally allowed regions (light blue ~ 99% data points). Lookup MDAnalysis online documentation about dihedrals for more information.

  • num (None, int) –

    figure number
    None: create new figure

  • norm (bool) – norm universe before calculation. Defaults to True.

Returns

rama (MDAnalysis.analysis.dihedrals.Ramachandran)

class contains atom groups, angles, frames, times, etc

pyrexMD.analysis.gdt

Hint

This module contains functions for Global-Distance-Test (GDT) analyses.

Example:

import MDAnalysis as mda
import pyrexMD.misc as misc
import pyrexMD.topology as top
import pyrexMD.analysis.analyze as ana
import pyrexMD.analysis.gdt as gdt

ref = mda.Universe("path/to/pdb")
mobile = mda.Universe("path/to/tpr", "path/to/xtc")

# first norm and align universes
top.norm_and_align_universe(mobile, ref)

# run GDT
GDT = gdt.GDT(mobile, ref)
GDT_percent, GDT_resids, GDT_cutoff, RMSD, FRAME = GDT

# get individual scores
GDT_TS = gdt.get_GDT_TS(GDT_percent)
GDT_HA = gdt.get_GDT_HA(GDT_percent)
frames = [i for i in range(len(GDT_TS))]
misc.cprint("GDT TS    GDT HA    frame", "blue")
_ = misc.print_table([GDT_TS, GDT_HA, frames], verbose_stop=10, spacing=10)

# rank scores
SCORES = gdt.GDT_rank_scores(GDT_percent, ranking_order="GDT_TS", verbose=False)
GDT_TS_ranked, GDT_HA_ranked, GDT_ndx_ranked = SCORES
misc.cprint("GDT TS    GDT HA    frame", "blue")
_ = misc.print_table([GDT_TS_ranked, GDT_HA_ranked, GDT_ndx_ranked], spacing=10, verbose_stop=10)


# plot local accuracy
text_pos_kws = {"text_pos_Frame": [-8.8, -0.3],
                "text_pos_TS": [-14.2, -0.3],
                "text_pos_HA": [-6, -0.3]}
_ = gdt.plot_LA(mobile, ref, GDT_TS_ranked, GDT_HA_ranked, GDT_ndx_ranked, **text_pos_kws)

Content:

pyrexMD.analysis.gdt.GDT(mobile, ref, sel1='protein and name CA', sel2='protein and name CA', sss=[None, None, None], cutoff=[0.5, 10, 0.5], true_resids=True, verbose=True, **kwargs)[source]

performs Global-Distance-Test (GDT).

Algorithm to identify how good “mobile” matches to “reference” by calculating the sets of residues which do not deviate more than a specified (pairwise) CA distance cutoff.

Note

sel1 = mobile.select_atoms(sel1)
sel2 = ref.select_atoms(sel2)
weights=”mass” # hardcoded weights for alignment
Parameters
  • mobile (universe, atomgrp) – mobile structure with trajectory

  • ref (universe, atomgrp) – reference structure

  • sel1 (str) – selection string of mobile structure

  • sel2 (str) – selection string of reference structure

  • sss (list) –

    [start, stop, step]
    start (None, int): start frame
    stop (None, int): stop frame
    step (None, int): step size

  • cutoff (list) –

    parameters describing GDT
    cutoff[0]: min_cutoff (float)
    cutoff[1]: max_cutoff (float)
    cutoff[2]: step_cutoff (float)

  • true_resids (bool) –

    offset resids of “GDT_resids” to correct codestyle resids into real resids
    True: return true resids in output “GDT_resids”
    False: return codestyle resids in output “GDT_resids”

Keyword Arguments
  • start (None/int) – start frame

  • stop (None/int) – stop frame

  • step (None/int) – step size

  • min_cutoff (float) –

  • max_cutoff (float) –

  • step_cutoff (float) –

  • disable (bool) – disable progress bar

Returns

GDT_percent (list)

GDT_Px for each cutoff distance in GDT_cutoff, where GDT_Px denotes percent of residues under distance cutoff <= x Angstrom

GDT_resids (list)
list with resids within cutoff
if true_resids == True:
returns true resids; possible values are: minRES <= value <= maxRES
if true_resids == False:
returns codestyle resids; possible values are 0 <= value <= len(mobile.select_atoms(sel)))
GDT_cutoff (list)

list with cutoff distances in Angstrom

RMSD (list)

list with (RMSD before alignment, RMSD after elignment)

FRAME (list)

list of analyzed frames

pyrexMD.analysis.gdt.GDT_continuous_segments(GDT_resids)[source]

Get continous segments for multiple frames (via to GDT_resids)

Parameters

GDT_resids (array) – output of gdt.GDT()

Returns

SEGMENTS (list)

list of continuous segments for each frame

pyrexMD.analysis.gdt.GDT_match_cutoff_dim(GDT_percent, GDT_cutoff)[source]

Matches GDT_cutoff array to the dimensions of GDT_percent. Used in combination with analysis.PLOT() function to plot multiple curves on one canvas.

Parameters
  • GDT_percent (list, array) – output of gdt.GDT()

  • GDT_cutoff (list, array) – output of gdt.GDT()

Returns

GDT_cutoff (list, array)

transformed GDT_cutoff with same dimensions as GDT_percent

pyrexMD.analysis.gdt.GDT_rank_percent(GDT_percent, norm=True, verbose=False)[source]

Ranks GDT_percent based on the sum of GDT_Px for all x in cutoff. (Higher sum means better accuracy during protein alignment)

Ranked Psum can be view analogous to GDT scores but using all cutoffs instead of specific ones

Parameters
  • GDT_percent (list) – output of gdt.GDT()

  • norm (bool) – norms Psum ~ 1/N * sum of GDT_Px

  • verbose (bool) –

Returns

RANKED_Psum (list)

ranked percent sum

RANKED_Psum_ndx (list)

indices of ranked percent sum for correct mapping

pyrexMD.analysis.gdt.GDT_rank_scores(GDT_percent, ranking_order='GDT_TS', prec=3, verbose=True, **kwargs)[source]

rank scores.

Note

Similar function to gdt.rank_scores() but takes other arguments.

Parameters
  • GDT_percent (list) – output of gdt.GDT()

  • ranking_order (None, str) –

    “FRAME” or None: output ranking ordered by frame number
    ”GDT_TS”: output ranking ordered by GDT_TS
    ”GDT_HA”: output ranking ordered by GDT_HA

  • prec (None, int) –

    -1 or None: rounding off
    int: rounding on to this number of decimals

  • verbose (bool) –

Keyword Arguments

verbose_stop (None, int) – stop printing after N lines

Returns

GDT_TS_ranked (array)

ranked array with GDT_TS scores

GDT_HA_ranked (array)

ranked array with GDT_HA scores

GDT_ndx_ranked (array)

array with score indices for correct mapping

Example

>> mobile = mda.Universe(<top>, <traj>)
>> ref = mda.universe(<ref> )
>> GDT_cutoff, GDT_percent, GDT_resids, FRAME = gdt.GDT(mobile, ref)
>> GDT_TS_ranked, GDT_HA_ranked, GDT_ndx_ranked = gdt.GDT_rank_scores(GDT_percent, ranking_order=”GDT_TS”)
pyrexMD.analysis.gdt.GDT_rna(mobile, ref, sel1='nucleic', sel2='nucleic', sss=[None, None, None], cutoff=[2, 40, 2], true_resids=True, **kwargs)[source]

returns gdt.GDT() with RNA default values after converting selection strings to selection id strings.

Code:
selid1 = pyrexMD.topology.sel2selid(mobile, sel=sel1)
selid2 = pyrexMD.topology.sel2selid(ref, sel=sel2)
return GDT(mobile=mobile, ref=ref, sel1=selid1, sel2=selid2, sss=sss,
cutoff=cutoff, true_resids=true_resids, **kwargs)
pyrexMD.analysis.gdt.get_GDT_HA(GDT_percent, prec=3)[source]

Note

GlobalDistanceTest_HighAccuracy. Possible values: 0 <= GDT_HA <= 100

Calculate GDT_HA for any given GDT_percent array according to:
GDT_HA = (GDT_P0.5 + GDT_P1 + GDT_P2 + GDT_P4)/4,
where GDT_Px denotes the percentage of residues with distance cutoff <= x Angstrom
Parameters
  • GDT_percent (list, array) – output of gdt.GDT()

  • prec (None, int) – rounding precision

Returns

GDT_HA (array)

array with GDT_HA scores

pyrexMD.analysis.gdt.get_GDT_TS(GDT_percent, prec=3)[source]

Note

GlobalDistanceTest_TotalScore. Possible values: 0 <= GDT_TS <= 100

Calculates GDT_TS for any given GDT_percent array according to:
GDT_TS = (GDT_P1 + GDT_P2 + GDT_P4 + GDT_P8)/4,
where GDT_Px denotes the percentage of residues with distance cutoff <= x Angstrom
Parameters
  • GDT_percent (list, array) – output of gdt.GDT()

  • prec (None, int) – rounding precision

Returns

GDT_TS (array)

array with GDT_TS scores

pyrexMD.analysis.gdt.get_Pair_Distances(mobile, ref, sel1='protein and name CA', sel2='protein and name CA', **kwargs)[source]

Aligns mobile to ref and calculates pair distances (e.g. CA-CA distances).

Note

single frame function.

Parameters
  • mobile (universe, atomgrp) – mobile structure with trajectory

  • ref (universe, atomgrp) – reference structure

  • sel1 (str) – selection string of mobile structure

  • sel2 (str) – selection string of reference structure

Returns

PAIR_DISTANCES (array)

array with pair pair distances

RMSD (tuple)

(RMSD before alignment, RMSD after alignment)

_resids_mobile (array)

array with mobile RES ids

_resids_ref (array)

array with reference RES ids

pyrexMD.analysis.gdt.get_array_percent(dist_array, cutoff)[source]

Get percentage and indices of dist_array that fulfill: dist_array <= cutoff

Parameters
  • dist_array (array) – distance array

  • cutoff (int, float) – cutoff distance

Returns

p (float)

percentage of dist_array that fulfill: dist_array <= cutoff)

ndx (array)

indices of dist_array that fulfill: dist_array <= cutoff)

pyrexMD.analysis.gdt.get_continuous_segments(array)[source]

Get continuous segments for single frame.

Parameters

array (array) –

ordered array with integers representing resids ~ single frame
e.g.: array = [5, 6, 7, 12, 13, 18, 19, 20]

Returns

SEGMENTS (list)

list of continuous segments

Example

>> gdt.get_continuous_segments([1,2,3,22,23,50,51,52])
[[1, 2, 3], [22, 23], [50, 51, 52]]
pyrexMD.analysis.gdt.plot_GDT(GDT_percent, GDT_cutoff, **kwargs)[source]

Creates a GDT_PLOT.

Parameters
  • GDT_percent (list, array) – output of gdt.GDT()

  • GDT_cutoff (list, array) – output of gdt.GDT()

  • title (str) –

Hint

Args and Keyword Args of misc.figure() are valid Keyword Args.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

pyrexMD.analysis.gdt.plot_LA(mobile, ref, GDT_TS, GDT_HA, GDT_ndx, sel1='protein and name CA', sel2='protein and name CA', cmap='GDT_HA', **kwargs)[source]

Create LocalAccuracy Plot (heatmap) with

  • xdata = residue ID

  • ydata = frame number

  • color = color-coded pair distance

Note

do not pass too many data points otherwise the plot will get squeezed

Parameters
  • mobile (universe, atomgrp) – mobile structure with trajectory

  • ref (universe, atomgrp) – reference structure

  • GDT_TS (array) – array with GDT_TS scores.

  • GDT_HA (array) – array with GDT_HA scores.

  • GTD_ndx (array) – array with corresponding index values (representative for frame numbers).

  • sel1 (str) – selection string of mobile structure (calculation of pair distances)

  • sel2 (str) – selection string of reference structure (calculation of pair distances)

  • cmap (str) –

    “GDT_TS” or “TS”: color map with new colors at values (0, 1, 2, 4, 8) and vmin, vmax = (0, 10).
    ”GDT_HA” or “HA”: color map with new colors at values (0, .5, 1, 2, 4) and vmin, vmax = (0, 5).
    ”nucleic” or “RNA” or “DNA”: color map with new colors at values (0, .5, 1, 2, 4) and vmin, vmax = (0, 20).
    other cmap names: see help(plt.colormaps) or alternatively https://matplotlib.org/examples/color/colormaps_reference.html

Keyword Arguments
  • prec (None, int) –

    rounding precision of scores
    None: rounding off
    int: rounding on to <int> decimals

  • ndx_offset (int) –

    offset/shift of GDT_ndx to match real “mobile” frames. Defaults to 0.
    Look up “start” parameter during execution of gdt.GDT()

  • rank_num (int) – plot only <rank_num> best ranked frames. Defaults to 30.

  • show_cbar (bool) – show/hide colorbar. Defaults to True.

  • show_frames (bool) – show/hide frame numbers. Defaults to False.

  • show_scores (bool) – show/hide GDT_TS and GDT_HA scores. Defaults to True.

  • save_as (None, str) – save name or realpath to save file. Defaults to None.

  • cbar_ticks (None, list) – color bar tick positions. Defaults to None.

  • cbar_label/label (str) –

  • cbar_fontweight/fontweight (str) – “normal”, “bold”

  • cbar_location/location (str) – “right”, “bottom”, “left”, “top”

  • cbar_orientation/orientation (str) – “horizontal”, “vertical”

  • cbar_min/vmin (None, int) – min value of colorbar and heatmap. Gets overwritten by cmaps such as “GDT_TS”, “GDT_HA”, “RNA” etc.

  • cbar_max/vmax (None, int) – max value of colorbar and heatmap. Gets overwritten by cmaps such as “GDT_TS”, “GDT_HA”, “RNA” etc.

  • text_pos_Frame (list) – [x0, y0] position of the “Frame” text box (label)

  • text_pos_TS (list) – [x0, y0] position of the “TS” text box (label)

  • text_pos_HA (list) – [x0, y0] position of the “HA” text box (label)

  • font_scale (float) –

Hint

Args and Keyword of misc.figure() are also valid.

Returns

fig (class)

matplotlib.figure.Figure

ax (class, list)

ax or list of axes ~ matplotlib.axes._subplots.Axes

LA_data (tuple)
LA_data[0]: PairDistances (list)
LA_data[1]: Frames (list)

Example

# obtain data
>> GDT = gdt.GDT(mobile, ref, sss=[None,None,None])
>> GDT_percent, GDT_resids, GDT_cutoff, RMSD, FRAME = GDT

# rank data
>> SCORES = gdt.GDT_rank_scores(GDT_percent, ranking_order=”GDT_HA”)
>> GDT_TS_ranked, GDT_HA_ranked, GDT_ndx_ranked = SCORES

# edit text box positions of labels “Frame”, “TS”, “HA”
>>text_pos_kws = {“text_pos_Frame”: [-8.8, -0.3],
“text_pos_TS”: [-4.2, -0.3],
“text_pos_HA”: [-1.9, -0.3]}

# plot
>> gdt.plot_LA(mobile, ref, SCORES[0], SCORES[1], SCORES[2], **text_pos_kws)
pyrexMD.analysis.gdt.plot_LA_rna(mobile, ref, GDT_TS, GDT_HA, GDT_ndx, sel1='nucleic', sel2='nucleic', cmap='RNA', **kwargs)[source]

plot_LA() with RNA default values after converting selection strings to selection id strings.

Code:
selid1 = pyrexMD.topology.sel2selid(mobile, sel=sel1)
selid2 = pyrexMD.topology.sel2selid(ref, sel=sel2)
return plot_LA(mobile, ref, GDT_TS=GDT_TS, GDT_HA=GDT_HA, GDT_ndx=GDT_ndx,
sel1=selid1, sel2=selid2, cmap=cmap, **kwargs)
pyrexMD.analysis.gdt.rank_scores(GDT_TS, GDT_HA, ranking_order='GDT_TS', prec=3, verbose=True, **kwargs)[source]

rank scores

Note

Similar function to gdt.GDT_rank_scores() but takes other arguments.

Parameters
  • GDT_TS (array) – output of gdt.get_GDT_TS()

  • GDT_HA (array) – output of gdt.get_GDT_HA()

  • ranking_order (None, str) –

    “FRAME” or None: output ranking ordered by frame number
    ”GDT_TS”: output ranking ordered by GDT_TS
    ”GDT_HA”: output ranking ordered by GDT_HA

  • prec (None, int) –

    -1 or None: rounding off
    int: rounding on to this number of decimals

  • verbose (bool) –

Keyword Arguments

verbose_stop (None, int) – stop printing after N lines

Returns

GDT_TS_ranked (array)

ranked array with GDT_TS scores

GDT_HA_ranked (array)

ranked array with GDT_HA scores

GDT_ndx_ranked (array)

array with score indices for correct mapping