Title: | Computing Statistically-Equivalent Path Models |
---|---|
Description: | Tools to compute and analyze the set of statistically-equivalent (Gaussian, linear) path models which generate the input precision or (partial) correlation matrix. This procedure is useful for understanding how statistical network models such as the Gaussian Graphical Model (GGM) perform as causal discovery tools. The statistical-equivalence set of a given GGM expresses the uncertainty we have about the sign, size and direction of directed relationships based on the weights matrix of the GGM alone. The derivation of the equivalence set and its use for understanding GGMs as causal discovery tools is described by Ryan, O., Bringmann, L.F., & Schuurman, N.K. (2022) <doi: 10.31234/osf.io/ryg69>. |
Authors: | Oisín Ryan |
Maintainer: | Oisín Ryan <[email protected]> |
License: | GPL-3 |
Version: | 1.0.1 |
Built: | 2024-11-09 03:37:56 UTC |
Source: | https://github.com/ryanoisin/seset |
Helper function. Takes a covariance matrix and ordering and generates a lower-triangular weights matrix.
cov_to_path(sigma, ordering = NULL, digits = 2)
cov_to_path(sigma, ordering = NULL, digits = 2)
sigma |
input matrix, with rows and columns in desired topological ordering Must be an invertible square matrix |
ordering |
character vector containing the dimension names of the input matrix in the desired ordering |
digits |
the number of digits used to round the output |
lower triangular matrix containing regression weights of the path model.
Element ij represents the effect of on
Return parent indices from a (weighted) DAG for a given child
find_parents(mat, child)
find_parents(mat, child)
mat |
An |
child |
Index giving the position of the child node |
a vector containing index numbers defining the parent nodes
Ryan O, Bringmann LF, Schuurman NK (upcoming). “The challenge of generating causal hypotheses using network models.” in preperation.
Takes an ordered path model and corresponding variance-covariance matrix and computes the appropriate residual covariance matrix (psi)
get_psi(B, sigma, digits = 3)
get_psi(B, sigma, digits = 3)
B |
input |
sigma |
variance-covariance matrix of the variables |
digits |
how many digits to round the result to |
a residual variance-covariance matrix
Takes a precision matrix and generates a lower-triangular weights matrix.
network_to_path(omega, input_type = "precision", digits = 20)
network_to_path(omega, input_type = "precision", digits = 20)
omega |
input matrix, with rows and columns in desired topological ordering Must be an invertible square matrix |
input_type |
specifies what type of matrix 'omega' is. default is "precision", other options include a matrix of partial correlations ("parcor") or a covariance matrix "covariance" |
digits |
desired rounding of the output matrix |
lower triangular matrix containing regression weights of the path model.
Element ij represents the effect of on
Ryan O, Bringmann LF, Schuurman NK (upcoming). “The challenge of generating causal hypotheses using network models.” in preperation.
Shojaie A, Michailidis G (2010). “Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.” Biometrika, 97(3), 519–538.
Bollen KA (1989). Structural equations with latent variables. Oxford, England, John Wiley \& Sons.
data(riskcor) omega <- (qgraph::EBICglasso(riskcor, n = 69, returnAllResults = TRUE))$optwi # qgraph method estimates a non-symmetric omega matrix, but uses forceSymmetric to create # a symmetric matrix (see qgraph:::EBICglassoCore line 65) omega <- as.matrix(Matrix::forceSymmetric(omega)) # returns the precision matrix B <- network_to_path(omega, digits=2) # Path model can be plotted as a weighted DAG pos <- matrix(c(2,0,-2,-1,-2,1,0,2,0.5,0,0,-2),6,2,byrow=TRUE) # qgraph reads matrix elements as "from row to column" # regression weights matrices are read "from column to row" # path model weights matrix must be transposed for qgraph qgraph::qgraph(t(B), labels=rownames(riskcor), layout=pos, repulsion=.8, vsize=c(10,15), theme="colorblind", fade=FALSE)
data(riskcor) omega <- (qgraph::EBICglasso(riskcor, n = 69, returnAllResults = TRUE))$optwi # qgraph method estimates a non-symmetric omega matrix, but uses forceSymmetric to create # a symmetric matrix (see qgraph:::EBICglassoCore line 65) omega <- as.matrix(Matrix::forceSymmetric(omega)) # returns the precision matrix B <- network_to_path(omega, digits=2) # Path model can be plotted as a weighted DAG pos <- matrix(c(2,0,-2,-1,-2,1,0,2,0.5,0,0,-2),6,2,byrow=TRUE) # qgraph reads matrix elements as "from row to column" # regression weights matrices are read "from column to row" # path model weights matrix must be transposed for qgraph qgraph::qgraph(t(B), labels=rownames(riskcor), layout=pos, repulsion=.8, vsize=c(10,15), theme="colorblind", fade=FALSE)
Takes a precision matrix and generates the SE-set, a set of statistically equivalent path models. Unless otherwise specified, the SEset will contain one weights matrix for every possible topological ordering of the input precision matrix
network_to_SEset( omega, orderings = NULL, digits = 20, rm_duplicates = FALSE, input_type = "precision" )
network_to_SEset( omega, orderings = NULL, digits = 20, rm_duplicates = FALSE, input_type = "precision" )
omega |
input |
orderings |
An optional matrix of |
digits |
desired rounding of the output weights matrices in the SE-set, in decimal places. Defaults to 20. |
rm_duplicates |
Logical indicating whether only unique DAGs should be returned |
input_type |
specifies what type of matrix 'omega' is. default is "precision", other options include a matrix of partial correlations ("parcor") or a model implied covariance or correlation matrix "MIcov" |
a matrix containing the SE-set
(or
matrix if a custom set of
orderings is specified).
Each row represents a lower-triangular weights matrix, stacked column-wise.
Ryan O, Bringmann LF, Schuurman NK (upcoming). “The challenge of generating causal hypotheses using network models.” in preperation.
Shojaie A, Michailidis G (2010). “Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.” Biometrika, 97(3), 519–538.
Bollen KA (1989). Structural equations with latent variables. Oxford, England, John Wiley \& Sons.
network_to_path
, reorder_mat
, order_gen
# first estimate the precision matrix data(riskcor) omega <- (qgraph::EBICglasso(riskcor, n = 69, returnAllResults = TRUE))$optwi # qgraph method estimates a non-symmetric omega matrix, but uses forceSymmetric to create # a symmetric matrix (see qgraph:::EBICglassoCore line 65) omega <- as.matrix(Matrix::forceSymmetric(omega)) # returns the precision matrix SE <- network_to_SEset(omega, digits=3) # each row of SE defines a path-model weights matrix. # We can extract element 20 by writing it to a matrix example <- matrix(SE[20,],6,6) # Example path model can be plotted as a weighted DAG pos <- matrix(c(2,0,-2,-1,-2,1,0,2,0.5,0,0,-2),6,2,byrow=TRUE) # qgraph reads matrix elements as "from row to column" # regression weights matrices are read "from column to row" # path model weights matrix must be transposed for qgraph qgraph::qgraph(t(example), labels=rownames(riskcor), layout=pos, repulsion=.8, vsize=c(10,15), theme="colorblind", fade=FALSE)
# first estimate the precision matrix data(riskcor) omega <- (qgraph::EBICglasso(riskcor, n = 69, returnAllResults = TRUE))$optwi # qgraph method estimates a non-symmetric omega matrix, but uses forceSymmetric to create # a symmetric matrix (see qgraph:::EBICglassoCore line 65) omega <- as.matrix(Matrix::forceSymmetric(omega)) # returns the precision matrix SE <- network_to_SEset(omega, digits=3) # each row of SE defines a path-model weights matrix. # We can extract element 20 by writing it to a matrix example <- matrix(SE[20,],6,6) # Example path model can be plotted as a weighted DAG pos <- matrix(c(2,0,-2,-1,-2,1,0,2,0.5,0,0,-2),6,2,byrow=TRUE) # qgraph reads matrix elements as "from row to column" # regression weights matrices are read "from column to row" # path model weights matrix must be transposed for qgraph qgraph::qgraph(t(example), labels=rownames(riskcor), layout=pos, repulsion=.8, vsize=c(10,15), theme="colorblind", fade=FALSE)
Takes a matrix and generates a matrix containing all orderings of the rows and columns
order_gen(omega)
order_gen(omega)
omega |
input p-dimensional square matrix |
a matrix of dimension orderings. Each column
represents an ordering of dimension names as character strings.
Chasalow S (2012). combinat: combinatorics utilities. R package version 0.0-8, https://CRAN.R-project.org/package=combinat.
data(riskcor) orderings <- order_gen(riskcor) # Each column of orderings defines an ordering of variables print(orderings[,1]) # in the second element, the fifth and sixth variable are switched print(orderings[,2])
data(riskcor) orderings <- order_gen(riskcor) # Each column of orderings defines an ordering of variables print(orderings[,1]) # in the second element, the fifth and sixth variable are switched print(orderings[,2])
Takes a path model and generates the corresponding (standardized) precision matrix or
covariance matrix. The inverse of network_to_path
.
path_to_network(B, psi = NULL, output = "precision")
path_to_network(B, psi = NULL, output = "precision")
B |
input |
psi |
variance-covariance matrix for the residuals. If NULL (the default) will impose the constraint that the variables have variance 1 and the residuals are uncorrelated |
output |
Function returns the precision ("precision") or covariance ("covariance") matrix |
a precision or covariance matrix
Ryan O, Bringmann LF, Schuurman NK (upcoming). “The challenge of generating causal hypotheses using network models.” in preperation.
Shojaie A, Michailidis G (2010). “Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.” Biometrika, 97(3), 519–538.
Bollen KA (1989). Structural equations with latent variables. Oxford, England, John Wiley \& Sons.
network_to_path
, SEset_to_network
A function used to analyse the SEset results. Calculates the proportion of path models in a given SEset in which a particular edge is present
propcal(SEmatrix, names = NULL, rm_duplicate = TRUE, directed = TRUE)
propcal(SEmatrix, names = NULL, rm_duplicate = TRUE, directed = TRUE)
SEmatrix |
An |
names |
optional character vector containing dimension names |
rm_duplicate |
Should duplicate weights matrices be removed from the SEset. Defaults to TRUE. |
directed |
If |
a matrix showing in what proportion particular edges are present.
If directed=TRUE, elements ij denote the proportion of weights matrices containing a path
from
to
. If directed=F, the output will be a symmetric matrix, with element ij
denoting in what proprtion an edge of either direction connects
to
.
Ryan O, Bringmann LF, Schuurman NK (upcoming). “The challenge of generating causal hypotheses using network models.” in preperation.
A function used to analyse the SEset results. For each member of the SE-set, calculate the proportion of explained variance in each child node, when predicted by all of its parent nodes
r2_distribution(SEmatrix, cormat, names = NULL, indices = NULL)
r2_distribution(SEmatrix, cormat, names = NULL, indices = NULL)
SEmatrix |
An |
cormat |
A |
names |
optional character vector containing dimension names |
indices |
option vector of matrix indices, indicating which variables to compute the R^2 distribution for |
Returns an matrix of
values.
For each member of the SE-set, this represents the variance explained in node
by it's parents
in that weighted DAG.
Ryan O, Bringmann LF, Schuurman NK (upcoming). “The challenge of generating causal hypotheses using network models.” in preperation. Haslbeck JM, Waldorp LJ (2018). “How well do network models predict observations? On the importance of predictability in network models.” Behavior Research Methods, 50(2), 853–861.
network_to_SEset, find_parents
# first estimate the precision matrix data(riskcor) omega <- (qgraph::EBICglasso(riskcor, n = 69, returnAllResults = TRUE))$optwi # qgraph method estimates a non-symmetric omega matrix, but uses forceSymmetric to create # a symmetric matrix (see qgraph:::EBICglassoCore line 65) omega <- as.matrix(Matrix::forceSymmetric(omega)) # returns the precision matrix SEmatrix <- network_to_SEset(omega, digits=3) r2set <- r2_distribution(SEmatrix, cormat = riskcor, names = NULL, indices = c(1,3,4,5,6)) # Plot results apply(r2set,2,hist) # For ggplot format, execute # r2set <- tidyr::gather(r2set)
# first estimate the precision matrix data(riskcor) omega <- (qgraph::EBICglasso(riskcor, n = 69, returnAllResults = TRUE))$optwi # qgraph method estimates a non-symmetric omega matrix, but uses forceSymmetric to create # a symmetric matrix (see qgraph:::EBICglassoCore line 65) omega <- as.matrix(Matrix::forceSymmetric(omega)) # returns the precision matrix SEmatrix <- network_to_SEset(omega, digits=3) r2set <- r2_distribution(SEmatrix, cormat = riskcor, names = NULL, indices = c(1,3,4,5,6)) # Plot results apply(r2set,2,hist) # For ggplot format, execute # r2set <- tidyr::gather(r2set)
Takes a matrix and re-orders the rows and columns to some target ordering
reorder_mat(matrix, names)
reorder_mat(matrix, names)
matrix |
input matrix to be re-arranged. Must have rows and columns named |
names |
character vector containing the dimension names of the input matrix in the desired ordering |
input matrix with rows and columns sorted according to names
data(riskcor) # first define an ordered vector of names row_names <- rownames(riskcor) row_names_new <- row_names[c(1,2,3,4,6,5)] reorder_mat(riskcor,row_names_new) # The fifth and sixth row and column have been switched print(riskcor)
data(riskcor) # first define an ordered vector of names row_names <- rownames(riskcor) row_names_new <- row_names[c(1,2,3,4,6,5)] reorder_mat(riskcor,row_names_new) # The fifth and sixth row and column have been switched print(riskcor)
Reported sample correlation matrix from a cross-sectional study on cognitive risk and resilience factors in remitted depression patients, from Hoorelebeke, Marchetti, DE Schryver and Koster (2016). The study was conducted with 69 participants, and the correlation matrix consists of six variables. The variables are as follows:
data(riskcor)
data(riskcor)
A 6 by 6 correlation matrix
* 'BRIEF_WM': working memory complaints, a self-report measure of perceived cognitive control * 'PASAT_ACC': PASAT accuracy, performance on behavioural measure of congitive control * 'Adapt ER': self-report adaptive emotion regulation strategies * 'Maladapt ER': self-report maladaptive emotion regulation strategies * 'Resilience': self-report resilience * 'Resid Depress': self-report residual depressive symptoms
<https://ars.els-cdn.com/content/image/1-s2.0-S0165032715313252-mmc1.pdf>
Hoorelbeke K, Marchetti I, De Schryver M, Koster EH (2016). “The interplay between cognitive risk and resilience factors in remitted depression: a network analysis.” Journal of Affective Disorders, 195, 96–104.
data(riskcor) print(rownames(riskcor)) print(riskcor)
data(riskcor) print(rownames(riskcor)) print(riskcor)
Takes the SE-set and calculates for each weights matrix the corresponding
precision matrix. Used to check the results of network_to_SEset
to assess deviations from statistical equivalence induced due to rounding,
thresholding, and numerical approximations.
SEset_to_network( SEmatrix, order.ref = NULL, order.mat = NULL, output = "raw", omega = NULL )
SEset_to_network( SEmatrix, order.ref = NULL, order.mat = NULL, output = "raw", omega = NULL )
SEmatrix |
a |
order.ref |
an optional character vector with variable names, the reference ordering of the precision matrix. |
order.mat |
a |
output |
Output as |
omega |
Comparision precision matrix, e.g. original input precision matrix to
|
If output = "raw"
, a matrix of precision matrices
stacked column-wise in
rows.
If
output = "summary"
returns a list containing the bias, MSE and
RMSE for each re-calculated precision matrix, relative to comparison omega
matrix supplied.
Ryan O, Bringmann LF, Schuurman NK (upcoming). “The challenge of generating causal hypotheses using network models.” in preperation.
Shojaie A, Michailidis G (2010). “Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.” Biometrika, 97(3), 519–538.
Bollen KA (1989). Structural equations with latent variables. Oxford, England, John Wiley \& Sons.
network_to_path
, path_to_network