Home > Analysis > Dynamic System Modeling

Conserved and Differential Interactions Detected from Microarray Data in Mouse Cerebellar Development

In this report, we compare gene expression among three developmental stages (according to Thomass request in August, 2010), to identify conserved and differential gene interactions using the comparative dynamical system modeling (DSM) method we have developed. The three stages include early embryonic (EE), late embryonic (LE) and postnatal (PN). By gene expression time courses, 1435 transcription factors (TFs) including alternative splicing variants were grouped into 291 clusters for DBA and 251 for BL6. We applied comparative DSM modeling on EE versus LE+PN, LE versus EE+PN, and PN versus EE+LE for the two strains separately. For each of the three comparisons on the DBA strain, we detected 272, 201, 260 differential and 19, 90, 31 conserved interactions among the cluster representatives, at a p-value of less than 0.05; for the BL6 strain, we identified 220, 206, 243 differential and 30, 45, 8 conserved interactions. From this result, we found a significant interaction between pax6 and klf4. We also tested interactions between splicing variants of klf4 and pax6 and found that some transcripts interact differentially during the three developmental stages.

1.      Designation of development stages

This study divides the developmental time window into three stages: early embryonic, late embryonic and postnatal. The early embryonic stage contains E12, E13 and E14; the late embryonic stage consists of E15, E16, E17, E18 and E19; the postnatal stage includes P0, P3, P6 and P9. (This is a revised developmental stage designation, according to Thomass email in August 2010, from our previous studies.)

2.      Modeling separately gene interactions in DBA and BL6 strains

We treated the DBA and BL6 strains separately in comparing gene interactions in the developmental stages. We observed that expression patterns of some genes in the two strains are quite different. We tried two methods to deal with the strain difference: one was to filter out genes showed significant difference between the two strains; the other is to perform comparative modeling on the two strains separately. The results in this report is produced with the second method, because many of the 1435 TFs are very different between the two strains and would have been filtered out if we had to use the first method.

3.      Clustering

We clustered transcripts of the two strains separately into linearly correlated groups. The clustering results are saved in TOTAL_cluster_DBA.txt and TOTAL_cluster_BL6.txt. Each cluster includes genes and their probes with linearly correlated transcripts across the entire time course. Technical replicates were likely to be grouped into the same cluster.

4.      Comparative dynamical system modeling

Comparative DSM was applied on analyzing interactions during cerebellar development from gene expression data. A DSM consists of a set of ordinary differential equations (ODE). And a nonlinear sigmoidal ODE model is used to describe the expression rate of a gene as

(1)

where is the expression level of gene i, influenced by gene j and k, s are constant rate coefficients to be estimated, and the last term models mRNA degradation with rate constant >0. Self-influence is allowed, i.e., a gene can influence its own rate of change. Comparative DSM is a framework to detect differential and conserved interactions through three statistical tests, explained in Table 1.

Table 1. Statistical tests to determine the strengths of total interaction, homogeneity and heterogeneity for a pair of interactions across experimental conditions.

Null hypothesis

Alternative hypothesis

Test statistic

Significance

No interaction

Active interaction

Ft

pt

No homogeneity

in interaction

Homogeneous interaction

Fc

pc

No heterogeneity

in interaction

Heterogeneous interaction

Fd

pd

Mathematically, any of the three test statistics can be derived from the other two. Comparative DSM searches the best model including the influential genes for each gene representative, according to pt, which measures the total interaction strength. Then according to pd and pc, differential (pd <= 0.05) or conserved interactions (pd > 0.05, pc <= 0.05) will be determined. The smaller the pd value is, the more significant is a differential interaction, likewise for pc to a conserved interaction.

5.      Results

5.1   Comparative modeling results across three development stages for individual strains

For studying interactions among clusters of transcription factors, we did three comparative runs for each strain: early embryonic, late embryonic, and postnatal versus the corresponding remaining stages. This resulted in, for each strain, 3 comma separated value (CSV) files describing interactions between transcript clusters. So there are a total of 6 cluster interaction files for both strains. The name of a CSV file, e.g., Early_Embryonic_DBA_0.csv, contains the development stage (Early Embryonic), a strain name (DBA), and a zero, for a comparative result between early embryonic and the other two stages for the DBA strain. Each file contains 6 columns, including symbol, probeID, cluster, cluster representative, parent cluster, total interaction strength p-value (pt), homogeneity p-value (pc) and heterogeneity p-value (pd). Symbol and probeID together determine a unique transcript of a gene. The cluster column contains the labels of the clusters to which each transcript is assigned; Cluster representative is the transcript chosen to represent a cluster; parent cluster is the cluster whose representative statistically significantly influences the expression pattern of the current cluster representative, as determined by comparative modeling; pt, pc and pd are described in Table 1.

In the BL6 strain, transcripts of klf4 were assigned to clusters C77, C95 and C119; and transcripts of pax6 were assigned to clusters C1, C28 and C35. A differential interaction between C28 and C119 showed up significantly when we compared early embryonic with the other two stages. And a more significantly differential interaction between C35 and C95 also appeared when we compared postnatal with the other two stages. We also noticed that cluster C165 is consistently active in influencing cluster C119 in three comparative analyses. It might thus be biologically interesting to examine those genes in C165 for their role in regulation of klf4.

In the DBA strain, transcripts of klf4 were assigned to clusters C64, C163 and C225; and pax6 into C31, C92 and C106. A conserved interaction between C31 and C163 was detected by comparative DSM in both comparisons of late embryonic v.s. the other two stages and postnatal v.s. the other two stages. Two differential interactions between C31 and C64 and between C31 and C225 were detected when we compared postnatal with the other two stages.

Thus, comparative modeling on both strains suggested that interactions between various transcripts of pax6 and klf4 might be actively involved in cerebellar development. This is based on the strong statistically differential expression pattern between the embryonic (early + late, i.e., EE+LE) and the postnatal (PN) stages, we detected above.

5.2   The interaction between various transcripts of pax6 and klf4 between pre-EGL and EGL expansion

We also studied interactions between the transcripts of pax6 and klf4, by contrasting the interactions between the time duration of pre-EGL expansion (E12, E13, E14, E15) and that of EGL expansion (E16, E17, E18, E19, P0, P3, P6, P9). We detected conserved and differential interactions between various transcripts of the two genes in both strains, as shown in Table 2 and Table 3. These results are obtained differently from previous ones in four aspects. First, manually choosing the order of smoothing splines, shown in Figure 3, may introduce inconsistency in estimating the expression rate, the derivative in equation (2). Second, >0 is required which means if it is not satisfied, all p-values would be set as 1. Third, dissociation consistent and hill coefficient was applied. and were chosen from {5,6,7,8,9,10,11} and {2,3,4,5} respectively. The range was decided in order to capture the non-linear dynamical change of TFs, since the average of all transcripts is 8. Forth, self-influence may exist, shown in equation (2). According to overall significance, , self-influence or not will be chosen along with a combination of and . Two interaction patterns are shown in the phase diagram in Figure 1, to illustrate possible differential and conserved interactions between pre-EGL expansion and EGL expansion involving different transcripts of the two genes by observations. These interaction patterns are also shown in Figure 2 through model predictions.

(2)

Table 2. Interactions between transcripts of pax6 and klf4 in the BL6 strain. Interactions in blue background are differential, and the ones in yellow background are conserved. The rest are inactive interactions.

child transcript

parent transcript

pt

pc

pd

Klf4.1850022

Pax6.101660253

0

0

0.163

Klf4.1850022

Pax6.102340114

0

0

0.05

Klf4.1850022

Pax6.105720411

1

1

1

Klf4.1850022

Pax6.1190025

0

0.018

0.003

Klf4.3830239

Pax6.101660253

0.004

0

0.181

Klf4.3830239

Pax6.102340114

1

1

1

Klf4.3830239

Pax6.105720411

0

0

0.19

Klf4.3830239

Pax6.1190025

0.029

0.017

0.096

Klf4.5570750

Pax6.101660253

1

1

1

Klf4.5570750

Pax6.102340114

1

1

1

Klf4.5570750

Pax6.105720411

1

1

1

Klf4.5570750

Pax6.1190025

0

0

0.178

Table 3. Interactions between transcripts of pax6 and klf4 in the DBA strain. The ones in yellow background are conserved and the rest are inactive interactions.

child transcript

parent transcript

pt

pc

pd

Klf4.1850022

Pax6.101660253

0

0

0.177

Klf4.1850022

Pax6.102340114

0.001

0.001

0.111

Klf4.1850022

Pax6.105720411

0

0

0.183

Klf4.1850022

Pax6.1190025

0.02

0.007

0.089

Klf4.3830239

Pax6.101660253

0.004

0

0.167

Klf4.3830239

Pax6.102340114

0.016

0.001

0.148

Klf4.3830239

Pax6.105720411

0

0

0.175

Klf4.3830239

Pax6.1190025

0.033

0.008

0.106

Klf4.5570750

Pax6.101660253

0.011

0

0.171

Klf4.5570750

Pax6.102340114

0.009

0

0.179

Klf4.5570750

Pax6.105720411

0.082

0

0.195

Klf4.5570750

Pax6.1190025

0.054

0.034

0.092


Figure 1. Differential and conserved interactions between different transcripts of pax6 and klf4 between pre-EGL expansion and EGL expansion viewed through the observations. A label on the plots represents the time of an observation, and its coordinates are the expression level of two transcripts of klf4 and pax6. An arrow represents a transition in expression level over time: the dashed is for pre-EGL expansion, and the solid is for EGL expansion. Left: In a differential interaction between transcripts klf4.1850022 and pax6.1190025, transition in time course shows a diverging trend between the two stages. Right: In a conserved interaction between transcripts klf4.1850022 and pax6.101660253, transitions in both time courses converge to the top right corner. Despite the difference in working zone between the two stages, we still consider this interaction conserved, as they appear to approach a common steady state.


Description: C:\Documents and Settings\ouyang\Desktop\BL6_Klf4.1850022_Pax6.1190025.pngDescription: C:\Documents and Settings\ouyang\Desktop\BL6_Klf4.1850022_Pax6.101660253.png

Figure 2. Differential and conserved interactions between different transcripts of pax6 and klf4 between pre-EGL expansion and EGL expansion viewed through the model predictions. In these plots, the x- and y-axis represent expression levels of a parent transcript and a child transcript while the z-axis is the expression rate of the child transcript. Since we consider degradation and self-influence, both observations of the parent and its child are taken into account. Each point represents observed expression levels of a parent and its child, and the corresponding child expression rate at each time point. The different colors of the points represent different developmental stages, pre-EGL expansion and EGL expansion. The child expression rate is estimated by smoothing splines shown in Figure 3. The surfaces are model predictions. Red and blue surfaces are produced by models for pre-EGL expansion and EGL expansion data, respectively, and the yellow one is generated by a model for pooled data. Top: In a differential interaction for klf4.1850022 from pax6.1190025, the expression levels of the pax6.1190025 have different influence patterns on the expression rate of klf4.1850022, as the blue and red surfaces substantially deviate from the yellow surface. Bottom: In a conserved interaction for klf4.1850022 from pax6.101660253, the expression levels of the pax6.101660253 have the same influence patterns on the expression rate of klf4.185002, as the blue and red surfaces both substantially overlap the yellow surface, though the ranges of the expression levels of pax6.101660253 are different for the two developmental stages.


Figure 3. Three smoothing splines fitted from time course observations. The x-axis in the plots represents time and y-axis the expression levels. The day of birth (P0) is marked as 0. Points are observed gene expression levels. The red lines are smoothing splines. A derivative, or expression rate, at each time point is estimated by smoothing splines. The three plots are for transcript pax6.1190025, pax6.101660253 and klf4.1850022, respectively, in the BL6 strain.

Project funded by funded by NIH Grant HD052472. Please view our funding & data sharing policy
Contact Webmaster