Top 50 frontier genes, graphical attachment view
Full network frontier, XLS spreadsheet 3.2MB
I've calculated the Q-value for enirchment in GO categories (using Storey's method), and plotted the enrichment at each attachment point. Additionally, each of these pages plot the LAR scores for a GO category, of just the genes that are attached. Clicking on a plot leads to the full LAR scores for every E-gene in the category, not just the attached E-genes.
Enrichment for all attached E-genes, with q-value for attachment
Predicted Network Interactions. A. Inferred S-gene network and Frontier. Nodes represent S-genes (ovals), E-genes (gray boxes), and Gene Ontology categories (white boxes). Arrows indicate activation, and tees indicate repression. Mixed arrow/tee line endings indicate GO set enrichment among both activated and inhibited E-genes. B. Expression values of selected E-genes. Each row shows the log-ratio expression of a single E-gene under various shRNA knockdowns to a GFP shRNA knockdown control. C. S-gene interaction confidence. Each pixel in the heatmap corresponds to an S-gene interaction’s bootstrap confidence. For each interaction, the parent S-gene is labeled to the rich, and the child S-gene is labeled to the bottom. Note that though NEM include all transitive interaction, they are not displayed in (B) for simplicity. Therefore, a row shows bootstrap confidence of an S-gene being upstream of other genes, and a column shows bootstrap confidence of a gene being downstream of other genes.
Top 50 frontier genes, graphical attachment view
Full network frontier Be careful clicking on the link, it's a 36 MB text file, cann be opened in Excel for easier viewing.
Enrichment for all attached E-genes
Signed Connection | Unsigned Connection | |
---|---|---|
Top 100 | ||
Top 50 | ||
Top 30 | ||
Any connection |
I collect all frontier genes connected to an attachment point (i.e. negative attached to SCN5A? ), calculate the intersections with all GO categories, then calculate a p-value using the hypergeometric distribution (no multiple-testing correction).
There are two methods of gathering a set of frontier genes for a connection point:
Predicted Network with notes and bootstrap confidence: PDF OmniGraffle
Bootstrap confidence matrix: PNG PDF
Full-genome Network Expansion XLS TAB
Frontier GO Category enrichment (GSEA): Minimum size 10 (TAB) Minimum size 5 (TAB) All categories (TAB)
Clustering of array replicates: PNG PDF
Heatmap of selected E-genes: PNG PDF
There were two ways that the GFP controls clustered: either independently of the knockdowns (for tier1 and tier3a), or by replicate set (tier 2 and tier3). An independent GFP cluster suggests that we should subtract out the mean GFP levels from all replicates (MeanGFPControl? ). GFP replicates being mixed into each replicate set's cluster indicates that we should have a different GFP control for each replicate set (ReplicateSetGFPControl? ).
Also, SCN5A? is not yet being treated correctly. These expression log-ratios are very close to zero, compared to other arrays, so I should probably estimate different differential expression parameters. See the MeanGFPControl boxplot.
Wiki page for the methods paper: KnockoutNets