TIMEOR accepts 2 input types: (1) raw .fastq files and SraRunTable (e.g. here) or a (2) RNA-seq time-series read count matrix (e.g. here) and metadata file (e.g. here).
We strongly encourage the user to input a read count matrix (and associated metadata file) when possible, as the input file size limit is 10GB.
For larger dataset processing, the user is encouraged to use our ready-to-use Docker image. Read 4 steps here, in Tutorials (left side-bar), ‘Web Server’ tab, ‘Local Installation’ section. If that is not possible, please feel free to contact us for specific space requirements. We are happy to help.
While TIMEOR analysis is running, simply make sure to revisit the page at least once an hour.
We strongly encourage the user to keep ‘5.Compare multiple methods’ set to ‘Yes’ to see TIMEOR’s full functionality.
Please click each button just once.
Once analysis has begun, please proceed through TIMEOR sequentially. The user can visit previous tabs, but proceed forward sequentially. Before beginning the analysis, the user can skim through each tab to see what is to come.
The user can download the demo data and go through the tutorial to get a sense for how long the analysis will take given user interface interaction (such as choosing a certain number of clusters besides 3).
TIMEOR supports these types of time-series data (note this is asked in Question 3 of “Determine Adaptive Default Methods”):
Please compare one set of time-series experiments at a time.
NOTE Importantly, we assume that replicate batches are sampled with each batch sampled at each time point. That means batch 1 across 4 timepoints would have corresponding replicate 1 at time point 1, replicate 1 at time point 2, replicate 1 at time point 3 and replicate 1 at time point 4. This process continues for all batches. This structure is adopted to work with all three differential expression methods. Moreover, it is a common structure to control for non-biological variation in a time-series experiment. For example, say RNA-seq was performed on a cell line after insulin stimulation, on 10 consecutive time points, every 20 minutes, with three biological replicates (such as in our manuscript). This means that non-biological factors could be considered when determining temporal differential expression by being able to compare three biological replicates!
Some time-series experimental designs are complex. In those cases, and it is advised to reach out to us with any questions before beginning the analysis. We are very willing to help, and are responsive!
It is advised to skip past the ‘Enrichment’ tab if time is limited, as programs such as MEME can take a long time, even though we limited the motif size to maximum 20 basepairs. The user can certainly go back to the ‘Enrichment’ tab once the rest of the analysis is complete.
It is advised to wait until any running process is finished before downloading results or logs to ensure a successful download.
Thank you for using TIMEOR! Please help us improve to better assist you. Please contact us with questions, ideas, and suggestions. If an error occurs with your data, please download the log file (far left) to check. When contacting us with questions, please send the time, the log file, and if possible a screenshot so we know where in TIMEOR you are.
Import SraRunTable from GEO* where TIMEOR will process raw data through retrieving .fastq files, quality control, alignment, and read count matrix creation. Read first tab of TIMEOR (Getting Started) for information about this input specification. Read this section for information about how to process these data in TIMEOR. We strongly encourage users to upload a read count matrix, or process raw .fastq data via TIMEOR’s interface locally using Docker (see 4 steps here) in Tutorials (left side-bar), ‘Web Server’ tab, ‘Local Installation’ section.
Import metadata file** and count matrix *** (skipping raw data retrieval, quality control, alignment, and read count matrix creation) and proceeding straight to normalization and correction. Read this section for information about how to process these data in TIMEOR.
Then simply follow the prompts. Fill out the grey boxes to begin interacting with each stage and tab.
NOTE: see first tab of TIMEOR called Getting Started for specifications.
ID, condition, time, batch
An example might be:
ID batch condition time
simT0.1 1 control 0
simT0.2 2 control 0
simT0.3 3 control 0
simT1.1 1 case 1
simT1.2 2 case 1
simT1.3 3 case 1
simT2.1 1 case 2
simT2.2 2 case 2
simT2.3 3 case 2
simT3.1 1 case 3
simT3.2 2 case 3
simT3.3 3 case 3
Overall the user must select at least the organism, sequencing, and experiment type, then load metadata or SraRunTable.txt.
Question 1 asks: “What type of organism?” The user can choose from fruit fly, human, or mouse.
Question 2 asks: “What type of sequencing?” If the user is uploading a read count matrix, strongly encouraged, the user can choose “not applicable”.
Question 3 asks: “What type of experiment?” There are two options - “case vs. control”, and “just case or control” types of time-series that TIMEOR supports (see this section).
Question 4 asks: “What type of time-series?” There are three options - “close time point and long time series”, “close time point and short time series”, and “distant time point”. Based on the user’s understanding of the biological system, the user should decide whether the timepoints are considered close or far in time. This question is important to determine how to model differential gene expression (DE) trajectories over time.
Question 5 asks: “Compare multiple methods (alignment and differential expression)?” If this question is left to ‘Yes’ (which is strongly encouraged), TIMEOR will run all methods for the user to determine the best suited method. This is important because in many cases the categorical method DESeq2 which does not consider gene trajectories, still returns a robust set of differentially expressed genes. If this is set to ‘No’, TIMEOR will run for alignment (if applicable): HISAT2, and for DE: DESeq2 (if distant time points selected in Question 4), or ImpulseDE2 (if close time points selected in Question 4).
Question 6 asks: “What is the maximum number of time steps over which one gene can influence the transcription of another gene?” This question prompts the user to tell TIMEOR the window of time over which one gene can directly influence another. Within this window all interactions are considered. It is advised to keep this value small if the time points are spaced out. Said differently, at each time point \(t\) for a differentially expressed gene \(g\), if Question 6’s answer were 2, TIMEOR would be asking, what are potential interactions of \(g\) with other TFs across \(t+1\) and \(t+2\).
“Normalize and Correct” tab: there are two normalization options - upper quartile and trimmed mean of M-values. It is advised to try both methods through TIMEOR’s interactive interface because the influence of normalization differs depending on the RNA-seq data structure.
“Normalize and Correct” tab: there are two options for correlating samples/replicates using the Pearson or Spearman correlation. The choice of correlation method depends heavily on the assumptions the user wants to make about their data, and it is encouraged to try both in TIMEOR’s interface. The user knows more about which samples/replicates (e.g. time points) should cluster together and how to identify outliers.
“Primary Analysis” stage: the user can choose to allow TIMEOR to automatically cluster the DE gene trajectories, or the user can choose the number of gene trajectory clusters. Importantly, finding the optimal solution to this hierarchical clustering problem is an NP-hard. Thus, user input is needed to assess a reasonable number of clusters for downstream analysis. To help, TIMEOR provides an automatic clustering option (PDF visible when folder is downloaded) which takes the mode between three unsupervised clustering methods (partition around medoids (Reynolds et al. 2006), Silhouette (Rousseeuw et al. 1987), and Calinski criterion (Calinkski et al. 1974)) to automatically return the number of gene trajectory clusters to the user. TIMEOR also provides an Elbow plot to show the user how the explained variation changes as a function of the number of clusters. The user can leverage this plot by picking the elbow of the curve. The user is encouraged to use the interactive clustermap and the clustering plots (available on download) to determine whether the automatic clustering option provides suitable clusters.
“Primary Analysis” stage: NOTE, there is not a fold change cut-off for the DE gene trajectories, only an adjusted p-value cutoff. This allows the user to view significant differences in expression trajectories while the fold change might be smaller. This is useful to observe changes for genes including non-coding genes and genes involved in dosage compensation.
“Secondary Analysis: Factor Binding”: the user is encouraged to “see each method’s predicted transcription factors” and search for protein-DNA data (in .bigWig format) to view the binding profile of that transcription factor across each gene trajectory cluster.
“Secondary Analysis: Temporal Relations”: the user can add additional genes or transcription factors (potentially viewed on Factor Binding tab) to the final gene regulatory network (GRN) within STRINGdb. NOTE: TIMEOR only reports the TF GRN using the observed and top one predicted TFs from the “Observed and Top Predicted Transcription Factors” table. The user is encouraged to view the results from individual methods (on Factor Binding tab) when constructing the final GRN, and view Temporal Relations Table to uncover the lead and lag relationships between TFs.
To run TIMEOR outside of website (recommended for preprocessing from raw .fastq files), users may use Docker and Docker Hub. First, the TIMEOR repository must be cloned (https://github.com/ashleymaeconard/TIMEOR.git). To use Docker, it must be installed (version 20.10.0 recommended).
/genomes_info/
) into desired location (e.g. /Users/USERNAME/Desktop/test_folder/genomes_info/
) to mount later.
/genomes_info/dme/
/genomes_info/mmu/
/genomes_info/hsa/
/genomes_info/
: https://drive.google.com/drive/folders/1KEnpCOU0dQU5p1tnEy3o9l02NE0uYnpm?usp=sharing/genome_info/
are readable.
chmod -R 777 /Users/USERNAME/Desktop/test_folder/genomes_info/dme/
.$ docker pull ashleymaeconard/timeor:latest
$ docker images
$ docker run -v /Users/USERNAME/Desktop/test_folder/:/srv/ -p 3838:3838 <IMAGE_ID>
localhost:3838
.NOTE: This could take a while. Please follow these commands:
$ cd /PATH/TO/TIMEOR/
$ docker build -t timeor_env .
$ docker container ls
$ docker exec -it <CONTAINER_NAME> /bin/bash/