Chapter 15 A Case Study

Lets take a simple RNA seq experiment and follow the steps from start to finish.

Set up the project folder and repo

First, I navigate to my projects folder. Now, since this will be an RNA seq project, I clone the RNA seq template from github:

git clone --depth 1 https://github.com/NielInfante/RNAseq.git Case_study_2019_02

This will create the folder Case_study_2019_02, and copy everything from the template repository into it. The depth flag tells git that you only want the latest code, without all the revision history, to save a little space and time.

Now go to github and create a repository called Case_study_2019_02, and use the option to create a repo from an existing repository. Do not initialize with a README.

Then, using your github ID and the name you want for the project:

git remote add origin https://github.com/NielInfante/Case_study_2019_02.git
git push -u origin master

You now have your project inititalized and connected to a github repository.

If the above gives you an error, you can try:

rm -rf .git
git init
git add -A
git commit -a -m 'Initial Commit'
git remote add origin https://github.com/NielInfante/Case_study_2019_02.git
git push -u origin master

Get the data

Should I do this with Agazie data? Project ID 22934923

I will be getting the data from BaseSpace. See 16.1. The project ID is 22934923. This is a Human RNA seq experiment, explaination here Link to paper, if available.

Its Human RNA, its old. Check with him, it may be published by now.

Use getFastq.R to get data

15.0.1 Create Conda Environment {-csrna_conda}

Find the versions that you want. You should install specific versions (generally the latest), and keep track of the version for reproducibility purposes. Copying the conda create command to your readme will record this.

conda search fastqc
conda search multiqc
conda search salmon

conda create -n case_study_2019_02 fastqc=0.11.8 multiqc=1.7 salmon=0.12.0

Do QC

fastqc and multiqc

Quantification

Get transcripts, merge them, get IDs

conda salmon build index do quantification

Differential Expression

DESeq2

GO, GSEA, pathway