DMS has emerged as one of the pre-eminent choices for RNA structure determination. DMS can be added to cells, tissues, or in vitro solution, and it rapidly and specifically modifies solvent accessible adenines and cytosines at their Watson–Crick base-pairing positions. In standard experimental conditions, an accessible nucleotide has ~2% chance of reacting with DMS, which results in multiple DMS modifications per single RNA molecule. The DMS modifications therefore report on the folding of each individual RNA molecule.
DMS mutational profiling with sequencing (DMS-MaPseq), encodes DMS modifications as mismatches that get incorporated during reverse transcription by a thermostable group II intron reverse transcriptase (TGIRT). Due to the high fidelity of TGIRT, the background incorporation of mismatches is typically lower than sequencing error. Thus, the observed rate of a mismatch at a given nucleotide is directly proportional to its DMS reactivity. A big advantage of DMS-MaPseq is that any RNA of interest can be targeted for library generation and analyses using sequence specific primers.
The MaP analysis web tool provides a simple platform for analyzing DMS-reactivity of an RNA. The user input is a
raw sequencing file (.fastq) generated from a DMS-MaPseq experiment, and a sequence of the RNA of interest
(.fasta). The DREEM algorithm performs sequence alignment using bowtie-2 and outputs the mismatch rate per
nucleotide.
Here we give a brief explanation for what is being done under the hood. The code is freely available and can
be downloaded hereEach run generates three directories:
input
, output
, and log
fastqc --extract fq1 fq2 --outdir=output/Mapping_Files/
fq1
and fq2
are the two supplied fastq files.
If only one is supplied the we leave out the fq2
argument. It is recommended to supply both the
forward and reverse reads if available as they give additional confidence to nucleotide identity.
trim_galore --fastqc --paired fq1 fq2 -o output/Mapping_Files/
fq1
and fq2
are the two supplied
fastq files.bowtie2-build
, bowtie2 will create
an index of each reference sequence in the supplied fasta file. This is required for aligning. The below command
puts these index file in the input/ directorybowtie2-build test.fasta input/test
bowtie2 --local --no-unal --no-discordant --no-mixed -X 1000 -L 12 -x input/test -1 fq1 -2 fq2 -S output/Mapping_Files/aligned.sam
--local
runs bowtie2 in local mode which allows for a read to match to part of a reference sequence.--no-unal
disallows reads that do not align to any of the reference sequences to be included in the
final sequence and alignment file (SAM).--no-discordant
removes reads that do not cordantly align.
Here is a
full definition of concordant pairs.--no-mixed
removes combination of reads if no cordant alignments are possible.-X 1000
allows for gaps between reads-L 12
use a length of 12 nt sequence for seeding.
Each job generates a structural constraint file for each sequence in the supplied fasta. These files end with
"_struc_constraint.txt". You can use the RNAStructure software package (https://rna.urmc.rochester.edu/RNAstructure.html)
to predict a secondary structure with a sequence and this file. if you have the software installed then run
Fold -m 3 test.fasta -dms test_struc_constraint.txt out.ct
ct2dot out.ct out.db
Where test.fasta
contains a single sequence of interest.
test_struc_constraint.txt
is the outputed structural constraint file
You can also use their webserver
Upload your fasta file containing a single sequence to the field "Select Sequence File:"
Upload the constraint file generated in "Select SHAPE Constraints File:". This will work even if you have DMS
Questions about DREEM, MaP analysis, and DMS-MaPseq please contact Silvi Rouskin (silvi@hms.harvard.edu)
Question about the server please contact Joseph Yesselman (jyesselm@unl.edu)