[Lecture Notes] Fundamentals of Biotechnology - L5: RNA technology

Ribozymes: Non-coding RNA

Small RNAs such as snRNA, snoRNA, gRNA are all important in RNA processing (removing introns). These are called ribozymes and are non-coding RNAs (ncRNA).

ncRNA can help modulate gene expression. For example anti-sense RNA can bind to mRNA to block its translation. This spawned the idea to use antisense RNA to attenuate the synthesis of proteins that cause disease.

RNA interference (RNAi) protects against RNA viruses. Small interfering RNAs (siRNAs) find viral mRNA and signal their digestion. CRISPR RNAs is the bacterial version of RNA interference and has been introduced into eukaryotes to make specific changes in genes, such as insert tags or make deletions.

Telomeres shorten during DNA replication because the ends of chromosomes cannot be synthesised without the 3’-OH group. Telomerase uses RNA to regenerate the telomeres that were lost during replication. Telemorase contains an RNA template and a reverse transcription subunit. The RNA template is near the reverse transcriptase binding site and from this assembly chromosomes can be lengthened by the addition of telomeric DNA.

Gene dosage compensation

We spoke about X chromosome inactivation previously. This is different in insects. In Drosophila, RNAs called roX1 and rox2 which are complexes with 5 proteins form the MSL complex which binds to the male X chromosome, ramping up its transcription. So female insects do not inactivate one of their X chromosomes, males instead make one of their X chromosomes hyperactive.

Humans and other mammals inactivate one of the two chromosomes in female cells. Xist which is a non-coding RNA coats the inactive chromosome. The Xist gene on the active X chromosome is itself switched off by methylation whereas the Xist gene on the inactivated chromosome is on, which leads to it deactivating itself. The Xist gene codes for Tsix antisense RNA which regulates Xist gene expression.

Piwi interacting RNAs (piRNAs

These help maintain the eukaryotic genome. They are 24-30 bp in length. They are coded for in the genome and are clustered sometimes within introns. They have complementary sequences to endogenous transposons which are found in the centromere area or the telomere area. When piRNAs are expressed, argonaut proteins find them and cleave them into piece sso that they bind to any RNA produced by a transposon, stopping transposons from moving location.

RNA production from viruses

CRISPR is found in bacteria and archaea. It can protect against viruses with RNA or DNA and histile plasmids and transposons. CRISPR stores foreign derived genetic sequences on the bacterial chromosome with palendromic repeats between them. When a foreign piece of DNA or RNA with the same sequence turns up in the bacteria the CRISPR system identifies it and destroys it.

Transcription modulation

sRNAs are small RNAs that can regulate the amount of gene expression. They bind to mRNAs to prevent them from being translated. Some of them bind but change the secondary structure to activate translation and some sRNAs bind to proteins to modulate transcription.

MicroRNAs in eukaryotes can bind to mRNA to signal that it should be degraded. There are also enhancer RNAs, circular RNAs and lncRNAs but their roles is not well-known.

Antisense RNA binds to sense mRNA to prevent its transcription as was mentioned earlier. These are produced by antisense genes. Neurospora is a fungus that forms hyphae during the day. The hyphae formation is due to frq genes and mRNA, therefore these levels are high during the day, at night the antisense frq is high which stops hyphae formation during the night. 20%-40% of all protein coding genes have corresponding antisense RNAs to modulate their expression. Cis antisense genes are found at the same location or near the main gene whereas trans antisense genes are found far away.

Antisense RNA can be made through chemical oligo synthesis or through cloning genes in the opposite direction to get their corresponding antisense strand. Antisense genes can be cloned behind a promoter so that its transcription can be controlled through this promoter. Cloning antisense genes into bacteria is more efficient as it will produce the RNA itself rather than needing to be constantly supplemented with RNA to knock out a certain gene, however this method is more commonly used, and as DNA instead of RNA as it is more stable in lab conditions.

Antisense oligos do not cross through into the cell with great ease. There is a mechanism for natural uptake but it is not known. It is an active process and depends on many factors. It is likely that endocytosis is used to take them up but then the RNA is trapped in a vesicle. It is possible that the RNAs enter through membrane bound receptors. Liposomes are a way to deliver RNAs into cells. Oligos can bind to liposomes if they are positive, and the positive liposome will be attracted to the negatively charged cell surface and the liposome with the bound oligo will be taken into the cell. The RNAs can also sit inside the liposome. Basic peptides are attached to the oligos, such as the Tat protein in HIV-1, these are ble to enter the cell nucleus and so the RNA will be taken along with them.

Another way to get RNA into a cell is using streptolysin O (from the Streptococci bacteria) which generates membrane pores that the oligonucleotides can pass through to get to the nucleus directly. Alternatively microinjection with tiny needles can put oligonucleotides into cells. A mechanical method called scrape-loading disrupts the cell membrane allowing the oligos to pass through.

RNA interference

RNAi defends against viruses and transposon movement and in some organisms affets their development. This occurs in two stages, the initiation and effect phases. Initiation occurs when external RNA is found from RNA virses, internal RNA is found that needs to be controlled, or when dsRNA from an engineered gene is detected. Any double stranded DNA is detected, it is the trigger for RNAi. An endonuclease called Dicer digests the RNA into small oligonucleotides called small interfering RNAs (siRNAs). Dicer moves siRNAs to the RNA-induced silencing complex (RISC) which is activated by siRNAs. This is the effector phase. RISC uses RNA helicase to unwind the fragments and the single standed siRNA is used as a template to find complementary sequences. When RISC finds a complementary sequence the argonaut family of enzymes cleave the mRNA which is then degraded by exonucleases. This means that all mRNA complementary to he siRNAs is destroyed. 50 siRNAs is enough to destroy all target mRNA in a cell. This can be done because siRNAs are amplified using RNA-dependent RNA polymerase (RdRP) which creates more dsRNA, which dicer recognised and so the cycle begins again. RNAi can also repress gene expression permanently by converting target genes into heterochromatin. See figure 5.11, 5.12, 5.13 and 5.14.

Gene expression modulation

During development of C. elegans small miRNAs are transcribed which regulate gene expression by blocking mRNA translation. These are present in plants and humans too. This is RNAi induced by genomic miRNAs. These miRNAs can bind to the 3’UTR or they bind to the target mRNA itself. Drosophila use pri-microRNAs which are transcribed as polycistronically and then cleaved into pre-miRNAs. These exit the nucleus and dicer recognises them because they loop round to form dsRNA. The loop structure is cleaved by dicer and RISC separates the two strands. There is a tolerance for a few mismatch pairs in some animals whereas plants require perfect matches to cleave and degrade target mRNA. miRNAs do not fully deactivate mRNAs but a class of RNAs called circular RNAs (circRNA) has binding sites for miRNA to counteract their effect, binding to them to prevent them binding to mRNA.

RNAi can be used to eliminate a protein from an organism without needing to knock out the gene with genetic engineering. C. elegans can take up dsRNA through either eating E. coli which is expressing the dsRNA, injecting the eggs with dsRNA or by simply having the dsRNA in solution. The outcome is that the dsRNA enters the work, dicer cleaves the dsRNA into chunks that then activate RISC to block target mRNAs.

Delivering RNAi to Drosophila is done by microinjecting it directly into a developing egg. The dsRNA enters the embryo while it develops and knocks out proteins of interest during the whole development.

Adding dsRNA to mammalinan cell lines generally causes an antiviral responce which produces interferon which degrades the cells. Instead of using long dsRNA like in C. elegans and Drosophila, smaller dsRNAs of 30 or so bases long are used to activate the mammalian versions of Dicer and RISC. This muted the expression of the target mRNA and was able to prevent the protein in question from being synthesised. siRNAs made by the Diver enzyme are short dsRNAs with 3’ overhangs 2 bases long (made of uracil) - siRNAs like these were made synthetically by taking an in-vitro transcript of dsRNA and adding to it purified Dicer to make siRNAs. This is done by PCR amplifying the gene to be suppressed including its promoter sequences and then creating in-vitro RNA transcripts. mRNAs can form secondary structures which makes it difficult to determine which siRNAs will interfere and so many siRNAs must be tested. siRNAs can be delivered to mammalian cells by microinjection, transfection, or liposomes.

One can also use short hairpin RNAs (shRNA) to induce RNAi in mammals. These mimic the structure of microRNA, and are created by transcribing an RNA strand with one end complementary to the other with an inturrupting sequence in the middle to allow the RNA to loop round and form hydrogen bonds (figure 5.18). In vivo these RNAs can be expressed by adding the promoter for RNA polymerase III. If 4-5 thymines are red by RNA polymerase III then it will stop and will not activate the enzymes adding the poly(A) tail and modified guanine cap the the RNA, therefore creating a shRNA capable of RNAi.

Putting shRNAs into vectors has the effect of sustaining the gene supression whereas simply adding it to a mammalian cell culture will only have temporary gene knockut.

RNAi libraries

An entire genome can be screened by expressing each gene sequence as dnRNA. Every clone in the library suppresses a gene using RNAi, therefore a library like this could be used to screen every protein in an organism for its function. A C. elegans RNAi library containing its entire genome has been created in E. coli then these E. coli are fed to the worm which triggers the RNAi response and removes one of the proteins from the entire work. This has allowed for identificaiton of genes whose suppression kills the embryo or causes developmental defects while the function of these genes wasn’t known before.

RNAi libraries are constructed by isolating an organisms genes in the cDNA form and amplifying them using PCR. These are then cloned into vectors. PCR is done with two polymerases and two promoters so that when the vector is transcribed, both the sense and antisense transcript are made, which then anneal and activate RNAi.

Mammalian RNAi libraries are constructed with siRNAs and shRNAs instead of sdRNA because of the reasons described before. These can be screen using live cell microarrays and analysed for symptoms.

RNA processing using RNA

RNase P is a ribonuclease P which trims the ends of pre-tRNA and the catalytic activity of this enzyme comes from the RNA, not from the protein section so it is a ribozyme.

Pre-mRNA is converted in the nucleus of eukaryotes to mRNA with the introns removed using a non coding RNA called snRNA (small nuclear). The snRNA is bound to a protein to form a complex called a spliceosome which is able to find the boundary between an intron and an exon and remove the intron. The 5’ end of the intron is identified by the U1 snRNA SNURP and the 3’ by the U2AF protein, U2 snRNA finds the intron branch site. The U1 and U2 RNAs bring the ends of two exons close together and cleave the RNA in between then ligate them into once piece and the intron is released.

Alternate splicing gives combinatorial variaty to the amount of genes that can be expressed and ncRNAs are involved in this process.

rRNAs are transcribed in the nucleolus by RNA polymerase I and the rRNA ribonucleotides are modified using small nucleolar RNAs (snoRNAs). There are 400 snoRNAs and are mostly encoded within introns and transcribed at the same time as the gene they are in. Another splicing event happens after the intorn is spliced out where the snoRNA is itself spliced out of the intron and then processed by an exonuclease. snoRNAs guide proteins to rRNAs and snRNAs so that they can be modified.


Riboswitches are sections of RNAs where the regulatory RNA forms part of the RNA to be regulated. They are found close to the 5’ end of an mRNA molecule and the riboswitch sequence or domain changes its secondary structure which effects the expression of the mRNA. Riboswitches bind to effector molecules to trigger the secondary structure change which prematurely terminates transcription or translation. These are most commonly found in bacteria for synthesis enzymes. Thiamine pyrophosphate binds to the TH1 box which is a riboswitch to prevent the production of more thiamine, when thiamine levels are low the riboswitch changes conformation and more thiamene can be made.

They work by causing conformational changes in the stem and loop structure of mRNA: Attenuation riboswitches on mRNAs bind to an effector molecule which causes a terminator loop to form, stopping transcription in its tracks and the unfinished mRNA is then degraded. This is transcriptional inhibition.

Translational inhibition is possible by changing the structure in which the Shine-Dalgarno sequence is found - if a loop is created so that the SD is on the loop then it will not be accessible to the ribosome and protein will not be synthesised. Again the formation of this secondary structure loop is modulated by effector molecules (5.22).

Bacillus subtilis has a gene called glmS which codes for a component of the cell wall - glutamine fructose 6-phosphate amidotransferase. It has a riboswitch which acts in a different way. This ribowitch instead changes the secondary structure of the mRNA so that it cleaves itself. When the products of the enzyme are abundant, they bind to the mRNA for glmS and alter the secnondary structure which cuts itself so no more translation can occur (5.23).

Only one riboswitch has been found in eukaryotes so far and it is the thiamine riboswitch, which has a different mechanism to the bacterial version.

Ribozymes for catalysis

Ribozymes are RNA molecules that can catalyse reactions like enzymes. Some are only RNA and some have proteins with them but the characteristic of a ribozyme is that the RNA does the catalytic work. There are large and small ribozymes, with large ribozymes being a few hundred to 3000 nucleotides long. An example of a large ribozyme can be found in the introns of Tetrahymena which when transcribed to pre-mRNA are able to self-splice, no SNURPS needed. This type of intron (the group I intron) are found in fungus and plant mitochondria, in chloroplasts, viruses and nuclear rRNA.

The linear structure of RNA is able to fold into secondary structures such as stem loops which are themselves able to fold differently leading to many 3D structures, like proteins. Group II introns can also self splice and are found in fungus and plant mitochondria and in some chloroplasts. These ribozymes change their surrounding microenvironment to be able to fold into conformations which enable them to self-splice. They bring the ends of two exons together, splice themselves out and ligate the exons together. These have similar structures to snRNAs which implies that group II introns came before snRNAs in the spliceosome.

Another ribozyme discussed earlier is RNase P which cleaves the 5’ ends of tRNA molecules.

Small ribozymes may only be 30 to 80 bases long. An example are viroids which are self replicating plant pathogens which are pure DNA with no associated protein. The hammerhead ribozyme is a small RNA which can self-cleave in the replication of viroids. It is a single stranded RNA genome resistant to ribonucleases. The hairpin ribozyme is another plant pathogen satellite virus which can cleave concatomers of ssRNA genomes for infection and then ligate them into circular genomes, making multiple copies.

Engineering ribozymes

A ribozyme catalytic core can be linked to a sequence complementary to target mRNA. This chimeric ribozyme can then seek out target mRNA, bind to them and cleave them (5.29). They are engineered so that the 3’ and 5’ end of the sequence to be split flank the catalytic core so that the cut is in between these two ends. This is better than using antisense oligos to recruit RNase H to cleave target mRNA because the antisense oligos themselves get cleaved meaning they need to be continuously supplied. Chimeric ribozymes however are reusable.

RNA SELEX: Systematic Evolution of Ligands by EXponential enrichment is a way of finding which substrates natural ribozymes to act on. A random mixture of DNA oligos is synthesised and these are converted into dsDNA using a 5’ primer and Klenow polymerase which contains the promoter for T7 RNA polymerase so amplify single stranded RNA copies of each oligo. The ribozyme is mixed with the ssRNAs and any that ribozymes that bind specifically to an RNA are isolated. These ribozymes are then attached to beads for purification and the RNA on them is reverse transcribed to cDNA. These are then amplified and sequenced to find out what else the ribozymes can catalyse.

In virto evolution and selection of ribozymes:

Given a random pool of RNA sequenecs one can create new ribozymes with new enzyme functions. Unlike in SELEX, the pool of random RNAs is a pool of potential ribozymes instead of a pool of potential substrates. The 5’ end is a substrate sequence for the wanted ligation reaction with a terminal triphosphate to make ligation more favourable. The 3’ end has a sequence that binds to some effector molecule which allows controll of the ligation molecule (5.32). A second ligation substrate is added and incubated - if any of this substrate binds to a specific random RNA then that moleucule will become larger and can be separated by gel electrophoresis and sequenced after reverse transcription and PCR.

This can be enhanced by adding a mutagenesis step using error prone PCR. Random oligonucleotides are taken and then mutagenised with error prone PCR. This makes a larger pool with more diverse sequences and they can be screened for the reaction of interest. Once a ribozyme is found that can carry out the reaction the error prone PCR can be repeated to keep screening for imoproved ribozymes. Examples of artificial ribozymes that have been found are ones that can carry out nucleophilic attack and isomerise ring structures. The selection steps must be stringent enough to wipe out any non-functional RNA but not too stringent as to wipe out weakly active potential ribozymes that could be improved over time.

Deoxyribozymes are the DNA analog to RNA ribozymes. These are sometimes called DNAzymes which can catalyse certain reactions. Currently discovered DNAzymes catalyse RNA and DNA based reactions because these are selected for in SELEX expriments. DNAzymes can process RNA clevage, DNA depurination, RNA ligation, DNA phosphorylation and thumine dimer cleavage. One DNAzyme can repair thymine dimers which are caused by UV radiation. Some organisms repair these errors with excision and DNA replacement, or through photolyase enzymes which specifically repair thymine dimers and are activated by blue light. Scientists have made a DNA sequence that performs a photolyase reaction. Random sequences of DNA were created in a pool and joined by a thiamine dimer, any DNA oligo that split the dimer under blue light would become smaller and so these can be detected by gel electrophoresis. A DNAzyme called UV1C was found.

Allosteric riboswitches and ribozymes

It would be nice to be able to control the enzyme actifity of a ribozyme or riboswitch using allosteric control. Is a ribozyme and riboswitch are connected then the riboswitch can be triggered by a small molecule which would either enable to disable the ribozyme. Modular design is used by scientists with in vitro selection to take different dimains from ribozymes and merge them together to create new molecules. The catalystic core of one ribozyme can be linked to the binding domain of another to create a new ribozyme which can be affected by one substrate and catalyse reactions of another (figure 5.35). The selection is done by taking a known catalystic core and mixing it with random sequences; some of the random sequences will be able to bind to the to an effector molecule and some will self cleave. Products that self cleave will move away faster during gel electrophoresis and slow RNAs are isolated because they did not cleave themselves. Then a second round of selection is done where these uncleaved ribozymes are added to a solution of the effector molecule and incubated in conditions that promote cleavege. Then the products are separated by gel electrophoresis and this time the ones that move faster are selected for becaus they cleaved in the presence of the effector molecule but not in its absense. Amplifying and sequencing this selection will allow one to determine the riboswitch domain that became joined to the known catalytic core. Allosteric ribozymes that have been created can respond to cAMP, oligos, proteins and metal ions.

References: my notes are made from, and follow the structure of my course textbook which is Biotechnology 2nd edition by David P. Clark, which can be found for purchase here.