Modification of Genes and Proteins
Brian, Sam, and George

Transcription & Transcript Processing

A gene is a part of DNA that contains the instructions for the production of a protein. It is composed of series of amino acids that form exons and
introns. Exons will form the final RNA molecule once the introns are spliced out from the copy. The process of transaction in essence, involves a strand of DNA, and the enzyme RNA polymerase. It starts with RNA polymerase attaching itself (with the help of proteins called transcription factors) to the beginning of a section of DNA called the promoter region. This region is indicated by another region that is known as the TATA box, which is a series of repeating Thymine and Adenine amino acids. This step is known as initiation.


Next is a step called elongation. RNA polymerase unwinds the double helix of DNA. The uncoiled strands of DNA become labeled sense and anti-sense (otherwise known as the non-template and the template strand). The template strand will be used to create a complimentary copy of RNA and is used as the model for such. The non-template strand can be used as a reference point for what the RNA will look like (with Thymine in DNA

replaced by Uracil in RNA). RNA polymerase moves down the template strand, simultaneously creating a strand of amino acids that are complementary to the ones the enzyme reads (i.e. Adenine is paired with Cytosine, Uracil with Guanine). As the RNA polymerase continues, the two strands of unzipped DNA attach back together, recreating the double helix. The forming RNA molecule peels off the end of the RNA polymerase as it is created. A gene can be transcribed by multiple RNA polymerase at once, created many RNA molecules (thus speeding up the production of proteins).This process is continued until the RNA polymerase transcribes a terminator sequence in DNA.

In prokaryotic cells, transcription usually stops at the end of the termination signal, and when the polymerase reaches that point, it releases both the RNA and the DNA. In eukaryotic cells, the polymerase continues for several more nucleotides after which the manufactured pre-mRNA is cut loose from the enzyme. (4)

Transcript Processing (4)

After the pre-mRNA is created, it is processed by enzymes in the nucleus. During this, the ends of the mRNA are modified and interior portions that will not be used for protein synthesis are spliced out. The 5’ end of the pre-mRNA is
fitted with a Guanine cap, which has two functions - protection from degradation, and as a starting point for translation (discussed later). The 3’ end of the pre-mRNA is also changed. It gets a poly-A tail, consisting of upwards of 250 adenine nucleotides. This facilitates the exiting of the RNA from the nucleus and into the cytoplasm. The average length of a RNA molecule is about 8000 nucleotides, but a protein requires only about 1200 to be assembled (400 amino acids). This means that a majority of RNA is not translated into a protein. The segments that do not end up coding for a protein (introns) need to be removed from the molecule, and the remaining parts (exons) need to be attached together. Once this is done (by spliceosomes), the pre-mRNA molecule is now called simply mRNA (messenger RNA).


After the mRNA exits the nucleus into the cytoplasm, translation and synthesis of a protein is ready to be initiated. The mRNA molecule has an instruction sheet, in the form of series of codons - three nucleotides that specify an amino acid (like triplets on DNA). The molecule tRNA is introduced, and transfers amino acids from the cytoplasm to the production zone (the ribosome). The ribosome adds each amino acid brought to it by tRNA to the growing end of a polypeptide chain. As the tRNA arrives at the ribsosome, it holds a specific amino acid at one end. The other end has a nucleotide triplet called an anticodon, which is paired with a complementary codon on mRNA. (10) The amino acids are bonded using dehydration synthesis. As the mRNA molecule is passed through a ribosome, each amino acid is attached to one another and slid through. The process is broken down into three steps, similar to those found in transcription: Initiation, Elongation, and Termination.


Initiation: The small ribosomal subunit of a ribosome binds to both mRNA and a special initiator tRNA. The subunit attaches to the segment at the 5’ end of the mRNA. In bacteria, rRNA pairs its bases with a specific sequence of nucleotides within the mRNA leader, while in eukaryotes, the 5’ cap first tells the small subunit to attach to the 5’ end of the mRNA. After the small subunit gets the mRNA and tRNA, the large ribosomal subunit is attached, forming a full ribosome. Proteins called initiation factors are used to unite all these components. (10)

Elongation: in this stage, amino acids are added individually to the preceding amino acid. Each addition requires proteins called elongation factors and occurs in a cycle of three steps - codon recognition, peptide bond formation, and translocation. These steps are commonly referred to just as translocation, which describes how the ribosomal unit moves the tRNA over, and attaches the amino acid to the existing polypeptide chain. As the tRNA moves from site A to P (in the ribosome), its anti-codon remains bonded to the mRNA with the help of hydrogen. mRNA moves in synchronization with it and brings the next codon to the A site on the ribosome. At this point, the tRNA that occupied the P site exits the ribosome and into the cytoplasm (where it finds another amino acid). The cycle of polypeptide elongation takes less than a tenth of a second and is repeated as each amino acid is added to the chain until the polypeptide is completed. Since many mRNA are created from each gene, proteins are synthesized relatively quickly, and in groups.

Termination: This is the final stage of translation. Once the stop codon on mRNA reaches the ribosome (on the A site), it is read as a stop signal. A protein called a release factor then binds to the stop codon on the A site and hydrolyzes the polypeptide chain from the ribosome. The ribosomal unit and proteins attached to it for translation then come apart and the final protein is ready for use. After the protein is made, further modification may be done to it, such as its structure being folded or coiled. This gives the protein unique use in the cell. A protein may also bind to another protein forming a protein complex (such as in the electron transport chain). (4)


Protein Folding
Translation is defined as the synthesis of an amino acid chain from RNA. This said, translation cannot be considered the end of the protein-creation process because a chain of amino acids is not yet a functional protein. In order to carry out their intended tasks, proteins must be folded into the correct shape and conformation.

Protein folding is considered a co-translational process, mainly because it occurs at the same time as the chain is being assembled. A spontaneous reaction, the amino acids naturally fold into their specified shape based on hydrophobic interactions (non-polar molecules will strive to escape the cytoplasm while polar molecules will be attracted to it), intramolecular hydrogen bonds, and van der Waals forces.
When this amino acid chain is folded, hydrophobic (black) peptides are pushed to the middle of the protein while hydrophilic peptides (white) are pulled to the outside.

This said, protein folding is too random to be left to its own devices and therefore must be regulated by the cell. As discussed, translation usually occurs in a ribosome situated on the Endoplasmic Reticulum. Because of this, the ER is designed to regulate protein folding. Factors such as the solvent (whether water, cytoplasm, or lipid), salt concentration, and temperature all affect how a protein will fold, so specific enzymes called chaperones are used to prevent misfolding. Chaperones are also necessary to protect against aggression, the unintentional reactions between unfolded or folding proteins and other proteins in the crowded cytoplasm. Chaperones’ final function is to serve as a protection against denaturing due to heat. This classifies them as heat shock proteins – they have the ability to hold a protein in its folded shape despite temperature that would normally disrupt the conformation.

Many external factors can have a (usually negative) affect on the folding of synthesized proteins. Besides the obvious factors such as temperature and salt concentration, electric and magnetic fields and the amount of space available in the cytoplasm also have an effect. Because of this, cells must have backups for removing denatured proteins from the cytoplasm (denatured proteins are almost always impossible to fix). High levels of misfolded and unfolded proteins in the ER results in what has been called ER stress. ER stress triggers the unfolded protein response, which slows down the production of proteins, increases the rate of protein folding, and dissolves any denatured amino acid chains.

Obviously, protein folding is extremely complicated. An unfolded amino acid chain can have an estimated 10,000,000,000,000,000,000,000,000,000,000,000,000,000,000 degrees of freedom, meaning that it would be impossible for a cell to fold a protein by trying all combinations (until around 1960, it was thought that proteins were folded by sampling all possible conformations of the peptide chain. This means that proteins must be folded through intermediate states or particular conformations, the way a person would fold a paper airplane (it would be a waste of time to make a paper airplane by folding a piece of paper in every possible way until it was folded correctly). This means that, although it takes some proteins only a millisecond or two to fold, others take hours and go through many intermediate states.
Four different Intermediate steps are required to reach conformation d.

Gene Repair

Genes are extremely important to cell and organism function, and obviously must be flawless to work properly. This can be a problem, as radiation, UV light, and even normal metabolic processes can leave molecular lesions in a cell’s DNA. A molecular lesion can affect a cells ability to transcribe an important gene, or can cause problems in its daughter cells after mitosis. To combat this, cells have created a process called the DNA repair process. Because there are three types of damages DNA can sustain, three processes have been developed.

Direct reversal:
A small amount of the damage done to DNA is caused by either a UV-produced reaction, or a nucleotide methylation. Because a chemical reaction created the damage, all the cell has to do to repair the damage is to carry out a reverse-reaction on the damaged nucleotides. An example of this is the enzyme photolyase, which carries out the photoreactivation reaction on DNA damaged by UV radiation. Besides UV radiation, methylation of guanine and methylation of both adenine and cytosine (adenine and cytosine require a different enzyme than guanine) can be undone in the same manner. Direct Reversal is the most energy efficient way to repair damage to a gene, but can rarely be used: chemical reactions cause a relatively small amount of the damage DNA sustains.

external image PR.gif
A direct reversal reaction is used to eliminate the dimer created by UV radiation.

Single-strand repair:
Because of the parallel nature of DNA (Guanine always pairs with cytosine and adenine with thymine), single-strand damage is easy and straightforward to repair. The excision repair mechanisms that are responsible for replacing the damaged nucleotide know exactly which base to insert. This said, there are three basic types of single-strand repair: Base excision repair (BER), Nucleotide excision repair (NER) and Mismatch repair (MMR). BER deals with mostly chemical distortions to a nucleotide.

Any damage caused by oxidation, alkylation, deamination, or hydrolysis is treated with BER. BER cuts the phosphorus-sugar strand bonding the base to the DNA and replaces it with a whole new base and phosphorus-sugar group. NER is tuned to notice the “helix-distorting lesion” that occur when two bases no longer fit together since one is damaged and misshapen. Because this recognition happens only during transcription, specialized enzymes are sent out the do transcription-coupled repair before RNA is sent out of the nucleus to synthesize incorrect proteins. MMR does not deal with damaged nucleotides. Instead, this process is carried out when two bases are mispared. MMR is usually only used after DNA replication and recombination is carried out.

The damaged nucleotide (top red) is replaced using the adjacent, matching nucleotide (bottom red left)

Double-strand break repair:
A double strand break (a break where both strands of a double helix are cut from the rest of the DNA strand, is the most dangerous form of damage DNA can commonly receive. There is no easy way to repair this type of damage, and if it cannot be fixed the cell will either die or mutate during the next division. Like single-strand damage, there are three different processes to deal with separate versions of a double-strand break: non-homologous end joining (NHEJ), microhomology-mediated end joining (MMEJ), and homologous recombination. The first method of repair, NHEJ, relies on the severed DNA being cut in different places so there are overhanging tails on each end. An enzyme called DNA Ligase IV is synthesized which compares the overhanging tails and matches the ends back together. If the tails match together, repair is usually accurate and successful, but if not mutations can occur. If nucleotides “fall off” the tails, deletions occur, and if the tails don’t match a translocation is formed. NHEJ is important before DNA replication has taken place, because there is no replicated template to compare the broken strand to. MMEJ is similar to NHEJ, but is capable of using microhomologous base pairs to match the tails of two broken strands back together. This means that MMEJ enzymes match a group of 5-25 base pairs together on the overhanging tails and then delete and add entirely new base pairs to the rest of the tails to make a complete DNA strand. MMEJ works well on less complete double-strand breaks, but requires much more energy than NHEJ. The final repair method, called homologous recombination, requires the presence of that DNA strand’s sister chromatid. Because the two chromatids have the same DNA, the enzymatic machinery carrying out the repair has a template to reconstruct the broken DNA. Homologous recombination is usually used to repair double-strand breaks that come from replication machinery trying to synthesize across a single-strand break (there are no tails to compare so a new template is needed).
external image 1-s2.0-S096289249801383X-gr1.jpg
Homologous recombination is being used to repair DNA with a double-strand break. The DNA’s sister cromatid is red and the broken DNA is blue.

Aside from these gene repair methods, many cells are capable of translesion synthesis. This is basically the ability of DNA replication machinery to synthesize right past a small DNA lesion. This process requires the temporary replacement of the DNA polymerase enzyme with a specialized Translesion polymerase.

There are three basic cell responses when all gene repair mechanisms fail and DNA starts synthesizing incorrect proteins. Firstly, and most desirably, the cell will simply commit suicide, called apoptosis, and will be replaced by a new, undamaged cell. Secondly, the cell can simply enter a state of dormancy called senescence. Although not harmful to the organism, senescence is irreversible and a waste of resources. Finally, and most dangerously, it is possible that the damaged cell avoids all programmed shutdowns and continues functioning, only with a different, altered genome. When this happens, unregulated cell division is common. This unregulated division is called cancer and is extremely dangerous to the organism.

RNA Interference
RNA interference is a way for a cell to protect itself from of parasitic genes, such as viruses and transposons. If a cell finds mRNA in its cytoplasm that looks “fishy,” one section being the mirror image of the other section, the cell will destroy all copies of that gene, whether the “fishy” mirror message, or the normal sequence. Although this prevents the virus from being duplicated by the cell’s ribosomes, it also prevents the original gene from being expressed. (4)
external image Dicer-Mech2.gif

This cellular mechanism was discovered in 1986 when geneticists at a biotech lab in California were trying to engineer a petunia with a more richly colored flower. To do this, they attempted to splice in extra genes for the purple pigment, thinking that this would cause more pigment to be expressed. Instead, the cell identified the “imposter” gene for purple pigment, and it proceeded to destroy all copies of it, both original and artificial. The resulting flowers were white, having no pigment at all.

When organisms come in contact with double stranded RNA (dsRNA), they chop it up enzymatically into several small segments called silencer RNA (siRNA), and then use the small nucleotide sequences to check for any single stranded RNA that might have been produced by the dsRNA. The target RNA segments are then found and destroyed by the RNA Interference Silencing Complex. (3)

Mechanism of RNA Interference (5)
Mechanism of RNA Interference (5)

RNAi has several potential medical and scientific uses. The ability to block the expression of a specific gene is invaluable when trying to determine what the genes do. Since the discovery in 1986, geneticists have figured out how to insert segments of double stranded “mirror image” mRNA that triggers the RNA interference mechanism. (1)

Although more research must be done before RNAi treatments are readily available for humans, there has been a lot of success manipulating RNAi in nematodes, and more recently, mice. There have been experimental treatments for macular degeneration in humans, which has shown some success. It has the potential to go much farther in the future, as it has been used in mice to combat ailments like Huntington’s disease, Lou Gehrig’s disease, hepatitis, and breast cancer. There is also potential for RNAi treatments for things like HIV, cancer, arthritis, and Alzheimer’s. (1)

1.) Researchers at a laboratory were able to isolate a certain gene in dandelions that is responsible for depositing the yellow pigment in the flowers. If they introduce extra dsRNA versions of the gene to the flower's cells, which of the following is likely to occur?A. The extra RNA will be transcribed and will result in a yellower flower.B. Thinking the dsRNA is a virus, the cell will release a hormone to trigger an immune response.C. The dsRNA will trigger the RISC to destroy all copies of it and the original pigment gene.D. The dsRNA will act as silencer RNA to prevent the cell's ribosomes from transcribing proteins.
2. Which of the following is an advantage of the RNA interference process?A. It helps a cell defend against viral infection.B. It helps a virus insert its genetic information into a cell's nucleus.C. It helps to accelerate mitotic processes.D. It prevents expression of all disadvantageous genes.
3.) If a cell's DNA is damaged beyond repair, what, hopefully, is the response of the cell?A. Proceed with mitosis and continue life with an altered genomeB. "commit suicide" via apoptosisC. Initiate Transition-coupled repairD. Destroy all copies of its own DNA and get some from another cell
4.) The enzyme DNA Ligase IV is synthesized during which level of DNA repair? A. MMEJB. NHEJC. translation synthesis D. homologous recombination
5.) During Transcript Processing, pre-mRNA gets fitted with a Guanine Cap on it's 5' end. What is the main purpose of this?A. Facilitate transport of out the cell nucleus into the cytoplasmB. Indicates the start region for RNA polymerase and other protein factors C. Protection from degradationD. Allow future repair through Direct Reversal
6.) A "Helix-distorting Lesion" occurs when two nucleotides no longer fit one another due to damage. Which method of repair is most effective? A. MMRB. BERC. Transcription Coupled Repair D. NER
7.) Protein folding is considered a "Co-translational Process" because:A. It occurs at the same time the protein is being assembledB. ER Stress occurs as a result of translationC. During the attachment of the large ribosomal subunit during translation, mRNA's 5' Guanine cap causes the mRNA to foldD. It is coupled with recombination during the end of protein repair
8.) A Dicer protein's actions result in:A. dsRNA cleavage B. mRNA cleavage C. RISC FormationD. DSB Formation
9.) Any damage to DNA caused by deamination is repaired by:A. Transcription Coupled RepairB. Base Excision RepairC. Non-homologous End JoiningD. Mismatch Repair
10.) Cancer is often the result of: A. A cell going into a state called Senescence B. A cell bypassing typical shutdown procedures when it's genome is altered C. MMR malfunctioning during repairD. Double stranded RNA
A) In eukaryotic cells, the DNA resides inside the nucleus all the time, except during mitosis. Describe how DNA is able to control protein synthesis from inside of the nucleus.
B) How does the structure of DNA lend itself to being repaired if there is damage?


1. Great video about RNAi and its discovery
2. Animation of the biological processes behind RNAi
4. Campbell, Neil A, and Jane B. Reece. Sixth Edition. San Fransisco: Pearson Education, Inc, 2002.
10. Ragona, James. "Protein Synthesis." Honors Anatomy. Lecture