Control of Gene ExpressionBy Muhammad Adil Khan, Emily Hu, and Jason Kim
external image GeneExpression.gif

What are transcriptions factors? To answer this, one might also ask “Why do we have different types of cells in our body?” Each cell in our body contains the same DNA, but the activation of certain genes in a cell allows it to differentiate. This distinguishes a neuron from a stomach cell. Transcription factors play a role in which areas of the DNA is available for transcription and are very important in cell differentiation. In most eukaryotes, most genes are turned off and transcription factors are used to turn them on. Conversely, genes in bacteria are originally turned on and transcription factors are used to turn off those genes.

What are Transcription Factors?
A transcription factor is a protein that binds the specific DNA sequences and controls the flow of transcriptions of DNA to mRNA. These proteins can work alone or with other transcription factors to form a complex before transcription occurs. Before we delve into transcription factors, it is important to know the specific sequences of nucleotides on DNA strand. By knowing this, we can see where transcription begins and ends.
First is the promoter. The promoter is an area which contains a DNA sequence that initiates transcriptions. The sequence that ends transcription is the terminator. The section of DNA that is transcribed in a RNA molecule is referred to as the transcription unit.
Eukaryotic promoters usually include a TATA box. A TATA box is nucleotide sequence containing TATA upstream from the transcription start point. When a TF protein recognizes the TATA sequence on the DNA, it binds to the DNA molecule (5). At this point, two things can happen. First, RNA polymerase II, the protein responsible for transcribing DNA to mRNA, attaches to the DNA strand. Or, a cascade reaction can occur. In a cascade reaction, the presence of small amounts of one protein triggers the production of larger amounts of a second, which triggers production of even larger amounts of a third protein, etc (6). The combination of all these transcription proteins along with the RNA polymerase II attached to the DNA strand is known as a Transcription Initiation Complex.

The Initiation of Transcription at a Eukaryotic Promoter

Here is video that condeses the purpose of transcription factors in gene expression.

How do transcription factors affect gene Expression?Here are some terms that are related to transcription factors (4):
  • Upregulation: An incrase in the rate of transcription.
  • Downregulation: A decrease in the rate of transcription.
  • Coactivator: A protein that works with transcription factors to increase the rate of transcription.
  • Corepressor: A protein that worls with transcription factors to decrease the rate of transcription.

One way transcription factors control gene expression is by stabilizing or blocking the RNA Polymerase II binding sites on the DNA strand. TF proteins act as catalysts to increase or decrease the rate of acetylation of histone proteins which are wrapped around DNA molecules. Acetylation is the reaction in which a hydrogen atom is replaced by an acetyl group. Two activities are observed with acetylation (7):

  • Histone acetyltransferase (HAT) activity: TF protein acetylates histone proteins, which weakens the attraction of DNA with histones. This makes the DNA more susceptible to transcription.
  • Histone deacetylase (HDAC) activity: TF deacetylates histone proteins, which strengthens the attraction of DNA with histones,. This inhibits the DNA from transcription.

Histone acetyltransferase (HAT) activity

Unfortunately, the interaction between the RNA polymerase and the TATA box produces a low rate of transcription. To achieve a a higher level of eukaryotic transcription, control elements are used. Control elements are segments of non-coding DNA that regulate transcription of a gene by binding to TF proteins. Some control elements are close to the promoter and some are far away from it. These are called control and distal control elements, respectively. The activator proteins first bind to the enhancers (distal control elements) on the DNA. Second, the DNA bends, bring the bound activators closer to the promoter region. Lastly, the activator proteins bind to TF proteins, forming a complex. The bending of the DNA allows enhancers to affect distant promoters by binding additional transcription factors (8).

Enhancer Action by Transcription Factors

How do transcription proteins bind to DNA?TF proteins contain a few basic structural principles called DNA binding domains. There are four DNA binding domains, or "motifs" (9).
The first is the "Helix-turn-helix motif". This motif is found in many DNA regulatory proteins. It consists of two segments of alpha helix separated by an irregular region, called "turn". The C-terminal helix lies in the major groove of the DNA double helix. The sequence specificity of the DNA binding is maintained by contact between the amino acid R-groups of the helix and the nucleotides of the groove. This is found in prokaryotic cells as the "lac" and "trp" repressor proteins (9).


Another binding domain is the "Helix-loop-helix motif". It consists of a short segment of alpha helix connected to a longer segment of alpha helix by an irregular region, or "loop". A dimerization domain interacts with other helix-loop-helix proteins to form heterodimers. The two sections of alpha helix, one from each monomer, bind to the major groove of the DNA double helix (9).

Helix-loop-Helix Motif

The third DNA motif is the "leucine zipper motif". Proteins of this type form heterodimers. Two alpha helices, one from each monomer, form a coil-like structure at one end due to interactions between leucines that extend from one side of each helix. At the end of spiral the alpha helices diverge which allow them to fit into the major groove of the DNA double helix (9).

Leucine Zipper Motif

The fourth motif is the "zinc finger motif". It consists of a segment of alpha helix that lies in the major groove of the DNA helix bound to a eta sheet by a zinc ion. The zinc ion is held in place by two cysteine and two histidine R groups (9).

Zinc Finger Motif

What is transcript processing?

Transcript Processing is the process in which the information contained within the genes of the DNA flow to proteins to express various genotypes and phenotypes. In the simplest term, it may be viewed as protein synthesis. A living organism cannot exist without proteins. DNA holds all of the instructions for the construction of every protein that an organism needs to function. Each gene contains the instruction for the construction of one protein and is between 150 to 30,000 codons long. Human DNA contains approximately 23,688 protein producing genes, and is too long to leave the nucleus. Therefore the information needed to construct a protein must be transferred from the DNA to mRNA. The mRNA takes this information out of the nucleus to the ribosome, where protein is finally synthesized. It involves two major processes: Transcription and Translation. (8)

Transcription is the synthesis of RNA under the direction of DNA. Both nucleic acids use the same language, and the information is simply transcribed, or copied, from one molecule to the other. Just as a DNA strand provides a template for the synthesis of a new complementary strand during DNA replication, it also provides a template for assembling a sequence of RNA nucleotides. The resulting RNA molecule, or mRNA (messenger RNA), is a faithful transcript of the gene's protein-building instructions. The mRNA carries the genetic message from the DNA to the protein-building site of the cell. Only one strand of DNA is copied during transcription. In addition, a single gene may be transcribed numerous times. After transcription, the DNA strands rejoin. (12)

Steps involved in transcription (10)
Transcription occurs in the nucleus and consists of 3 steps: binding of polymerase, synthesis of mRNA (elongation), and modification of pre-mRNA.

1. Binding of Polymerase (Initiation)
Genes on the DNA, or sense strands, have a promoter region that is a sequence of repeating nitrogenous bases (i.e. CCCCCC, or TATATATATA <-- TATA box in eukaryotes) that precedes the gene (11). The promoter identifies the start of a gene, which strand is to be copied, and the direction that it is to be copied. Groups of proteins called transcription factors (TF's) help an enzyme called RNA polymerase II to bind to the promoter region. RNA polymerase assembles bases that are complimentary to the DNA strand being copied (12). RNA contains uracil instead of thymine as nitrogenous base.
Phases of Transcription (bacteria)
Phases of Transcription (bacteria)

2. Synthesis of mRNA (Elongation)

Polymerase II begins to unwind and opens the DNA molecule, starting at the promoter region. Polymerase II untwists and exposes 10 to 20 DNA nitrogenous bases at a time, allowing contemporary RNA nucleotides to pair with the
exposed DNA bases. Once the mRNA nucleotides attach to one another, the growing strand of mRNA disengages from the DNA and the DNA reforms, rejoining the strands. Transcription begins to end after the mRNA transcribes two termination triplets (TTATTT). A termination code in the DNA indicates where transcription will stop. This process is called termination which signals the end of the transcription unit. The transcription unit is the stretch of DNA that is to be transcribed into an RNA molecule. Transcription continues for another 10 to 30 nucleotides before the "pre" mRNA molecules disengage from the polymerase II. The mRNA produced is called a mRNA transcript. Often several polymerase II molecules follow one after the other on a gene, resulting in the production of identical mRNA strands, increasing the amount of proteins that can be synthesized. (8)

3. Modification of pre-mRNA (Termination)
mRNA must be modified after transcription and before it leaves the nucleus through nuclear pores. After transcription, the 5' (five prime) end, the end constructed first, has a modified guanine (guanine cap) added which identifies the end that is to attach to the ribosome. At the 3' end, a tail of 50 to 250 adenines (poly A tail) is added to help prevent degradation. Eukaryotic genes contain regions that are not translated into proteins. These regions of DNA are called introns and must be removed from mRNA. Their function is not well understood. The remaining portions of DNA that are translated into protein are called exons. Pre-mRNA includes introns as well as exons: snRNP (Small Nuclear Ribonuclear proteins) cut out the introns and with other proteins splice the exons together, creating a functional mRNA molecule, called mature mRNA transcript, which exits the nucleus. (10)

Here is a simplified diagram of the phases involved in Transcription:

Translation is the process where ribosomes synthesize proteins using the mature mRNA transcript produced during transcription. This is a three part process consisting of: Initiation, Elongation, and Termination. All three stages require protein "factors" that aid mRNA, tRNA, and ribosomes in the translation process. For chain initiation and elongation, energy is also required. It is provided by the hydrolysis of GTP (guanosine triphosphate), a molecule very similar to ATP. (8)

1. Initiation
Once the mRNA has been modified and is ready for translation, it binds to a specific site on the ribosome. Ribosomes consist of two parts, a large subunit and a small subunit. They contain a binding site for mRNA and two binding sites for tRNA located in the large ribosomal subunit. The 5' end of an mRNA binds to the small subunit of the ribosome. Just past the guanine cap is the codon, the initial nucleotide of mRNA, that serves as the start sequence (AUG). The key to translating a genetic message into a specific amino acid sequence is that each type of tRNA molecule (transfer RNA) links a particular mRNA codon with a particular amino acid. The function of tRNA is to transfer amino acids from the cytoplasm's amino acid pool to a ribosome. As a tRNA molecule arrives at a ribosome, it bears a specific amino acid at one end. At the other end is a nucleotide triplet called an anticodon, which base-pairs with a complementary codon on the mRNA. Each amino acid is joined to the correct tRNA by a specific enzyme called an aminoacyl-tRNA synthetase. There are 20 of these enzymes in the cell, one enzyme for each amino acid. The synthetase activates the covalent attachment of the amino acid to its tRNA in a process driven by the hydrolysis of ATP. The resulting aminoacyl tRNA, also called an activated amino acid, is released from the enzyme and delivers its amino acid to a growing polypeptide chain on a ribosome. An initiator tRNA (anti-codon is UAC; amino acid is methionine), which resides in one binding site of the ribosome called the P site, leaving the second binding site (A Site) open, binds to the mRNA start sequence utilizing peptide bonds with its anticodon and the major portion of the tRNA attaches briefly to the larger ribosomal sub-unit. A peptide bond forms connecting the amino acid of the tRNA in the P site to the amino acid of the tRNA in the A binding site.The union of mRNA, initiator tRNA, and a small ribosomal sub-unit is followed by the attachment of a large ribosomal subunit, completing a translation initiation complex. (13) Proteins called initiation factors are required to bring all these components together. The cell also uses energy in the form of a GTP molecule to form the initiation complex. The ribosome adds each amino acid brought to it by tRNA to the growing end of a polypeptide chain. (8)

Initiation phase of Translation
Initiation phase of Translation

Initiation and Termination Codes
An initiation code signals the start of a genetic message. As the ribosome moves along a mRNA transcript, it will not begin synthesizing protein until it reaches an initiation code.
Termination codes signal the end of the genetic message. Synthesis stops when the ribosome reaches a terminator codon (10).

The Genetic Code
Living organisms depend on proteins for structure, function, and as enzymes to regulate the chemical reactions that sustain life. Life would be impossible without proteins. The genetic code is the codon(s) that code for a specific amino acid and is the same for every species. This means that the genetic code is universal. The term "genome" refers to the complete compliment of an organism's genes. The table below can be used to determine what amino acid corresponds to any 3 letter codon. (8)

external image rnacode.gif

The elongation cycle of Translation
The elongation cycle of Translation

2. Elongation
In this stage of translation, amino acids are added one by one to the preceding amino acid. Each addition involves the participation of several proteins called elongation factors and occurs in a three step cycle.

a. Codon Recognition: The next mRNA codon is exposed. The codon is recognized by a tRNA molecule with the approximate anticodon. This step requires the hydrolysis of two molecules of GTP. An enzyme briefly attaches this next tRNA to the large sub-unit.
b. Peptide Bond Formation: An rRNA molecule, or ribosomal RNA (proteins and RNA molecules that compose the ribosomal subunits), of the large ribosomal subunit, functioning as a ribozyme, activates the formation of a peptide bond, joining the new amino acid and the carboxyl end of the growing polypeptide chain. Then another enzyme (riboenzyme RNA) allows for the formation of a peptide bond between the first amino acid (methionine) and the second amino acid. The ribosome slide both the tRNAs over, and the first tRNA separates from the amino acids, which is a growing polypeptide chain, leaving the amino acids attached to the second tRNA.
c. Translocation: The mRNA molecule is moved by the ribosome one codon over. This movement exposes the next codon. This movement also causes the first tRNA to separate from the mRNA. The tRNA returns to the cytosol where it will attach to another amino acid (same type of amino acid that was given to the polypeptide). The process of elongation and translocation occur concurrently and continue until the "STOP" codon is reached. Simply, the ribosome shifts the mRNA by one codon in this stage of translocation. (8)

4. Termination
When the stop codon (UAA, UAG, or UGA) is exposed on the ribosome, a protein that acts as a release factor binds to the mRNA, causing the polypeptide/protein to be released. The two sub-units of the ribosome then separates, releasing the mRNA. (8) The polypeptide chain/protein then spontaneously folds and twists often with the aid of a chaperone protein, forming a functional protein of specific conformation: a three dimensional molecule with secondary and tertiary structure. Additional steps-- posttranslational modifications-- may be required before the protein can be doing its particular job in the cell. Certain amino acids may be chemically altered by the attachment of sugars, lipids, phosphate groups, or other additions. The protein may also be further modified by enzymes, which may remove one or more amino acids from the leading (amino) end of the polypeptide chain.
Usually after a ribosome moves past the "START" codon, another ribosome attaches to the mRNA, forming a string of ribosomes called polyribosomes, building another polypeptide/protein. This continues until the mRNA begins to degrade, which takes about 3 minutes. It takes less than a minute to construct an average sized protein. (13)

"Transcript Processing" - By Jason Kim!

Termination stage of Translation
Termination stage of Translation

Here is an overview of the entire cycle of transcript processing to synthesize protein:

Overview of Transcription and Translation in an eukaryotic cell
Overview of Transcription and Translation in an eukaryotic cell

Now, here is a short video that sums up tanscription processing:

What is Epigenetics?
Epigenetics (literally meaning "above genetics") is the study of chemical reactions that switch parts of the genome off and on and the factors that influence these reactions. It is through epigenetic marks that environmental factors like diet, stress and prenatal nutrition can make an imprint on genes that are passed from one generation to the next. While the DNA code remains permanent for life, the epigenome is flexible. (3)

How are genes turned on or off?
Signals from the outside world work through the epigenome to change a cell's gene expression. Tags on the genes react to signals from the outside world like diet and stress. Epigenetic tags give the cell a way to "remember" long-term what its genes should be doing.
As cells grow and divide, cellular machinery copies epigenetic tags along with the DNA. This is especially important during embryonic development, as past experiences inform future choices. For example, a cell must first "know" that it is an eye cell before it can decide whether to become part of the lens or the cornea. The epigenome allows cells to remember their past experiences long after the signals fade away.
Two types of tags are methyl tags and acetyl tags. Methyl tags silence genes or keep them turned off. They are added to cytosine at sequence CG. Methyl tags can silence genes by blocking transcription machinery from binding to DNA or by recruiting proteins that bind to DNA that blocks transcription machinery from binding. The addition of methyl tags to repress DNA activity is called methylation. Acetyl tags, found near active genes, loosen the bond between DNA and histones, allowing easier access to DNA. They are added to amino acid lysine on the tails of histone proteins. The attachment of different molecules to histones is called histone modification. (1)


Gene Regulatory Proteins Have Two Functions:
1. Switch specific genes on or off
A gene regulatory protein attaches to a specific sequence of DNA on one or more genes. Once there, it acts like a switch, activating genes or shutting them down.
2. Recruit enzymes that add and remove enzymatic tags
Gene regulatory proteins also recruit enzymes that add or remove epigenetic tags. Enzymes add epigenetic tags to the DNA, the histones, or both. (1)

What types of signals does the epigenome respond to?
The epigenome changes in response to signals from inside the cells, neighboring cells, and the outside environment. (1)
  • Direct Contact- Cells can signal their neighbors through direct contact. This is especially important during embryonic development, for example, in early nervous system development.
  • Release Factors- Cells can release factors that are taken in by nearby cells. (This is similar to tossing a ball to someone.) Cells in the nervous system use release factors.
  • Hormone Signals- Hormone signals released in one part of body travel through the blood stream to affect other cells. Any cell can pick up the hormone signals.
  • Environmental Factors- Environmental factors can be direct, such as things that we eat are broken down and circulate the body, or indirect.

Early in development: Most signals come from within cells or from neighboring cells. The mother's nutrition is also important at this stage because the food she brings into her body forms the building blocks for shaping the growing fetus and its developing epigenome. Other types of signals, such as stress hormones, can also travel from mom to fetus.
After birth and as life continues: A wider variety of environmental factors start to play a role in shaping the epigenome. Social interactions, physical activity, diet and other inputs generate signals that travel from cell to cell throughout the body. As in early development, signals from within the body continue to be important for many processes, including physical growth and learning. Hormonal signals trigger big changes at puberty.
In Old age: Cells continue to listen for signals. Environmental signals trigger changes in the epigenome, allowing cells to respond dynamically to the outside world. Internal signals direct activities that are necessary for body maintenance, such as replenishing blood cells and skin, and repairing damaged tissues and organs. During these processes, just like during embryonic development, the cell's experiences are transferred to the epigenome, where they shut down and activate specific sets of genes. (1)


Epigenetic Inheritance
In normal genetic inheritance, two reproductive cells meet and grow and divide to form every type of cell in the adult organism. In order for this process to occur, the epigenome must be erased through a process called reprogramming. Reprogramming is important because genetic information of eggs and sperm is marked with epigenetic tags. Before the new organism can grow into a healthy embryo, the epigenetic tags are erased. During development, specialized cellular machinery erases the genome's epigenetic tags in order to return the cells to a genetic "blank slate." However, for a small minority of genes, epigenetic tags make it through this process and pass unchanged from parent to offspring. This means that a parent's experiences, in the form of epigenetic tags, can be passed down to future generations. (1)


Epigenetic marks can pass from parent to offspring in a way that completely bypasses egg or sperm, avoiding the epigenetic purging that happens during early development.
In mammals, about 1% of genes escape epigenetic reprogramming through imprinting. (1)

Nurturing Behavior in Rats
When the female pups become mothers themselves, the ones that received high quality care become high nurturing mothers. The pups that received low quality care become low nurturing mothers. The nurturing behavior itself transmits epigenetic information onto the pups' DNA, without passing through egg or sperm. (1)

Gestational diabetes
Mammals can experience a hormone-triggered type of diabetes during pregnancy, known as gestational diabetes. When the mother has gestational diabetes, the developing fetus is exposed to high levels of the sugar glucose. High glucose levels trigger epigenetic changes in the daughter's DNA, increasing the likelihood that she will develop gestational diabetes herself. (1)

Nutritional Effects
Shortage of food for the grandfather was associated with extended lifespan of his grandchildren. Food abundance, on the other hand, was associated with a greatly shortened lifespan of the grandchildren. Early death was the result of either diabetes or heart disease. (1)

Implications for Evolution
From our studies about genetics, we already know that the genome changes slowly through the processes of random mutation and natural selection. It takes many generations for a genetic trait to become common in a population. The epigenome, on the other hand, can change rapidly in response to signals from the environment. Also, epigenetic changes can happen in many individuals at once. Through epigenetic inheritance, some of the experiences of the parents may pass to future generations. The epigenome remains flexible as environmental conditions continue to change. Epigenetic inheritance may allow an organism to continually adjust its gene expression to fit its environment, without changing its DNA code. (1)
Some scholars have suggested that epigenetics is similar to Lamarckism, the idea that an organism can pass on characteristics that it acquired during its lifetime to its offspring. (2)

Review Questions

1.) The combination of all transcription proteins along with the RNA polymerase II attached to a DNA strand refers to the…

a) Promoter Region
b) Terminator Region
c) Coactivation Complex
d) Histone deacetylase activity
e) Transcription Initiation Complex

2.) Which of the following are ways in which transcription factors decrease the rate of transcription?

a) Histone acetyltransferase (HAT) activity
b) Histone deacetylase (HDAC) activity
c) Upregulation
d) Enhancer Action
e) All of the above

3.) Which of following is a common feature of the DNA binding motifs?

a) Zinc Atom
b) Cysteine Groups
c) Histidine R Groups
d) Leucine
e) Alpha Helixes

4) The enzyme used in transcription is:
a) Ligase
b) Amino-acyl transferase
c) DNA polymerase
d) RNA polymerase
e) Acetylase

5) The DNA strand with a sequence of AACGTAACG is transcribed. What is the following sequence of the mRNA that is synthesized?

6) Just like DNA polymerase, RNA polymerase performs a template directed synthesis in:
a) 3' ---> 5'
b) 5'---> 3'
c) 5'---> 5"
d) 3'---> 3'
e) 3'---> 4'

7) What is the role of ribosomes in protein synthesis?
a) they translate the basic DNA code using transfer RNA
b) they provide a site for transfer RNAs to link to messenger RNAs
c) they provide a source of amino acids
d) they carry the proteins to their site of action
e) they activate the unwinding of the DNA strand for transcription to occur

8.) Which of the following is a function of gene regulatory proteins?

a) turning genes on or off by binding to DNA
b) receive signals from nearby cells
c) recruit other enzymes to add or remove tags
d) a and b
e) a and c

9.) All of the following are characteristics of epigenetics except:

a) environmental factors can change a gene's expression
b) epigenetics is regulated by acetyl and methyl tags
c) the genome is influenced by internal and external factors
d) changes in the epigenome are permanent
e) epigenetic inheritance occurs due to the failure of all epigenetic tags being erased from reprogramming

10.) What type of epigenetic signals are present during early stages of development?

a) signals from neighboring cells
b) stress hormones from the mother
c) nutrition from the mother
d) signals from within the cell
e) all of the above

Essay Question
Eating disorders, such as diabetes, can be caused by many factors. Some of these factors may be attributed to genetic and environmental influences. Explain the likely causes of eating disorders amongst humans in terms of each of the following:
a) Transcription Factors
b) Transcription Processing
c) Epigenetic Factors

Annotated Works Cited
1. Epigenetics A wonderful site all about epigenetics with a surplus of pictures
2. Epigenetics and Imprinted Genes Information about the basic of epigenetics and imprinting
3. Epigenetics in TIME An article called "Why Your DNA Isn't Your Destiny"
4.Transcription Factors Wiki Page An In-depth look at transcription factors on Wikipedia
5. Transcription Factors Terminology A couple of terms regarding TF's.
6. Transcription Factors A link on how transcription factors bind to DNA.
7. Regulating histone acetyltransferases and deacetylases A website that explains how TF's acetylate DNA.
8. Cambell's Biology: 6th Edition Gotta love our bio book!
9. DNA Binding Domains A great website that explores the different methods of interactions between proteins and DNA.
10. Protein Synthesis: Transcription and Translation A simple overview of protein synthesis
11. Gene Expression: Transcription A closer look at transcription
12. Transcription and RNA Polymerase II Information regarding stages in Transcription
13. Protein Synthesis- Translation A link on processes involved in Translation