IB Biology · Theme A · A1.2

The four-letter
alphabet of
everything alive.

Two strands. Four letters. Three billion years of memory. A chemistry that copies itself well enough for life to last.

15Sub-topics
29Key terms
SL+HLLevel
MoleculesLevel of organisation
Why this topic

What this topic answers.

Every sub-topic below feeds at least one of these questions.

Guiding question 1

How does the structure of nucleic acids allow hereditary information to be stored?

Guiding question 2

How does the structure of DNA facilitate accurate replication?

A1.2.1 – A1.2.10 · Standard Level

10 things to lock in.

The required syllabus content for A1.2, in order. Each card is one lesson-sized checkpoint.

A1.2.1

DNA as the genetic material of all living organisms

Some viruses use RNA as their genetic material but viruses are not considered to be living.

A1.2.2

Components of a nucleotide

In diagrams of nucleotides use circles, pentagons and rectangles to represent relative positions of phosphates, pentose sugars and bases.

A1.2.3

Sugar–phosphate bonding and the sugar–phosphate “backbone” of DNA and RNA

Sugar–phosphate bonding and the sugar–phosphate “backbone” of DNA and RNA

A1.2.4

Bases in each nucleic acid that form the basis of a code

Bases in each nucleic acid that form the basis of a code

A1.2.5

RNA as a polymer formed by condensation of nucleotide monomers

RNA as a polymer formed by condensation of nucleotide monomers

A1.2.6

DNA as a double helix made of two antiparallel strands of nucleotides with two strands linked by hydrogen bonding between complementary base pairs

DNA as a double helix made of two antiparallel strands of nucleotides with two strands linked by hydrogen bonding between complementary base pairs

A1.2.7

Differences between DNA and RNA

Differences between DNA and RNA

A1.2.8

Role of complementary base pairing in allowing genetic information to be replicated and expressed

Role of complementary base pairing in allowing genetic information to be replicated and expressed

A1.2.9

Diversity of possible DNA base sequences and the limitless capacity of DNA for storing information

Diversity of possible DNA base sequences and the limitless capacity of DNA for storing information

A1.2.10

Conservation of the genetic code across all life forms as evidence of universal common ancestry

Conservation of the genetic code across all life forms as evidence of universal common ancestry

A1.2.1 · The genetic material

One molecule. All of life.

Every living organism on Earth — from E. coli to a blue whale — stores its genetic information in DNA. The single most universal fact in biology.

DNA (deoxyribonucleic acid) is the genetic material of all living organisms. It is found in chromosomes and contains the instructions for the growth, development and functioning of every cell. The sequence of bases along a strand of DNA is what we mean by genetic information.

🧬

The seven characteristics of life

Living organisms have at least one cell, carry out metabolism, maintain homeostasis, respond to stimuli, reproduce, grow and develop, and contain genetic information (in the form of DNA). Use this checklist to decide whether something is alive — or, more interestingly, isn't.

Why viruses aren't alive (despite having genetic material)

Viruses do carry genetic material — DNA or RNA — surrounded by a protein coat. But they fail almost every other test of life:

This is why the syllabus says "some viruses use RNA as their genetic material but viruses are not considered to be living." A virus is genetic information looking for a cell to hijack.

A1.2.2 · Components of a nucleotide

Three pieces. Always.

A nucleotide is a phosphate, a sugar, and a nitrogenous base — held together by covalent bonds.

P PHOSPHATE SUGAR PENTOSE BASE NITROGENOUS
Phosphate · circleA PO₄³⁻ group. Negatively charged. Forms the backbone of the polymer when many nucleotides join.
Pentose sugar · pentagonA 5-carbon ring sugar. Ribose in RNA, deoxyribose in DNA. The naming convention is no accident — deoxyribose is ribose with one less oxygen (on the 2' carbon).
Nitrogen base · rectangleOne of A, T, G, C in DNA (with U replacing T in RNA). The base is the information-carrying part of the nucleotide.
Drawing conventionIn IB diagrams: circle for phosphate, pentagon for pentose sugar, rectangle for nitrogen base. Memorise it.
A1.2.3 · Sugar–phosphate backbone

A polymer built by condensation.

Both DNA and RNA are polynucleotides — long chains assembled one nucleotide at a time through condensation reactions, building a strong covalent backbone.

Each condensation reaction joins the phosphate of one nucleotide to the sugar of the next, releasing a water molecule. The result is a continuous chain of covalently bonded atoms — sugar, phosphate, sugar, phosphate — which forms a strong backbone for the molecule. The bases stick out sideways from this backbone, where they can pair with bases from another strand or be read by enzymes.

The backbone is strong because every bond in it is a covalent bond — energetically expensive to break. This is why DNA can survive decades inside a cell (and tens of thousands of years inside a frozen mammoth tusk) without degrading.

A1.2.4 · Four bases

The four-letter alphabet of everything alive.

The sequence of bases is the code. The sugar–phosphate backbone is just the rail it sits on.

DNA bases

A · T · G · C

Adenine, Thymine, Guanine, Cytosine. A pairs with T; G pairs with C. The sequence in a gene determines the amino acid sequence in a protein.

RNA bases

A · U · G · C

Adenine, Uracil, Guanine, Cytosine. Uracil replaces thymine. RNA uses U because it's metabolically cheaper to make than T — fine for short-lived transcripts.

From base to protein

The flow of information is direct:

  1. Transcription — the DNA code is copied into a complementary mRNA code in the nucleus.
  2. Translation — ribosomes read the mRNA code in triplets (codons) and assemble the matching amino acid sequence to form a polypeptide.

Each set of three bases (a codon) specifies one amino acid. With four possible bases at each position, there are 4³ = 64 possible codons — enough to code for the 20 standard amino acids, with redundancy and start/stop signals.

A1.2.5 · RNA polymer

Single-stranded. Versatile.

RNA monomers (nucleotides) link by condensation reactions to form an RNA polymer. Usually single-stranded; the same chemistry, different geometry.

✏️

Drawing convention (exam-critical)

For RNA: vertical column of nucleotides; covalent bond joining the sugar of one nucleotide to the phosphate of the next; label the four RNA bases (U, A, C, G). For an explicit IB diagram you need 4+ nucleotides in a chain.

A1.2.6 · The double helix

Two strands. Antiparallel.

Two polynucleotide strands wind around each other in opposite directions, held together by hydrogen bonds between complementary base pairs. The structure is so elegant it feels inevitable.

AT TA GC CG AT TA GC 5' 3' 3' 5'
Two antiparallel strandsOne strand runs 5' → 3', the other runs 3' → 5'. Their backbones face outward; the bases face each other in the middle.
Complementary base pairingA pairs only with T (2 hydrogen bonds). G pairs only with C (3 hydrogen bonds). The pairings are forced by the chemistry of the bases — no other combinations work.
Hydrogen bonds — the magicIndividually weak, but with millions per chromosome, collectively strong enough to hold the helix stable. Weak enough to be unzipped by enzymes when needed for replication or transcription.
Drawing conventionFor IB diagrams: show the two strands antiparallel (pentagons pointing in opposite directions). A with T, G with C. Hydrogen bonds = dashed lines. You don't need to draw the helical twist.
A1.2.7 · DNA vs RNA

Same chemistry, different jobs.

Three differences. Each one matters.

Feature DNA RNA
Number of strands2 (double helix)1 (single strand)
Pentose sugarDeoxyriboseRibose
BasesA, T, G, CA, U, G, C (U replaces T)
StabilityVery stable — built to lastLess stable — built to be disposable
ExamplesChromosomes, plasmidsmRNA, tRNA, rRNA

You should be able to sketch deoxyribose and ribose and label the difference: deoxyribose lacks the –OH on its 2' carbon (the "deoxy" part).

A1.2.8 · Why complementarity matters

One strand encodes the other.

Because the pairings are obligate, each strand is a perfect template for the other. This is what makes replication and gene expression possible.

A=T (2 H-bonds)

Adenine pairs with thymine

The chemistry of thymine only allows adenine to bond with it — and they form two hydrogen bonds. In RNA, adenine pairs with uracil instead (also two H-bonds).

G≡C (3 H-bonds)

Guanine pairs with cytosine

Guanine pairs only with cytosine, held by three hydrogen bonds — slightly stronger than A=T. DNA regions rich in G–C are harder to denature.

Two processes that rely on it

A1.2.9 · Information capacity

An essentially limitless code.

Any length is possible. Any sequence is possible. So the number of possible DNA molecules is astronomically large — and so is the amount of information one cell can carry.

Human chromosomes
23 pairs

Total DNA per cell ≈ 2 metres when stretched out.

Longest human chromosome (1)
249 M nt

~249 million nucleotides — the longest in our genome.

Shortest human chromosome (21)
48 M bp

~48 million base pairs — the smallest autosome.

Longest human gene (dystrophin)
2.3 M bp

Codes for the muscle protein dystrophin. Mutations cause muscular dystrophy.

Genes are sequences of DNA that code for specific proteins. The shortest human gene (coding for a tRNA) is only 76 nucleotides long; the average is 10,000 – 15,000 nucleotides. The total information stored in 23 pairs of chromosomes — about 3 billion base pairs — fits inside every cell of your body, and is copied every time a cell divides.

A1.2.10 · Universal common ancestry

One code, every kingdom.

Bacteria, archaea, plants, fungi, animals — almost without exception, all use the same 64 codons to specify the same 20 amino acids. This shared code is the single strongest piece of evidence that modern life traces back to a single common ancestor.

"Near-universal" because there are tiny exceptions — a few codons differ in some mitochondrial DNAs and in a handful of ciliate protozoans. But the rule holds for >99% of life: AUG = methionine in E. coli, in yeast, in oak trees, in you. Read a human gene with an E. coli ribosome and you get the same protein you'd get reading it with a human ribosome.

This is why we can put a human gene into bacteria and have them make human insulin (the basis of recombinant biotechnology). It's also why we can be confident that life on Earth has a single deep ancestor — LUCA, the Last Universal Common Ancestor — sometime around 3.5–4 Ga.

HL extension

Higher Level only.

An extra 5 sub-topics for HL — same syllabus, deeper mechanism.

HL only

Directionality of RNA and DNA

Directionality of RNA and DNA

HL only

Purine-to-pyrimidine bonding as a component of DNA helix stability

Purine-to-pyrimidine bonding as a component of DNA helix stability

HL only

Structure of a nucleosome

Structure of a nucleosome

HL only

Evidence from the Hershey–Chase experiment for DNA as the genetic material

Evidence from the Hershey–Chase experiment for DNA as the genetic material

HL only

Chargaff’s data on the relative amounts of pyrimidine and purine bases across diverse life forms

Chargaff’s data on the relative amounts of pyrimidine and purine bases across diverse life forms

A1.2.11 · Directionality

5' to 3'. Always.

Both DNA polymerase and RNA polymerase can only add nucleotides in one direction: 5' → 3'. This single constraint shapes everything about how DNA is replicated, transcribed and translated.

Numbering the carbons

In ribose and deoxyribose, the carbons are numbered 1' through 5' going clockwise from the oxygen atom:

  • 1' — the nitrogen base attaches here.
  • 2' — bears –H in deoxyribose, –OH in ribose. The single difference that names the two sugars.
  • 3' — the next nucleotide's phosphate attaches here in the chain.
  • 5' — bears the nucleotide's own phosphate group.

Why directionality matters

  • Replication. DNA polymerase reads its template 3' → 5', and synthesises the new strand 5' → 3'. This is why one daughter strand (leading) is built continuously and the other (lagging) in Okazaki fragments.
  • Transcription. RNA polymerase moves along its DNA template in the 3' → 5' direction, producing mRNA in the 5' → 3' direction.
  • Translation. Ribosomes read mRNA from 5' end to 3' end. The 5' end enters the ribosome first.
A1.2.12 · Helix stability

Purine + pyrimidine = constant width.

The DNA double helix is exactly the same width all the way down the molecule — regardless of the sequence — because every base pair is a purine paired with a pyrimidine.

Purines

A, G · two rings

Adenine and guanine. Larger, two-ring molecules. About 2 nm long.

Pyrimidines

C, T (U) · one ring

Cytosine, thymine (uracil in RNA). Smaller, one-ring molecules. About 1.2 nm long.

A purine paired with a pyrimidine = constant length. A purine paired with another purine would be too wide; a pyrimidine paired with another pyrimidine would be too narrow. This is why A pairs only with T (purine + pyrimidine) and G pairs only with C (purine + pyrimidine) — never A with G or T with C.

The consequence: the helix has the same three-dimensional structure regardless of base sequence. This is what lets the same enzymes replicate and transcribe any stretch of DNA.

A1.2.13 · Nucleosomes

Packaging 2 metres of DNA into a 10-μm nucleus.

The nucleosome is the first level of DNA packaging — a unit of about 147 base pairs of DNA wrapped twice around a core of histone proteins, with one more histone holding the structure.

  • The core. Eight histone proteins (two copies each of H2A, H2B, H3, H4) form an octamer. They are positively charged, which lets them bind the negatively charged DNA phosphate backbone electrostatically.
  • The wrap. DNA loops around the core almost twice.
  • The lock. A separate "linker" histone (H1) sits on the outside and stabilises the wrapped structure.
  • Why it matters. Nucleosomes enable the supercoiling of DNA during cell division (so 2 metres of DNA fits inside a 10-μm nucleus) and also regulate gene expression — when DNA is tightly wrapped on a nucleosome, transcription enzymes can't reach it.
A1.2.14 · Hershey & Chase (1952)

The experiment that settled the debate.

In the 1950s, biology was split: was the genetic material protein or DNA? Hershey and Chase used the T2 bacteriophage and two clever radioactive labels to settle it.

The setup

The T2 bacteriophage is a virus that infects E. coli. Its structure is simple: a DNA molecule inside a protein coat. When it infects a bacterium, the DNA enters the cell and the protein coat stays outside, attached to the surface.

Hershey and Chase produced two batches of radioactive viruses:

  • Batch 1 · radioactive ³²P — in the phosphate of DNA only (proteins contain no phosphorus).
  • Batch 2 · radioactive ³⁵S — in the sulfur of proteins only (DNA contains no sulfur).

The blender trick

Each batch was used to infect a separate flask of E. coli. After infection, the cultures were violently agitated in a kitchen blender — shaking the empty protein coats loose from the bacterial surface — then centrifuged. The bacteria (with whatever had entered them) pelleted at the bottom; the loose protein coats stayed in the supernatant.

The result

  • ³²P (DNA-labelled batch). Radioactivity was found mostly in the pellet — the bacteria. DNA had entered the cells. Newly-produced phage in the supernatant were also radioactive, meaning the bacteria had copied the DNA.
  • ³⁵S (protein-labelled batch). Radioactivity stayed mostly in the supernatant — outside the cells. Protein had not entered the bacteria, and therefore could not be the genetic material.
🧪

Nature of Science · technology enables experiments

The experiment was only possible because radioisotopes had recently become available to researchers. New tools open new questions — a recurring pattern in the history of biology.

A1.2.15 · Chargaff & falsification

The data that killed a hypothesis.

The early "tetranucleotide hypothesis" said DNA was a boring repeating unit. Chargaff's measurements showed it couldn't be — and his pattern eventually let Watson and Crick crack the helix.

The tetranucleotide hypothesis

In the early 20th century, Phoebus Levene proposed that DNA was a repeating unit of all four bases in equal amounts — A, T, G, C, A, T, G, C, …. If that were true, every DNA sample would contain 25% of each base.

Chargaff's measurements

Erwin Chargaff measured the relative amounts of the four bases in DNA from many different species. Two findings:

  • The four bases were not present in equal amounts — directly falsifying the tetranucleotide hypothesis.
  • But there was a pattern: in every species he tested, the amount of A always equalled the amount of T, and the amount of G always equalled the amount of C. This is now called Chargaff's rule.

Watson and Crick used Chargaff's rule directly: if A=T and G=C across all species, the simplest explanation was that the molecule contains paired strands with A always opposite T and G always opposite C. The double helix fell out of that constraint.

Nature of Science · induction and falsification

Induction is the move from specific observations to a general hypothesis (Levene's tetranucleotide idea was inductive). The problem: induction can never prove a hypothesis — there's always a next observation that might falsify it. Falsification (Karl Popper's principle) reverses the logic: hypotheses are accepted as long as they survive attempts to break them. A single counter-example can disprove a hypothesis decisively. Chargaff's data did exactly that to the tetranucleotide hypothesis.

HL-only key terms

PurinesPyrimidinesNucleosomeHistonesTetranucleotide hypothesisRadioisotopes5’ to 3’Chargaff’s ruleInductionFalsification
Vocabulary

19 terms to own.

If you can't define one of these in a sentence, that's where to revise next.

GeneticsDNARNANucleotidesPhosphateDeoxyriboseRiboseNitrogen baseGenetic codePolymerCondensation reactionComplementary base pairsPolynucleotidesAntiparallelDNA ReplicationGene expressionVirusHomeostasisMetabolism

IB Linking Questions

“What makes RNA more likely to have been the first genetic material, rather than DNA?”

“How can polymerization result in emergent properties?”