In 1972, Horace Barlow, great-grandson of Darwin, wrote an article in which he put forth “A neuron doctrine for perceptual psychology?” (The question mark at the end shows that Barlow was a contemplative guy.) With this doctrine, Barlow tried to relate the firing of neurons in sensory pathways with subjectively experienced sensation. One of the principles he introduces is: at progressively higher levels of sensory processing, information is carried by fewer neurons because the system is organized to a near complete a representation with the fewest active neurons. In other terms, the encoding of sensory information gets ‘sparser’ as one moves up into higher levels of sensory processing. Since then, a number of lines of evidence have converged, supporting Barlow’s proposition and the general value of ‘sparse codes.’
On the blog today I’ll be reviewing one such paper, “Sparse coding of sensory inputs” by Bruno Olshausen and David Field. This paper defines ‘sparse coding’ as a computational strategy where brains encode sensory information using a small number of simultaneously active neurons at a given time.
This paper puts forth four ideas for why sparse coding theoretically might be a good strategy:
- More memories can be stored
- Makes use of the statistical structure of natural signals
- Represents data in a convenient way for further processing
- Save metabolic energy, by decreasing neuronal firing rates
When model neurons are trained to optimize spare representations of natural scenes, the receptive fields that emerge represent the simple-cells of the primary visual cortex.
One interesting feature of sensory representations that this paper points out, is that as visual signals move from the thalamus to the visual cortex, there is a 25:1 expansion (of axonal projects to cortex versus axonal projections to the LGN of the thalamus). They suggest that this 25:1 expansion possible emerges as a compromise between the trade-offs: with high sparseness eventually ending in ‘grandmother cells’ where a single unique neuron represents each element of a sensory, and low sparseness incurring developmental and metabolic costs of having to use many neurons to encode each element. Additionally, the more a neuron spikes the more energy is needed to maintain it’s electrochemical gradients through pumps like the sodium-potassium ATPase. From what we know about the metabolic use of the cortex, scientists have estimated that only 1/50th of neurons are active at any given time.
Another interesting point this paper makes, is that some experimental results show neurons with much higher firing rates than would be predicted by metabolic estimates. Because these experiments often involve searching for firing neurons with an electrode, we may be systematically biasing our studies towards a minority of neurons that fires “less sparsely” than the general populations. Solutions to this bias include chronically implanting electrodes where the positioning is set anatomically or using antidromic stimulation to identify neurons as opposed to stimulus elicited firing.
Sparse coding beyond sensory systems
This paper also points out that sparsely firing neurons are observed in motor cortex during movements, and that experimentally driving a single neuron can be enough to initiate whisker movements in rats (Brecht et al., 2004). In the zebra finch song production pathway, HVC neurons fire sparsely at precise points in song, and precise spike-timing in RA may be important as well.
How does one actually measure “sparseness”
Olshausen and Field write that a standard measure of “sparseness” is kurtosis, with a larger value indicating a “sparser” distribution.
They also describe another method developed by Rolls and Tovee, the activity ratio, which is specialized for one sided distributions (and therefore good at modeling neurons since firing rate cannot drop below zero).
Finally, The activity ratio can then be scaled from 0-1, using Vinje and Gallant’s sparse coding scale transformation:
I think there is compelling evidence for brains using of sparse coding in sensory systems and there are good theoretical reason for why brains should use sparse coding. It will be interesting to see if these findings hold up for motor systems as well. If they find evidence of sparse coding in relatively simple and blunt movements like locomotion, I would guess that they will be more present in higher premotor areas encoding complicated and learned movements.
I big issue in machine learning is how to process the data before unleashing machine learning algorithms on it. I think neurophysiology shows that a good strategy might be to duplicate the data and turn it into over complete sparse representations. I assume scientists working with big data are already doing this, but honestly I have no idea.
Barlow HB: Single units and sensation: a neuron doctrine for perceptual psychology? Perception 1972, 1:371-394.
Brecht M, Schneider M, Sakmann B, Margrie TW: Whisker movements evoked by stimulation of single pyramidal cells in rat motor cortex. Nature 2004, 427:704-710
Olshausen BA, Field DJ (2004). Sparse Coding of Sensory Inputs. Current Opinion in Neurobiology, 14: 481-487.
Rolls ET, Tovee MJ: Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. J Neurophysiol 1995, 73:713-726.
Vinje WE, Gallant JL: Sparse coding and decorrelation in primary visual cortex during natural vision. Science 2000, 287:1273-1276.
 I had always thought of sparse coding from the perspective of individual neurons, as a strategy where a given sensory neuron fires only at a very specific stimuli or aspect of a stimuli, so therefore it fires ‘sparsely.’ I realized in reading this, that really this paper’s definition and my mental understanding are two sides of the same coin if the neurons are firing independently. However, even with a population of neurons that only fire very rarely might not be ‘sparsely coding’ by Olshausen and Field’s definition if they are all firing together and then silent together.
 How does that neuron spur the rest of the brain into action and what occurs when that neuron dies?
 This is the same gallant that did some of the coolest experiments ever decoding neural activity from fMRI activation: https://neuroamer.wordpress.com/2011/10/31/scientists-record-lucid-dreams-with-eeg-and-fmri-simultaneously/
Filed under: Uncategorized | Leave a Comment
First off, I wanted to say I’m working in a songbird lab now, so while I’m keeping this a general neuroscience blog, you’re probably going to start seeing more blogposts about bird brains.
So, is the bird vocal learning pathway specialized for song and independent from other tasks? A new paper by the Okanoya lab addresses this very question.
What is the vocal learning pathway?
We know that the vocal learning pathway resembles general thalamo-cortico-basal ganglia circuitry, but it’s generally thought to be very specialized for song because:
- lesion studies show brain areas in the pathway are necessary for song
- the brain areas are much larger in males (and only males sing)
- the areas are much larger in songbirds than birds that aren’t vocal learners
- neurons in these areas firing is modulated during singing (and in some conditions during playback of the bird’s own songs) as shown by direct neural recordings and early gene expression studies.
One of the brain areas in the vocal learning pathway is the striatum-like Area X, which appears to be a specialized structure surrounded by the more general avian medial striatum (MSt). Like the mammalian striatum, Area X receives strong dopaminergic inputs, but unlike the classic thalamo-cortico-basal ganglia circuit, Area X does not reciprocally connect to the dopaminergic vental tegmental area (VTA) or substantia nigra pars compacta (Person et al., 2008). It is unclear whether parts of the song system evolved out of basal ganglia circuits, but I believe all known vertebrates have basal ganglias, making the basal ganglia one of the most conserved brain regions.
Testing if Area X performs basal-ganglia like calculations
Because electrophysiological studies in mammalian striatum and avian MSt have shown that neurons fire in response to stimuli signaling food rewards and the rewards themselves, the Okanoya lab wondered whether Bengalese finch Area X neurons are modulated by food rewards as well. They designed two operant conditioning studies for birds to perform while they recorded from neurons in Area X.
In the first experiment, if the birds peck at a red LED while it is illuminated, they are rewarded with food 50% of the time. If the birds pecked at the LED while it was off, they received no food, and there would be a greater delay before it illuminated again.
A second experiment the LEDs lit up either red or green, with 50% probability and arandom sequence. If the bird pecked the LED while it was red, it received food 100% of the time. If the LED was green when the bird pecked it, the bird received no food, however the bird was required to peck it to proceed to further trials. This way experimenters could separate out a visual stimulus from it’s association with reward and also control for the general motor movement of pecking. However, it seems to me that to properly learn the task, the bird must learn to peck the green LED to continue on in the trial, see more red lights, and get more food rewards.
After the animals learned these tasks, they performed the tasks while experimenters recorded extracellularly from neurons in Area X that were previously shown to be modulated by song (n=19 neurons for task one and n=25 neurons for task two), and a few neurons that were not modulated from songs (when looking at the inter-spike interval).
Results from Experiment 1
Two examples of Area X neuron firing during the task are shown above. Many neurons showed differential responses between reward trials (when the peck was followed by food) and non-reward trials (when it wasn’t). Standard deviations in spiking rate were significantly greater in reward trials than non-reward trials (p < 0.001).
Results from Experiment 2
In experiment 2, birds were slower to key-peck the non-rewarded key. As in experiment 1, the SDs of firing rate were significantly higher in reward trials than non-reward trials. Neurons recorded in experiment 2 showed different firing on the red LED reward trials, than the green LED no-reward trials, before they even pecked or got their food reward.
I’m going to ramble a bit here, to leave some notes for myself…
I think the authors did a fairly convincing job of showing that Area X neurons are modulated by non-song related tasks. How important this signaling is for normal behavior is unclear. The majority of neurons they recorded from seemed to modulate their firing much more during song.
I wonder if Area X might be not a song learning area, per say, but an elaborated thalamo-cortico-basal ganglia loop that is specialized for beak, and vocal organ movements. I wonder if the modulation seen in this study could relate to motor planning for the actual eating movements. For example, the birds showed a much longer delay in pecking for the unrewarded condition in experiment 2. Maybe the difference in activity upon seeing the reward or non-reward light related to a difference in the movement made as opposed to directly coding the reward of the visual signal. Monkey studies sometimes try to control for this by training the animals to have to respond to non-reward trials with an equal speed as reward trials. As the experimenters mention, monkey studies also try to separate out cognitive processes from related motor activity by recording directly from muscles with electromyography EMG.
Another way to try to minimize this confound for this would be to change the reward from a food pellet to some sort of juice delivery (though the bird would still have to swallow), or perhaps an IV drug delivery, though the drug’s affect on dopamine systems might also modulate area X activity. Similarly, perhaps they changes in activity we see are due to dopamine signaling, but will be irrelevant to the plasticity of synapses in this circuit because they are not directly-locked to a movement. So perhaps Area X is song-specific, but VTA and SNpC are not. Area X is slightly modulated by VTA and SNpC but effectively can ignore their signaling.
One strange part of this study is that they implanted the birds with testosterone secreting tubes to increase the rate of song. However, as they mention testosterone has been shown to change behaviors in some cognitive tasks, so could confound the results. Getting Bengalese finches to sing with electrodes in their brain is challenging, but it is done all the time without using testosterone supplementation.
A possible confound of these experiments is that the birds only heard the mechanical feeder sound on the trials they received food. A better control would be to have the exact same sounds, to make sure the Area X neurons are not being modulated by sound.
Finally, I’m curious about the converse of this question. Is the rest of the striatum modulated during song, similarly to how this study showed Area X was modulated by non-song stimuli?
Hope you enjoyed this article, please check out other posts on my blog. If you liked this blogpost about songbirds, you might also like: Of mice and birds – what is a zebra finch, really?
Person, A.L., Gale, S.D., Farries, M.A. & Perkel, D.J. (2008) Organizationof the songbird basal ganglia, including area X. J. Comp. Neurol., 508, 840–866
Feenders, G., Liedvogel, M., Rivas, M., Zapka, M., Horita, H., Hara, E., Wada, K., Mouritsen, H. & Jarvis, E.D. (2008) Molecular mapping of movement-associated areas in the avian brain: a motor theory for vocal learning origin. PLoS One, 3, e1768.
Seki, Y., Hessler, H.A., Xie, K., Okaynoya, K. (2014) Food rewards modulate the activity of song neurons in Bengalese finches. Eur. J. Neurosci., 39,6, 9750983
 (also known as the anterior-forebrain pathway, AFP)
Filed under: Uncategorized | Leave a Comment
Zebra finch genetics
The zebra finch, Taeniopygia guttata, is an Australian songbird with a black and white striped breast. It is used by neuroscientists as a model organism to study the learning and production of a complex motor behavior–birdsong. Like other songbirds, zebra finches are genetically predisposed to learn their species’ particular song, but they have to hear the song being produced and slowly learn to mimic it through a trial and error process, similar to how babies learn to speak.
So out of the over 5,000 identified passerine songbirds, why did neuroscience choose the zebra finch as its model songbird? For similar reasons, to why biology chose the mouse as a model mammal–zebra finches are small, easy to care for, and breed well in captivity. However, with any lab-bred animals there is a risk of inbreeding, which could alter the results of studies and effectively create subspecies reducing comparability between labs.
Lab bred animals – inbred strains of mice
To reduce genetic variability within studies and increase comparability between labs, mouse geneticists have intentionally mated strains of mice to be inbred. By breeding mice with their siblings for ten generations, they generated animals homozygous at essentially every allele. In other words, all of the animals born from this process have the same genome, with the same copies of each gene on each chromosome. These animals are essentially genetic clones of one another, and if they reproduce with each other their young will clones as well. Through this process we created standardized inbred strains of mice such as C57/BL6Js or BALB/CByJs that are used in labs around the world and maintained by organizations such as The Jackson Laboratory.
Inbred strains have been essential for research, however they can sometimes produce idiosyncratic results, that don’t generalize to other strains, let alone mammals in general. For example, knocking out a gene in one strain may have an obvious phenotype that is conspicuously absent in another, likely because other genes can compensate in the second strain. In extreme cases, inbreeding can fundamentally alter normal characteristics of the species and predispose pathological states, for example BALB/cWah1, are predisposed to lack a corpus callosum. About 20% of mice have no CC and another 20 to 30% have an unusually small CC. Scientists have therefore argued that while data generated from inbred strains might be more reproducible, it may also be less generalizable. You have reduced experimental noise and you can reproduce the same results, but are they robust findings that generalize to all mice or only a particular strain.
Are zebra finches inbred?
There have been no efforts (to my knowledge) to generate an inbred laboratory strain of zebra finches. However, there could potentially have been bottlenecks during the process of domestication and shipping birds between continents that have reduced genetic diversity within laboratory populations of the species. To investigate the population genetics of laboratory zebra finches, researchers genotyped 1000 zebra finches from 18 laboratory populations (held in Europe, North America, and Australia), and two wild populations at 10 microsatellites. Microsatellites, are sequences of repeats of 2-5 base pairs that tend to be quite polymorphic, making them good markers for genetic diversity.
As might be expected, laboratory populations showed loss of genetic variability, due to genetic drift. If in a population a particular allele every drifts to zero, then it is lost in that population. Lab populations on average had roughly half the number of alleles per locus compared to wild zebra finches—11.7 for captive versus 19.3 for wild. The most inbred population studied, a population bred for the recessive trait of white plumage showed only 6.4 alleles per locus, roughly a third of the alleles per locus seen in the wild populations. Researchers found genetic differences between zebra finches found in their European and North American populations and pointed out that there could be functional differences between these populations as well.
The study looked at some functional differences between populations of zebra finches. Populations varied greatly in body mass, with some laboratory populations weighing almost double that of newly domesticated populations (and although body mass may be affected by animal husbandry and housing, this result was true even for three genetic separate populations housed in the same lab). Body weight of animals can be important for scientists who may wish to place electrodes, cannuli, or other devices on birds heads, but it was mainly used as an easy and reliable measure to show real differences between the populations. It is possible there are differences to brain structures and genes involved in learning and plasticity that may affect song structure as well. Such differences are more difficult to demonstrate, but also important to know as they could affect and possibly complicate results from studies.
In sum, the laboratory populations studied zebra finches are somewhat inbred, but still show diversity. If studies conflict between North American and European laboratories, we should consider if genetic effects might explain the differences. Perhaps we should begin inbreeding strains of zebra finches to facilitate genetic studies in the future. Also, we may need to investigate other lab bred species of songbirds used by researchers such as Bengalese Finches, which have been bred in captivity for centuries and selectively bred for their plumage.
 (Parrots and hummingbirds are also vocal learners.)
 This process of sibling-sibling breeding is also known as “selfing.” The Ancient Egyptians, believing the royal family to be gods bred brothers and sisters for 10 generations, and likely created humans homozygous at each allele.
 Of course, due to the sex chromosomes, males and females will have different genomes, and a small number of new mutations occur with each generation.
 Though they are far from perfect: http://en.wikipedia.org/wiki/Microsatellite#Limitations. Although mutations can complicate analyses, previous studies by Forstmeier et al. demonstrated that they are extremely rare in Zebra finches — they did not observe a single event in their microsatelli loci in 7368 parent offspring comparisons.
Forstmeier W, Segelbacher G, Mueller JC, Kempenaers B. Genetic variation and differentiation in captive and wild zebra finches (Taeniopygia guttata). Mol Ecol. 2007 Oct;16(19):4039-50.
Filed under: Uncategorized | Leave a Comment
Alzheimer’s disease (AD) is by far the number one cause of dementia, and the number one risk factor for AD is age–with almost 50% of people above the age of 85 suffering from the disease.
With the aging baby boomer generation and medical advances squaring off the life expectancy curve, the United States is projected to have over 9 million cases of AD by 2050. This huge expected increase in cases has been dubbed by some as the “Alzheimer’s Disease Epidemic.”
Alzheimer’s disease is marked primarily by memory loss and inability to form new memories, but it can also impair language, higher thinking, visuospatial skills, and can even cause personality changes and delusions. By erasing patients’ identities, memories, and personalities, Alzheimer’s disease robs patients of their humanity, and can be devastating for the friends and family watching the disease unfold.
Alzheimer’s disease was first described in 1901, a patient Auguste D., a 51-year-old woman, suffering from debilitating memory loss, deficits, and precatory delusions. She underwent a progressive decline and died five years later. The case was described by Alois Alzheimer a German psychiatrist and neuropathologist (though now the disease is treated by neurologists).
The fact that Alzheimer was a pathologist was of crucial importance. Because of it, he performed an autopsy on Auguste D. and discovered proteinaceous plaques, and neurofibrillary tangles in the woman’s brain, which to this day remain the pathological hallmarks of Alzheimer’s disease:
Alzheimer’s mentor, Kraepelin, was the first to use the eponym “Alzheimer’s Disease” in 1904, and his writing about the disease was prescient:“Although the anatomical findings suggest that we are dealing with a particularly serious form of senile dementia… this disease sometimes starts as early as in the late forties.”
This clinical observation was borne out by later genetics. There is an early onset familial variant of the AD, that is caused by rare mutations. Familial AD is devastating, otherwise normal, healthy patients in their 40s or younger, but luckily it accounts for a minority of cases. The later-onset variant has complex genetic and environmental risk factors and accounts for the vast majority of cases. In this variant genetic factors such as the ApoE4 allele are known to increase the risk of Alzheimer’s, but a doctor cannot genetically predict who will get it and who won’t.
What was really interesting about the first three genes identified in familial Alzheimer’s disease is that they are part of the same pathway. These genes code for the amyloid precursor protein (APP), and presenilin 1 and 2. Amyloid precursor protein is cleaved by the “gamma secretase complex,” an enzyme that is made up of multiple subunits, including presinilin 1 & 2. If APP is cleaved at the wrong locations, it forms a protein fragment beta-amyloid, which can induce other proteins to misfold, forming a protein aggregate that grows like a snowball rolling down a hill. This protein aggregate eventually forms a giant extracellular plaque—the very same plaques that Alzheimer’s described in his patient Auguste D!
The Amyloid Cascade Hypothesis
This genetic work led to the amyloid cascade hypothesis: that extracellular amyloid accumulates and initiates a sequence of events, eventually leading to neurotoxicity and clinical symptoms in AD. This hypothesis has inspired a number of clinical trials to test drugs to treat or slow the course of Alzheimer’s, by attempting to stop the formation or remove these beta-amyloid plaques. Even when trials have been successful y removed patient’s plaques, patients have not shown clinical improvements. It is unclear whether the amyloid cascade hypothesis is wrong (and perhaps amyloid plaques correlate with but do not cause AD), or whether we are just starting our treatments too late once the cascade of neurodegeneration is initiated–but that these treatments could be successful if we used them on patients earlier in their disease course.
In fact we know that patients with Alzheimer’s disease don’t show clinical symptoms until late in the disease, perhaps due to a phenomenon known as cognitive reserve. New techniques however, allow us to determine whether people showing subtle cognitive deficits are at high or low risk of developing Alzhiemer’s, and therefore we can continue to test the amyloid cascade hypothesis by perforing clinical trials using the drugs we know clear plaques on people who are of high risk of developing Alzheimer’s.
For example, the same proteins found in the plaques and tangles in Alzheimer’s disease brains such as amyloid beta make their way into the cerebrospinal fluid which coats the brain and spinal cord. In patients with mild cognitive impairment–meaning they are starting to show cognitive problems compared to age-matched, education-matched controls, but not severe enough to be classified as Alzheimer’s–we can perform lumbar punctures to harvest their cerebrospinal fluid. Then by measuring levels of these proteins found in plaques, we can predict which of these patients with MCI are at high risk of progressing to Alzheimer’s.
Additionally, we can look at the burden and distribution of plaques in the brain using radioactive PET ligands that bind to the plaques:
The Cholinergic Hypothesis
While the previously-described techniques will inevitably assist with the diagnosing AD, the amyloid cascade hypothesis is still unproven, and so far has not lead to any effective therapy. Currently, one of the best treatments for Alzheimer’s disease is lifestyle modification: labeling and arranging one’s life as memory declines. The mainstay of pharmacological approach for AD are cholinesterase inhibitors developed in the 1990s and 2000s, which can improve performance on cognitive tests.
These drugs block acetylcholinesterase, blocking the breakdown of the neurotransmitter acetylcholine into acetate and choline, so the drugs therefore they increase levels of acetylcholine in the synapse and signaling to the postsynaptic cell.
You might be wondering, how do these drugs help treat Alzheimer’s if it is a disease caused by plaques, tangles and neurodegeneration? Why did we develop and test them in the first place?
These drugs were actually developed based on an earlier theory of Alzheimer’s the cholinergic hypothesis.
In the 1960s and ‘70s psychologists gave young healthy subjects cholinergic inhibitors (which prevent acetylcholine from signaling to post-synaptic cells). These young healthy subjects had memory problems and cognitive problems that resembled patients with Alzheimer’s disease, showing that acetylcholine was important for memory.
Most of the brain’s supply of acetylcholine derives from the nucleus basalis of Meynert in the basal forebrain, so doctors investigated this area during the autopsies of patients with Alzheimer’s and showed that the nucleus basalis seems to be particularly vulnerable to the neurodegeneration seen in AD. Patients showed decreased numbers of healthy neurons in this area, and decreased cholinergic projections to the hippocampus and entorhinal cortex (areas we also knew were important for memory), suggesting that patients with AD may have decreased acetylcholine.
Going back to the drugs we use for Alzheimer’s cholinesterase inhibitors, these drugs are thought to boost the remaining acetylcholine signaling going on in synapses, similar to how SSRIs increase the amount of serotonin. However, because they are not preventing the neurodegeneration that is causing AD, disease progression continues and causes massive neurodegeneration, eventually killing patients by interfering with their ability to swallow and breath.
However, if the cholinergic hypothesis is correct, and disruptions to the acetylcholine signaling from the basal forebrain is especially important for the symptoms of AD, then maybe if we can selectively prevent degeneration of this area we can slow the disease. Currently clinical trials are delivering of Nerve Growth Factor (NGF) to the basal forebrain using gene therapy approaches—implanting stem cells that produce NGF or using viruses to modify cells in the basal forebrain to produce it themselves. Nerve Growth factor is known to increase the growth and vitality of cholinergic neurons, and may therefore help preserve the cells as they suffer early insults of Alzheimer’s.
Even if Nerve Growth Factor gene therapy can just slow the disease, we can give patients more time with their memories intact to enjoy their families and a dignified independent life. We can decrease the time families and friends spend caring for a person who’s face the recognize, but whose mind becomes increasingly unfamiliar, and we can decrease the time patients have to spend institutionalized or with a full time caretaker.
Clinical research is slow, and research is never certain, but the new approaches to Alzheimer’s give hope to a Alzheimer’s disease—a disease where prognoses for patients like Auguste D. have remained gloomy for over a century, and where an epidemic of new cases looms ahead in the near future.
Filed under: Uncategorized | Leave a Comment
Here are the articles I thought were tweet-worthy in July 2012. If you find the topics interesting, follow me on twitter. I really appreciate your support.
July was an interesting month, including: unconscious passwords stored in procedural memory, neuroethics of the “gay gene,” Virtually THC-free (but CBD-rich) marijuana, oh and lets not forget the Higgs Boson (puts things in historical context).
Brain: The precuneus: review – involved in self-consciousness, engaged in self-related mental representations – http://bit.ly/LO2lnZ
http://scienceblog.com/55711/children-in-foster-care-develop-resilience-through-compassion/ … children in foster care develop resilience through compassion meditation
List of movies with psychological or cognitive science themes – Indiana http://bit.ly/PIANW8
A Single Brain Structure May Give Winners That Extra Physical Edge: Scientific American – t http://bit.ly/LNKb5Q
The American Scholar: Living With Voices – T. M. Luhrmannhttp://bit.ly/PBJ1iN – Thought-provoking alternative view on auditory hallucinations
Continue reading ‘Month in Review: July – Unconscious passwords, “gay genes,” high-less pot, and the Higgs Boson’
Filed under: Uncategorized | Leave a Comment
5-hydroxy-methyl CpGs (5-hmCs) were first discovered in 2009 and shown to be enriched in the brain, but remain a mysterious epigenetic mark, despite intriguing functional findings such as: environmental enrichment’s reduction of it, MeCP2’s preference for 5mc over 5hmc, and it’s possible role as an intermediate in demethylation. This new technique will aid their characterization by allowing absolute quantification and base-resolution localization of the marks. The technique also serves as a reminder of why you should pay attention in orgo, or at least why you should collaborate with people who did!
Now, Emory’s Peng Jin, has collaborated with University of Chicago chemist He Chuan to develop a new derivative of bisulfite sequencing, Tet-Assisted Bisulfite Sequencing (TAB-Seq) that distinguishes 5-hmcs from 5-mcs, as they describe in cell.
Traditional bisulfite sequencing (MethylC-Seq):
- Sequence the sample
- Treat the sample with bisulfite, which converts all non-methylated cytosines to uracils, but leaves 5-mCs and 5-hmCs as cytosines. (The even rarer base 5-carboxy-C (5-caC), is converted by bisulfite into 5-caU.)
- Compare your first sequence to your second. You know the unmethylated cytosines from the first sequence will show up as Ts in the second sequence (because when they are amplified, they will be amplified as Thymidine not Uracil). The methylated and hydroxy-methylated cytosines which show up as Cytosines in both sequences.
So, how can we differentiate 5-hmCs from 5-mCs? In a process that may remind you of all those organic chemistry synthesis problems, TAB-seq involves an extra step to protect the hydroxy-methylated cytosines from TET oxidation.
- Glucosylate 5hmC using β-glucosyltransferase (βGT).
- 5mC is oxidized to 5caC by with an excess of recombinant Tet1. (The blocked 5hmCs (β-glucosyl-5-hydroxymethylcytosine (5gmC) ) are not oxidized.)
- Treat the sample with bisulfite. This converts the Cs and 5-caCs to Us, but doesn’t effect the 5gmC.
- Compare back with traditional bisulfite sequencing. The 5-hmCs are the bases that show up as Cs in these two sequences. (5-mCs will show up as Ts in TAB-seq, but Cs in traditional. Unmodified Cs and CaCs will show up as Ts in both sequences.)
Validation and Findings
Validation of new techniques (proof that they work), is always important, and the paper shows that it works using mass spectrometry.
They validated it’s practicality, by using the technique to map 5-hmCs in human embryonic stem cells (hESCs) and mouse embryonic stem cells (mESCs). In hESCs they found 691,414 5hmCs with a false discovery rate of 5%. Interestingly, though mice have similarly sized genomes, they found much higher levels of 5hmCs–2,057,636–which they hypothesize is also due to the higher levels of Tet1 and Tet2 proteins.
So where are 5hmCs enriched, now that we can identify them precisely? H1 distal-regulatory elements including p300-binding sites (observed/expected [o/e] = 7.6), predicted enhancers (o/e = 7.8), CTCF-binding sites (o/e = 5.1)–CTCF is a transcriptional repressor that blocks interactions between promoters and enhancers and also plays a role in stopping the spread of heterochromatin, and DNase I hypersensitive sites (o/e = 3.4) which are associated with active gene expression. Because, 5-hmCs are enriched at enhancers, the authors speculate that 5-hmC may be specifically recognized by transcription factors as a core base in binding motifs.
Many genes had significant enrichment of 5hmC, but lowly expressed genes had more than highly expressed. 5hmCs showed asymmetry, with more hydroxylation on strands where the CpG was surrounded by Gs. (A similar pattern wasn’t observed for 5mCs.)
5hmCs also tended to be enriched near low CpG areas
Previous findings that identified 5hmCs in high CpG areas, such as CpG island-containing promoters, but these findings are likely do to the bias of mapping techniques which can amplify frequent weak signals and overshadow sparse but strong ones. The present study, found that 5hmCs tended to be enriched in lower CpG areas, especially those with H3K4me3 or bivalent (H3K4Me3 and H3K27ac) chromatin modifications, but how 5hmC interacts with the histone code is still up in the air.
It will be interesting to see if the findings from, generalize to different cell-types, but since hESCs and mESCs showed similar patterns, it suggests that the regulation at least in stem cells is evolutionarily conserved.
It seems this tree will have bountiful fruit, weighing down the branches for some time. I’ll leave a final summary in the authors’ own words:
“We have developed a genome-wide approach to determine 5hmC distribution at base resolution and have generated base-resolution maps of 5hmC in both hESCs and mESCs. These maps provide a template for further understanding the biological roles of 5hmC in stem cells as well as gene regulation in general. In conjunction with methylC-Seq, the TAB-Seq method described here represents a general approach to measure the absolute abundance of 5mC and 5hmC at specific sites or genome-wide, which could be widely applied to various cell types and tissues.”
Kriaucionis, S., & Heintz, N. (2009). The nuclear DNA base , 5-hydroxymethylcytosine is present in brain and enriched in Purkinje neurons. Science, 324(5929), 929-930. (Free full text.)
Szulwach, K. E., Li, X., Li, Y., Song, C.-X., Wu, H., Dai, Q., Irier, H., et al. (2011). 5-hmC-mediated epigenetic dynamics during postnatal neurodevelopment and aging. Nature neuroscience, 14(12), 1607-16. Nature Publishing Group. doi:10.1038/nn.2959
Yu M, Hon GC, Szulwach KE, Song CX, Zhang L, Kim A, Li X, Dai Q, Shen Y, Park B, Min JH, Jin P, Ren B, & He C (2012). Base-resolution analysis of 5-hydroxymethylcytosine in the Mammalian genome. Cell, 149 (6), 1368-80 PMID: 22608086
Guo, J. U., Su, Y., Zhong, C., Ming, G.-li, & Song, H. (2011). Emerging roles of TET proteins and 5-hydroxymethylcytosines in active DNA demethylation and beyond. Cell Cycle, 10(16), 2662-2668. doi:10.4161/cc.10.16.17093
You may also be interested in the brief article I wrote previously about 5-hmCs and a paper that showed that they are highly enriched in the cerebellum and hippocampus (10x higher than in stem cells), and they increase with age. Further, the authors showed that MeCP2–which strongly binds the unhydroxylated and more ubiquitously expressed version, 5-methyl CpGs–does not bind 5-hmCs. Overexpression of MeCP2 even seems to block TETs from converting 5-mCs into 5-hmCs.
Just learned about oxBS-Seq another method for sequencing, need to look into this. Does anyone know off-hand advantages/disadvantages of either?
Filed under: Epigenetics, Genetics, Human, Molecular, Mouse | 1 Comment