Despite decades of research and technological advancements, the human race still battles with several disruptive—if not deadly—diseases. There are many unanswered questions, including how cells shift from a healthy state to a diseased state in the first place. We do know, however, that modifications at the cell surface play a critical role in the development of viral infections, inflammation, autoimmune diseases, and cancer. In particular, carbohydrates determine how cells communicate with one another and their environment. Because of the bewildering diversity of carbohydrates and their compounds, you need an excellent understanding of the basics before delving deeper into glycobiology.
To help you navigate this complexity, we have prepared an introductory guide to glycobiology that explains the essentials of carbohydrates and their importance in cellular functions.
We cannot talk about glycobiology without mentioning monosaccharides, the building blocks of carbohydrates.
The word monosaccharide derives from the Greek words mono (one) and saccharide (sweetness), so it is actually just a fancy way of saying simple sugar. These sugars have the general formula Cx(H2O)n.
Most of you are perfectly familiar with the three common simple sugars: glucose, fructose, and galactose. The human body acquires these from various food sources and uses them as primary fuel sources for the brain, muscles, and other organs.
Despite the differences in structures, all monosaccharides—including our three friends mentioned above—share physical and chemical properties. In their linear form, monosaccharides have one carbonyl group (C=O), while other carbons have one hydroxyl group (-OH) attached to them. The carbonyl group can appear at the tip of the chain, making the carbohydrate an aldehyde H(C=O)-, or somewhere in the middle, making it a ketone -(C=O)-.
Almost all monosaccharides primarily exist as five or six carbon rings, formed via the nucleophilic addition between the carbonyl group and one of the hydroxyl groups.
The cyclization of monosaccharides has a crucial implication: they acquire a new asymmetric center called the anomeric carbon. It becomes the center of reactivity for monosaccharides, especially when simple sugar molecules come together to form more complex carbohydrates. In other words, glycobiology owes its structural diversity partly to the anomeric carbon.
Complex carbohydrate formation begins with two monosaccharides joined together by a linkage called a glycosidic bond. The link is formed between the anomeric carbon of the first monosaccharide and the hydroxyl group of the second. Also of note, a glycosidic bond can occur between a carbohydrate and another biological macromolecule, which we will explore later in the article during our discussion of glycoconjugates.
In the grand ocean of glycobiology, the renowned simple sugars are only the tip of the iceberg. Besides supplying the body’s energy demand, they often serve as building blocks of complex sugars: oligosaccharides and polysaccharides.
Both words have Greek origins—oligo means few and poly means many. As the names suggest, oligosaccharides contain between 2 and 10 monosaccharides. Sugar chains beyond that can be considered polysaccharides.
As we mentioned in the previous section, sugar molecules connect through glycosidic linkages. There isn’t a single way to connect two sugar molecules. In fact, there is a wide range of possibilities, depending not only on the choice of sugars but also the way they are linked. The anomeric carbon of one sugar can bind to any of the unmodified hydroxyl groups. In addition, the anomeric carbons are stereogenic (bearing different substituents), meaning the glycosidic bonds can assume different configurations (namely, α- or β-).
Here is the gist: the linkage diversity leads to an astonishing structural and functional diversity, laying the foundations for several essential biological processes in the body.
Oligosaccharides spark particular interest because they are rarely found as stand-alone molecules in nature. Instead, they are conjugated to proteins and lipids through glycosidic bonds, hence the name glycoconjugate. These glycoproteins and glycolipids populate the cell surface and are the key players in cell-to-cell communication, adhesion, and signaling. Not surprisingly, abnormalities in their formation and population are often associated with diseased states, including congenital disorders, AIDS, the Coronavirus disease, rheumatoid arthritis, cancer, and inflammatory bowel disease .
Polysaccharides are equally significant for the human body for a whole other set of reasons. Although found in several food sources, they cannot be digested by our digestive system. It is a known fact that groups of microbes in our body outnumber our very own cells by several fold and play an essential role in our health. Insufficient intake of polysaccharides like cellulose, resistant starch, and pectin (mainly found in plant-based foods) causes imbalances in our gut microbiome. This disruption has been associated, time and time again, with several diseases, including diabetes, obesity, and cancer .
Now, let’s take a step back and focus on glycoconjugates, which we mentioned briefly in the previous section. From the simplest single-celled organisms to humans, all cells are densely covered with layers of glycans (the collective name for oligosaccharides and polysaccharides) attached to surface proteins and lipids. These structures facilitate cell–cell interactions, molecular transfer across the cell membrane, cell signaling, and determination of cell fate.
Take the mucus and saliva, for example. These fluids, critical to your fight against pathogens, are abundant in a heavily glycosylated protein called mucin .
Heavily what, you ask? Now would be the perfect time to introduce our next term: glycosylation, the conjugation between a glycan and the functional group of another molecule.
There are a few types of glycosylation, but the two commonly observed in eukaryotes are N-glycosylation and O-glycosylation, where N and O indicate the atom to which the carbohydrate is attached.
The binding mechanism is quite intricate for both types, so we are just scratching the surface here. Simply put, you have a macromolecule (e.g., peptide, protein, or lipid) and a precursor sugar molecule (usually one of the following simple sugars: N-acetylglucosamine (GlcNAc), N-acetylgalactosamine (GalNAc), galactose, fucose, or mannose).
It is important to note here that the mechanism of bond formation follows a site-specific pattern. In other words, especially in the glycosylation of proteins, certain amino acid residues are favored in glycosidic bond formation. In N-linked glycosylation, the carbohydrate is covalently attached to not just any nitrogen atom, but the one on the amide group of an asparagine (Asn) residue. In O-glycosylation, the attachment occurs between the anomeric carbon of the sugar molecule and the oxygen of serine (Ser) or threonine (Thr) residues.
Upon glycosylation, the glycan often undergoes maturation, during which its chains are elongated via the addition of new sugars. This elongation follows either a linear or a branched pathway, bringing about structural and functional diversity across the final molecules.
Glycosylation is endothermic, meaning that the bond breaking/forming processes require energy. A group of enzymes called glycosyltransferases provide a ladder for the molecules to climb over the activation barrier so that the reaction can proceed. This reaction can be either the initial transfer of the sugar to the protein/lipid or elongation of the glycan moiety via additional sugar residues.
Many glycosyltransferase-mediated reactions follow a similar pattern, in which the reactants acquire the roles of donor and acceptor. The monosaccharide to be added is usually presented as a nucleotide sugar, where the sugar is attached to a nucleotide to form an activated glycosyl donor. This particular form makes that sugar residue ready for action (or reaction in this case).
The molecule to which the sugar is added is termed an acceptor. In the initial addition of the monosaccharide, the acceptor would be the protein or peptide. For further glycosylation, the main glycan chain acts as the acceptor.
Glycosylation is an intricate but well-coordinated process thanks to the nature of glycosyltransferases. First of all, many glycosyltransferases possess strict specificity towards their acceptors and donors. This is particularly true for the initial biosynthesis of the glycoconjugate (i.e., the transfer of the first monosaccharide to the side chain residue). In addition, glycosyltransferases act sequentially, meaning that the product of one glycosylation becomes the acceptor for the next. This phenomenon could explain the differences in prominent glycan structures and glycosyltransferase expressions across different cell types.
Since glycosyltransferases are responsible for the cell glycan composition, it is no surprise that modifications in their expression are a primary indicator of disease. Especially in cancer, modified expression of glycosyltransferases results in abnormal glycan structures disrupting the life cycle of the cell. In the end, you get cells that become immortalized, require extra nutrition, divide uncontrollably, and can easily metastasize.
Indeed, abnormal glycosyltransferase expression has been detected in various cancer types and is now used as a significant hallmark for cancer biomarker discovery.
The final term before we wrap up our glycobiology discussion pertains to the recognition of glycans. It would be impossible to predict the functions of glycans without knowing what exactly interacts with them.
There is another class of proteins, collectively called lectins, that recognize specific glycan sequences. They are found not only in animals but also in plants and prokaryotes.
Just like glycosyltransferases, lectins are also specific due to their distinct carbohydrate-recognition domains. This recognition pattern drives contact and communication between the glycan-bearing cell and the lectin-bearing cell. Such cell–cell interactions are key to immune response activities, such as phagocytosis, apoptosis, binding of pathogens, and regulation of cell movement (e.g., adhesion and migration).
Due to their abilities to recognize specific glycan structures, lectins have become the rising stars of glycobiology research, particularly in glycan detection in cancer cells. Interestingly, we are not just talking about animal lectins. In fact, plant lectins are the primary sources of glycan detection in today’s cancer research. Sounds baffling, right? How can a plant-based protein be used to map out the human glycan profile?
This is a discussion for another post. In our next glycobiology article, we will dive into lectins—particularly plant lectins—and explore what makes them so enticing in glycan research. In the meantime, you can refer to our lectin guide to find out how these versatile tools can help your research.