Glycoproteins comprise a group of polypeptides that are decorated with oligosaccharide chains (glycans) (1). Although these proteins are well characterized in eukaryotes, glycoproteins have also been found in prokaryotes but their precise role is not well understood. Glycoproteins are ubiquitously present in eukaryotes and comprise 20-50% of the yeast proteome, while more than half of the proteins that make up the human proteome are glycoproteins (2,3). Based on their abundance, it is not surprising that glycoproteins fulfill an important role in eukaryotic cell biology, ranging from protein folding to maintenance of cell structure, receptor-ligand interactions and cell signaling (1). Structurally, glycoproteins are remarkable diverse and include many soluble, secreted proteins as well as numerous integral membrane proteins of which the extracellular parts of are equipped with glycans. The oligosaccharide chains of glycoproteins are functionally important. For example, some proteins require glycans for proper folding, or stability, while the presence of glycans in other proteins increases their solubility (4). The carbohydrate moieties of glycoproteins are chemically complex and are covalently attached to amino acid side chains through a biochemical process known as glycosylation (5). In eukaryotes, glycoproteins are produced through different, chemically distinct glycosylation mechanisms of which N-linked glycosylation is the predominant one (5). Dysregulation of this process is in humans the underlying cause of different neurological disorders, while it is also associated with other pathologies such as diabetes type 1, Crohn’s disease and cancer (6). In N-linked glycosylation, carbohydrate chains are transferred to the amide nitrogen of an asparagine of the polypeptide backbone. Sometimes the amide nitrogen of arginine is also used for this purpose. The biosynthesis of N-linked glycans occurs cotranslationally in the endoplasmic reticulum (ER) and comprises two main steps, namely: assembly of a lipid-linked oligosaccharide precursor and its transfer to specific asparagine residues of the acceptor polypeptide (5). The precursor, which is the same for all eukaryotes, is shown in figure 1 (adapted from 4) and comprises a branched structure made up of 14 sugar moieties. These include three glucose, nine mannose and two N-acetylglucosamine molecules. Following transfer onto the acceptor polypeptide by oligosaccharyltransferase, the precursor is modified within the subcompartments of the ER and Golgi complex. The biosynthesis of the precursor is a sequential process that is catalyzed by enzymes, known as glycosyltransferases. These are embedded into the endoplasmic membrane with their active site exposed to either the cytosol or ER lumen. (4,5) The first steps occur in the cytoplasm and utilize soluble nucleotide-activated sugars as building blocks. These are added to the growing precursor, which is attached to dolichol. This is a very long-chain polyisoprenoid lipid that is anchored to the endoplasmic membrane and serves as carrier for the oligosaccharide precursor (4). The final steps in the biosynthesis of the lipid-linked precursor occur in the lumen of the ER and require flipping of the oligosaccharide intermediate. The building blocks for these reactions comprise membrane-embedded hexoses (e.g. dolichol-linked mannose and dolichol-linked glucose). In the yeast Saccharomyces cerevisiae, the that biosynthesis of the lipid-linked precursor requires 13 glycosyltransferases encoded by the Alg (asparagine-linked glycosylation) genes. These enzymes catalyze the step-wise addition of sugar monomers to the growing oligosaccharide precursor (5). The enzymes that facilitate elongation of the precursor at the luminal side of the ER membrane belong to superfamily C of glycosyltransferases (GT-Cs). Members of this group are typically integral membrane proteins with α-helical transmembrane domains (TMDs) and utilize as donor substrate a dolichol-linked carbohydrate (7). The detailed structure of the yeast glucosyltransferase ALG6 was reported recently, providing profound molecular insight into its catalytic mechanism (8). Here, I will discuss this structure as well as its current functional understanding.
Overview of eukaryotic N-linked glycosylation
More than half of the eukaryotic proteins present in databases such as UniProt are probably glycoproteins with 90% of these most likely equipped with N-linked glycans (9). The biosynthesis of eukaryotic N-linked glycans occurs in the lumen of the ER through the en bloc transfer of a pre-assembled oligosaccharide precursor onto the accepting polypeptide. This precursor, which is universally conserved throughout all eukaryotes, has a defined structure (figure 1) and typically contains three glucose, nine mannose and two N-acetylglucosamine molecules. This precursor arises through the sequential addition of carbohydrate monomers onto dolichol, a polyisoprenoid lipid-carrier, that is strongly hydrophobic and comprises 75-95 carbon atoms. The assembly of the lipid-linked precursor is highly conserved and occurs on the cytosolic and luminal side of the ER membrane. The biochemistry of N-linked glycosylation was elucidated using the yeast Saccharomyces cerevisiae as model organism. For example, the glycosidases that catalyze the assembly of the oligosaccharide precursor were identified employing Alg (asparagine-linked glycosylation) yeast mutants that are defective in N-linked glycosylation (10). Consequently, the assembly of the lipid-linked oligosaccharide precursor is affected in these mutants, resulting in the accumulation of lipid-linked precursor intermediates within the ER membrane as well as hyperglycosylation of proteins. The process of N-linked glycosylation in yeast is shown in figure 2 (courtesy of University of Zurich) and reveals that biosynthesis of the lipid-linked precursor requires glycosyltransferases encoded by the Alg genes. All these enzymes are embedded into the endoplasmic membrane. The assembly of the oligosaccharide precursor is initiated at the cytosolic face of the endoplasmic membrane through the step-wise addition of two N-acetylglucosamine (GlcNAC, blue squares) and five mannose (Man, green circles) molecules onto the lipid-linked carrier (5). These steps are catalyzed by ALG7(DPAGT1),13,14,1,2 and ALG11. The enzyme DPAGT1 (ALG7) catalyzes the first step in the biosynthesis of the oligosaccharide precursor and is inhibited by the antibiotic tunicamycin, thereby blocking the production of all N-linked glycans. The building blocks for these reactions comprise soluble nucleotide-activated sugars. The final product of the ALG enzymes on the cytosolic face of the endoplasmic membrane is the oligosaccharide intermediate Dol-PP-GlcNAC2Man5. Further elongation of the precursor requires flipping towards the luminal side of the endoplasmic membrane, which is probably facilitated by RFT1. Following translocation into the ER lumen, seven hexoses are added to the intermediate, namely: four mannosyl residues and three terminal glucoses (5). The building blocks for these final steps comprise membrane-embedded hexoses (e.g. dolichol-linked mannose and dolichol-linked glucose). The 14-residue mature precursor is subsequently transferred to a specific asparagine residue in the acceptor polypeptide as it emerges into the ER lumen by the olgigosaccharyltransferase complex (OST). Following transfer of the mature precursor, the empty lipid carrier is flipped back to the cytosolic face of the ER membrane and recycled, while all tree glucose residues and one mannose residue are removed by three different enzymes.
Biochemical and structural features of glucosyltransferase ALG6
The final steps in the assembly of the lipid-linked oligosaccharide precursor occur in the lumen of the ER and comprise the addition of four mannosyl and three glucose molecules. In yeast, these are added one at a time by ALG3, ALG9, ALG12, ALG,6 and ALG8. These ALG enzymes are part of the GT-C superfamily, which are typically integral membrane proteins with α-helical TMDs and utilize either dolichol-activated mannose or glucose as donor substrate. ALG6 represents an α-1,3-glucosyltransferase of 63 kDa with 13 to 14 predicted TMDs that adds the first glucose residue to the oligosaccharide intermediate in a α-1,3 linkage. Moreover, mutations in the human variant of the ALG6 gene are tightly associated with specific congenital disorders of glycosylation (CDGs) characterized by abnormal N-linked glycosylation, showing a large number of glycoprotein abnormalities such as hypo-glycosylated serum proteins that result in various organ disorders (11). Recently, detailed structures of yeast ALG6 with and without substrate analogue were reported (8). To obtain the structure of ALG6 in the unbound form, apo ALG6 was purified and reconstituted into nanodiscs after which its structure was established by cryo-EM at 3.0 Å
This structure is shown in figure 3 in surface (left panel with hydrophobic residues in red and hydrophilic residues in white) and ribbon representation (right panel with the catalytic base (Asp69) shown in pink spheres) and reveals that ALG6 has 14 TMDs and contains two long loops termed EL1 (in grey) and EL4 (in blue). These adopt a helical conformation in the ER lumen. A disulfide bridge links the luminal end of TMD14 with EL4, while EL4 bridges TMD7 and 8. ALG6 possesses a unique overall structure that is not conserved amongst GT-C members. However, the structure of the N-terminal half (in green) comprising TMD1-7 as well as the luminal loops EL-h1 and EL-h2 is also observed in other GT-C enzymes, while, the C-terminal part (in orange) made up of TMD8-14 is structurally variable and therefore not conserved. Hence, GT-C enzymes comprise a modular architecture containing a conserved N-terminal module and variable C-terminal module with distinct functional roles. The active site and substrate-binding cavities are located at the interface of the conserved and variable modules.
The structure also provides insight into how GT-C enzymes bind their dolichol-linked donor substrates. Specifically, the dolichol part interacts with TMD6 of the conserved module, while the carbohydrate moiety interacts with the variable module. ALG6 contains a large hydrophilic cavity that is oriented towards the ER lumen and a groove-shaped cavity that faces the endoplasmic membrane. The residues that line these cavities are highly conserved, suggesting that they are involved in substrate binding or make up the active site. To explore the role of these cavities in more detail, the structure of ALG6 with the synthetic donor substrate dolichol-linked glucose (dol-25-p-Glc) was solved by cryo-EM at 3.9Å (8). This structure is presented in the left panel of figure 4 (in surface representation with coloring as in figure 3) and shows that the lipid-linked donor substrate (in green spheres) is present in a lipid-exposed groove made up of TMD6, 7 and 8. The dolichol moiety is located in the membrane oriented groove and inhibits mainly with hydrophobic residues of TMD6. Loop EL4 runs over this groove, resulting in a funnel-like entrance. A close-up of the active site is provided in the right panel of figure 4 and shows that the donor substrate is oriented in such a way that the (anomeric) C1 carbon of glucose is accessible for a nucleophilic attack by the mannose C3 OH group of the acceptor glycan. The phosphate group of dolichol-linked glucose functions as leaving group in the glucose transfer reaction and is located in a positively charged surface region of ALG6. A binding site for the lipid-linked acceptor glycan was not observed in this structure. The active site contains different acidic residues (Asp69, Asp99, Glu306, Asp307, and Glu379) that could function as general base. Of these, Asp69 represents the catalytic base, while Asp99 is probably involved in binding of the acceptor glycan. The catalytic base is located at the end of helix EL-h1 (figure 3). His378 is involved in binding and orientation of the donor substrate. No metal cofactor was observed in the structures, suggesting that the activity of ALG6 is metal ion independent. Indeed, purified ALG6 retained its activity in the presence of EDTA.
Catalytic mechanism of ALG6
Members of the GT-C superfamily typically use dolichol-bound substrates and they possess a catalytic mechanism leading to inversion of the anomeric configuration of the product (so-called inverting enzymes). The glucose transfer reaction catalyzed by ALG6 probably occurs through a SN2 mechanism in which the nucleophilic hydroxyl of the acceptor attacks the anomeric (carbon C1) of glucose, thereby displacing the leaving group (phosphate) from the opposite face and forming an α1-3 glycosidic bond (12). To this end, the attacking hydroxyl is activated via deprotonation by an aspartate side chain (Asp69) that acts as general base. The catalytic mechanism of ALG6 is shown in figure 5 (with ALG6 in surface representation adopted from 8) and probably comprises three states. The first state comprises the apo state after which stage 2 is obtained through binding of dolichol-linked glucose. Subsequently, the mannose-containing acceptor glycan binds and Asp69 acts as general base that abstracts the proton from the OH group of the terminal mannose to activate it for a nucleophilic attack (state 3). Conceivably, a conformational change is required to move the glucose of the donor substrate closer to the acceptor glycan and Asp69.
Glycoproteins are proteins that are equipped with oligosaccharides, which are typically covalently attached to amino acid side chains (1). These proteins are present in all organisms both as soluble and membrane-bound species. Glycoproteins are particularly abundant in eukaryotes as, for example, evidenced by the finding that more than 50% of the proteins that make up the human proteome are glycoproteins (3). Not surprisingly, glycoproteins are involved in a variety of biological functions, ranging from protein folding to maintenance of cell structure, receptor-ligand interactions and cell signaling (1). The biochemical process by which oligosaccharides (glycans) are attached to selected acceptor sites within a polypeptide is known as glycosylation. Different forms of glycosylation have been identified, although N-linked glycosylation, in which carbohydrate chains are transferred to the amide nitrogen of an asparagine of the polypeptide backbone, is by far the most common one (5). The biosynthesis of eukaryotic N-linked glycans occurs in the ER through the en bloc transfer of a pre-assembled oligosaccharide precursor onto the accepting polypeptide. This precursor is obtained through the sequential addition of carbohydrate monomers onto dolichol, a polyisoprenoid lipid-carrier. The biosynthesis of the precursor is a sequential process that is catalyzed by enzymes, known as glycosyltransferases (figure 2). These are embedded into the endoplasmic membrane with their active site exposed to either the cytosol or ER lumen. The first steps occur in the cytoplasm, while the final steps in occur in the lumen of the ER and require flipping of the oligosaccharide intermediate. The biosynthesis of the lipid-linked precursor is in yeast catalyzed by 13 glycosyltransferases encoded by the Alg (asparagine-linked glycosylation) genes. ALG3, ALG9, ALG12, ALG,6 and ALG8 facilitate the final steps in the biosynthesis of the lipid-linked precursor (5). These enzymes are all part of the GT-C superfamily, which are typically integral membrane proteins with α-helical TMDs and utilize either dolichol-activated mannose or glucose as donor substrate (7). ALG6 represents an α-1,3-glucosyltransferase that adds the first glucose residue to the oligosaccharide intermediate (5). Moreover, mutations in the human variant of the ALG6 gene are tightly associated with specific congenital disorders of glycosylation (CDGs) characterized by abnormal N-linked glycosylation, showing a large number of glycoprotein abnormalities such as hypo-glycosylated serum proteins that result in various organ disorders (11). Recently, detailed structures of yeast ALG6 with and without substrate analogue were reported providing a molecular basis for understanding the catalytic mechanism of this enzyme (figure 5).
1. Stick RV, Williams S J. (2009). Glycoproteins and Proteoglycans. Carbohydrates: The Essential Molecules of Life, 369–412.
2. Kung LA, Tao SC, Qian J. et al. 2009. Global analysis of the glycoproteome in Saccharomyces cerevisiae reveals new roles for protein glycosylation in eukaryotes. Mol Syst Biol. 5: 308.
3. Stadlmann J, Taubenschmid J, Wenzel D. et al. 2017. Comparative glycoproteomics of stem cells identifies new players in ricin toxicity. Nature. 549: 538-542.
4. Molecular Cell Biology, 4th edition. New York: W. H. Freeman; 2000.
5. Breitling J, Aebi M. 2013. N-linked protein glycosylation in the endoplasmic reticulum. Cold Spring Harb Perspect Biol. 5: a013359.
6. Lauc G, Pezer M, Rudan I, Campbell H. 2016. Mechanisms of disease: The human N-glycome. Biochim Biophys Acta. 1860: 1574-1582.
7. Albuquerque-Wendt A, Hütte HJ, Buettner FFR. et al. 2019. Membrane Topological Model of Glycosyltransferases of the GT-C Superfamily. Int J Mol Sci. 20(19). pii: E4842.
9. Helenius A, Aebi M. 2004. Roles of N-linked glycans in the endoplasmic reticulum. Annu Rev Biochem. 73: 1019-1049.
10. Huffaker TC, Robbins PW. 1983. Yeast mutants deficient in protein glycosylation. Proc Natl Acad Sci U S A. 80:7466-7470.
11. Imbach T, Burda P, Kuhnert P. et al. 1999. A mutation in the human ortholog of the Saccharomyces cerevisiae ALG6 gene causes carbohydrate-deficient glycoprotein syndrome type-Ic. Proc Natl Acad Sci U S A. 96: 6982-6987.
12. Moremen KW, Haltiwanger RS. 2019. Emerging structural insights into glycosyltransferase-mediated synthesis of glycans. Nat Chem Biol. 15: 853-864.