Different specificities of two aldehyde dehydrogenases from Saccharomyces cerevisiae var. boulardii

Aldehyde dehydrogenases play crucial roles in the detoxification of exogenous and endogenous aldehydes by catalysing their oxidation to carboxylic acid counterparts. The present study reports characterization of two such isoenzymes from the yeast Saccharomyces cerevisiae var. boulardii (NCYC 3264), one mitochondrial (Ald4p) and one cytosolic (Ald6p). Both Ald4p and Ald6p were oligomeric in solution and demonstrated positive kinetic cooperativity towards aldehyde substrates. Wild-type Ald6p showed activity only with aliphatic aldehydes. Ald4p, on the contrary, showed activity with benzaldehyde along with a limited range of aliphatic aldehydes. Inspection of modelled structure of Ald6p revealed that a bulky amino acid residue (Met177, compared with the equivalent residue Leu196 in Ald4p) might cause steric hindrance of cyclic substrates. Therefore, we hypothesized that specificities of the two isoenzymes towards aldehyde substrates were partly driven by steric hindrance in the active site. A variant of wild-type Ald6p with the Met177 residue replaced by a valine was also characterized to address to the hypothesis. It showed an increased specificity range and a gain of activity towards cyclohexanecarboxaldehyde. It also demonstrated an increased thermal stability when compared with both the wild-types. These data suggest that steric bulk in the active site of yeast aldehyde dehydrogenases is partially responsible for controlling specificity.


Introduction
The multigene family of aldehyde dehydrogenases (EC 1.2.1.3) contributes primarily to acetaldehyde detoxification through its oxidation to acetate [1]. Other physiological roles of aldehyde dehydrogenases include lipid peroxidation [2], the metabolism of amino acids and biogenic amines [3,4], corticosteroids [5], retinoids [6,7] and protein deglycation [8]. The superfamily also participate in detoxification of exogenous aldehydes from beverages, foods, industrial pollutants etc [9,10]. Mammalian aldehyde dehydrogenases are classified on the basis of kinetic mechanistic features into three classes of enzyme family: ALDH1, ALDH2 and ALDH3 being the representative members with rate-limiting steps in NADH dissociation, deacylation and hydride transfer respectively [11,12]. All three classes conform to an ordered ternary complex kinetic mechanism in which the NAD(P) + coenzyme binds first in the reaction sequence [13].
ALDH superfamily members typically exhibit a broad substrate specificity and many of them are able to oxidize several highly reactive aliphatic and aromatic aldehydes. Exogenous aldehydes, whether intermediates or products, may be derived from the metabolism of drugs

Recombinant expression and purification of Ald6p and Ald4p
Genomic DNA was extracted from yeast cells using the YeaStar genomic DNA kit TM (Zymo Research Corp., U.S.A.). The coding sequences for ALD6 and ALD4 were PCR-amplified from this NCYC 3264 genomic DNA. The primers were designed based on the ORF of Saccharomyces sp. 'boulardii' whole genome shotgun sequence (available at: http://www.ncbi.nlm.nih.gov/nuccore/?term=saccharomyces%20boulardii%20genome%20) [33]. The amplicons were inserted into the Escherichia coli expression vector pET46 Ek/LIC (Merck-Millipore, Nottingham, U.K.) according to the manufacturer's instructions (note that this vector introduces bases coding for the amino acid sequence MAHHHHHHVDDDDK at the 5'-end of the coding sequence). Correct insertion into the vector was verified by PCR and by DNA sequencing (GATC, London, U.K.) of the insert.
The expression vector was used to transform competent E. coli Rosetta TM (DE3) cells (Merck-Millipore) and colonies resulting from this transformation were used to inoculate cultures (5 ml of LB medium supplemented with 100 μg · ml −1 ampicillin and 34 μg · ml −1 chloramphenicol), which were grown overnight at 37 • C with shaking. Each culture was then diluted into 1 litre of LB (supplemented with 100 μg · ml −1 ampicillin and 34 μg · ml −1 chloramphenicol), grown until A 600 reached 0.6 to 1.0 (typically 5-6h) at 30 • C, followed by a slow induction by adding 1.3 mM IPTG overnight (12-16 h) at 16 • C. Cells were harvested by centrifugation (4200g for 15 min at room temperature), resuspended in cell resuspension buffer (50 mM HEPES-OH, pH 7.5, 150 mM NaCl, 10% (v/v) glycerol) and stored frozen at −80 • C until the purification step.
For purification, cell suspensions were thawed, disrupted by sonication on ice (three pulses at 100 W for 30 s with 30 s gaps for cooling) and clarified by centrifugation (20000g for 20 min at 4 • C). The supernatant was applied to a cobalt agarose column (1 ml, His-Select, Sigma, Poole, U.K.), which had been pre-equilibrated in buffer A (cell

Bioinformatics and molecular modelling
The well-studied human retinal dehydrogenase 1 [accession no: NP 000680.2] (class1) and liver mitochondrial ALDH [accession no: CAG33272.1] (class 2) were identified using protein BLAST search. Multiple sequences of the above mentioned ALDHs along with Ald4p and Ald6p sequences from the present study were aligned using Transitive Consistency Score (TCS) web server available from http://tcoffee.crg.cat/tcs to determine their homology and conservation of domain residues [36]. The sequence homology was evaluated using ESPript 3.0 available at http://espript.ibcp.fr [37]. A BLOSUM50 matrix scoring the similarity and identity between the sequences was computed using the MatGAT v2.01 application [38].
The predicted protein sequences of Ald4p and Ald6p were submitted to Phyre2 in the intensive mode to generate an initial molecular monomeric model of the protein [39]. This model was then minimized and computationally solvated using YASARA (http://www.yasara.org/minimizationserver.htm) [40]. To generate a tetrameric model, four copies of the model were aligned (using PyMol) to the four subunits of sheep class 1 ALDH (PDB ID: 1BXS [41]), which shares >90% protein sequence similarity with human retinal dehydrogenases ALDH1A1 (PDB ID: 4WB9 [42]) as judged by the Sequence Similarity Cutoff parameters in Protein Data Bank (PDB). The tetrameric structure was saved in a new pdb file together with the NAD + cofactors associated with each subunit. This tetrameric, cofactor bound structure was then minimized using YASARA. To identify the aldehyde-binding site, the structure of human mitochondrial aldehyde dehydrogenase complexed with crotonaldehyde was used (PDB: 1O01 [43]) and was aligned to the tetrameric models of Ald4p and Ald6p. A new pdb file was generated for both Ald4p and Ald6p incorporating crotanaldehyde, which was then minimized in YASARA. Subsequent models were generated with other aldehydes (octanal and benzaldehyde) bound by overlaying the 'new' substrate over the aldehyde group and key atoms from the carbon backbone, removing the 'old' substrate and reminimizing the complex in YASARA. These models are provided as supplementary information.

Cross-linking
Cross-linking with bis(sulfosuccinimidyl) suberate (BS 3 , 50-800 μM) was carried out with 15 μM protein (diluted as required in 100 mM sodium phosphate buffer, pH 7.4) in a total volume of 10 μl. Reaction mixtures were incubated at 30 • C for 30 min before addition of the cross-linker and then incubated at the same temperature for a further 35 min. Reactions were stopped by addition of an equal volume of SDS-loading buffer (120 mM Tris/HCl, pH 6.8, 4% (w/v) SDS, 20%(v/v) glycerol, 5% (w/v) Bromophenol Blue and 1% (w/v) DTT) and analysed by SDS/PAGE (10% gel).

Analytical gel filtration
Ald6p (wild-type and p.M177V variant) and Ald4p (200 μl of a 60 μM purified protein aliquot) were chromatographed on a Sephacryl S-300 (Pharmacia) column (total volume, V t = 65.2 ml; void volume, V 0 = 15.1 ml) at a flow rate of 1 ml · min −1 . The column was equilibrated and developed in buffer G (50 mM Tris/HCl, 17 mM Tris-base, 150 mM sodium chloride, pH 7.4) [44][45][46]. Fractions (1 ml) were collected and analysed for protein content by measuring the absorbance at 280 nm. Standard proteins (Thyroglobulin, 669 kDa; Albumin, 67 kDa; Chymotrypsinogen, 25 kDa and Ribonuclease A, 14 kDa) were used to calibrate the column. Their elution volumes (V e ) were used to calculate K av according to the equation: Molecular masses were estimated by making use of the inverse, linear correlation between K av and the logarithm of the molecular mass [45].

Enzyme kinetic analysis
Aldehyde dehydrogenase activity was measured at 30 • C using a ThermoScientific Multiskan TM Microplate spectrophotometer. For Ald6p, reactions contained 100 mM sodium phosphate buffer (pH 7.3), containing 40 μM NADP + , 20 μM MgCl 2 (Mg 2+ has been previously identified as a cation activator of S. cerevisiae Ald6p and NADP + as the enzyme's preferred cofactor [47,48]) and varied concentrations of substrates ranging from 10-1200 μM. The long-chain (C 8 -C 13 ) and phenolic aldehydes were dissolved in 1.7% (v/v) DMSO as a solvent carrier; this solvent has been previously shown to have little effect on the activity of aldehyde dehydrogenases [49,50]. Ald4p was assayed in the same buffer containing 20 μM KCl, 0.4 mM NAD + , 5 mM EDTA and 1 mM PMSF. K + has been shown to act as a cation activator of S. cerevisiae Ald4p and this enzyme can function with either NAD + or NADP + as a cofactor [48,51].
Steady-state kinetic data was obtained in triplicates from the same 96-well plate with readings taken every 5 s. The initial, linear portion of the progress curve was identified by visual inspection and fitted by linear regression to give the initial rates (v) of change in absorbance at 340 nm. These rates were converted into molar units using the extinction coefficient of NADH (6.22 mM −1 · cm −1 ) [52]. Rates of reactions were thus expressed as micromolar concentration of NAD(P)H formed per second.
The kinetic parameters (V max , K 0.5 and Hill coefficient, h) were obtained by plotting the rates of reaction against substrate concentration and fitting the data to the equation below using non-linear regression as implemented in GRAPHPAD PRISM 6.0 (GraphPad Software Inc, CA, U.S.A.). All points were weighted equally.
Where, V max is the maximum enzyme velocity, [S] is the concentration of substrate, K 0.5 is the concentration of substrate that produces a half-maximal enzyme velocity (analogous to the Michaelis-Menten constant, K m , in non-cooperative enzymes) and h is the Hill coefficient [53,54].

Sequence and predicted structure of Ald4p and Ald6p
Following PCR amplification of S. cerevisiae var. boulardii ALD4 and ALD6, the coding sequences were determined and deposited in GenBank (accession numbers: KX022008 and KT869135 for ALD4 and ALD6 respectively). Alignment of the coding sequences using the S. cerevisiae S288c genome as reference revealed single nucleotide difference in the S. cerevisiae var. boulardii ALD4 sequence (1342C>T) and five differences in the S. cerevisiae var. boulardii ALD6 sequence (294C>T, 487A>G, 753A>G, 813G>A, 894T>C). Only the 487A>G substitution in ALD6 altered the protein coding sequence, resulting in the replacement of the isoleucine at position 163 by a valine residue. No difference was observed in the protein coded for by ALD4. Multiple sequence alignment revealed an overall approximately 70% similarity (50% identity) between the human class 1 and 2 and both Ald4p and Ald6p (Supplementary Figure S1b), with conserved motifs equivalent of GXGXXG box at the Rossmann fold (residues ∼160-230) and residues at the catalytic channel (residues ∼200-410) (Supplementary Figure S1a). Although Ald4p is reported to be a mitochondrial enzyme in S. cerevisiae, it lacks a discernable targeting sequence (consensus: MLSLRQSIRFFKPA-TRTLCSSRYLL at the N-terminus [56]), as does the variant studied here.
Using the derived protein sequences of S. cerevisiae var. boulardii Ald4p and Ald6p, molecular models were built ( Figure 1). As expected, these models show a high degree of similarity to other aldehyde dehydrogenase structures. For example, Ald4p has RMSD of 0.720Å over 10088 equivalent atoms when compared with sheep liver aldehyde dehydrogenase (1BXS); Ald6p and this protein had an RMSD of 0.729Å over 10476 equivalent atoms. Val 163 in Ald6p (which differs from the previously reported sequences of S. cerevisiae Ald6p) does not form part of the active site and lies close to the oligomeric interface. It, therefore, seems unlikely that the conservative substitution of valine for isoleucine in our sequence compared with the S. cerevisiae S288c genome has any functional consequence for the enzyme. Each subunit in Ald4p and Ald6p contains one active site and the residues forming each of these sites come from one subunit only. In our model, the cofactor and the substrate lie end-to-end in the active site with the nicotinamide ring of the cofactor orientated towards the aldehyde group of the substrate (in our models, the distance between the cofactor and the aldehyde group appears too large for efficient catalysis. We, therefore, assume that there will be some conformational changes in order to bring the two substrates closer together before the reaction can occur. This change is unlikely to affect the identity of the residues interacting with the substrates). While the cofactor is predicted to make many specific contacts with the enzyme, the aldehyde substrate appears to be located in a channel lined mainly with highly conserved hydrophobic residues as previously described in class 1, 2 and 3 human ALDHs [57][58][59]. Consequently, there are few specific contacts between the protein and this substrate leading us to hypothesize that both Ald4p and Ald6p can accommodate a range of different aldehyde substrates. This is consistent with previous observations on bacterial, fungal and higher eukaryotic aldehyde dehydrogenases [58][59][60][61][62]. We further noted that part of the aldehyde-binding site in Ald6p is formed by two methionine residues (Met 177 and Met 178 ). The structurally equivalent residues in Ald4p are Leu 196 and Met 197 (Figure 1b,d). We postulated that substitution of the less bulky, hydrophobic leucine residue at this position in the aldehyde-binding pocket might enable Ald4p to accommodate bulkier substrates such as cyclic, aromatic or branched chain aldehydes.

Expression and purification of mitochondrial Ald4p, cytosolic Ald6p and its variant Ald6p.M177V
Ald4p and Ald6p (wild-type and variant Ald6p.M177V) could be expressed in, and purified from, E. coli cells ( Figure  2). Typical yields of wild-type Ald6p and its variant were approximately 2 mg/l of bacterial cell culture (Figure 2b,c), but the yield of Ald4p was typically less than 1 mg/l culture (Figure 2a).
The resolution of the protein samples on SDS/PAGE (10% gels) showed small amounts of Ald6p at approximately four times the expected molecular mass (Figure 2b) suggesting the presence of some protein oligomers that are resistant to separation by SDS and heat treatment. Heat and SDS-resistant oligomerization have been observed in a number of other proteins [63,64]. The intensity of this band was not reduced by supplementation of the loading buffer with additional DTT (2 mM, results not shown).

Both cytosolic and mitochondrial yeast aldehyde dehydrogenases are oligomers
The ability of the three enzymes to form dimers and tetramers was further investigated using chemical cross-linking with BS 3 (Figure 3). Resolution of the cross-linked products by SDS/PAGE (10% gel) revealed bands corresponding to an Ald6p dimer (∼110 kDa) and tetramer (∼220 kDa) (Figure 3b). However, the Ald6p.M177V variant showed oligomerization even in the absence of BS 3 suggesting increased oligomeric stability compared with the wild-type protein. The intensity of these bands was greater following treatment with increasing concentrations of BS 3 ( Figure  3c). Oligomerization was also seen with Ald4p; however, neither discrete homodimer nor homotetramer forms were detected by SDS/PAGE (Figure 3a). Similar patterns of oligomerization was seen when cross-linking was done with 1-3% (v/v) glutaraldehyde (results not shown).
Oligomers were also detected by gel filtration chromatography for both Ald4p and Ald6p (Figure 4). The molecular mass of Ald6p was estimated to be approximately 220 kDa, which corresponds to a tetrameric assembly. The molecular mass of Ald4p, on the other hand was estimated to be approximately 151 kDa suggesting a trimeric arrangement. A non-tetrameric arrangement for an ALDH is unusual and our models of Ald4p show the protein as a tetramer. There is some, limited previous evidence for trimeric forms of S. cerevisiae aldehyde dehydrogenases [65]. These studies were based on crude extracts and are likely to represent a mixture of the isoenzymes now known to be present in this species. In the present study, cross-linked species corresponding to dimers, trimers and tetramers were detected. It is hard to imagine how the protein could exist as a trimer without substantial conformational changes, assuming that the basic fold of the monomer has been predicted correctly. Therefore, we assume that either the protein exists largely as a tetramer that runs anomalously in gel filtration chromatography or there is a dynamic equilibrium between a dimer and a 'dimer of dimers' . Anomalies in gel filtration can arise from interactions between the protein and matrix, which retard the elution of the protein resulting in a lower estimated apparent molecular mass [66]. The previously detected trimers may represent intermediates in the formation of tetramers. Human ALDH3 is known to exist largely as a dimer due to an extended C-terminal tail that prevents tetramerization [67]. However, S. cerevisiae var. boulardii Ald4p lacks a C-terminal tail and, in our models, the shorter N-terminal extension did not appear to sterically hinder the formation of tetramers since it lies on the exterior of the predicted structure.

Both Ald4p and Ald6p exhibit positive cooperativity, but the two enzymes show markedly different substrate specificity
Both Ald4p and Ald6p are active aldehyde dehydrogenases demonstrating positive cooperativity towards their aldehyde substrates ( Table 1, Supplementary Figures S2, S3 and S4). This may be accounted for the stability of multi-subunit configuration of these enzymes where binding of the substrate to the first subunit increases its affinity for the subsequent subunits. Previously, positive cooperativity was reported in S. cerevisiae cytosolic aldehyde dehydrogenase for binding the cofactor NAD(P) + following the substitution of an arginine residue with glutamate at position 480 [68]. Interestingly, no observation for cooperative kinetics with respect to the aldehyde substrates has previously been reported for yeast aldehyde dehydrogenases. The physiological significance of these cooperative kinetics with substrates is not yet known. In general, positively cooperative kinetics enable 'switch-like' responses to changes in substrate concentrations and are commonly seen in enzymes and pathways that need to be highly sensitive to small changes in substrate concentrations [69]. Ald6p showed activity with straight chain aliphatic aldehydes ranging from C 2 to C 11 and the branched chain, C 4 aldehyde, isobutanal. Once the chain length extended beyond 11 carbon atoms, no activity was detected. However, it should be noted that, as the chain length increases, the aldehydes become less soluble in water and so the lack of activity may be partly explained by the failure of the compounds to dissolve rather than failure to interact with the enzyme's active site. Wild-type Ald6p showed no activity with any of the cyclic or aromatic aldehydes tested. Interestingly, there was little correlation between activity and chain length, with the highest activity (judged by the ratio of k cat to K 0.5 ) being shown with propionaldehyde ( Table 1). Like Ald6p, Ald4p was active with a range of aliphatic aldehydes, with hexanal being the 'preferred' substrate from those tested here. However, no activity could be detected with butyraldehyde (C 4 ), not with any aldehydes with a chain length greater than nine carbon atoms. In contrast with Ald6p, this enzyme was active with benzaldehyde, although it lacked activity with the other cyclic and aromatic aldehydes tested ( Table 1).  Inspection of the molecular model of the active sites of Ald4p and Ald6p revealed a possible reason for these differences in specificity. On one side of the Ald6p active site are two methionine residues (Met 177 and Met 178 ). These bulky side chains protrude into the active site filling some of the volume potentially sterically hindering the binding of a cyclic or aromatic substrate. In contrast, in Ald4p, the structurally equivalent residue was a leucine (Leu 196 ). We, therefore, reasoned that replacement of Met 177 residue with a smaller valine residue might enable Ald6p to interact with cyclic and/or aromatic aldehydes. Our hypothesis is supported by the sequence alignment with human aldehyde dehydrogenases (Supplementary Figure S1): both human ALDH1 and ALDH2, which are active with aliphatic and cyclic aldehydes [26], have smaller amino acid residues (valine and leucine respectively) at the position corresponding to Met 177 in Ald6p. Furthermore, we have previously shown that an aldehyde dehydrogenase from Candida dubliniensis (which has an isoleucine at the equivalent position) is active with some cyclic and aromatic substrates [70]. The Ald6p.M177V variant is active with aliphatic aldehydes and showed an increased activity with medium chainlength aliphatic aldehydes (valeraldehyde and hexanaldehyde) compared with both wild-type Ald4p and Ald6p. However, its activity was only observed up to a chain length of seven carbons. Interestingly, it also gained activity with cyclohexanecarboxaldehyde, a substrate that the wild-type enzyme has no activity towards (Table 1). However, even with this change, there was no activity with benzaldehyde. This may be due to the lower reactivity of this aromatic aldehyde. This suggests that our hypothesis is partially correct and that the creation of additional volume at this point in the active site enables the binding and subsequent oxidation of bulkier aldehydes. However, the Ald6p.M177V has no activity towards benzaldehyde (unlike Ald4p) suggesting that there are additional determinants of substrate specificity in these enzymes. Ald6p.M177V, like Ald4p but not wild-type Ald6p, has no activity with butyraldehyde. This suggests that a valine or leucine at the position equivalent to 177 in Ald6p is incompatible with the binding of four carbon aldehydes. It is hard to provide a definitive explanation in the absence of an experimentally determined structure. However, it may be that shorter aldehydes (with three or fewer carbons and isobutaraldehyde which is branched) bind without significant interaction with this residue. The branched nature of valine or leucine may create some steric hindrance to four carbon aldehydes, which can be overcome by the greater binding energy expected in substrates with longer carbon chains.

Substrates and cofactors affect the thermal stability of yeast aldehyde dehydrogenases
The Ald6p-M177V variant is more thermally stable than the wild-type with a melting temperature ∼4 • C higher (Table 2, Supplementary Figure S5). This demonstrates that the M177V substitution increases the stability of the protein. Given that it was easier to detect oligomers in the variant (see above, Figure 3c), this most likely results from tighter binding between the subunits of the oligomers. Ald4p has similar thermal stability to wild-type Ald6p ( Table  2). NAD(P) + binding caused a significant (P<0.05) increase in the T m for Ald4p and wild-type Ald6p demonstrating that this increases the overall stability towards heat denaturation of all three proteins (Table 2, Supplementary Figure  S5). Addition of the aldehyde substrates (in the presence of the appropriate cofactor) generally increased the thermal stability of Ald4p, but most had little or no effect on Ald6p (Table 2, Supplementary Figure S5). Aldehyde dehydrogenases have an ordered, ternary complex mechanism in which NAD(P) + binds first. These data suggest that cofactor binding results in a small stabilization of the enzyme, perhaps due to a reorganization of the structure which then permits aldehyde binding.