Title of Invention

"FUSION POLYPEPTIDE FOR EXPRESSION IN A HOST CELL"

Abstract The present invention relates to fusion proteins (fusion polypeptides), particularly for use in expression and/or purification systems. The present inventors have found that the TolAIII domain has remarkable properties which are of particular use as a fusion protein partner to achieve high levels of expression in a host cell. In one aspect of the invention, a TolAIII domain or a functional homologue, fragment, or derivative thereof is located towards the N-terminus of the fusion polypeptide and a non-TolA polypeptide is located towards the C-terminus of the fusion polypeptide.
Full Text FUSION PROTEINS
The present invention relates to fusion proteins (fusion polypeptides), particularly for use in expression and/or purification systems.
Purified proteins are required for several applications. However, the isolation of pure proteins, in sufficient quantities, is sometimes problematic. For protein function studies, large amounts of a protein of interest (for example, a mutated protein) are often needed. Various expression systems have been used for heterologous production of proteins. Escherichia coli (E. coli) is still the most common host despite huge advances in the area of protein expression in the last ten years in other hosts. E. coli is popular because expressing proteins in the bacterium is relatively simple and a vast amount of knowledge about bacterium itself exists, and (sometimes most importantly) because of the low costs associated with production.
Proteins can be expressed in£. coli either directly or as fusions (of a "fusion partner" and a protein or polypeptide), also known as fusion proteins. The purpose of fusion partners is to provide affinity tags (e.g. Hisn tag, glutathione-S-transferase, cellulose binding domain, intein tags), to make proteins more soluble (e.g. glutathione-S-transferase), to enable formation of disulphide bonds (e.g. thioredoxin), or to export fused proteins to the periplasm where conditions for the formation of disulphide bonds are more favourable (e.g. DsbA and DsbC). Proteins used as fusion partners are normally small (less than 30 kDa).
TolA is a periplasmic protein involved in (1) mamtaining the integrity of the inner membrane and (2) the uptake of colicins and bacteriophages. The first function is evidenced by the increased outer membrane instability (e.g. SDS sensitivity) of TolA' mutants. This function has been shown by various authors and may depend upon the interaction with the TolB protein (Levengood-Freyermuth et al, 1993, J. Bacteriol. 17_5: 222-228; Wan & Baneyx, 1998, Protein Expression & Purification 14: 13-22). Wan and Banex (1998, supra) have demonstrated that co-expression of the C-terminal TolAUT domain of TolA (see below) facilitates the recovery of periplasmic recombinant proteins into the growth medium ofE. coli, confirming that overproduction of the TolAHI domain

disrupts the outer membrane and causes periplasmic proteins to leach into the growth medium.
The second function of TolA is based upon tie use of TolA as a receptor by phage proteins (Lubkowski, J. et al, 1999, Structure With Folding & Design 7: 711-722) and colicins (Gokce, I. et al, 2000, J. Mol. Biol. 304: 621-632). This has been revealed both by the phage/colicin resistance of tolA mutants and by direct demonstration of the tolA -protein interactions by physical methods. TolA is composed of three domains. A short N-terminal domain is composed of a single transmembrane helix, which anchors TolA in the inner membrane. The second, largest domain is polar and mainly a-helical. A C-terminal domain HI (TolAin) is small and composed of 92 amino acids. Its 3D structure was recently solved in a complex with Nl domain of minor coat gene 3 protein of Ff filamentous bacteriophage (Holliger, P. et al, 1999, J. Mol. Biol. 288: 649-657). It is tightly folded into a slightly elongated protein with the aid of one disulphide bond (Figure 1).
Lubkowski et al. (1999; supra) disclose a fusion protein comprising residues 1-86 (the Nl domain) of the filamentous Ff bacteriophage minor coat gene 3 protein g3p towards the N-tenninus and residues 295-425 (including the TolATH domain) of TolA, a coreceptor of g3p, towards the C-terminus, and a C-terminal AlasHise (SEQ ID NO: 1) tail. The fusion protein was used by Lubkowski et al. to elucidate the crystal structure of a complex formed between the g3p Nl and TolAHI domains.
Various homologues of the TolA protein are known, for example from E. coli (SwissProt Ace. No. PI9934), Salmonella species (for example Genbank Ace. Nos gil6764117 and gil 675986, Pectobacterium species (for example Genbank Ace. No. gil6116636) and Haemophihis species (for example Genbank Ace. No. gi2126342).
The present inventors have found that the TolAin domain has remarkable properties which are of particular use as a fusion protein partner to achieve high levels of expression in a host cell.
According to the present invention, there is provided a fusion polypeptide for expression in

a host cell comprising a ToLAJH domain or a functional homologue, fragment, or derivative thereof and a non-TolA polypeptide, wherein the TolAHI domain or functional homologue, fragment, or derivative thereof is located towards the N-terminus of the fusion polypeptide and the non-TolA polypeptide is located towards the C-tenninus of the fusion polypeptide.
As used herein, the terms "polypeptide" and "protein" are synonymous and refer to a sequence of two or more linked amino acid residues.
The TolAHI domain, when located towards the N-terminus of a fusion polypeptide, has been shown by the present inventors to facilitate higher than expected levels of the TolADI fusion polypeptide expression in a host cell. The TolAm domain fusions will be useful, for example, for obtaining purified protein and polypeptide partners and/or for studying the properties of these partners.
The fusion polypeptide may further comprise a signal peptide. This will allow the fusion polypeptide to be targeted to a specific intra- or extra-cellular location. The signal peptide may be located at or near the N-terminus of the fusion polypeptide. The signal peptide may be cleaved from the fusion polypeptide during the targetting process.
If the fusion polypeptide has the basic structure: N terminus — TolATJI - Protein partner - C terminus, it may be expected that it will be expressed in high yields in the cytoplasm. If, however, the fusion polypeptide has the basic structure: N terminus - Signal peptide -TolAHI - Protein partner - G terminus, the signal peptide may be used to target the construct to a non-cytoplasmic location. For example, in E .coli expression systems the ribose-binding-protein signal peptide (for example, the E. coli ribose-binding-protein signal peptide [SEQ ID NO: 2]) may be used to target a fusion protein to the periplasni. Signal peptides which may be suitable for use in the present invention conform to "a set of general rules which are described in Von Heijne, G. 1985, J. Mol. Biol. 184(1): 99-105.
The TolAHI domain or functional homologue, fragment, or derivative thereof may be codon-optimised for expression in the host cell.

The fusion polypeptide may further comprise a linker between the TolAIH domain or functional homologue, fragment, or derivative thereof and the non-TolA polypeptide. The linker may provide a physical separation between the TolAHI domain or functional hornologue, fragment, or derivative thereof and the non-TolA polypeptide or may be functional. The linker may comprise at least one cleavage site for an endopeptidase. For example, the cleavage site may comprise the amino acid sequence DDDDK (SEQ ID NO: 3; for enterokinase) and/or LVPR (SEQ TD NO: 4; for thrombin) and/or IEGR (SEQ ID NO: 5; for factor Xa).
3n one embodiment, the fusion polypeptide according to invention may further comprise an affinity purification tag. The affinity purification tag may be located at or near the N-terminus of the fusion polypeptide. For example, the affinity purification tag is an N-tenninal Hisn tag, with n=4, 5, 6, 7, 8, 9 or 10 (SEQ ID NOs: 6 - 12, respectively; preferably n=6 [SEQ ID NO: 8]), optionally with the Hisn tag linked to the fusion polypeptide by one or more Ser residues (preferably two). The affinity purification tag will provide one means for immobilising the fusion polypeptide, for example as a step in purification.
In one embodiment, the fusion polypeptide comprises a signal peptide at the N-terminus and an affinity purification tag near the N-terminus. If the signal peptide is cleaved from the fusion polypeptide during targeting, then the affinity purification tag may be located at
or nearer to the new N-terminus of the fusion protein.
Preferably, the TolAKI domain consists of amino acid residues 329-421 (SEQ ID NO: 13) of Escherichia coli TolA (SwissProt Ace. No. P19934).
The host cell maybe bacterial (for example, Escherichia coli).
The non-TolA polypeptide of the fusion polypeptide may be human BCL-XL (SWISSPROT Accession No. B47537). The fusion polypeptide.with human BCL-XL may comprise the amino acid sequence of SEQ ID NO: 14 or SEQ ID NO: 15. As shown in Example 2 below, large amounts of BCL-XL (an important protein in apoptosis and cancer

research) can be generated by expression as a ToLAHI fusion polypeptide.
Further provided according to the present invention is a DNA molecule encoding the fusion polypeptide as defined above. The mKNA properties of the DNA molecule when transcribed may be optimised for expression in the host cell.
Also provided is an expression vector comprising the DNA molecule as defined above for expression of the fusion polypeptide of the invention. The expression vector may have an inducible promoter (for example, the IPTG-inducible T7 promotor) which drives expression of the fusion polypeptide. The expression vector may also have an antibiotic resistance marker (for example, the bla gene, which confers resistance to ampicillin and chloramphenicol).
In another aspect of the invention there is provided a cloning vector for producing the expression vector as defined above, comprising DNA encoding the TolAUI domain or a functional homologue, fragment, or derivative thereof upstream or downstream from a cloning site which allows in-frame insertion of DNA encoding a non-TolA polypeptide. The cloning vector may further comprise DNA encoding at least one cleavage site (for example, the amino acid sequence DDDDK [SEQ ID NO: 3] and/or LVPR [SEQ ID NO: 4] and/or IEGR [SEQ TD NO: 5]) for an endopeptidase, the cleavage site located between the DNA encoding the TolATH domain or a functional homologue, fragment, or derivative thereof and the cloning site. The cloning site may comprise at least one restriction endonuclease (for example, BarriHl and/or Kpnl) target sequence. The cloning vector may further comprise DNA encoding an affinity purification tag as defined above. The cloning vector may further comprise an inducible promoter (for example, the IPTG-inducible T7 promotor) and/or DNA encoding an antibiotic resistance marker (for example, the bla gene, which confers resistance to ampicillin and chloramphenicol).
For example, the cloning vector may have the structure of pTolE, pTolT or pTolX (as shown in Figure 2 with reference to the description).
Also provided is the use of the TolAJJI domain or functional homologue, fragment, or

derivative thereof for production of a fusion polypeptide as defined above.
Further provided is the use of the TolAItt domain or functional homologue, fragment, or derivative thereof for production of the DNA molecule as defined above.
Yet further provided is the use of the TolAHI domain or functional homologue, fragment, or derivative thereof for production of an expression vector as defined above.
Also provided is the use of the TolAIH domain or functional homologue, fragment, or derivative thereof for production of a cloning vector as defined above.
In one aspect there is provided a host cell containing the DNA as defined above and/or the expression vector as defined above and/or the cloning vector as defined above.
In another aspect there is provided the use of the fusion polypeptide as defined above for immobilisation of the non-TolA polypeptide, comprising the step of: binding the fusion polypeptide to a TolA binding polypeptide (eg. the TolA-recognition site of colicin N [Gokce et al., 2000, supra] or other colicins, the TolA binding region of bacteriophage g3p-Dl protein [Riechmann & Holliger, 1997, Cell 90: 351-360], or the TolA binding region of TolB or other Tol proteins).
It is known that TolATJl interacts specifically with several naturally occurring proteins such as colicins, phage proteins and other Tol proteins. This range of existing binding partners makes the over expression of TolAIH fusion proteins of particular utility since these proteins may be used in purification or immobilisation technologies. The TolAHI domain therefore not only drives high expression of the fusion polypeptide but also provides an affinity tag for purification, immobilisation or analysis of the fusion polypeptide. The TolAIH binding proteins (or binding polypeptide domains thereof) could be used to provide binding sites for the TolAHI fusions (as in Figure 6). Protein chips could be made using these TolATJl binding proteins which then bind the TolAin fusion proteins. This provides a way to immobilise a wide variety of proteins on the surface using the TolAHI fusion as the common interaction.

Alternatively, the fusion polypeptide comprising an affinity tag as defined above may be used for immobilisation of the non-TolA polypeptide, comprising the step of: binding the affinity tag of the fusion polypeptide to a binding moiety.
Also provided is the use of the fusion polypeptide as defined above for purification and
isolation of the non-TolA polypeptide, comprising the steps of:
(i) binding the fusion polypeptide to a TolA binding polypeptide (eg. the TolA-recognition
site of colicin N or other colicins, the TolA binding region of bacteriophage g3p-Dl
protein, or the TolA binding region of TolB or other Tol proteins);
(ii) cleaving the non-TolA polypeptide from the TolAHI domain or functional homologue,
fragment, or derivative thereof using an endopeptidase; and
(iii) separating the cleaved non-TolA polypeptide from the TolATH domain or functional
homologue, fragment, or derivative thereof.
In an alternative embodiment, the fusion polypeptide comprising an affinity tag may be
used for purification and isolation of the non-TolA polypeptide, comprising the steps of:
(i) binding the affinity tag of the fusion polypeptide to a binding moiety,
(ii) cleaving the non-TolA polypeptide from the TolAJDI domain or functional homologue,
fragment, or derivative thereof using an endopeptidase; and
(iii) separating the cleaved non-TolA polypeptide from the TolAIH domain or functional
bomologue, fragment, or derivative thereof.
The fusion polypeptide as disclosed herein may be used for studying interaction properties of the non-TolA polypeptide or the fusion polypeptide, for example self-interaction, interaction with another molecule, or interaction with a physical stimulus.
Also provided is a method for high expression of a polypeptide as a fusion polypeptide in a host cell, comprising the step of expressing the polypeptide as a fusion polypeptide as defined above in a host cell. Levels of expression of a polypeptide as a fusion protein defined herein will be high relative to levels of expression of a polypeptide not linked to the ToIAIH domain.

The invention will be further described with reference to the accompanying figures. Of the figures:
Figure 1: (Prior art) Shows the structure and sequence of third domain of TolA. The model is from the crystal structure of complex between TolAHI and Nl domain of minor coat gene 3 protein from filamentous bacteriophage (Holliger. et a!., 1999, supra). Disulphide bond is labelled black. Residues 333-421 were resolved in the model;
Figure 2: Shows pTol expression vectors. pTol vectors are T7 based expression vectors derived from pETSc. The tagged TolAHI region, depicted generically in the middle panel sequence (SEQ ID NO: 16), is inserted in between Xhol and Mlul sites. Hisg-Serz linker (SEQ ID NO: 17) precedes the TolA gene for domain EL, coding for TolA amino acids 329-421 (SEQ ID NO: 13). Short flexible part (Gly-Gly-Gly-Ser, SEQ ID NO: 18) then follows and the cleavage site for endopeptidases composed of four or five amino acids (denoted by X in middle panel and underlined in bottom panel). The bottom panel shows the DNA sequences (SEQ ID NOs: 19-21, respectively) and encoded amino acid residues (SEQ ID NOs: 22-24, respectively) of the cleavage/cloning site of the tagged TolATCE region of pTolE, pTolT and pTolX. The cleavage site is denoted by an arrow. Stop codons are shown as asterisks;
Figure 3: Characterization of TolADI expression. A: SDS-PAGE of expressed
TolAJH from using three different vectors. Lane 1, pTolT uninduced; lane 2, pTolX; lane 3, pTolE; lane 4, pTolT. B: Growth curve of bacteria with pTolT. Uninduced (solid squares) sample, induced (open squares) sample. 1 mM IPTG was added to induce sample at the time denoted by an arrow. C: SDS-PAGE of fractionation of bacteria after expression of TolADI from pTolT. Lane 1,uninduced sample; lane 2, induced bacteria; lane 3, periplasmic fraction; lane 4, cytoplasmic fraction; lane 5, insoluble (membrane -I- inclusion bodies) fraction. M, molecular weight marker;
Figure 4: Expression of different proteins in E.coli using pTol system. A: Expression of fusion of TolAHI with prokaryotic proteins. Lane 1, colicin N 40-76; lane 2, Al 0 T-

Figure 8: Shows an SDS-PAGE of expressed TolAHI-BCLXL fusion protein. Lane 1, whole ceD pellet, Lane 2, supernatant after ultra centrifugation, lane 3, column wash with xesiispension buffer, lane 4, wash with 50 mM imidazole, lane 5, molecular weight marker, lane 6, elation with 300 mM imidazole; and
Figure 9: Shows an SDS-PAGE of thrombin-cleaved TolAIU-BCLXL fusion protein. Lane 1, whole fusion protein, Lane 2, and 4 fusion protein after thrombin cleavage, lane 3, molecular weight marker, lane 5, flow through the column, lane 6, wash, lane 7, wash with 2M NaCl, lane 8, elution with 300 mM imidazole.
domain colicin N; lane 3, R-doniain colicin N. Bottom panel presents an estimation of proportion of expressed protein in bacterial cells as determined from scanned gels with the software package Tina. Values reported represent average of estimation from 5-11 colonies ± SD. B: Expression of fusion of TolAHI with eukaryotic proteins. Lane 1, PDK2; lane 2, HBD1 domain; lane 3, EqtH; lane 4, PLA.2- Values in bottom are average of estimation from 4-8 colonies ± SD. C: Expression of fusion of ToLAJDI with membrane proteins. Lane 1, uninduced pTolT; lane 2, induced BcrC; lane 3, induced TM1. The position where expressed BcrC and TM1 should appear on the gel is denoted by an asterisk and circle, respectively. M, molecular weight marker; C, control of bacteria] cells from uninduced sample of pTolT;
Figure 5: Purification of R-domain of colicin N. Lane 1, uninduced cells containing pTolT-Rdomain vector; lane 2, induced cells; lane 3, bacterial cytoplasmic fraction; lane 4, flowthrougb. of Ni-NTA chromatography; lane 5, purified fusion TolT-Rdomain proteins; lane 6, purified R domain after cleavage and ion-exchange chromatography;
Figure 6: Depicts diagrammatically various uses of a HSs-tagged fusion protein. (T) A TolIHA ("Tol") fusion partner (depicted as an oval) with a His^ (H6) affinity tag ( depicted as a rectangle) is attached to a non-TolAUI polypeptide (depicted as a circle). (IT) To obtain purified non-TolAIH polypeptide, it may be removed from the fusion protein by endopeptidase cleavage (depicted as a lightening bolt) and purified. For interaction studies and the creation of protein arrays, the fusion protein may be immobilised in a variety of ways e.g. to a Nickel Chelate substrate via the Has6 tag or (HI) (as shown) using an immo'bilised tag made from all or part of a recognised TolAIH binding protein from "bacteria or phage, allowing the non-TolAUI polypeptide (or the entire fusion) to be available for interaction studies. The interaction between the non TolA-TJI polypeptide and a molecule that recognises it (protein, DNA, carbohydrate, lipid etc) is shown in (IV)- The partner is shown as a half circle;
Figure 7: Shows a circular plasmid map of a construct used to produce a Tol-A-M and BCL-XL fusion polypeptide;
EXPERIMENTAL
In our laboratory we first prepared fusion proteins between domain DI of periplasmic TolA protein (ToLAJH) and T domain of colicin N. Huge amounts of fusion protein was isolated when TolAffl was at the N-terminus and T-domain at the C-terminus. On the other hand, when the colicin N domain was the N-terminal partner no expression of fusion protein was obtained.
Here we describe cloning of pTol vectors that use TolAHI as a fusion partner at the N-temun.al part of expressed fusion protein. We show that levels of expression of various fusion proteins are around 20 % of total bacterial proteins and we were able to purify 50-90 mg of fusions per 1 of bacterial broth. We prepared different components of colicin N by the use of this system.
In Example 1, several proteins were expressed using the system. These were different parts and domains of colicin N (TolA binding box (peptide of arnino acids 40-76), deletion mutant of T-domain (A10) and R domain), representing prokaryotic proteins. Human phospholipase Aj, pore-forming protein from sea anemone equinatoxin H, micleotide binding domain 1 (NBD1) of human cystic fibrosis transmembrane conductance regulator (CFTR) and human mitochondrial pyruvate dehydrogenase kinase 2 (PDK2) were examples of eukaryotic proteins. Transmembrane proteins were represented by BcrC, a component of bacitracin resistance system from Bacillus licheniformis, and transmembrane domain 1 (TM1) of human CFTR. The expression of BCL-XL, an important protein in apoptosis and cancer research, as a ToLAITI fusion polypeptide is shown in Example 2.
For Example 1, in all cases except for two membrane proteins the yields effusion protein were higher than the individual proteins. The expression of small peptides and soluble proteins was consistently good. More difficult targets were also chosen .The membrane proteins did not express at all. The human PLA, PDJCj and equinatoxin expressed well but as in the case of the individual proteins much ends up as insoluble fraction. PLA has many SS bonds and PDK has consistently resisted soluble expression in other systems. The TolAHI was not able to overcome the insoluble behaviour of these fusion partners but their
recovery from inclusion bodies is still possible. In Example 2, large amounts of BCL-XL were expressed.
MATERIALS AND METHODS Example 1 :
Cloning ofpTol vectors:
The original vector used in cloning was a derivative of pETSc (Novagen) termed pETSc. The pETSc vector was constructed by adding to the pETSc vector nucleotides encoding methionine followed by six histidine and two serine residues downstream of the cloning site (Politou, A.S. et al., 1994, Biochemistry 33(15): 4730-4737). The pETSc vector was used for an expression of fusion between domain HI of TolA (ammo acids 329-421; SEQ ID NO: 13) protein and T domain of colicin N. It is T7 based expression vector with bla gene, providing ampicillin selection. The fusion protein contains a methionine followed by six histidines and two serines at the N-terminal part. This linker enables easy purification using Ni-chelate affinity chromatography. The fusion partners were linked together.via BarriEQ. site. The C-terminal end of the fusion was cloned via Mlul site. The T-domain gene was removed from the vector by restricting it with BamHl and Mlul. An adaptor sequence was then ligated into the vector. It was composed in such a way that it removed the'-BaTwHI site within the flexible linker, but introduced a new UamHI site just after the cleavage sequence for endopeptidases (Figure 2). In this way fused partners can be cloned in pTol vector via BamHl or Kpnl site, leaving a tag of two (Gly-Ser, SEQ TD NO: 25) or four (Gly-Ser-Gly-Thr; SEQ ID NO: 26) amino acids, respectively, at the N-terminus (see Figure 2).
The linker between ToLAHI and fused partner is, therefore, composed of flexible part (Gly-Gly-Gly-Ser; SEQ ID NO: 18) and cleavage sequence for endopeptidases (enterokinase, factor Xa or thrombin) (Figure 2). The oligonucleotides (all oligonucleotides from MWG
f
Biotech) with the following sequences were used as an adaptors:
E(4-)' (5'-GATCTGATGATGACGATAAAGGATCCGGTACCTGATGAA-3'; SEQ ID
NO: 27) and
E(-) (5'-CGCGTTCATCAGGTACCGGATCCTTTATCGTCATCATCA-3'; SEQ ID NO:
28) for enterokinase;
X(+) (5'-GATCTATTGAAGGTCGCGGATCCGGTACCTGATGAA-3J; SEQ ID NO:
29) and
X(-) (5'-CGCGTTCATCAGGTACCGGATCCGCGACCTTCAATA-3'; SEQ ID NO: 30) for factor Xa;
T(+) (5'-GATCTCTGGTTCCGCGCGGATCCGGTACCTGATGAA-3'; SEQ ID NO: 31) and T(-) (5'-CGCGTTCATCAGGTACCGGATCCGCGCGGAACCAGA-3'; SEQ ID NO: 32) for thrombin cleavage sites.
Newly cloned vectors were named pTolE, pTolX, pTolT and they comprise cleavage sequences for enterokinase, factor Xa, and thrombin, respectively. Fusion partners used to test the system were cloned into the pTol vectors via BamHI and Mlul sites. If the nucleic acid sequence coding for a particular protein contained internal BamHI site, a Kpnl site was used instead. Nine different proteins were used to test the system (Table 1). Coding sequences were amplified by PCR. Reaction mixture contained (in 100 ul total volume): 10 |il of 10 X reaction buffer supplied by the producer, 2 ul of 100 mM MgSO4, 4 jol of dNTP mix (200 jiM final concentration), 100 pmol of each oligonueleotide, approximately 20 ng of target DNA and 1 Unit of Vent DNA polymerase (New England BioLabs). Target DNA was obtained either from DNA cloned into plasmids (e.g. colicin sequences were from the plasmid pCHAP4 [Pugsley, A.P., 1984, Mol. Microbiol. 1: 317-325], equinatoxin sequences were from an equmatoxin-containing plasmid described in Anderluh G. et al., 1996, Biochem. Biophys. Res. Commun. 220: 437-42, and BcrC sequences were from an BcrC-containing plasmid described in Podlesek, Z. et al., 1995, Mol. Microbiol. 16: 969-976) or via direct PCR or RT-PCR from the host organism. The resulting DNA was sequenced after cloning into pTol to ensure that it corresponded to precisely to me section of the published sequence shown in the table. Typically the following cycles were used: 10 min at 97°C; 30 cycles, each composed of 2 min denatoration at 97°C, 1 min of annealing at 58°C, 1 min of extension at 72°C; 7 min at 72°C and soak at 10°C. PCR fragments were purified using commercial kits (Qiagen) and restricted by an appropriate restriction endonucleases. Restricted fragments were cloned into pre-cleaved pTol vector. The correct nucleotide sequence of the fusion protein was verified by sequencing.
Table 1: Proteins used to testpTol fusion expression system:

(Table Remove)
a Mr of fusion protein calculated from the sequence. b Restriction site used for cloning at
the N-terminal part of the fusion protein. In all cases C-tenninal site used was Mini.c
RefSeq accession number.d Oligonucleotides to amplify the desired proteins were of the
following sequences (all 5'-3'; see Table 1):
1. TTTTTGGATCCAATTCCAATGGATGGTCATGGAG (SEQ ID NO: 42)
2. AAGGATCCAAGCTTCAAGGTTTAGGCTTTGAATTATTGTCC (SEQ ID NO: 43)
3. TTTTTGGATCCAATGCTTTTGGTGGAGGGAAAAATC (SEQ ID NO: 44)
4. CTCAGCGGTGGCAGCAGCC (SEQ ID NO: 45)
5. CGCGGAtCCCATGGGGACAATAATTCAAAGC (SEQ ID NO: 46)
6. GGCGAATTCACGCGTTAAAATAATAATTTCTGGCTCAC (SEQ ID NO: 47)
7. CCGGGGTACCAATTTGGTGAATTTCCACAGAATGATC (SEQ ID NO: 48)
8. GGCGAATTCACGCGTTAGCAACGAGGGGTGCTCCC (SEQ ID NO: 49)
9. CGCGGATCCGCAGACGTGGCTGGCGCC (SEQ ID NO: 50)

10. GGCGAATTCACGCGTTAAGCTTTGCTCACGTGAGTTTC (SEQ ID NO: 51)
11. CGCGGATCCTCTAATGGTGATGACAGCCTC (SEQ ID NO: 52)
12. GGCGAATTCACGCGTTAGAAAGAATCACATCCCATGAG(SEQIDNO: 53)
13. CCGGGGTACCAAGTACATAGAGCACTTCAGCAAGTTC (SEQ ID NO: 54)
14. GGCGAATTCACGCGTTACGTGACGCGGTACGTGGTCG (SEQ ID NO: 55)
15. CGCGGATCCTTTTCAGAATTAAATATTGATG (SEQ ID NO: 56)
16. GGCGAATTCACGCGTTAAAAGTTCTTCGATTTATCG (SEQ ID NO: 57)
17. CGCGGATCCCAGAGGTCGCCTCTGG (SEQ ID NO: 58)
18. GGCGAATTCACGCGTTAGGGAAATTGCCGAGTGAC (SEQ ID NO: 59)
Expression of proteins in E. coli
All proteins were expressed in an E. coli BL21(DE3)pLysE strain (from Novagen). The strain was transformed with plasmid and grown on LB plates with appropriate selection (Ampicillin., Chloramphenicol). One colony was used to inoculate 5 ml of LBAC medium (Ampicillin at 100 ^g/ml, Chloramphenicol at 34 jig/ml, both from SIGMA). Bacteria were grown on rotating wheel at 37°C. After 60 min the expression of recombinaht proteins was induced by an addition of 1 mM (final) IPTG and bacteria were grown for additional 4 h. Small samples (corresponding to a volume of bacteria which when resuspended in 1 ml yields A^xH^S) was analysed on SDS-PAGE. Gels were stained with Coomassie and scanned at 600 dpi using commercial scanner. The amount of expressed proteins was estimated from the gels using the program Tina 2.0. For large-scale expression, 5 ml of bacterial culture in stationary phase was used to inoculate 250 ml of LBAC medium and grown at 37°C in orbital shaker at 180 rpm overnight. The next morning 20-25 ml of overnight culture was used to inoculate 500 ml of M9 LBAC medium. In total 3-5 1 of bacterial culture were grown for a single protein. Bacteria were grown at the same
conditions until ASOO reached approximately 0.8. Then the production of recombinant proteins was induced by adding IPTG to final 1 mM concentration. Bacteria were grown for additional 4-5 h, centrifuged for 5 min at 5000 rpm at 4°C, and stored at -20°C.
Isolation of proteins from bacteria
Pelleted bacteria were resuspended (2 ml of buffer / g of cells) hi 50 mM NaH2PC>4, pH 8.0, 300 mM NaCl, 10 mM imidazole, 20 mM £-mercaptoethanol (buffer A), with following enzymes and inhibitors of proteases (final concentrations): DNase (10 (J-g/ml), KNase (20 ug/ml), lysozyme (1 mg/ml of buffer), PMSF (0.5 mM), benzamidine (ImM). They were incubated on ice for an hour and occasionally vigorously shaken. The resuspended bacteria were sonicated for 3 min with a Branson sonicator and then centrifuged in a Beckman ultra-centrifuge at 40000 rpm, 4°C in 45ti rotor. Supernatant was removed and placed at 4°C. Pellet was resuspended in the same buffer without enzymes and inhibitors (1 ml / g of weight) and kept on ice for 15 min. Centrifugation at the same conditions followed after additional 1 min of sonication. Supematants from both centrifugations were merged and applied at approximately 1 ml/min to 1-3 ml of Ni-NTA resin (Qiagen) equilibrated with buffer A. Typically, column with bound protein was washed with two fractions of 3 ml of buffer A, two fractions of buffer A with 20 mM imidazole and 6-10 fractions of buffer A with 300 mM imidazole. Fractions were analysed on SDS-PAGE. Fractions of interest were pooled and dialysed three times against water (5 1) at 4°C. Purity was checked by SDS-PAGE. Proteins were stored at 4°C in 3 mM NaN3. Protein concentration was determined by using extinction coefficients calculated from, the sequence.
Fractionation of bacterial proteins
All bacterial proteins were fractionated in order to see the amount of msoluble expressed proteins. Pelleted bacteria from 100 ml of broth were.resuspended hi 40 ml of 20 % sucrose, 1 mM EDTA, 30 mM Tris-HCl, pH 8.0 and incubated 10 min at room temperature. They were centrifuged at 9000 g for 10 min at 4°C. Supernatant was removed and pellet was gently resuspended in 8 ml of ice-cold 5 mM MgSO4. Bacteria were gently shaken and incubated on ice for 10 min. Bacterial protoplasts were centrifuged again at the same conditions. Supernatant was removed as periplasmic fraction. Pellet was resuspended
in 10 ml of 20 mM NaH2P04, pH 8.0, with 1 mg of lysozyme and benzamidine. It was shaken vigorously and incubated on ice for 30 min, and finally, sonicated 5 x 30 s. Cytoplasmic proteins were removed from insoluble material by centrifugation at 35 000 g at 4°C for 30 min. Supernatant was removed as cytoplasmic fraction and pellet was resuspended in 2 ml of 8 M urea, 10 mM Tris-HCl, pH 7.4, 0.5 % Triton X-100 as insoluble fraction (membrane proteins and putative inclusion bodies).
Cleavage and purification of TolAUI-R-domain colicin Nfusion
Pure R-domain of colicin N was produced using the pTol expression system. 45 mg of TolAJH-R-domain was incubated in 35 ml of cleavage mixture at 20°C for 20 h. Cleavage mixture contains buffer as specified by producer and thrombin (Restriction grade, Novagen) at 0.1 U/mg of fused protein. Cleaved products were dialysed three times against 5 1 of 40 mM Tris-HCl, pH8.4 at 4°C, each time at least 4 h. Cleaved R domain was separated from TolAEI and uncleaved fusion protein by ion-exchange chromatography on FPLC system (Pharmacia). Proteins were applied to Mono S column (Pharmacia) at 1 ml/min in 40 mM Tris-HCl, pH8.4. After unbound material was washed from the column, R-domain was eluted by applying gradient of NaCl from 0 to 500 mM in the same buffer in 30 min. Large peak at approximately 70% of NaCl (app. 350 mM) was collected and checked for purity by SDS-PAGE.
Example 2:
Cloning ofpTol vector
A DNA fragment encoding BCL-XL was amplified by PCR from the plasmid pETBCLXL using the oligonucleotides SenseBCL-STU (5'- TTT TTT AGO CCT TCT CAG AGC AAC CQG GAG - 3'; SEQ ID NO: 60) and Mlu-BCL-Rev (5' - TTT TAG GCG TTC ATT TCC GAC TGA AGA G - 3'; SEQ TD NO: 61). BCL-XL was introduced into pTOLT plasmid using Stu 1 and Mlu I restriction sites. The finaTplasmid was named as a pTOLT-BCLXL (Figure 7) and DNA sequencing of this plasmid showed that BCL-XL encoding DNA fragment was correctly inserted.
Protein purification
BCL-XL protein was expressed in an JS. coli BL21 DE3 (pLysE) strain. The strain was transformed with plasmid and grown on LB plates with ampicillin (200 |xg/ml) and chloramphenicol (35 ug/ml) selection. 5 ml of LB medium with, antibiotics was inoculated with single colony and grown overnight at 37 °C. A 5 ml overnight culture was introduced into 500 ml of LB medium in 2 liter flasks containing ampicillin and chloramphenicol. Bacteria were grown until OD goo' 0.8 and induced by addition of final concentration ImM IPTG then grown for additional 3 hours. Cells were harvested and resuspended in 20 mM phosphate, 300mM NaCl, pH: 8.0 buffer containing RNAse, DNAse, PMSF (ImM) and Benzamidine (ImM). The cells were lysed by French press and the supernatant was obtained by ultra-centrifugation at 40 000 rpm for 1 h. The N-terminal 6X Histidine-tag (SEQ ID NO: 8) facilitated purification of the Tol-BCL fusion by means of Ni-NTA affinity column. The fusion protein was washed onto the column with 20 mM phosphate, SOOmM NaCl, pH: 8.0, buffer, additionally washed with the same buffer containing 50 mM imidazole and eluted in 300 mM imidazole, pH 7..0. The expression of fusion protein was analysed by SDS-PAGE (Figure 8) and concentration of protein was determined by UV absorption at 280 nm.
Thrombin cleavage of the BCL-XL protein
20 mg of TolA-BCL fusion was incubated in 20 ml of cleavage buffer at 4 °C for 4 h. Cleavage buffer contains 50mM Tris-HCl, 150mM NaCl, 2.5 mM CaCl2, 5 mM DTT and Thrombin (lUnit of thrombin (Sigma)/mg of fused protein). The released protein was recovered applying overnight dialysed cleavage mixture to a Ni-NTA column. After unbound protein was washed from the column, remains of the BCL-XL protein was washed by 2 M NaCl. All flow through and washes were collected and analysed by SDS-PAGE (Figure 9). The protein yields were calculated after thrombin cleavage using UV absorbanceat280nm. . ..
RESULTS
Expression of TolAJHprotein in E. coli
In Example 1, the third domain of TolAUI with tags (Figure 2) was expressed from three different expression vectors (Figure 3), pTolE, pTolT, and pTolX. In each case, the
expression of TolAIU was huge, sometimes reaching up to 40 % of all bacterial proteins (see Figure 3A). Specifically, the amount of expressed TolAHI from pTolT was 26.96 % ± 1.67 (n=5). The amount of expressed TolAHI was approximately the same regardless wliich vector was used. TolA expressed in bacteria did not interfere with normal bacterial metabolism. The growth curve was very similar for induced and non-induced bacteria (Figure 3B). All of the TolAIU protein was expressed in soluble form. No inclusion bodies were revealed by visual inspection of pelleted remains of bacteria after osmotic lysis, lysozyme treatment, sonication, and centrifugation. Furthermore, none of the TolAIU was found in insoluble cell fraction after fractionation of proteins from bacteria. Insoluble fraction represents membrane proteins and should contain also recombinant proteins in inclusion bodies (Figure 3C). Bacteria containing TolAIU were a bit more fragile than normal. TolATfl was released from the cells already after mild hypo-osmotic treatment, which, should release only periplasmic proteins.
Expression of other proteins in E. coli as fusions with TolAHL
Ten proteins were tested in order to check the suitability of pTol expression system for expression and preparation of other proteins (see Example 1, Table 1, and Example 2). These were different parts and domains of colicin N (TolA binding box (peptide of amino acids 40-76), deletion mutant of T-domain (A10) and R domain), representing prokaryotic proteins. Human phospholipase A2, pore-forming protein from sea anemone equinatoxin 31, nucleotide binding domain 1 (NBD1) of human cystic fibrosis transmembrane conductance regulator (CFTR), human mitochondrial pymvate dehydrogenase kinase 2 (PDK2) and BCL-XL were examples of eukaryotic proteins. Transmembrane proteins were represented by BcrC, a component of bacitracin resistance system from B. licheniformis, and transmembrane domain 1 (TM1) of human CFTR. Proteins chosen represent variations in size (app. 4.4 of colicin 40-76 kDa vs. 44 kDa of PDK2), genetic code (prokaryotic vs. eukaryotic proteins), protein location (soluble vs. membrane), and disulphide content (PLA2, 7 disulphides vs. equinatoxin, none). Fusion proteins were expressed at high proportion in E. coli using pTol system (Figure 4). Again, the expression was as high as 40% in some cases, but the average was around 20-25 % (see Figure 4B and C bottom panels). The only two exceptions were membrane proteins, BcrC and TM1. In this case a band corresponding to their size was lacking from the gel (Figure 4C). As opposed to
expression of TolAHI alone, expression of fusion proteins interferes with the growth of bacteria. In the case of PLAa and membrane proteins, TM1 and BcrC, the amount of bacteria at the end of the growth halved in some cases. Interestingly, expression of fusion of PDK2 in bacterial cell had positive effect and there was always slightly more bacteria at the end of the growth (not shown). Some of the bacteria expressing fusions were further fractionated. PDK2 and PLA2 were expressed as insoluble inclusion bodies. EqtH and R-domain were found mainly in the insoluble fraction, but some proportion was found also in cytoplasmic fraction (10-25 % of expressed proteins) (not shown).
Isolation and cleavage of fusion proteins
In Example 1, expressed fusions were isolated from the cytoplasm by simple extraction into buffered solution, which was applied onto Ni-NTA column. By this single step proteins were already more than 95 % pure (Figure 5). Yields of isolated fusions were on average approximately 50 mg/1 of bacterial broth, but reached up to 90 mg/1 (Table 2). Even proteins, which were mainly expressed as inclusion bodies, were isolated in significant quantities by this procedure, i.e. 11 mg/ml of EqtH fusion was isolated. One of the fusion proteins, TolE-Tdomain 40-76, was used for the preparation of a peptide sample suitable for structure determination by NMR. It was expressed in M9 minimal media containing 15NH4C1. Even in rriinimal media it was possible to express and produce fusion at significant amounts, almost 70 mg of pure fusion was obtained per litre of bacterial culture.
Table 2: Yields of isolated fusion proteins by using pTol system
(Table Remove)

a Proteins are named after plasmid used for expression of fusion protein.
Pure R-domain was prepared from TolT-Rdomain fusion by cleavage with thrombin and separation of cleavage products by ion-exchange chromatography. The results of such purification scheme are presented on Figure 5. By the outlined procedure 13 mg of pure functional R domain was prepared from 11 of starting bacterial culture. Slightly lower yield as expected from the amount of soluble fusion is a consequence of R-domain precipitation during the preparation. However, yield presented here is still more than two times higher than the system which provides directly expressed R-domain.
We show in Example 2 that BCL-XL, an important protein in. apoptosis and cancer research, can be expressed in large quantities as a fusion with ToLAJH (see Figure 8). SDS-PAGE analysis of the TolA-BCL fusion protein revealed a band with an apparent molecular weight of about 35 kD, which is in agreement with the following theoretical calculations:
ProtParamaters of TolA-BCL fusion protein (SEQID NO: 14): Number of amino acids: 348 Molecular weight: 38048.5 Theoretical pi: 5.83 Amino acid composition:
Ala (A) 38 10.9%
Arg (R) 17 4.9%
Asn (N) 17 4.9%
Asp (D) 16 4.6%
Cys (C) 3 0.9%
Gin (Q) 13 3.7%
Glu (E) 24 6.9%
Gly (G) 29 8.3%
His (H) 10 2.9%
He (I) 12 3.4%
Leu (L) 28 8.0%
Lys (K) 16 4.6%
Met (M) 7 2.0%
Phe (F) 16 4.6%
Pro (P) 17 4.9%
Ser (S) 34 9.8%
Thr (T) 13 3.7%
Trp (W) 7 2.0%
Tyr (Y) 10 2.9%
Val (V) 21 6.0%
Asx (B) 0 0.0%
Glx (Z) 0 0.0%
Xaa (X) 0 0.0%
Total number of negatively charged residues (Asp + Glu) : 40 Total number of positively charged residues (Arg + Lys) : 33
Extinction coefficients:
Conditions: 6.0 M guanidium hydrochloride, 0.02 M phosphate buffer, pH 6.5
Extinction coefficients are in units of M"1 cm"1 .
The first table lists values computed assuming ALL Cys residues appear as half cystines, whereas the second table assumes that NONE do.
276 27B 279 280 282 nm nm nm nm ntn

Ext. coefficient Abs 0.1% (=1 g/1)
Ext. coefficient Abs 0.1% (=1 g/1)



(Table Remove)

The TolAIH domain was cleaved from the TolA-BCL fusion using thrombin and the BCL partner purified on a Ni-NTA column (Figure 9). We found that 1 litre of BL21 ,(DE3) pLys E E. Coli cell culture gave 20 mg of highly pure, thrombin-cleaved BCL-XL protein. The SDS-PAGE apparent molecular weight following thrombin cleavage (see Figure 9) was in agreement with the following theoretical calculations:
ProtParamaters of the cleaved BCLXL component TolA-BCL fusion after thrombin treatment (SEQIDNO: 15):
Number of amino acids: 236 Molecular weight: 26329.2 Theoretical pi: 4.94 Amino acid composition:

(Table Remove)
Total number of negatively charged residues (Asp + Glu) : 31 Total number of positively charged residues (Arg + Lys) : 21
Extinction coefficients:
Conditions: 6.0 M guanidium hydrochloride 0.02 M phosphate, buffer pH 6.5
Extinction coefficients are in units of M"1 cm'1 .
The first table lists values computed assuming ALL Cys residues appear as half cystines, whereas the second table assumes that NONE do.

Ext. coefficient Abs 0.1% (=1 g/1)
(Table Remove)
Ext. coefficient 46500 47600 47690 47510 46400
Abs 0.1% (=1 g/1) 1.766 1.808 1.811 1.804 1.762
DISCUSSION
TolA3H is expressed in huge quantities in soluble form in bacterial cytoplasm. Among the reasons for high expression of proteins in E. coli are most commonly cited appropriate codon usage, stability of mRNA transcript, size, content of disulphide bonds, and non-toxicity to the cell. TolAHI is small protein, with only one disulphide bond. It is very stable and monomeric in solution even at concentrations as high as 30 mg/ml (data from analytical ultracentrifugation and gel filtration, not shown). The small size and tendency not to aggregate are certainly important in tolerance of heterologous material in the cytoplasm of bacteria. A further advantage of TolAHI gene is, that it is bacterial protein and as such it possesses only 5 codons (4.7 % of 106 amino acids excluding protease cleavage site) rarely transcribed in E. coli genome. They are scattered along the sequence. An improvement of its expression could be achieved by engineering of the conformation of its mRNA transcript. It was shown that, for a high yield of transcribed RNA, sometimes the conformation of RNA should be such, that the ribosome binding site and start codon should be exposed and not involved in base pairing. In the case of TolAHI mRNA both are involved in building short stems and not always completely exposed (analysis of transcribed RNAs of 60-120 nucleotides (step of 10 nt) by Mfold on htrp://bioinfo.math.rpi.edu/~zukerm/). High expression of TolAHI protein in the T7 based vector and-the high yields of pure product are comparable or even better than published and existing systems for production of fusion proteins in E. coli.
We have employed a domain of a periplasmic bacterial protein as a fusion partner in the overexpression of various proteins of bacterial and eukaryotic origin. Some small peptides or domains could be attached to TolAHI without significantly changing its size. The same amount of expressed protein would then be expected. In fact, the yield of fusion containing colicin N 40-76 peptide was the same as for TolAHI itself. The system is suitable for the preparation of eukaryotic proteins as well. In particular, the level of expression of EqtH is much more improved over the published one. Approximately 20 % of total expression of the fusion contrasted with approximately 5 % in the case of direct expression. The
maj ority of EqtH expressed from the pTol system is in the insoluble fraction, but isolation of the soluble cytoplasmic fraction still resulted in a large improvement in yield over the published method. The pTol system might also be applicable for proteins expressed as inclusion bodies. For example, the amount of expressed PLA.2 is similar to other expression systems, however the fusion protein can easily be isolated by Ni-NTA chromatography and then refolded and cleaved on the column matrix. An interesting observation was that the two membrane proteins studied did not express as fusion proteins with pTolA system, although the reason for this is unclear at the moment.
Three expression vectors were constructed providing three different cleavage sites for endopeptidases widely used in molecular biology, e.g. enterokinase, factor Xa and thrombin. Recognition sites for endopeptidases differ in amino acid sequence and size. These differences dramatically change properties of the small TolAHI partner in fusion proteins (Table 3). TolAT and TolAX are basic, calculated pi more than 8.5, TolAE is acid in nature, pi of 6.6. This is the result of four aspartates in the recognition sequence for enterokinase (DDDDK; SEQ ID NO: 3). The constructed vectors thus enable higher flexibility, i.e. one can easily choose appropriate vector on the basis of the properties of fused partner. In our case, R-domain of colicin N was expressed in pTolT vector since R-domain is even more basic (pi 9.7) than cleaved TolAHL On the other hand, colicin N peptide 40-76 has almost the same pi as TolAT or TolAX. This make subsequent purification much more difficult, the peaks representing the peptide and TolAHI would then overlap in ion-exchange chromatography. Therefore, peptide was expressed in pToLE. Cleaved TolAIH was not bound to the column at chosen conditions and the difference in pi of the uncleaved fusion (pi 7.2) and peptide was large enough to get clearly resolved peaks (not shown).
Table 3: Physical properties of TolAUJproteins after endoproteinase cleavage

(Table Remove)

Proteins are named according to the vector in which they were produced. b Calculated
from the sequence.
We could produce functional parts of the colicin N toxin by using the pTol expression system. We produced functional R-domain and 39 residue peptide composed of colicin residues 40-76. His-tagged R-domain expresses poorly and irreproducibly and the tolA fusion expressed consistently well and improved the yield by more than two fold. Peptide was produced as 15N labeDed sample for NMR structure determination. Preparation of large quantities of labelled peptide sample for NMR structure analysis can be problematic and a significant financial burden to research groups. High yields and versatility of the pTol system should make preparation of short peptides and proteins much cheaper and alternative to chemical synthesis and other expression systems. The system may be particularly useful for reproducible high level expression of small ( The expression of BCL-XL, an important protein in apoptosis and cancer research, is difficult to express at high yield since it has a hydrophobic C-terminal region which causes instability and toxicity. Thus most structural work has been carried out on truncated versions lacking this region. We were unable to express this protein in satisfactory yields for structural studies and thus used the TolAIH fusion protein system to improve our yields. We can now express large amounts of this protein as a TolAUI fusion partner (Figure 8). It is well folded as judged by CD spectroscopy (not shown). We can also produce large amounts in minimal media including 15NELiCl as the only nitrogen source.
SEQUENCE LISTING
cllO> University of Newcastle Upon Tyne c!20> Fusion Proteins c!30> 43952/JMD/MAR
GB 0'2006B9. B 2002-01-10
61
Patentln version 3-1
1 9 PRT
Artificial Seguence
)

Ala3-His6 tail
1
Ala Ala Ala His His His Eis His His
1 5
2
25
PRT
Escherichia coli
2
Met Asn Met Lys Lys Leu Ala Thr Leu Val Ser Ala Val Ala Leu Ser
1 5 10 15 •
.Ala Thr Val Ser Ala Asn Ala Met Ala
20 25
3
5
PRT
Artificial Sequence


Cleavage site for enterokinase
3
Asp Asp Asp Asp Lys
1 5
c210> 4
4
PRT
c213> Artificial Sequence

Cleavage site for throiribin
4
Leu Val Pro Arg
1
5
4
PRT
Artificial Sequence

Cleavage site for factor Xa
5
lie Glu Gly Arg
1
6
4
PRT
Artificial Sequence

4xHis tag
6
Eis His His His
1
7
5
PRT
Artificial Sequence

c220>
BxHis tag
7
His His His His His
1 . 5
8
6
PRT
Artificial Sequence

e223> 6xHis tag
8
His His His His His His
1 5
9
7
PRT
Artificial Sequence

VxHis tag
9
His His His His His His His
1 5
10
8
• PRT
Artificial Sequence
• •
BxHis tag
10
His His His His His His His His
1 5
11
9
c212> PRT
Artificial Secniencp


«223> SxEis tag
11
His His His His His His His His His
1 5
12
10
PRT
Artificial Sequence

lOxHis tag c4OO> 12
His His His His His His His His His His
15 10
13
c211> 93
PRT
Escherichia coli
13
Asn Asn Gly Ala Ser Gly Ala Asp lie Asn Asn Tyr Ala Gly Gin He
15 10 15
Lys Ser Ala lie Glu Ser Lys Phe Tyr Asp Ala Ser Ser Tyr Ala Gly
20 25 30
Lys Thr Cys Thr Leu Arg He Lys Leu Ala Pro Asp Gly Met Leu Leu
35 40 45
Asp He Lys Pro Glu Gly Gly Asp Pro Ala Leu Cys Gin Ala Ala Leu
50 55 60
Ala Ala Ala Lys Leu Ala Lys He Pro Lys Pro Pro Ser Gin Ala Val
65 70 75 ' 80
Tyr Glu Val Phe Lys Asn Ala Pro Leu Asp Phe Lys Pro
85 90
14 348

PRT
c213> Artificial Sequence
c220>
TolA-BCL fusion protein
14
Met His His His His His His Ser Ser Asn Asn Gly Ala Ser Gly Ala
1 • 5 10 15
Asp He Asn Asn Tyr Ala Gly Gin lie Lys Ser Ala lie Glu Ser Lys
20 25 30
Phe Tyr Asp Ala Ser Ser Tyr Ala Gly Lys Thr Cys Thr Leu Arg lie
35 40 45
Lys Leu Ala Pro Asp Gly Met Leu Leu Asp He Lys Pro Glu Gly Gly
50 55 60
Asp Pro Ala Leu Cys Gin Ala Ala Leu Ala Ala Ala Lys Leu Ala Lys
65 70 75 BO
He Pro Lys Pro Pro Ser Gin Ala Val Tyr Glu Val Phe Lys Asn Ala
85 90 95
Pro Leu Asp Phe Lys Pro Gly Gly Gly Ser Gly Ser Leu Val Pro Arg
100 105 110
Gly Ser Arg Pro Ser Gin Ser Asn Arg Glu Leu Val Val Asp Phe Leu
115 120 125
Ser Tyr Lys Leu Ser Gin Lys Gly Tyr Ser Trp Ser Gin Phe Ser Asp
130 135 140
Val Glu Glu Asn Arg Thr Glu Ala Pro Glu Gly Thr Glu Ser Glu Met
145 . 150 155 • 160
Glu Thr Pro Ser Ala He Asn Gly Asn Pro Ser Trp His Leu Ala Asp
165 . 170 175
Ser- Pro Ala Val Asn Gly Ala Thr Ala His Ser Ser Ser Leu Asp Ala
180 IBS 190
Arg Glu Val He Pro Met Ala Ala Val Lys Gin Ala Leu Arg Glu Ala
195 200 205

Gly Asp Glu Phe Glu Leu Arg Tyr Arg Arg Ala Phe Ser Asp Leu Thr
210 215 220
Ser Gin Leu His He Thr Pro Gly Thr Ala Tyr Gin Ser Pbe Glu Gin
225 230 235 240
Val Val Asn Glu Leu Phe Arg Asp Gly Val Asn Trp Gly Arg lie Val
245 250 255
Ala Phe Phe Ser Phe Gly Gly Ala Leu Cys Val Glu Ser Val Asp Lys
260 265 270
Glu Met Gin Val Leu Val Ser Arg He Ala Ala Trp Met Ala Thr Tyr
275 280 2B5
Leu Asn Asp His Leu Glu Pro Trp He Gin Glu Asn Gly Gly Trp Asp
290 295 300
Thr Phe Val Glu Leu Tyr Gly Asn Asn Ala Ala Ala Glu Ser Arg Lys
305 310 315 320
Gly Gin Glu Arg Phe Asn Arg Trp Phe Leu Thr Gly Met Thr Val Ala
325 330 335
Gly Val Val Leu Leu Gly Ser Leu Phe Ser Arg Lys
340 345
15
236
PRT
Artificial Sequence

TolA-BCL fusion-protein after thrombin cleavage
15
Gly Ser Arg Pro Ser Gin Ser Asn Arg Glu Leu Val Val Asp Phe Leu
15 10 15
Ser Tyr Lys Leu Ser Gin Lys Gly Tyr Ser Trp Ser Gin Phe Ser Asp
20 25 30
Val Glu Glu Asn Arg Thr Glu Ala Pro Glu Gly Thr Glu Ser Glu Met
35 40 45

Glu Tbx Pro Ser Ala lie ABU Gly Asn Pro Ser Trp His Leu Ala Asp
50 55 60
Ser Pro Ala Val ABU Gly Ala Thr Ala His Ser Ser Ser Leu Asp Ala
65 70 75 BO
Arg Glu Val lie Pro Met Ala Ala Val Lys Gin Ala Leu Arg Glu Ala
B5 90 95
Gly Asp Glu Phe Glu Leu Arg Tyr Arg Arg Ala Phe Ser Asp Leu Tbx
100 105 110
Ser Gin Leu His He Thr Pro Gly Thr Ala Tyr Gin Ser Phe Glu Gin
115 120 125
Val Val Asn Glu Leu Phe Arg Asp Gly Val Asn Trp Gly Arg He Val
130 135 140
Ala Phe Phe Ser Phe Gly Gly Ala Leu Cys Val Glu Ser Val Asp Lys
145 150 . 155 160
Glu Met Gin Val Leu Val Ser Arg He Ala Ala Trp Met Ala Thr Tyr
165 170 175
lieu. Asn. Asp His Leu Glu Pro Trp He Gin Glu Asn Gly Gly Trp Asp
180 185 190
Thr Phe Val Glu Leu Tyr Gly Asn Asn Ala Ala Ala Glu Ser Arg Lys
195 200 205
Gly Gin Glu Arg Phe Asn Arg Trp Phe Leu Thr Gly Met Thr Val Ala
210 215 220
Gly Val Val Leu Leu Gly Ser Leu Phe Ser Arg Lys
225 230 235
16
115
PRT
Artificial Sequence
' ., .
Tagged TolAIII region of'pTbl vectors

MISC_FEATORE c222> (107) .. (HI)
Xaa residues represent cleavage sites DDDDK (SEQ ID NO: 3), LVPR (SEQ ID NO: 4; no Xaa at position 111) or IEGR (SEQ ID NO: 5; no
Xaa at position 111)
16
Met His His His His Eis His Ser Ser Asn Asn Gly Ala Ser Gly Ala
15 10 15
Asp He Asn Asn Tyr Ala Gly Gin lie Lys Ser Ala lie Glu Eer Lys
20 25 30
Phe Tyr Asp Ala Ser Ser Tyr Ala Gly Lys Thr Cys Thr Leu Arg lie
35 40 45
Lys Leu Ala Pro Asp Gly Met Leu Leu Asp lie Lys Pro Glu Gly Gly
50 55 60
Asp Pro Ala Leu Cys Gin Ala Ala Leu Ala Ala Ala Lys Leu Ala Lys
65 • 70 75 80
lie Pro Lys Pro Pro Ser Gin Ala Val Tyr Glu Val Phe Lys Asn Ala
B5 90 95
Pro Leu Asp Phe Lys Pro Gly Gly Gly Ser Xaa Xaa Xaa Xaa Xaa Gly
100 105 110
Ser Gly Thr 115
c210> 17
c211> B
c212> PRT
c213> Artificial Sequence
I

His6-Ser2 linker
17
His His His His His Eis Ser Ser
1 5
18

c211> 4
PRT
Artificial Sequence
•c220>
Short flexible polypeptide
18
Gly Gly Gly Ser
1
19
51
DNA
Artificial Sequence

Cleavage/cloning site of pTolE vector
19
ggtgggggat ctgatgatga cgataaagga tccggtacct gatgaacgcg t
20
46
DNA
Artificial Sequence

Cleavage/cloning site of pTolT vector
20
ggtgggggat ctctggttcc gcgcggatcc ggtacctgat gaacgcgt
21
48
DNA
«213> Artificial Sequence

Cleavage/cloning site of pTolX vector
21
g-gtg-ggggat ctattgaagg tcgcggatcc ggtacctgat gaacgcgt
22
17

PRT
Artificial Seguence
c220>
Cleavage/cloning site of pTolE vector

MISCJFEATDRE
(14)..'(15)
Xaa represents stop codon site
22
Gly Gly Gly Ser Asp Asp Asp Asp Lys Gly Ser Gly Thr Xaa Xaa Thr
1 5 .10
Arg
23
16
PRT
Artificial Sequence

Cleavage/cloning site of pTolT vector

MISC_FEATORE
(13).. (14)
Xaa represents stop codon site
23
Gly Gly Gly Ser Leu Val Pro Arg Gly Ser Gly Thr Xaa Xaa Thr Arg
24
IS
PRT '
Artificial Seguence

Cleavage/cloning site of pTolX vector

MISC_FEATURE
(13).. (14)
Xaa represents stop codon site

24
Gly Gly Gly Ser He Glu Gly Arg Gly Ser Gly Thr Xaa Xaa Thr Arg
15 10 15
25
2
PRT
Artificial Sequence

Gly-Ser tag
25
Gly Ser
1
25
.4
PRT
' Artificial Sequence

» Gly-Ser-Gly-Thr tag
<:400> 26
Gly Ser Gly Thr
1
27
39
DNA
Artificial Sequence

Synthetic oligonucleotide
27
gatctgatga tgacgataaa ggatccggta cctgatgaa
28
39
DNA
Artificial Sequence

Synthetic oligonucleotide 2B

cgcgttcatc aggtaccgga tcctttatcg tcatcatca
c210> 29
36
DNA
Artificial Sequence

Synthetic oligonucleotide
29
gatctattga aggtcgcgga tccggtacct gatgaa
30
36
DNA
Artificial Sequence

Synthetic oligonucleotide
30
cgcg-ttcatc aggtaccgga tccgcgacct tcaata
31
36
DNA
Artificial Sequence


S^Tithetic oligonucleotide
31
gatctctggt tccgcgcgga tccggtacct gatgaa
32
36
DNA
Artificial Sequence

Synthetic oligonucleotide
32
cgcgttcatc aggtaccgga tccgcgcgga accaga
33 37

PRT
Escherichia cola
c400> 33
Asn Ser Asn Gly Trp Ser Trp Ser Asn Lys Pro His Lys Asn Asp Gly
15 10 15
Phe His Ser Asp Gly Ser Tyr His lie Thr Phe Hie Gly Asp Asn Asn
20 25 30
Ser Lys Pro Lys Pro 35
34
80
c2 L2 > PRT
Esclierichia coli
34
Asn Asn Ala Phe Gly Gly Gly Lys Asn Pro Gly lie Gly Asn Thr Ser
15 10 15
Gly Ala Gly Ser Asn Gly Ser Ala Ser Ser Asn Arg Gly Asn Ser Asn
20 25 30
Gly Trp Ser Trp Ser Asn Lys Pro His Lys Asn Asp Gly Phe His Ser
35 40 45
Asp Gly Ser Tyr His lie Thr Phe His Gly Asp Asn Asn Ser Lys Pro
50 55 SO
Lys Pro Gly Gly Asn Ser Gly Asn Arg Gly Asn Asn Gly Asp Gly Ala
65 70 75 BO
35
117
PRT
Escherichia coli
35
His Gly Asp Asn Asn Ser Lys Pro Lys Pro Gly Gly Asn Ser Gly Asn
1 5 . 10 15

Arg Gly Asn Asn Gly Asp Gly Ala Ser Ala Lys Val Gly Glu He Thr
2'0 25 30
He Thr Pro Asp ABU Ser Lys Pro Gly Arg Tyr lie Ser Ser Asn Pro
35 40 45
Glu Tyr Ser Leu Leu Ala Lys Leu He Asp Ala Glu Ser He Lys Gly
50 55 60
Thr Glu Val Tyr Thr Phe His Thr Arg Lys Gly Gin Tyr Val Lys Val
65 70 75 80
Thr Val Pro Asp Ber Asn He Asp Lys Met Arg Val Asp Tyr Val Asn
85 90 95
Trp Lys Gly Pro Lys Tyr Asn Asn Lys Leu Val Lys Arg Phe Val Ser
100 105 110
Gin Phe Leu Leu Phe 115
36
124
PRT
Homo sapiens
36
Asn Leu Val Asn Phe Els Arg Net He Lys Leu Thr Thr Gly Lys Glu
15 10 15
Ala Ala Leu Ser Tyr Gly Phe Tyr Gly Cys His Cys Gly Val Gly Gly
20 25 30
Arg Gly Ser Pro Lys Asp Ala Thr Asp Arg Cys Cys Val Thr His Asp
35 40 45
Cys Cys Tyr Lys Arg Leu Glu Lys Arg Gly Cys Gly Thr Lys Phe Leu
50 55 60
Ser Tyr Lys Phe Ser Asn Ser Gly Ser Arg He Thr Cys Ala Lys Gin
65 70 75 80
Asp Ser Cys Arg Ser Gin Leu. Cys .Glu Cys Asp Lys Ala .Ala Ala Thr
85 90 95

Cys Plie Ala Arg Asn Lys Thr Thr Tyr Asn Lys Lys Tyr Gin Tyr Tyr
100 105 110
Ser Asn Lys His Cys Arg Gly Ser Thr Pro Arg Cys
115 120
c210> 37
179
PRT
Actinia eguina
37
Ser Ala Asp Val Ala Gly Ala Val He Asp Gly Ala Ser Leu Ser Phe
15 10 15
Asp lie Leu Lys Thr Val Leu Glu Ala Leu Gly Asn Val Lys Arg Lys
20 25 30
lie Ala Val Gly Val Asp Asn Glu Ser Gly Lys Thr Trp Thr Ala Leu
35 40 45
Asn. Thr Tyr Phe Arg Ser Gly Thr Ser Asp lie Val Leu Pro His Lys
50 55 60
Val Pro His Gly Lys Ala Leu Leu Tyr Asn Gly Gin Lys Asp Arg Gly
65 70 75 BO
Pro Val Ala Thr Gly Ala Val Gly Val Leu Ala Tyr Leu Met Ser Asp
85 90 95
Gly Asn Thr Leu Ala Val Leu Phe Ser Val Pro Tyr Asp Tyr Asn Trp
100 105 110
Tyr Ser Asn Trp Trp Asn Val' Arg He Tyr Lys Gly Lys Arg Arg Ala
115 • • " 120 - 125
Asp Gin Arg Met Tyr Glu Glu Leu Tyr Tyr Asn Leu Ser Pro Phe Arg
130 135 140
Gly Asp Asn Gly Trp His Thr Arg Asn Leu Gly. Tyr Gly Leu Lys Ser
145 ' 150 155 160
Gly Phe Met Asn Ser Ser Gly Bis Ala lie Leu Glu He His Val

165 170 175
Ser Lys Ala
c210> 3B
c211> 191
PRT
Homo sapiens
38
Thr Gly Ala Gly Lys Thr Ser Leu Leu Met Met lie Met Gly Glu Leu
15 10 15
Glu Pro Ser Glu Gly Lys He Lys His Ser Gly Arg lie Ser Phe Cys
20 25 30
Ser Gin Phe Ser Trp He Met Pro Gly Thr He Lys Glu ASB He He
35 40 45
Phe Gly Val Ser Tyr Asp Glu Tyr Arg Tyr Arg Ser Val He Lys Ala
50 55 60
Cys Gin Leu Glu Glu Asp He Ser Lys Phe Ala Glu Lys Asp Asn He
65 70 75 80
Val Leu Gly Glu Gly Gly He Thr Leu Ser Gly Gly Gin Arg Ala Arg
85 90 35
He Ser Leu Ala Arg Ala Val Tyr Lys Asp Ala Asp Leu Tyr Leu Leu
100 105 110
Asp Ser Pro Phe Gly Tyr Leu Asp Val Leu Thr Glu Lys Glu He Phe
115 120 125
Glu Ser Cys.Val Cys Lys Leu Met Ala Asn Lys Thr Arg He Leu Val
130 135 140
Thr Ser Lys Met Glu His Leu Lys Lys Ala Asp Lys lie Leu He Leii
145 150 155 160
His Glu Gly Ser Ser Tyr Phe Tyr Gly Thr Phe Ser Glu Leu Gin Asn
165 170 175
Leu Gin Pro Asp Phe Ser Ser Lys Leu Met Gly Cys Asp Ser Phe

180 185 19D
39
390
PRT
HOTIO sapiens
39
Lys Tyr lie Glu His Phe Ser Lys Phe Ser Pro Ser Pro Leu Ser Met
15 10 15
Lys Gin Phe Leu Asp Phe Gly Ser Ser Asn Ala Cys Glu Lys Thr Ser
20 25 30
Phe Thr Phe Leu Arg Gin Glu Leu Pro Val Arg Leu Ala Asn lie Met
35 40 45
Lys Glu lie Asn Leu Leu Pro Asp Arg Val Leu Ser Thr Pro Ser Val
50 55 60
Gin Leu Val Gin Ser Trp Tyr Val Gin Ser Leu Leu Asp lie Met Glu
65 70 75 80
Phe Leu Asp Lys Asp Pro Glu Asp His Arg Thr Leu Ser Gin Phe Thr
85 90 95
Asp Ala Leu Val Thr lie Arg Asn Arg His Asn Asp Val Val Pro Thr
1DO 105 110
Met Ala Gin Gly Val Leu Glu Tyr Lys Asp Thr Tyr Gly Asp Asp Pro
115 120 125
Val Ser Asn Gin Asn lie Gin Tyr Phe Leu Asp Arg Phe Tyr Leu Ser
130 135 140
Arg lie Ser lie Arg Met Leu lie Asn Gin His Thr Leu lie Phe Asp
' 145 150 155 160
Gly'Ser Thr Asn Pro Ala His Pro Lys His lie Gly Ser lie Asp Pro
165 170 175
Asn Cys Asn Val Ser Glu Val Val Lys Asp Ala Tyr Asp Met Ala Lys
180 185 190

Leu Leu Cys Asp Lys Tyr Tyr Met Ala Ser Pro Asp Leu Glu lie Gin
195 200 205
Glu lie Asn Ala Ala Asn Ser Lys Gin Pro lie His Met Val Tyr Val
210 215 220
Pro Ser His Leu Tyr His Met Leu Phe Glu Leu Phe-^Lys Asn Ala Met
225 230 235 240
Arg Ala Thr Val Glu Ser His Glu Ser Ser Leu lie Leu Pro Pro lie
245 250 255
Lys Val Met Val Ala Leu Gly Glu Glu Asp Leu Ser lie Lys Met Ser
260 265 270
Asp Arg Gly Gly Gly Val Pro Leu Arg Lys lie Glu Arg Leu Phe Ser
275 280 285
Tyr Met Tyr Ser Thr Ala Pro Thr Pro Gin Pro Gly Thr Gly Gly Thr
290 295 300
Pro Leu Ala Gly Phe Gly Tyr Gly Leu Pro lie Ser Arg Leu Tyr.Ala
305 310 315 320
Lys Tyr Phe Gin Gly Asp Leu Gin Leu Phe Ser Met Glu Gly Phe Gly
325 330 335
Thr Asp Ala Val lie Tyr Leu Lys Ala Leu Ser Thr Asp Ser Val Glu
340 345 350
Arg Leu Pro Val Tyr Asn Lys Ser Ala Trp Arg His Tyr Gin Thr lie
355 360 365
Gin. Glu Ala Gly Asp Trp Cys Val Pro Ser Thr Glu Pro Lys Asn Thr
370 375 380
Ser Thr Tyr Arg Val Ser
385 390
c210> 40
c211> 202
PRT
Bacillus licheniformis .

40
Ser Phe Ser Glu Leu Asn lie Asp Ala Phe Arg Phe lie Asn Asp Leu
15 10 15
Gly Lys Glu Tyr Ser Met Leu Asn Pro Val Val Tyr Phe Leu Ala Glu
20 25 30
Tyr Met Met Tyr Phe Leu Ala Leu Gly Leu Val Val Tyr Trp Leu Thr
35 ' 40 45
Arg Thr Thr Lys Asn Arg Leu Met Val He Tyr Ala Val He Ala Phe
50 55 60
Val Val Ala Glu He Leu Gly Lys He Met Gly Ser Leu His Ser Asn
€5 70 75 BO
Tyr Gin Pro Phe Ala Thr Leu Pro Asn Val Asn Lys Leu He Glu His
85 90 95
Glu He Asp Asn Ser Phe Pro Ser Asp His Thr He Leu Phe Phe Ser
100 105 110
He Gly Phe Leu lie Phe Leu Phe His Lys Lys Th.r Gly Trp Leu Trp
115 120 125
Leu Val Leu Ala Phe Ala Val Gly lie Ser Arg He Trp Ser Gly Val
130 135 140
His Tyr Pro Leu Asp Val Ala Ala Gly Ala Leu Leu Gly Val Leu Ser
145 150 155 160
Ala Leu Phe Val Phe Trp Thr Ala Pro Lys Leu Ser Phe He His Gin
165 • 170 175
Met Leu Ser Leu Tyr Glu Lys Val Glu Gin Arg He Val Pro Ser Lys
1BO 185 190
Asn Lys Ser Asn Asp Lys Ser Lys Asn Phe
195 200
41
354
PRT
Homo sapiens

41
Gin Arg Ser Pro Leu Glu Lys Ala Ser Val Val Ser Lys Leu Phe Phe
15 10 15
Ser Trp Thi Arg Pro Ije Leu Arg Lys Gly Tyr Arg Gin Arg Leu Glu
20 25 30
Leu Ser Asp He Tyr Gin He Pro Ser Val Asp Ser Ala Asp Asn Leu
35 40 45
Ser Glu Lys Leu Glu Arg Glu Trp Asp Arg Glu Leu Ala Ser Lys Lys
50 55 60
Asn Pro Lys Leu He Asn Ala Leu Arg Arg Cys Phe Phe Trp Arg Phe
65 70 75 BO
Met Phe Tyr Gly He Phe Leu Tyr Leu Gly Glu Val Thr Lys Ala Val
85 90 95
Gin Pro Leu Leu Leu Gly Arg He He Ala Ser Tyr Asp Pro Asp Asn
100 105 110
Lys Glu Glu Arg Ser He Ala He Tyr Leu Gly He Gly Leu Cys Leu
115 120 125
Leu Phe He Val Arg Thr Leu Leu Leu His Pro Ala He Phe Gly Leu
130 135 140
His His He Gly Met Gin Met Arg He Ala Met Phe Ser Leu He 'Tyr
145 150 155 ISO
Lys Lys Thr Leu Lys Leu Ser Ser Arg Val Leu Asp Lys He Ser He
165 170 175
Gly Gin Leu Val Ser Leu Leu Ser Asn Asn Leu Asn Lys Phe Asp Glu
180 185 190
Gly Leu Ala Leu Ala His Phe Val Trp He Ala Pro Leu Gin Val Ala
195 200 205
Leu Leu Met Gly Leu He Trp Glu Leu Leu Gin Ala Ser Ala Phe Cys
210 215 . 220

Gly Leu Gly Phe Leu lie Val .Leu Ala Leu Phe Gin Ala Gly Leu Gly
225 230 235 240
Arg Met Met Met Lys Tyr Arg Asp Gin Arg Ala Gly Lys lie Ser Glu
245 250 255
Arg Leu Val lie Thr Ser Glu Met He Glu Asn He Gin Ser Val Lys
260 265 270
Ala Tyr Cys Trp Glu Glu Ala Met Glu Lye Met He Glu Asn Leu Arg
275 2BO 2B5
Gin Tlir Glu Leu Lys Leu Thr Arg Lys Ala Ala Tyr Val Arg Tyr Phe
290 295 300
Asn Ser Ser Ala Phe Phe Phe Ser Gly Phe Phe Val Val Phe Leu Ser
305 310 315 320
Val Leu Pro Tyr Ala Leu He Lys Gly He He Leu Arg Lys He Phe
325 ' 330 335
Thr Thr He Ser Phe Cys He Val Leu Arg Met Ala Val Thr Arg Gin
340 345 350
Phe Pro
42
34
DNA
Artificial Sequence

Synthetic oligonucleotide
«c400> 42
tttttggatc caattccaat ggatggtcat ggag
43
«211> 41
DMA
Artificial Sequence
c220>
Synthetic oligonucleotide

43
aaggatccaa gcttcaaggt ttaggctttg aattattgtc c
44
36
DMA
Artificial Sequence

Synthetic oligonucleotide
44
tttttggatc caatgctttt ggtggaggga aaaatc
45
19
DMA
Artificial Sequence

Synthetic oligonucleotide
45
ctcagcggtg gcagcagcc
46
31
UNA .
Artificial Sequence

Synthetic oligonucleotide
46
egcggatccc atggggacaa taattcaaag c
47
3B
c212> DMA
Artificial Sequence

Synthetic oligonucleotide
47
ggcgaattca cgcgttaaaa taataatttc tggctcac
48

37
DNA
Artificial Sequence

Synthetic oligonucleotide
48
ccggggtacc aatttggtga atttccacag aatgatc
49
35
DNA
Artificial Sequence

Synthetic oligonucleotide
49
ggcgaattca cgcgttagca acgaggggtg ctccc
50
27
DNA
Artificial Sequence

Synthetic oligonucleotide
50
cgcggatccg cagacgtggc tggcgcc
51
38
DNR. Artificial Sequence

Synthetic oligonucleotide
51
ggcgaattca cgcgttaagc tttgctcacg tgagtttc
52
30
DKA
Artificial Sequence


Synthetic oligonucleotide
52
cgcggatcct ctaatggtga tgacagcctc
c210> 53
38
DMA
Artificial Sequence

Synthetic oligoniicleotide
53
ggcgaattca cgcgttagaa agaatcacat cccatgag
c210> 54
37
DMA
Artificial Sequence

c223> Synthetic oligonucleotide
54
ccggggtacc aagtacatag agcacttcag caagttc
55
37
DMA
Artificial Sequence

Synthetic oligonucleotide
55
ggcgaattca cgcgttacgt gacgcggtac gtggtcg
55
31
' DNA
Artificial Sequence

«:223> Synthetic oligonucleotide
56

cgcggatcct tttcagaatt aaatattgat g
57
36
DNA
Artificial Sequence

Synthetic oligonucleotide
57
ggcgaattca cgcgttaaaa gttcttcgat ttatcg
58
25 DWA Artificial Sequence

c223> Synthetic oligonucleotide
58
cgcgrgatccc agaggtcgcc tctgg
59
35
DNA
Artificial Sequence

Synthetic oligonucleotide
59
ggcgaattca cgcgttaggg aaattgccga gtgac
60
30
DNA
Artificial Sequence

. Synthetic oligonucleotide
60
ttttttaggc cttctcagag caaccgggag . . '
61 2B

DN7S.
Artificial Sequence

Synthetic oligonucleotide
61
ttttacgcgt tcatttccga ctgaagag







WE CLAIM:
1. An isolated recombinant fusion polypeptide for expression in a host cell, in which the fusion polypeptide has a basic structure, in sequence, an N terminus, a TolAIII domain defined by the amino acid sequence of SEQ ID NO: 13 or a functional homologue, fragment, or derivative thereof for achieving enhanced expression of the fusion polypeptide in the host cell, a non-TolA protein partner whose expression is desired, and a C-terminus, and which optionally comprises an affinity purification tag.
2. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in claim 1, comprising a signal peptide.
3. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in claim 2, in which the signal peptide is located at or near the N-terminus of the fusion polypeptide.
4. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in any of the preceding claims, wherein the TolAIII domain or functional homologue, fragment, or derivative, thereof has been condon-optimised for expression in the host cell,
5. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in any of the preceding claims, comprising a linker between the TolAIII domain or functional homologue, fragment, or derivative thereof and the non-TolA protein partner.
6. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in claim 5, wherein the linker comprises at least one cleavage site for an endopeptidase.
7. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in claim 6, wherein the cleavage site comprises the amino acid sequence DDDDK (SEQ ID NO: 3) and/or LVPR (SEQ ID NO: 4) and/or IGER (SEQ ID NO: 5).

8. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in claim 1, wherein the affinity purification tag is located at or near the N-terminus of the fusion polypeptide.
9. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in claim 8, wherein the affinity purification tag is an N-terminal Hisn tag, with n=4, 5, 6, 7, 8, 9 or 10 (SEQ ID NOs: 6-12, respectively; preferably n=6 [SEQ ID NO: 8]), optionally with the Hisn tag linked to the fusion polypeptide by one or more Ser residues (preferably 2).
10. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in any of the preceding claims, wherein the TolAIII domain consists of amino acid residues 329-421 (SEQ ID NO: 13) of the Escherichia coli TolA sequence (SwissProt Ace. No. PI9934).
11. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in any of the preceding claims, wherein the host cell is bacterial (for example, Escherichia coli).
12. The isolated recombinant fusion polypeptide for expression in a host cell as claimed in any of the preceding claims, wherein the non-TolA protein partner is BCL-XL.
13. The isolated recombinant fusion polypeptide as claimed in any of claims 1-12 as and when used for expression in a host cell for immobilization and/or purification and/or isolation of the non-TolA protein partner and/or for studying interaction properties of the non-TolA protein partner or the fusion polypeptide.

Documents:

2305-DELNP-2004-Abstract-(16-07-2008).pdf

2305-delnp-2004-abstract.pdf

2305-DELNP-2004-Claims-(16-04-2009).pdf

2305-DELNP-2004-Claims-(16-07-2008).pdf

2305-delnp-2004-claims-(30-07-2008).pdf

2305-delnp-2004-claims-(31-07-2008).pdf

2305-delnp-2004-claims.pdf

2305-delnp-2004-complete specification (granted).pdf

2305-DELNP-2004-Correspondence-Others-(16-07-2008).pdf

2305-DELNP-2004-Correspondence-Others-(17-07-2008).pdf

2305-delnp-2004-correspondence-others-(30-07-2008).pdf

2305-delnp-2004-correspondence-others-(31-07-2008).pdf

2305-delnp-2004-correspondence-others.pdf

2305-delnp-2004-description (complete)-16-07-2008.pdf

2305-delnp-2004-description (complete)-30-07-2008.pdf

2305-delnp-2004-description (complete)-31-07-2008.pdf

2305-delnp-2004-description (complete).pdf

2305-DELNP-2004-Drawings-(16-07-2008).pdf

2305-delnp-2004-drawings.pdf

2305-DELNP-2004-Form-1-(16-07-2008).pdf

2305-delnp-2004-form-1.pdf

2305-delnp-2004-form-18.pdf

2305-DELNP-2004-Form-2-(16-07-2008).pdf

2305-delnp-2004-form-2.pdf

2305-DELNP-2004-Form-3-(16-07-2008).pdf

2305-delnp-2004-form-3.pdf

2305-delnp-2004-form-5.pdf

2305-delnp-2004-gpa.pdf

2305-delnp-2004-pct-220.pdf

2305-delnp-2004-pct-306.pdf

2305-delnp-2004-pct-409.pdf

2305-delnp-2004-pct-search report.pdf

2305-DELNP-2004-Petition-137-(17-07-2008).pdf


Patent Number 234560
Indian Patent Application Number 2305/DELNP/2004
PG Journal Number 26/2009
Publication Date 26-Jun-2009
Grant Date 08-Jun-2009
Date of Filing 09-Aug-2004
Name of Patentee NEWCASTLE UNIVERSITY VENTURES LIMITED
Applicant Address CENTRAL SQUARE SOUTH, ORCHARD STREET, NEWCASTLE-UPON-TYNE, NE1 3XX, UNITED KINGDOM
Inventors:
# Inventor's Name Inventor's Address
1 GOKCE, ISA GAZIOSMANPASA UNIVERSITESI, FEN EDEBIYAT FAKULTESI KIMYA BOLUMU, 60240 TOKAT, TURKEY,
2 ANDERLUH, GREGOR DEPARTMENT OF BIOLOGY, BIOTECHNICAL FACULTY, UNIVERSITY OF LJUBLIJANA, VECNA POT 111, 1000 LJUBLIJANA, SLOVENIA,
3 LAKEY, JEREMY HUGH SCHOOL OF CELL AND MOLECULAR BIOSCIENCES, THE MEDICAL SCHOOL, THE UNIVERSITY OF NEWCASTLE-UPON-TYNE NE2 4HH, UNITED KINGDOM,
PCT International Classification Number C12N 15/62
PCT International Application Number PCT/GB2003/000078
PCT International Filing date 2003-01-10
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 0200689.8 2002-01-10 U.K.