Title of Invention

"SERINE PROTEASES, NUCLEIC ACIDS ENCODING SERINE ENZYMES AND VECTORS AND HOST CELLS INCORPORATING SAME"

Abstract The present invention provides novel serine proteases, novel genetic material encoding these enzymes, and pro-teolytic proteins obtained from Micrococcineae spp., including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In particular, the present invention provides protease compositions obtained from a Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the protease, host cells transformed with the vector DNA, and an enzyme produced by the host cells. The present invention also provides cleaning compositions (e.g., detergent compositions), animal feed compositions, and textile and leather processing compositions comprising protease(s) obtained from a Micrococcineae spp., including but not limited to Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., variant) proteases derived from the wild-type proteases described herein. These mutant proteases also find use in numerous applications.
Full Text SERINE PROTEASES, NUCLEIC ACIDS
ENCODING SERINE ENZYMES AND VECTORS AND HOST CELLS INCORPORATING SAME
The present application claims priority under 35 U.S.C. §119, to co-pending U.S. Provisional Patent Application Serial Number 60/523,609, filed November19, 2003.
FIELD OF THE INVENTION
The present invention provides novel serine proteases, novel genetic material encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In particular, the present invention provides protease compositions obtained from a Cellulomonas spp, DMA encoding the protease, vectors comprising the DMA encoding the protease, host cells transformed with the vector DMA, and an enzyme produced by the host cells. The present invention also provides cleaning compositions (e.g., detergent compositions), animal feed compositions, and textile and leather processing compositions comprising protease(s) obtained from a Micrococcineae spp., including but not limited to Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., variant) proteases derived from the wild-type proteases described herein. These mutant proteases also find use in numerous applications.
BACKGROUND OF THE INVENTION
Serine proteases are a subgroup of carbonyl hydrolases comprising a diverse class of enzymes having a wide range of specificities and biological functions (See e.g., Stroud, Sci. Amer., 131:74-88). Despite their functional diversity, the catalytic machinery of serine proteases has been approached by at least two genetically distinct families of enzymes: 1) the subtilisins; and 2) the mammalian chymotrypsin-related and homologous bacterial serine proteases (e.g., trypsin and S. gr/seustrypsin). These two families of serine proteases show remarkably similar mechanisms of catalysis (See e.g., Kraut, Ann. Rev. Biochem., 46:331-358 [1977]). Furthermore, although the primary structure is unrelated, the tertiary structure of these two enzyme families brings together a conserved catalytic triad of amino acids consisting of serine, histidine and aspartate. The subtilisins and chymotrypsin-related serine proteases both have a catalytic triad comprising aspartate, histidine and serine. In

the subtilisin-related proteases the relative order of these amino acids, reading from the amino to carboxy terminus, is aspartate-histidine-serine. However, in the chymotrypsin-related proteases, the relative order is histidine-aspartate-serine. Much research has been conducted on the subtilisins, due largely to their usefulness in cleaning and feed applications. Additional work has been focused on the adverse environmental conditions (e.g., exposure to oxidative agents, chelating agents, extremes of temperature and/or pH) which can adversely impact the functionality of these enzymes in various applications. Nonetheless, there remains a need in the art for enzyme systems that are able to resist these adverse conditions and retain or have improved activity over those currently known in the art.
SUMMARY OF THE INVENTION
The present invention provides novel serine proteases, novel genetic material encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In particular, the present invention provides protease compositions obtained from a Cellulomonas spp, DMA encoding the protease, vectors comprising the DMA encoding the protease, host cells transformed with the vector DMA, and an enzyme produced by the host cells. The present invention also provides cleaning compositions (e.g., detergent compositions), animal feed compositions, and textile and leather processing compositions comprising protease(s) obtained from a Micrococcineae spp., including but not limited to Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., variant) proteases derived from the wild-type proteases described herein. These mutant proteases also find use in numerous applications.
The present invention provides isolated serine proteases obtained from a member of the Micrococcineae. In some embodiments, the proteases are cellulomonadins. In some preferred embodiments, the protease is obtained from an organism selected from the group consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some particularly preferred embodiments, the protease is obtained from Cellulomonas 69B4. In further embodiments, the protease comprises the amino acid sequence set forth in SEQ ID NO:8. In additional embodiments, the present invention provides isolated serine proteases comprising at least 45% amino acid identity with serine protease comprising SEQ ID NO:8. In some embodiments, the isolated serine proteases comprise at least 50% identity, preferably at least 55%, more preferably at least 60%, yet more preferably at least 65%, even more preferably at least 70%, more preferably at least

75%, still more preferably at least 80%, more preferably 85%, yet more preferably 90%, even more preferably at least 95%, and most preferably 99% identity.
The present invention also provides compositions comprising isolated serine proteases having immunological cross-reactivity with the serine proteases obtained from the Micrococcineae. In some preferred embodiments, the serine proteases have immunological cross-reactivity with serine protease obtained from Cellulomonas 69B4. In alternative embodiments, the serine proteases have immunological cross-reactivity with serine protease comprising the amino acid sequence set forth in SEQ ID NO:8. In still further embodiments, the serine proteases have cross-reactivity with fragments (i.e., portions) of any of the serine proteases obtained from the Micrococcineae, the Cellulomonas 69B4 protease, and/or serine protease comprising the amino acid sequence set forth in SEQ ID NO:8.
In some embodiments, the present invention provides the amino acid sequence set forth in SEQ ID NO:8, wherein the sequence comprises substitutions at least one amino acid position selected from the group comprising positions 2, 8, 10, 11, 12, 13,14, 15, 16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 185, 186, 187, and 188. In alternative embodiments, the sequence comprises substitutions at least one amino acid position selected from the group comprising positions 1,4, 22, 27, 28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 156, 158, 161, 166, 176, 177, 181, 182, 187, and 189.
In some preferred embodiments, the present invention provides protease variants having an amino acid sequence comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8. In alternative embodiments, the present invention provides protease variants having an amino acid sequence comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease comprising at least a portion of SEQ ID NO:8. In some embodiments, the substitutions are made at positions equivalent to positions 2, 8, 10, 11, 12, 13, 14, 15,16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 155, 159, 160, 163, 165, 170, 174, 179, 183, 184,185, 186, 187, and 188 in a Cellulomonas 69B4 protease having an amino acid sequence set forth in SEQ ID NO:8. In alternative embodiments, the substitutions are made at positions equivalent to positions 1, 4, 22, 27,

28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 156, 158, 161, 166, 176, 177, 181, 182, 187, and 189, in a Cellulomonas 69B4 protease having an amino acid sequence set forth in SEQ ID NO:8. In some preferred embodiments, the protease variants comprise the amino acid sequence comprising SEQ ID NO:8, wherein at least one amino acid position at positions selected from the group consisting of 14, 16, 35, 36, 65, 75, 76, 79, 123, 127,159, and 179, are substituted with another amino acid. In some particularly preferred embodiments, the proteases comprise at least one mutation selected from the group consisting of R14L, R16I, R16L, R16Q, R35F, T36S, G65Q, Y75G, N76L, N76V, R79T, R123L, R123Q, R127A, R127K, R127Q, R159K, R159Q, and R179Q. In some alternative preferred embodiments, the proteases comprise multiple mutations selected from the group consisting of R16Q/R35F/R159Q, R16Q/R123L, R14L/R127Q/R159Q, R14L/R179Q, R123L/R127Q/R179Q, R16Q/R79T/R127Q, and R16Q/R79T. In some particularly preferred embodiments, the proteases comprise the following mutations R123L, R127Q, and R179Q. The present invention also provides protease variants having amino acid sequences comprising at least one substitution selected from the group consisting of T36I, A38R, N170Y, N73T, G77T, N24A, T36G, N24E, L69S, T36N, T36S, E119R, N74G, T36W, S76W, N24T, N24Q, T36P, S76Y, T36H, G54D, G78A, S187P, R179V, N24V, V90P, T36D, L69H, G65P, G65R, N7L, W103M, N55F, G186E, A70H, S76V, G186V, R159F, T36Y, T36V, G65V, N24M, S51A, G65Y, Q71I, V66H, P118A, T116F, A38F, N24H, V66D, S76L, G177M, G186I, H85Q, Q71K, Q71G, G65S, A38D, P118F, A38S, G65T, N67G, T36R, P118R, S114G, Y75I, 1181H, G65Q, Y75G, T36F, A38H, R179M, T183I, G78S, A64W, Y75F, G77S, N24L, W103I, V3L, Q81V, R179D, G54R, T36L, Q71M, A70S, G49F, G54L, G54H, G78H, R179I, Q81K, V90I, A38L, N67L, T109I, R179N, V66I, G78T, R179Y, S187T, N67K, N73S, E119K, V3I, Q71H, 111Q, A64H, R14E, R179T, L69V, V150L, Q71A, G65L, Q71N, V90S, A64N, 111 A, N145I, H85T, A64Y, N145Q, V66L, S92G, S188M, G78D, N67A, N7S, V80H, G54K, A70D, P118H, D2G, G54M, Q81H, D2Q, V66E, R79P, A38N, N145E, R179L, T109H, R179K, V66A, G54A, G78N, T109A, R179A, N7A, R179E, H104K, A64R, and V80L In further embodiments, wherein the amino acid sequence of the protease variants comprise at least one substitution selected from the group consisting of H85R, H85L, T62I, N67H, G54I, N24F, T40V, T86A, G63V, G54Q, A64F, G77Y, R35F, T129S, R61M, I126L, S76N, T182V, R79G, T109P, R127F, R123E, P118I, T109R, 171S, T183K, N67T, P89N, F1T, A64K, G78I, T109L, G78V, A64M, A64S, T10G, G77N, A64L, N67D, S76T, N42H, D184F, D184R, S76I, S78R, A38K, V72I, V3T, T107S, A38V, F47I, N55Q, S76E, P118Q, T109G, Q71D, P118K, N67S, Q167N, N145G, I28L, 111T, A64I, G49K, G49A, G65A,

N170D, H85K, S185I, I181N, V80F, L69W, S76R, D184H, V150M, T183M, N67Q, S51Q, A38Y, T107V, N145T, Q71F, A83N, S76A, N67R, T151L, T163L, S51F, Q81I, F47M, A41N, P118E, N67Y, T107M, N73H, 67V, G63W, T10K, I181G, S187E, T107H, D2A, L142V, A143N, A8G, S187L, V90A, G49L, N170L, G65H, T36C, G12W, S76Q, A143S, F1A, N7H, S185V, A110T, N55K, N67F, N7I, A110S, N170A, Q81D, A64Q, Q71L, A38I, N112I, V90T, N145L, A64T, 111S, A30S, R123I, D2H, V66M, Q71R, V90L, L68W, N24S, R159E, V66N, D184Q, E133Q, A64V, D2N, G13M, T40S, S76K, G177S, G63Q, S15F, ASK, A70G, and A38G. In some preferred embodiments, these variants have improved casein hydrolysis performance as compared to wild-type Cellulomonas 69B4 protease.
The present invention also provides protease variants having amino acid sequences comprising at least one substitution selected from the group consisting of R35E, R35D, R14E, R14D, Q167E, G49C, S15R, S15H, 111W, S15C, G49Q, R35Q, R35V, G49E, R123D, R123Y, G49H, A38D, R35S, F47R, R123C, T151L, RUT, R35T, R123E, G49A, G49V, D56L, R35N, R35A, G12D, R35C, R123N, T46V, R123H, S155C, T121E, R127E, S113C, R123T, R16E, T46F, T121L, A38C, T46E, R123W, T44E, N55G, A8G, E119G, R35P, R14G, F59W, R127S, R61E, RMS, S155W, R123F, R123S, G49N, R127D, E119Y, A48E, N170D, R159T, S99A, G12Q, P118R, F165W, R127Q, R35H, G12N, A22C, G12V, R16T, Y57G, T100A, T46Y, R159E, E119R, T107R, T151C, G54C, E119T, R61V, 111E, R14I, R61M, S15E, A22S, R16C, T36C, R16V, L125Q, M180L, R123Q, R14A, R14Q, R35M, R127K, R159Q, N112P, G124D, R179E, G49L, A41D, G177D, R123V, E119V, T10L, T109E, R179D, G12S, T10C, G91Q, S15Y, S155Y, R14C, T163D, T121F, R14N, F165E, N24E, A41C, R61T, G12I, P118K, T46C, 111T, R159D, N170C, R159V, S155I, 111Q, D2P, T100R, R159S, S114C, R16D, and P134R. In alternative embodiments, the protease variants have amino acid sequences comprising at least one substitution selected from the group consisting of S99G, T100K, R127A, F1P, S155V, T128A, F165H, G177E, A70M, S140P, A87E, D2I, R159K, T36V, R179C, E119N, T10Y, I172A, AST, F47V, W103L, R61K, D2V, R179V, D2T, R159N, E119A, G54E, R16Q, G49S, R16I, S51L, S155E, S15M, R179I, T10Q, G12H, R159C, R179T, T163C, R159A, A132S, N157D, G13E, L141M, A41T, R123M, R14M, A8R, Q81P, N24T, T10D, A88F, R61Q, S99K, R179Y, T121A, N112E, S155T, T151V, S99Q, T10E, S92T, T109K, T44C, R123A, A87C, S15F, S155F, D56F, T10F, A83H, R179M, T121D, G13D, P118C, G49F, Q174C, S114E, T86E, F1N, T115C, R127C, R123K, V66N, G12Y, S113A, S15N, A175T, R79T, R123G, R179S, R179N, R123I, P118A, S187E, N112D, A70G, E119L, E119S, R159M, R14H, R179F, A64C, A41S, R179W, N24G, T100Q, P118W, Q81G, G49K, R14L, N55A, R35K, R79V, D2M, T160D, A83D, R179L, S51A, G12P, S99H, N42D, S188E, T10M, L125M, T116N, A70P, Q174S,

G65D, S113D, E119Q, A83E, N170L, Q81A, S51C, P118G, Q174T, I28V, S15G, and T116G. In some preferred embodiments, these variants have improved LAS stability as compared to wild-type Cellulomonas 69B4 protease.
The present invention also provides protease variants having amino acid sequences comprising at least one substitution selected from the group consisting of G26I, G26K, G26Q, G26V, G26W, F27V, F27W, I28P, T29E, T129W, T40D, T40Q, R43D, P43H, P43K, P43L, A22C, T40H, P89W, G91L, S18E, F59K, A30M, A30N, G31M, C33M, G161L, G161V, P43N, G26E, N73P, G84C, G84P, G45V, C33L, Y9E, Y9P, A147E, C158H, I28W, A48P, A22S, T62R, S137R, S155P, S155R, G156I, G156L, Q81A, R96C, I4D, I4P, A70P, C105E, C105G, C105K, C105M, C105N, C105S, T128A, T128V, T128G, S140P, G12D, C33N, C33E, T164G, G45A, G156P, S99A, Q167L, S155W, I28T, R96F, A30P, R123W, T40P, T39R, C105P, T100A, C105W, S155K, T46Y, R123F, I4G, S155Y, T46V, A93S, Y57N, Q81S, G186S, G31H, T10Y, G31V, A83H, A38D, R123Y, R79T, C158G, G31Y, Q81P, R96E, A30Y, R159K, A22T, T40N, Y57M, G31N, Q81G, T164L, T121E, T10F, Q146P, R123N, VSR, P43G, Q81H, Q81D, G161I, C158M, N24T, T10W, T128S, T160I, Y176P, S155F, T128C, L125A, P168Y, T62G, F166S, S188A, Q81F, T46W, A70G, and A38G. In alternative embodiments, the protease variants have amino acid sequences comprising at least one substitution selected from the group consisting of S188E, S188V, Y117K, Y117Q, Y117R, Y117V, R127K, R127Q, R123L, T86S, R123I, Q81E, L125M, H32A, S188T, N74F, C33D, F27I, A83M, Q71Y, R123T, V90A, F59W, L141C, N170E, T46F, S51V, G162P, S185R, A41S, R79V, T151C, T107S, T129Y, M180L, F166C, C105T, T160E, P89A, R159T, T183P, S188M, T10L, G25S, N24S, E119L, T107L, T107Q, G161K, G15Q, S15R, G153K, G153V, S188G, A83E, G186P, T121D, G49A, S15C, C105Y, C105A, R127F, Q71A, T10C, R179K, T86I, W103N, A87S, F166A, A83F, R123Q, A132C, A143H, T163I, T39V, A93D, V90M, R123K, P134W, G177N, V115I, S155T, T110D, G105L, N170D, T107A, G84V, G84M, L111K, P168I, G154L, T183I, S99G, S15T, A8G, S15N, P189S, S188C, T100Q, A110G, A121A, G12A, R159V, G31A, G154R, T182L, V115L, T160Q, T107F, R159Q, G144A, S92T, T101S, A83R, G12HM S15H, T116Q, T36V, G154, Q81C, V130T, T183A, P118T, A87E, T86M, V150N, and N24E. In some preferred embodiments, these variants have improved thermostability as compared to wild-type Cellulomonas 69B4 protease.
The present invention also provides protease variants having amino acid sequences comprising at least one substitution selected from the group consisting of T36I, I172T, N24E, N170Y, G77T, G186N, 1181L, N73T, A38R, N74G, N24A, G54D, S76D, R123E, 159E, N112E, R35E, R179V, R123D, N24T, R179T, R14L, A38D, V90P, R14Q, R123I, R179D, S76V, R79G, R35L, S76E, S76Y, R79D, R79P, R35Q, R179N, N112D, R179E,

G65P, Y75G, V90S, R179M, R35F, R123F, A64I, N24Q, R14I, R179A, R127A, R179I, N170D, R35A, R159F, T109E, R14D, N67D, G49A, N112Q, G78D, T121E, L69S, T116E, V90I, T36S, T36G, N145E, T86D, S51D, R179K, T107E, T129S, L142V, R79A, R79E, A38H, T107S, R123A, N55E, R123L, R159N, G65D, RUN, G65Q, R123Q, N24V, RUG, T116Q, A38N, R159Q, R179Y, A83E, N112L, S99N, G78A, T10N, H85Q, R35Q, N24L, N24H, G49S, R79L, S76T, S76L, G65S, N55F, R79V, G65T, R123N, T86E, Y75F, F1T, S76N, S99V, R79T, N112V, R79M, T107V, R79S, G54E, G65V, R127Q, R159D, T107H, H85T, R35T, T36N, Q81E, R123H, S76I, A38F, V90T, and RUT. In alternative embodiments, the protease variants have amino acid sequences comprising at least one substitution selected from the group consisting of G65L, S99D, T107M, S113T, S99T, G77S, RUM, A64N, R61M, A70D, Q71G, A93D, S92G, N112Y, S15W, R159K, N67G, T10E, R127H, A64Y, R159C, A38L, T160E, T183E, R127S, A8E, S51Q, N7L, G63D, A38S, R35H, R14K, T107I, G12D, A64L, S76W, A41N, R35M, A64V, A38Y, T183I, W103M, A41D, R127K, T36D, R61T, G65Y, G13S, R35Y, R123T, A64H, G49H, A70H, A64F, R127Y, R61E, A64P, T121D, V115A, R123Y, T101S, T182V, H85L, N24M, R127E, N145D, Q71H, S76Q, A64T, G49F, A64Q, T10D, F1D, A70G, R35W, Q71D, N121I, A64M, T36H, A8G, T107N, R35S, N67T, S92A, N170L, N67E, S114A, R14A, RUS, Q81D, S51H, R123S, A93S, R127F, 119V, T40V, S185N, R123G, R179L, S51V, T163D, T109I, A64S, V72I, N67S, R159S, H85M, T109G, Q71S, R61H, T107A, Q81V, V90N, T109A, A38T, N145T, R159A, A110S, Q81H, A48E, S51T, A64W, R159L, N67H, A93E, T116F, R61S, R123V, V3L, and R159Y. In some preferred embodiments, these variants have improved keratin hydrolysis activity as compared to wild-type Cellulomonas 69B4 protease.
The present invention also provides protease variants having amino acid sequences comprising at least one substitution selected from the group consisting of T36I, P89D,. A93T, A93S, T36N, N73T, T36G, R159F, T36S, A38R, S99W, S76W, T36P, G77T, G54D, R127A, R159E, H85Q, T36D, S76L, S99N, Y75G, S76Y, R127S, N24E, R127Q, D184F, N170Y, N24A, S76T, H85L, Y75F, S76V, L69S, R159K, R127K, G65P, N74G, R159H, G65Q, G186V, A48Q, T36H, N67L, R14I, R127L, T36Y, S76I, S114G, R127H, S187P, V3L, G78D, R123I, I181Q, R35F, H85R, R127Y, N67S, Q81P, R123F, R159N, S99A, S76D, A132V, R127F, A143N, S92A, N24T, R79P, S76N, RUM, G186E, N24Q, N67A, R127T, H85K, G65T, G65Y, R179V, Y75I, 111Q, A38L, T36L, R159Y, R159D, N24V, G65S, N157D, G186I, G54Q, N67Y, R127G, S76A, A38S, T109E, V66H, T116F, R123L, G49A, A64H, T36W, D184H, S99D, G161K, P134E, A64F, N67G, S99T, D2Q, S76E, R16Q, G54N, N67V, R35L, Q71I, N7L, N112E, L69H, N24H, G54I, R16L, N24M, A64Y, S113A, H85F, R79G, 111 A, T121D, R61V, and G65L In alternative embodiments, the protease variants

have amino acid sequences comprising at least one substitution selected from the group consisting of N67Q, S187Q, Q71H, T163D, R61K, R159V, Q71F, V31F, V90I, R79D, T160E, R123Q, A38Y, S113G, A88F, A70G, 111T, G78A, N24L, S92G, R14L, D184R, G54L, N112L, H85Y, R16N, G77S, R179T, V80L, G65V, T121E, Q71D, R16G, P89N, N42H, G49F, 111S, R61M, R159C, G65R, T183I, A93D, L111E, S51Q, G78N, N67T, A38N, T40V, A64W, R159L, T10E, R179K, R123E, V90P, A64N, G161E, H85T, A8G, L142V, A41N, S185I, Q71L, A64T, R16I, A38D, G54M, N112Q, R16A, R14E, V80H, N170D, S99G, R179N, S15E, G49H, A70P, A64S, G54A, S185W, R61H, T10Q, A38F, N170L, T10L, N67F, G12D, D184T, R14N, S187E, R14P, N112D, S140A, N112G G49S, L111D, N67M, V150L, G12Y, R123K, P89V, V66D, G77N, S51T, A8D, I181H, T86N, R179D, N55F, N24S, D184L, R61S, N67K, G186L, F1T, R159A, 111L, R61T, D184Q, A93E, Q71T, R179E, L69W, T163I, S188Q, L125V, A38V, R35A, P134G, A64V, N145D, V90T, and A143S. In some preferred embodiments, these variants have improved BMI performance as compared to wild-type Cellulomonas 69B4 protease.
The present invention also provides protease variants having amino acid sequences comprising at least one substitution selected from the group consisting of T36I, N170Y, A38R, R79P, G77T, L69S, N73T, S76V, S76Y, R179V, T36N, N55F, R159F, G54D, G65P, L69H, T36G, G177M, N24E, N74G, R159E, T36S, Y75G, S76I, S76D, A8R, A24A, V90P, R159C, G65Q, T121E, A8V, S76L, T109E, R179M, AST, T107N, G186E, S76W, R123E, A38F, T36P, N67G, Y75F, S76N, R179I, S187P, N67V, V90S, R127A, R179Y, R35F, N145S, G65S, R61M, S51A, R179N, R123D, N24T, N55E, R79C, G186V, R123I, G161E, G65Y, A38S, R14L, V90I, R79G, N145E, N67L, R127S, R150Y, M180D, N67T, A93D, T121D, Q81V, T109I, A93E; T107S, R179T, R179L, R179K, R159D, R179A, R79E, R123F, R79D, T36D, A64N, L142V, T109A, 1172V, A83N, T85A, R179D, A38L, I126L, R127Q, R127L, L69W, R127K, G65T, R127H, P134A, N67D, RUM, N24Q, A143N, N55S, N67M., S51D, S76E, T163D, A38D, R159K, T183I, G63V, ASS, T107M, H85Q, N112E, N67F, N67S, A64H, T86I, P134E, T182V, N67Y, A64S, G78D, V90T, R61T, R16Q, G65R, T86L, V90N, R159Q, G54I, S76C, R179E, V66D, L69V, R127Y, R35L, R14E, and T86F. In alternative embodiments, the protease variants have amino acid sequences comprising at least one substitution selected from the group consisting of G186I, A64Q, T109G, G64L, N24L, A8E, N112D, A38H, R179W, S114G, R123L, ASL, T129S, N170D, R159N, N67C, S92C, T107A, G54E, T107E, T36V, R127T, ASM, H85L, A110S, N170C, A64R, A132V, T36Y, G63D, W103M, T151V, R123P, W103Y, S76T, S187T, R127F, N67A, P171M, A70S, R159H, S76Q, L125V, G54Q, G49L, R14I, R14Q, A83I, V90L, T183E, R159A, T101S, G65D, G54A, T107Q, Q71M, T86E, N24M, N55Q, R61V, P134D, R96K, A88F, N145Q,

A64M, A64T, N24V, S140A, ASH, A64I, R123Q, T183Q, N24H, A64W, T62I, T129G, R35A, T40V, 111T, A38N, N145G, A175T, G77Q, T109H, A8P, R35E, T109N, A110T, N67Q, G63P, H85R, S92G, A175V, S51Q, G63Q, T116F, G65A, R79L, N145P, L69Q, Q146D, A83D, F166Y, R123A, T121L, R123H, A70P, T182W, S76A, A64F, T107H, G186L,Q81I, R123K, A64L, N67R, V3L, S187E, S161K, T86M, I4M, G77N, G49A, A41N, G54M, T107V, Q81E, A38I, T109L, T183K, A70G, Q71D, T183L, Q81H, A64V, A93Q, S188E, S51F, G186P, G186T, R159L, P134G, N145T, N55V, V66E, R159V, Y176L, and R16L . In some preferred embodiments, these variants have improved BMI performance under low pH conditions, as compared to wild-type Cellulomonas 69B4 protease.
. . The present invention also provides serine proteases comprising at least a portion of an amino acid sequence selected from the group consisting of SEQ ID NO:8, SEQ ID NO:6, SEQ ID NO:7, and SEQ ID NO:9. In some embodiments, the nucleotide sequences encoding these serine proteases comprise a nucleotide sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, and SEQ ID NO:5. In some embodiments, the serine proteases are variants having amino acid sequences that are similar to that set forth in SEQ ID NO:8. In some preferred embodiments, the proteases are obtained from a member of the Micrococcineae. In some particularly preferred embodiments, the proteases are obtained from an organism selected from the group consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some particularly preferred embodiments, the protease is obtained from variants of Cellulomonas 69B4.
The present invention also provides isolated protease variants having amino acid sequences comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8, wherein the amino acid of the protease comprises Arg14, Ser15, Arg16, Cys17, His32, Cys33, Phe52, Asp56, ThrTOO, Val115, Thr116, Tyr117, Pro118, Glu119, Ala132, Glut33, Pro134, Gly135, Asp136, Ser137, Thr151, Ser152, Gly153, Gly154, Ser155, Gly156, Asn157, Thr164, and Phe165. In some . embodiments, the catalytic triad of the proteases comprises His 32, Asp56, and Ser137. In alternative embodiments, the proteases comprise Cys131, Ala132, Glu133, Pro134, Gly135, Thr151, Serf 52, Gly153, Gly154, Ser155, Gly156, Asn157 and Gly 162, Thr 163, and Thr164. In some preferred embodiments, the amino acid sequence of the proteases comprise Phe52, Tyr117, Pro118 and Glu119. In some particularly preferred embodiments, the amino acids sequences of the proteases have main-chain to main-chain hydrogen bonding from Gly 154 to the substrate main-chain.

In embodiments, the proteases of the present invention comprise three disulfide bonds. In some preferred embodiments, the disulfide bonds are located between C17 and C38, C95 and C105, and C131 and C158. In some particularly preferred embodiments, the disulfide bonds are located between C17 and C38, C95 and C105, and C131 and C158 of SEQ ID NO:8. In alternative protease variant embodiments, the disulfide bonds are located at positions equivalent to the disulfide bonds in SEQ ID NO:8.
The present invention also provides isolated protease variants having amino acid sequences comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8, wherein the variants have altered substrate specificities as compared to wild-type Cellulomonas 69B4 protease. In some further preferred embodiments, the variants have altered pis as compared to wild-type Cellulomonas 69B4 protease. In additional preferred embodiments, the variants have improved stability as compared to wild-type Cellulomonas 69B4 protease. In still further preferred embodiments, the variants exhibit altered surface properties. In some particularly preferred embodiments, the variants exhibit altered surface properties as compared to wild-type Cellulomonas 69B4 protease. In additional particularly preferred embodiments, the variants comprise mutations at least one substitution at sites selected from the group consisting of 1, 2, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 22, 24, 25, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 57, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69, 71, 73, 74, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 95, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 123, 124, 126, 127, 128, 130, 131, 132, 133, 134, 135, 137, 143, 144, 145, 146, 147, 148, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 170, 171, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, and 184.
The present invention also provides protease variants having at least one improved property as compared to the wild-type protease. In some particularly preferred embodiments, the variants are variants of a serine protease obtained from a member of the Micrococcineae. In some particularly preferred embodiments, the proteases are obtained from an organism selected from the group consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some particularly preferred embodiments, the protease is obtained from variants of Cellulomonas 69B4. In some preferred embodiments, at least one improved property is selected from the group consisting of acid stability, thermostability, casein hydrolysis, keratin hydrolysis, cleaning performance, and LAS stability.

The present invention also provides expression vectors comprising a polynucleotide sequence encoding protease variants having amino acid sequences comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8. In further embodiments, the present invention provides host cells comprising these expression vectors. In some particularly preferred embodiments, the host cells are selected from the group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp. The present invention also provides the serine proteases produced by the host cells.
The present invention also provides variant proteases comprising an amino acid sequence selected from the group consisting of SEQ ID NOS:54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78, In some preferred embodiments, the amino acid sequence is encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NOS:53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, and 77. In further embodiments, the present invention provides expression vectors comprising a polynucleotide sequence encoding at least one protease variant. In additional embodiments, the present invention provides host cells comprising these expression vectors. In some particularly preferred embodiments, the host cells are selected from the group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp. The present invention also provides the serine proteases produced by the host cells.
The present invention also provides compositions comprising at least a portion of an isolated serine protease of obtained from a member of the Micrococcineae, wherein the protease is encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, and SEQ ID NO:4. In some preferred embodiments, the sequence comprises at least a portion of SEQ ID NO:1. In further embodiments, the present invention provides host cells comprising these expression vectors. In some particularly preferred embodiments, the host cells are selected from the group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp. The present invention also provides the serine proteases produced by the host cells.
The present invention also provides variant serine proteases, wherein the proteases comprise at least one substitution corresponding to the amino acid positions in SEQ ID NO:8, and wherein variant proteases have better performance in at least one property selected from the group consisting of keratin hydrolysis, thermostability, casein activity, LAS stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease.
The present invention also provides isolated polynucleotides comprising a nucleotide sequence (i) having at least 70% identity to SEQ ID NO:4, or (ii) being capable of hybridizing

to a probe derived from the nucleotide sequence set forth in SEQ ID NO:4, under conditions of intermediate to high stringency, or (iii) being complementary to the nucleotide sequence set forth in SEQ ID NO:4. In embodiments, the present invention provides expression vectors encoding at least one such polynucleotide. In further embodiments, the present invention provides host cells comprising these expression vectors. In some particularly preferred embodiments, the host cells are selected from the group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp. The present invention also provides the serine proteases produced by the host cells. In further embodiments, the present invention provides polynucleotides that are complementary to at least a portion of the sequence set forth in SEQ ID N0:4.
The present invention also provides methods of producing an enzyme having protease activity, comprising: transforming a host cell with an expression vector comprising a polynucleotide having at least 70% sequence identity to SEQ ID NO:4; cultivating the transformed host cell under conditions suitable for host cell. In some embodiments, the host cell is selected from the group consisting of Streptomyces, Aspergillus, Trichoderma and Bacillus species.
The present invention also provides probes comprising 4 to 150 nucleotide sequence substantially identical to a corresponding fragment of SEQ ID NO:4, wherein the probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic activity, and wherein the nucleic acid sequence is obtained from a member of the Micrococcineae. In some embodiments, the Micrococcineae is a Cellulomonas spp. In some preferred embodiments, the Cellulomonas is Cellulomonas strain 69B4.
The present invention also provides cleaning compositions comprising at least one serine protease obtained from a member of the Micrococcineae. In some embodiments, ate least one protease is obtained from an organism selected from the group consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some preferred embodiments, the protease is obtained from Cellulomonas 69B4. In some particularly preferred embodiments, at least one protease comprises the amino acid sequence set forth in SEQ ID NO:8. In some further embodiments, the present invention provides isolated serine proteases comprising at least 45% amino acid identity with serine protease comprising SEQ ID NO:8. In some embodiments, the isolated serine proteases comprise at least 50% identity, preferably at least 55%, more preferably at least 60%, yet more preferably at least 65%, even more preferably at least 70%, more preferably at least 75%, still more preferably at least 80%, more preferably 85%, yet more preferably 90%, even more preferably at least 95%, and most preferably 99% identity. 75.

The present invention further provides cleaning compositions comprising at least one serine protease, wherein at least one of the serine proteases has immunological cross-reactivity with the serine protease obtained from a member of the Micrococcineae. In some preferred embodiments, the serine proteases have immunological cross-reactivity with serine protease obtained from Cellulomonas 69B4. In alternative embodiments, the serine proteases have immunological cross-reactivity with serine protease comprising the amino acid sequence set forth in SEQ ID NO:8. In still further embodiments, the serine proteases have cross-reactivity with fragments (i.e., portions) of any of the serine proteases obtained from the Micrococcineae, the Cellulomonas 69B4 protease, and/or serine protease comprising the amino acid sequence set forth in SEQ ID NO:8.
The present invention further provides cleaning compositions comprising at least one serine protease, wherein the protease is a variant protease having an amino acid sequence comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease having an amino acid sequence set forth in SEQ ID NO:8. In some embodiments, the substitutions are made at positions equivalent to positions 2, 8, 10, 11, 12, 13, 14, 15, 16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 185, 186, 187, and 188 in a Cellulomonas 69B4 protease comprising an amino acid sequence set forth in SEQ ID NO:8. In alternative embodiments, the substitutions are made at positions equivalent to positions 1, 4, 22, 27, 28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 77,80,84,87,88,89,92,96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 156, 158, 161, 166, 176, 177, 181, 182,187, and 189, in a Cellulomonas 69B4 protease comprising an amino acid sequence set forth in SEQ ID NO:8. In further embodiments, the protease comprises at least one amino acid substitutions at positions 14, 16, 35, 36, 65, 75, 76, 79, 123, 127, 159, and 179, in an equivalent amino acid sequence to that set forth in SEQ ID NO:8. In still further embodiments, the protease comprises at least one mutation selected from the group consisting of R14L, R16I, R16L, R16Q, R35F, T36S, G65Q, Y75G, N76L, N76V, R79T, R123L, R123Q, R127A, R127K, R127Q, R159K, R159Q, and R179Q. In yet additional embodiments, the protease comprises a set of mutations selected from the group consisting of the sets R16Q/R35F/R159Q, R16Q/R123L, R14L/R127Q/R159Q, R14L/R179Q, R123L/R127Q/R179Q, R16Q/R79T/R127Q, and R16Q/R79T. In some particularly preferred embodiments, the protease comprises the following mutations R123L, R127Q, and R179Q. In some particularly preferred embodiments, the variant serine proteases comprise at least one substitution corresponding to the amino acid positions in

SEQ ID NO:8, and wherein the variant proteases have better performance in at least one property selected from the group consisting of keratin hydrolysis, thermostability, casein activity, LAS stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease. In some embodiments, the variant protease comprises an amino acid sequence selected from the group consisting of SEQ ID NOS:54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 78. In alternative embodiments, the variant protease amino acid sequence is encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NOS:53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, and 77.
The present invention also provides cleaning compositions comprising a cleaning effective amount of a proteolytic enzyme, the enzyme comprising an amino acid sequence having at least 70 % sequence identity to SEQ ID NO:4, and a suitable cleaning formulation. In some preferred embodiments, the cleaning compositions further comprise one or more additional enzymes or enzyme derivatives selected from the group consisting of proteases, amylases, lipases, mannanases, pectinases, cutinases, oxidoreductases, hemicellulases, and cellulases.
The present invention also provides compositions comprising at least one serine protease obtained from a member of the Micrococcineae, wherein the compositions further comprise at least one stabilizer. In some embodiments, the stabilizer is selected from the group consisting of borax and glycerol. In some embodiments, the present invention provides competitive inhibitors suitable to stabilize the enzyme of the present invention to anionic surfactants. In some embodiments, at least one protease is obtained from an organism selected from the group consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some preferred embodiments, the protease is obtained from Cellulomonas 69B4. In some particularly preferred embodiments, at least one protease comprises the amino acid sequence set forth in SEQ ID NO:8.
The present invention further provides compositions comprising at least one serine protease obtained obtained from a member of the Micrococcineae, wherein the serine protease is an autolytically stable variant. In some embodiments, at least one variant protease is obtained from an organism selected from the group consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some preferred embodiments, the variant protease is obtained from Cellulomonas 69B4. In some particularly preferred embodiments, at least one variant protease comprises the amino acid sequence set forth in SEQ ID NO:8.
The present invention also provides cleaning compositions comprising at least

0.0001 weight percent of the serine protease of the present invention, and optionally, an adjunct ingredient. In some embodiments, the composition comprises an adjunct ingredient. In some preferred embodiments, the composition comprises a sufficient amount of a pH modifier to provide the composition with a neat pH of from about 3 to about 5, the composition being essentially free of materials that hydrolyze at a pH of from about 3 to about 5. In some particularly preferred embodiments, the materials that hydrolyze comprise a surfactant material. In additional embodiments, the cleaning composition is a liquid composition. In further embodiments, the surfactant material comprises a sodium alkyl sulfate surfactant that comprises an ethylene oxide moiety.
The present invention additionally provides cleaning compositions that comprise at least one acid stable enzyme, the cleaning composition comprising a sufficient amount of a pH modifier to provide the composition with a neat pH of from about 3 to about 5, the composition being essentially free of materials that hydrolyze at a pH of from about 3 to about 5. In further embodiments, the materials that hydrolyze comprise a surfactant material. In some preferred embodiments, the cleaning composition being a liquid composition. In yet additional embodiments, the surfactant material comprises a sodium alkyl sulfate surfactant that comprises an ethylene oxide moiety. In some embodiments, the cleaning composition comprises a suitable adjunct ingredient. In some additional embodiments, the composition comprises a suitable adjunct ingredient. In some preferred embodiments, the composition comprises from about 0.001 to about 0.5 weight % of ASP.
In some alternatively preferred embodiments, the composition comprises from about 0.01 to about 0.1 weight percent of ASP.
The present invention also provides methods of cleaning, the comprising the steps of: a) contacting a surface and/or an article comprising a fabric with the cleaning composition comprising the serine protease of the present invention at an appropriate concentration; and b) optionally washing and/or rinsing the surface or material. In alternative embodiments, any suitable composition provided herein finds use in these methods.
The present invention also provides animal feed comprising at least one serine protease obtained from a member of the Micrococcineae. In some embodiments, at least one protease is obtained from an organism selected from the group consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some preferred embodiments, the protease is obtained from Cellulomonas 69B4. In some particularly preferred embodiments, at least one protease comprises the amino acid

sequence set forth in SEQ ID NO:8.
The present invention provides an isolated polypeptide having proteolytic activity, (e.g., a protease) having the amino acid sequence set forth in SEQ ID NO:8. In some embodiments, the present invention provides isolated polypeptides having approximately 40% to 98% identity with the sequence set forth in SEQ ID NO:8. In some preferred embodiments, the polypeptides have approximately 50% to 95% identity with the sequence set forth in SEQ ID NO:8. In some additional preferred embodiments, the polypeptides have approximately 60% to 90% identity with the sequence set forth in SEQ ID NO:8. In yet additional embodiments, the polypeptides have approximately 65% to 85% identity with the sequence set forth in SEQ ID NO:8. In some particularly preferred embodiments, the polypeptides have approximately 90% to 95% identity with the sequence set forth in SEQ ID NO:8.
The present invention further provides proteases obtained from bacteria of the suborder Micrococcineae. In some preferred embodiments, the proteases are obtained from members of the family Promicromonosporaceae. In yet further embodiments, the proteases are obtained from any member of the genera Xylanimicrobium, Xylanibacterium, Xylanimonas, Myceligenerans, and Promicromonospora. In some preferred embodiments, the proteases are obtained from members of the family Cellulomonadaceae. In some particularly preferred embodiments, the proteases are obtained from members of the genera Cellulomonas and Oerskovia. In some further preferred embodiments, the proteases are derived from Cellulomonas spp. In some embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas horn in is, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, Cellulomonas fermentans, Cellulomonas xylanilytica, Cellulomonas humilata and Cellulomonas strain 69B4 (DSM 16035).
In alternative embodiments, the proteases are derived from Oerskovia spp. In some preferred embodiments, the Oerskovia spp. is selected from Oerskovia jenensis, Oerskovia pauromeiabola, Oerskovia enterophila, Oerskovia turbata and Oerskovia turbata strain DSM 20577.
In some embodiments, the proteases have apparent molecular weights of about 17kD to 21 kD as determined by a matrix assisted laser desorption/ionizaton - time of flight ("MALDI-TOF") spectrophotometer.
The present invention further provides isolated polynucleotides that encode proteases comprise an amino acid sequence comprising at least 40% amino acid sequence

identity to SEQ ID NO:8. In some embodiments, the proteases have at least 50% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 60% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 70% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 80% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 90% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 95% amino acid sequence identity to SEQ ID NO:8. The present invention also provides expression vectors comprising any of the polynucleotides provided above.
The present invention further provides host cells transformed with the expression vectors of the present invention, such that at least one protease is expressed by the host cells. In some embodiments, the host cells are bacteria, while in other embodiments, the host cells are fungi. In some preferred embodiments, the bacterial host cells are selected from the group consisting of the genera Bacillus and Streptomyces. In some alternative preferred embodiments, the fungal host cells are members of the genus Trichoderma, while in other alternative preferred embodiments, the fungal host cells are members of the genus Aspergillus.
The present invention also provides isolated polynucleotides comprising a nucleotide sequence (i) having at least 70% identity to SEQ ID NOS:3 or 4, or (ii) being capable of hybridizing to a probe derived from the nucleotide sequence disclosed in SEQ ID NOS: 3 or 4, under conditions of medium to high stringency, or (iii) being complementary to the nucleotide sequence disclosed in SEQ ID NOS:3 or 4. In some embodiments, the present invention provides vectors comprising such polynucleotide. In further embodiments, the present invention provides host cells transformed with such vector.
The present invention further provides methods for producing at least one enzyme having protease activity, comprising: the steps of transforming a host cell with an expression vector comprising a polynucleotide comprising at least 70% sequence identity to SEQ ID NO:4, cultivating the transformed host cell under conditions suitable for the host cell to produce the protease; and recovering the protease. In some preferred embodiments, the host cell is a Streptomyces spp, while in other embodiments, the host cell is a Bacillus spp,, a Trichoderma spp., and/or a Aspergillus spp. In some embodiments, the Streptomyces spp. is Streptomyces lividans. In alternative embodiments, the host cell is T. reesei. In further embodiments, the Aspergillus spp. is A. niger.
The present invention also provides fragments (i.e., portions) of the DMA encoding the proteases provided herein. These fragments find use in obtaining partial length DMA

fragments capable of being used to isolate or identify polynucleotides encoding mature protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having proteolytic activity. In some embodiments, portions of the DNA provided in SEQ ID NO:1 find use in obtaining homologous fragments of DNA from other species, and particularly from Micrococcineae spp. which encode a protease or portion thereof having proteolytic activity.
The present invention further provides at least one probe comprising a polynucleotide substantially identical to a fragment of SEQ ID NOS:1, 2, 3 or 4, wherein the probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic activity, and wherein the nucleic acid sequence is obtained from a bacterial source. In some embodiments, the bacterial source is a Cellulomonas spp. In some preferred embodiments, the bacterial source is Cellulomonas strain 69B4.
The present invention further provides compositions comprising at least one of the proteases provided herein. In some preferred embodiments, the compositions are cleaning compositions. In some embodiments, the present invention provides cleaning compositions comprising a cleaning effective amount of at least one protease comprising an amino acid sequence having at least 40% sequence identity to SEQ ID NO:8, at least 90% sequence identity to SEQ ID NO:8, and/or having an amino acid sequence of SEQ ID N0:8. In some embodiments, the cleaning compositions further comprise at least one suitable cleaning adjunct. In some embodiments, the protease is derived from a Cellulomonas sp. In some preferred embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas hominis, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, and Cellulomonas strain 69B4 (DSM 16035). In some particularly preferred embodiments, the Cellulomonas spp is Cellulomonas. strain 69B4. In still further embodiments, the cleaning composition further comprises at least one additional enzymes or enzyme derivatives selected from the group consisting of protease, amylase, lipase, mannanase and cellulase.
The present invention also provides isolated naturally occurring proteases comprising an amino acid sequence having at least 45% sequence identity to SEQ ID NO:8, at least 60% sequence identity to SEQ ID NO:8, at least 75% sequence identity to SEQ ID NO:8, at least 90% sequence identity to SEQ ID NO:8, at least 95% sequence identity to SEQ ID NO:8, and/or having the sequence identity of SEQ ID NO:8, the protease being isolated from a Cellulomonas spp.. In some embodiments, the protease is isolated from Cellulomonas strain 69B4 (DSM 16035).

In additional embodiments, the present invention provides engineered variants of the serine proteases of the present invention. In some embodiments, the engineered variants are genetically modified using recombinant DMA technologies, while in other embodiments, the variants are naturally occurring. The present invention further encompasses engineered variants of homologous enzymes. In some embodiments, the engineered variant homologous proteases are genetically modified using recombinant DMA technologies, while in other embodiments, the variant homologous proteases are naturally occurring.
The present invention also provides serine proteases that immunologically cross-react with the Cellulomonas 69B4 protease (i.e., ASP) of the present invention. Indeed, it is intended that the present invention encompass fragments (e.g., epitopes) of the ASP protease that stimulate an immune response in animals (including, but not limited to humans) and/or are recognized by antibodies of any class. The present invention further encompasses epitopes on proteases that are cross-reactive with ASP epitopes. In some embodiments, the ASP epitopes are recognized by antibodies, but do not stimulate an immune response in animals (including, but not limited to humans), while in other embodiments, the ASP epitopes stimulate an immune response in at least one animal species (including, but not limited to humans) and .are recognized by antibodies of any class. The present invention also provides means and compositions for identifying and assessing cross-reactive epitopes.
The present invention further provides at least one polynucleotide encoding a signal peptide (i) having at least 70% sequence identity to SEQ ID NO:9, or (ii) being capable of hybridizing to a probe derived from the polypeptide sequence encoding SEQ ID NO:9, under conditions of medium to high stringency, or (iii) being complementary to the polypeptide sequence provided in SEQ ID NO:9. In further embodiments, the present invention provides at vectors comprising the polynucleotide described above. In yet additional embodiments, a host cell is provided that is transformed with the vector.
The present invention also provides methods for producing proteases, comprising: (a) transforming a host cell with an expression vector comprising a polynucleotide having at least 70% sequence identity to SEQ ID NO:4, at least 95% sequence identity to SEQ ID NO:4, and/or having a polynucleotide sequence of SEQ ID NO:4; (b) cultivating the transformed host cell under conditions suitable for the host cell to produce the protease; and
(c) recovering the protease. In some embodiments, the host cell is a Bacillus species (e.g., B. subtilis, B. clausii, or B. licheniformis). In alternative embodiments, the host cell is a Streptomyces spp., (e.g., Streptomyces lividans). In additional embodiments, the host cell

is a Trichoderma spp., (e.g., Trichoderma reesei). In yet further embodiments, the host cell is a Aspergillus spp. (e.g., Aspergillus niger).
As will be appreciated, an advantage of the present invention is that a polynucleotide has been isolated which provides the capability of isolating further polynucleotides which encode proteins having serine protease activity, wherein the backbone is substantially identical to that of the Cellulomonas protease of the present invention.
In further embodiments, the present invention provides means to produce host cells that are capable of producing the serine proteases of the present invention in relatively large quantities. In particularly preferred embodiments, the present invention provides means to produce protease with various commercial applications where degradation or synthesis of polypeptides are desired, including cleaning compositions, as well as feed components, textile processing, leather finishing, grain processing, meat processing, cleaning, preparation of protein hydrolysates, digestive aids, microbicidal compositions, bacteriostatic composition, fungistatic compositions, personal care products, including oral care, hair care, and/or skin care.
The present invention further provides enzyme compositions have comparable or improved wash performance, as compared to presently used subtilisin proteases. Other objects and advantages of the present invention are apparent from the present Specification.
The present invention provides an isolated polypeptide having proteolytic activity, (e.g., a protease) having the amino acid sequence set forth in SEQ ID NO:8. In some embodiments, the present invention provides isolated polypeptides having approximately 40% to 98% identity with the sequence set forth in SEQ ID NO:8. In some preferred embodiments, the polypeptides have approximately 50% to 95% identity with the sequence set forth in SEQ ID NO:8. In some additional preferred embodiments, the polypeptides have approximately 60% to 90% identity with the sequence set forth in SEQ ID NO:8. In yet additional embodiments, the polypeptides have approximately 65% to 85% identity with the sequence set forth in SEQ ID NO:8. In some particularly preferred embodiments, the polypeptides have approximately 90% to 95% identity with the sequence set forth in SEQ ID NO:8.
The present invention further provides proteases obtained from bacteria of the suborder Micrococcineae. In some preferred embodiments, the proteases are obtained from members of the family Promicromonosporaceae. In yet further embodiments, the proteases are obtained from any member of the genera Xylanimicrobium, Xylan/bacterium,

Xylanimonas, Myceligenerans, and Promicromonospora. In some preferred embodiments, the proteases are obtained from members of the family Cellulomonadaceae. In some particularly preferred embodiments, the proteases are obtained from members of the genera Cellulomonas and Oerskovia. In some further preferred embodiments, the proteases are derived from Cellulomonas spp. In some embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas hominis, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, Cellulomonas fermentans, Cellulomonas xylanilytica, Cellulomonas humilata and Cellulomonas strain 69B4 (DSM 16035).
In alternative embodiments, the proteases are derived from Oerskovia spp. In some preferred embodiments, the Oerskovia spp. is selected from Oerskovia jenensis, Oerskovia paurometabola, Oerskovia enterophila, Oerskovia turbata and Oerskovia turbata strain DSM 20577.
In some embodiments, the proteases have apparent molecular weights of about 17kD to 21 kD as determined by a matrix assisted laser desorption/ionizaton - time of flight ("MALDI-TOF") spectrophotometer.
The present invention further provides isolated polynucleotides that encode proteases comprise an amino acid sequence comprising at least 40% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 50% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 60% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 70% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 80% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 90% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 95% amino acid sequence identity to SEQ ID NO:8. The present invention also provides expression vectors comprising any of the polynucleotides provided above.
The present invention further provides host cells transformed with the expression vectors of the present invention, such that at least one protease is expressed by the host cells. In some embodiments, the host cells are bacteria, while in other embodiments, the host cells are fungi. In some preferred embodiments, the bacterial host cells are selected from the group consisting of the genera Bacillus and Streptomyces. In some alternative preferred embodiments, the fungal host cells are members of the genus Trichoderma, while in other alternative preferred embodiments, the fungal host cells are members of the genus

Aspergillus.
The present invention also provides isolated polynucleotides comprising a nucleotide sequence (i) having at least 70% identity to SEQ ID NOS:3 or 4, or (ii) being capable of hybridizing to a probe derived from the nucleotide sequence disclosed in SEQ ID NOS: 3 or 4, under conditions of medium to high stringency, or (iii) being complementary to the nucleotide sequence disclosed in SEQ ID NOS:3 or 4. .In some embodiments, the present invention provides vectors comprising such polynucleotide. In further embodiments, the present invention provides host cells transformed with such vector.
The present invention further provides methods for producing at least one enzyme having protease activity, comprising: the steps of transforming a host cell with an expression vector comprising a polynucleotide comprising at least 70% sequence identity to SEQ ID NO:4, cultivating the transformed host cell under conditions suitable for the host cell to produce the protease; and recovering the protease. In some preferred embodiments, the host cell is a Streptomyces spp, while in other embodiments, the host cell is a Bacillus spp,, a Trichoderma spp., and/or a Aspergillus spp. In some embodiments, the Streptomyces spp. is Streptomyces lividans. In alternative embodiments, the host cell is T. reesei. In further embodiments, the Aspergillus spp. is A. niger.
The present invention also provides fragments (i.e., portions) of the DNA encoding the proteases provided herein; These fragments find use in obtaining partial length DNA fragments capable of being used to isolate or identify polynucleotides encoding mature protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having proteolytic activity. In some embodiments, portions of the DNA provided in SEQ ID NO:1 find use in obtaining homologous fragments of DNA from other species, and particularly from Micrococcineae spp. which encode a protease or portion thereof having proteolytic activity.
The present invention further provides at least one probe comprising a polynucleotide substantially identical to a fragment of SEQ ID NOS:1, 2, 3 or 4, wherein the probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic activity, and wherein the nucleic acid sequence is obtained from a bacterial source. In some embodiments, the bacterial source is a Cellulomonas spp. In some preferred embodiments, the bacterial source is Cellulomonas strain 69B4.
The present invention further provides compositions comprising at least one of the proteases provided herein. In some preferred embodiments, the compositions are cleaning compositions. In some embodiments, the present invention provides cleaning compositions comprising a cleaning effective amount of at least one protease comprising an amino acid

sequence having at least 40% sequence identity to SEQ ID NO:8, at least 90% sequence identity to SEQ ID NO:8, and/or having an amino acid sequence of SEQ ID NO:8. In some embodiments, the cleaning compositions further comprise at least one suitable cleaning adjunct. In some embodiments, the protease is derived from a Cellulomonas sp. In some preferred embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas hominis, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, and Cellulomonas strain 69B4 (DSM 16035). In some particularly preferred embodiments, the Cellulomonas spp is Cellulomonas. strain 69B4. In still further embodiments, the cleaning composition further comprises at least one additional enzymes or enzyme derivatives selected from the group consisting of protease, amylase, lipase, mannanase and cellulase.
The present invention also provides isolated naturally occurring proteases comprising an amino acid sequence having at least 45% sequence identity to SEQ ID NO:8, at least 60% sequence identity to SEQ ID NO:8, at least 75% sequence identity to SEQ ID NO:8, at least 90% sequence identity to SEQ ID NO:8, at least 95% sequence identity to SEQ ID NO:8, and/or having the sequence identity of SEQ ID NO:8, the protease being isolated from a Cellulomonas spp.. In some embodiments, the protease is isolated from Cellulomonas strain 69B4 (DSM 16035).
In additional embodiments, the present invention provides engineered variants of the serine proteases of the present invention. In some embodiments, the engineered variants are genetically modified using recombinant DNA technologies, while in other embodiments, the variants are naturally occurring. The present invention further encompasses engineered variants of homologous enzymes. In some embodiments, the engineered variant homologous proteases are genetically modified using recombinant DNA technologies, while in other embodiments, the variant homologous proteases are naturally occurring.
The present invention also provides serine proteases that immunologically cross-react with the ASP protease of the present invention. Indeed, it is intended that the present invention encompass fragments (e.g., epitopes) of the ASP protease that stimulate an immune response in animals (including, but not limited to humans) and/or are recognized by antibodies of any class. The present invention further encompasses epitopes on proteases that are cross-reactive with ASP epitopes. In some embodiments, the ASP epitopes are recognized by antibodies, but do not stimulate an immune response in animals (including, but not limited to humans), while in other embodiments, the ASP epitopes stimulate an immune response in at least one animal species (including, but not limited to humans) and

are recognized by antibodies of any class. The present invention also provides means and compositions for identifying and assessing cross-reactive epitopes.
The present invention further provides at least one polynucleotide encoding a signal peptide (i) having at least 70% sequence identity to SEQ ID NO:9, or (ii) being capable of hybridizing to a probe derived from the polypeptide sequence encoding SEQ ID NO:9, under conditions of medium to high stringency, or (iii) being complementary to the polypeptide sequence provided in SEQ ID NO:9. In further embodiments, the present invention provides at vectors comprising the polynucleotide described above. In yet additional embodiments, a host cell is provided that is transformed with the vector.
The present invention also provides methods for producing proteases, comprising: (a) transforming a host cell with an expression vector comprising a polynucleotide having at least 70% sequence identity to SEQ ID NO:4, at least 95% sequence identity to SEQ ID NO:4, and/or having a polynucleotide sequence of SEQ ID NO:4; (b) cultivating the transformed host cell under conditions suitable for the host cell to produce the protease; and
(c) recovering the protease. In some embodiments, the host cell is a Bacillus species (e.g., B. subtilis, B. clausii, or B. licheniformis). In alternative embodiments, the host cell is a Streptomyces spp., (e.g., Streptomyces lividans). In additional embodiments, the host cell is a Trichoderma spp., (e.g., Trichoderma reesei). In yet further embodiments, the host cell is a Aspergillus spp., (e.g., Aspergillus niger).
As will be appreciated, an advantage of the present invention is that a polynucleotide has been isolated which provides the capability of isolating further polynucleotides which encode proteins having serine protease activity, wherein the backbone is substantially identical to that of the Cellulomonas protease of the invention.
In further embodiments, the present invention provides means to produce host cells that are capable of producing the serine proteases of the present invention in relatively large quantities. In particularly preferred embodiments, the present invention provides means to produce protease with various commercial applications where degradation or synthesis of polypeptides are desired, including cleaning compositions, as well as feed components, textile processing, leather finishing, grain processing, meat processing, cleaning, preparation of protein hydrolysates, digestive aids, microbicidal compositions, bacteriostatic composition, fungistatic compositions, personal care products, including oral care, hair care,1 and/or skin care.
The present invention further provides enzyme compositions have comparable or improved wash performance, as compared to presently used subtilisin proteases. Other

objects and advantages of the present invention are apparent from the present Specification.
DESCRIPTION OF THE FIGURES
Figure 1 provides an unrooted phylogenetic tree illustrating the relationship of novel strain 69B4 to members of the family Cellulomonadaceae and other related genera of the suborder Micrococcineae.
Figure 2 provides a phylogenetic tree for ASP protease.
Figure 3 provides a MALDI TOF spectrum of a protease derived from Cellulomonas strain 69B4
Figure 4 shows the sequence of N-terminal most tryptic peptide from C. flavigena
Figure 5 provides the plasmid map of the pSEGCT vector.
Figure 6 provides the plasmid map of the pSEGCT69B4 vector.
Figure 7 provides the plasmid map of the pSEA469BCT vector.
Figure 8 provides the plasmid map of the pHPLT-Asp-C1-1 vector.
Figure 9 provides the plasmid map of the pHPLT-Asp-C1-2 vector.
Figure 10 provides the plasmid map of the pHPLT-Asp-C2-1 vector.
Figure 11 provides the plasmid map of the pHPLT-Asp-C2-2 vector.
Figure 12 provides the plasmid map of the pHPLT-ASP-lll vector. Figure 13 provides the plasmid map of the pHPLT-ASP-IV vector. Figure 14 provides the plasmid map of the pHPLT-ASP-VII vector. Figure 15 provides the plasmid map of the pXX-Kpnl vector. Figure 16 provides the plasmid map of the p2JM103-DNNP1 vector.
Figure 17 provides the plasmid map of the pHPLT vector.
Figure 18 provides the map and MXL-prom sequences for the opened pHPLT-ASP-C1-2.
Figure 19 provides the plasmid map of the pENMxS vector. Figure 20 provides the plasmid map of the pICatH vector.
Figure 21 provides the plasmid map of the pTREX4 vector.
Figure 22 provides the plasmid map of the pSLGAMpR2 vector.
Figure 23 provides the plasmid map of the pRAXdes2-ASP vector.
Figure 28 provides the plasmid map of the pAPDI vector.
Figure 25 provides graphs showing ASP autolysis. Panel A provides a graph

showing the ASP autolysis peptides observed in a buffer without LAS. Panel B provides a graph showing the ASP autolysis peptides observed in a buffer with 0.1% LAS.
Figure 26 compares the cleaning activity (absorbance at 405 nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-*-]; RELASE™ [-A-]; and OPTIMASE™ [-•-] in liquid TIDE® detergent under North American wash conditions.
Figure 27 provides a graph that compares the cleaning activity (absorbance at 405 nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-4-]; RELASE™ [-A-]; and OPTIMASE™ [-•-] in Detergent Composition III powder detergent (0.66 g/l) North American concentration/detergent formulation under Japanese wash conditions.
Figure 28 provides a graph that compares the cleaning activity (absorbance at 405 nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-4-]; RELASE™ [-A-]; and OPTIMASE™ [-•-] in ARIEL® REGULAR detergent powder under European wash conditions.
Figure 29 provides a graph that compares the cleaning activity (absorbance at 405 nm) dose (ppm) response curves of certain serine protease (69B4 [-x-]; PURAFECT® [-4- ]; RELASE™ [-A-]; and OPTIMASE™ [-.-] in PURE CLEAN detergent powder under Japanese conditions.
Figure 30 provides a graph that compares the cleaning activity (absorbance at 405 nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-4-]; RELASE™ [-A-]; and OPTIMASE™ [-•-] in Detergent Composition III powder (1.00 g/l) under North American conditions.
Figure 31 provides a graph that shows comparative oxidative inactivation of various serine proteases (100 ppm) as a measure of per cent enzyme activity over time (minutes) (69B4 [-x-]; BPN' variant 1 [-4- ]; PURAFECT® [-A-]; and GG36-variant 1 [-•-]) with 0.1 M H2O2 at pH 9.45, 25°C.
Figure 32 provides a graph that shows comparative chelator inactivation of various serine proteases (100 ppm) as a measure of per cent enzyme activity over time (minutes) (69B4 [-x-]; BPN'-variant 1 [-4- ]; PURAFECT® [-A-]; and GG36-variant 1 [-•-] with 10mM EDTA at pH 8.20, 45°C.
Figure 33 provides a graph that shows comparative thermal inactivation of various serine proteases (100 ppm) as a measure of percent enzyme activity over time (minutes) (69B4 [-x-]; BPN'-variant [-4- ]; PURAFECT® [-A-]; and GG36-variant 1 [-•-] with 50 mM

Tris at pH 8.0, 45°C.
Figure 34 provides a graph that shows comparative thermal inactivation of certain serine proteases (69B4 [-x-]; BPN'-variant [-+- ]; PURAFECT® [-A-]; and GG36-variant-1 [-•-] at pH 8.60, over a temperature gradient of 57°C to 62°C.
Figure 35 provides a graph that shows enzyme activity (hydrolysis of di-methyl casein measured by absorbance at 405 nm) of certain serine proteases (2.5 ppm) (69B4 [-•- ]; BPN'-variant [-+- PURAFECT® [-A-]; and GG36-variant 1[ -• -] at pH 's ranging from 5to12at37°C.
Figure 36 provides a bar graph that shows enzyme stability as indicated by % remaining activity (hydrolysis of di-methyl casein measured by absorbance at 405 nm) of certain serine proteases (2.5 ppm) (69B4, BPN'- variant; PURAFECT® and GG36-variant 1 at pHs ranging from 3 (| ), 4 (gg ), 5 ( g ) to 6 ( gg ) at 25°, 35°, and 45°C., respectively.
Figure 37 provides a graph that shows enzyme stability as indicated by % remaining activity of a BPN'-variant at pH ranges from 3 (-4-), 4 (--•--), 5 ( - A~ ) to 6 (--X--) at 25°, 35°, and 45°C., respectively
Figure 38 provides a graph that shows enzyme stability as indicated by % remaining activity of PURAFECT® TM protease at pH ranges from 3 (-*-), 4 (-•--), 5 (-A--) to 6 (--X--) at 25°, 35°, and 45°C., respectively
Figure 39 provides a graph that shows enzyme stability as indicated by % remaining activity of 69B4 protease at pH ranges from 3 (-*-), 4 (--•--), 5 ( ~A~ ) to 6 (--X-) at 25 °, 35° and 45°C., respectively
DESCRIPTION OF THE INVENTION
The present invention provides novel serine proteases, novel genetic material encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In particular, the present invention provides protease compositions obtained from a Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the protease, host cells transformed with the vector DNA, and an enzyme produced by the host cells. The present invention also provides cleaning compositions (e.g., detergent compositions), animal feed compositions, and textile and leather processing compositions comprising protease(s) obtained from a Micrococcineae spp., including but not limited to Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., variant) proteases derived from the wild-type proteases described herein. These mutant

proteases also find use in numerous applications!
Gram-positive alkalophilic bacteria have been isolated from in and around alkaline soda lakes (See e.g., U.S. Pat. No. 5,401,657, herein incorporated by reference). These alkalophilic were analyzed according to the principles of numerical taxonomy with respect to each other and also a collection of known bacteria, and taxonomically characterized. Six natural clusters or phenons of alkalophilic bacteria were generated. Amongst the strains isolated was a strain identified as 69B4.
Cellulomonas spp. are Gram-positive bacteria classified as members of the family Cellulomonadaceae, Suborder Micrococcineae, Order Actinomycetales, Class Actinobacteria. Cellulomonas grows as slender, often irregular rods that may occasionally show branching, but no mycelium is formed. In addition, there is no aerial growth and no spores are formed. Cellulomonas and Streptomyces are only distantly related at a genetic level. The large genetic (genomic) distinction between Cellulomonas and Streptomyces is reflected in a great difference in phenotypic properties. While serine proteases in Streptomyces have been previously examined, there apparently have been no reports of any serine proteases (approx. MW 18,000 to 20,000) secreted by Cellulomonas spp. In addition, there apparently have been no previous reports of Cellulomonas proteases being used in the cleaning and/or feed industry.
Streptomyces are Gram-positive bacteria classified as members of the Family Streptomycetaceae, Suborder Streptomycineae, Order Actinomycetales, class Actinobacteria. Streptomyces grows as an extensively branching primary or substrate mycelium and an abundant aerial mycelium that at maturity bear characteristic spores. Streptogrisins are serine proteases secreted in large amounts from a wide variety of Streptomyces species. The amino acid sequences of Streptomyces proteases have been determined from at least 9 different species of Streptomyces including Streptomyces griseus Streptogrisin C (accession no. P52320); alkaline proteinase (EC 3.4.21.-) from Streptomyces sp. (accession no. PC2053); alkaline serine proteinase I from Streptomyces sp. (accession no. S34672), serine protease from Streptomyces lividans (accession no. CAD4208); putative serine protease from Streptomyces coelicolor A3(2) (accession no. NP_625129); putative serine protease from Streptomyces avermitilis MA-4680 (accession no. NP_822175); serine protease from Streptomyces lividans (accession no. CAD42809); putative serine protease precursor from Streptomyces coelicolor A3(2) (accession no. NP_628830)). A purified native alkaline protease having an apparent molecular weight of 19,000 daltons and isolated from Streptomyces griseus var. alcalophilus protease and cleaning compositions comprised thereof have been described (See e.g., U.S. Patent No.

5,646,028, incorporated herein by reference).
The present invention provides protease enzymes produced by these organisms. Importantly, these enzymes have good stability and proteolytic activity. These enzymes find use in various applications, including but not limited to cleaning compositions, animal feed, textile processing and etc. The present invention also provides means to produce these enzymes. In some preferred embodiments, the proteases of the present invention are in pure or relatively pure form.
The present invention also provides nucleotide sequences which are suitable to produce the proteases of the present invention in recombinant organisms. In some embodiments, recombinant production provides means to produce the proteases in quantities that are commercially viable.
Unless otherwise indicated, the practice of the present invention involves conventional techniques commonly used in molecular biology, microbiology, and recombinant DNA, which are within the skill of the art. Such techniques are known to those of skill in the art and are described in numerous texts and reference works (See e.g., Sambrook etal., "Molecular Cloning: A Laboratory Manual", Second Edition (Cold Spring Harbor), [1989]); and Ausubel etal., "Current Protocols in Molecular Biology" [1987]). All patents, patent applications, articles and publications mentioned herein, both supra and infra, are hereby expressly incorporated herein by reference.
Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this
invention pertains. For example, Singleton and Sainsbury, Dictionary of Microbiology and
)
Molecular Biology, 2d Ed., John Wiley and Sons, NY (1994); and Hale and Marham, The Harper Collins Dictionary of Biology, Harper Perennial, NY (1991) provide those of skill in the art with a general dictionaries of many of the terms used in the invention. Although any methods and materials similar or equivalent to those described herein find use in the practice of the present invention, the preferred methods and materials are described herein. Accordingly, the terms defined immediately below are more fully described by reference to the Specification as a whole. Also, as used herein, the singular "a", "an" and "the" includes the plural reference unless the context clearly indicates otherwise. Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively. It is to be understood that this invention is not limited to the particular methodology, protocols, and reagents described, as these may vary, depending upon the context they are used by those of skill in the art.

The practice of the present invention employs, unless otherwise indicated, conventional techniques of protein purification, molecular biology, microbiology, recombinant DMA techniques and protein sequencing, all of which are within the skill of those in the art.
Furthermore, the headings provided herein are not limitations of the various aspects or embodiments of the invention which can be had by reference to the specification as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the specification as a whole. Nonetheless, in order to facilitate understanding of the invention, a number of terms are defined below.
I. Definitions
As used herein, the terms "protease," and "proteolytic activity" refer to a protein or peptide exhibiting the ability to hydrolyze peptides or substrates having peptide linkages. Many well known procedures exist for measuring proteolytic activity (Kalisz, "Microbial Proteinases," In: Fiechter (ed.), Advances in Biochemical Engineering/Biotechnology. [1988]). For example, proteolytic activity may be ascertained by comparative assays which analyze the respective protease's ability to hydrolyze a commercial substrate. Exemplary substrates useful in the such analysis of protease or protelytic activity, include, but are not limited to di-methyl casein (Sigma C-9801), bovine collagen (Sigma C-9879), bovine elastin (Sigma E-1625), and bovine keratin (ICN Biomedical 902111). Colorimetric assays utilizing these substrates are well known in the art (See e.g., WO 99/34011; and U.S. Pat. No. 6,376,450, both of which are incorporated herein by reference. The pNA assay (See e.g., Del Mar etal., Anal. Biochem., 99:316-320 [1979]) also finds use in determining the active enzyme concentration for fractions collected during gradient elution. This assay measures the rate at which p-nitroaniline is released as the enzyme hydrolyzes the soluble synthetic substrate, succinyl-alanine-alanine-proline-phenylalanine-p-nitroanilide (sAAPF-pNA). The rate of production of yellow color from the hydrolysis reaction is measured at 410 nm on a spectrophotometer and is proportional to the active enzyme concentration. In addition, absorbance measurements at 280 nm can be used to determine the total protein concentration. The active enzyme/total-protein ratio gives the enzyme purity.
As used herein, the terms "ASP protease," "Asp protease," and "Asp," refer to the serine proteases described herein. In some preferred embodiments, the Asp protease is the protease designed herein as 69B4 protease obtained from Cellulomonas strain 69B4. Thus, in preferred embodiments, the term "69B4 protease" refers to a naturally occurring mature protease derived from Cellulomonas strain 69B4 (DSM 16035) having substantially identical amino acid sequences as provided in SEQ ID NO:8. In alternative embodiments,

the present invention provides portions of the ASP protease.
The term "Cellulomonas protease homologues" refers to naturally occurring proteases having substantially identical amino acid sequences to the mature protease derived from Cellulomonas strain 69B4 or polynucleotide sequences which encode for such naturally occurring proteases, and which proteases retain the functional characteristics of a serine protease encoded by such nucleic acids. In some embodiments, these protease homologues are referred to as "cellulomonadins."
As used herein, the terms "protease variant," "ASP variant," "ASP protease variant," and "69B protease variant" are used in reference to proteases that are similar to the wild-type ASP, particularly in their function, but have mutations in their amino acid sequence that make them different in sequence from the wild-type protease.
As used herein, "Cellulomonas ssp." refers to all of the species within the genus "Cellulomonas," which are Gram-positive bacteria classified as members of the Family Cellulomonadaceae, Suborder Micrococcineae, Order Actinomycetales, Class Actinobacteria. It is recognized that the genus Cellulomonas continues to undergo taxonomical reorganization. Thus, it is intended that the genus include species that have been reclassified
As used herein, "Streptomyces ssp." refers to all of the species within the genus "Streptomyces," which are Gram-positive bacteria classified as members of the Family Streptomycetaceae, Suborder Streptomycineae, Order Actinomycetales, class Actinobacteria. It is recognized that the genus Streptomyces continues to undergo taxonomical reorganization. Thus, it is intended that the genus include species that have been reclassified
As used herein, "the genus Bacillus" includes all species within the genus "Bacillus," as known to those of skill in the art, including but not limited to B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis. It is recognized that the genus Bacillus continues to undergo taxonomical reorganization. Thus, it is intended that the genus include species that have been reclassified, including but not limited to such organisms as B. stearothermophilus, which is now named "Geobacillus stearothermophilus." The production of resistant endospores in the presence of oxygen is considered the defining feature of the genus Bacillus, although this characteristic also applies to the recently named Alicyclobacillus, Amphibacillus, Aneurinibacillus, Anoxybacillus, Brevibacillus, Filobacillus, Gracilibacillus, Halobacillus, Paenibacillus, Salibacillus, Thermobacillus, Ureibacillus, and Virgibacillus.

The terms "polynucleotide" and "nucleic acid", used interchangeably herein, refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides. These terms include, but are not limited to, a single-, double- or triple-stranded DNA,-genomic DNA, cDNA, RNA, DNA-RNA hybrid, or a polymer comprising purine and pyrimidine bases, or other natural, chemically, biochemically modified, non-natural or derivatized nucleotide bases. The following are non-limiting examples of polynucleotides: genes, gene fragments, chromosomal fragments, ESTs, exons, introns, mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. In some embodiments, polynucleotides comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs, uracil, other sugars and linking groups such as fluororibose and thioate, and nucleotide branches. In alternative embodiments, the sequence of, nucleotides is interrupted by non-nucleotide components.
As used herein, the terms "DNA construct" and "transforming DNA" are used interchangeably to refer to DNA used to introduce sequences into a host cell or organism. The DNA may be generated in vitro by PCR or any other suitable technique(s) known to those in the art. In particularly preferred embodiments, the DNA construct comprises a sequence of interest (e.g., as an incoming sequence). In some embodiments, the sequence is operably linked to additional elements such as control elements (e.g., promoters, etc.). The DNA construct may further comprise a selectable marker. It may further comprise an incoming sequence flanked by homology boxes. In a further embodiment, the transforming DNA comprises other non-homologous sequences, added to the ends (e.g., stuffer sequences or flanks). In some embodiments, the ends of the incoming sequence are closed such that the transforming DNA forms a closed circle. The transforming sequences may be wild-type, mutant or modified. In some embodiments, the DNA construct comprises sequences homologous to the host cell chromosome. In other embodiments, the DNA construct comprises non-homologous sequences. Once the DNA construct is assembled in vitro it may be used to: 1) insert heterologous sequences into a desired target sequence of a host cell, and/or 2) mutagenize a region of the host cell chromosome (i.e., replace an endogenous sequence with a heterologous sequence), 3) delete target genes; and/or introduce a replicating plasmid into the host.
As used herein, the terms "expression cassette" and "expression vector" refer to nucleic acid constructs generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell.

The recombinant expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DMA, plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette portion of an expression vector includes, among other sequences, a nucleic acid sequence to be transcribed and a promoter. In preferred embodiments, expression vectors have the ability to incorporate and express heterologous DNA fragments in a host cell. Many prokaryotic and eukaryotic expression vectors are commercially available. Selection of appropriate expression vectors is within the knowledge of those of skill in the art. The term "expression cassette" is used interchangeably herein with "DNA construct," and their grammatical equivalents. Selection of appropriate expression vectors is within the knowledge of those of skill in the art.
As used herein, the term "vector" refers to a polynucleotide construct designed to introduce nucleic acids into one or more cell types. Vectors include cloning vectors, expression vectors, shuttle vectors, plasmids, cassettes and the like. In some embodiments, the polynucleotide construct comprises a DNA sequence encoding the protease (e.g., precursor or mature protease) that is operably linked to a suitable prosequence (e.g., secretory, etc.) capable of effecting the expression of the DNA in a suitable host.
As used herein, the term "plasmid" refers to a circular double-stranded (ds) DNA construct used as a cloning vector, and which forms an extrachromosomal self-replicating genetic element in some eukaryotes or prokaryotes, or integrates into the host chromosome.
As used herein in the context of introducing a nucleic acid sequence into a cell, the term "introduced" refers to any method suitable for transferring the nucleic acid sequence into the cell. Such methods for introduction include but are not limited to protoplast fusion, transfection, transformation, conjugation, and transduction (See e.g., Ferrari etal., "Genetics,"in Hardwood etal, (eds.), Bacillus. Plenum Publishing Corp., pages 57-72, [1989]).
As used herein, the terms "transformed" and "stably transformed" refers to a cell that has a non-native (heterologous) polynucleotide sequence integrated into its genome or as an episomal plasmid that is maintained for at least two generations.
As used herein, the term "selectable marker-encoding nucleotide sequence" refers to a nucleotide sequence which is capable of expression in the host cells and where expression of the selectable marker confers to cells containing the expressed gene the ability to grow in the presence of a corresponding selective agent or lack of an essential nutrient.

As used herein, the terms "selectable marker" and "selective marker" refer to a nucleic acid (e.g., a gene) capable of expression in host cell which allows for ease of selection of those hosts containing the vector. Examples of such selectable markers include but are not limited to antimicrobials. Thus, the term "selectable marker" refers to genes that provide an indication that a host cell has taken up an incoming DMA of interest or some other reaction has occurred. Typically, selectable markers are genes that confer antimicrobial resistance or a metabolic advantage on the host cell to allow cells containing the exogenous DNA to be distinguished from cells that have not received any exogenous sequence during the transformation. A "residing selectable marker" is one that is located on the chromosome of the microorganism to be transformed. A residing selectable marker encodes a gene that is different from the selectable marker on the transforming DNA construct. Selective markers are well known to those of skill in the art. As indicated above, preferably the marker is an antimicrobial resistant marker (e.g., ampR; phleoR; specR; kanR; eryR; tetR; cmpR; and neoR; See e.g., Guerot-Fleury, Gene, 167:335-337 [1995]; Palmeros etal., Gene 247:255-264 [2000]; and Trieu-Cuot etal., Gene, 23:331-341 [1983]). Other markers useful in accordance with the invention include, but are not limited to auxotrophic markers, such as tryptophan; and detection markers, such as (3- galactosidase.
As used herein, the term "promoter" refers to a nucleic acid sequence that functions to direct transcription of a downstream gene. In preferred embodiments, the promoter is appropriate to the host cell in which the target gene is being expressed. The promoter, together with other transcriptional and translational regulatory nucleic acid sequences (also termed "control sequences") is necessary to express a given gene. In general, the transcriptional and translational regulatory sequences include, but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences.
A nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA encoding a secretory leader (i.e., a signal peptide), is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation. Generally, "operably linked" means that the DNA sequences being linked are contiguous, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors

or linkers are used in accordance with conventional practice.
As used herein the term "gene" refers to a polynucleotide (e.g., a DMA segment), that encodes a polypeptide and includes regions preceding and following the coding regions as well as intervening sequences (introns) between individual coding segments (exons).
As used herein, "homologous genes" refers to a pair of genes from different, but usually related species, which correspond to each other and which are identical or very similar to each other. The term encompasses genes that are separated by speciation (i.e., the development of new species) (e.g., orthologous genes), as well as genes that have been separated by genetic duplication (e.g., paralogous genes).
As used herein, "ortholog" and "orthologous genes" refer to genes in different species that have evolved from a common ancestral gene (i.e., a homologous gene) by speciation. Typically, orthologs retain the same function during the course of evolution. Identification of orthologs finds use in the reliable prediction of gene function in newly sequenced genomes.
As used herein, "paralog" and "paralogous genes" refer to genes that are related by duplication within a genome. While orthologs retain the same function through the course of evolution, paralogs evolve new functions, even though some functions are often related to the original one. Examples of paralogous genes include, but are not limited to genes encoding trypsin, chymotrypsin, elastase, and thrombin, which are all serine proteinases and occur together within the same species.
As used herein, "homology" refers to sequence similarity or identity, with identity being preferred. This homology is determined using standard techniques known in the art (See e.g., Smith and Waterman, Adv. Appl. Math., 2:482 [1981]; Needleman and Wunsch, J. Mol. Biol., 48:443 [1970]; Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 [1988]; programs such as GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package (Genetics Computer Group, Madison, Wl); and Devereux etal., Nucl. Acid Res., 12:387-395 [1984]).
As used herein, an "analogous sequence" is one wherein the function of the gene is essentially the same as the gene based on the Cellulomonas strain 69B4 protease. Additionally, analogous genes include at least 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100% sequence identity with the sequence of the Cellulomonas strain 69B4 protease. Alternately, analogous sequences have an alignment of between 70 to 100% of the genes found in the Cellulomonas strain 69B4 protease region and/or have at least between 5-10 genes found in the region aligned with the genes in the Cellulomonas strain 69B4 chromosome. In additional embodiments more than one of the

above properties applies to the sequence. Analogous sequences are determined by known methods of sequence alignment. A commonly used alignment method is BLAST, although as indicated above and below, there are other methods that also find use in aligning sequences.
One example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pair-wise alignments. It can also plot a tree showing the clustering relationships used to create the alignment. PILEUP uses a simplification of the progressive alignment method of Feng and Doolittle (Feng and Doolittle, J. Mol. Evol., 35:351-360 [1987]). The method is similar to that described by Higgins and Sharp (Higgins and Sharp, CABIOS 5:151-153 [1989]). Useful PILEUP parameters including a default gap weight of 3.00, a default gap length weight of 0.10, and weighted end gaps.
Another example of a useful algorithm is the BLAST algorithm, described by Altschul etal., (Altschul etal., J. Mol. Biol., 215:403-410, [1990]; and Karlin etal., Proc. Natl. Acad. Sci, USA 90:5873-5787 [1993]). A particularly useful BLAST program is the WU-BLAST-2 program (See, Altschul etal., Meth. Enzymol., 266:460-480 [1996]). WU-BLAST-2 uses several search parameters, most of which are set to the default values. The adjustable parameters are set with the following values: overlap span =1, overlap fraction = 0.125, word threshold (T) = 11. The HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. However, the values may be adjusted to increase sensitivity. A % amino acid sequence identity value is determined by the number of matching identical residues divided by the total number of residues of the "longer" sequence in the aligned region. The "longer" sequence is the one having the most actual residues in the aligned region (gaps introduced by WU-Blast-2 to maximize the alignment score are ignored).
Thus, "percent (%) nucleic acid sequence identity" is defined as the percentage of nucleotide residues in a candidate sequence that are identical with the nucleotide residues of the starting sequence (i.e., the sequence of interest). A preferred method utilizes the BLASTN module of WU-BLAST-2 set to the default parameters, with overlap span and overlap fraction set to 1 and 0.125, respectively.
As used herein, the term "hybridization" refers to the process by which a strand of nucleic acid joins with a complementary strand through base pairing, as known in the art.
A nucleic acid sequence is considered to be "selectively hybridizable" to a reference nucleic acid sequence if the two sequences specifically hybridize to one another under

moderate to high stringency hybridization and wash conditions. Hybridization conditions are based on the melting temperature (Tm) of the nucleic acid binding complex or probe. For example, "maximum stringency" typically occurs at about Tm-5°C (5° below the Tm of the probe); "high stringency" at about 5-10°C below the Tm; "intermediate stringency" at about 10-20°C below the Tm of the probe; and "low stringency" at about 20-25°C below the Tm. Functionally, maximum stringency conditions may be used to identify sequences having strict identity or near-strict identity with the hybridization probe; while an intermediate or low stringency hybridization can be used to identify or detect polynucleotide sequence homologs.
Moderate and high stringency hybridization conditions are well known in the art. An example of high stringency conditions includes "hybridization at about 42°C in 50% formamide, 5X SSC, 5X Denhardt's solution, 0.5% SDS and 100 jig/ml denatured carrier DNA followed by washing two times in 2X SSC and 0.5% SDS at room temperature and two additional times in 0.1 X SSC and 0.5% SDS at 42°C. An example of moderate stringent conditions include an overnight incubation at 37°C in a solution comprising 20% formamide, 5 x SSC (150mM NaCI, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x Denhardt's solution, 10% dextran sulfate and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1x SSC at about 37 - 50°C. Those of skill in the art know how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like.
As used herein, "recombinant" includes reference to a cell or vector, that has been modified by the introduction of a heterologous nucleic acid sequence or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found in identical form within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all as a result of deliberate human intervention. "Recombination," "recombining," and generating a "recombined" nucleic acid are generally the assembly of two or more nucleic acid fragments wherein the assembly gives rise to a chimeric gene.
In a preferred embodiment, mutant DNA sequences are generated with site saturation mutagenesis in at least one codon. In another preferred embodiment, site saturation mutagenesis is performed for two or more codons. In a further embodiment, mutant DNA sequences have more than 50%, more than 55%, more than 60%, more than 65%, more than 70%, more than 75%, more than 80%, more than 85%, more than 90%, more than 95%, or more than 98% homology with the wild-type sequence. In alternative embodiments, mutant DNA is generated in vivo using any known mutagenic procedure such

as, for example, radiation, nitrosoguanidine and the like. The desired DMA sequence is then isolated and used in the methods provided herein.
As used herein, the term "target sequence" refers to a DNA sequence in the host cell that encodes the sequence where it is desired for the incoming sequence to be inserted into the host cell genome. In some embodiments, the target sequence encodes a functional wild-type gene or operon, while in other embodiments the target sequence encodes a functional mutant gene or operon, or a non-functional gene or operon.
As used herein, a "flanking sequence" refers to any sequence that is either upstream or downstream of the sequence being discussed (e.g., for genes A-B-C, gene B is flanked by the A and C gene sequences). In a preferred embodiment, the incoming sequence is flanked by a homology box on each side. In another embodiment, the incoming sequence and the homology boxes comprise a unit that is flanked by stuffer sequence on each side. In some embodiments, a flanking sequence is present on only a single side (either 3' or 5'), but in preferred embodiments, it is on each side of the sequence being flanked. In some embodiments, a flanking sequence is present on only a single side (either 3' or 5'), while in preferred embodiments, it is present on each side of the sequence being flanked.
As used herein, the term "stuffer sequence" refers to any extra DNA that flanks homology boxes (typically vector sequences). However, the term encompasses any non-homologous DNA sequence. Not to be limited by any theory, a stuffer sequence provides a noncritical target for a cell to initiate DNA uptake.
As used herein, the terms "amplification" and "gene amplification" refer to a process by which specific DNA sequences are disproportionately replicated such that the amplified gene becomes present in a higher copy number than was initially present in the genome. In some embodiments, selection of cells by growth in the presence of a drug (e.g., an inhibitor of an inhibitable enzyme) results in the amplification of either the endogenous gene encoding the gene product required for growth in the presence of the drug or by amplification of exogenous (i.e., input) sequences encoding this gene product, or both.
"Amplification" is a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (i.e., replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of "target" specificity. Target sequences are "targets" in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out.

As used herein, the term "co-amplification" refers to the introduction into a single cell of an amplifiable marker in conjunction with other gene sequences (i.e., comprising one or more non-selectable genes such as those contained within an expression vector) and the application of appropriate selective pressure such that the cell amplifies both the amplifiable marker and the other, non-selectable gene sequences. The amplifiable marker may be physically linked to the other gene sequences or alternatively two separate pieces of DMA, one containing the amplifiable marker and the other containing the non-selectable marker, may be introduced into the same cell.
As used herein, the terms "amplifiable marker," "amplifiable gene," and "amplification vector" refer to a gene or a vector encoding a gene which permits the amplification of that gene under appropriate growth conditions.
"Template specificity" is achieved in most amplification techniques by the choice of enzyme. Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid. For example, in the case of Qp replicase, MDV-1 RNA is the specific template for the replicase (See e.g., Kacian etal., Proc. Natl. Acad. Sci. USA 69:3038 [1972]). Other nucleic acids are not replicated by this amplification enzyme. Similarly, in the case of T7 RNA polymerase, this amplification enzyme has a stringent specificity for its own promoters (See, Chamberlin et al., Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzyme will not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between the oligonucleotide or polynucleotide substrate and the template at the ligation junction (See, Wu and Wallace, Genomics 4:560 [1989]). Finally, Tag and Pfu polymerases, by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in thermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences.
As used herein, the term "amplifiable nucleic acid" refers to nucleic acids which may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid" will usually comprise "sample template."
As used herein, the term "sample template" refers to nucleic acid originating from a sample which is analyzed for the presence of "target" (defined below). In contrast, "background template" is used in reference to nucleic acid other than sample template which may or may not be present in a sample. Background template is most often inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids

from organisms other than those to be detected may be present as background in a test sample.
As used herein, the term "primer" refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
As used herein, the term "probe" refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention will be labeled with any "reporter molecule," so that is detectable in any detection system, including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
As used herein, the term "target," when used in reference to the polymerase chain reaction, refers to the region of nucleic acid bounded by the primers used for polymerase chain reaction. Thus, the "target" is sought to be sorted out from other nucleic acid sequences. A "segment" is defined as a region of nucleic acid within the target sequence.
As used herein, the term "polymerase chain reaction" ("PCR") refers to the methods of U.S. Patent Nos. 4,683,195 4,683,202, and 4,965,188, hereby incorporated by reference, which include methods for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. The two primers are complementary

to their respective strands of the double stranded target sequence. To effect amplification, the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule. Following annealing, the primers are extended with a polymerase so as to form a new pair of complementary strands. The steps of denaturation, primer annealing and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one "cycle"; there can be numerous "cycles") to obtain a high concentration of an amplified segment of the desired target sequence. The length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter. By virtue of the repeating aspect of the process, the method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). Because the desired amplified segments of the target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be "PCR amplified".
As used herein, the term "amplification reagents" refers to those reagents . (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for primers, nucleic acid template and the amplification enzyme. Typically, amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
As used herein, the terms "PCR product," "PCR fragment," and "amplification product" refer to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
As used herein, the term "RT-PCR" refers to the replication and amplification of RNA sequences. In this method, reverse transcription is coupled to PCR, most often using a one enzyme procedure in which a thermostable polymerase is employed, as described in U.S. Patent No. 5,322,770, herein incorporated by reference. In RT-PCR, the RNA template is

converted to cDNA due to the reverse transcriptase activity of the polymerase, and then amplified using the polymerizing activity of the polymerase (i.e., as in other PCR methods).
As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer to bacterial enzymes, each of which cut double-stranded DMA at or near a specific nucleotide sequence.
A "restriction site" refers to a nucleotide sequence recognized and cleaved by a given restriction endonuclease and is frequently the site for insertion of DMA fragments. In certain embodiments of the invention restriction sites are engineered into the selective marker and into 5' and 3' ends of the DNA construct.
As used herein, the term "chromosomal integration" refers to the process whereby an incoming sequence is introduced into the chromosome of a host cell. The homologous regions of the transforming DNA align with homologous regions of the chromosome. Subsequently, the sequence between the homology boxes is replaced by the incoming sequence in a double crossover (i.e., homologous recombination). In some embodiments of the present invention, homologous sections of an inactivating chromosomal segment of a DNA construct align with the flanking homologous regions of the indigenous chromosomal region of the Bacillus chromosome. Subsequently, the indigenous chromosomal region is deleted by the DNA construct in a double crossover (i.e., homologous recombination).
"Homologous recombination" means the exchange of DNA fragments between two DNA molecules or paired chromosomes at the site of identical or nearly identical nucleotide sequences. In a preferred embodiment, chromosomal integration is homologous recombination.
"Homologous sequences" as used herein means a nucleic acid or polypeptide sequence having 100%, 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 88%, 85%, 80%, 75%, or 70% sequence identity to another nucleic acid or polypeptide sequence when optimally aligned for comparison. In some embodiments, homologous sequences have between 85% and 100% sequence identity, while in other embodiments there is between 90% and 100% sequence identity, and in more preferred embodiments, there is 95% and 100% sequence identity.
As used herein "amino acid" refers to peptide or protein sequences or portions thereof. The terms "protein," "peptide," and "polypeptide" are used interchangeably.
As used herein, "protein of interest" and "polypeptide of interest" refer to a protein/polypeptide that is desired and/or being assessed. In some embodiments, the protein of interest is expressed intracellularly, while in other embodiments, it is a secreted polypeptide. In particularly preferred embodiments, these enzyme include the serine

proteases of the present invention. In some embodiments, the protein of interest is a secreted polypeptide which is fused to a signal peptide (i.e., an amino-terminal extension on a protein to be secreted). Nearly all secreted proteins use an amino- terminal protein extension which plays a crucial role in the targeting to and translocation of precursor proteins across the membrane. This extension is proteolytically removed by a signal peptidase during or immediately following membrane transfer.
As used herein, the term "heterologous protein" refers to a protein or polypeptide that does not naturally occur in the host cell. Examples of heterologous proteins include enzymes such as hydrolases including proteases. In some embodiments, the gene encoding the proteins are naturally occurring genes, while in other embodiments, mutated and/or synthetic genes are used.
As used herein, "homologous protein" refers to a protein or polypeptide native or naturally occurring in a cell. In preferred embodiments, the cell is a Gram-positive cell, while in particularly preferred embodiments, the cell is a Bacillus host cell. In alternative embodiments, the homologous protein is a native protein produced by other organisms, including but not limited to E. coll, Streptomyces, Trichoderma, and Aspergillus. The invention encompasses host cells producing the homologous protein via recombinant DMA technology.
As used herein, an "operon region" comprises a group of contiguous genes that are transcribed as a single transcription unit from a common promoter, and are thereby subject to co-regulation. In some embodiments, the operon includes a regulator gene. In most preferred embodiments, operons that are highly expressed as measured by RNA levels, but have an unknown or unnecessary function are used.
As used herein, an "antimicrobial region" is a region containing at least one gene that encodes an antimicrobial protein.
A polynucleotide is said to "encode" an RNA or a polypeptide if, in its native state or when manipulated by methods known to those of skill in the art, it can be transcribed and/or translated to produce the RNA, the polypeptide or a fragment thereof. The anti-sense strand of such a nucleic acid is also said to encode the sequences.
As is known in the art, a DNA can be transcribed by an RNA polymerase to produce RNA, but an RNA can be reverse transcribed by reverse transcriptase to produce a DNA. Thus a DNA can encode a RNA and vice versa.
The term "regulatory.segment" or "regulatory sequence" or "expression control sequence" refers to a polynucleotide sequence of DNA that is operatively linked with a polynucleotide sequence of DNA that encodes the amino acid sequence of a polypeptide

chain to effect the expression of the encoded amino acid sequence. The regulatory sequence can inhibit, repress, or promote the expression of the operably linked polynucleotide sequnce encoding the amino acid.
"Host strain" or "host cell" refers to a suitable host for an expression vector comprising DNA according to the present invention.
An enzyme is "overexpressed" in a host cell if the enzyme is expressed in the cell at a higher level that the level at which it is expressed in a corresponding wild-type cell.
The terms "protein" and "polypeptide" are used interchangeability herein. The 3-letter code for amino acids as defined in conformity with the IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN) is used through out this disclosure. It is also understood that a polypeptide may be coded for by more than one nucleotide sequence due to the degeneracy of the genetic code.
A "prosequence" is an amino acid sequence between the signal sequence and mature protease that is necessary for the secretion of the protease. Cleavage of the pro sequence will result in a mature active protease.
The term "signal sequence" or "signal peptide" refers to any sequence of nucleotides and/or amino acids which may participate in the secretion of the mature or precursor forms of the protein. This definition of signal sequence is a functional one, meant to include all those amino acid sequences encoded by the N-terminal portion of the protein gene, which participate in the effectuation of the secretion of protein. They are often, but not universally, bound to the N-terminal portion of a protein or to the N-terminal portion of a precursor protein. The signal sequence may be endogenous or exogenous. The signal sequence may be that normally associated with the protein (e.g., protease), or may be from a gene encoding another secreted protein. One exemplary exogenous signal sequence comprises the first seven amino acid residues of the signal sequence from Bacillus subtilis subtilisin fused to the remainder of the signal sequence of the subtilisin from Bacillus lentus (ATCC 21536).
The term "hybrid signal sequence" refers to signal sequences in which part of sequence is obtained from the expression host fused to the signal sequence of the gene to be expressed. In some embodiments, synthetic sequences are utilized.
The term "substantially the same signal activity" refers to the signal activity, as indicated by substantially the same secretion of the protease into the fermentation medium, for example a fermentation medium protease level being at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 98% of the secreted protease levels in the fermentation medium as provided by the signal sequence of SEQ ID NOS:5 and/or 9.

The term "mature" form of a protein or peptide refers to the final functional form of the protein or peptide. To exemply, a mature form of the protease of the present invention at least includes the amino acid sequence identical to residue positions 1-189 of SEQ ID NO:8.
The term "precursor" form of a protein or peptide refers to a mature form of the protein having a prosequence operably linked to the amino or carbonyl terminus of the protein. The precursor may also have a "signal" sequence operably linked, to the amino terminus of the prosequence. The precursor may also have additional polynucleotides that are involved in post-translational activity (e.g., polynucleotides cleaved therefrom to leave the mature form of a protein or peptide).
"Naturally occurring enzyme" refers to an enzyme having the unmodified amino acid sequence identical to that found in nature. Naturally occurring enzymes include native enzymes, those enzymes naturally expressed or found in the particular microorganism.
The terms "derived from" and "obtained from" refer to not only a protease produced or producible by a strain of the organism in question, but also a protease encoded by a DNA sequence isolated from such strain and produced in a host organism containing such DNA sequence. Additionally, the term refers to a protease which is encoded by-a DNA sequence of synthetic and/or cDNA origin and which has the identifying characteristics of the protease in question. To exemplify, "proteases derived from Cellulomonas" refers to those enzymes having proteolytic activity which are naturally-produced by Cellulomonas, as well as to serine proteases like those produced by Cellulomonas sources but which through the use of genetic engineering techniques are produced by non-Cellulomonas organisms transformed with a nucleic acid encoding said serine proteases.
A "derivative" within the scope of this definition generally retains the characteristic proteolytic activity observed in the wild-type, native or parent form to the extent that the derivative is useful for similar purposes as the wild-type, native or parent form. Functional derivatives of serine protease encompass naturally occurring, synthetically or recombinantly produced peptides or peptide fragments which have the general characteristics of the serine protease of the present invention.
The term "functional derivative" refers to a derivative of a nucleic acid which has the functional characteristics of a nucleic acid which encodes serine protease. Functional derivatives of a nucleic acid which encode serine protease of the present invention encompass naturally occurring, synthetically or recombinantly produced nucleic acids or fragments and encode serine protease characteristic of the present invention. Wild type nucleic acid encoding serine proteases according to the invention include naturally occurring

alleles and homologues based on the degeneracy of the genetic code known in the art.
The term "identical" in the context of two nucleic acids or polypeptide sequences refers to the residues in the two sequences that are the same when aligned for maximum correspondence, as measured using one of the following sequence comparison or analysis algorithms.
The term "optimal alignment" refers to the alignment giving the highest percent identity score.
"Percent sequence identity," "percent amino acid sequence identity," "percent gene sequence identity," and/or "percent nucleic acid/polynucloetide sequence identity," with respect to two amino acid, polynucleotide and/or gene sequences (as appropriate), refer to the percentage of residues that are identical in the two sequences when the sequences are optimally aligned. Thus, 80% amino acid sequence identity means that 80% of the amino acids in two optimally aligned polypeptide sequences are identical.
The phrase "substantially identical" in the context of two nucleic acids or polypeptides thus refers to a polynucleotide or polypeptide that comprising at least 70% sequence identity, preferably at least 75%, preferably at least 80%, preferably at least 85%, preferably at least 90%, preferably at least 95% , preferably at least 97% , preferably at least 98% and preferably at least 99% sequence identity as compared to a reference sequence using the programs or algorithms (e.g., BLAST, ALIGN, CLUSTAL) using standard parameters. One indication that two polypeptides are substantially identical is that the first polypeptide is immunologically cross-reactive with the second polypeptide. Typically, polypeptides that differ by conservative amino acid substitutions are immunologically cross-reactive. Thus, a polypeptide is substantially identical to a second polypeptide, for example, where the two peptides differ only by a conservative substitution. Another indication that two nucleic acid sequences are substantially identical is that the two molecules hybridize to each other under stringent conditions (e.g., within a range of medium to high stringency).
The phrase "equivalent," in this context, refers to serine proteases enzymes that are encoded by a polynucleotide capable of hybridizing to the polynucleotide having the sequence as shown in SEQ ID NO:1, under conditions of medium to maximal stringency. For example, being equivalent means that an equivalent mature serine protease comprises at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% and/or at least 99% sequence identity to the mature Cellulomonas serine protease having the amino acid sequence of SEQ ID NO:8.

The term "isolated" or "purified" refers to a material that is removed from its original environment (e.g., the natural environment if it is naturally occurring). For example, the material is said to be "purified" when it is present in a particular composition in a higher or lower concentration than exists in a naturally occurring or wild type organism or in combination with components not normally present upon expression from a naturally occurring or wild type organism. For example, a naturally-occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same polynucleotide or polypeptide, separated from some or all of the coexisting materials in the natural system, is isolated. Such polynucleotides could be part of a vector, and/or such polynucleotides or polypeptides could be part of a composition, and still be isolated in that such vector or composition is not part of its natural environment. In preferred embodiments, a nucleic acid or protein is said to be purified, for example, if it gives rise to essentially one band in an electrophoretic gel or blot.
The term "isolated", when used in reference to a DMA sequence, refers to a DMA sequence that has been removed from its natural genetic milieu and is thus free of other extraneous or unwanted coding sequences, and is in a form suitable for use within genetically engineered protein production systems. Such isolated molecules are those that are separated from their natural environment and include cDNA and genomic clones. Isolated DNA molecules of the present invention are free of other genes with which they are ordinarily associated, but may include naturally occurring 5' and 3' untranslated regions such as promoters and terminators. The identification of associated regions will be evident to one of ordinary skill in the art (See e.g., Dynan and Tijan, Nature 316:774-78 [1985]). The term "an isolated DNA sequence" is alternatively referred to as "a cloned DNA sequence".
The term "isolated," when used in reference to a protein, refers to a protein that is found in a condition other than its native environment. In a preferred form, the isolated protein is substantially free of other proteins, particularly other homologous proteins. An isolated protein is more than 10% pure, preferably more than 20% pure, and even more preferably more than 30% pure, as determined by SDS-PAGE. Further aspects of the invention encompass the protein in a highly purified form (i.e., more than 40% pure, more than 60% pure, more than 80% pure, more than 90% pure, more than 95% pure, more than 97% pure, and even more than 99% pure), as determined by SDS-PAGE.
As used herein, the term, "combinatorial mutagenesis" refers to methods in which libraries of variants of a starting sequence are generated. In these libraries, the variants contain one or several mutations chosen from a predefined set of mutations. In addition, the methods provide means to introduce random mutations which were not members of the

preaermed set or mutations, in some emooaiments, the methods include those set forth in U.S. Patent Appln. Ser. No. 09/699.250, filed October 26, 2000, hereby incorporated by reference. In alternative embodiments, combinatorial mutagenesis methods encompass commercially available kits (e.g., QuikChange® Multisite, Stratagene, San Diego, CA).
As used herein, the term "library of mutants" refers to a population of cells which are identical in most of their genome but include different homologues of one or more genes. Such libraries can be used, for example, to identify genes or operons with improved traits.
As used herein, the term "starting gene" refers to a gene of interest that encodes a protein of interest that is to be improved and/or changed using the present invention. . As used herein, the term "multiple sequence alignment" ("MSA") refers to the sequences of multiple homologs of a starting gene that are aligned using an algorithm (e.g., Clustal W).
As used herein, the terms "consensus sequence" and "canonical sequence" refer to an archetypical amino acid sequence against which all variants of a particular protein or sequence of interest are compared. The terms also refer to a sequence that sets forth the nucleotides that are most often present in a DNA sequence of interest. For each position of a gene, the consensus sequence gives the amino acid that is most abundant in that position in the MSA.
As used herein, the term "consensus mutation" refers to a difference in the sequence of a starting gene and a consensus sequence. Consensus mutations are identified by comparing the sequences of the starting gene and the consensus sequence resulting from an MSA. In some embodiments, consensus mutations are introduced into the starting gene such that it becomes more similar to the consensus sequence. Consensus mutations also include amino acid changes that change an amino acid in a starting gene to an amino acid that is more frequently found in an MSA at that position relative to the frequency of that amino acid in the starting gene. Thus, the term consensus mutation comprises all single amino acid changes that replace an amino acid of the starting gene with an amino acid that is more abundant than the amino acid in the MSA.
As used herein, the term "initial hit" refers to a variant that was identified by screening a combinatorial consensus mutagenesis library. In preferred embodiments, initial hits have improved performance characteristics, as compared to the starting gene.
As used herein, the term "improved hit" refers to a variant that was identified by screening an enhanced combinatorial consensus mutagenesis library.
As used herein, the terms "improving mutation" and "performance-enhancing mutation" refer to a mutation that leads to improved performance when it is introduced into

the starting gene. In some preferred embodiments, these mutations are identified by sequencing hits that were identified during the screening step of the method. In most embodiments, mutations that are more frequently found in hits are likely to be improving mutations, as compared to an unscreened combinatorial consensus mutagenesis library.
As used herein, the term "enhanced combinatorial consensus mutagenesis library" refers to a CCM library that is designed and constructed based on screening and/or sequencing results from an earlier round of CCM mutagenesis and screening. In some embodiments, the enhanced CCM library is based on the sequence of an initial hit resulting from an earlier round of CCM. In additional embodiments, the enhanced CCM is designed such that mutations that were frequently observed in initial hits from earlier rounds of mutagenesis and screening are favored. In some preferred embodiments, this is accomplished by omitting primers that encode performance-reducing mutations or by increasing the concentration of primers that encode performance-enhancing mutations relative to other primers that were used in earlier CCM libraries.
As used herein, the term "performance-reducing mutations" refer to mutations in the combinatorial consensus mutagenesis library that are less frequently found in hits resulting from screening as compared to an unscreened combinatorial consensus mutagenesis library. In preferred embodiments, the screening process removes and/or reduces the abundance of variants that contain "performance-reducing mutations."
As used herein, the term "functional assay" refers to an assay that provides an indication of a protein's activity. In particularly preferred embodiments, the term refers to assay systems in which a protein is analyzed for its ability to function in its usual capacity. For example, in the case of enzymes, a functional assay involves determining the effectiveness of the enzyme in catalyzing a reaction.
As used herein, the term "target property" refers to the property of the starting gene that is to be altered. It is not intended that the present invention be limited to any particular target property. However, in some preferred embodiments, the target property is the stability of a gene product (e.g., resistance to denaturation, proteolysis or other degradative factors), while in other embodiments, the level of production in a production host is altered. Indeed, it is contemplated that any property of a starting gene will find use in the present invention.
The term "property" or grammatical equivalents thereof in the context of a nucleic acid, as used herein, refer to any characteristic or attribute of a nucleic acid that can be selected or detected. These properties include, but are not limited to, a property affecting binding to a polypeptide, a property conferred on a cell comprising a particular nucleic acid,

a property affecting gene transcription (e.g., promoter strength, promoter recognition, promoter regulation, enhancer function), a property affecting RNA processing (e.g., RNA splicing, RNA stability, RNA conformation, and post-transcriptional modification), a property affecting translation (e.g., level, regulation, binding of mRNA to ribosomal proteins, post-translational modification). For example, a binding site for a transcription factor, polymerase, regulatory factor, etc., of a nucleic acid may be altered to produce desired characteristics or to identify undesirable characteristics.
The term "property" or grammatical equivalents thereof in the context of a polypeptide, as used herein, refer to any characteristic or attribute of a polypeptide that can be selected or detected. These properties include, but are not limited to oxidative stability, substrate specificity, catalytic activity, thermal stability, alkaline stability, pH activity profile, resistance to proteolytic degradation, KM, kcat, kcat/kM ratio, protein folding, inducing an immune response, ability to bind to a ligand, ability to bind to a receptor, ability to be secreted, ability to be displayed on the surface of a cell, ability to oligomerize, ability to signal, ability to stimulate cell proliferation, ability to inhibit cell proliferation, ability to induce apoptosis, ability to be modified by phosphorylation or glycosylation, ability to treat disease.
As used.herein, the term "screening" has its usual meaning in the art and is, in general a multi-step process. In the first step, a mutant nucleic acid or variant polypeptide therefrom is provided. In the second step, a property of the mutant nucleic acid or variant polypeptide is determined. In the third step, the determined property is compared to a property of the corresponding precursor nucleic acid, to the property of the corresponding naturally occurring polypeptide or to the property of the starting material (e.g., the initial sequence) for the generation of the mutant nucleic acid.
It will be apparent to the skilled artisan that the screening procedure for obtaining a nucleic acid or protein with an altered property depends upon the property of the starting material the modification of which the generation of the mutant nucleic acid is intended to facilitate. The skilled artisan will therefore appreciate that the invention is not limited to any specific property to be screened for and that the following description of properties lists illustrative examples only. Methods for screening for any particular property are generally described in the art. For example, one can measure binding, pH, specificity, etc., before and after mutation, wherein a change indicates an alteration. Preferably, the screens are performed in a high-throughput manner, including multiple samples being screened simultaneously, including, but not limited to assays utilizing chips, phage display, and multiple substrates and/or indicators.
As used herein, in some embodiments, screens encompass selection steps in which

variants of interest are enriched from a population of variants. Examples of these embodiments include the selection of variants that confer a growth advantage to the host organism, as well as phage display or any other method of display, where variants can be captured from a population of variants based on their binding or catalytic properties. In a preferred embodiment, a library of variants is exposed to stress (heat, protease, denaturation) and subsequently variants that are still intact are identified in a screen or enriched by selection. It is intended that the term encompass any suitable means for selection. Indeed, it is not intended that the present invention be limited to any particular method of screening.
As used herein, the term "targeted randomization" refers to a process that produces a plurality of sequences where one or several positions have been randomized. In some embodiments, randomization is complete (i.e., all four nucleotides, A, T, G, and C can occur at a randomized position. In alternative embodiments, randomization of a nucleotide is limited to a subset of the four nucleotides. Targeted randomization can be applied to one or several codons of a sequence, coding for one or several proteins of interest. When expressed, the resulting libraries produce protein populations in which one or more amino acid positions can contain a mixture of all 20 amino acids or a subset of amino acids, as determined by the randomization scheme of the randomized codon. In some embodiments, the individual members of a population resulting from targeted randomization differ in the number of amino acids, due to targeted or random insertion or deletion of codons. In further embodiments, synthetic amino acids are included in the protein populations produced. In some preferred embodiments, the majority of members of a population resulting from targeted randomization show greater sequence homology to the consensus sequence than the starting gene. In some embodiments, the sequence encodes one or more proteins fo interest. In alternative embodiments, the proteins have differing biological functions. In some preferred embodiments, the incoming sequence comprises at least one selectable marker.
The terms "modified sequence" and "modified genes" are used interchangeably herein to refer to a sequence that includes a deletion, insertion or interruption of naturally occurring nucleic acid sequence. In some preferred embodiments, the expression product of the modified sequence is a truncated protein (e.g., if the modification is a deletion or interruption of the sequence). In some particularly preferred embodiments, the truncated protein retains biological activity. In alternative embodiments, the expression product of the modified sequence is an elongated protein (e.g., modifications comprising an insertion into the nucleic acid sequence). In some embodiments, an insertion leads to a truncated protein

(e.g., when the insertion results in the formation of a stop codon). Thus, an insertion may result in either a truncated protein or an elongated protein as an expression product.
As used herein, the terms "mutant sequence" and "mutant gene" are used interchangeably and refer to a sequence that has an alteration in at least one codon occurring in a host cell's wild-type sequence. The expression product of the mutant sequence is a protein with an altered amino acid sequence relative to the wild-type. The expression product may have an altered functional capacity (e.g., enhanced enzymatic activity).
The terms "mutagenic primer" or "mutagenic oligonucleotide" (used interchangeably herein) are intended to refer to oligonucleotide compositions which correspond to a portion of the template sequence and which are capable of hybridizing thereto. With respect to mutagenic primers, the primer will not precisely match the template nucleic acid, the mismatch or mismatches in the primer being used to introduce the desired mutation into the nucleic acid library. As used herein, "non-mutagenic primer" or "non-mutagenic oligonucleotide" refers to oligonucleotide compositions which will match precisely to the template nucleic acid. In one embodiment of the invention, only mutagenic primers are used. In another preferred embodiment of the invention, the primers are designed so that for at least one region at which a mutagenic primer has been included, there is also non-mutagenic primer included in the oligonucleotide mixture. By adding a mixture of mutagenic primers and non-mutagenic primers corresponding to at least one of the mutagenic primers, it is possible to produce a resulting nucleic acid library in which a variety of combinatorial mutational patterns are presented. For example, if it is desired that some of the members of the mutant nucleic acid library retain their precursor sequence at certain positions while other members are mutant at such sites, the non-mutagenic primers provide the ability to obtain a specific level of non-mutant members within the nucleic acid library for a given residue. The methods of the invention employ mutagenic and non-mutagenic oligonucleotides which are generally between 10-50 bases in length, more preferably about 15-45 bases in length. However, it may be necessary to use primers that are either shorter than 10 bases or longer than 50 bases to obtain the mutagenesis result desired. With respect to corresponding mutagenic and non-mutagenic primers, it is not necessary that the corresponding oligonucleotides be of identical length, but only that there is overlap in the region corresponding to the mutation to be added.
Primers may be added in a pre-defined ratio according to the present invention. For example, if it is desired that the resulting library have a significant level of a certain specific mutation and a lesser amount of a different mutation at the same or different site, by

adjusting the amount of primer added, it is possible to produce the desired biased library. Alternatively, by adding lesser or greater amounts of non-mutagenic primers, it is possible to adjust the frequency with which the corresponding mutation(s) are produced in the mutant nucleic acid library.
As used herein, the phrase "contiguous mutations" refers to mutations which are presented within the same oligonucleotide primer. For example, contiguous mutations may be adjacent or nearby each other, however, they will be introduced into the resulting mutant template nucleic acids by the same primer.
As used herein, the phrase "discontiguous mutations" refers to mutations which are presented in separate oligonucleotide primers. For example, discontiguous mutations will be introduced into the resulting mutant template nucleic acids by separately prepared oligonucleotide primers.
The terms "wild-type sequence," or "wild-type gene" are used interchangeably herein, to refer to a sequence that is native or naturally occurring in a host cell. In some embodiments, the wild-type sequence refers to a sequence of interest that is the starting point of a protein engineering project. The wild-type sequence may encode either a homologous or heterologous protein. A homologous protein is one the host cell would produce without intervention. A heterologous protein is one that the host cell would not produce but for the intervention.
As used herein, the term "antibodies" refers to immunoglobulins. Antibodies include but are not limited to immunoglobulins obtained directly from any species from which it is desirable to produce antibodies. In addition, the present invention encompasses modified antibodies. The term also refers to antibody fragments that retain the ability to bind to the epitope that the intact antibody binds and include polyclonal antibodies, monoclonal antibodies, chimeric antibodies, anti-idiotype (anti-ID) antibodies. Antibody fragments include, but are not limited to the complementarity-determining regions (CDRs), single-chain fragment variable regions (scFv), heavy chain variable region (VH), light chain variable region (VL). Polyclonal and monoclonal antibodies are also encompassed by the present invention. Preferably, the antibodies are monoclonal antibodies.
The term "oxidation stable" refers to proteases of the present invention that retain a specified amount of enzymatic activity over a given period of time under conditions prevailing during the proteolytic, hydrolyzing, cleaning or other process of the invention, for example while exposed to or contacted with bleaching agents or oxidizing agents. In some embodiments, the proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic activity after contact with a bleaching or oxidizing

agent over a given time period, for example, at least 1 minute, 3 minutes, 5 minutes, 8 . minutes, 12 minutes, 16 minutes, 20 minutes, etc. In some embodiments, the stability is measured as described in the Examples.
The term "chelator stable" refers to proteases of the present invention that retain a specified amount of enzymatic activity over a given period of time under conditions prevailing during the proteolytic, hydrolyzing, cleaning or other process of the invention, for example while exposed to or contacted with chelating agents. In some embodiments, the proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic activity after contact with a chelating agent over a given time period, for example, at least 10 minutes, 20 minutes, 40 minutes, 60 minutes, 100 minutes, etc. In some embodiments, the chelator stability is measured as described in the Examples.
The terms "thermally stable" and "thermostable" refer to proteases of the present invention that retain a specified amount of enzymatic activity after exposure to identified temperatures over a given period of time under conditions prevailing during the proteolytic, hydrolyzing, cleaning or other process of the invention, for example while exposed altered temperatures. Altered temperatures includes increased or decreased temperatures! In some embodiments, the proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic activity after exposure to altered temperatures over a given time period, for example, at least 60 minutes, 120 minutes, 180 minutes, 240 minutes, 300 minutes, etc. In some embodiments, the thermostability is determined as described in the Examples.
The term "enhanced stability" in the context of an oxidation, chelator, thermal and/or pH stable protease refers to a higher retained proteolytic activity over time as compared to other serine proteases (e.g., subtilisin proteases) and/or wild-type enzymes.
The term "diminished stability" in the context of an oxidation, chelator, thermal and/or pH stable protease refers to a lower retained proteolytic activity over time as compared to other serine proteases (e.g., subtilisin proteases) and/or wild-type enzymes.
As used herein, the term "cleaning composition" includes, unless otherwise indicated, granular or powder-form all-purpose or "heavy-duty" washing agents, especially cleaning detergents; liquid, gel or paste-form all-purpose washing agents, especially the so-called heavy-duty liquid types; liquid fine-fabric detergents; hand dishwashing agents or light duty dishwashing agents, especially those of the high-foaming type; machine dishwashing agents, including the various tablet, granular, liquid and rinse-aid types for household and institutional use; liquid cleaning and disinfecting agents, including antibacterial hand-wash types, cleaning bars, mouthwashes, denture cleaners, car or carpet shampoos, bathroom

cleaners; hair shampoos and hair-rinses; shower gels and foam baths and metal cleaners; as well as cleaning auxiliaries such as bleach additives and "stain-stick" or pre-treat types.
It is to be understood that the test methods described in the Examples herein are used to determine the respective values of the parameters of the present invention, as such invention is described and claimed herein.
Unless otherwise noted, all component or composition levels are in reference to the active level of that component or composition, and are exclusive of impurities, for example, residual solvents or by-products, which may be present in commercially available sources.
Enzyme components weights are based on total active protein.
All percentages and ratios are calculated by weight unless otherwise indicated. All percentages and ratios are calculated based on the total composition unless otherwise
indicated.
It should be understood that every maximum numerical limitation given throughout
this specification includes every lower numerical limitation, as if such lower numerical limitations were expressly written herein. Every minimum numerical limitation given throughout this specification will include every higher numerical limitation, as if such higher numerical limitations were expressly written herein. Every numerical range given throughout this specification will include every narrower numerical range that falls within such broader numerical range, as if such narrower numerical ranges were all expressly written herein.
The term "cleaning activity" refers to the cleaning performance achieved by the protease under conditions prevailing during the proteolytic, hydrolyzing, cleaning or other process of the invention. In some embodiments, cleaning performance is determined by the application of various cleaning assays concerning enzyme sensitive stains, for example grass, blood, milk, or egg protein as determined by various chromatographic, spectrophotometric or other quantitative methodologies after subjection of the stains to standard wash conditions. Exemplary assays include, but are not limited to those described in WO 99/34011, and U.S. Pat. 6,605,458 (both of which are herein incorporated by reference), as well as those methods included in the Examples.
The term "cleaning effective amount" of a protease refers to the quantity of protease described hereinbefore that achieves a desired level of enzymatic activity in a specific cleaning composition. Such effective amounts are readily ascertained by one of ordinary skill in the art and are based on many factors, such as the particular protease used, the cleaning application, the specific composition of the cleaning composition, and whether a liquid or dry (e.g., granular, bar) composition is required, etc.
The term "cleaning adjunct materials," as used herein, means any liquid, solid or

gaseous material selected for the particular type of cleaning composition desired and the form of the product (e.g., liquid, granule, powder, bar, paste, spray, tablet, gel; or foam composition), which materials are also preferably compatible with the protease enzyme used in the composition. In some embodiments, granular compositions are in "compact" form, while in other embodiments, the liquid compositions are in a "concentrated" form.
The term "enhanced performance" in the context of cleaning activity refers to an increased or greater cleaning activity of certain enzyme sensitive stains such as egg, milk, grass or blood, as determined by usual evaluation after a standard wash cycle and/or multiple wash cycles.
The term "diminished performance" in the context of cleaning activity refers to an decreased or lesser cleaning activity of certain enzyme sensitive stains such as egg, milk, grass or blood, as determined by usual evaluation after a standard wash cycle.
The term "comparative performance" in the context of cleaning activity refers to at least 60%, at least 70%, at least 80% at least 90% at least 95% of the cleaning activity of a comparative subtilisin protease (e.g., commercially available proteases), including but not limited to OPTIMASE™ protease (Genencor), PURAFECT™ protease products (Genencor), SAVINASE ™ protease (Novozymes), BPN'-variants (See e.g., U.S. Pat. No. Re 34,606), RELASE™, DURAZYME™, EVERLASE™, KANNASE ™ protease (Novozymes), MAXACAL™, MAXAPEM™, PROPERASE ™ proteases (Genencor; See also, U.S. Pat. No. Re 34,606, U.S. Pat. Nos. 5,700,676; 5,955,340; 6,312,936; 6,482,628), and B. lentus variant protease products [for example those described in WO 92/21760, WO 95/23221 and/or WO 97/07770 (Henkel). Exemplary subtilisin protease variants include, but are not limited to those having substitutions or deletions at residue positions equivalent to positions 76, 101, 103, 104, 120, 159, 167, 170, 194, 195, 217, 232, 235, 236, 245, 248, and/or 252 of BPN'. Cleaning performance can be determined by comparing the proteases of the present invention with those subtilisin proteases in various cleaning assays concerning enzyme sensitive stains such as grass, blood or milk as determined by usual spectrophotometric or analytical methodologies after standard wash cycle conditions.
As used herein, a "low detergent concentration" system includes detergents where less than about 800 ppm of detergent components are present in the wash water. Japanese detergents are typically considered low detergent concentration systems, as they have usually have approximately 667 ppm of detergent components present in the wash water.
As used herein, a "medium detergent concentration" systems includes detergents wherein between about 800 ppm and about 2000ppm of detergent components are present

in the wash water. North American detergents are generally considered to be medium detergent concentration systems as they have usually approximately 975 ppm of detergent components present in the wash water. Brazilian detergents typically have approximately 1500 ppm of detergent components present in the wash water.
As used herein, "high detergent concentration" systems includes detergents wherein greater than about 2000 ppm of detergent components are present in the wash water. European detergents are generally considered to be high detergent concentration systems as they have approximately 3000-8000 ppm of detergent components in the wash water.
As used herein, "fabric cleaning compositions" include hand and machine laundry detergent compositions including laundry additive compositions and compositions suitable for use in the soaking and/or pretreatment of stained fabrics (e.g., clothes, linens, and other textile materials).
As used herein, "non-fabric cleaning compositions" include non-textile (i.e., fabric) surface cleaning compositions, including but not limited to dishwashing detergent compositions, oral cleaning compositions, denture cleaning compositions, and personal cleansing compositions.
The "compact" form of the cleaning compositions herein is best reflected by density and, in terms of composition, by the amount of inorganic filler salt. Inorganic filler salts are conventional ingredients of detergent compositions in powder form. In conventional detergent compositions, the filler salts are present in substantial amounts, typically 17-35% by weight of the total composition. In contrast, in compact compositions, the filler salt is present in amounts not exceeding 15% of the total composition. In some embodiments, the filler salt is present in amounts that do not exceed 10%, or more preferably, 5%, by weight of the composition. In some embodiments, the inorganic filler salts are selected from the alkali and alkaline-earth-metal salts of sulfates and chlorides. A preferred filler salt is sodium sulfate.
II. Serine Protease Enzymes and Nucleic Acid Encoding Serine Protease Enzymes
The present invention provides isolated polynucleotides encoding amino acid sequences, encoding proteases. In some embodiments, these polynucleotides comprise at least 65% amino acid sequence identity, preferably at least 70% amino acid sequence identity, more preferably at least 75% amino acid sequence identity, still more preferably at least 80% amino acid sequence identity, more preferably at least 85% amino acid sequence

identity, even more preferably at least 90% amino acid sequence identity, more preferably at least 92% amino acid sequence identity, yet more preferably at least 95% amino acid sequence identity, more preferably at least 97% amino acid sequence identity, still more preferably at least 98% amino acid sequence identity, and most preferably at least 99% amino acid sequence identity to an amino acid sequence as shown in SEQ ID NOS:6-8, (e.g., at least a portion of the amino acid sequence encoded by the polynucleotide having proteolytic activity, including the mature protease catalyzing the hydrolysis of peptide linkages of substrates), and/or demonstrating comparable or enhanced washing performance under identified wash conditions.
In some embodiments, the percent identity (amino acid sequence, nucleic acid sequence, gene sequence) is determined by a direct comparison of the sequence information between two molecules by aligning the sequences, counting the exact number of matches between the two aligned sequences, dividing by the length of the shorter sequence, and multiplying the result by 100. Readily available computer programs find use in these analysis, such as those described above. Programs for determining nucleotide sequence identity are available in the Wisconsin Sequence Analysis Package, Version 8 (Genetics Computer Group, Madison, Wl) for example, the BESTFIT, FASTA and GAP programs, which also rely on the Smith and Waterman algorithm. These programs are readily utilized with the default parameters recommended by the manufacturer and described in the Wisconsin Sequence Analysis Package referred to above.
An example of an algorithm that is suitable for determining sequence similarity is the BLAST algorithm, which is described in Altschul, etal., J. Mol. Biol., 215:403-410 (1990). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. These initial neighborhood word hits act as starting points to find longer HSPs containing them. The word hits are expanded in both directions along each of the two sequences being compared for as far as the cumulative alignment score can be increased. Extension of the word hits is stopped when: the cumulative alignment score falls off by the quantity X from a maximum achieved value; the cumulative score goes to zero or below; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLAST program uses as defaults a wordlength (W) of 11, the BLOSUM62 scoring matrix (See, Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989))

alignments (B) of 50, expectation (E) of 10, M'5, N'-4, and a comparison of both strands.
The BLAST algorithm then performs a statistical analysis of the similarity between, two sequences (See e.g., Karlin and Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 [1993]). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a serine protease nucleic acid of this invention if the smallest sum probability in a comparison of the test nucleic acid to a serine protease nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001. Where the test nucleic acid encodes a serine protease polypeptide, it is considered similar to a specified serine protease nucleic acid if the comparison results in a smallest sum probability of less than about 0.5, and more preferably less than about 0.2.
In some embodiments of the present invention, sequences were analyzed by BLAST and protein translation sequence tools. In some experiments, the preferred version was BLAST (Basic BLAST version 2.0). The program chosen was "BlastX", and the database chosen was "nr". Standard/default parameter values were employed.
In some preferred embodiments, the present invention encompasses the •
approximately 1621 base pairs in length polynucleotide set forth in SEQ. ID NO:1. A start codon is shown in bold in SEQ ID NO:1. In another embodiment of the present invention, the polynucleotides encoding these amino acid sequences comprise a 1485 base pair portion (residues 1-1485 of SEQ ID NO:2) that, if expressed, is believed to encode a signal sequence (nucleotides 1-84 of SEQ ID NO:5) encoding amino acids 1-28 of SEQ ID NO:9; an N-terminal prosequence (nucleotides 84-594 encoding amino acid residues 29-198 of SEQ ID NO:6); a mature protease sequence (nucleotides 595-1161 of SEQ ID NO:2 encoding amino acid residues 1-189 of SEQ ID NO:8); and a C-terminal pro-sequence (nucleotides 1162-1486 encoding amino acid residues 388-495 of SEQ ID NO:6). Alternatively, the signal peptide, the N-terminal pro-sequence, mature serine protease sequence and C-terminal pro-sequence is numbered in relation to the amino acid residues of the mature protease of SEQ ID NO:6 being numbered 1-189, i.e., signal peptide (residues -198 to -171 ), an N-terminal pro sequence (residues -171 to -1), the mature serine protease sequence (residues 1-189) and a C-terminal pro-sequence (residues 190-298). In another embodiment of the present invention, the polynucleotide encoding an amino acid sequence having proteolytic activity comprises a nucleotide sequence of nucleotides 1 to 1485 of the portion of SEQ ID NO:2 encoding the signal peptide and precursor protease. In another embodiment of the present invention, the polynucleotide encoding an amino acid

sequence comprises the sequence of nucleotides 1 to 1412 of the polynucleotide encoding the precursor Cellulomonas protease (SEQ ID NO:3). In yet another embodiment, the polynucleotide encoding an amino acid sequence comprises the sequence of nucleotides 1 to 587 of the portion of the polynucleotide encoding the mature Cellulomonas protease (SEQ ID NO:4).
As will be understood by the skilled artisan, due to the degeneracy of the genetic code, a variety of polynucleotides can encode the signal peptide, precursor protease and/or mature protease provided in SEQ ID NOS:6, 7, and/or 8, respectively, or a protease having the % sequence identity described above. Another embodiment of the present invention encompasses a polynucleotide comprising a nucleotide sequence having at least 70% sequence identity, at least 75% sequence identity, at least 80% sequence identity, at least 85% sequence identity, at least 90% sequence identity, at least 92% sequence identity, at least 95% sequence identity, at least 97% sequence identity, at least 98% sequence identity and at least 99% sequence identity to the polynucleotide sequence of SEQ ID NOS:2, 3, and/or 4, respectively, encoding the signal peptide and precursor protease, the precursor protease and/or the mature protease, respectively.
In additional embodiments, the present invention provides fragments or portions of DMA that encodes proteases, so long as the encoded fragment retains proteolytic activity. Another embodiment of the present invention encompasses polynucleotides having at least 20% of the sequence length, at least 30% of the sequence length, at least 40% of the sequence length, at least 50% of the sequence length, at least 60% of the sequence length, 70% of the sequence length, at least 75% of the sequence length, at least 80% of the sequence length, at least 85% of the sequence length, at least 90% of the sequence length, at least 92% of the sequence length, at least 95% of the sequence length, at least 97% of the sequence length, at least 98% of the sequence length and at least 99% of the sequence of the polynucleotide sequence of SEQ ID NO:2, or residues 185-1672 of SEQ ID NO:1, encoding the precursor protease. In alternative embodiments, these fragments or portions of the sequence length are contiguous portions of the sequence length, useful for shuffling of the DNA sequence in recombinant DMA sequences (See e.g., U.S. Pat. No. 6,132,970)
Another embodiment of the invention includes fragments of the DNA described herein that find use according to art recognized techniques in obtaining partial length DNA fragments capable of being used to isolate or identify polynucleotides encoding mature protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having proteolytic activity. Moreover, the DNA provided in SEQ ID NO:1 finds use in identifying

homologous fragments of DNA from other species, and particularly from Cellulomonas spp. which encode a protease or portion thereof having proteolytic activity.
In addition, the present invention encompasses using primer or probe sequences constructed from SEQ ID NO:1, or a suitable portion or fragment thereof (e.g., at least about 5-20 or 10-15 contiguous nucleotides), as a probe or primer for screening nucleic acid of either genomic or cDNA origin. In some embodiments, the present invention provides DNA probes of the desired length (i.e., generally between 100 and 1000 bases in length), based on the sequences in SEQ ID NOS1, 2, 3, and/or 4.
In some embodiments, the DNA fragments are electrophoretically isolated, cut from the gel, and recovered from the agar matrix of the gel. In preferred embodiments, this purified fragment of DNA is then labeled (using, for example, the Megaprime labeling system according to the instructions of the manufacturer) to incorporate P32 in the DNA. The labeled probe is denatured by heating to 95°C for a given period of time (e.g., 5 minutes), and immediately added to the membrane and prehybridization solution. The hybridization reaction proceeds for an appropriate time and under appropriate conditions (e.g., 18 hours at 37 SC), with gentle shaking or rotation. The membrane is rinsed (e.g., twice in SSC/0.3% SDS) and then washed in an appropriate wash solution with gentle agitation. The stringency desired is a reflection of the conditions under which the membrane (filter) is washed. In some embodiments herein, "low-stringency" conditions involve washing with a solution of 0.2X SSC/0.1% SDS at 20°C for 15 minutes, while in • other embodiments, "medium-stringency" conditions, involve a further washing step comprising washing with a solution of 0.2X SSC/0.1% SDS at 37°C for 30 minutes, while in other embodiments, "high-stringency" conditions involve a further washing step comprising washing with a solution of 0.2X SSC/0.1% SDS at 37°C for 45 minutes, and in further embodiments, "maximum-stringency" conditions involve a further washing step comprising washing with a solution of 0.2X SSC/0.1% SDS at 37°C for 60 minutes. Thus, various embodiments of the present invention provide polynucleotides capable of hybridizing to a probed derived from the nucleotide sequence provided in SEQ ID NOS:1, 2, 3, 4, and/or 5, under conditions of medium, high and/or maximum stringency.
After washing, the membrane is dried and the bound probe detected. If P32 or another radioisotope is used as the labeling agent, the bound probe is detected by autoradiography. Other techniques for the visualization of other probes are well-known to those of skill in the art. The detection of a bound probe indicates a nucleic acid sequence has the desired homology, and therefore identity to SEQ ID NOS:1, 2, 3, 4, and/or 5, and is encompassed by the present invention. Accordingly, the present invention provides

methods for the detection of nucleic acid encoding a protease encompassed by the present invention which comprises hybridizing part or all of a nucleic acid sequence of SEQ ID NOS:1, 2, 3, 4, and/or 5 with other nucleic acid of either genomic or cDNA origin.
As indicated above, in other embodiments, hybridization conditions are based on the melting temperature (Tm) of the nucleic acid binding complex, to confer a defined "stringency" as explained below. "Maximum stringency" typically occurs at about Tm-5°C (5°C below the Tm of the probe); "high stringency" at about 5°C to 10°C below Tm; "intermediate stringency" at about 10°C to 20°C below Tm; and "low stringency" at about 20° C to 25°C below Tm. As known to those of skill in the art, medium, high and/or maximum stringency hybridization are chosen such that conditions are optimized to identify or detect polynucleotide sequence homologues or equivalent polynucleotide sequences.
In yet additional embodiments, the present invention provides nucleic acid constructs (i.e., expression vectors) comprising the polynucleotides encoding the proteases of the present invention. In further embodiments, the present invention provides host cells transformed with at least one of these vectors.
In further embodiments, the present invention provides polynucleotide sequences further encoding a signal sequence. In some embodiments, invention encompasses polynucleotides having signal activity comprising a nucleotide sequence having at least 65% sequence identity, at least 70% sequence identity, preferably at least 75% sequence identity, more preferably at least 80% sequence identity, still further preferably at least 85% sequence identity, even more preferably at least 90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 97% sequence identity, at least 98% sequence identity, and most preferably at least 99% sequence identity to SEQ ID NO:5. Thus, in these embodiments, the present invention provides a sequence with a putative signal sequence, and polynucleotides being capable of hybridizing to a probe derived from the nucleotide sequence disclosed in SEQ ID NO:5 under conditions of medium, high and/or maximal stringency, wherein the signal sequences have substantially the same signal activity as the signal sequence encoded by the polynucleotide of the present invention.
In some embodiments, the signal activity is indicated by substantially the same level of secretion of the protease into the fermentation medium, as the starting material. For example, in some embodiments, the present invention provides fermentation medium protease levels at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 98% of the secreted protease levels in the fermentation medium as provided by the signal sequence of SEQ ID NO:3. In some embodiments, the secreted protease levels are ascertained by protease activity analyses such as the pNA assay (See

e.g., Del Mar, [1979], infra). Additional means for determining the levels of secretion of a heterologous or homologous protein in a Gram-positive host cell and detecting secreted proteins include using either polyclonal or monoclonal antibodies specific for the protein. Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and fluorescent activated cell sorting (FACS), as well-known those in the art.
In further embodiments, the present invention provides polynucleotides, encoding an amino acid sequence of a signal peptide (nucleotides 1-84 of SEQ ID NO:5), as shown in SEQ ID NO:9, nucleotide residue positions 1 to 85 of SEQ ID NO:2, and /or SEQ ID NO:5. The invention further encompasses nucleic acid sequences which hybridize to the nucleic acid sequence shown in SEQ ID N0:5 under low, medium, high stringency and/or maximum stringency conditions, but which have substantially the same signal activity as the sequence. The present invention encompasses all such polynucleotides.
In further embodiments, the present invention provides polynucleotides that are complementary to the nucleotide sequences described herein. Exemplary complementary nucleotide sequences include those that are provided in SEQ ID NOS:1-5.
Further aspects of the present invention encompass polypeptides having proteolytic activity comprising 65% amino acid sequence identity, at least 70% sequence identity, at least 75% amino acid sequence identity, at least 80% amino acid sequence identity, at least 85% amino acid sequence identity, at least 90% amino acid sequence identity, at least 92% amino acid sequence identity, at least 95% amino acid sequence identity, at least 97% amino acid sequence identity, at least 98% amino acid sequence identity and at least 99% amino acid sequence identity to the amino acid sequence of SEQ ID NO: 6 (i.e., the signal and precursor protease), SEQ ID N0:7 (i.e., the precursor protease), and/or of SEQ ID NO:8 (i.e., the mature protease). The proteolytic activity of these polypeptides is determined using methods known in the art and include such methods as those used to assess detergent function. In further embodiments, the polypeptides are isolated. In additional embodiments of the present invention, the polypeptides comprise amino acid sequences that identical to amino acid sequence selected from the group consisting of the amino acid sequences of SEQ ID NOS:6, 7, or 8. In some further embodiments, the polypeptides are identical to portions of SEQ ID NOS:6, 7 or 8.
In some embodiments, the present invention provides isolated polypeptides having proteolytic activity, comprising the amino acid sequence approximately 495 amino acids in length, as provided in SEQ ID NO:6. In further embodiments, the present invention encompasses polypeptides having proteolytic activity comprising the amino acid sequence approximately 467 amino acids in length provided in SEQ ID NO:7. In some embodiments,

these amino acid sequences comprise a signal sequence (amino acids 1-28 of SEQ ID NO:9); and a precursor protease (amino acids 1-467 of SEQ ID NO:7). In additional embodiments, the present invention encompasses polypeptides comprising an N-terminal prosequence (amino acids 1-170 of SEQ ID NO:7), a mature protease sequence (amino acids 1-189 of SEQ ID NO:8), and a C-terminal prosequence (amino acids 360 -467 of SEQ ID NO:7). In still further embodiments, the present invention encompasses polypeptides comprising a precursor protease sequence (e.g., amino acids 1-467 of SEQ ID NO:7). In yet another embodiment, the present invention encompasses polypeptides comprising a mature protease sequence comprising amino acids (e.g., 1-189 of SEQ ID NO:8).
In further embodiments, the present invention provides polypeptides and/or proteases comprising amino acid sequences of the above described sequence derived from bacterial species including, but not limited to Micrococcineae which are identified through amino acid sequence homology studies. In some embodiments, an amino acid residue of a precursor Micrococcineae protease is equivalent to a residue of Cellulomonas strain 69B4, if it is either homologous (i.e., corresponding in position in either primary or tertiary structure) or analogous to a specific residue or portion of that residue in Cellulomonas strain 69B4 protease (i.e., having the same or similar functional capacity to combine, react, or interact chemically).
In some preferred embodiments, in order to establish homology to primary structure, the amino acid sequence of a precursor protease is directly compared to the Cellulomonas strain 69B4 mature protease amino acid sequence and particularly to a set of conserved residues which are discerned to be invariant in all or a large majority of Cellulomonas like proteases for which sequence is known. After aligning the conserved residues, allowing for necessary insertions and deletions in order to maintain alignment (i.e., avoiding the elimination of conserved residues through arbitrary deletion and insertion), the residues corresponding to particular amino acids in the mature protease (SEQ ID NO:8) and Cellulomonas 69B4 protease are determined. Alignment of conserved residues preferably should conserve 100% of such residues. However, alignment of greater than 75% or as little as 45% of conserved residues is also adequate to define equivalent residues. However, conservation of the catalytic triad, His32/Asp56/Ser137 of SEQ ID NO:8 should be maintained.
For example, in some embodiments, the amino acid sequence of proteases from Cellulomonas strain 69B4, and other Micrococcineae spp. described above are aligned to provide the maximum amount of homology between amino acid sequences. A comparison of these sequences indicates that there are a number of conserved residues contained in

each sequence. These are the residues that are identified and utilized to establish the equivalent residue positions of amino acids identified in the precursor or mature Micrococcineae protease in question.
These conserved residues are used to ascertain the corresponding amino acid residues of Cellulomonas strain 69B4 protease in one or more in Micrococcineae homologues (e.g., Cellulomonas cellasea (DSM 20118) and/or a Cellulomonas homologue herein). These particular amino acid sequences are aligned with the sequence of Cellulomonas 69B4 protease to produce the maximum homology of conserved residues. By this alignment, the sequences and particular residue positions of Cellulomonas 69B4 are observed in comparison with other Cellulomonas spp. Thus, the equivalent amino acid for the catalytic triad (e.g., in Cellulomonas 69B4 protease) is identifiable in the other Micrococcineae spp. In some embodiments of the present invention, the protease homologs comprise the equivalent of His32/Asp56/Ser137 of SEQ ID NO:8.
Another indication that two polypeptides are substantially identical is that the first polypeptide is immunologically cross-reactive with the second polypeptide. Methodologies for determining immunological cross-reactivity are described in the art and are described in the Examples herein. Typically, polypeptides that differ by conservative amino acid substitutions are immunologically cross-reactive. Thus, a polypeptide is substantially identical to a second polypeptide, for example, where the two peptides differ only by a conservative substitution.
The present invention encompasses proteases obtained from various sources. In some preferred embodiments, the proteases are obtained from bacteria, while in other embodiments, the proteases are obtained from fungi.
In some particularly preferred embodiments, the bacterial source is selected from the members of the suborder Micrococcineae. In some embodiments, the bacterial source is the family Promicromonosporaceae. In some preferred embodiments, the Promicromonosporaceae spp. includes and/or is selected from the group consisting of Promicromonospora citrea (DSM 43110), Promicromonospora sukumoe (DSM 44121), Promicromonospora aerolata (CCM 7043), Promicromonospora vindobonensis (CCM 7044), Myceligenerans xiligouense (DSM 15700), Isoptericola variabilis (DSM 10177, basonym Cellulosimicrobium variabile), Cellulosimicrobium cellulans (DSM 20424, basonym Nocardia cellulans, Cellulomonas cellulans), Cellulosimicrobium funkei, Xylanimonas cellulosilytica (LMG 20990), Xylanibacterium ulmi (LMG 21721), and Xylanimicrobium pachnodae (DSM 12657, basonym Promicromonospora pachnodae).
In other particularly preferred embodiments, the bacterial source is the family

Cellulomonadaceae. In some preferred embodiments, the Cellulomonadaceae spp. includes and/or is selected from the group of Cellulomonas fimi (ATCC 484, DSM 20113), Cellulomonas biazotea (ATCC 486, DSM 20112), Cellulomonas cellasea (ATCC 487, 21681, DSM 20118), Cellulomonas denverensis, Cellulomonas hominis (DSM 9581), Cellulomonas flavigena (ATCC 482, DSM 20109), Cellulomonas persica (ATCC 700642, DSM 14784), Cellulomonas iranensis (ATCC 700643, DSM 14785); Cellulomonas fermentans (ATCC 43279, DSM 3133), Cellulomonas gelida (ATCC 488, DSM 20111, DSM 20110), Cellulomonas humilata (ATCC 25174, basonym Actinomyces humiferus), Cellulomonas uda (ATCC 491, DSM 20107), Cellulomonas xylanilytica (LMG 21723), Cellulomonas septica, Cellulomonas parahominis, Oerskovia turbata (ATCC 25835, DSM 20577, synonym Cellulomonas turbata), Oerskovia jenensis (DSM 46000), Oerskovia enterophila (ATCC 35307, DSM 43852, basonym Promicromonospora enterophila), Oerskovia paurometabola (DSM 14281), and Cellulomonas strain 69B4 (DSM 16035). In further embodiments, the bacterial source also includes and/or is selected from the group of Thermobifida spp., Rarobacterspp., and/or Lysobacterspp. In yet additional embodiments, the Thermobifida spp. is Thermobifida fusca (basonym Thermomonospora fusca) (tfpA, AAC23545; See, Lao et. al, Appl. Environ. Microbiol., 62: 4256-4259 [1996]). In an alternative embodiment, the Rarobacterspp. is Rarobacter faecitabidus (RPI, A45053; See e.g., Shimoi et al., J. Biol. Chem., 267:25189-25195 [1992]). In yet another embodiment, the Lysobacter spp. is Lysobacter enzymogenes.
In further embodiments, the present invention provides polypeptides and/or polynucleotides obtained and/or isolated from fungal sources. In some embodiments, the fungal source includes a Metarhizium spp. In some preferred embodiments, the fungal source is a Metarhizium anisopliae (CHY1 (CAB60729).
In another embodiment, the present invention provides polypeptides and/or polynucleotides derived from a Cellulomonas strain selected from cluster 2 of the taxonomic classification described in U.S. Pat. No 5,401,657, herein incorporated by reference. In US Patent 5,401,657, twenty strains of bacteria isolated from in and around alkaline lakes were assigned to the type of bacteria known as Gram-positive bacteria on the basis of: (1) the Dussault modification of the Gram's staining reaction (Dussault, J. Bacteriol., 70:484-485 [1955]); (2) the KOH sensitivity test (Gregersen, Eur. J. Appl. Microbiol. Biotechnol., 5:123-127 [1978]; Halebian etal., J. Clin. Microbiol., 13:444-448 [1981]; and (3) the aminopeptidase reaction (Cerny, Eur. J. Appl. Microbiol., 3:223-225 [1976]; Cerny, Eur. J. Appl. Microbiol., 5:113-122 [1978]). In addition, in most cases, confirmation was also made on the basis of quinone analysis (Collins and Jones, Microbiol. Rev., 45:316-354 [1981])

using the method described by Collins (See, Collins, In Goodfellow and Minnikin (eds), Chemical Methods in Bacterial Svstematics. Academic Press, London [1985], pp. 267-288). In addition, strains can be tested for 200 characters and the results analyzed using the principles of numerical taxonomy (See e.g., Sneath and Sokal, Numerical Taxonomy. W.H. Freeman & Co.,. San Francisco, CA [1973]). Exemplary characters tested, testing methods, and codification methods are also described in U.S. Pat. 5,401,657.
As described in U.S. Pat. No. 5,401,657, the phenetic data, consisting of 200 unit characters was scored and set out in the form of an "n.times.t" matrix, whose t columns represent the "t" bacterial strains to be grouped on the basis of resemblances, and whose "n" rows are the unit characters. Taxonomic resemblance of the bacterial strains was estimated by means of a.similarity coefficient (Sneath and Sokal, supra, pp. 114-187). Although many different coefficients have been used for biological classification, only a few have found regular use in bacteriology. Three association coefficients (See e.g., Sneath and Sokal, supra, at p. 129), namely, the Gower, Jaccard and Simple Matching coefficients were applied. These have been frequently applied to the analysis of bacteriological data and are widely accepted by those skilled in the art, as they have been shown to result in robust classifications.
The coded data were analyzed using the TAXPAK program package (Sackin, Meth. Microbiol., 19:459-494 [1987]), run on a DEC VAX computer at the University of Leicester, U.K.
A similarity matrix was constructed for all pairs of strains using the Gower Coefficient (SG) with the option of permitting negative matches (See, Sneath and Sokal, supra, at pp. 135-136), using the RTBNSIM program in TAXPAK. As the primary instrument of analysis and the one upon which most of the taxonomic data presented herein are based, the Gower Coefficient was chosen over other coefficients for generating similarity matrices because it is applicable to all types of characters or data, namely, two-state, multistate (ordered and qualitative), and quantitative.
Cluster analysis of the similarity matrix was accomplished using the Unweighted Pair Group Method with Arithmetic Averages (UPGMA) algorithm, also known as the Unweighted Average Linkage procedure, by running the SMATCLST sub-routine in TAXPAK.
Dendrograms illustrate the levels of similarity between bacterial strains In some embodiments, dendrograms are obtained by using the DENDGR program in TAXPAK. The phenetic data were re-analyzed using the Jaccard Coefficient (Sj) (Sneath and Sokal, supra, at p.131) and Simple Matching Coefficient (SSM) (Sneath, P.H.A. and Sokal, R.R., ibid, p. 132) by running the RTBNSIM program in TAXPAK. An additional two dendrograms were

obtained by using the SMATCLST with UPGMA option and DENDGR sub-routines in TAXPAK.
Using the SG /UPGMA method, six natural clusters or phenons of alkalophilic bacteria were generated at the 79% similarity level. These six clusters included 15 of the 20 alkalophilic bacteria isolated from alkaline lakes. Although the choice of 79% for the level of delineation was arbitrary, it was in keeping with current practices in numerical taxonomy (See e.g., Austin Priest, Modern Bacterial Taxonomy. Van Nostrand Reinhold, Wokingham, U.K., [1986], p. 37). Placing the delineation at a lower percentage would combine groups of clearly unrelated organisms whose definition is not supported by the data. At the 79% level, 3 of the clusters exclusively contain novel alkalophilic bacteria representing 13 of the newly isolated strains (potentially representing new taxa). Protease 69B4 was classified as in cluster 2 by this method.
The significance of the clustering at this level was supported by the results of the TESTDEN program. This program tests the significance of all dichotomous pairs of clusters (comprising 4 or more strains) in a UPGMA.generated dendrogram with Squared Euclidean distances, or their complement as a measurement and assuming that the clusters are hyperspherical. The critical overlap was set at 0.25%. The separation of the clusters is highly significant.
The Sj coefficient is a useful adjunct to the SG coefficient, as it can be used to detect phenons in the latter that are based on negative matches or distortions owing to undue weight being put on potentially subjective qualitative data. Consequently, the Sj coefficient is useful for confirming the validity of clusters defined initially by the use of the SG coefficient. The Jaccard Coefficient is particularly useful in comparing biochemically unreactive organisms (Austin and Priest, supra, at p. 37). In addition, there may be some question about the admissibility of matching negative character states (See, Sneath and Sokal, supra, at p. 131), in which case the Simple Matching Coefficient is a widely applied alternative. Strain 69B4 was classified as in cluster 2 by this method.
In the main, all of the clusters (especially the clusters of the new bacteria) generated by the SG /UPGMA method were recovered in the dendrograms produced by the Sj /UPGMA method (cophenetic correlation, 0.795), and the SSM /UPGMA method (cophenetic correlation, 0.814). The main effect of these transformations was to gather all the Bacillus strains in a single large cluster which further serves to emphasize the separation between the alkalophilic Bacillus species and the new alkalophilic bacteria, and the uniqueness of the latter. Based on these methodologies, 69B4 is considered to be a cluster 2 bacterium.
In other aspects of the present invention, the polynucleotide is derived from a

bacteria having a 16S rRNA gene nucleotide sequence at least 70%, 75%, 80%, 85%, 88%, 90%, 92%, 95%, 98% sequence identity with the 16S rRNA gene nucleotide sequence of Cellulomonas strain 69B4. The sequence of the 16S rRNA gene is deposited at GenBank under Accession Number X92152.
Figure 1 provides an unrooted phylogenetic tree illustrating the relationship of novel strain 69B4 to members of the family Cellulomonadaceae (including Cellulomonas strain 69B4) and other related genera of the suborder Micrococcineae. The dendrogram was constructed from aligned 16S rDNA sequences (1374 nt) using TREECONW v.1.3b (Van de Peer and De Wachter, Comput. Appl. Biosci., 10: 569-570 [1994]). Distance estimations were calculated using the substitution rate calibration of Jukes and Cantor (Jukes and Cantor, "Evolution of protein molecules," In, Munro (ed.), Mammalian Protein Metabolism, Academic Press, NY, at pp.21-132, [1969]) and tree topology inferred by the Neighbor-Joining algorithm (Saitou and Nei, Mol. Biol. Evol., 4:406-425 [1987]). The numbers at the nodes refer to bootstrap values from 100 resampled data sets (Felsenstein, Evol., 39:783-789 [1985]) and the scale bar indicates 2 nucleotide substitutions in 100 nt.
The strain 69B4 exhibits the closest 16S rDNA relationship to members of Cellulomonas and Oerskovia of the family Cellulomonadaceae. The closest relatives are believed to be C. cellasea (DSM 20118) and C. fimi (DSM 20113), with at least 95% sequence identity with the 16S rRNA gene nucleotide sequence of Cellulomonas strain 69B4 (e.g., 96% and 95% identity respectively) to strain 69B4 16S rRNA gene sequence.
In some preferred embodiments of the present invention, the Cellulomonas spp. is Cellulomonas strain 69B4 (DSM16035). This strain was originally isolated from a sample of sediment and water from the littoral zone of Lake Bogoria, Kenya at Acacia Camp (Lat. 0° 12'N, Long. 36° 07'E) collected on 10 October 1988. The water temperature was 33°C, pH 10.5 with a conductivity of 44 mS/cm. Cellulomonas strain 69B4 was determined to have the phenotypic characteristics described below. Fresh cultures were Gram-positive, slender, generally straight, rod-shaped bacteria, approximately 0.5-0.7|im x 1.8-4|im. Older cultures contained mainly short rods and coccoid cells. Cells occasionally occurred in pairs or as V-forms, but primary branching was not observed. Endospores were not detected. On alkaline GAM agar the strain forms opaque, glistening, pale-yellow coloured, circular and convex or domed colonies, with entire margins, about 2 mm in diameter after 2-3 days incubation at 37°C. The colonies were viscous or slimy with a tendency to clump when scraped with a loop. On neutral Tryptone Soya Agar, strain growth was less vigorous, giving translucent yellow colonies, generally
However, growth under anaerobic conditions was markedly reduced compared to aerobic growth. The strain also appeared to be negative in standard oxidase, urease, aminopeptidase, and KOH tests. In addition, nitrate was not reduced, although the organisms were catalase positive and DNase was produced under alkaline conditions. The preferred temperature range for growth was 20 - 37°C, with an optimum temperature at around 30-37°C. No growth was observed at 15°C or 45°C.
The strain is alkalophilic and slightly halophilic. The strain may also be characterized as having growth occurring at pH values between 6.0 and 10.5 with an optimum around pH 9-10. No growth was observed at pH 11 or pH 5.5. Growth below pH 7 was less vigorous and abundant than that of cultures grown at the optimal temperature. The strain was observed to grow in medium containing 0-8% (w/v) NaCI. Furthermore, the strain may also be characterized as a chemo-organotroph, since it grew on complex substrates such as yeast extract and peptone; and hydrolyzed starch, gelatin, casein, carboxymethylcellulose and amorphous cellulose.
The strain was observed to have metabolism that was respiratory and also fermentative. Acid was produced both aerobically and anaerobically from (API 50CH): L-arabinose, D-xylose, D-glucose, D-fructose, D-mannose, rhamnose (weak), cellobiose, maltose, sucrose, trehalose, gentiobiose, D-turanose, D-lyxose and 5-keto-gluconate (weak). Amygdalin, arbutin, salicin and esculin are also utilized. The strain was unable to utilize: ribose, lactose, galactose, melibiose, D-raffinose, glycogen, glycerol, erythritol, inositol, mannitol, sorbitol, xylitol, arabitol, gluconate and lactate.
The strain was determined to be susceptible to ampicillin, chloramphenicol, erythromycin, fusidic acid, methicillin, novobiocin, streptomycin, tetracycline, sulphafurazole, oleandomycin, polymixin, rifampicin, vancomycin and bacitracin; but resistant to gentamicin, nitrofurantoin, nalidixic acid, sulphmethoxazole, trimethoprim, penicillin G, neomycin and kanamycin.
The following enzymes, aside from the protease of the present invention, were observed to be produced (ApiZym, API Coryne); C4-esterase, C8-esterase/lipase, leucine arylamidase, alpha-chymotrypsin, alpha-glucosidase, beta-glucosidase and pyrazinamidase.
The strain was observed to exhibit the following chemotaxonomic characteristics. Major fatty acids (>10% of total) were C16:1 (28.1%), C18:0 (31.1%), C18:1 (13.9%). N-saturated (79.1%), n-unsaturated (19.9%). Fatty acids with even numbers of carbons accounted for 98%. Main polar lipid components: phosphatidylglycerol (PG) and 3 unidentified glycolipids (alpha-napthol positive) were present; DPG, PGP, PI and PE were lot detected. Menaquinones MK-4, MK-6, MK-7 and MK-9 were the main isoprenoids

present. The cell wall peptidoglycan type was A40 with L-ornithine as diamino acid and D-aspartic acid in the interpeptide bridge. With regard to toxicity evaluation, there are no known toxicity or pathogenicity issues associated with bacteria of the genus Cellulomonas.
Although there may be variations in the sequence of a naturally occurring enzyme within a given species of organism, enzymes of a specific type produced by organisms of the same species generally are substantially identical with respect to substrate specificity and/or proteolytic activity levels under given conditions (e.g., temperature, pH, water hardness, oxidative conditions, chelating conditions, and concentration), etc. Thus, for the purposes of the present invention, it is contemplated that other strains and species of Cellulomonas also produce the Cellulomonas protease of the present invention and thus provide useful sources for the proteases of the present invention. Indeed, as presented herein, it is contemplated that other members of the Micrococcineae will find use in the present invention.
In some embodiments, the proteolytic polypeptides of this invention are characterized physicochemically, while in other embodiments, they are characterized based on their functionally, while in further embodiments, they are characterized using both sets of properties. Physicochemical characterization takes advantages of well known techniques such as SDS electrophoresis, gel filtration, amino acid composition, mass spectrometry (e.g,. MALDI-TOF-MS, LC-ES-MS/MS, etc.), and sedimentation to determine the molecular weight of proteins, isoelectric focusing to determine the pi of proteins, amino acid sequencing to determine the amino acid sequences of protein, crystallography studies to determine the tertiary structures of proteins, and antibody binding to determine antigenic epitopes present in proteins.
In some embodiments, functional characteristics are determined by techniques well known to the practitioner in the protease field and include, but are not limited to, hydrolysis of various commercial substrates, such as di-methyl casein ("DMC") and/or AAPF-pNA. This preferred technique for functional characterization is described in greater detail in the Examples provided herein.
In some embodiments of the present invention, the protease has a molecular weight of about 17kD to about 21 kD, for example about 18kD to 19kD, for example 18700 daltons to 18800 daltons, for example about 18764 daltons, as determined by MALDI-TOF-MS). In another aspect of the present invention, the protease measured MALDI-TOF-MS spectrum as set forth in Figure 3.
The mature protease also displays proteolytic activity (e.g., hydrolytic activity on a substrate having peptide linkages) such as DMC. In further embodiments, proteases of the

present invention provide enhanced wash performance under identified conditions. Although the present invention encompasses the protease 69B as described herein, in some embodiments, the proteases of the present invention exhibit at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic activity as compared to the proteolytic .activity of 69B4. In some embodiments, the proteases display at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic activity as compared to the proteolytic activity of proteases sold under the tradenames SAVINASE® (Novzymes) or PURAFECT® (Genencor) under the same conditions. In some embodiments, the proteases of the present invention display comparative or enhanced wash performance under identified conditions as compared to 69B4 under the same conditions. In some preferred embodiments, the proteases of the present invention display comparative or enhanced wash performance under identified conditions, as compared to proteases sold under the tradenames SAVINASE® (Novozymes) or PURAFECT® (Genencor) under the same conditions.
In yet further embodiments, the proteases and/or polynucleotides encoding the proteases of the present invention are provided purified form (i.e., present in a particular composition in a higher or lower concentration than exists in a naturally occurring or wild type organism), or in combination with components not normally present upon expression from a naturally occurring or wild-type organism. However, it is not intended that the present invention be limited to proteases of any specific purity level, as ranges of protease purity find use in various applications in which the proteases of the present inventing are suitable.
III. Obtaining Polynucleotides Encoding Micrococcineae (e.g., Cellulomonas) Proteases of the Present Invention
In some embodiments, nucleic acid encoding a protease of the present invention is obtained by standard procedures known in the art from, for example, cloned DMA (e.g., a DNA "library"), chemical synthesis, cDNA cloning, PCR, cloning of genomic DNA or fragments thereof, or purified from a desired cell, such as a bacterial or fungal species (See, for example, Sambrook etal., supra [1989]; and Glover and Names (eds.), DNA Cloning: A Practical Approach. Vols 1 and 2, Second Edition). Synthesis of polynucleotide sequences is well known in the art (See e.g., Beaucage and Caruthers, Tetrahedron Lett., 22:1859-1862 [1981]), including the use of automated synthesizers (See e.g., Needham-VanDevanter etal., Nucl. Acids Res., 12:6159-6168 [1984]). DNA sequences can also be custom made and ordered from a variety of commercial sources. As described in greater

detail herein, in some embodiments, nucleic acid sequences derived from genomic DNA contain regulatory regions in addition to coding regions.
In some embodiments involving the molecular cloning of the gene from genomic DNA, DNA fragments are generated, some of which comprise at least a portion of the desired gene. In some embodiments, the DNA is cleaved at specific sites using various restriction enzymes. In some alternative embodiments, DNAse is used in the presence of manganese to fragment the DNA, or the DNA is physically sheared (e.g., by sonication). The linear DNA fragments created are then be separated according to size and amplified by standard techniques, including but not limited to, agarose and polyacrylamide gel electrophoresis, PCR and column chromatography.
Once nucleic acid fragments are generated, identification of the specific DNA fragment encoding a protease may be accomplished in a number of ways. For example, in some embodiments, a proteolytic hydrolyzing enzyme encoding the asp gene or its specific RNA, or a fragment thereof, such as a probe or primer, is isolated, labeled, and then used in hybridization assays well known to those in the art, to detect a generated gene (See e.g., Benton and Davis, Science 196:180 [1977]; and Grunstein and Hogness, Proc. Natl. Acad. Sci. USA 72:3961 [1975]). In preferred embodiments, DNA fragments sharing substantial sequence similarity to the probe hybridize under medium to high stringency.
In some preferred embodiments, amplification is accomplished using PCR, as known in the art. In some preferred embodiments, a nucleic acid sequence of at least about 4 nucleotides and as many as about 60 nucleotides from SEQ ID NOS:1, 2, 3 and/or 4 (i.e., fragments), preferably about 12 to 30 nucleotides, and more preferably about 25 nucleotides are used in any suitable combinations as PCR primer. These same fragments also find use as probes in hybridization and product detection methods.
In some embodiments, isolation of nucleic acid constructs of the invention from a cDNA or genomic library utilizes PCR with using degenerate oligonucleotide primers prepared on the basis of the amino acid sequence of the protein having the amino acid sequence as shown in SEQ ID NOS:1 -5. The primers can be of any segment length, for example at least 4, at least 5, at least 8, at least 15, at least 20, nucleotides in length. Exemplary probes in the present application utilized a primer comprising a TTGWHCGT and a GDSGG polynucleotide sequence as more fully described in Examples.
In view of the above, it will be appreciated that the polynucleotide sequences provided herein and based on the polynucleotide sequences provided in SEQ ID NOS:1-5 are useful for obtaining identical or homologous fragments of polynucleotides from other

species, and particularly from bacteria that encode enzymes having the serine protease activity expressed by protease 69B4.
IV. Expression and Recovery of Serine Proteases of the Present Invention
Any suitable means for expression and recovery of the serine proteases of the present invention find use herein. Indeed, those of skill in the art know many methods suitable for cloning a CeMo/nonas-derived polypeptide having proteolytic activity, as well as an additional enzyme (e.g., a second peptide having proteolytic activity, such as a protease, cellulase, mannanase, or amylase, etc.). Numerous methods are also known in the art for introducing at least one (e.g., multiple) copies of the polynucleotide(s) encoding the enzyme(s) of the present invention in conjunction with any additional sequences desired, into the genes or genome of host cells.
In general, standard procedures for cloning of genes and introducing exogenous proteases encoding regions (including multiple copies of the exogenous encoding regions) into said genes find use in obtaining a Cellulomonas 69B4 protease derivative or homologue thereof. Indeed, the present Specification, including the Examples provides such teaching. However, additional methods known in the art are also suitable (See e.g., Sambrook etal. supra (1989); Ausubel etal., supra [1995]; and Harwood and Cutting, (eds.) Molecular Biological Methods for Bacillus." John Wiley and Sons, [1990]; and WO 96/34946).
In some preferred embodiments, the polynucleotide sequences of the present invention are expressed by operatively linking them to an expression control sequence in an appropriate expression vector and employed by that expression vector to transform an appropriate host according to techniques well established in the art. In some embodiments, the polypeptides produced on expression of the DMA sequences of this invention are isolated from the fermentation of cell cultures and purified in a variety of ways according to well established techniques in the art. Those of skill in the art are capable of selecting the most appropriate isolation and purification techniques.
More particularly, the present invention provides constructs, vectors comprising polynucleotides described herein, host cells transformed with such vectors, proteases expressed by such host cells, expression methods and systems for the production of serine protease enzymes derived from microorganisms, in particular, members of the Micrococcineae, including but not limited to Cellulomonas species. In some embodiments, the polynucleotide(s) encoding serine protease(s) are used to produce recombinant host cells suitable for the expression of the serine protease(s). In some preferred embodiments,

the expression hosts are capable of producing the protease(s) in commercially viable quantities.
IV. Recombinant Vectors
As indicated above, in some embodiments, the present invention provides vectors comprising the aforementioned polynucleotides. In some embodiments, the vectors (i.e., constructs) of the invention encoding the protease are of genomic origin (e.g., prepared though use of a genomic library and screening for DNA sequences coding for all or part of the protease by hybridization using synthetic oligonucleotide probes in accordance with standard techniques). In some preferred embodiments, the DNA sequence encoding the protease is obtained by isolating chromosomal DNA from the Cellulomonas strain 69B4 and amplifying the sequence by PCR methodology (See, the Examples).
In alternative embodiments, the nucleic acid construct of the invention encoding the protease is prepared synthetically by established standard methods (See e.g., Beaucage and Caruthers, Tetra. Lett. 22:1859-1869 [1981]; and Matthes etal., EMBO J., 3:801-805 [1984]). According to the phosphoramidite method, oligonucleotides are synthesized (e.g., in an automatic DNA synthesizer), purified, annealed, ligated and cloned in suitable vectors..
In additional embodiments, the nucleic acid construct is of mixed synthetic and genomic origin. In some embodiments, the construct is prepared by ligating fragments of synthetic or genomic DNA (as appropriate), wherein the fragments correspond to various parts of the entire nucleic acid construct, in accordance with standard techniques.
In further embodiments, the present invention provides vectors comprising at least one DNA construct of the present invention. In some embodiments, the present invention encompasses recombinant vectors. It is contemplated that any suitable vector will find use in the present invention, including autonomously replicating vector a well as vectors that integrate (either transiently or stably) within the host cell genome). Indeed, a wide variety of vectors, and expression cassettes suitable for the cloning, transformation and expression in fungal (mold and yeast), bacterial, insect and plant cells are known to those of skill in the art. Typically, the vector or cassette contains sequences directing transcription and translation of the nucleic acid, a selectable marker, and sequences allowing autonomous replication or chromosomal integration. In some embodiments, suitable vectors comprise a region 5' of the gene which harbors transcriptional initiation controls and a region 3' of the DNA fragment which controls transcriptional termination. These control regions may be derived from genes homologous or heterologous to the host as long as the control region selected is able to function in the host cell.

The vector is preferably an expression vector in which the DNA sequence encoding the protease of the invention is operably linked to additional segments required for transcription of the DNA. In some preferred embodiments, the expression vector is derived from plasmid or viral DNA, or in alternative embodiments, contains elements of both. Exemplary vectors include, but are not limited to pSEGCT, pSEACT, and/or pSEA4CT, as well as all of the vectors described in the Examples herein. Construction of such vectors is described herein, and methods are well known in the art (See e.g., U.S. Pat. No. 6,287,839; and WO 02/50245). In some preferred embodiments, the vector pSEGCT (about 8302 bp; See, Figure 5) finds use in the construction of a vector comprising the polynucleotides described herein (e.g., pSEG69B4T; See, Figure 6). In alternative preferred embodiments, the vector pSEA469B4CT (See, Figure 7) finds use in the construction of a vector comprising the polynucleotides described herein. Indeed, it is intended that all of the vectors described herein will find use in the present invention.
In some embodiments, the additional segments required for transcription include regulatory segments (e.g., promoters, secretory segments, inhibitors, global regulators, etc.), as known in the art. One example includes any DNA sequence that shows . transcriptional activity in the host cell of choice and is derived from genes, encoding proteins either homologous or heterologous to the host cell. Specifically, examples of suitable promoters for use in bacterial host cells include but are not limited to the promoter of the Bacillus stearothermophllus maltogenic amylase gene, the Bacillus amyloliquefaciens (BAN) amylase gene, the Bacillus subtilis alkaline protease gene, the Bacillus clausii alkaline protease gene the Bacillus pumilus xylosidase gene, the Bacillus thuringiensis cry\\\A, and the Bacillus licheniformis alpha-amylase gene. Additional promoters include the A4 promoter, as described herein. Other promoters that find use in the present invention include, but are not limited to phage Lambda PR or PL promoters, as well as the E. co//lac, trp or tac promoters.
In some embodiments, the promoter is derived from a gene encoding said protease or a fragment thereof having substantially the same promoter activity as said sequence. The invention further encompasses nucleic acid sequences which hybridize to the promoter sequences under intermediate, high, and/or maximum stringency conditions, or which have at least about 90% homology and preferably about 95% homology to such promoter, but which have substantially the same promoter activity. In some embodiments, this promoter is used to promote the expression of either the protease and/or a heterologous DNA sequence (e.g., another enzyme in addition to the protease of the present invention). In additional embodiments, the vector also comprises at least one selectable marker.

In some embodiments, the recombinant vectors of the invention further comprise a DNA sequence enabling the vector to replicate in the host cell. In some preferred embodiments involving bacterial host cells, these sequences comprise all the sequences needed to allow plasmid replication (e.g., or/and/or rep sequences).
In some particularly preferred embodiments, signal sequences (e.g., leader
sequence or pre sequence) are also included in the vector, in order to direct a polypeptide of
the present invention into the secretory pathway of the host cells. In some more preferred
embodiments, a secretory signal sequence is joined to the-DNA sequence encoding the
precursor protease in the correct reading frame (See e.g., SEQ ID NOS:1 and 2).
Depending on whether the protease is to be expressed intracellularly or is secreted, a
polynucleotide sequence or expression vector of the invention is engineered with or without
a. natural polypeptide signal sequence or a signal sequence which functions in bacteria (e.g.,
Bacillus sp.), fungi (e.g., Trichoderma), other prokaryoktes or eukaryotes. In some
embodiments, expression is achieved by either removing or partially removing the signal
sequence .
In some embodiments involving secretion from bacterial cells, the signal peptide is a naturally occurring signal peptide, or a functional part thereof, while in other embodiments, it is a synthetic peptide. Suitable signal peptides include but are not limited to sequences derived from Bacillus licheniformis alpha-amylase, Bacillus clausii alkaline protease, and Bacillus amyloliquefaciens amylase. One preferred signal sequence is the signal peptide derived from Cellulomonas strain 69B4, as described herein. Thus, in some particularly preferred embodiments, the signal peptide comprises the signal peptide from the protease described herein. This signal finds use in facilitating the secretion of the 69B4 protease and/or a heterologous DNA sequence (e.g. a second protease, such as another wild-type protease, a BPN' variant protease, a GG36 variant protease, a lipase, a cellulase, a mannanase, etc.). In some embodiments, these second enzymes are encoded by the DNA sequence and/or the amino acid sequences known in the art (See e.g., U.S. Pat. Nos. 6,465,235, 6,287,839, 5,965,384, and 5,795,764; as well as WO 98/22500, WO 92/05249, EP 0305216B1, and WO 94/25576). Furthermore, it is contemplated that in some embodiments, the signal sequence peptide is also be operatively linked to an endogenous sequence to activate and secrete such endogenous encoded protease.
The procedures used to ligate the DNA sequences coding for the present protease, the promoter and/or secretory signal sequence, respectively, and to insert them into suitable vectors containing the information necessary for replication, are well known to those skilled

in the art. As indicated above, in some embodiments, the nucleic acid construct is prepared using PCR with specific primers.
V. Host Cells
As indicated above, in some embodiments, the present invention also provides host cells transformed with the vectors described above. In some embodiments, the polynucleotide encoding the protease(s) of the present invention that is introduced into the host cell is homologous, while in other embodiments, the polynucleotide is heterologous to the host. In some embodiments in which the polynucleotide is homologous to the host cell (e.g., additional copies of the native protease produced by the host cell are introduced), it is operably connected to another homologous or heterologous promoter sequence. In alternative embodiments, another secretory signal sequence, and/or terminator sequence find use in the present invention. Thus, in some embodiments, the polypeptide DNA sequence comprises multiple copies of a homologous polypeptide sequence, a heterologous polypeptide sequence from another organism, or synthetic polypeptide sequence(s). Indeed, it is not intended that the present invention be limited to any particular host cells and/or vectors..
Indeed, the host cell into which the DNA construct of the present invention is introduced may be any cell which is capable of producing the present alkaline protease, including, but not limited to bacteria, fungi, and higher eukaryotic cells.
Examples of bacterial host cells which find use in the present invention include, but are not limited to Gram-positive bacteria such as Bacillus, Streptomyces, and Thermobifida, for example strains of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. stearothermophilus, B. clausii, B. amyloliquefaciens, B. coagulans, B. circulans, B. lautus, B. megaterium, B. thuringiensis, S. griseus, S. lividans, S. coelicolor, S. avermitilis and T. fusca; as well as Gram-negative bacteria such as members of the Enterobacteriaceae (e.g., Escherichia coli). In some particularly preferred embodiments, the host cells are B. subtilis, B. clausii, and/or B. licheniformis. In additional preferred embodiments, the host cells are strains of S. lividans (e.g., TK23 and/or TK21). Any suitable method for transformation of the bacteria find use in the present invention, including but not limited to protoplast transformation, use of competent cells, etc., as known in the art. In some preferred embodiments, the method provided in U.S. Pat. No. 5,264,366 (incorporated by reference herein), finds used in the present invention. For S. lividans, one preferred means for transformation and protein expression is that described by Fernandez-Abalos et al. (See, Fernandez-Abalos etal., Microbiol., 149:1623-1632 [2003]; See also, Hopwood, etal.,

Genetic Manipulation of Streptomyces: Laboratory Manual, Innis [1985], both of which are incorporated by reference herein). Of course, the methods described in the Example herein find use in the present invention.
Examples of fungal host cells which find use in the present invention include, but are not limited to Trichoderma spp. and Aspergillus spp. In some particularly preferred embodiments, the host cells are Trichoderma reesei and/or Aspergillus niger. In some embodiments, transformation and expression in Aspergillus is performed as described in U.S. Pat. 5,364,770, herein incorporated by reference. Of course, the methods described in the Example herein find use in the present invention.
In some embodiments, particular promoter and signal sequences are needed to provide effective transformation and expression of the protease(s) of the present invention. Thus, in some preferred embodiments involving the use of Bacillus host cells, the aprE promoter is used in combination with known Bacillus-tie rived signal and other regulatory sequences. In some preferred embodiments involving expression in Aspergillus, the glaA promoter is used. In some embodiments involving Streptomyces host cells, the glucose isomerase (Gl) promoter of Actinoplanes missouriensis is used, while in other embodiments, the A4 promoter is used.
In some embodiments involving expression in bacteria such as E. coli, the protease is retained in the cytoplasm, typically as insoluble granules (i.e., inclusion bodies). However, in other embodiments, the protease is directed to the periplasmic space by a bacterial secretion sequence. In the former case, the cells are lysed, and the granules are recovered and denatured after which the protease is refolded by diluting the denaturing agent. In the latter case, the protease is recovered from the periplasmic space by disrupting the cells (e.g., by sonication or osmotic shock), to release the contents of the periplasmic space and recovering the protease.
In preferred embodiments, the transformed host cells of the present invention are cultured in a suitable nutrient medium under conditions permitting the expression of the present protease, after which the resulting protease is recovered from the culture. The medium used to culture the cells comprises any conventional medium suitable for growing the host cells, such as minimal or complex media containing appropriate supplements. Suitable media are available from commercial suppliers or may be prepared according to published recipes (e.g., in catalogues of the American Type Culture Collection). In some embodiments, the protease produced by the cells is recovered from the culture medium by conventional procedures, including, but not limited to separating the host cells from the medium by centrifugation or filtration, precipitating the proteinaceous components of the

supernatant or filtrate by means of a salt (e.g., ammonium sulfate), chromatographic purification (e.g., ion exchange, gel filtration, affinity, etc.). Thus, any method suitable for recovering the protease(s) of the present invention will find use. Indeed, it is not intended that the present invention be limited to any particular purification method.
VI. Applications for Serine Protease Enzymes
As described in greater detail herein, the proteases of the present invention have important characteristics that make them very suitable for certain applications. For example, the proteases of the present invention have enhanced thermal stability, enhanced oxidative stability, and enhanced chelator stability, as compared to some currently used proteases.
Thus, these proteases find use in cleaning compositions. Indeed, under certain wash conditions, the present proteases exhibit comparative or enhanced wash performance as compared with.currently used subtilisin proteases. Thus, it is contemplated that the cleaning and/or enzyme compositions of the present invention will be provided in a variety of cleaning compositions. In some embodiments, the proteases of the present invention are utilized in the same manner as subtilisin.proteases (i.e., proteases currently in use). Thus, the present proteases find use in various cleaning compositions, as well as animal feed applications, leather processing (e.g., bating), protein hydrolysis, and in textile uses. The identified proteases also find use in personal care applications. .
Thus, the proteases of the present invention find use in a number of industrial applications, in particular within the cleaning, disinfecting, animal feed, and textile/leather industries. In some embodiments, the protease(s) of the present invention are combined with detergents, builders, bleaching agents and other conventional ingredients to produce a variety of novel cleaning compositions useful in the laundry and other cleaning arts such as, for example, laundry detergents (both powdered and liquid), laundry pre-soaks, all fabric bleaches, automatic dishwashing detergents (both liquid and powdered), household cleaners, particularly bar and liquid soap applications, and drain openers. In addition, the protease find use in the cleaning of contact lenses, as well as other items, by contacting such materials with an aqueous solution of the cleaning composition. In addition these naturally occurring proteases can be used, for example in peptide hydrolysis, waste treatment, textile applications, medical device cleaning, biofilm removal and as fusion-cleavage enzymes in protein production, etc. The composition of these products is not critical to the present invention, as long as the protease(s) maintain their function in the setting used. In some embodiments, the compositions are readily prepared by combining a cleaning effective amount of the protease or an enzyme composition comprising the

protease enzyme preparation with the conventional components of such compositions in their art recognized amounts.
A. Cleaning Compositions
The cleaning composition of the present invention may be advantageously employed for example, in laundry applications, hard surface cleaning, automatic dishwashing applications, as well as cosmetic applications such as dentures, teeth, hair and skin. However, due to the unique advantages of increased effectiveness in lower temperature solutions and the superior color-safety profile, the enzymes of the present invention are ideally suited for laundry applications such as the bleaching of fabrics. Furthermore, the enzymes of the present invention may be employed in both granular and liquid compositions.
The enzymes of the present invention may also be employed in a cleaning additive product. A cleaning additive product including the enzymes of the present invention is ideally suited for inclusion in a wash process when additional bleaching effectiveness is desired. Such instances may include, but are not limited to low temperature solution cleaning application. The additive product may be, in its simplest form, one or more proteases, including ASP. Such additive may be packaged in dosage form for addition to a cleaning process where a source of peroxygen is employed and increased bleaching effectiveness is desired. Such single dosage form may comprise a pill, tablet, gelcap or other single dosage unit such as pre-measured powders or liquids. A filler or carrier material may be included to increase the volume of such composition. Suitable filler or carrier materials include, but are not limited to, various salts of sulfate, carbonate and silicate as well as talc, clay and the like. Filler or carrier materials for liquid compositions may be water or low molecular weight primary and secondary alcohols including polyols and diols. Examples of such alcohols include, but are not limited to, methanol, ethanol, propanol and isopropanol. The compositions may contain from about 5% to about 90% of such materials. Acidic fillers can be used to reduce pH. Alternatively, the cleaning additive may include activated peroxygen source defined below or the adjunct ingredients as fully defined below.
The present cleaning compositions and cleaning additives require an effective amount of the ASP enzyme and/or variants provided herein. The required level of enzyme may be achieved by the addition of one or more species of the enzymes of the present invention. Typically the present cleaning compositions will comprise at least 0.0001 weight percent, from about 0.0001 to about 1, from about 0.001 to about 0.5, or even from about

0.01 to about 0.1 weight percent of at least one of the enzymes of the present invention.
The cleaning compositions herein will typically be formulated such that, during use in aqueous cleaning operations, the wash water will have a pH of from about 5.0 to about 11.5 or even from about 7.5 to about 10.5. Liquid product formulations are typically formulated to have a neat pH from about 3.0 to about 9.0 or even from about 3 to about 5. Granular laundry products are typically formulated to have a pH from about 9 to about 11. Techniques for controlling pH at recommended usage levels include the use of buffers, alkalis, acids, etc., and are well known to those skilled in the art.
Suitable low pH cleaning compositions typically have a neat pH of from about 3 to about 5, and are typically free of surfactants that hydrolyze in such a pH environment. Such surfactants include sodium alkyl sulfate surfactants that comprise at least one ethylene oxide moiety or even from about 1 to 16 moles of ethylene oxide. Such cleaning compositions typically comprise a sufficient amount of a pH modifier, such as sodium hydroxide, monoethanolamine or hydrochloric acid, to provide such cleaning composition with a neat pH of from about 3 to about 5. Such compositions typically comprise at least one acid stable enzyme. Said compositions may be liquids or solids. The pH of such liquid compositions is measured as a neat pH. The pH of such solid compositions is measured as a 10% solids solution of said composition wherein the solvent is distilled water. In these embodiments, all pH measurements are taken at 20°C.
When the serine protease(s) is/are employed in a granular composition or liquid, it may be desirable for the enzyme to be in the form of an encapsulated particle to protect such enzyme from other components of the granular composition during storage. In addition, encapsulation is also a means of controlling the availability of the enzyme during the cleaning process and may enhance performance of the enzymes provided herein. In this regard, the serine proteases of the present invention may be encapsulated with any encapsulating material known in the art.
The encapsulating material typically encapsulates at least part of the catalyst for the enzymes of the present invention. Typically, the encapsulating material is water-soluble and/or water-dispersible. The encapsulating material may have a glass transition temperature (Tg) of 0°C or higher. Glass transition temperature is described in more detail in WO 97/11151, especially from page 6, .line 25 to page 7, line 2.
The encapsulating material is may be selected from the group consisting of carbohydrates, natural or synthetic gums, chitin and chitosan, cellulose and cellulose derivatives, silicates, phosphates, borates, polyvinyl alcohol, polyethylene glycol, paraffin waxes and combinations thereof. When the encapsulating material is a carbohydrate, it

may be typically selected from the group consisting of monosaccharides, oligosaccharides, polysaccharides, and combinations thereof. Typically, the encapsulating material is a starch. Suitable starches are described in EP 0 922 499; US 4,977,252; US 5,354,559 and US 5,935,826.
The encapsulating material may be a microsphere made from plastic such as thermoplastics, acrylonitrile, methacrylonitrile, polyacrylonitrile, polymethacrylonitrile and mixtures thereof; commercially available microspheres that can be used are those supplied by Expancel of Stockviksverken, Sweden under the trademark Expancel®, and those supplied by PQ Corp. of Valley Forge, Pennsylvania U.S.A. under the tradename PM 6545, PM 6550, PM 7220, PM 7228, Extendospheres®, Luxsil®, Q-cel® and Sphericel®.
As described herein, the proteases of the present invention find particular use in the cleaning industry, including, but not limited to laundry and dish detergents. These applications place enzymes under various environmental stresses. The proteases of the present invention provide advantages over many currently used enzymes, due to their stability under various conditions.
Indeed, there are a variety-of wash conditions including varying detergent formulations, wash water volumes, wash water temperatures, and lengths of wash time, to which proteases involved in washing are exposed. In addition, detergent formulations used in different geographical areas have different concentrations of their relevant components present in the wash water. For example, a European detergent typically has about 4500-5000 ppm of detergent components in the wash water, while a Japanese detergent typically has approximately 667 ppm of detergent components in the wash water. In North America, particularly the United States, detergents typically have about 975 ppm of detergent components present in the wash water.
A low detergent concentration system includes detergents where less than about 800 ppm of detergent components are present in the wash water. Japanese detergents are typically considered low detergent concentration system as they have approximately 667 ppm of detergent components present in the wash water.
A medium detergent concentration includes detergents where between about 800 ppm and about 2000ppm of detergent components are present in the wash water. North American detergents are generally considered to be medium detergent concentration systems as they have approximately 975 ppm of detergent components present in the wash water. Brazil typically has approximately 1500 ppm of detergent components present in the wash water.
A high detergent concentration system includes detergents where greater than about

2000 ppm of detergent components are present in the wash water. European detergents are generally considered to be high detergent concentration systems as they have approximately 4500-5000 ppm of detergent components in the wash water.
Latin American detergents are generally high suds phosphate builder detergents and the range of detergents used in Latin America can fall in both the medium and high detergent concentrations as they range from 1500 ppm to 6000 ppm of detergent components in the wash water. As mentioned above, Brazil typically has approximately 1500 ppm of detergent components present in the wash water. However, other high suds phosphate builder detergent geographies, not limited to other Latin American countries, may have high detergent concentration systems up to about 6000 ppm of detergent components present in the wash water.
In light of the foregoing, it is evident that concentrations of detergent compositions in typical wash solutions throughout the world varies from less than about 800 ppm of detergent composition ("low detergent concentration geographies"), for example about 667 ppm in Japan, to between about 800 ppm to about 2000 ppm ("medium detergent concentration geographies"), for example about 975 ppm in U.S. and about 1500 ppm in Brazil, to greater than about 2000 ppm ("high detergent concentration geographies"), for example about 4500 ppm to about 5000 ppm in Europe and about 6000 ppm in high suds phosphate builder geographies.
The concentrations of the typical wash solutions are determined empirically. For example, in the U.S., a typical washing machine holds a volume of about 64.4 L of wash solution. Accordingly, in order to obtain a concentration of about 975 ppm of detergent within the wash solution about 62.79 g of detergent composition must be added to the 64.4 L of wash solution. This amount is the typical amount measured into the wash water by the consumer using the measuring cup provided with the detergent.
As a further example, different geographies use different wash temperatures. The temperature of the wash water in Japan is typically less than that used in Europe. For example, the temperature of the wash water in North America and Japan can be between 10 and 30°C (e.g., about 20°C), whereas the temperature of wash water in Europe is typically between 30 and 60°C (e.g., about 40°C).
As a further example, different geographies typically have different water hardness. Water hardness is usually described in terms of the grains per gallon mixed Ca2+/Mg2+; Hardness is a measure of the amount of calcium (Ca2+) and magnesium (Mg2*) in the water. Most water in the United States is hard, but the degree of hardness varies. Moderately hard (60-120 ppm) to hard (121-181 ppm) water has 60 to 181 parts per million (parts per million

converted to grains per U.S. gallon is ppm # divided by 17.1 equals grains per gallon) of hardness minerals.

European water hardness is typically greater than 10.5 (for example 10.5-20.0) grains per gallon mixed Ca2+/Mg2+ (e.g., about 15 grains per gallon mixed Ca2+/Mg?+). North American water hardness is typically greater than Japanese water hardness, but less than European water hardness. For example. North American water hardness can be between 3 to10 grains, 3-8 grains or about 6 grains. Japanese water hardness is typically lower than North American water hardness, usually less than 4, for example 3 grains-per gallon mixed Ca2+/Mg2+.
Accordingly, in some embodiments, the present invention provides proteases that show surprising wash performance in at least one set of wash conditions (e.g., water temperature, water hardness, and/or detergent concentration). In some embodiments, the proteases of the present invention are comparable in wash performance to subtilisin proteases. In some embodiments, the proteases of the present invention exhibit enhanced wash performance as compared to subtilisin proteases. Thus, in some preferred embodiments of the present invention, the proteases provided herein exhibit enhanced oxidative stability, enhanced thermal stability, and/or enhanced chelator stability.
In some preferred embodiments, the present invention provides the ASP protease, as well as homologues and variants fo the protease. These proteases find use in any applications in which it is desired to clean protein based stains from textiles or fabrics.
In some embodiments, the cleaning compositions of the present invention are formulated as hand and machine laundry detergent compositions including laundry additive compositions, and compositions suitable for use in the pretreatment of stained fabrics, rinse-added fabric softener compositions, and compositions for use in general household hard surface cleaning operations, as well as dishwashing operations. Those in the art are familiar with different formulations which can be used as cleaning compositions. In

preferred embodiments, the proteases of the present invention comprise comparative or enhanced performance in detergent compositions (i.e., as compared to other proteases). In some embodiments, cleaning performance is evaluated by comparing the proteases of the present invention with subtilisin proteases in various cleaning assays that utilize enzyme-sensitive stains such as egg, grass, blood, milk, etc., in standard methods. Indeed, those in the art are familiar with the spectrophotometric and other analytical methodologies used to assess detergent performance under standard wash cycle conditions.
Assays that find use in the present invention include, but are not limited to those described in WO 99/34011, and U.S. Pat. No. 6,605,458 (See e.g., Example 3). In U.S. Pat. No. 6,605,458, at Example 3, a detergent dose of 3.0 g/l at pH10.5, wash time 15 minutes, at 15 C, water hardness of 6edH, 10nM enzyme concentration in 150 ml glass beakers with stirring rod, 5 textile pieces (phi 2.5 cm) in 50 ml, EMPA 117 test material from Center for Test Materials Holland are used. The measurement of reflectance "R" on the test material was done at 460 nm using a Macbeth ColorEye 7000 photometer. Additional methods are provided in the Examples herein. Thus, these methods also find use in the present invention.
The addition of proteases of the invention to conventional cleaning compositions does not create any special use limitation. In other words, any temperature and pH suitable for the detergent is also suitable for the present compositions, as long as the pH is within the range set forth herein, and the temperature is below the described protease's denaturing temperature. In addition, proteases of the present invention find use in cleaning compositions that do not include detergents, again either alone or in combination with builders and stabilizers.
When used in cleaning compositions or detergents, oxidative stability is a further consideration. Thus, in some applications, the stability is enhanced, diminished, or comparable to subtilisin proteases as desired for various uses. In some preferred embodiments, enhanced oxidative stability is desired. Some of the proteases of the present invention find particular use in such applications.
When used in cleaning compositions or detergents, thermal stability is a further consideration. Thus, in some applications, the stability is enhanced, diminished, or comparable to subtilisin proteases as desired for various uses. In some preferred embodiments, enhanced thermostability is desired. Some of the proteases of the present invention find particular use in such applications.
When used in cleaning compositions or detergents, chelator stability is a further consideration. Thus, in some applications, the stability is enhanced, diminished, or

comparable to subtilisin proteases as desired for various uses. In some preferred embodiments, enhanced chelator stability is desired. Some of the proteases of the present invention find particular use in such applications.
In some embodiments of the present invention, naturally occurring proteases are provided which exhibit modified enzymatic activity at different pHs when compared to subtilisin proteases. A pH-activity profile is a plot of pH against enzyme activity and may be constructed as described in the Examples and/or by methods known in the art. In some embodiments, it is desired to obtain naturally occurring proteases with broader profiles (i.e., those having greater activity at range of pHs than a comparable subtilisin protease). In other embodiments, the enzymes have no significantly greater activity at any pH, or naturally occurring homologues with sharper profiles (i.e., those having enhanced activity when compared to subtilisin proteases at a given pH, and lesser activity elsewhere). Thus, in various embodiments, the proteases of the present invention have differing pH optima and/or ranges. It is not intended that the present invention be limited to any specific pH or pH range.
In some embodiments of the present invention, the cleaning compositions comprise, proteases of the present invention at a level from 0.00001 % to 10% of 69B4 and/or other protease of the present invention by weight of the composition and the balance (e.g., 99.999% to 90.0%) comprising cleaning adjunct materials by weight of composition. In other aspects of the present invention, the cleaning compositions of the present invention comprise, the 69B4 and/or other proteases at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% 69B4 or other protease of the present invention by weight of the composition and the balance of the cleaning composition (e.g., 99.9999% to 90.0%, 99.999 % to 98%, 99.995% to 99.5% by weight) comprising cleaning adjunct materials.
In some embodiments, preferred cleaning compositions, in addition to the protease preparation of the invention, comprise one or more additional enzymes or enzyme derivatives which provide cleaning performance and/or fabric care benefits. Such enzymes include, but are not limited to other proteases, lipases, cutinases, amylases, cellulases, peroxidases, oxidases (e.g. laccases), and/or mannanases.
Any other protease suitable for use in alkaline solutions finds use in the compositions of the present invention. Suitable proteases include those of animal, vegetable or microbial origin. In particularly preferred embodiments, microbial proteases are used. In some embodiments, chemically or genetically modified mutants are included. In some embodiments, the protease is a serine protease, preferably an alkaline microbial protease or a trypsin-like protease. Examples of alkaline proteases include subtilisins, especially those

derived from Bacillus (e.g., subtilisin, lentus, amyloliquefaciens, subtilisin Carlsberg, subtilisin 309, subtilisin 147 and subtilisin 168). Additional examples include those mutant proteases described in U.S. Pat. Nos. RE 34,606, 5,955,340, 5,700,676, 6,312,936, and 6,482,628, all of which are incorporated herein by reference. Additional protease examples include, but are not limited to trypsin (e.g., of porcine or bovine origin), and the Fusarium protease described in WO 89/06270. Preferred commercially available protease enzymes include those sold under the trade names MAXATASE®, MAXACAL™, MAXAPEM™, OPTICLEAN®, OPTIMASE®, PROPERASE®, PURAFECT® and PURAFECT® OXP (Genencor), those sold under the trade names ALCALASE®, SAVINASE®, PRIMASE®, DURAZYM™, RELASE® and ESPERASE® (Novozymes); and those sold under the trade name BLAP™ (Henkel Kommanditgesellschaft auf Aktien, Duesseldorf, Germany. Various proteases are described in WO95/23221, WO 92/21760, and U.S. Pat. Nos. 5,801,039, 5,340,735, 5,500,364, 5,855,625. An additional BPN' variant ("BPN'-var 1" and "BPN-variant 1"; as referred to herein) is described in US RE 34,606. An additional GG36-variant ("GG36-var.1" and "GG36-variant 1"; as referred to herein) is described in US 5,955,340 and 5,700,676.- A further GG36-variant is described in US Patents 6,312,936 and 6,482,628. In one aspect of the present invention, the cleaning compositions of the present invention comprise additional protease enzymes at a level from 0.00001 % to 10% of additional protease by weight of the composition and 99.999% to 90.0% of cleaning adjunct materials by weight of composition. In other embodiments of the present invention, the cleaning compositions of the present invention also comprise, proteases at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% 69B4 protease (or its homologues or variants) by weight of the composition and the balance of the cleaning composition (e.g., 99.9999% to 90.0%, 99.999 % to 98%, 99.995% to 99.5% by weight) comprising cleaning adjunct materials.
In addition, any lipase suitable for use in alkaline solutions finds use in the present invention. Suitable lipases include, but are not limited to those of bacterial or fungal origin. Chemically or genetically modified mutants are encompassed by the present invention. Examples of useful lipases include Humicola lanuginosa lipase (See e.g., EP 258 068, and EP 305 216), Rhizomucor miehei lipase (See e.g., EP 238 023), Candida lipase, such as C. antarctica lipase (e.g., the C. antarctica lipase A or B; See e.g., EP 214 761), a Pseudomonas lipase such as P. alcaligenes and P. pseudoalcaligenes lipase (See e.g., EP 218 272), P. cepacia lipase (See e.g., EP 331 376), P. stutzeri lipase (See e.g., GB 1,372,034), P. fluorescens lipase, Bacillus lipase (e.g., B. subtilis lipase [Dartois etal., Biochem. Biophys. Acta 1131:253-260 [1993]); B. stearothermophilus lipase [See e.g., JP

64/744992]; and B. pumilus lipase [See e.g., WO 91/16422]).
Furthermore, a number of cloned lipases find use in some embodiments of the present invention, including but not limited to Penicillium camembertii lipase (See, Yamaguchi etal., Gene 103:61-67 [1991]), Geotricum candidum lipase (See, Schimada et a/., J. Biochem., 106:383-388 [1989]), and various Rhizopus lipases such as R. delemar lipase (See, Mass etal., Gene 109:117-113 [1991]), a R. niveuslipase (Kugimiya etal., Biosci. Biotech. Biochem. 56:716-719 [1992]) and R. oryzae lipase.
Other types of lipolytic enzymes such as cutinases also find use in some embodiments of the present invention, including but not limited to the cutinase derived from Pseudomonas mendocina (See, WO 88/09367), or cutinase derived from Fusarium solani pisi (See, WO 90/09446).
Additional suitable lipases include commercially available lipases such as M1 LIPASE™, LUMA FAST™, and LIPOMAX™ (Genencor); LIPOLASE® and LIPOLASE® ULTRA (Novozymes); and LIPASE P™ "Amano" (Amano Pharmaceutical Co. Ltd., Japan).
In some embodiments of the present invention, the cleaning compositions of the present invention further comprise lipases at a level from 0.00001 % to 10% of additional lipase by weight of the composition and the balance of cleaning adjunct materials by weight of composition. In other aspects of the present invention, the cleaning compositions of the present invention also comprise, lipases at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% lipase by weight of the composition.
Any amylase (alpha and/or beta) suitable for use in alkaline solutions also find use in some embodiments of the present invention. Suitable amylases include, but are not limited to those of bacterial or fungal origin. Chemically or genetically modified mutants are included in some embodiments. Amylases that find use in the present invention, include, but are not limited to a-amylases obtained from B. licheniformis (See e.g., GB 1,296,839). Commercially available amylases that find use in the present invention include, but are not limited to DURAMYL®, TERMAMYL®, FUNGAMYL® and BAN™ (Novozymes) and RAPIDASE® and MAXAMYL® P (Genencor International).
In some embodiments of the present invention, the cleaning compositions of the present invention further comprise amylases at a level from 0.00001 % to 10% of additional amylase by weight of the composition and the balance of cleaning adjunct materials by weight of composition. In other aspects of the present invention, the cleaning compositions of the present invention also comprise, amylases at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% amylase by weight of the composition.
Any cellulase suitable for use in alkaline solutions find use in embodiments of the

present invention. Suitable cellulases include, but are not limited to those of bacterial or fungal origin. Chemically or genetically modified mutants are included in some embodiments. Suitable cellulases include, but are not limited to Humicola insolens cellulases (See e.g., U.S. Pat. No. 4,435,307). Especially suitable cellulases are the cellulases having color care benefits (See e.g., EP 0 495 257).
Commercially available cellulases that find use in the present include, but are not limited to CELLUZYME® (Novozymes), and KAC-500(B)™ (Kao Corporation). In some embodiments, cellulases are incorporated as portions or fragments of mature wild-type or variant cellulases, wherein a portion of the N-terminus is deleted (See e.g., U.S. Pat. No. 5,874,276).
In some embodiments, the cleaning compositions of the present invention can further comprise cellulases at a level from 0.00001 % to 10% of additional cellulase by weight of the composition and the balance of cleaning adjunct materials by weight of composition. In other aspects of the present invention, the cleaning compositions of the present invention also comprise cellulases at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001 % to 2%, 0.005% to 0.5% cellulase by weight of the composition.
Any mannanase suitable for use in detergent compositions and or alkaline solutions find use in the present invention. Suitable mannanases include, but are not limited to those of bacterial or fungal origin. Chemically or genetically modified mutants are included in some embodiments. Various mannanases are known which find use in the present invention (See e.g., U.S. Pat. No. 6,566,114, U.S. Pat. No.6,602,842, and US Patent No. 6,440,991, all of which are incorporated herein by reference).
In some embodiments, the cleaning compositions of the present invention can further comprise mannanases at a level from 0.00001 % to 10% of additional mannanase by weight of the composition and the balance of cleaning adjunct materials by weight of composition. In other aspects of the present invention, the cleaning compositions of the present invention also comprise, mannanases at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% mannanase by weight of the composition.
In some embodiments, peroxidases are used in combination with hydrogen peroxide or a source thereof (e.g., a percarbonate, perborate or persulfate). In alternative embodiments, oxidases are used in combination with oxygen. Both types of enzymes are used for "solution bleaching" (i.e., to prevent transfer of a textile dye from a dyed fabric to another fabric when the fabrics are washed together in a wash liquor), preferably together with an enhancing agent (See e.g., WO 94/12621 and WO 95/01426). Suitable peroxidases/oxidases include, but are not limited to those of plant, bacterial or fungal origin.

Chemically or genetically modified mutants are included in some embodiments.
In some embodiments, the cleaning compositions of the present invention can further comprise peroxidase and/or oxidase enzymes at a level from 0.00001 % to 10% of additional peroxidase and/or oxidase by weight of the composition and the balance of cleaning adjunct materials by weight of composition. In other aspects of the present invention, the cleaning compositions of the present invention also comprise, peroxidase and/or oxidase enzymes at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% peroxidase and/or oxidase enzymes by weight of the composition.
Mixtures of the above mentioned enzymes are encompassed herein, in particular a mixture of a the 69B4 enzyme, one or more additional proteases, at least one amylase, at least one lipase, at least one mannanase, and/or at least one cellulase. Indeed, it is contemplated that various mixtures of these enzymes will find use in the present invention.
It is contemplated that the varying levels of the protease and one or more additional enzymes may, both independently range to 10%, the balance of the cleaning composition being cleaning adjunct materials. The specific selection of cleaning adjunct materials are readily made by considering the surface, item, or fabric to be cleaned, and the desired form of the composition for the cleaning conditions during use (e.g., through the wash detergent use).
Examples of suitable cleaning adjunct materials include, but are not limited to, surfactants, builders, bleaches, bleach activators, bleach catalysts, other enzymes, enzyme stabilizing systems, chelants, optical brighteners, soil release polymers, dye transfer agents, dispersants, suds suppressors, dyes, perfumes, colorants, filler salts, hydrotropes, photoactivators, fluorescers, fabric conditioners, hydrolyzable surfactants, preservatives, anti-oxidants, anti-shrinkage agents, anti-wrinkle agents, germicides, fungicides, color speckles, silvercare, anti-tarnish and/or anti-corrosion agents, alkalinity sources, solubilizing agents, carriers, processing aids, pigments, and pH control agents (See e.g., U.S. Pat. Nos. 6,610,642, 6,605,458, 5,705,464, 5,710,115, 5,698,504, 5,695,679, 5,686,014 and 5,646,101, all of which are incorporated herein by reference). Embodiments of specific cleaning composition materials are exemplified in detail below.
If the cleaning adjunct materials are not compatible with the proteases of the present invention in the cleaning compositions, then suitable methods of keeping the cleaning adjunct materials and the protease(s) separated (i.e., not in contact with each other) until combination of the two components is appropriate are used. Such separation methods include any suitable method known in the art (e.g., gelcaps, encapulation, tablets, physical separation, etc.).

Preferably an effective amount of one or more protease(s) provided herein are included in compositions useful for cleaning a variety of surfaces in need of proteinaceous stain removal. Such cleaning compositions include cleaning compositions for such applications as cleaning hard surfaces, fabrics, and dishes. Indeed, in some embodiments, the present invention provides fabric cleaning compositions, while in other embodiments, the present invention provides non-fabric cleaning compositions. Notably, the present invention also provides cleaning compositions suitable for personal care, including oral care (including dentrifices, toothpastes, mouthwashes, etc., as well as denture cleaning compositions), skin, and hair cleaning compositions. It is intended that the present invention encompass detergent compositions in any form (i.e., liquid, granular, bar, semi-solid, gels, emulsions, tablets, capsules, etc.).
By way of example, several cleaning compositions wherein the protease of the present invention find use are described in greater detail below. In embodiments in which the cleaning compositions of the present invention are formulated as compositions suitable for use in laundry machine washing method(s), the compositions of the present invention preferably contain at least one surfactant and at least one builder compound, as well as one or more cleaning adjunct materials preferably selected from organic polymeric compounds, bleaching agents, additional enzymes, suds suppressors, dispersants, lime-soap dispersants, soil suspension and anti-redeposition agents and corrosion inhibitors. In some embodiments, laundry compositions also contain softening agents (i.e., as additional cleaning adjunct materials).
The compositions of the present invention also find use detergent additive products in solid or liquid form. Such additive products are intended to supplement and/or boost the performance of conventional detergent compositions and can be added at any stage of the cleaning process.
In embodiments formulated as compositions for use in manual dishwashing methods, the compositions of the invention preferably contain at least one surfactant and preferably at least one additional cleaning adjunct material selected from organic polymeric compounds, suds enhancing agents, group II metal ions, solvents, hydrotropes and additional enzymes.
In some embodiments, the density of the laundry detergent compositions herein ranges from 400 to 1200 g/liter, while in other embodiments, it ranges from 500 to 950 g/liter of composition measured at 20°C.
In some embodiments, various cleaning compositions such as those provided in U.S, Pat. No. 6,605,458 find use with the proteases of the present invention. Thus, in some

embodiments, the compositions comprising at least one protease of the present invention is a compact granular fabric cleaning composition, while in other embodiments, the composition is a granular fabric cleaning composition useful in the laundering of colored fabrics, in further embodiments, the composition is a granular fabric cleaning composition which provides softening through the wash capacity, in additional embodiments, the composition is a heavy duty liquid fabric cleaning composition.
In some embodiments, the compositions comprising at least one protease of the present invention are fabric cleaning compositions such as those described in U.S. Pat. Nos. 6,610,642 and 6,376,450. In addition, the proteases of the present invention find use in granular laundry detergent compositions of particular utility under European or Japanese washing conditions (See e.g., U.S. Pat. No. 6,610,642).
In alternative embodiments, the present invention provides hard surface cleaning compositions comprising at least one protease provided herein. Thus, in some embodiments, the compositions comprising at least one protease of the present invention is a hard surface cleaning composition such as those described in U.S. Pat. Nos. 6,610,642, 6,376,450, and 6,376,450.
In yet further embodiments, the present invention provides dishwashing compositions comprising at least one protease provided herein. Thus, in some embodiments, the compositions comprising at least one protease of the present invention is a hard surface cleaning composition such as those in U.S. Pat. Nos. 6,610,642 and 6,376,450.
In still further embodiments, the present invention provides dishwashing compositions comprising at least one protease provided herein. Thus, in some embodiments, the compositions comprising at least one protease of the present invention comprise oral care compositions such as those in U.S. Pat. No. 6,376,450, and 6,376,450.
The formulations and descriptions of the compounds and cleaning adjunct materials contained in the aforementioned US Pat. Nos. 6,376,450, 6,605,458, 6,605,458, and 6,610,642, all of which are expressly incorporated by reference herein. Still further examples are set forth in the Examples below.
I) Processes of Making and Using the Cleaning Composition of the Present Invention
The cleaning compositions of the present invention can be formulated into any suitable form and prepared by any process chosen by the formulator, non-limiting examples of which are described in U.S. Pat. Nos. 5,879,584, 5,691,297, 5,574,005, 5,569,645,

5,565,422, 5,516,448, 5,489,392, and 5,486,303, all of which are incorporated herein by reference. When a low pH cleaning composition is desired, the pH of such composition may be adjusted via the addition of a material such as monoethanolamine or an acidic material such as HCI.
II) Adjunct Materials In Addition to the Serine Proteases of the Present Invention
While not essential for the purposes of the present invention, the non-limiting list of adjuncts illustrated hereinafter are suitable for use in the instant cleaning compositions and may be desirably incorporated in certain embodiments of the invention, for example to assist or enhance cleaning performance, for treatment of the substrate to be cleaned, or to modify the aesthetics of the cleaning composition as is the case with perfumes, colorants, dyes or the like. It is understood that such adjuncts are in addition to the serine proteases of the present invention. The precise nature of these additional components, and levels of incorporation thereof, will depend on the physical form of the composition and the nature of the cleaning operation for which it is to be. used. Suitable adjunct materials include, but are not limited to, surfactants, builders, chelating agents, dye transfer inhibiting agents, deposition aids, dispersants, additional enzymes, and enzyme stabilizers, catalytic materials, bleach activators, bleach boosters, hydrogen peroxide, sources of hydrogen,peroxide, preformed peracids, polymeric dispersing agents, clay soil removal/anti-redeposition agents, brighteners, suds suppressors, dyes, perfumes, structure elasticizing agents, fabric softeners, carriers, hydrotropes, processing aids and/or pigments. In addition to the disclosure below, suitable examples of such other adjuncts and levels of use are found in U.S. Patent Nos. 5,576,282, 6,306,812, and 6,326,348, that are incorporated by reference. The aforementioned adjunct ingredients may constitute the balance of the cleaning compositions of the present invention.
Surfactants - The cleaning compositions according to the present invention may comprise a surfactant or surfactant system wherein the surfactant can be selected from nonionic surfactants, anionic surfactants, cationic surfactants, ampholytic surfactants, zwitterionic surfactants, semi-polar nonionic surfactants and mixtures thereof. When a low pH cleaning composition, such as composition having a neat pH of from about 3 to about 5, is desired, such composition typically does not contain alkyl ethoxylated sulfate as it is believed that such surfactant may be hydrolyzed by such compositions the acidic contents.
The surfactant is typically present at a level of from about 0.1% to about 60%, from about 1 % to about 50% or even from about 5% to about 40% by weight of the subject

cleaning composition.
Builders - The cleaning compositions of the present invention may comprise one or more detergent builders or builder systems. When a builder is used, the subject cleaning composition will typically comprise at least about 1%, from about 3% to about 60% or even from about 5% to about 40% builder by weight of the subject cleaning composition.
Builders include, but are not limited to, the alkali metal, ammonium and alkanolammonium salts of polyphosphates, alkali metal silicates, alkaline earth and alkali metal carbonates, aluminosilicate builders polycarboxylate compounds, ether hydroxypolycarboxylates, copolymers of maleic anhydride with ethylene or vinyl methyl ether, 1, 3, 5-trihydroxy benzene-2, 4, 6-trisulphonic acid, and carboxymethyloxysuccinic acid, the various alkali metal, ammonium and substituted ammonium salts of polyacetic acids such as ethylenediamine tetraacetic acid and nitrilotriacetic acid, as well as polycarboxylates such as mellitic acid, succinic acid, citric acid, oxydisuccinic acid, polymaleic acid, benzene 1,3,5-tricarboxylic acid, carboxymethyloxysuccinic acid, and soluble salts thereof.
Chelating Agents - The cleaning compositions herein may contain a chelating agent, Suitable chelating agents include copper, iron and/or manganese chelating agents and mixtures thereof.
When a chelating agent is used, the cleaning composition may comprise from about 0.1% to about 15% or even from about 3.0% to about 10% chelating agent by weight of the subject cleaning composition.
Deposition Aid - The cleaning compositions herein may contain a deposition aid. Suitable deposition aids include, polyethylene glycol, polypropylene glycol, polycarboxylate, soil release polymers such as polytelephthalic acid, clays such as Kaolinite, montmorillonite, atapulgite, illite, bentonite, halloysite, and mixtures thereof.
Dye Transfer Inhibiting Agents - The cleaning compositions of the present invention may also include one or more dye transfer inhibiting agents. Suitable polymeric dye transfer inhibiting agents include, but are not limited to, polyvinylpyrrolidone polymers, polyamine N-oxide polymers, copolymers of N-vinylpyrrolidone and N-vinylimidazole, polyvinyloxazolidones and polyvinylimidazoles or mixtures thereof.
When present in a subject cleaning composition, the dye transfer inhibiting agents may be present at levels from about 0.0001% to about 10%, from about 0.01% to about 5% or even from about 0.1 % to about 3% by weight of the cleaning composition.
Dispersants - The cleaning compositions of the present invention can also contain dispersants. Suitable water-soluble organic materials include the homo- or co-polymeric

acids or their salts, in which the polycarboxylic acid comprises at least two carboxyl radicals separated from each other by not more than two carbon atoms.
Enzymes - The cleaning compositions can comprise one or more detergent enzymes which provide cleaning performance and/or fabric care benefits. Examples of suitable enzymes include, but are not limited to, hemicellulases, peroxidases, proteases, cellulases, xylanases, lipases, phospholipases, esterases, cutinases, pectinases, keratinases, reductases, oxidases, phenol oxidases, lipoxygenases, ligninases, pullulanases, tannases, pentosanases, malanases, 13-glucanases, arabinosidases, hyaluronidase, chondroitinase, laccase, and amylases, or mixtures thereof. A typical combination is cocktail of conventional applicable enzymes like protease, lipase, cutinase and/or cellulase in conjunction with amylase.
Enzyme Stabilizers - Enzymes for use in detergents can be stabilized by various techniques. The enzymes employed herein can be stabilized by the presence of water-soluble sources of calcium and/or magnesium ions in the finished compositions that provide such ions to the enzymes.
Catalytic Metal Complexes - The cleaning compositions of the present invention may include catalytic metal complexes. One type of metal-containing bleach catalyst is a catalyst system comprising a transition metal cation of defined bleach catalytic activity, such as copper, iron, titanium, ruthenium, tungsten, molybdenum, or manganese cations, an auxiliary metal cation having little or no bleach catalytic activity, such as zinc or aluminum cations, and a sequestrate having defined stability constants for the catalytic and auxiliary metal cations, particularly ethylenediaminetetraacetic acid, ethylenediaminetetra (methylenephosphonic acid) and water-soluble salts thereof. Such catalysts are disclosed in U.S. Pat. No. 4,430,243.
If desired, the compositions herein can be catalyzed by means of a manganese compound. Such compounds and levels of use are well known in the art and include, for example, the manganese-based catalysts disclosed in U.S. Pat. No. 5,576,282.
Cobalt bleach catalysts useful herein are known, and are described, for example, in U.S. Pat. Nos. 5,597,936, and 5,595,967. Such cobalt catalysts are readily prepared by known procedures, such as taught for example in U.S. Pat. Nos. 5,597,936, and 5,595,967.
Compositions herein may also suitably include a transition metal complex of a macropolycyclic rigid ligand - abbreviated as "MRL". As a practical matter, and not by way of limitation, the compositions and cleaning processes herein can be adjusted to provide on the order of at least one part per hundred million of the active MRL species in the aqueous washing medium, and will preferably provide from about 0.005 ppm to about 25 ppm, more preferably from about 0.05 ppm to about 10 ppm, and most preferably from about 0.1 ppm to about 5 ppm, of the MRL in the wash liquor.

Preferred transition-metais in tne instant transition-metal bleach catalyst include manganese, iron and chromium. Preferred MRUs herein are a special type of ultra-rigid ligand that is cross-bridged such as 5,12-diethyl-1,5,8,12-tetraazabicyclo[6.6.2]hexadecane.
Suitable transition metal MRLs are readily prepared by known procedures, such as 5 taught for example in WO 00/332601, and U.S. Pat. No. 6,225,464.
III) Processes of Making and Using Cleaning Compositions
The cleaning compositions of the present invention can be formulated into any suitable form and prepared by any process chosen by the formulator, non-limiting examples 10 of which are described in U.S. Pat. Nos. 5,879,584, 5,691,297, 5,574,005, 5,569,645, 5,516,448, 5,489,392, and 5,486,303, all of which are incorporated herein by reference.
IV) Method of Use
The cleaning compositions disclosed herein of can be used to clean a situs inter alia a surface or fabric. Typically at least a portion of the situs is contacted with an embodiment of the present cleaning composition, in neat form or diluted in a wash liquor, and then the situs is optionally washed and/or rinsed. For purposes of the present invention, washing includes but is not limited to, scrubbing, and mechanical agitation. The fabric may comprise most any fabric capable of being laundered in normal consumer use conditions. The disclosed cleaning compositions are typically employed at concentrations of from about 500 ppm to about 15,000 ppm in solution. When the wash solvent is water, the water temperature typically ranges from about 5°C to about 90°C and, when the situs comprises a fabric, the water to fabric mass ratio is typically from about 1:1 to about 30:1.
B. Animal Feed
Still further, the present invention provides compositions and methods for the production of a food or animal feed, characterized in that protease according to the invention is mixed with food or animal feed. In some embodiments, the protease is added as a dry product before processing, while in other embodiments it is added as a liquid before or after processing. In some embodiments, in which a dry powder is used, the enzyme is diluted as a liquid onto a dry carrier such as milled grain. The proteases of the present invention find use as components of animal feeds and/or additives such as those described U.S. Pat. No. 5,612,055, U.S. Pat. No. 5,314,692. and U.S. Pat No. 5,147,642, all of which are hereby incorporated by reference.
The enzyme feed additive according to the present invention is suitable for

preparation in a number of methods. For example, in some embodiments, it is prepared simply by mixing different enzymes having the appropriate activities to produce an enzyme mix. In some embodiments, this enzyme mix is mixed directly with a feed, while in other embodiments, it is impregnated onto a cereal-based carrier material such as milled wheat, maize or soya flour. The present invention also encompasses these impregnated carriers, as they find use as enzyme feed additives.
In some alternative embodiments, a cereal-based carrier (e.g., milled wheat or maize) is impregnated either simultaneously or sequentially with enzymes having the appropriate activities. For example, in some embodiments, a milled wheat carrier is first sprayed with a xylanase, secondly with a protease, and optionally with a p-glucanase. The present invention also encompasses these impregnated carriers, as they find use as enzyme feed additives. In preferred embodiments, these impregnated carriers comprise at least one protease of the present invention.
In some embodiments, the feed additive of the present invention is directly mixed with the animal feed, while in alternative embodiments, it is mixed with one or more other feed additives such as a vitamin feed additive, a mineral feed additive, and/or an amino acid feed additive. The resulting feed additive including several different types of components is then mixed in an appropriate, amount with the feed.
In some preferred embodiments, the feed additive of the present invention, including cereal-based carriers is normally mixed in amounts of 0.01-50 g per kilogram of feed, more preferably 0.1-10 g/kilogram, and most preferably about 1 g/kilogram.
In alternative embodiments, the enzyme feed additive of the present invention involves construction of recombinant microorganisms that produces the desired enzyme(s) in the desired relative amounts. In some embodiments, this is accomplished by increasing the copy number of the gene encoding at least one protease of the present invention, and/or by using a suitably strong promoter operatively linked to the polynucleotide encoding the protease(s). In further embodiments, the recombinant microorganism strain has certain enzyme activities deleted (e.g., cellulases, endoglucanases, etc.), as desired.
In additional embodiments, the enzyme feed additives provided by the present invention also include other enzymes, including but not limited to at least one xylanase, a-amylase, glucoamylase, pectinase, mannanase, a-galactosidase, phytase, and/or lipase. In some embodiments, the enzymes having the desired activities are mixed with the xylanase and protease either before impregnating these on a cereal-based carrier or alternatively
1
such enzymes are impregnated simultaneously or sequentially on such a cereal-based carrier. The carrier is then in turn mixed with a cereal-based feed to prepare the final feed.

In alternative embodiments, the enzyme feed additive is formulated as a solution of the individual enzyme activities and then mixed with a feed material pre-formed as pellets or as a mash.
In still further embodiments, the enzyme feed additive is included in animals' diets by incorporating it into a second (i.e., different) feed or the animals' drinking water. Accordingly, it is not essential that the enzyme mix provided by the present invention be incorporated into the cereal-based feed itself, although such incorporation forms a particularly preferred embodiment of the present invention. The ratio of the units of xylanase activity per g of the feed additive to the units of protease activity per g of the feed additive is preferably 1:0.001-1,000, more preferably 1:0.01-100, and most preferably 1:0.1-10. As indicated above, the enzyme mix provided by the present invention is preferably finds use as a feed additive in the preparation of a cereal-based feed.
In some embodiments, the cereal-based feed comprises at least 25% by weight, or more preferably at least 35% by weight, wheat or maize or a combination of both of these cereals. The feed further comprises a protease (i.e., at least one protease of the present invention) in such an amount that the feed includes a protease in such an amount that the feed includes 100-100,000 units of protease activity per kg.
Cereal-based feeds provided the present invention according to the present invention find use as feed for a variety of non-human animals, including poultry (e.g., turkeys, geese, ducks, chickens, etc.), livestock (e.g., pigs, sheep, cattle, goats, etc.), and companion animals (e.g., horses, dogs, cats, rabbits, mice, etc.). The feeds are particularly suitable for poultry and pigs, and in particular broiler chickens.
C. Textile and Leather Treatment
The present invention also provides compositions for the treatment of textiles that include at least one of the proteases of the present invention. In some embodiments, at least one protease of the present invention is a component of compositions suitable for the treatment of silk or wool (See e.g., U.S. RE Pat. No. 216,034, EP 134,267, U.S. Pat. No. 4,533,359, and EP 344,259).
In addition, the proteases of the present invention find use in a variety of applications where it is desirable to separate phosphorous from phytate. Accordingly, the present invention also provides methods producing wool or animal hair material with improved properties. In some preferred embodiments, these methods comprise the steps of pretreating wool, wool fibres or animal hair material in a process selected from the group consisting of plasma treatment processes and the Delhey process; and subjecting the

pretreated wool or animal hair material to a treatment with a proteolytic enzyme (e.g., at least one protease of the present invention) in an amount effective for improving the properties. In some embodiments, the proteolytic enzyme treatment occurs prior to the plasma treatment, while in other embodiments, it occurs after the plasma treatment. In some further embodiments, it is conducted as a separate step, while in other embodiments, it is conducted in combination with the scouring or the dyeing of the wool or animal hair material. In additional embodiments, at least one surfactant and/or at least one softener is present during the enzyme treatment step, while in other embodiments, the surfactant(s) and/or softener(s) are incorporated in a separate step wherein the wool or animal hair material is subjected to a softening treatment.
In some embodiments, the compositions of the present invention find us in methods for shrink-proofing wool fibers (See e.g., JP 4-327274). In some embodiments, the compositions are used in methods for shrink-proofing treatment of wool fibers by subjecting the fibers to a low-temperature plasma treatment, followed by treatment with a shrink-proofing resin such as a block-urethane resin, polyamide epochlorohydrin resin, glyoxalic resin, ethylene-urea resin or acrylate resin, and then treatment with a weight reducing proteolytic enzyme for obtaining a softening effect). In some embodiments, the plasma treatment step is a low-temperature treatment, preferably a corona discharge treatment or a glow discharge treatment.
In some embodiments, the low-temperature plasma treatment is carried out by using a gas, preferably a gas selected from the group consisting of air, oxygen, nitrogen, ammonia, helium, or argon. Conventionally, air is used but it may be advantageous to use any of the other indicated gasses.
Preferably, the low-temperature plasma treatment is carried out at a pressure between about 0.1 torr and 5 torr for from about 2 seconds to about 300 seconds, preferably for about 5 seconds to about 100 seconds, more preferably from about 5 seconds to about 30 seconds.
As indicated above, the present invention finds use in conjunction with methods such as the Delhey process (See e.g., DE-A-43 32 692). In this process, the wool is treated in an aqueous solution of hydrogen peroxide in the presence of soluble wolframate, optionally followed by treatment in a solution or dispersion of synthetic polymers, for improving the anti-felting properties of the wool. In this method, the wool is treated in an aqueous solution of hydrogen peroxide (0.1-35% (w/w), preferably 2-10% (w/w)), in the presence of a 2-60% (w/w), preferably 8-20% (w/w) of a catalyst (preferably Na2 WO4), and in the presence of a nonionic wetting agent. Preferably, the treatment is carried out at pH 8-11, and room

temperature. The treatment time depends on the concentrations of hydrogen peroxide and catalyst, but is preferably 2 minutes or less. After the oxidative treatment, the wool is rinsed with water. For removal of residual hydrogen peroxide, and optionally for additional bleaching, the wool is further treated in acidic solutions of reducing agents (e.g., sulfites, phosphites etc.).
In some embodiments, the enzyme treatment step carried out for between about 1 minute and about 120 minutes. This step is preferably carried out at a temperature of between about 20'C. and about 60°C., more preferably between about 30°C. and about 50°C. Alternatively, the wool is soaked in or padded with an aqueous enzyme solution and then subjected to steaming at a conventional temperature and pressure, typically for about 30 seconds to about 3 minutes. In some preferred embodiments, the proteolytic enzyme treatment is carried out in an acidic or neutral or alkaline medium which may include a buffer.
In alternative embodiments, the enzyme treatment step is conducted in the presence of one or more conventional anionic, non-ionic (e.g., Dobanol; Henkel AG) or cationic surfactants. An example of a useful nonionic surfactant is Dobanol (from Henkel AG). In further embodiments, the wool or animal hair material is subjected to an ultrasound treatment, either prior to or simultaneous with the treatment with a proteolytic enzyme. In some preferred embodiments, the ultrasound treatment is carried out at a temperature of about 50°C for about 5 minutes. In some preferred embodiments, the amount of proteolytic enzyme used in the enzyme treatment step is between about 0.2 w/w % and about 10 w/w %, based on the weight of the wool or animal hair material. In some embodiments, in order to the number of treatment steps, the enzyme treatment is carried out during dyeing and/or scouring of the wool or animal hair material, simply by adding the protease to the dyeing, rinsing and/or scouring bath. In some embodiments, enzyme treatment is carried out after the plasma treatment but in other embodiments, the two treatment steps are carried out in the opposite order.
Softeners conventionally used on wool are usually cationic softeners, either organic cationic softeners or silicone based products, but anionic or non-ionic softeners are also useful. Examples of useful softeners include, but are not limited to polyethylene softeners and silicone softeners (i.e., dimethyl polysiloxanes (silicone oils)), H-polysiloxanes, silicone elastomers, aminofunctional dimethyl polysiloxanes, aminofunctional silicone elastomers, and epoxyfunctional dimethyl polysiloxanes, and organic cationic softeners (e.g. alkyl quarternary ammonium derivatives).

In additional embodiments, the present invention provides compositions for the treatment of an animal hide that includes at least one protease of the present invention. In some embodiments, the proteases of the present invention find use in compositions for treatment of animal hide, such as those described in WO 03/00865 (Insect Biotech Co., Taejeon-Si, Korea). In additional embodiments, the present invention provides methods for processing hides and/or skins into leather comprising enzymatic treatment of the hide or skin with the protease of the present invention (See e.g., WO 96/11285). In additional embodiments, the present invention provides compositions for the treatment of an animal skin or hide into leather that includes at least one protease of the present invention.
Hides and skins are usually received in the tanneries in the form of salted or dried raw hides or skins. The processing of hides or skins into leather comprises several different process steps including the steps of soaking, unhairing and bating. These steps constitute the wet processing and are performed in the beamhouse. Enzymatic treatment utilizing the proteases of the present invention are applicable at any time during the process involved in the processing of leather. However, proteases are usually employed during the wet processing (i.e., during soaking, unhairing and/or bating). Thus, in some preferred embodiments, the enzymatic treatment with at least one of the proteases of the present invention occurs during the wet processing stage.
In some embodiments, the soaking processes of the present invention are performed under conventional soaking conditions (e.g., at a pH in the range pH 6.0 -11). In some preferred embodiments, the range is pH 7.0 -10.0. In alternative embodiments, the temperature is in the range of 20-30 2C, while in other embodiments it is preferably in the range 24-28 QC. In yet further embodiments, the reaction time is in the range 2-24 hours, while preferred range is 4-16 hours. In additional embodiments, tensides and/or preservatives are provided as desired.
The second phase of the bating step usually commences with the addition of the bate itself. In some embodiments, the enzymatic treatment takes place during bating. In some preferred embodiments, the enzymatic treatment takes place during bating, after the deliming phase. In some embodiments, the bating process of the presents invention is performed using conventional conditions (e.g., at a pH in the range pH 6.0 -9.0). In some preferred embodiments, the pH range is 6.0 to 8.5. In further embodiments, the temperature is in the range of 20-30S C, while in preferred embodiments, the temperature is in the range of 25-289C. In some embodiments, the reaction time is in the range of 20-90 minutes, while in other embodiments, it is in the range 40-80 minutes. Processes for the manufacture of leather are well known to those skilled in the art (See e.g., WO 94/069429

WO 90/1121189, U.S. Pat. No. 3,840,433, EP 505920, GB 2233665, and U.S. Pat. No. 3,986,926, all of which are herein incorporated by reference).
In further embodiments, the present invention provides bates comprising at least one protease of the present invention. A bate is an agent or an enzyme-containing preparation comprising the chemically active ingredients for use in beamhouse processes, in particular in the bating step of a process for the manufacture of leather. In some embodiments, the present invention provides bates comprising protease and suitable excipients. In some embodiments, agents including, but not limited to chemicals known and used in the art, e.g. diluents, emulgators, delimers and carriers. In some embodiments, the bate comprising at least one protease of the present invention is formulated as known in the art (See e.g., GB-A2250289, WO 96/11285, and EP 0784703).
In some embodiments, the bate of the present invention contains from 0.00005 to 0.01 g of active protease per g of bate, while in other embodiments, the bate contains from 0.0002 to 0.004 g of active protease per g of bate.
Thus, the proteases of the present invention find use in numerous applications and settings.
EXPERIMENTAL
The present invention is described in further detail in the following Examples which are not in any way intended to limit the scope of the invention as claimed. The attached Figures are meant to be considered as integral parts of the specification and description of the invention. All references cited are herein specifically incorporated by reference for all that is described therein. The following Examples are offered to illustrate, but not to limit the claimed invention
In the experimental disclosure which follows, the following abbreviations apply: PI (proteinase inhibitor), ppm (parts per million); M (molar); mM (millimolar); uM (micromolar); nM (nanomolar); mol (moles); mmol (millimoles); umol (micromoles); nmol (nanomoles); gm (grams); mg (milligrams); ug (micrograms); pg (picograms); L (liters); ml and ml (milliliters); ul and |jL (microliters); cm (centimeters); mm (millimeters); urn (micrometers); nm (nan9meters); U (units); V (volts); MW (molecular weight); sec (seconds); min(s) (minute/minutes); h(s) and hr(s) (hour/hours); °C (degrees Centigrade); OS (quantity sufficient); ND (not done); NA (not applicable); rpm (revolutions per minute); H2O (water); dH2O (deionized water); (HCI (hydrochloric acid); aa (amino acid); bp (base pair); kb (kilobase pair); kD (kilodaltons); cDNA (copy or complementary DNA); DNA

(deoxyribonucleic acid); ssDNA (single stranded DMA); dsDNA (double stranded DMA); dNTP (deoxyribonucleotide triphosphate); RNA (ribonucleic acid); MgCI2 (magnesium chloride); NaCI (sodium chloride); w/v (weight to volume); v/v (volume to volume); g (gravity); OD (optical density); Dulbecco's phosphate buffered solution (DPBS); SOC (2% Bacto-Tryptone, 0.5% Bacto Yeast Extract, 10 mM NaCI, 2.5 mM KCI); Terrific Broth (TB; 12 g/l Bacto Tryptone, 24 g/l glycerol, 2.31 g/l KH2PO4, and 12.54 g/l K2HPO4); OD280 (optical density at 280 nm); OD600 (optical density at 600 nm); A405 (absorbance at 405 nm); Vmax (the maximum initial velocity of an enzyme catalyzed reaction); PAGE (polyacrylamide gel electrophoresis); PBS (phosphate buffered saline [150 mM NaCI, 10 mM sodium phosphate buffer, pH 7.2]); PBST (PBS+0.25%TWEEN®20); PEG (polyethylene glycol); PCR (polymerase chain reaction); RT-PCR (reverse transcription PCR); SDS (sodium dodecyl sulfate); Tris (tris(hydroxymethyl)aminomethane); HEPES (N-[2-Hydroxyethyl]piperazine-N-[2-ethanesulfonic acid]); HBS (HEPES buffered saline); SDS (sodium dodecylsulfate); Tris-HCI (tris[Hydroxymethyl]aminomethane-hydrochloride); Tricine (N-[tris-(hydroxymethyl)-methyl]-glycine); CHES (2-(N-cyclo-hexylamino) ethane-sulfonic acid); TAPS (3-{[tris-(hydroxymethyl)-methyl]-amino}-propanesulfonic acid); CAPS (3-(cyclo-hexylamino)-propane-sulfonic acid; DMSO (dimethyl sulfoxide); DTT (1,4-dithio-DL-threitol); SA (sinapinic acid (s,5-dimethoxy-4-hydroxy cinnamic acid); TCA (trichloroacetic acid); Glut and GSH (reduced glutathione); GSSG (oxidized glutathione); TCEP (Tris[2-carboxyethyl] phosphine); Ci (Curies); mCi (milliCuries); uCi (microCuries); HPLC (high pressure liquid chromatography); RP-HPLC (reverse phase high pressure liquid chromatography); TLC (thin layer chromatography); MALDI-TOF (matrix-assisted laser desorption/ionization-time of flight); Ts (tosyl); Bn (benzyl); Ph (phenyl); Ms (mesyl); Et (ethyl), Me (methyl); Taq (Thermus aquaticus DNA polymerase); Klenow (DNA polymerase I large (Klenow) fragment); rpm (revolutions per minute); EGTA (ethylene glycol-bis((3-aminoethyl ether) N, N, N', N'-tetraacetic acid); EDTA (ethylenediaminetetracetic acid); bla 0-lactamase or ampicillin-resistance gene); HDL (heavy duty liquid detergent, i.e., laundry detergent); MJ Research (MJ Research, Reno.NV); Baseclear (Baseclear BV, Inc., Leiden, the Netherlands); PerSeptive (PerSeptive Biosystems, Framingham, MA); ThermoFinnigan (ThermoFinnigan, San Jose, CA); Argo (Argo BioAnalytica, Morris Plains, NJ);Seitz EKS (SeitzSchenk Filtersystems GmbH, Bad Kreuznach, Germany); Pall (Pall Corp., East Hills, NY); Spectrum (Spectrum Laboratories, Dominguez Rancho, CA); Molecular Structure (Molecular Structure Corp., Woodlands, TX); Accelrys (Accelrys, Inc., San Diego, CA); Chemical Computing (Chemical Computing Corp., Montreal, Canada); New Brunswick (New Brunswick Scientific, Co., Edison, NJ); CFT (Center for Test Materials, Vlaardingeng, the

Netherlands); Procter & Gamble (Procter & Gamble, Inc., Cincinnati, OH); GE Healthcare (GE Healthcare, Chalfont St. Giles, United Kingdom); DNA2.0 (DNA2.0, Menlo Park, CA); OXOID (Oxoid, Basingstoke, Hampshire, UK); Megazyme (Megazyme International Ireland Ltd., Bray Business Park, Bray, Co., Wicklow, Ireland); Finnzymes (Finnzymes Oy, Espoo, Finland); Kelco (CP Kelco, Wilmington, DE); Corning (Corning Life Sciences, Corning, NY); (NEN (NEN Life Science Products, Boston, MA); Pharma AS (Pharma AS, Oslo, Norway); Dynal (Dynal, Oslo, Norway); Bio-Synthesis (Bio-Synthesis, Lewisville, TX); ATCC (American Type Culture Collection, Rockville, MD); Gibco/BRL (Gibco/BRL, Grand Island , NY); Sigma (Sigma Chemical Co., St. Louis, MO); Pharmacia (Pharmacia Biotech, Piscataway, NJ); NCBI (National Center for Biotechnology Information); Applied Biosystems (Applied Biosystems, Foster City, CA); BD Biosciences and/or Clontech (BD Biosciences CLONTECH Laboratories, Palo Alto, CA); Operon Technologies (Operon Technologies, Inc., Alameda, CA); MWG Biotech (MWG Biotech, High Point, NC); Oligos Etc (Oligos Etc. Inc, Wilsonville, OR); Bachem (Bachem Bioscience, Inc., King of Prussia, PA); Difco (Difco Laboratories, Detroit, Ml); Mediatech (Mediatech, Herndon, VA; Santa Cruz (Santa Cruz Biotechnology, Inc., Santa Cruz, CA); Oxoid (Oxoid Inc., Ogdensburg, NY); Worthington (Worthington Biochemical Corp., Freehold, NJ); GIBCO BRL or Gibco BRL (Life Technologies, Inc., Gaithersburg, MD); Millipore (Millipore, Billerica, MA); Bio-Rad (Bio-Rad, Hercules, CA); Invitrogen (Invitrogen Corp., San Diego, CA); NEB (New England Biolabs, Beverly, MA); Sigma (Sigma Chemical Co., St. Louis, MO); Pierce (Pierce Biotechnology, Rockford, IL); Takara (Takara Bio Inc., Otsu, Japan); Roche (Hoffmann-La Roche, Basel, Switzerland); EM Science (EM Science, Gibbstown, NJ); Qiagen (Qiagen, Inc., Valencia, CA); Biodesign (Biodesign Intl., Saco, Maine); Aptagen (Aptagen, Inc., Herndon, VA); Sorvall (Sorvall brand, from Kendro Laboratory Products, Asheville, NC); Molecular Devices (Molecular Devices, Corp., Sunnyvale, CA); R&D Systems (R&D Systems, Minneapolis, MN); Stratagene (Stratagene Cloning Systems, La Jolla, CA); Marsh (Marsh Biosciences, Rochester, NY); Bio-Tek (Bio-Tek Instruments, Winooski, VT); (Biacore (Biacore, Inc., Piscataway, NJ); PeproTech (PeproTech, Rocky Hill, NJ); SynPep (SynPep, Dublin, CA); New Objective (New Objective brand; Scientific Instrument Services, Inc., Ringoes, NJ); Waters (Waters, Inc., Milford, MA); Matrix Science (Matrix Science, Boston, MA); Dionex (Dionex, Corp., Sunnyvale, CA); Monsanto (Monsanto Co., St. Louis, MO); Wintershall (Wintershall AG, Kassel, Germany); BASF (BASF Co., Florham Park, NJ); Huntsman (Huntsman Petrochemical Corp., Salt Lake City, UT); Enichem (Enichem Iberica, Barcelona, Spain); Fluka Chemie AG (Fluka Chemie AG, Buchs, Switzerland); Gist-Brocades (Gist-Brocades, NV, Delft, the Netherlands); Dow Corning (Dow Corning Corp., Midland, Ml); and

Microsoft (Microsoft, Inc., Redmond, WA).
EXAMPLE 1 Assays
In the following Examples, various assays were used, such as protein determinations, application-based tests, and stability-based tests. For ease in reading, the following assays are set forth below and referred to in the respective Examples. Any deviations from the protocols provided below in any of the experiments performed during the development of the present invention are indicated in the Examples.
Some of the detergents used in the following Examples had the following compositions. In Compositions I and II, the balance (to 100%) is perfume/dye and/or water. The pH of these compositions was from about 5 to about 7 for Composition I, and about 7.5 to about 8.5 Composition II. In Composition III, the balance (to 100%) comprised of water and/or the minors perfume, dye, brightener/SRPI/sodium
carboxymethylcellulose/photobleach/MgSo4/PVPVI/suds suppressor/high molecular PEG/clay.


Composition III
C14-Ci5AS or sodium tallow alkyl 3.0
sulfate
LAS 8.0
C12-C15AE3S 1.0
Ci2-Ci5E5 or E3 5.0
QAS
Zeolite A 11.0
SKS-6 (dry add) 9.0
MA/AA 2.0
AA
3Na Citrate 2H2O
Citric Acid (Anhydrous) 1.5
DTPA
EDDS 0.5
HEDP 0.2
PB1

Composition III
Percarbonate 3.8
NOBS
NACA OBS 2.0
TAED 2.0
BB1 0.34
BB2
Anhydrous Na Carbonate 8.0
Sulfate 2.0
Silicate
Protease B
Protease C -
Lipase
Amylase
Cellulase
Pectin Lyase 0.001
Aldose Oxidase 0.05
PAAC
A. TCA Assay for Protein Content Determination in 96-well Microtiter Plates
This assay was started using filtered culture supernatant from microtiter plates grown 4 days at 33 °C with shaking at 230 RPM and humidified aeration. A fresh 96-well flat bottom plate was used for the assay. First, 100 uL/well of 0.25 N HCI were placed in the wells. Then, 50 uL filtered culture broth were added to the wells. The light scattering/absorbance at 405 nm (use 5 sec mixing mode in the plate reader) was then determined, in order to provide the "blank" reading.
For the test, 100 uL/well 15% (w/v) TCA was placed in the plates and incubated between 5 and 30 min at room temperature. The light scattering/absorbance at 405 nm (use 5 sec mixing mode in the plate reader) was then determined.
The calculations were performed by subtracting the blank (i.e., no TCA) from the test reading with TCA. If desired, a standard curve can be created by calibrating the TCA readings with AAPF assays of clones with known conversion factors. However, the TCA results are linear with respect to protein concentration from 50 to 500 ppm and can thus be

plotted directly against enzyme performance for the purpose of choosing good-performing variants.
B. suc-AAPF-pNA Assay of Proteases in 96-well Microtiter Plates
In this assay system, the reagent solutions used were:
1. 100 mM Tris/HCI, pH 8.6, containing 0.005% TWEEN®-80 (Tris buffer)
2. 100 mM Tris buffer, pH 8.6, containing 10 mM CaCI2 and 0.005% TWEEN®-80 (Tris buffer)
3. 160 mM suc-AAPF-pNA in DMSO (suc-AAPF-pNA stock solution) (Sigma: S-7388)
To prepare suc-AAPF-pNA working solution, 1 ml AAPF stock was added to 100 ml Tris/Ca buffer and mixed well for at least 10 seconds.
The assay was performed by adding 10 \i\ of diluted protease solution to each well, followed by the addition (quickly) of 190 ul 1 mg/ml AAPF-working solution. The solutions were mixed for 5 sec., and the absorbance change was read at 410 nm in an MTP reader, at 25°C. The protease activity was expressed as AU (activity = 8OD-min'1 .ml'1).
C. Keratin Hydrolysis Assay
In this assay system, the chemical and reagent solutions used were:
Keratin ICN 902111
Detergent Detergent Composition II
1.6 g. detergent is dissolved in 1000 ml water (pH = 8.2) 0.6 ml. CaCI2/MgCI2 of 10,000 gpg is added as well as 1190 mg HEPES, giving a hardness and buffer strength of 6 gpg and 5 mM respectively. The pH is adjusted to 8.2 with NaOH.
Picrylsulfonic acid (TNBS)
Sigma P-2297 (5% solution in water)
Reagent A 45.4 g Na2B4O7.10 H2O (Merck 6308) and 15 ml of 4N NaOH are
dissolved together to a final volume of 1000 ml (by heating if needed)
Reagent B 35.2 g NaH2PO4.1 H2O (Merck 6346) and 0.6 g Na2SO3 (Merck 6657)
are dissolved together to a final volume of 1000 ml.
Method:
Prior to the incubations, keratin was sieved on a 100 urn sieve in small portions at a

time. Then, 10 g of the MTPs were filled with 60 ul TNBS reagent A per well. From the incubated plates, 10 pi was transferred to the MTPs with TNBS reagent A. The plates were covered with tape and shaken for 20 minutes in a bench shaker (BMG Thermostar) at room temperature and 500 rpm. Finally, 200 ul of reagent B was added to the wells, mixed for 1 minute on a shaker, and the absorbance at 405 nm was measured with the MTP-reader.
Calculation of the Keratin Hydrolyzing Activity:
The obtained absorbance value was corrected for the blank value (substrate without enzyme). The resulting absorbance provides a measure for the hydrolytic activity. For each sample (variant) the performance index was calculated. The performance index compares the performance of the variant (actual value) and the standard enzyme (theoretical value) at the same protein concentration. In addition, the theoretical values can be calculated, using the parameters of the Langmuir equation of the standard enzyme. A performance index (PI) that is greater than 1 (Pl>1) identifies a better variant (as compared to the standard [e.g., wild-type]), while a PI of 1 (Pl=1) identifies a variant that performs the same as the standard, and a PI that is less than 1 (Pl D. Microswatch Assay for Testing Protease Performance
All of the detergents used in these assays did not contain enzymes.

Detergent Preparations:
1. European Detergent Solution:
Milli-Q water was adjusted to 15 gpg water hardness (Ca/Mg=4/1), add 7.6 g/l ARIEL® Regular detergent and stir the detergent solution vigorously for at least 30 minutes. The detergent was filtered before use in the assay through a 0.22um filter (e.g. Nalgene top bottle filter).
2. Japanese Detergent Solution
Milli-Q water was adjusted to 3 gpg water hardness (Ca/Mg=3/1), add 0.66 g/l Detergent Composition III, the detergent solution was stirred vigorously for at least 30 minutes. The detergent was filtered before use in the assay through a 0.22um filter (e.g. Nalgene top bottle filter).
3. Cold Water Liquid Detergent (US Conditions):
Milli-Q water was adjusted to 6 gpg water hardness (Ca/Mg=3/1), add 1.60 g/l TIDE® LVJ-1 detergent and stir the detergent solution vigorously for at least 15 minutes. Add 5mM Hepes buffer and set pH at 8.2. The detergent was filtered before use in the assay through a 0.22um filter (e.g. Nalgene top bottle filter).
4. Low pH Liquid Detergent (US Conditions):
Milli-Q water was adjusted to 6 gpg water hardness (Ca/Mg=3/1), 1.60 g/l Detergent Composition I, was added and the detergent solution stirred vigorously for at least 15 minutes. The pH was set at 6.0 using 1N NaOH solution. The detergent was filtered before use in the assay through a 0.22pm filter (e.g. Nalgene top bottle filter).
Microswatches:
Microswatches of %" circular diameter were ordered and delivered by CFT Vlaardingen. The microswatches were pretreated using the fixation method described below. Single microswatches were placed in each well of a 96-well microtiter plate vertically to expose the whole surface area (i.e., not flat on the bottom of the well).
Bleach Fixation ("Superfixed"):

In a 10 L stainless steel beaker containing 10L of water, the water was heated to 60°C for fixation of swatches used in European conditions (=Super fixed). For Japanese condition(s) and other conditions, the swatches were fixed at room temperature (=3K). Then, 10 ml of 30% hydrogen peroxide (1 ml/L of H2O2, final cone, of H2O2 is 300 ppm) were added. Then, 100 swatches (10 swatches/L) were added to the solution. The solution was allowed to sit for 30 minutes with occasional stirring and monitoring of the temperature. The swatches were rinsed 7-8 times with cold water and placed on bench to dry. A towel was placed on top of swatches, as this prevents the swatches from curling up. For the 3K swatches, the procedure is repeated (except the water was not heated andlOx the amount of hydrogen peroxide was added).
Alternative Fixation ("3K" Swatch Fixation):
This particular swatch fixation was done at room temperature, however the amount of 30% H2O2 added is 10X more than in the Superfixed Swatch Fixation. Bubble formation (frothing) will be visible and therefore it is necessary to use a bigger beaker to account for this. First, 8 liters of distilled water are placed in a 10 L beaker, and 80 ml of 30% hydrogen peroxide are added. The water and peroxide are mixed well with a ladle. Then, 40 pieces of EMPA 116 swatches were spread into a fan before adding into the solution to ensure uniform fixation. The swatches were swirled in the solution (using the ladle) for 30 minutes, continuously for the first five minutes and occasionally for the remaining 25 minutes. The solution was discarded and the swatches were rinsed 6 times with approximately 6 liters of distilled water each time. The swatches were placed on top of paper towels to dry. The air-dried swatches were punched using a 1/4" circular die on an expulsion press. A single microswatch was placed vertically into each well of a 96-well microtiter plate to expose the whole surface area (i.e. not flat on the bottom of the well).
Enzyme Samples:
The enzyme samples were tested at appropriate concentrations for the respective geography, and diluted in 10 mM NaCI, 0.005% TWEEN®-80 solution.
Test Method:
The incubator was set at the desired temperature: 20°C for cold water liquid
conditions; 30°C for low-pH liquid conditions; 40°C for European conditions; 20°C for Japanese and North American conditions. The pretreated and precut swatches were placed into the wells of a 96-well MTP, as described above. The enzyme samples were diluted, if

needed, in 10 mM Naui, 0.005% TWEEN©-80 to 20x the desired concentration. The desired detergent solutions were prepared as described above. Then, 190 ul of detergent solution were added to each well of the MTP. To this mixture, 10 ul of enzyme solution were added to each well (to provide a total volume to 200 ul/well). The MTP was sealed with a plate sealer and placed in an incubator for 60 minutes, with agitation at 350 rpm. Following incubation under the appropriate conditions, 100 ul of solution from each well were removed and placed into a fresh MTP. The new MTP containing 100 ul of solution/well was read at 405 nm in a MTP reader. Blank controls, as well as a control containing a microswatch and detergent but no enzyme were also included.
Table 1-1 Detergent Composition and Incubation Conditions in the (jSwatch Assay.

The stock solution was used at a concentration of 15,000 gpg stock #1 = Ca/Mg 3:1
(1.92 M Ca2+ = 282.3 g/L CaCI2.2H2O; 0.64 M Mg2+ = 30.1 g/L MgCI2.6H2O) stock #2= Ca/Mg 4:1
(2.05 M Ca2+ = 301.4 g/L CaCI2.2H2O; 0.51 M Mg2+ =103.7 g/L MgCI2.6H2O)
Calculation of the BMI Performance:
The obtained absorbance value was corrected for the blank value (obtained after incubation of microswatches in the absence of enzyme). The resulting absorbance was a measure for the hydrolytic activity. For each sample (variant) the performance index was

calculated. The performance index compares the performance of the variant (actual value) and the standard enzyme (theoretical value) at the same protein concentration. In addition, the theoretical values can be calculated, using the parameters of the Langmuir equation of the standard enzyme. A performance index (PI) that is greater than 1 (Pl>i) identifies a better variant (as compared to the standard [e.g., wild-type]), while a PI of 1 (Pl=1) identifies a variant that performs the same as the standard, and a PI that is less than 1 (Pl Thus, the PI identifies winners, as well as variants that are less desirable for use under certain circumstances.
D. Dimethylcasein Hydrolysis Assay (96 wells)
In this assay system, the chemical and reagent solutions used were:
Dimethylcasein (DMC): Sigma C-9801
TWEEN®-80: Sigma P-8074
PIPES buffer (free acid): Sigma P-1851; 15.1 g is dissolved in about 960 ml water; pH is
adjusted : to 7.0 with 4N NaOH, 1 ml 5% TWEEN®- 80 is
added and the volume brought up to 1000 ml. The final
concentration of PIPES and TWEEN®-80 is 50 mM and
0.005% respectively.
Picrylsulfonic acid (TNBS): Sigma P-2297 (5% solution in water)
Reagent A: 45.4 g Na2B4O7.10 H2O (Merck 6308) and 15 ml of 4N NaOH
are dissolved together to a final volume of 1000 ml (by
heating if needed)
Reagent B: 35.2 g NaH2PO4.1 H2O (Merck 6346) and 0.6 g Na2SO3 (Merck
6657) are dissolved together to a final volume of 1000 ml.
Method:
To prepare the substrate, 4 g DMC were dissolved in 400 ml PIPES buffer. The filtered
culture supernatants were diluted with PIPES buffer; the final concentration of the controls in the growth plate was 20 ppm. Then, 10 ul of each diluted supernatant were added to 200 ul substrate in the wells of a MTP. The MTP plate was covered with tape, shaken for a few seconds and placed in an oven at 37°C for 2 hours without agitation.
About 15 minutes before removal of the 1st plate from the oven, the TNBS reagent was prepared by mixing 1 ml TNBS solution per 50 ml of reagent A. MTPs were filled with 60 ul TNBS reagent A per well. The incubated plates were shaken for a few seconds, after which 10 ul were transferred to the MTPs with TNBS reagent A. The plates were covered with tape and shaken for 20 minutes in a bench shaker (BMG Thermostar) at room temperature

and 500 rpm. Finally, 200 ul reagent B were added to the wells, mixed for 1 minute on a shaker, and the absorbance at 405 nm was determined using an MTP-reader.
Calculation of Dimethylcasein Hydrolyzing Activity:
The obtained absorbance value was corrected for the blank value (substrate without enzyme). The resulting absorbance is a measure for the hydrolytic activity. The (arbitrary) specific activity of a sample was calculated by dividing the absorbance and the determined protein concentration.
E. Thermostability Assay
This assay is based on the dimethylcasein hydrolysis, before and after heating of the buffered culture supernatant. The same chemical and reagent solutions were used as described in the dimethylcasein hydrolysis assay.
Method:
The filtered culture supernatants were diluted to 20 ppm in PIPES buffer (based on the concentration of the controls in the growth plates). Then, 50 ul of each diluted supernatant were placed in the empty wells of a MTP. The MTP plate was incubated in an iEMS incubator/shaker HT (Thermo Labsystems) for 90 minutes at 60°C and 400 rpm. The plates were cooled on ice for 5 minutes. Then, 10 ul of the solution was added to a fresh MTP containing 200 ul dimethylcasein substrate/well. This MTP was covered with tape, shaken for a few seconds and placed in an oven at 37 °C for 2 hours without agitation. The same detection method as used for the DMC hydrolysis assay was used.
Calculation of Thermostability:
The residual activity of a sample was expressed as the ratio of the final absorbance and the initial absorbance, both corrected for blanks.
F. LAS Stability Assay
LAS stability was measured after incubation of the test protease in the presence of 0.06% LAS (dodecylbenzenesulfonate sodium), and the residual activity was determined using the AAPF assay.

Reagents:
Dodecylbenzenesulfonate, Sodium salt (=LAS): Sigma D-2525
TWEEN®-80: Sigma P-8074
TRIS buffer (free acid): Sigma T-1378); 6.35 g is dissolved in about 960 ml water; pH is
adjusted to 8.2 with 4N HCI. Final concentration of TRIS is 52.5 mM.
LAS stock solution: Prepare a 10.5 % LAS solution in MQ water (=10.5 g per 100 ml
MQ)
TRIS buffer-100 mM / pH 8.6 (100mM Tris/0.005% TweenSO)
TRIS-Ca buffer, pH 8.6 (100mM Tris/10mM CaCI2/0.005% TweenSO)
Hardware:
Flat bottom MTPs: Costar (#9017) Biomek FX ASYS Multipipettor Spectramax MTP Reader iEMS Incubator/Shaker Innova 4330 Incubator/Shaker Biohit multichannel pipette BMG Thermostar Shaker
Method:
A 10 ul 0.063% LAS solution was prepared in 52.5 mM Tris buffer pH 8.2. The AAPF working solution was prepared by adding 1 ml of 100 mg/ml AAPF stock solution (in DMSO) to 100 ml (100 mM) TRIS buffer, pH 8.6. To dilute the supernatants, flat-bottomed plates were filled with dilution buffer and an aliquot of the supernatant was added and mixed well. The dilution ratio depended on the concentration of the ASP-controls in the growth plates (AAPF activity). The desired protein concentration was 80 ppm.
Ten ul of the diluted supernatant was added to 190 ul 0.063% LAS buffer/well. The MTP was covered with tape, shaken for a few seconds and placed in an incubator (Innova 4230) at 25°C, for 60 minutes at 200 rpm agitation. The initial activity (f=10 minutes) was determined after 10 minutes of incubation by transferring 10 ul of the mixture in each well to a fresh MTP containing 190ul AAPF work solution. These solutions were mixed well and the AAPF activity was measured using a MTP Reader (20 readings in 5 minutes and 25°C).
The final activity (?=60 minutes) was determined by removing another 10 ul of solution from the incubating plate after 60 minutes of incubation. The AAPF activity was then determined as described above. The calculations were performed as follows: the % Residual Activity was [f-60 value]*100 /[MO value].
G. Scrambled Egg Hydrolysis Assay
Proteases release insoluble particles from scrambled egg, which was baked into the

wells of 96-well microtiter plates. The scrambled egg coated wells were treated with a mixture of protease containing culture filtrate and ADW (automatic dishwash detergent) to determine the enzyme performance in scrambled egg removal. The rate of turbidity is a measure of the enzyme activity.
Materials:
Water bath
Oven with mechanical air circulation (Memmert ULE 400)
Incubator/shaker with amplitude of 0.25 cm (Multitron), equipped with MTP-holders and
aluminum covers and bottoms
Biomek FX liquid-handling system (Beckman)
Micro plate reader (Molecular Devices Spectramax 340, SOFTmax Pro Software)
Nichiryo 8800 multi channel syringe dispenser + syringes
Micro titer plate tape
Single and multi channel pipettes with tips
Grade A medium eggs
CaCI2.2H2O (Merck 102382); MgCI2.6H2O (Merckl05833); Na2CO3 (Merck 6392)
ADW product:
LH-powder (= Light House)
Procedure:
Three eggs were stirred with a fork in a glass beaker and 100 ml milk (at 4°C or room temperature) was added. The beaker was placed in an 85°C water bath, and the mixture was stirred constantly with a spoon. As the mixture became thicker, care was taken to scrape the solidifying material continuously from the walls and bottom of the beaker. When the mixture was slightly runny (after about 25 minutes) the beaker was removed from the bath. Another 40 ml milk was added to the mixture and blended with a hand mixer or blender for 2 minutes. The mixture was cooled to room temperature (an ice bath can be used). The substrate was then stirred with an additional amount of 5 to 15% water (usually 7.5%).
Test Method:
First, 50ul of scrambled egg substrate were dispensed into each well of a MTP. The plates were allowed to dry at room temperature overnight (about 17 hours), baked in oven at 80°C for 2 hours, then cooled to room temperature.
ADW product solution was prepared by dissolving 2.85 g of LH-powder into 1L water. Only about 15 minutes dissolution time was needed and filtration of the solution was not needed. Then, 1.16 mL artificial hardness solution was added and 2120 mg Na2CO3

was dissolved in the solution.
Hardness solution was prepared by mixing 188.57g CaCI2.2H2O and 86.92g MgCI2.6H2O in 1L demi water (equal to 1.28 M Ca + 0.43 M Mg and totally 10000 gpg). The above-mentioned amounts of ADW, CaCI2and MgCI2 were already proportionally increased values (200/190x) because of the addition of 10 ul supernatant to 190 ul ADW solution.
ADW solution (190 ul) was added to each well of the substrate plate. The MTPs were processed by addinglO ul of supernatant to each well and sealing the'plate with tape. The plate was placed in a pre-warmed incubator/shaker and secured with a metal cover and clamp. The plate was then washed for 30 minutes at the appropriate temperature (50°C for US) at 700 rpm. The plate was removed from the incubator/shaker. With gentle up and down movements of the liquid, about 125 ul of the warm supernatant were transferred to an empty flat bottom plate. After cooling, exactly 100 ul of the dispersion was dispensed into the wells of an empty flat bottom plate. The absorbance at 405 nm was determined using a microtiter plate reader.
Calculation of the Scrambled Egg Hydrolyzing Activity:
The obtained absorbance value was corrected for the blank value (substrate without enzyme). The resulting absorbance is a measure for the hydrolytic activity. For each sample (variant) the performance index was calculated. The performance index compares the performance of the variant (actual value) and the standard enzyme (theoretical value) at the same protein concentration. In addition, the theoretical values can be calculated, using the parameters of the Langmuir equation of the standard enzyme. A performance index (PI) that is greater than 1 (Pl>1) identifies a better variant (as compared to the standard [e.g., wild-type]), while a PI of 1 (Pl=1) identifies a variant that performs the same as the standard, and a PI that is less than 1 (Pl EXAMPLE 2 Production of 69B4 protease From the Gram-Positive Alkaliphilic Bacterium 69B4
This Example provides a description of the Cellulomonas strain 69B4 used to initially isolate the novel protease 69B4 provided by the present invention. The alkaliphilic microorganism Cellulomonas strain 69B.4, (DSM 16035) was isolated at 37°C on an alkaline

casein medium containing (g L"1) (See e.g., Duckworth etal., FEMS Microbiol. Ecol., 19:181-191 [1996]).
Glucose (Merck 1.08342) 10
Peptone (Difco 0118) 5
Yeast extract (Difco 0127) 5
K2HPO4 1
MgSO4.7H2O 0.2
NaCI 40
Na2CO3 • • 10
Casein 20
Agar 20
An additional alkaline cultivation medium (Grant Alkaliphile Medium) was also used to cultivate Cellulomonas strain 69B.4, as provided below:
Grant Alkaliphile Medium ("GAM") solution A (g L1)
Glucose (Merck 1.08342) 10
Peptone (Difco 0118) 5
Yeast extract (Difco 0127) 5
K2HPO4 1
MgSO4.7H2O 0.2
Dissolved in 800 ml distilled water and sterilized by autoclaving
GAM solution B (g L1)
NaCI 40
Na2CO3 10
Dissolved in 200 ml distilled water and sterilized by autoclaving.
Complete GAM medium was prepared by mixing Solution A (800 ml) with Solution B (200 ml). Solid medium is prepared by the addition of agar (2% w/v).
Growth Conditions
From a freshly thawed glycerol vial of culture (stored as a frozen glycerol (20% v/v, stock stored at -80°C), the micro-organisms were inoculated using an inoculation loop on

Grant Alkaliphile Medium (GAM) described above in agar plates and grown for at least 2 days at 37 C. One colony was then used to inoculate a 500 ml shake flask containing 100 ml of GAM at pH 10. This flask was then incubated at 37°C in a rotary shaker at 280 rpm for 1-2 days until good growth (according to visual observation) was obtained. Then, 100 ml of broth culture was subsequently used to inoculate a 7 L fermentor containing 5 liters of GAM. The fermentations were run at 37°C for 2-3 days in order to obtain maximal production of protease. Fully aerobic conditions were maintained throughout by injecting air, at a rate of 5 L/min, into the region of the impeller, which was rotating at about 500 rpm. The pH was set at pH 10 at the start, but was not controlled during the fermentation.
Preparation of 69B4 Crude Enzyme Samples
Culture broth was collected from the fermentor, and cells were removed by centrifugation for 30 min at 5000 x g at 10eC. The resulting supernatant was clarified by depth filtration over Seitz EKS (SeitzSchenk Filtersystems). The resulting sterile culture supernatant was further concentrated approximately 10 times by ultra filtration using an ultra filtration cassette with a 10kDa cut-off. (Pall Omega 10kDa Minisette; Pall). The resulting concentrated crude 69B4 samples were frozen and stored at -20°C until further use. Purification
The cell separated culture broth was dialyzed against 20mM (2-(4-morpholino)-ethane sulfonic acid ("MES") ,pH 5.4, 1mM CaCI2 using 8K Molecular Weight Cut Off (MWCO) Spectra-Por7 (Spectrum) dialysis tubing. The dialysis was performed overnight or until the conductivity of the sample was less than or equal to the conductivity of the MES buffer. The dialyzed enzyme sample was purified using a BioCad VISION(Applied Biosystems) with a 10x100mm(7.845 mL) POROS High Density Sulfo-propyl (HS) 20 (20micron) cation-exchange column (PerSeptive Biosystems). After loading the enzyme on the previously equilibrated column at 5ml_/min, the column was washed at 40mL/min with a pH gradient from 25mM MES, pH 6.2, 1mM CaCI2 to 25mM (N-[2-hydroxyethyl] piperazine-N'-[2-ethane] sulfonic acid [C8H18N2O4S, CAS # 7365-45-9]) ("HEPES") pH 8.0,1mM CaCI2 in 25 column volumes. Fractions (8mL) were collected across the run. The pH 8.0 wash step was held for 5 column volumes and then the enzyme was eluted using a gradient (0-100 mM NaCI in the same buffer in 35 column volumes). Protease activity in the fractions was monitored using the pNA assay (sAAPF-pNA assay; DelMar, etal., supra). Protease activity which eluted at 40mM NaCI was concentrated and buffer exchanged(using a 5K MWCO VIVA Science 20ml_ concentrator) into 20mM MES, pH 5.8, 1mMCaCI2. This material was used for further characterization of the enzyme.

EXAMPLE 3 PCR Amplification of a Serine Protease Gene Fragment
In this Example, PCR amplification of a serine protease gene fragment is described.
Degenerate Primer Design
Based on alignments of published serine protease amino acid sequences, a range of degenerate primers were designed against conserved structural and catalytic regions. Such regions included those that were highly conserved among the serine proteases, as well as those known to be important for enzyme structure and function.
During the development of the present invention, protein sequences of nine published serine proteases (Streptogrisin C homologues) were aligned, as shown in below. The sequences were Streptomyces griseus Streptogrisin C (accession no. P52320); alkaline serine protease precursor from Thermobifida fusca (accession no. AAC23545); alkaline proteinase (EG 3.4.21.-) from Streptomyces sp. (accession no. PC2053); alkaline serine proteinase I from Streptomyces sp. (accession no. S34672); serine protease from Streptomyces lividans (accession no. CAD4208); putative serine protease from Streptomyces coelicolor A3(2) (accession no. NP_625129); putative serine protease from Streptomyces avermitilis MA-4680 (accession no. NP_822175); serine protease from Streptomyces lividans (accession no. CAD42809); putative serine protease precursor from Streptomyces coelicolor A3(2) (accession no. NP_628830). All of these sequences are publicly available from GenBank. These alignments are provided below. In this alignment, two conserved boxes are underlined and shown in bold.
AAC23545 (1) . —MNHSSR--RTTSLLFTAALAATALVAATTPAS
PC2053 (1) --MRHTGR-NAIGAAIAASALAFALVPSQAAAN DTLTERAEAAV
S34672 (1) --MRLKGRTVAIGSALAASALALSLVPANASSELP SAETAKADALV
CAD42808 (1) MVGRHAAR-SRRAALTALGALVLTALPSAASAAPPPVPGPRPAVARTPDA
NP_6 25129 (1) MVGRHAAR- SRRAALTALGALVLTALPSAASAAPPPVPGPRPAVARTPDA
NP_822175 (1) MVHRHVG--AGCAGLSVLATLVLTGLPAAAAIEPP-GPAPAPSAVQPLGA
CAD42809 (1) MPHRHRHH-RAVGAAVAATAALLVAGLSGSASAGTAPAGSAPTAAETLRT
NP_628830 (1) MPHRHRHH-RAVGAAVAATAALLVAGLSGSASAGTAPAGSAPTAAETLRT
P52320 (1) MERTT-LRRRALVAGTATVAVGALALAGLTGVASADPAATAAPPVSA
51 100
AAC2 3545 (31) AQELALKRDLGLSDAEVAELRAAEAEAVELEEELRDSLGSDFGGV
PC2 0 5 3 (42) ADLPAGVLDAMERDLGLSEQEAGLKLVAEHDAALLGETLSADLDAFAGSW
S34672 (45) EQLPAGMVDAMERDLGVPAAEVGNQLVAEHEAAVLEESLSEDLSGYAGSW
CAD4 2808 (50) ATAPARMLSAMERDLRLAPGQAAARPVNEAEAGTRAGMLRNTLGDRFAGA
NP_6 25129 (50) ATAPARMLSAMERDLRLAPGQAAARLVNEAEAGTRAGMLRNTLGDRFAGA
NP_822175 (48) GNPSTAVLGALQRDLHLTDTQAKTRLVNEMEAGTRAGRLQNALGKHFAGA
CAD42809 (50) DAAPPALLKAMQRDLGIDRRQAERRLVNEAEAGATAGRLRAALGGDFAGA
NP_628830 (50) DAAPPALLKAMQRDLGLDRRQAERRLVNEAEAGATAGRLRAALGGDFAGA
P52320 (47) DSLSPGMLAALERDLGLDEDAARSRIANEYRAAAVAAGLEKSLGARYAGA
101 150
AAC2 3545 (76) YLDADT-TEITVAVTDPAAVSRVDADDVTVDWDFGETALNDFVASLNAI
PC2053 (92) LAEGT ELWATTSEAEAAEITEAGATAEWDHTLAELDSVKDALDTA

S34672 (95) IVEGTS—EHWATTDRAEAAEITAAGATATWEHSLAELEAVKDILDEA
CAD42808 (100) WVSGATSAELTVATTDAADTAAIEAQGAKAAWGRNLAELRAVKEKLDAA
NP_625129 (100) WVSGATSAELTVATTDAADTAAIEAQGAKAAVVGRNLAELRAVKEKLDAA
NP_822175 (98) WVHGAASADLTVATTHATDIPAITAGGATAVWKTGLDDLKGAKKKLDSA
CAD42809 (100) WVRGAESGTLTVATTDAGDVAAVEARGAEAKWRHSLADLDAAKARLDTA
NP_628830 (100) WVRGAESGTLTVATTDAGDVAA.IEARGAEAKWRHSLADLDAAKARLDTA
P52320 (97) RVSGAK-ATLTVATTDASEAARITEAGARAEWGHSLDRFEGVKKSLDKA
151 200
AAC23545 (125) ADT—ADPKVTGWYTDLESDAWITTLRGGTPAAEELAERAGLDERAVRI
PC2053 (139) AES-YDTTDAPVWYVDVTTNGWLLTSD—VTEAEGFVEAAGVNAAAVDI
S34672 (143) ATA-NPEDAAPVWYVDVTTNEVWLASD—VPAAEAFVAASGADASTVRV
CAD42808 (150) AVR-TRTRQTPVWYVDVKTNRVTVQATG—ASAAAAFVEAAGVPAADVGV
NP_625129 (150) AVR-TRTRQTPVWYVDVKTNRVTVQATG--ASAAAAFVEAAGVPAADVGV
NP_822175 (148) VAHGGTAVNTPVRYVDVRTNRVTLQARS—RAAADALIAAAGVDSGLVDV
CAD42809 (150) AAG-LNTADAPVWYVDTRTNTVWEAIR—PAAARSLLTAAGVDGSLAHV
NP_628830 (150) AAG-LNTADAPVWYVDTRTNTVWEAIR—PAAARSLLTAAGVDGSLAHV
P52320 (146) ALD-KAPKNVPVWYVDVAANRVWNAAS--PAAGQAFLKVAGVDRGLVTV
201 250
AAC23545 (173) VEEDEEPQSLAAIIGGNPYYFGN-YRCSIGFSVRQGSQTGFATAGHCGST
PC2053 (186) QTSDEQPQAFYDLVGGDAYYMGG-GRCSVGFSVTQGSTPGFATAGHCGTV
S34672 (190) ERSDESPQPFYDLVGGDAYYIGN-GRCSIGFSVRQGSTPGFVTAGHCGSV
CAD42808 (197) RVSPDQPRVLEDLVGGDAYYIDDQARCSIGFSVTKDDQEGFATAGHCGDP
NP_625129 (197) RVSPDQPRVLEDLVGGDAYYIDDQARCSIGFSVTKDDQEGFATAGHCGDP
NP_822175 (196) KVSEDRPRALFDIRGGDAYYIDNTARCSVGFSVTKGNQQGFATAGHCGRA
CAD42809 (197) KNRTERPRTFYDLRGGEAYYINNSSRCSIGFPITKGTQQGFATAGHCDRA
NP_628830 (197) KNRTERPRTFYDLRGGEAYYINNSSRCSIGFPITKGTQQGFATAGHCGRA
P52320 (193) ARSAEQPRALADIRGGDAYYMNGSGRCSVGFSVTRGTQNGFATAGHCGRV
251 ' 300
AAC23545 (222)
PC2053 (235)
S34672 (239)
CAD42808 (247)
NP_625129 (247)
NP_822175 (246)
CAD42809 (247)
NP 628830 (247)
P*52320 (243)
GTRVS SPSGTVAGSYFPGRDMGWVRITSADTVTPLVNRYNGGTVTV
GTSTTGYNQAAQGTFEESSFPGDDMAWVSVNSDWNTTPTVNE--GE-VTV GNATTGFNRVSQGTFRGSWFPGRDMAWVAVNSNWTPTSLVRNS-GSGVRV GATTTGYNEADQGTFQASTFPGKDMAWVGVNSDWTATPDVKAEGGEKIQL GATTTGYNEADQGTFQASTFPGKDMAWVGVNSDWTATPDVKAEGGEKIQL GAPTAGFNEVAQGTVQASVFPGHDMAWVGVNSDWTATPDVAGAAGQNVSI GSSTTGANRVAQGTFQGSIFPGRDMAWVATNSSWTATPYVLGAGGQNVQV GSSTTGANRVAQGTFQGSIFPGRDMAWVATNSSWTATPYVLGAGGQNVQV GTTTNGVNQQAQGTFQGSTFPGRDIAWVATNANWTPRPLVNGYGRGDVTV
301 350
AAC23545
PC2053
S34672
CAD42808
NP_625129
NP_822175
CAD42809
NP_628830
P52320
(268) TGSQEAATGSSVCRSGATTGWRCGTIQSKNQTVRYAEGTVTGLTRTTACA
(282) SGSTEAAVGASICRSGSTTGWHCGTIQQHNTSVTYPEGTITGVTRTSVCA
(288) TGSTQATVGSSICRSGSTTGWRCGTIQQHNTSVTYPQGTITGVTRTSACA
(297) AGSVEALVGASVCRSGSTTGWHCGTIQQHDTSVTYPEGTVDGLTGTTVCA
(297) AGSVEALVGASVCRSGSTTGWHCGTIQQHDTSVTYPEGTVDGLTETTVCA
(296) AGSVQAIVGAAICRSGSTTGWHCGTVEEHDTSVTYEEGTVDGLTRTTVCA
(297) TGSTASPVGASVCRSGSTTGWHCGTVTQLNTSVTYQEGTISPVTRTTVCA
(297) TGSTASPVGASVCRSGSTTGWHCGTVTQLNTSVTYQEGTISPVTRTTVCA
(293) AGSTASWGASVCRSGSTTGWHCGTIQQLNTSVTYPEGTISGVTRTSVCA

AAC23545
PC2053
S34672
CAD42808
NP_625129
NP_822175
CAD42809
NP_628830
P52320
CAD42808 NP 625129
NP_628830 P52320

351 400
(318) EGGDSGGPWLTGSQAQGVTSGGTGDCRSGGITFFQPINPLLSYFGLQLVT (332) EPGDSGGSYISGSQAQGVTSGGSGNCTSGGTTYHQPINPLLSAYGLDLVT (338) QPGDSGGSFISGTQAQGVTSGGSGNCSIGGTTFHQPVNPILSQYGLTLVR (347) EPGDSGGPFVSGVQAQGTTSGGSGDCTNGGTTFYQPVNPLLSDFGLTLKT (347) EPGDSGGPFVSGVQAQGTTSGGSGDCTNGGTTFYQPVNPLLSDFGLTLKT
(346) EPGDSGGSFVSGSQAQGVTSGGSGDCTRGGTTYYQPVNPILSTYGLTLKT
(347) EPGDSGGSFISGSQAQGVTSGGSGDCRTGGGTFFQPINALLQNYGLTLKT
(347) EPGDSGGSFI SGSQAQGVTSGGSGDCRTGGETFFQPINALLQNYGLTLKT
(343) EPGDSGGSYISG5QAQGVTSGGSGNCSSGGTTYFQPINPLLQAYGLTLVT
401 450
(368) G
(382) G
(388) S
(397) TSAATQTPAPQDNAAA DAWTAGRVYEVGTTVS YDGVRYRCLQSH
(397) TSAATQTPAPQDNAAA DAWTAGRVYEVGTTVS YDGVRYRCLQSH
STAPTDTPSDPVDQSG VWAAGRVYEVGAQVTYAGVTYQCLQSH
TGGDDGGGDDGG EEPGG-TWAAGTVYQPGDTVTYGGATFRCLQGH
(397) TGGDDGGGDDGGGDDGGEEPGG-TWAAGTVYQPGDTVTYGGATFRCLQGH
(393 ) SGGGTPTDPPTTPPTDSP GGTWAVGTAYAAGATVTYGGATYRCLQAH



451

468

ID ID

S34672 (389) (SEQ ID N0:650)
CAD42808 (441) QAQGVGSPASVPALWQRV (SEQ ID NO:651)
NP_625129 (441) QAQGVGSPASVPALWQRV (SEQ ID N0:652)
NP_822175 (439) QAQGVWQPAATPALWQRL (SEQ ID NO:653)
CAD42809 (441) QAYAGWEPPNVPALWQRV (SEQ ID NO:654)
NP_628830 (446) QAYAGWEPPNVPALWQRV (SEQ ID NO:655)
P52320 (440) TAQPGWTPADVPALWQRV (SEQ ID NO:656)
Two particular regions were chosen to meet the criteria above, and a forward and a reverse primer were designed based on these amino acid regions. The specific amino acid regions used to design the primers are highlighted in black in the sequences shown in the alignments directly above. Using the genetic code for codon usage, degenerate nucleotide PCR primers were synthesized by MWG-Biotech. The degenerate primer sequences produced were:
forward primer TTGWXCGT_FW: 5' ACNACSGGSTGGCRGTGCGGCAC 3' (SEQ ID
NO: 10)
reverse primer GDSGGX_RV: 5'-ANGNGCCGCCGGAGTCNCC-3' (SEQ ID NO:11)
As all primers were synthesized in the 5'-3' direction and standard IUB code for mixed base sites was used (e.g., to designate "N" for A/C/T/G). Degenerate primers TTGWXCGT_FW and GDSGGX_RV successfully amplified a 177 bp region from Cellulomonas sp. isolate 69B4 by PCR, as described below.
PCR Amplification of a Serine Protease Gene Fragment
Cellulomonas sp. isolate 69B4 genomic DNA was used as a template for PCR amplification of putative serine protease gene fragments using the above-described primers. PCR was carried out using High Fidelity Platinum Taq polymerase (Catalog number 11304-102; Invitrogen). Conditions were determined by individual experiments, but typically thirty cycles were run in a thermal cycler (MJ Research). Successful amplification was verified by electrophoresis of the PCR reaction on a 1% agarose TBE gel. A PCR product that was amplified from Cellulomonas sp. 69B4 with the primers TTGWXCGT_FW and GDSGGX_RV was purified by gel extraction using the Qiaquick Spin Gel Extraction kit (Catalogue 28704; Qiagen) according to the manufacturer's instructions. The purified PCR product was cloned into the commercially available pCR2.1TOPO vector System (Invitrogen) according to the manufacturer's instructions, and transformed into competent E.CO//TOP10 cells. Colonies containing recombinant plasmids were visualized using blue/white selection. For rapid screening of recombinant transformants, plasmid DNA was

prepared from cultures of putative positive (i.e., white) colonies. DNA was isolated using the Qiagen plasmid purification kit, and was sequenced by Baseclear. One of the clones contained a DNA insert of 177 bp that showed some homology with several streptogrisin-like protease genes of various Streptomyces species and also with serine protease genes from other bacterial species. The DNA and protein coding sequence of this 177 bp fragment is provided in Fig. 13.
Sequence Analysis
The sequences were analyzed by BLAST and other protein translation sequence tools. BLAST comparison at the nucleotide level showed various levels of identity to published serine protease sequences. Initially, nucleotide sequences were submitted to BLAST (Basic BLAST version 2.0). The program chosen was "BlastX", and the database chosen was "nr." Standard/default parameter values were employed. Sequence data for putative Cellulomonas 69B4 protease gene fragment was entered in FASTA format and the query submitted to BLAST to compare the sequences of the present invention to those already in the database. The results returned for the 177 bp fragment a high number of hits for protease genes from various Streptomyces spp., including S. griseus, S. lividans, S. coelicolor, S. albogriseolus, S. platensis, S. fradiae, and Streptomyces sp. It was concluded that further analysis of the 177 bp fragment cloned from Cellulomonas sp. isolate 69B4 was desired.
EXAMPLE 4
Isolation of a Polynucleotide Sequence from the Genome of Cellulomonas 69B4 Encoding a Serine Protease by Inverse PCR
In this Example, experiments conducted to isolate a polynucleotide sequence encoding a serine protease produced by Cellulomonas sp. 69B4 are described.
Inverse PCR of Cellulomonas sp. 69B4 Genomic DNA to Isolate the Gene Encoding Cellulomonas strain 69B4 Protease
Inverse PCR was used to isolate and clone the full-length serine protease gene from Cellulomonas sp. 69B4. Based on the DNA sequence of the 177 bp fragment of the Cellulomonas protease gene described in Example 3, novel DNA primers were designed:

69B4int_RV1 5'-CGGGGTAGGTGACCGAGGAGTTGAGCGCAGTG-3' (SEQIDNO:14) 69B4int_FW2 5'-GCTCGCCGGCAACCAGGCCCAGGGCGTCACGTC-3' (SEQ IDNO:15)
Chromosomal DNA of Cellulomonas sp. 69B4 was digested with the restriction enzymes Apa\, BamY\\, BssH\\, Kpn\, A/a/1, A/col, Nhel, Pvu\, Sal\ or Ssfll, purified using the Qiagen PCR purification kit (Qiagen, Catalogue # 28106) and self-ligated with T4 DNA ligase (Invitrogen) according to the manufacturers' instructions. Ligation mixtures were purified using the Qiagen PCR purification kit, and PCR was performed with primers 69B4int_RV1 and 69B4int_FW2. PCR on DNA fragments that were digested with Nco\, and then self-ligated, resulting in a PCR product of approximately 1.3 kb. DNA sequence analysis (BaseClear) revealed that this DNA fragment covers the main part of a streptogrisin-like protease gene from Cellulomonas. This protease was designated as "69B4 protease," and the gene encoding Cellulomonas 69B4 protease was designated as the "asp gene." The entire sequence of the asp gene was derived by additional inverse PCR reactions with primer 69B40int_FW2 and an another primer: 69B4-for4 (5' AAC GGC GGG TTC ATC ACC GCC GGC CAC TGC GGC C 3' {SEQ ID NO:16). Inverse PCR with these primers on A/col, BssH\\, Apa\ and Pvu\ digested and self-ligated DNA fragments of genomic DNA of Cellulomonas sp. 69B4 resulted in the identification of the entire sequence of the asp gene.
Nucleotide and Amino Acid Sequences
For convenience, various sequences are included below. First, the DNA sequence of the asp gene (SEQ ID NO:1) provided below encodes the signal peptide (SEQ ID NO:9) and the precursor serine protease (SEQ ID N0:7) derived from Cellulomonas strain 69B4 (DSM 16035). The initiating polynucleotide encoding the signal peptide of the Cellulomonas strain 69B4 protease is in bold (ATG).

umn (5ul particle size, 300 angstroms pore size). The elution gradient was formed from 0.1% (v/v) TFA in water and 0.08% (v/v) TFA in acetonitrile at a flow rate of 0.2 ml-min. The column compartment was heated to 50°C. Peptide elution was monitored at 215 nm and data were collected at 215 nm and 280 nm. The samples were then analyzed on a LCQ Advantage mass spectrometer with a Surveyor HPLC (both from Thermo Finnigan). The LCQ mass spectrophotometer was run with the following settings: Spray voltage: 4.5kV; Capillary temperature: 225Q C. Data processing was performed using TurboSEQUEST and Xcalibur (ThermoFinnigan). Sequencing of the tryptic digest portions was also performed in part by Argo BioAnalytica.
Analysis of the full sequence of the asp gene revealed that it encodes a prosequence protease of 495 amino acids (SEQ ID NO:6). The first 28 amino acids were predicted to form a signal peptide. The mass of the mature chain of 69B4 protease as produced by Cellulomonas strain 69B4 has a molecular weight of 18764 (determined by MALDI-TOF). The sequence of the N-terminus of the mature chain was also determined by MALDI-TOF analysis and starts with the sequence FDVIGGNAYTIGGR (SEQ ID NO:17). It is believed that the 69B4 protease has a unique precursor structure with NH2- and COOH

terminal pro-sequences, as is known to occur with some other enzymes (e.g., T. aquaticus aqualysin I; See e.g., Lee etal., FEMS Microbiol. Lett., 1:69-74 [1994]; Sakamoto etal., Biosci. Biotechnol. Biochem., 59:1438-1443 [1995]; Sakamoto etal., Appl. Microbiol. Biotechnol., 45:94-101 [1996]; Kim etal., Biochem. Biophys. Res. Commun., 231:535-539 [1997]; and Oledzka etal., Protein Expr. Purific., 29:223-229 [2003]). The predicted molecular weight of mature 69B4 protease as provided in SEQ ID NO:8, was 18776.42, which corresponds well with the molecular weight of the purified enzyme with proteolytic activity isolated from Cellulomonas sp. 69B4 (i.e., 18764). The prediction of the COOH terminal pro-sequence in 69B4 protease was also based on an alignment of the 69B4 protease with T. aquaticus aqualysin I, provided below. In this alignment, the amino acid sequence of the Cellulomonas 69B4 signal sequence and precursor protease are aligned with the signal sequence and precursor protease Aqualysin I of Thermus aquaticus (COOH-terminal pro-sequence of Aqualysin I is underlined and in bold).
The sequences of three internal peptides of the purified enzyme from Cellulomonas

sp. 69B4 having proteolytic activity were determined by MALDI-TOF analysis. All three peptides were also identified in the translation product of the isolated asp gene, confirming the identification of the correct protease gene (See, SEQ ID NO:1, above).
Percentage Identity Comparison Between Asp and Streptogrisin
The deduced polypeptide product of the asp gene (mature chain) was used in homology analysis with other serine proteases using the BLAST program and settings as described in Example 3. The preliminary analyses showed identities of from about 44 - 48% (See, Table 4-1, below). Together with analysis of the translated sequence, these results provided evidence that the asp gene encodes a protease having less than 50% sequence identity with the mature chains of Streptogrisin-like serine proteases. An alignment of Asp with Streptogrisin A, Streptogrisin B, Streptogrisin C, Streptogrisin D of Sireptomyces griseus is provided below. In this alignment, the amino acid sequences of Cellulomonas 69B4 mature protease ("69B4 mature") are aligned with mature proteases amino acid sequences of Streptogrisin C ("Sq - streptogrisinCjnature"), Streptogrisin B ("Sq -streptogrisinBmature"), Streptogrisin A ("Sq - streptogrisinAmature"), Streptogrisin D ("Sq -streptogrisinDmature") and consensus residues.
1 50
69B4 mature (1) FDVIGGNAYTIGGRSRCSIGFAVN GGFITAGHCGRTGATT
Sg-StreptogrisinC mature (1) ADIRGGDAYYMNGSGRCSVGFSVTRGTQNGFATAGHCGRVGTTTNG--VN
Sg-StreptogrisinBmature (1) —ISGGDAIYSST-GRCSLGFNVRSGSTYYFLTAGHCTDGATTWWANSAR
Sg-StreptogrisinAmature (1) —IAGGEAITTGG-SRCSLGFNVSVNGVAHALTAGHCTNISASWS
Sg-StreptogrisinDmature (1) --IAGGDAIWGSG-SRCSLGFNVVKGGEPYFLTAGHCTESVTSWSD-TQG
Consensus (1) IAGGDAIY G SRCSLGFNV G YFLTAGHCT GTTW
51 • 100
Asp mature (41) ANPTGTFAGSSFPGNDYAFVRTGAGVNLLAQVNNYSGGRVQVAGHTAAPV Sg-StreptogrlsinC mature (49) QQAQGTFQGSTFPGRDIAWVATNANWTPRPLVNGYGRGDVTVAGSTASW
Sg-StreptogrisinBmature (48) TTVLGTTSGSSFPNNDYGIVRYTNTTIPKDGTVGG QDITSAANATV
Sg-StreptogrisinAmature (43) IGTRTGTSFPNNDYGIIRHSNPAAADGRVYLYNGSYQDITTAGNAFV
Sg-StreptogrisinDmature (47) GSEIGANEGSSFPENDYGLVKYTSDTAHPSEVNLYDGSTQAITQAGDATV
Consensus (51) IGT GSSFP NDYGIVRYTA VN Y G Q IT AG A V
101 150
Asp mature (91) GSAVCRSGSTTGWHCGTITALNSSVTYPEG-TVRGLIRTTVCAEPGDSGG
Sg-StreptogrisinC mature (99) GASVCRSGSTTGWHCGTIQQLNTSVTYPEG-TISGVTRTSVCAEPGDSGG
Sg-StreptogrisinBmature (94) GMAVTRRGSTTGTHSGSVTALNATVNYGGGDWYGMIRTNVCAEPGDSGG
Sg-StreptogrisinAmature (90) GQAVQRSGSTTGLRSGSVTGLNATVNYGSSGIVYGMIQTNVCAEPGDSGG
Sg-StreptogrisinDmature (97) GQAVTRSGSTTQVHDGEVTALDATVNYGNGDIVNGLIQTTVCAEPGDSGG
Consensus (101) G AV RSGSTTG H GSVTALNATVNYG G IV GLIRTTVCAEPGDSGG
151 200
Asp mature (140) SLLAGNQAQGVTSGGSGNCRTGGTTFFQPVNPILQAYGLRMITTDSGSSP Sg-StreptogrisinC mature (148) SYISGSQAQGVTSGGSGNCSSGGTTYFQPINPLLQAYGLTLVTSGGGTPT
Sg-StreptogrisinBmature (144) PLYSGTRAIGLTSGGSGNCSSGGTTFFQPVTEALSAYGVSVY
Sg- S trep togr i S inAma tur e (140) SLFAGSTALGLTSGGSGNCRTGGTTFYQPVTEALSAYGATVL
Sg-StreptogrisinDmature (147 ) ALFAGDTALGLTSGGSGDCSSGGTTFFQPVPEALAAYGAEIG
Consensus (151) SLFAGS ALGLTSGGSGNCSSGGTTFFQPV EALSAYGLTVI
201 250
Asp mature (190)
Sg-StreptogrisinC mature (198) DPPTTPPTDSPGGTWAVGTAYAAGATVTYGGATYRCLQAHTAQPGWTPAD
Sg-StreptogrisinBmature (186)
Sg-StreptogrisinAmature (182)
Sg-StreptogrisinDmature (189)
Consensus (201)
251
Asp mature (190) (SEQ ID NO:8)

Sg-StreptogrisinC mature
Sg-StreptogrisinBmature
Sg-StreptogrisinAmature
Sg- S treptogri s inDmature
Consensus

(248) VPALWQRV
(186)
(182)
(189)
(251)

(SEQ ID NO:639)
(SEQ ID NO:640)
(SEQ ID NO:641)
(SEQ ID NO:642)
(SEQ ID NO:643)

Table 4-1. Percentage Identity: Comparison between Cellulomonas sp. 69B4 Protease Encoded by asp and Other Serine Proteases (identity between the mature chains)

Additionnel protease sequences were also investigated. In these analyses, proteases homologous in protein sequence to the mature domain of ASP were searched for using BLAST. Those identified were then aligned using the multiple sequence alignment program clustalW. The numbers on the top of the alignment below refer to the amino-acid sequence of the mature ASP protease. The numbers at the side of the alignment are sequence identifiers, as described at the bottom of the alignment.

Sequence
ASP
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

1 10 20 30 40
TF
-IGTR -LGTR -IGQT -IGQT -IGST -LGTT -VGVR -IGVR -I GEN -I GEN -IGAN -ATVD -ATVD
FDVIGGNAYTIGGRSRCSIGFAVN GGFITAGHCGRTGATTANPTG
TPLIAGGEAITTGGSRC SLGFNV-SVNGVAHALTAGHCTNISASWS
--IAGGEAIYAAGGGRCSLGFNVRSSSGATYALTAGHCTEIASTWYTNSGQTSL-NKLIQGGDAIYAS SWRC SLGFNVRTS SGAEYFLTAGHCTDGAGAWRAS SGGTV--NKLIQGGDAIYASSWRCSLGFNVRTSSGAEYFLTAGHCTDGAGAWRASSGGTV— TKLIQGGDAIYASSWRC SLGFNVRSS SGVDYFLTAGHCTDGAGTWYSNSARTTA-TKLISGGDAIYSSTGRCSLGFNVRSGS-TYYFLTAGHCTDGATTWWANSARTTV-
VLGGGAIYGGGSRCSAAFNV-TKGGARYFVTAGHCTNISANWSASSGGSV--
QREVAGGDAIYGGGSRCSAAFNV-TKNGVRYFLTAGHCTNLSSTWSSTSGGTS--KPFIAGGDAITGNGGRCSLGFNVTKG-GEPHFLTAGHCTEGISTWSDSSG—QV-KPFVAGGDAITGGGGRCSLGFNVTKG-GEPYFITAGHCTESISTWSDSSG—NV-TPLIAGGDAIWGSGSRCSLGFNVVKG-GEPYFLTAGHCTESVTSWSDTQGG-SE-KTFASGGDAIFGGGARCSLGFNVTAGDGSAAFLTRGHCGGGATMWSDAQGGQPI-KTFASGGDAIFGGGARCSLGFNVTAGDGSPAFLTAGHCGVAADQWSDAQGGQPI-
TTRLNGAEPILSTAGRCSAGFNVTDG' ATVQGGDVYYINRS SRC SIGFAVT- - • ADIRGGDAYYMNGSGRCSVGFSVTRG-YDLRGGEAYYINNS SRC SIGFPITKG • YDLVGGDAYYIGN-GRCSIGFSVRQG-YDLVGGDAYYMGG-GRC SVGFSVTQG' EDLVGGDAYYIDDQARC SIGF S VTKD • LAAIIGGNPYYFGNYRC SIGFSVRQG-ANIVGGIEYSINNASLCSVGFSVTRG • AAGTVGGDPYYTGNVRCSIGFS VH --• VIVPVRDYWGGDALSGCTLAFPVYGG-DPPLRSGLAIYGTNVRCSSAFMAYSG-
-TSDFILTAGHCGPTGSVWFGDRPGDGQ—VGRT
---TGFVSAGHCGGSGASATTSSGEAL GTF
-TQNGFATAGHCGRVGTTTNGVWQQAQ GTF
-TQQGFATAGHCGRAGSSTTGANRVAQ GTF
- STPGFVTAGHCGSVGNATTGFNRVSQ GTF
-STPGFATAGHCGTVGTSTTGYNQAAQ GTF
-DQEGFATAGHCGDPGATTTGYNEADQ GTF
-SQTGFATAGHCGSTGTRVSSPSG TV
- ATKGFVTAGHCGTVNATARIGGAW GTF
---GGFVTAGHCGRAGAGVSGWDRSYI GTF
FLTAGHCAVEGKGHILKTEMTGGQ-IGTV
-SSYYMMTAGHCAEDSSYWEVPTYSYGYQGVGHV

50 60 70 80 90 100
ASP AGSSFPGN-DYAFVRTGAGVNLLAQVNNYSGGR-VQVAGHTAAPVGSAVCRSGSTTGWHC
2 TGTSFPNNDYGIIRHSNPAAA—DGRVYLYNGSYQDITTAGNAFVGQAVQRSGSTTGLRS
3 AGTSFPGNDYGLIRHSNASAA—DGRVYLYNGSYRDITGAGNAYVGQTVQRSGSTTGLHS
4 AGSSFPGNDYGIVQYTGS VSRPGTANGVDITRAATPSVGTTVIRDGSTTGTHS
5 AGSSFPGNDYGIVQYTGS VSRPGTANGVDITRAATPSVGTTVIRDGSTTGTHS
6 AGS SFPGNDYGIVRYTGS VSRPGTANGVDITRAATPSVGTTVIRDGSTTGTHS
7 SGSSFPNNDYGIVRYTNTT IPKDGTVGGQDITSAANATVGMAVTRRGSTTGTHS
8 EGTSFPTNDYGIVRYTDGSSP--AGTVDLYNGSTQDISSAANAWGQAIKKSGSTTKVTS
9 EGTSFPTNDYGIVRYTTTTNV--DGRVNLYNGGYQDIASAADAWGQAIKKSGSTTKVTS
10 AASSFPGDDYGLVKYTADVAH— PSQVNLYDGSSQSISGAAEAAVGMQVTRSGSTTQVHS
11 AASSFPDNDYGLVKYTADVDH—PSEVNLYNGSSQAISGAAEATVGMQVTRSGSTTQVHD
1,2 EGSSFPENDYGLVKYTSDTAH—PSEVNLYDGSTQAITQAGDATVGQAVTRSGSTTQVHD

13 QAVFPPEGDFGLVRYDGPSTE—APSEVDLGDQTLPISGAAEASVGQEVFRMGSTTGLAD
14 QAVFPGEGDFALVRYDDPATE—APSEVDLGDQTLPISGAAEAAVGQEVFRMGSTTGLAD
-1C
16 VAGSFPGDDFSLVEYANGKAGDGADWAVGDGKGVRITGAGEPAVGQRVFRSGSTSGLRD
17 SGSVFPGSADMAYVRTVSGTVLRGYINGYGQGS-FPVSGSSEAAVGASICRSGSTTQVHC
18 QGSTFPGR-DIAWATNANWTPRPLVNGYGRGD-VTVAGSTASWGASVCRSGSTTGWHC
19 QGSIFPGR-DMAWVATNSSWTATPYVLGAGGQN-VQVTGSTASPVGASVCRSGSTTGWHC
2 0 RGSWFPGR-DMAWVAVNSNWTPTSLVRNSGSG—VRVTGSTQATVGSSICRSGSTTGWRC
21 EESSFPGD-DMAWVSVNSDWNTTPTVNEGE VTVSGSTEAAVGASICRSGSTTGWHC
2 2 QASTFPGK-DMAWVGVNSDWTATPDVKAEGGEK-IQLAGSVEALVGASVCRSGSTTGWHC
23 AGSYFPGR-DMGWVRITSADTVTPLVNRYNGGT-VTVTGSQEAATGSSVCRSGATTGWRC
2 4 AARVFPGN-DRAWVSLTSAQTLLPRVANGSSF—VTVRGSTEAAVGAAVCRSGRTTGYQC
2 5 QGSSFPDN-DYAWVSVGSGWWTVPWLGWGTVSDQLVRGSNVAPVGASICRSGSTTHWHC
2 6 EASQFGDGIDAAWAKNYGDWNGRGRVTHWNGGGGVDIKGSNEAAVGAHMCKSGRTTKWTC
2 7 ADYTFGYYGDSAIVRVDDPGF WQPRGWVYPSTRITNWDYDYVGQYVCKQGSTTGYTC
110 120 130 140 150
ASP GTITALNSSVTYPEGTV-RGLIRTTVCAEPGDSGGSLLAGN-QAQGVTSGGS
2 GSVTGLNATVNYGS SGIVYGMIQTNVCAEPGDSGGSLF-AGSTALGLTSGGS
3 GRVTGLNATVNYGGGDIVSGLIQTNVCAEPGDSGGALF-AGSTALGLTSGGS
4 GRVTALNATVNYGGGDWGGLIQTTVCAEPGDSGGSLYGSNGTAYGLTSGGS
5 GRVTALNATVNYGGGDWGGLIQTTVCAEPGDSGGSLYGSNGTAYGLTSGGS
6 GRVTALNATVNYGGGDIVSGLIQTTVCAEPGDSGGPLYGSNGTAYGLTSGGS
7 GSVTALNATVNYGGGDWYGMIRTNVCAEPGDSGGPLY-SGTRAIGLTSGGS
8 GTVTAVNVTVNYGDGP-VYNMGRTTACSAGGDSGGAHF-AGSVALGIHSGSS
9 GTVSAVNVTVNYSDGP-VYGMVRTTACSAGGDSGGAHF-AGSVALGIHSGSS
10 GTVTGLDATVNYGNGDIVNGLIQTDVCAEPGDSGGSLFSGDK-AVGLTSGGS
11 GTVTGLDATVNYGNGDIVNGLIQTDVCAEPGDSGGSLFSGDQ-AIGLTSGGS
12 GEVTALDATVNYGNGDIVNGLIQTTVCAEPGDSGGALFAGDT-ALGLTSGGS
13 GQVLGLDVTVNYPEG-TVTGLIQTDVCAEPGDSGGSLFTRDGLAIRLTSGGT
14 GQVLGLDATVNYPEG-MVTGLIQTDVCAEPGDSGGSLFTRDGLAIGLTSGGS
15 VDGLIQTDVCAEPGDSGGALFDGDA-AIGLTSGGS
16 GRVTALDATVNYPEG-TVTGLIETDVCAEPGDSGGPMFSEGV-ALGVTSGGS
17 GTIGAKGATVNYPQGAV-SGLTRTSVCAEPGDSGGSFYSGS-QAQGVTSGGS
18 GTIQQLNTSVTYPEGTI-SGVTRTSVCAEPGDSGGSYISGS-QAQGVTSGGS
19 GTVTQLNTSVTYQEGTI-SPVTRTTVCAEPGDSGGSFISGS-QAQGVTSGGS
20 GTIQQHNTSVTYPQGTI-TGVTRTSACAQPGDSGGSFISGT-QAQGVTSGGS
21 GTIQQHNTSVTYPEGTI-TGVTRTSVCAEPGDSGGSYISGS-QAQGVTSGGS
2 2 GTIQQHDTSVTYPEGTV-DGLTETTVCAEPGDSGGPFVSGV-QAQGTTSGGS
2 3 GTIQSKNQTVRYAEGTV-TGLTRTTACAEGGDSGGPWLTGS-QAQGVTSGGT
2 4 GTITAKNVTANYAEGAV-RGLTQGNACMGRGDSGGSWITSAGQAQGVMSGGNVQSNGNNC
2 5 GTVLAHNETVNYSDGSWHQLTKTSVCAEGGDSGGSFISGD-QAQGVTSGGW
2 6 GYLLRKDVSVNYGNGHI-VTLNETSACALGGDSGGAYVWND-QAQGITSGSN
27 GQITETNATVSYPGRTL-TGMTWSTACDAPGDSGSGVYDGSTAHGILSGGPN
160 170 180 189
ASP GNCRTGGTTFFQPVNPILQAYGLRMITTDSGSSP (SEQ ID NO:18)
2 GNCRTGGTTFYQPVTEALSAYGATVL (SEQ ID NO: 19)
3 GNCRTGGTT (SEQ ID N0:20)
4 GNCSSGGTTFFQPVTEALSAYGVSVY (SEQ ID NO: 21)
5 GNCSSGGTTFFQPVTEALSAYGVSVY (SEQ ID N0:22)
6 GNCSSGGTTFFQPVTEALSAYGVSVY (SEQ ID NO: 23)
7 GNCSSGGTTFFQPVTEALSAYGVSVY (SEQ ID NO: 24)
8 GCSGTAGSAIHQPVTKALSAYGVTVYL (SEQ ID NO:25)

9 GCTGTNGSAIHQPVREALSAYGVNVY (SEQ ID NO: 26)
10 GDCTSGGTTFFQPVTEALSATGTQIG (SEQ ID NO: 27)
11 GDCTSGGETFFQPVTEALSATGTQIG (SEQ ID NO: 28)
12 GDCSSGGTTFFQPVPEALAAYGAEIG (SEQ ID N0:29)
13 RDCTSGGETFFQPVTTALAAVGGTLGGEDGGDG- (SEQ ID NO:30)
14 GDCTVGGETFFQPVTTALAAVGATLGGEDGGAGA (SEQ ID NO:31)
15 GDCSQGGETFFQPVTEALKAYGAQIGGGQGEPPE (SEQ ID NO:32)
16 GDCAKGGTTFFQPLPEAMASLGVRLIVPGREGAA (SEQ ID NO:33)
17 GDCSRGGTTYFQPVNRILQTYGLTLVTA (SEQ ID NO:34)
18 GNCSSGGTTYFQPINPLLQAYGLTLVTSGG—GT (SEQ ID N0:35)
19 GDCRTGGETFFQPINALLQNYGLTLKTTGGDDGG (SEQ ID NO:36)
20 GNCSIGGTTFHQPVNPILSQYGLTLVRS (SEQ ID NO:37)
21 GNCTSGGTTYHQPINPLLSAYGLDLVTG (SEQ ID NO:38)
22 GDCTNGGTTFYQPVNPLLSDFGLTLKTTSA (SEQ ID NO:39)
23 GDCRSGGITFFQPINPLLSYFGLQLVTG (SEQ ID NO:40)
24 GIPASQRSSLFERLQPILSQYGLSLVTG (SEQ ID NO:41)
25 GNCSSGGETWFQPVNEILNRYGLTLHTA (SEQ ID N0:42)
26 -MDTNNCRSFYQPVNTVLNKWKLSLVTSTDVTTS (SEQ ID NO:43)
27 SGCGMIHEPISRALADRGVTLLAG ' (SEQ ID NO: 44)
In the above listing, the numbers correspond as follows:
1 ASP Protease
2 Streptogrisin A (Streptomyces griseus)
3 Glutamyl endopeptidase (Streptomyces fradiae)
4 Streptogrisin B (Streptomyces lividans)
5 , SAM-P20 (Streptomyces coelicolor)
Q SAM-P20 (Streptomyces albogriseolus)
7 Streptogrisin B (Streptomyces griseus)
8 Glutamyl endopeptidase II (Streptomyces griseus)
9 Glutamyl endopeptidase II (Streptomyces fradiae)
10 Streptogrisin D (Streptomyces albogriseolus)
11 Streptogrisin D (Streptomyces coelicolor)
12 Streptogrisin D (Streptomyces griseus)
13 Subfamily S1E unassigned peptidase (SalO protein) (Streptomyces lividans)
14 Subfamily S1E unassigned peptidase (SALO protein) (Streptomyces coelicolor)
15 Streptogrisin D (Streptomyces platensis)
16 Subfamily S1E unassigned peptidase (3SC5B7.10 protein)(Sfreptomyces coelicolor)
17 CHY1 protease (Metarhizium anisopliae)
18 Streptogrisin C (Streptomyces griseus)
19 Streptogrisin C (SCD40A.16c protein) (Streptomyces coelicolor)
20 Subfamily S1E unassigned peptidase (I) (Streptomyces sp.)
21 Subfamily S1E unassigned peptidase (II) (Streptomyces sp.)
22 Subfamily S1E unassigned peptidase (SCF43A.19 protein)(Sfreptomyces coelicolor)
23 Subfamily S1E unassigned peptidase (Thermobifida fusca; basonym

Thermomonospora fusca)
24 Alpha-lytic endopeptidase (Lysobacter enzymogenes)
25 Subfamily S1E unassigned peptidase (SC10G8.13C protein) (Streptomyces
coelicolor)
26 Yeast-lytic endopeptidase (Rarobacter faecitabidus)
27 Subfamily S1E unassigned peptidase (SC10A5.18 protein) (Streptomyces coelicolor)
EXAMPLE 5 Screening for Novel Homologues of 69B4 Protease by PCR
In this Example, methods used to screen for novel homologues of 69B4 protease are described. Bacterial strains of the suborder Micrococcineae, and in particular from the family Cellulomonadaceae and Promicromonosporaceae were ordered from the German culture collection, DSMZ (Braunschweig) and received as freeze dried cultures. Additional strains were received from the Belgian Coordinated Collections of Microorganisms, BCCM™/LMG (University of Ghent). The freeze-dried ampoules were opened according to DSMZ instructions and the material rehydrated with sterile physiological saline (1.5 ml) for 1h. Well-mixed, rehydrated cell suspensions (300 |il_) were transferred to sterile Eppendorf tubes for subsequent PCR.
PCR Methods
i) Pretreatment of the Samples
The rehydrated microbial cell suspensions were placed in boiling water bath for 10 min. The suspensions were then centrifuged at 16000 rpm for 5 min. (Sigma 1-15 centrifuge) to remove cell debris and remaining cells, the clear supernatant fraction serving as template for the PCR reaction.
(ii) PCR Test Conditions
The DNA from these types of bacteria (Actinobacteria) is characteristically highly GC rich (typically >55 mol%), so addition of DMSO is a necessity. The chosen concentration based on earlier work with the Cellulomonas sp. strain 69B4 was 4% v/v DMSO.
(iii) PCR Primers (chosen from the following pairs)

Prot-int_FW1 5'-TGCGCCGAGCCCGGCGACTC-3' (SEQ ID NO:45)
Prot-int_RV1 5'-GAGTCGCCGGGCTCGGCGCA-3' (SEQ ID NO:46)
Prot-int_FW2 5'-TTCCCCGGCAACGACTACGCGTGGGT-3' (SEQ ID NO:47)
Prot-int_RV2 S'-ACCCACGCGTAGTCGTTGCCGGGGAA-S1 (SEQ ID NO:48)
Cellu-FW1 5'-GCCGCTGCTCGATCGGGTTC-3' (SEQ ID NO:49)
Cellu-RV1 5'-GCAGTTGCCGGAGCCGCCGGACGT-3' (SEQ ID NO:50)
(iv) PCR Mixture (all materials supplied by Invitrogen)
Template DMA 4ul
10x PCR buffer 5ul
50mM MgSO4 2ul
IQmMdNTP's 1ul
Primers (10jiMsoln.) luleach
Platinum Taqhifi polymeraseO.Sul
DMSO 2ul
MilliQ water 33.5ul
(v) PCR Protocol
1) 94°C 5 min
2) 94°C 30 sec
3)55°C ' 30 sec
4) 68°C 3 min
5) Repeat steps 2-4 repeat for 29 cycles
6)68°C 10 min
7) 15°C 1 min
The amplified PCR products were examined by agarose gel electrophoresis. Distinct bands for each organism were excised from the gel, purified using the Qiagen gel extraction kit, and sequenced by BaseClear, using the same primer combinations.
(vi) Sequence Analysis
Nucleotide sequence data were analyzed and the DMA sequences were translated into amino acid sequences to review the homology to 69B4-mature protein. Sequence alignments were performed using AlignX, a component of Vector NTI suite 9.0.0. The results are compiled in Table 5-1. The numbering is that used in SEQ ID NO:8.
Table 5-1. Percent Identity of (translated) Amino Acid Sequences found in Natural Isolate Strains Compared to 69B4 Mature Protease

These results show that PCR primers based on polynucleotide sequences of the 69B4 protease gene (mature chain), SEQ ID NO:4 are successful in detecting homologous genes in bacterial strains of the suborder Micrococcineae, and in particular from the family Cellulomonadaceae and Promicromonosporaceae.
Figure 2 provides a phylogeny tree of ASP protease. The phylogeny of this protease was examined by a variety of approaches from mature sequences of similar members of the chymotrypsin superfamily of proteins and ASP homologues for which significant mature sequence has been deduced. Using protein distance methods known in the art (See e.g., Kimura, The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK [1983]) similar trees were obtained either including or excluding gaps. The phylogenetic tree of Figure 2 was constructed from aligned sequences (positions 16-181 of SEQ ID NO:8) using TREECONW v.1.3b (Van de Peer and De Wachter, Comput. Appl. Biosci., 10:569 - 570 [1994]) and with tree topology inferred by the Neighbor-Joining algorithm (Saitou and Nei, Mol. Biol. Evol., 4:406 - 425 [1987]). As indicated by this tree, the data indicate that the ASP series of homologous proteases ("cellulomonadins") forms a separate subfamily of proteins. In Figure 2, the numbers provided in brackets correspond to

the sequences provided herein.
The following is an alignment between the Cellulomonas 69B4 ASP protease and homologous proteases of related genera described herein.
1 50
69B4 (ASP) complete (1) MTPRTVTRALAVATAAATLLAGGMAAQANEPAPPGSASAPPRLAEKLDPD
Cellulomonas gelida (1)
Cellulomonas flavigena (1)
Cellulomonas biazotea (1)
Cellulomonas find. (1)
Cellulomonas iranensis (1)
Cellulomonas cellasea (1)
C. xylanilytica (1)
Oerskovia turbata (1) MARSFWRTLATACAATALVAGPAALTANAATPTPDTPTVSPQTSSKVSPE
Oerskovia jenensis (1)
Cm. cellulans (1)
Pm. oitrea (1)
Pm. sukumoe (1)
69B4 (ASP) mature '(1)
Consensus (1)
51 100
69B4 (ASP) complete (51) LLEAMERDLGLDAEEAAATLAFQHDAAETGEALAEELDEDF-AGTWVEDD
Cellulomonas gelida (1)
Cellulomonas flavigena (1)
Cellulomonas biazotea (1)
Cellulomonas find (1)
Cellulomonas iranensis (1)
Cellulomonas cellasea (1) V
C. xylanilytica (1)
Oerskovia turbata (51) VLRALQRDLGLSAKDATKRLAFQSDAASTEDALADSLDAYAGAWVDPARN
Oerskovia jenensis (1)
Cm. cellulans (1) PRAAGRAARSSGSRASAS
Pm. citrea (1) .
Pm. sukumoe (1)
69B4 (ASP) mature (1)
Consensus (51)
101 150
69B4 (ASP) complete (100) VLYVATTDEDAVEEVEGEGATAVTVEHSLADLEAWKTVLDAALEGHDDVP
Cellulomonas gelida (1)
Cellulomonas flavigena (1)
Cellulomonas biazotea (1) KQTASEFVIRLTIGELNLAAANSPLPIGHAWSTAL
Cellulomonas fimi (1)
Cellulomonas iranensis (1)
Cellulomonas cellasea (2) GRVRQLPLRGHDVLPARERDPAGLRSASRPGLTRSRRARLDAAGPSARVA
C. xylanilytica (1)
Oerskovia turbata (101) TLYVGVADRAEAKEVRSAGATPVWDHTLAELDTWKAALDGELNDPAGVP
Oerskovia jenensis (1)
Cm. cellulans (19) TSPGPTSVTASASSCGRATGRRQRWTFEADGTVRAGGKCMDVAWAPRPTA
Pm. citrea (1)
Pm. sukumoe (1)
69B4 (ASP) mature (1-)
Consensus (101)
151 200
69B4 (ASP) complete (150) TWYVDVPTNSVWAVKAGAQDVAAGLVEGADVPSDAVTFVETDETPRTMF
Cellulomonas gelida (1) '•
Cellulomonas flavigena (1) V
Cellulomonas biazotea (36) GWYVDVTTNTVWNATALAVAQATEIVAAATVPADAVRvA/ETTEAPRTFI
Cellulomonas fimi (1) V
Cellulomonas iranensis (1)
Cellulomonas cellasea (52) AWYVDVPTNKLWESVG--DTAAAADAVAAAGLPADAVTLATTEAPRTFV
C. xylanilytica (1)
Oerskovia turbata (151) SWFVDVTTNQWVNVHDGGRALAELAAASAGVPADAITYVTTTEAPRPLV
O.jenenensis revi (1)
Cm. cellulans (69) RRSSSRTARQRGPEVRAQRRGRPRVGAGEQSASTPPGAHRGTRGAVRAHG
Pm. citrea (1)
Pm. sukumoe (1)
69B4 (ASP) mature (1) F
Consensus (151)

201

250

DVIGGNAYTIGGRSR CSIGFAVNGGFITAGHCGRTGA TTA
DVIGGNAYYINASSR CSVGFAVEGGFVTAGHCGRAGA STS
R CSIGFAVTGGFVTAGHCGRSGA TTT
DWGGNAYTMGSGGR CSVGFAVNGGFITAGHCGSVGT RTS
R CSVGFAVNGGFVTAGHCGTVGT RTS
DVRGGDRYITRDPGASSGSACSIGYAVQGGFVTAGHCGRGGTRRVLTASW
69B4(ASP)complete (200)
Cellulomonas gelida (1)
Cellulomonas flavigena (2)
Cellulomonas biazotea (86)
C. find., revi (2)
C.iranensis revi (1)
Cellulomonas cellasea (100)
C. xylanilytica (1)
Oerskovia turbata (201)
Oerskovia jenensis (1)
Cm. cellulans (119)
Pm. citrea (1)
Pm. sukumoe (1)
69B4 (ASP) mature (2)
Consensus (201)

DVIGGNAYYIGSRSR
DVIGGNRYRINNTSR
DVIGGDAYYIGGRSR
DVIGGNAYTIGGRSR-DVIGG Y I R

-CSIGFAVEGGFVTAGHCGRAGA STS
-CSVGFAVSGGFVTAGHCGTTGA TTT
- C SIGFAVTGGF VTAGHCGRTGA ATT
-CSIGFAVNGGFITAGHCGRTGA TTA
CSIGFAV GGFVTAGHCGR GA TS



69B4(ASP)complete (240)
Cellulomonas gelida (1)
Cellulomonas flavigena (42)
Cellulomonas biazotea (126)
Cellulomonas find (42)
Cellulomonas iranensis (1)
Cellulomonas cellasea (140)
C. xylanilytica (27)
Oerskovia turbata (241)
Oerskovia jenensis (27)
Cm. cellulans (169)
Pm. citrea (1)
Pm. sukumoe (1)
69B4 (ASP) mature (42)
Consensus (251)

251 300
NPTGTFAGSSFPGNDYAFVRTGAGVNLLAQVNNYSGGRVQVAGHTAAPVG
SPSGTFRGSSFPGNDYAWVQVASGNTPRGLVNNHSGGTVRVTGSQQAAVG KPSGTFAGSSFPGNDYAWVRVASGNTPVGAVNNYSGGTVAVAGSTQATVG SPSGTFAGSSFPGNDYAWVRVASGNTPVGAVNNYSGGTVAVAGSTQAAVG
FPGNDYAWVQVGSGDTPRGLVNNYAGGTVRVTGSQQAAVG
SPSGTFRGSSFPGNDYAWVQVASGNTPRGLVNNHSGGTVRVTGSQQAAVG SPSGTFAGSSFPGNDYAWVRAASGNTPVGAVNRYDGSRVTVAGSTDAAVG GPGGTFRGSNFPGNDYAWVQVDAGNTPVGAVNNYSGGRVAVAGSTAAPVG GPGGTFRGSSFPGNDYAWVQVDAGNTPVGAVNNYSGGRVAVAGSTAAPVG ARMGTVQAASFPGHDYAWVRVDAGFSPVPRVNNYAGGTVDVAGSAEAPVG
FPGNDYAWVNTGTDDTLVGAVNNYSGGTVNVAGSTRAAVG '
FPGNDYAWVNVGSDDTPIGAVNWYSGGTVNVAGSTQAAVG
NPTGTFAGSSFPGNDYAFVRTGAGVNLLAQVWNYSGGRVQVAGHTAAPVG P GTF GSSFPGNDYAWVQVASGNTPVGAVNNYSGGTV VAGST AAVG



69B4(ASP)complete (290)
Cellulomonas gelida (1)
Cellulomonas flavigena (92)
Cellulomonas biazotea (176)
Cellulomonas find. (92)
Cellulomonas iranensis (41)
Cellulomonas cellasea (190)
C. xylanilytica (77)
Oerskovia turbata (291)
Oerskovia jenensis (77)
Cm. cellulans (219)
Pm. citrea (41)
Pm. sukumoe (41)
69B4 (ASP) mature (92)
Consensus (301)
69B4(ASP(complete (340)
Cellulomonas gelida (1)
Cellulomonas flavigena (142)
Cellulomonas biazotea (226)
Cellulomonas find. (142)
Cellulomonas iranensis (86)
Cellulomonas cellasea (240)
C. xylanilytica (127)
Oerskovia turbata (341)
Oerskovia jenensis (127)
Cm. cellulans (269)
Pm. citrea (86)
Pm. sukumoe (86)
69B4 (ASP) mature (142)
Consensus (351)

301 350
SAVCRSGSTTGWHCGTITALNSSVTYPEGTVRGLIRTTVCAEPGDSGGSL
SYVCRSGSTTGWRCGYVRAYNTTVRYAEGSVSGLIRTSVCAEPGDSGGSL ASVCRSGSTTGWRCGTIQAFNSTVNYAQGSVSGLIRTNVCAEPGDSGGSL ATVCRSGSTTGWRCGTIQAFNATVNYAEGSVSGLIRTNVCAEPGDSGGSL
AYVCRSGSTTGWRCGTVQAYNASVRYAEGTVSGLIRTNVCAEPGD
SYVCRSGSTTGWRCGYVRAYNTTVRYAEGSVSGLIRTSVCAEPGDSGGSL AAVCRSGSTTAWGCGTIQSRGASVTYAQGTVSGLIRTNVCAEPGDSGGSL ASVCRSGSTTGWHCGTIGAYNTSVTYPQGTVSGLIRTNVCAEPGDSGGSL SSVCRSGSTTGWRCGTIAAYNSSVTYPQGTVSGLIRTNVCAEPGDSGGSL ASVCRSGATTGWRCGVIEQKNITVNYGNGDVPGLVRGSACAEGGDSGGSV
ATVCRSGSTTGWHCGTIQALNASVTYAEGTVSGLIRTNVCAEPGD
STVCRSGSTTGWHCGTIQAFNASVTYAEGTVSGLIRTNVCAEPGD
SAVCRSGSTTGWHCGTITALNSSVTYPEGTVRGLIRTTVCAEPGDSGGSL ASVCRSGSTTGWRCGTI AYNASV YAEGTVSGLIRTNVCAEPGDSGGSL
351 400
LAGNQAQGVTSGGSGNCRTGGTTFFQPVNPILQAYGLRMITT-DSGSSPA LAGNQAQGVTSGGSGNCSSGGTTYFQPVNEALRVYGLTLVTS-DGGGTE-
VAGTQAQGVTSGGSGNCRYGGTTYFQPVNEILQDQPGPSTTR-AL
IAGNQAQGLTSGGSGNCTTGGTTYFQPVNEALSAYGLTLVTSSGGGGGGG
VAG
VAGTQAQGVTSGGSGNCRYGGTTYFQPVNEILQAYGLRLVLG-HARGGPS
IAGTQARGVTSGGSGNC
LAGNQAQGVTSGGSGNCSSGGTTYFQPVNEALGGYGLTLVTSDGGGPSRR LAGNQAQGLTSGGSGNCSSGGTTYFQPVNEALSAYGLTLVTSGGRGNC— ISGNQAQGVTSGRINDCSNGGKFLYQPDRRPVARDHGRRVGQRARRARGQ
LAGNQAQGVTSGGSGNCRTGGTTFFQPVNPILQAYGLRMITTDSGSSP— LAGNQAQGVTSGGSGNC GGTTYFQPVN L YGL LV



69B4(ASP)complete
Cellulomonas gelida
Cellulomonas flavigena
Cellulomonas biazotea
Cellulomonas find.
Cellulomonas iranensis
Cellulomonas cellasea
C. xylanilytica

(389) -PAPTSCTGYARTFTGTLAAGRAAAQPNGSYVQVNRSGTHSVCLNGPSGA
(49) -PPPTGCQGYARTYQGSVSAGTSVAQPNGSYVTTG-GGTHRVCLSGPAGT
(186)
(276) TTCTGYARTYTGSLASRQSAVQPSGSYVTVGSSGTIRVCLDGPSGT
(145)
(86)
(289) -PARRAPAPPARA
(144)

Oerskovia turbata
Oerskovia jenensis
Cm. cellulans
Pm. citrea
Pm. sukumoe
69B4 (ASP) mature
Consensus

(391) RPGARAMRGPTRAASRPGRRSRSERF VRHDRGRATGCA-
(175)
(319) VHRRPRVRLQ
(86)
(86)
(190)
(401)



69B4(ASP)complete
Cellulomonas gelida
Cellulomonas flavigena
Cellulomonas biazotea
Cellulomonas fimi
Cellulomonas iranensis
Cellulomonas cellasea
C. xylanilytica
Oerskovia turbata
Oerskovia jenensis
Cm. cellulans
Pm. citrea
Pm. sukumoe
69B4 (ASP) mature
Consensus
69B4(ASP)complete
Cellulomonas gelida
Cellulomonas flavigena
Cellulomonas biazotea
Cellulomonas fimi
Cellulomonas iranensis
Cellulomonas cellasea
C. xylanilytica
Oerskovia turbata
Oerskovia jenensis
Cm. cellulans
Pm. citrea
Pm. sukumoe
69B4 (ASP) mature
Consensus



EXAMPLE 6 Detection of Novel Homologues of 69B4 Protease by Immunoblotting
In this Example, immunoblotting experiments used to detect homologues of 69B4 are described. The following organisms were used in these experiments :
1. Cellulomonas biazotea DSM 20112
2. Cellulomonas flavigena DSM 20109
3. Cellulomonas fimi DSM 20113
4. Cellulomonas cellasea DSM 20118
5. Cellulomonas uda DSM 20107
6. Cellulomonas gelida DSM 20111
7. Cellulomonas xylanilytica LMG 21723
8. Cellulomonas iranensis DSM 14785
9. Oerskovia jenensis DSM 46000
10. Oerskovia turbata DSM 20577
11. Cellulosimicrobium cellulans DSM 20424
12. Xylanibacterium t//m/LMG21721
13. Isoptericola variabilis DSM 10177
14. Xylanimicrobium pachnodae DSM 12657

15. Promicromonospora citrea DSM 43110
16. Promicromonospora sukumoe DSM 44121
17. Agromyces ramosus DSM 43045
The strains were first grown on Heart Infusion/skim milk agar plates (72 h, 30°C) to confirm strain purity, protease reaction by clearing of the skim milk and to serve as inoculum. Bacterial strains were cultivated on Brain Heart Infusion broth supplemented with casein (0.8% w/v) in 100/500 Erlenmeyer flasks with baffles at 230 rpm, 30°C for 5 days. Microbial growth was checked by microscopy. Supernatants were separated from cells by centrifugation for 30 min at 4766 x g. Further solids were removed by centrifugation at 9500 rpm. Supernatants were concentrated using Vivaspin 20 ml concentrator (Vivascience), cutoff 10 kDa, by cehtrifugation at 4000 x g. Concentrates were stored in aliquots of 0.5 mL at-20°C.
Primary antibody
The primary antibody (EP034323) for the immunoblotting reaction, prepared by Eurogentec (Liege Science Park, Seraing, Belgium) was raised against 2 peptides consisting of amino acids 151-164 and 178-189 in the 69B4 mature protease (SEQ ID NO:8), namely:
TSGGSGNCRTGGTT (epitope 1; SEQ ID NO:51) and LRMITTDSGSSP (epitope 2; SEQ ID NO:52) as shown below in the amino acid sequence of 69B4 mature protease:
1 FDVIGGNAYT IGGRSRCSIG FAVNGGFITA GHCGRTGATT ANPTGTFAGS 51 SFPGNDYAFV RTGAGVNLLA QVMNYSGGRV QVAGHTAAPV GSAVCRSGST 101 TGWHCGTITA LNSSVTYPEG TVRGLIRTTV CAEPGDSGGS LLAGNQAQGV 151 pf§fe|§^r GGp'FFQPVN PILQAYGpm"iTf0SG§SP (SEQ ID NO: 8)
Electrophoresis and Immunoblotting Sample preparation
1. Concentrated culture supernatant (50 jiL)
2. PMSF (1 |iL; 20 mg/ml)
3. 1MHCI(25|iL)
4. Nu PAGE LDS sample buffer (25 jiL) (Invitrogen, Carlsbad, CA, USA)
Mixed and heated at 90°C for 10 min.
Electrophoresis
SDS-PAGE was performed in duplicate using NuPAGE 10% Bis-Tris gels (Invitrogen) with MES-SDS running buffer at 100 v for 5 min. and 200 v constant. Where

possible,25 jil_ sample were loaded in each slot. One gel of each pair was stained with Coomassie Blue and the other gel was used for immunoblotting using the Boehringer Mannheim chromogenic Western blotting protocol (Roche).
Immunoblotting
The transfer buffer used was Transfer buffer: Tris (0.25M) - glycine (1.92M) -methanol (20% v/v). The PVDF membrane was pre-wetted by successive moistening in methanol, deionized water, and finally transfer buffer.
The PAGE gel was briefly washed in deionized water and transferred to blotting pads soaked in transfer buffer, covered with pre-wetted PVDF membrane and pre-soaked blotting pads. Blotting was performed in transfer buffer at 400 mA constant for 2.5-3 h. The membrane was briefly washed (2x) in Tris buffered saline (TBS) (0.5M Tris, 0.15M NaCI, pH7.5). Non-specific antibody binding was prevented by incubating the membrane in 1% v/v mouse/rabbit Blocking Reagent (Roche) in maleic acid solution (100 mM maleic acid, 150 mM NaCI, pH7.5) overnight at 4°C.
The primary antibody used in these reactions was EP034323 diluted 1:1000. The reaction was performed with the Ab diluted in 1% Blocking Solution with a 30 min. action time. The membrane was washed 4x 10 min. in TEST (TSB + 0.1% v/v Tween 20).
The secondary antibody consisted of anti-mouse/anti-rabbit IgG (Roche) 73 ^L in 20 ml in 1% Blocking Solution with a reaction time of 30 min. The membrane was washed 4x 15 min. in TEST and the substrate reaction (alkaline phosphatase) performed with BM Chromogenic Western Blotting Reagent (Roche) until staining occurred.
The results of the cross-reactivity with primary polyclonal antibody are shown in Table 6-1.

Based on these results, it is clear that the antibody used in these experiments is highly specific at detecting homologues with a very high percentage of amino acid sequence identity to 69B4 protease. Furthermore, these results indicate that the C-terminal portion of the 69B4 mature protease chain is fairly variable especially in the region of the 2-peptide epitopes. In these experiments, it was determined that in cases where there were more than 2 amino acid differences in this region a negative Western blotting reaction resulted.
EXAMPLE 7 Inverse PCR and Genome Walking
In this Example, experiments conducted to elucidate polynucleotide sequences of ASP are described. The microorganisms utilized in these experiments were :
1. Cellulomonas biazotea DSM 20112
2. Cellulomonas flavigena DSM 20109
3. Cellulomonas fimi DSM 20113
4. Cellulomonas cellasea DSM 20118
5. Cellulomonas gelida DSM 20111
6. Cellulomonas iranensis (DSM 14785)
7. Oerskovia jenensis DSM 46000

8. Oerskovia turbata DSM 20577
9. Cellulosimicrobium cellulans DSM 20424
10. Promicromonospora citrea DSM 43110
11. Promicromonospora sukumoe DSM 44121
These bacterial strains were cultivated on Brain Heart Infusion broth or Tryptone Soya broth in 100/500 Erlenmeyer flasks with baffles at 230 rpm, 30°C for 2 days. Cells were separated from the culture broth by centrifugation for 30 min at 4766 x g.
Chromosomal DMA was obtained by standard phenol/chloroform extraction method known in the art from cells digested by lysozyme/EDTA (See e.g., Sambrook etal., supra). Chromosomal DNA was digested with the restriction enzymes selected from the following list: Apa\, BamHI, SssHII, Kpn\, Nar\, A/col, Nhe\, Pvu\, Sail or Ssfll.
The nucleotide and amino acid sequences of these organisms are provided below. In these listings, the mature protease is indicated in bold and the signal sequence is underlined.
C. flavigena (DSM 20109)
1 GTCGACGTCA TCGGGGGCAA CGCGTACTAC ATCGGGTCGC GCTCGCGGTG CAGCTGCAGT AGCCCCCGTT GCGCATGATG TAGCCCAGCG CGAGCGCCAC
51 CTCGATCGGG TTCGCGGTCG AGGGCGGGTT CGTCACCGCG GGGCACTGCG GAGCTAGCCC AAGCGCCAGC TCCCGCCCAA GCAGTGGCGC CCCGTGACGC
101 GGCGCGCGGG CGCGAGCACG TCGTCACCGT CGGGGACCTT CCGCGGCTCG CCGCGCGCCC GCGCTCGTGC AGCAGTGGCA GCCCCTGGAA GGCGCCGAGC
151 TCGTTCCCCG GCAACGACTA CGCGTGGGTC CAGGTCGCCT CGGGCAACAC AGCAAGGGGC CGTTGCTGAT GCGCACCCAG GTCCAGCGGA GCCCGTTGTG
201 GCCGCGCGGG CTGGTGAACA ACCACTCGGG CGGCACGGTG CGCGTCACCG CGGCGCGCCC GACCACTTGT TGGTGAGCCC GCCGTGCCAC GCGCAGTGGC
251 GCTCGCAGCA GGCCGCGGTC GGCTCGTACG TGTGCCGATC GGGCAGCACG CGAGCGTCGT CCGGCGCCAG CCGAGCATGC ACACGGCTAG CCCGTCGTGC
301 ACGGGATGGC GGTGCGGCTA CGTCCGGGCG TACAACACGA CCGTGCGGTA TGCCCTACCG CCACGCCGAT GCAGGCCCGC ATGTTGTGCT GGCACGCCAT
351 CGCGGAGGGC TCGGTCTCGG GCCTCATCCG CACGAGCGTG TGCGCCGAGC GCGCCTCCCG AGCCAGAGCC CGGAGTAGGC GTGCTCGCAC ACGCGGCTCG
401 CGGGCGACTC CGGCGGCTCG CTGGTCGCCG GCACGCAGGC CCAGGGCGTC GCCCGCTGAG GCCGCCGAGC GACCAGCGGC CGTGCGTCCG GGTCCCGCAG
451 ACGTCGGGCG GGTCCGGCAA CTGCCGCTAC GGGGGCACGA CGTACTTCCA

TGCAGCCCGC CCAGGCCGTT GACGGCGATG CCCCCGTGCT GCATGAAGGT
501 GCCCGTGAAC GAGATCCTGC AGGACCAGCC CGGGCCGTCG ACCACGCGTG CGGGCACTTG CTCTAGGACG TCCTGGTCGG GCCCGGCAGC TGGTGCGCAC
551 CCCTA
GGGAT (SEQ ID NO:53)
Cellulomonas flavigena (DSM 20109)
1 VDVIGGNAYY IGSRSRCSIG FAVEGGFVTA GHCGRAGAST SSPSGTFRGS
51 SFPGNDYAWV QVASGNTPRG LVNNHSGGTV RVTGSQQAAV GSYVCRSGST
101 TGWRCGYVRA YNTTVRYAEG SVSGLIRTSV CAEPGDSGGS LVAGTQAQGV
151 TSGGSGNCRY GGTTYFQPVN EILQDQPGPS TTRAL (SEQ ID NO:54)
Cellulomonas biazotea (DSM 20112)
1 TAAAACAGAC GGCCAGTGAA TTTGTAATAC' GACTCACTAT AGGCGAATTG ATTTTGTCTG CCGGTCACTT AAACATTATG CTGAGTGATA TCCGCTTAAC
51 AATTTAGCGG CCGCGAATTC GCCCTTACCT ATAGGGCACG CGTGGTCGAC TTAAATCGCC GGCGCTTAAG CGGGAATGGA TATCCCGTGC GCACCAGCTG
101 GGCCCTGGGC TGGTACGTCG ACGTCACTAC CAACACGGTC GTCGTCAACG CCGGGACCCG ACCATGCAGC TGCAGTGATG GTTGTGCCAG CAGCAGTTGC
151 CCACCGCCCT CGCCGTGGCC CAGGCGACCG AGATCGTCGC CGCCGCAACG GGTGGCGGGA GCGGCACCGG GTCCGCTGGC TCTAGCAGCG GCGGCGTTGC
201 GTGCCCGCCG ACGCCGTCCG GGTCGTCGAG ACCACCGAGG CGCCCCGCAC CACGGGCGGC TGCGGCAGGC CCAGCAGCTC TGGTGGCTCC GCGGGGCGTG
251 GTTCATCGAC GTCATCGGCG GCAACCGTTA CCGGATCAAC AACACCTCGC CAAGTAGCTG CAGTAGCCGC CGTTGGCAAT GGCCTAGTTG TTGTGGAGCG
301 GCTGCTCGGT CGGCTTCGCC GTCAGCGGCG GCTTCGTCAC CGCCGGGCAC CGACGAGCCA GCCGAAGCGG CAGTCGCCGC CGAAGCAGTG GCGGCCCGTG
351 TGCGGCACGA CCGGCGCGAC CACGACGAAA CCGTCCGGCA CGTTCGCCGG ACGCCGTGCT GGCCGCGCTG GTGCTGCTTT GGCAGGCCGT GCAAGCGGCC
401 CTCGTCGTTC CCCGGCAACG ACTACGCGTG GGTGCGCGTC GCGTCCGGCA GAGCAGCAAG GGGCCGTTGC TGATGCGCAC CCACGCGCAG CGCAGGCCGT
451 ACACCCCGGT CGGCGCCGTG AACAACTACA GCGGCGGCAC CGTGGCCGTC TGTGGGGCCA GCCGCGGCAC TTGTTGATGT CGCCGCCGTG GCACCGGCAG
501 GCCGGCTCGA CGCAGGCGAC CGTCGGTGCG TCCGTCTGCC GCTCCGGCTC CGGCCGAGCT GCGTCCGCTG GCAGCCACGC AGGCAGACGG CGAGGCCGAG
551 CACCACGGGG TGGCGCTGCG GGACGATCCA GGCGTTCAAC TCCACCGTCA GTGGTGCCCC ACCGCGACGC CCTGCTAGGT CCGCAAGTTG AGGTGGCAGT

601 ACTACGCGCA GGGCAGCGTC TCCGGCCTCA TCCGCACGAA CGTGTGCGCC TGATGCGCGT CCCGTCGCAG AGGCCGGAGT AGGCGTGCTT GCACACGCGG
651 GAGCCCGGCG ACTCCGGCGG CTCGCTCATC GCCGGCAACC AGGCCCAGGG CTCGGGCCGC TGAGGCCGCC GAGCGAGTAG CGGCCGTTGG TCCGGGTCCC
701 CCTGACGTCC GGCGGGTCGG GCAACTGCAC CACCGGCGGG ACGACGTACT GGACTGCAGG CCGCCCAGCC CGTTGACGTG GTGGCCGCCC TGCTGCATGA
751 TCCAGCCCGT CAACGAGGCG CTCTCCGCCT ACGGCCTGAC GCTCGTCACG AGGTCGGGCA GTTGCTCCGC GAGAGGCGGA TGCCGGACTG CGAGCAGTGC
801 TCGTCCGGCG GCGGCGGTGG CGGCGGCACG ACCTGCACCG GGTACGCGCG AGCAGGCCGC CGCCGCCACC GCCGCCGTGC TGGACGTGGC CCATGCGCGC
851 GACCTACACC GGCTCGCTCG CCTCGCGGCA GTCCGCCGTC CAGCCGTCCG CTGGATGTGG CCGAGCGAGC GGAGCGCCGT CAGGCGGCAG GTCGGCAGGC
901 GCAGCTATGT GACCGTCGGG TCCAGCGGCA CCATCCGCGT CTGCCTCGAC CGTCGATACA CTGGCAGCCC AGGTCGCCGT GGTAGGCGCA GACGGAGCTG
951 GGCCCGAGCG GGACGGACTT CGACCTGTAC CTGCAGAAGT GGAACGGGTC CCGGGCTCGC CCTGCCTGAA GCTGGACATG GACGTCTTCA CCTTGCCCAG
1001 CGCGTGGGC (SEQ ID NO:55) GCGCACCCG
Cellulomonas biazotea (DSM 20112)
1 KQTASEFVIR LTIGELNLAA ANSPLPIGHA WSTALGWYVD VTTNTWVNA
51 TALAVAQATE IVAAATVPAD AVRWETTEA PRTFIDVIGG NRYRINNTSR
101 CSVGFAVSGG FVTAGHCGTT GATTTKPSGT FAGSSFPGND YAWVRVASGN
151 TPVGAVNNYS GGTVAVAGST QATVGASVCR SGSTTGWRCG TIQAFNSTVN
201 YAQGSVSGLI RTNVCAEPGD SGGSLIAGNQ AQGLTSGGSG NCTTGGTTYF
251 QPVNEALSAY GLTLVTSSGG GGGGGTTCTG YARTYTGSLA SRQSAVQPSG
301 SYVTVGSSGT IRVCLDGPSG TDFDLYLQKW NGSAW (SEQ ID NO:56)
Cellulomonas fimi (DSM 20113)
1 GTGGACGTGA TCGGCGGCGA CGCCTACTAC ATCGGCGGCC GCAGCCGCTG CACCTGCACT AGCCGCCGCT GCGGATGATG TAGCCGCCGG CGTCGGCGAC
51 TTCGATCGGG TTCGCCGTCA CCGGGGGCTT CGTGACCGCC GGGCACTGCG AAGCTAGCCC AAGCGGCAGT GGCCCCCGAA GCACTGGCGG CCCGTGACGC
101 GCCGCACCGG CGCGGCCACG ACGAGCCCGT CGGGCACGTT CGCCGGCTCG CGGCGTGGCC GCGCCGGTGC TGCTCGGGCA GCCCGTGCAA GCGGCCGAGC
151 AGCTTCCCGG GCAACGACTA CGCGTGGGTG CGGGTCGCGT CGGGCAACAC TCGAAGGGCC CGTTGCTGAT GCGCACCCAC GCCCAGCGCA GCCCGTTGTG

201 GCCCGTCGGC GCGGTGAACA ACTACAGCGG CGGCACGGTC GCCGTCGCCG CGGGCAGCCG CGCCACTTGT TGATGTCGCC GCCGTGCCAG CGGCAGCGGC
251 GCTCGACCCA GGCCGCCGTC GGTGCGACCG TGTGCCGCTC GGGCTCCACC CGAGCTGGGT CCGGCGGCAG CCACGCTGGC ACACGGCGAG CCCGAGGTGG
301 ACCGGCTGGC GGTGCGGCAC CATCCAGGCG TTCAACGCGA CCGTCAACTA TGGCCGACCG CCACGCCGTG GTAGGTCCGC AAGTTGCGCT GGCAGTTGAT
351 CGCCGAGGGC AGCGTCTCCG GCCTCATCCG CACGAACGTG TGCGCCGAGC GCGGCTCCCG TCGCAGAGGC CGGAGTAGGC GTGCTTGCAC ACGCGGCTCG
401 CCGGCGACTC GGGCGGCTCG CTCGTCGCCG GCAACCAGGC GCAGGGCATG GGCCGCTGAG CCCGCCGAGC GAGCAGCGGC CGTTGGTCCG CGTCCCGTAC
451 ACGTCCGGCG GCTCCGACAA CTGC (SEQ ID NO:57) TGCAGGCCGC CGAGGCTGTT GACG
Cellulomonas fimi (DSM 20113)
1 VDVIGGDAYY IGGRSRCSIG FAVTGGFVTA GHCGRTGAAT TSPSGTFAGS
51 SFPGNDYAWV RVASGNTPVG AVNNYSGGTV AVAGSTQAAV GATVCRSGST
101 TGWRCGTIQA FNATVNYAEG SVSGLIRTNV CAEPGDSGGS LVAG (SEQ ID
NO:58)
Cellulomonas gelida ( DSM 20111)
1 CTCGCGGGCA ACCAGGCGCA GGGCGTGACG TCGGGCGGGT CGGGCAACTG GAGCGCCCGT TGGTCCGCGT CCCGCACTGC AGCCCGCCCA GCCCGTTGAC
51 CTCGTCGGGC GGGACGACGT ACTTCCAGCC CGTCAACGAG GCCCTCCGGG GAGCAGCCCG CCCTGCTGCA TGAAGGTCGG GCAGTTGCTC CGGGAGGCCC
101 TGTACGGGCT CACGCTCGTG ACCTCTGACG GTGGGGGCAC CGAGCCGCCG ACATGCCCGA GTGCGAGCAC TGGAGACTGC CACCCCCGTG GCTCGGCGGC
151 CCGACCGGGT GCCAGGGCTA TGCGCGGACC TACCAGGGCA GCGTCTCGGC GGCTGGCCCA CGGTCCCGAT ACGCGCCTGG ATGGTCCCGT CGCAGAGCCG
201 CGGGACGTCG GTCGCGCAGC CGAACGGTTC GTACGTCACG ACCGGGGGCG GCCCTGCAGC CAGCGCGTCG GCTTGCCAAG CATGCAGTGC TGGCCCCCGC
251 GGACGCACCG GGTGTGCCTG AGCGGACCGG CGGGCACGGA CCTGGACCTG CCTGCGTGGC CCACACGGAC TCGCCTGGCC GCCCGTGCCT GGACCTGGAC
301 TACCTGCAGA AGTGGAACGG GTACTCGTGG GCCAGCGTCG CGCAGTCGAC ATGGACGTCT TCACCTTGCC CATGAGCACC CGGTCGCAGC GCGTCAGCTG
351 GTCGCCTGGT GCCACGGAGG CGGTCACGTA CACCGGGACC GCCGGCTACT CAGCGGACCA CGGTGCCTCC GCCAGTGCAT GTGGCCCTGG CGGCCGATGA
401 ACCGCTACGT GGTCCACGCG TACGCGGGTT CGGGGGCGTA CACCCTGGGG TGGCGATGCA CCAGGTGCGC ATGCGCCCAA GCCCCCGCAT GTGGGACCCC

451 GCGACGACCC CG (SEQ ID NO:59) CGCTGCTGGG GC
Cellulomonas gelida (DSM 20111)
1 LAGNQAQGVT SGGSGNCSSG GTTYFQPVNE ALRVYGLTLV TSDGGGTEPP
51 PTGCQGYART YQGSVSAGTS VAQPNGSYVT TGGGTHRVCL SGPAGTDLDL
101 YLQKWNGYSW ASVAQSTSPG ATEAVTYTGT AGYYRYWHA YAGSGAYTLG
151 ATTP (SEQ ID NO:60)
Cellulomonas iranensis (DSM 14785)
1 TTCCCCGGCA ACGACTACGC GTGGGTCCAG GTCGGGTCGG GCGACACCCC AAGGGGCCGT TGCTGATGCG CACCCAGGTC CAGCCCAGCC CGCTGTGGGG
51 CCGCGGCCTG GTCAACAACT ACGCGGGCGG CACCGTGCGG GTCACCGGGT GGCGCCGGAC CAGTTGTTGA TGCGCCCGCC GTGGCACGCC CAGTGGCCCA
101 CGCAGCAGGC CGCGGTCGGC GCGTACGTCT GCCGGTCGGG CAGCACGACG GCGTCGTCCG GCGCCAGCCG CGCATGCAGA CGGCCAGCCC GTCGTGCTGC
151 GGCTGGCGCT GCGGCACCGT GCAGGCCTAC AACGCGTCGG TCCGCTACGC CCGACCGCGA CGCCGTGGCA CGTCCGGATG TTGCGCAGCC AGGCGATGCG
201 CGAGGGCACC GTCTCGGGCC TCATCCGCAC CAACGTCTGC GCCGAGCCCG GCTCCCGTGG CAGAGCCCGG AGTAGGCGTG GTTGCAGACG CGGCTCGGGC
251 GCGACTC (SEQ ID NO:61) CGCTGAG
Cellulomonas iranensis (DSM 14785)
1 FPGNDYAWVQ VGSGDTPRGL VNNYAGGTVR VTGSQQAAVG AYVCRSGSTT 51 GWRCGTVQAY NASVRYAEGT VSGLIRTNVC AEPGD (SEQ ID NO:62)
Cellulomonas cellasea (DSM 20118)
1 GTCGGGCGGG TCCGGCAACT GCCGCTACGG GGGCACGACG TACTTCCAGC CAGCCCGCCC AGGCCGTTGA CGGCGATGCC CCCGTGCTGC ATGAAGGTCG
51 CCGTGAACGA GATCCTGCAG GCCTACGGTC TGCGTCTCGT CCTGGGCTGA GGCACTTGCT CTAGGACGTC CGGATGCCAG ACGCAGAGCA GGACCCGACT
101 CACGCTCGCG GCGGGCCCGG CTCGACGCGG CCGGCCCGTC GGCCCGGGTC GTGCGAGCGC CGCCCGGGCC GAGCTGCGCC GGCCGGGCAG CCGGGCCCAG
151 GCCGCCTGGT ACGTCGACGT GCCGACCAAC AAGCTCGTCG TCGAGTCGGT CGGCGGACCA TGCAGCTGCA CGGCTGGTTG TTCGAGCAGC AGCTCAGCCA

201 CGGCGACACC GCGGCGGCCG CCGACGCCGT CGCCGCCGCG GGCCTGCCTG GCCGCTGTGG CGCCGCCGGC GGCTGCGGCA GCGGCGGCGC CCGGACGGAC
251 CCGACGCCGT GACGCTCGCG ACCACCGAGG CGCCACGGAC GTTCGTCGAC GGCTGCGGCA CTGCGAGCGC TGGTGGCTCC GCGGTGCCTG CAAGCAGCTG
301 GTCATCGGCG GCAACGCGTA CTACATCAAC GCGAGCAGCC GCTGCTCGGT CAGTAGCCGC CGTTGCGCAT GATGTAGTTG CGCTCGTCGG CGACGAGCCA
351 CGGCTTCGCG GTCGAGGGCG GGTTCGTCAC CGCGGGCCAC TGCGGGCGCG GCCGAAGCGC CAGCTCCCGC CCAAGCAGTG GCGCCCGGTG ACGCCCGCGC
401 CGGGCGCGAG CACGTCGTCA CCGTCGGGGA CCTTCCGCGG CTCGTCGTTC GCCCGCGCTC GTGCAGCAGT GGCAGCCCCT GGAAGGCGCC GAGCAGCAAG
451 CCCGGCAACG ACTACGCGTG GGTCCAGGTC GCCTCGGGCA ACACGCCGCG GGGCCGTTGC TGATGCGCAC CCAGGTCCAG CGGAGCCCGT TGTGCGGCGC
501 CGGGCTGGTG AACAACCACT CGGGCGGCAC GGTGCGCGTC ACCGGCTCGC GCCCGACCAC TTGTTGGTGA GCCCGCCGTG CCACGCGCAG TGGCCGAGCG
551 AGCAGGCCGC GGTCGGCTCG TACGTGTGCC GATCGGGCAG CACGACGGGA TCGTCCGGCG CCAGCCGAGC ATGCACACGG CTAGCCCGTC GTGCTGCCCT
601 TGGCGGTGCG GCTACGTCCG GGCGTACAAC ACGACCGTGC GGTACGCGGA ACCGCCACGC CGATGCAGGC CCGCATGTTG TGCTGGCACG CCATGCGCCT
651 GGGCTCGGTC TCGGGCCTCA TCCGCACGAG CGTGTGCGCC GAGCCGGGCG CCCGAGCCAG AGCCCGGAGT AGGCGTGCTC GCACACGCGG CTCGGCCCGC
701 ACTCCGGCGG CTCGCTGGTC GCCGGCACGC AGGCCCAGGG CGTCACGTCG TGAGGCCGCC GAGCGACCAG CGGCCGTGCG TCCGGGTCCC GCAGTGCAGC
751 GGCGGGTCCG GCAACTGCCG CTACGGGGGC ACGACGTACT TCCAGCCCGT CCGCCCAGGC CGTTGACGGC GATGCCCCCG TGCTGCATGA AGGTCGGGCA
801 GAACGAGATC CTGCAGGCCT ACGGTCTGCG TCTCGTCCTG GGCTGACACG CTTGCTCTAG GACGTCCGGA TGCCAGACGC AGAGCAGGAC CCGACTGTGC
851 CTCGCGGCGG GCCCTCCCCT GCCCGTCGCG CGCCGGCCCC ACCAGCCCGG GAGCGCCGCC CGGGAGGGGA CGGGCAGCGC GCGGCCGGGG TGGTCGGGCC
901 GCCG (SEQ ID NO:63) CGGC
Cellulomonas cellasea (DSM 20118)
1 VGRVRQLPLR GHDVLPARER DPAGLRSASR PGLTRSRRAR LDAAGPSARV
51 AAWYVDVPTN KLWESVGDT AAAADAVAAA GLPADAVTLA TTEAPRTFVD
101 VIGGNAYYIN ASSRCSVGFA VEGGFVTAGH CGRAGASTSS PSGTFRGSSF
151 PGNDYAWVQV ASGNTPRGLV NNHSGGTVRV TGSQQAAVGS YVCRSGSTTG
201 WRCGYVRAYN TTVRYAEGSV SGLIRTSVCA EPGDSGGSLV AGTQAQGVTS

251 GGSGNCRYGG TTYFQPVNEI LQAYGLRLVL G*HARGGPSP ARRAPAPPAR 301 A (SEQ ID NO:64)
Cellulomonas xylanilytica (LMG21723)
1 CGCTGCTCGA TCGGGTTCGC CGTGACGGGC GGCTTCGTGA CCGCCGGCCA CTGCGGACGG TCCGGCGCGA CGACGACGTC GCCGAGCGGC ACGTTCGCCG
GCGACGAGCT AGCCCAAGCG GCACTGCCCG CCGAAGCACT GGCGGCCGGT GACGCCTGCC AGGCCGCGCT GCTGCTGCAG CGGCTCGCCG TGCAAGCGGC
101 GGTCCAGCTT TCCCGGCAAC GACTACGCCT GGGTCCGCGC GGCCTCGGGC AACACGCCGG TCGGTGCGGT GAACCGCTAC GACGGCAGCC GGGTGACCGT
CCAGGTCGAA AGGGCCGTTG CTGATGCGGA CCCAGGCGCG CCGGAGCCCG TTGTGCGGCC AGCCACGCCA CTTGGCGATG CTGCCGTCGG CCCACTGGCA
201 GGCCGGGTCC ACCGACGCGG CCGTCGGTGC CGCGGTCTGC CGGTCGGGGT CGACGACCGC GTGGGGCTGC GGCACGATCC AGTCCCGCGG CGCGAGCGTC
CCGGCCCAGG TGGCTGCGCC GGCAGCCACG GCGCCAGACG GCCAGCCCCA GCTGCTGGCG CACCCCGACG CCGTGCTAGG TCAGGGCGCC GCGCTCGCAG
301 ACGTACGCCC AGGGCACCGT CAGCGGGCTC ATCCGCACCA ACGTGTGCGC CGAGCCGGGT GACTCCGGGG GGTCGCTGAT CGCGGGCACC CAGGCGCGGG
TGCATGCGGG TCCCGTGGCA GTCGCCCGAG TAGGCGTGGT TGCACACGCG GCTCGGCCCA CTGAGGCCCC CCAGCGACTA GCGCCCGTGG GTCCGCGCCC
401 GCGTGACGTC CGGCGGCTCC GGCAACTGC (SEQ ID NO:65) CGCACTGCAG GCCGCCGAGG CCGTTGACG
Cellulomonas xylanilytica (LMG 21723)
1 RCSIGFAVTG GFVTAGHCGR SGATTTSPSG TFAGSSFPGN DYAWVRAASG 51 NTPVGAVNRY DGSRVTVAGS TDAAVGAAVC RSGSTTAWGC GTIQSRGASV 101 TYAQGTVSGL IRTNVCAEPG DSGGSLIAGT QARGVTSGGS GNC (SEQ ID NO:66)
Oerskovia turbata (DSM 20577)
1 ATGGCACGAT CATTCTGGAG GACGCTCGCC ACGGCGTGCG CCGCGACGGC TACCGTGCTA GTAAGACCTC CTGCGAGCGG TGCCGCACGC GGCGCTGCCG
51 ACTGGTTGCC GGCCCCGCAG CGCTCACCGC GAACGCCGCG ACGCCCACCC TGACCAACGG CCGGGGCGTC GCGAGTGGCG CTTGCGGCGC TGCGGGTGGG
101 CCGACACCCC GACCGTTTCA CCCCAGACCT CCTCGAAGGT CTCGCCCGAG GGCTGTGGGG CTGGCAAAGT GGGGTCTGGA GGAGCTTCCA GAGCGGGCTC

151 GTGCTCCGCG CCCTCCAGCG GGACCTGGGG CTGAGCGCCA AGGACGCGAC CACGAGGCGC GGGAGGTCGC CCTGGACCCC GACTCGCGGT TCCTGCGCTG
201 GAAGCGTCTG GCGTTCCAGT CCGACGCGGC GAGCACCGAG GACGCTCTCG CTTCGCAGAC CGCAAGGTCA GGCTGCGCCG CTCGTGGCTC CTGCGAGAGC
251 CCGACAGCCT GGACGCCTAC GCGGGCGCCT GGGTCGACCC TGCGAGGAAC GGCTGTCGGA CCTGCGGATG CGCCCGCGGA CCCAGCTGGG ACGCTCCTTG
301 ACCCTGTACG TCGGCGTCGC CGACAGGGCC GAGGCCAAGG AGGTCCGTTC TGGGACATGC AGCCGCAGCG GCTGTCCCGG CTCCGGTTCC TCCAGGCAAG
351 GGCCGGAGCG ACCCCCGTGG TCGTCGACCA CACGCTCGCC GAGCTCGACA CCGGCCTCGC TGGGGGCACC AGCAGCTGGT GTGCGAGCGG CTCGAGCTGT
401 CGTGGAAGGC GGCGCTCGAC GGTGAGCTCA ACGACCCCGC GGGCGTCCCG GCACCTTCCG CCGCGAGCTG CCACTCGAGT TGCTGGGGCG CCCGCAGGGC
451 AGCTGGTTCG TCGACGTCAC GACCAACCAG GTCGTCGTCA ACGTGCACGA TCGACCAAGC AGCTGCAGTG CTGGTTGGTC CAGCAGCAGT TGCACGTGCT
501 CGGCGGACGC GCCCTCGCGG AGCTGGCTGC CGCGAGCGCG GGCGTGCCCG GCCGCCTGCG CGGGAGCGCC TCGACCGACG GCGCTCGCGC CCGCACGGGC
551 CCGACGCCAT CACCTACGTG ACGACGACCG AGGCTCCTCG TCCCCTCGTC GGCTGCGGTA GTGGATGCAC TGCTGCTGGC TCCGAGGAGC AGGGGAGCAG
601 GACGTGGTGG GCGGCAACGC GTACACCATG GGTTCGGGCG GGCGCTGCTC CTGCACCACC CGCCGTTGCG CATGTGGTAC CCAAGCCCGC CCGCGACGAG
651 GGTCGGCTTC GCGGTGAACG GGGGCTTCAT CACGGCCGGG CACTGCGGCT CCAGCCGAAG CGCCACTTGC CCCCGAAGTA GTGCCGGCCC GTGACGCCGA
701 CGGTCGGCAC CCGCACCTCG GGGCCGGGCG GCACGTTCCG GGGGTCGAAC GCCAGCCGTG GGCGTGGAGC CCCGGCCCGC CGTGCAAGGC CCCCAGCTTG
751 TTCCCCGGCA ACGACTACGC CTGGGTGCAG GTCGACGCGG GTAACACCCC AAGGGGCCGT TGCTGATGCG GACCCACGTC CAGCTGCGCC CATTGTGGGG
801 GGTCGGCGCG GTCAACAACT ACAGCGGTGG GCGCGTCGCG GTCGCAGGGT CCAGCCGCGC CAGTTGTTGA TGTCGCCACC CGCGCAGCGC CAGCGTCCCA
851 CGACGGCCGC GCCCGTGGGG GCCTCGGTCT GCCGGTCCGG TTCCACGACG GCTGCCGGCG CGGGCACCCC CGGAGCCAGA CGGCCAGGCC AAGGTGCTGC
901 GGCTGGCACT GCGGCACCAT CGGCGCGTAC AACACCTCGG TGACGTACCC CCGACCGTGA CGCCGTGGTA GCCGCGCATG TTGTGGAGCC ACTGCATGGG
951 GCAGGGCACC GTCTCGGGGC TCATCCGCAC GAACGTGTGC GCCGAGCCCG CGTCCCGTGG CAGAGCCCCG AGTAGGCGTG CTTGCACACG CGGCTCGGGC
1001 GCGACTCGGG CGGCTCGCTC CTCGCGGGCA ACCAGGCGCA GGGCGTGACC CGCTGAGCCC GCCGAGCGAG GAGCGCCCGT TGGTCCGCGT CCCGCACTGG

1051 TCGGGCGGGT CGGGCAACTG CTCGTCGGGC GGGACGACGT ACTTCCAGCC AGCCCGCCCA GCCCGTTGAC GAGCAGCCCG CCCTGCTGCA TGAAGGTCGG
1101 CGTCAACGAG GCCCTCGGGG GGTACGGGCT CACGCTCGTG ACCTCTGACG GCAGTTGCTC CGGGAGCCCC CCATGCCCGA GTGCGAGCAC TGGAGACTGC
1151 GTGGGGGCCC GAGCCGCCGC CGACCGGGTG CCAGGGCTAT GCGCGGACCT CACCCCCGGG CTCGGCGGCG GCTGGCCCAC GGTCCCGATA CGCGCCTGGA
1201 ACCAGGGCAG CGTCTCGGCC GGGACGTCGG TCGCGCAGCG AACGGTTCGT TGGTCCCGTC GCAGAGCCGG CCCTGCAGCC AGCGCGTCGC TTGCCAAGCA
1251 ACGTCACGAC CGGGGGCGGG CGACCGGGTG TGCC (SEQ ID NO:67) TGCAGTGCTG GCCCCCGCCC GCTGGCCCAC ACGG
Oerskovia turbata (DSM 20577)
1 MARSFWRTLA TACAATALVA GPAALTANAA TPTPDTPTVS PQTSSKVSPE
51 VLRALQRDLG LSAKDATKRL AFQSDAASTE DALADSLDAY AGAWVDPARN
101 TLYVGVADRA EAKEVRSAGA TPVWDHTLA ELDTWKAALD GELNDPAGVP
151 SWFVDVTTNQ VWNVHDGGR ALAELAAASA GVPADAITYV TTTEAPRPLV
201 DWGGNAYTM GSGGRCSVGF AVNGGFITAG HCGSVGTRTS GPGGTFRGSN
251 FPGNDYAWVQ VDAGNTPVGA VNNYSGGRVA VAGSTAAPVG ASVCRSGSTT
301 GWHCGTIGAY NTSVTYPQGT VSGLIRTNVC AEPGDSGGSL LAGNQAQGVT
351 SGGSGNCSSG GTTYFQPVNE ALGGYGLTLV TSDGGGPSRR RPGARAMRGP
401 TRAASRPGRR SRSERFVRHD RGRATGCA (SEQ ID NO:68)
Oerskovia jenensis (DSM 46000)
1 GCCGCTGCTC GGTCGGCTTC GCGGTGAACG GCGGCTTCGT CACCGCAGGC CGGCGACGAG CCAGCCGAAG CGCCACTTGC CGCCGAAGCA GTGGCGTCCG
51 CACTGCGGGA CGGTGGGCAC CCGCACCTCG GGGCCGGGCG GCACGTTCCG GTGACGCCCT GCCACCCGTG GGCGTGGAGC CCCGGCCCGC CGTGCAAGGC
101 CGGGTCGAGC TTCCCCGGCA ACGACTACGC CTGGGTGCAG GTCGACGCGG GCCCAGCTCG AAGGGGCCGT TGCTGATGCG GACCCACGTC CAGCTGCGCC
151 GGAACACCCC GGTCGGGGCC GTCAACAACT ACAGCGGTGG ACGCGTCGCG CCTTGTGGGG CCAGCCCCGG CAGTTGTTGA TGTCGCCACC TGCGCAGCGC
201 GTCGCGGGCT CGACGGCCGC ACCCGTGGGT TCCTCGGTCT GCCGGTCCGG CAGCGCCCGA GCTGCCGGCG TGGGCACCCA AGGAGCCAGA CGGCCAGGCC
251 TTCCACGACG GGCTGGCGCT GCGGCACGAT CGCGGCCTAC AACAGCTCGG AAGGTGCTGC CCGACCGCGA CGCCGTGCTA GCGCCGGATG TTGTCGAGCC
301 TGACGTACCC GCAGGGGACC GTCTCCGGGC TCATCCGCAC CAACGTGTGC ACTGCATGGG CGTCCCCTGG CAGAGGCCCG AGTAGGCGTG GTTGCACACG
351 GCCGAGCCGG GCGACTCGGG CGGCTCGCTC CTCGCGGGCA ACCAGGCACA

CGGCTCGGCC CGCTGAGCCC GCCGAGCGAG GAGCGCCCGT TGGTCCGTGT
401 GGGCCTGACG TCGGGCGGGT CGGGCAACTG CTCGTCGGGC GGCACGACGT
CCCGGACTGC AGCCCGCCCA GCCCGTTGAC GAGCAGCCCG CCGTGCTGCA
451 ACTTCCAGCC CGTCAACGAG GCGCTCTCGG CCTACGGCCT CACGCTCGTG
TGAAGGTCGG GCAGTTGCTC CGCGAGAGCC GGATGCCGGA GTGCGAGCAC
501 ACCTCCGGCG GCAGGGGCAA CTGC (SEQ ID NO:69) TGGAGGCCGC CGTCCCCGTT GACG
Oerskovia jenensis (DSM 46000)
1 RCSVGFAVNG GFVTAGHCGT VGTRTSGPGG TFRGSSFPGN DYAWVQVDAG
51 NTPVGAVNNY SGGRVAVAGS TAAPVGSSVC RSGSTTGWRC GTIAAYNSSV
101 TYPQGTVSGL IRTNVCAEPG DSGGSLLAGN QAQGLTSGGS GNCSSGGTTY
151 FQPVNEALSA YGLTLVTSGG RGNC (SEQ ID NO:70)
Cellulosimicrobium cellulans (DSM 20424)
1 CCACGGGCGG CGGGTCGGGC AGCGCGCTCG TCGGGCTCGC GGGCAAGTGC GGTGCC'CGCC GCCCAGCCCG TCGCGCGAGC AGCCCGAGCG CCCGTTCACG
51 ATCGACGTCC CCGGGTCCGA CTTCAGTGAC GGCAAGCGCC TCCAGCTGTG TAGCTGCAGG GGCCCAGGCT GAAGTCACTG CCGTTCGCGG AGGTCGACAC
101 GACGTGCAAC GGGTCGCAGG CAGCGCTGGA CGTTCGAAGC CGACGGCACC CTGCACGTTG CCCAGCGTCC GTCGCGACCT GCAAGCTTCG GCTGCCGTGG
• 151 GTACGCGCGG GCGGCAAGTG CATGGACGTC GCGTGGGCGC CGCGGCCGAC CATGCGCGCC CGCCGTTCAC GTACCTGCAG CGCACCCGCG GCGCCGGCTG
201 GGCACGGCGC TCCAGCTCGC GAACTGCACG GCAACGCGGC CCAGAAGTTC CCGTGCCGCG AGGTCGAGCG CTTGACGTGC CGTTGCGCCG GGTCTTCAAG
251 GTGCTCAACG GCGCGGGCGA CCTCGTGTCG GTGCTGGCGA ACAAAGTGCG CACGAGTTGC CGCGCCCGCT GGAGCACAGC CACGACCGCT TGTTTCACGC
301 TCGACGCCGC CGGGTGCGCA CCGAGGTACT CGCGGCGCCG TACGAGCTCA AGCTGCGGCG GCCCACGCGT GGCTCCATGA GCGCCGCGGC ATGCTCGAGT
351 CGGCGACGTG CGCGGCGGCG ACCGCTACAT CACACGGGAC CCGGGCGCGT GCCGCTGCAC GCGCCGCCGC TGGCGATGTA GTGTGCCCTG GGCCCGCGCA
401 CGTCGGGCTC GGCCTGCTCG ATCGGGTACG CCGTCCAGGG CGGCTTCGTC GCAGCCCGAG CCGGACGAGC TAGCCCATGC GGCAGGTCCC GCCGAAGCAG
451 ACGGCGGGGC ACTGCGGACG CGGCGGGACA AGGAGAGTGC TCACCGCGAG TGCCGCCCCG TGACGCCTGC GCCGCCCTGT TCCTCTCACG AGTGGCGCTC
501 CTGGGCGCGC ATGGGGACGG TCCAGGCGGC GTCGTTCCCC GGCCACGACT

GACCCGCGCG TACCCCTGCC AGGTCCGCCG CAGCAAGGGG CCGGTGCTGA
551 ACGCGTGGGT GCGCGTCGAC GCCGGGTTCT CCCCCGTCCC GCGGGTGAAC TGCGCACCCA CGCGCAGCTG CGGCCCAAGA GGGGGCAGGG CGCCCACTTG
601 AACTACGCCG GCGGCACCGT CGACGTCGCC GGCTCGGCCG AGGCGCCCGT TTGATGCGGC CGCCGTGGCA GCTGCAGCGG CCGAGCCGGC TCCGCGGGCA
651 GGGTGCGTCG GTGTGCCGCT CGGGCGCCAC GACCGGCTGG CGCTGCGGCG CCCACGCAGC CACACGGCGA GCCCGCGGTG CTGGCCGACC GCGACGCCGC
701 TCATCGAGCA GAAGAACATC ACCGTCAACT ACGGCAACGG CGACGTTCCC AGTAGCTCGT CTTCTTGTAG TGGCAGTTGA TGCCGTTGCC GCTGCAAGGG
751 GGCCTCGTGC GCGGCAGCGC GTGCGCGGAG GGCGGCGACT CGGGCGGGTC CCGGAGCACG CGCCGTCGCG CACGCGCCTC CCGCCGCTGA GCCCGCCCAG
801 GGTGATCTCC GGCAACCAGG CGCAGGGCGT CACGTCGGGC AGGATCAACG CCACTAGAGG CCGTTGGTCC GCGTCCCGCA GTGCAGCCCG TCCTAGTTGC
851 ACTGCTCGAA CGGCGGCAAG TTCCTCTACC AGCCCGATCG ACGGCCTGTC TGACGAGCTT GCCGCCGTTC AAGGAGATGG TCGGGCTAGC TGCCGGACAG
901 GCTCGTGACC ACGGGCGGCG GGTCGGGCAG CGCGCTCGTC GGGCTCGCGG CGAGCACTGG TGCCCGCCGC CCAGCCCGTC GCGCGAGCAG CCCGAGCGCC
951 GCAAGTGCAT CGACGTCCCC GGGTCCGACT TCAG (SEQ ID NO:71) CGTTCACGTA GCTGCAGGGG CCCAGGCTGA AGTC
Cellulosimicrobium cellulans (DSM 20424)
1 PRAAGRAARS SGSRASASTS PGPTSVTASA SSCGRATGRR QRWTFEADGT
51 VRAGGKCMDV AWAPRPTARR SSSRTARQRG PEVRAQRRGR PRVGAGEQSA
101 STPPGAHRGT RGAVRAHGDV RGGDRYITRD PGASSGSACS IGYAVQGGFV
151 TAGHCGRGGT RRVLTASWAR MGTVQAASFP GHDYAWVRVD AGFSPVPRVN
201 NYAGGTVDVA GSAEAPVGAS VCRSGATTGW RCGVIEQKNI TVNYGNGDVP
251 GLVRGSACAE GGDSGGSVIS GNQAQGVTSG RINDCSNGGK FLYQPDRRPV
301 ARDHGRRVGQ RARRARGQVH RRPRVRLQ (SEQ ID NO:72)
Promicromonospora citrea (DSM 43110)
1 TTCCCCGGCA ACGACTACGC GTGGGTGAAC ACGGGCACGG ACGACACCCT AAGGGGCCGT TGCTGATGCG CACCCACTTG TGCCCGTGCC TGCTGTGGGA
51 CGTCGGCGCC GTGAACAACT ACAGCGGCGG CACGGTCAAC GTCGCGGGCT GCAGCCGCGG CACTTGTTGA TGTCGCCGCC GTGCCAGTTG CAGCGCCCGA
101 CGACCCGTGC CGCCGTCGGC GCGACGGTCT GCCGCTCGGG CTCCACGACC GCTGGGCACG GCGGCAGCCG CGCTGCCAGA CGGCGAGCCC GAGGTGCTGG
151 GGCTGGCACT GCGGCACCAT CCAGGCGCTG AACGCGTCGG TCACCTACGC

CCGACCGTGA CGCCGTGGTA GGTCCGCGAC TTGCGCAGCC AGTGGATGCG
201 CGAGGGCACC GTGAGCGGCC TCATCCGCAC CAACGTGTGC GCCGAGCCCG GCTCCCGTGG CACTCGCCGG AGTAGGCGTG GTTGCACACG CGGCTCGGGC
251 GCGACTC (SEQ ID NO:73) CGCTGAG
Promicromonospora citrea (DSM 43110)
1 FPGNDYAWVN TGTDDTLVGA VNNYSGGTVN VAGSTRAAVG ATVCRSGSTT 51 GWHCGTIQAL NASVTYAEGT VSGLIRTNVC AEPGD (SEQ ID NO:74)
Promicromonospora sukumoe (DSM 44121)
1 TTCCCCGGCA ACGACTACGC GTGGGTGAAC GTCGGCTCCG ACGACACCCC AAGGGGCCGT TGCTGATGCG CACCCACTTG CAGCCGAGGC TGCTGTGGGG
51 GATCGGTGCG GTCAACAACT ACAGCGGCGG CACCGTGAAC GTCGCGGGCT CTAGCCACGC CAGTTGTTGA TGTCGCCGCC GTGGCACTTG CAGCGCCCGA
101 CGACCCAGGC CGCCGTCGGC TCCACCGTCT GCCGCTCCGG TTCCACGACC GCTGGGTCCG GCGGCAGCCG AGGTGGCAGA CGGCGAGGCC AAGGTGCTGG
151 GGCTGGCACT GCGGCACCAT CCAGGCCTTC AACGCGTCGG TCACCTACGC CCGACCGTGA CGCCGTGGTA GGTCCGGAAG TTGCGCAGCC AGTGGATGCG
201 CGAGGGCACC GTGTCCGGCC TGATCCGCAC CAACGTCTGC GCCGAGCCCG GCTCCCGTGG CACAGGCCGG ACTAGGCGTG GTTGCAGACG CGGCTCGGGC
251 GCGACTC (SEQ ID NO:75) CGCTGAG
Promicromonospora sukumoe (DSM 44121)
1 FPGNDYAWVN VGSDDTPXGA VNNYSGGTVN VAGSTQAAVG STVCRSGSTT 51 GWHCGTIQAF NASVTYAEGT VSGLIRTNVC AEPGD (SEQ ID NO:76)
Xylanibacterium ulmi (LMG21721)
1 GCCGCTGCTC GATCGGGTTC GCCGTGACGG GCGGCTTCGT GACCGCCGGC
CGGCGACGAG CTAGCCCAAG CGGCACTGCC CGCCGAAGCA CTGGCGGCCG
51 CACTGCGGAC GGTCCGGCGC GACGACGACG TCCGCGAGCG GCACGTTCGC
GTGACGCCTG CCAGGCCGCG CTGCTGCTGC AGGCGCTCGC CGTGCAAGCG
101 CGGGTCCAGC TTTCCCGGCA ACGACTACGC CTGGGTCCGC GCGGCCTCGG
GCCCAGGTCG AAAGGGCCGT TGCTGATGCG GACCCAGGCG CGCCGGAGCC

151 GAACACGCCG GTCGGTGCGG TGAACCGCTA CGACGGCAGC CGGGTGACCG CTTGTGCGGC CAGCCACGCC ACTTGGCGAT GCTGCCGTCG GCCCACTGGC
201 TGGCCGGGTC CACCGACGCG GCCGTCGGTG CCGCGGTCTG CCGGTCGGGG ACCGGCCCAG GTGGCTGCGC CGGCAGCCAC GGCGCCAGAC GGCCAGCCCC
251 TCGACGACCG CGTGGCGCTG CGGCACGATC CAGTCCCGCG GCGCGACGGT AGCTGCTGGC GCACCGCGAC GCCGTGCTAG GTCAGGGCGC CGCGCTGCCA
301 CACGTACGCC CAGGGCACCG TCAGCGGGCT CATCCGCACC AACGTGTGCG GTGCATGCGG GTCCCGTGGC AGTCGCCCGA GTAGGCGTGG TTGCACACGC
351 CCGAGCCGGG TGACTCCGGG GGGTCGCTGA TCGCGGGCAC CCAGGCGCAG GGCTCGGCCC ACTGAGGCCC CCCAGCGACT AGCGCCCGTG GGTCCGCGTC
401 GGCGTGACGT CCGGCGGCTC CGGCAACTGC (SEQ ID NO:77) CCGCACTGCA GGCCGCCGAG GCCGTTGACG
Xylanibacterium ulmf. (LMG 21721)
1 RCSIGFAVTG GFVTAGHCGR SGATTTSASG TFAGSSFPGN DYAWVRAASG 51 NTPVGAVNRY DGSRVTVAGS TDAAVGAAVC RSGSTTAWRC GTIQSRGATV 101 TYAQGTVSGL IRTNVCAEPG DSGGSLIAGT QAQGVTSGGS G (SEQ ID NO:78)
Inverse PCR
Inverse PCR was used to determine the full-length serine protease genes from chromosomal DNA of bacterial strains of the suborder Micrococcineae shown by PCR or immunoblotting to be novel homologues of the new Cellulomonas sp. 69B4 protease described herein.
Digested DNA was purified using the PCR purification kit (Qiagen, Catalogue # 28106), and self-ligated with T4 DNA ligase (Invitrogen) according to the manufacturers' instructions. Ligation mixtures were purified with the PCR purification kit (Qiagen) and a PCR was performed with primers selected from the following list;
RV-1 Rest 5' - ACCCACGCGTAGTCGTTGCC - 3' (SEQ ID NO:79)
RV-1 Cellul 5' - ACCCACGCGTAGTCGTKGCCGGGG - 3' (SEQ ID NO:80)
RV-2 biaz-fimi 5'- TCGTCGTGGTCGCGCCGG - 3' (SEQIDNO:81)
RV-2 cella-flavi 5' - CGACGTGCTCGCGCCCG - 3' (SEQ ID NO:82)
RV-2cellul 5'- CGCGCCCAGCTCGCGGTG - 3' (SEQ ID NO:83)
RV-2 turb 5' - CGGCCCCGAGGTGCGGGTGCCG - 3' (SEQ ID NO:84)
Fw-1 biaz-fimi 5' - CAGCGTCTCCGGCCTCATCCGC - 3' (SEQ ID N0:85)
Fw-1 cella-flavi 5' - CTCGGTCTCGGGCCTCATCCGC - 3' (SEQ ID N0:86)
Fw-1 cellul 5' - CGACGTTCCCGGCCTCGTGCGC - 3' (SEQ ID N0:87)
Fw-1 turb 5' - CACCGTCTCGGGGCTCATCCGC - 3' (SEQ ID NO:88)

Fw-2 rest 5' - AGCARCGTGTGCGCCGAGCC - 3' (SEQ ID NO:89)
Fw-2 cellul 5' - GGCAGCGCGTGCGCGGAGGG - 3' (SEQ ID NO:90)
Fw-1 gelida 5' - GCCGCTGCTCGATCGGGTTC - 3' (SEQ ID N0:91)
Rv-1 gelida 5' - GCAGTTGCCGGAGCCGCCGGACGT - 3'. (SEQ ID NO:92)
The amplified PCR products were examined by agarose gel electrophoresis (0.8% agarose in TBE buffer (Invitrogen)). Distinct bands in the range 1.3 - 2.2 kbp for each organism were excised from the gel, purified using the Qiagen gel extraction kit and the sequence analyzed by BaseClear. Sequence analysis revealed that these DNA fragments covered some additional parts of protease gene homologues to the Cellulomonas 69B4 protease gene.
Genome Walking Using Rapid Amplification of Genomic Ends (RAGE)
A genome walking methodology (RAGE) known in the art was used to determine the full-length serine protease genes from chromosomal DNA of bacterial strains of the suborder Micrococcineae shown by PCR or immunoblotting to be novel homologues of the new Cellulomonas sp. 69B4 protease. RAGE was performed using the Universal GenomeWalker™ Kit (BD Biosciences Clontech), some with modifications to the manufacturer's protocol (BD Biosciences user manual PT3042-1, Version # PR03300). Modifications to the manufacturer's protocol included addition of DMSO (3 uL) to the reaction mixture in 50 uL total volume due to the high GC content of the template DNA and use of Advantage™ - GC Genomic Polymerase Mix (BD Biosciences Clontech) for the PCR reactions which were performed as follows;
PCR 1 PCR 2
99°C - 0.05 sec
94°C - 0.25 sec/72°C - 3.00 min 7 cycles 4 cycles
94°C - 0.25 sec/67°C - 4.00 min 39 cycles 24 cycles 67°C - 7.00 min 15°C-1.00 min
PCR was performed with primers (Invitrogen, Paisley, UK) selected from the following list (listed in 5' to 3' orientation);
RV-1 Rest ACCCACGCGTAGTCGTTGCC (SEQ ID NO:79)
RV-1 Cellul ACCCACGCGTAGTCGTKGCCGGGG (SEQ ID NO:80) RV-2 biaz-fimi TCGTCGTGGTCGCGCCGG (SEQ ID NO:81) RV-2 cella-flavi CGACGTGCTCGCGCCCG (SEQ ID NO:82) RV-2 cellul CGCGCCCAGCTCGCGGTG (SEQ ID NO:83) RV-2turb CGGCCCCGAGGTGCGGGTGCCG (SEQ ID NO:84) Fw-1 biaz-fimi CAGCGTCTCCGGCCTCATCCGC (SEQ ID NO:85) Fw-1 cella-flavi CTCGGTCTCGGGCCTCATCCGC (SEQ ID NO:86) Fw-1 cellul CGACGTTCCCGGCCTCGTGCGC (SEQ ID NO:87)

Fw-1 turb Fw-2 rest Fw-2 cellul Fw-1 gelida Rv-1 gelida Flavi FW1 Flavi FW2 Flavi RV1 Flavi RV2 Turb FW1 Turb FW2 Cellu RV1 Cellu RV2 Cellu FW1 Cellu FW2 Cella RV2 Cella RV1 Cella FW1 Cella FW2 NO: 106) Gelida RV1 NO: 107) Gelida RV2 Gelida FW1 NO: 109) Gelida FW2 Biazot RV1 Biazot RV2 Biazot FW1 Biazot FW2
CACCGTCTCGGGGCTCATCCGC (SEQ ID NO:88) AGCARCGTGTGCGCCGAGCC (SEQ ID NO:89) GGCAGCGCGTGCGCGGAGGG (SEQ ID NO:90) GCCGCTGCTCGATCGGGTTC (SEQ ID NO:91) GCAGTTGCCGGAGCCGCCGGACGT (SEQ ID NO:92) TGCGCCGAGCCCGGCGACTCCGGC (SEQ ID NO:93) GGCACGACGTACTTCCAGCCCGTGAAC (SEQ ID NO:94) GACCCACGCGTAGTCGTTGCCGGGGAACGACGA (SEQ ID NO:95) GAAGGTCCCCGACGGTGACGACGTGCTCGCGCC (SEQ ID NO:96) CAGGCGCAGGGCGTGACCTCGGGCGGGTCG (SEQ ID NO:97) GGCGGGACGACGTACTTCCAGCCCGTCAA (SEQ ID NO:98) CACCCACGCGTAGTCGTGGCCGGGGAACGA (SEQ ID NO:99) GAAGCCGCCCTGGACGGCGTACCCGATCGAGCA (SEQ ID NO:100) TGCGCGGAGGGCGGCGACTCGGGCGGGTCG (SEQ ID NO:101) TTCCTCTACCAGCCCGTCAACCCGATCCTA (SEQ ID NO:102) CGCCGCGGGGACGAACCCGCCCTCGACCGCGAA (SEQ ID NO:103) CGCGTAGTCGTTGCCGGGGAACGACGAGCC (SEQ ID NO:104) GGCCTCATCCGCACGAGGGTGTGCGCCGAG (SEQ ID NO:105) ACGTCGGGCGGGTCCGGCAACTGCCGCTACGGGGGC (SEQ ID
GAGCCCGTACACCCGGAGGGCCTCGTTGACGGGCTGGAA (SEQ ID
CGTCACGCCCTGCGCCTGGTTGCCCGCGAG (SEQ ID NO:108) TCCAGCCCGTCAACGAGGCCCTCCGGGTGTACGGGCTC (SEQ ID
ACGTCGGTCGCGCAGCCGAACGGTTCGTACGTC (SEQ ID NO:110) CGTGGTCGCGCCGGTCGTGCCGCAGTGCCC (SEQ ID NO:111) GACGACGACCGTGTTGGTAGTGACGTCGACGTACCA (SEQ ID NO: 112) TCCACCACGGGGTGGCGCTGCGGGACGATC (SEQ ID NO:113) GTGTGCGCCGAGCCCGGCGACTCCGGCGGC (SEQ ID NO:114)
Turb RV C-mature
GCTCGGGCCCCCACCGTCAGAGGTCACGAGCGTGAG (SEQ ID
NO:115)
Turb FW signal
ATGGCACGATCATTCTGGAGGACGCTCGCCACGGCG (SEQ ID NO:116)
Cellu internal FW
TGCTCGATCGGGTACGCCGTCCAGGGCGGCTTC (SEQ ID NO:117)
Cellu internal RV
TAGGATCGGGTTGACGGGCTGGTAGAGGAA (SEQ ID NO:118)
Biazot Int Fw TGGTACGTCGACGTCACTACCAACACGGTCGTCGTC (SEQ ID NO:119)
Biazot Int Rv 5' - GCCGCCGGAGTCGCCGGGCTCGGCGCACAC (SEQ ID NO:120)
flavi Nterm 5' - GTSGACGTSATCGGSGGSAACGCSTACTAC (SEQ ID NO: 121)
flavi Cterm 5' - SGCSGTSGCSGGNGANGA (SEQ ID NO:122)
fimi Nterm 5' - GTSGAYGTSATCGGCGGCGAYGCSTAC (SEQ ID NO:123)
fimi Cterm 5' - SGASGCGTANCCCTGNCC (SEQ ID NO:124)
The PCR products were subcloned in the pCR4-TOPO TA cloning vector (Invitrogen) and transformed to E.co//Top10 one-shot electrocompetent cells (Invitrogen). The transformants were incubated (37°C, 260 rpm, 16 hours) in 2xTY medium with 100 (ig/ml ampicillin. The isolated plasmid DNA (isolated using the Qiagen Qiaprep pDNA isolation kit)

was sequenced by BaseClear.
Sequence Analysis
Full length polynucleotide sequences were assembled from PCR fragment sequences using the GontigExpress and AlignX programs in Vector NTI suite v. 9.0.0 (Invitrogen) using the original polynucleotide sequence obtained in Example 4 as template and the ASP mature protease and ASP full-length sequence for alignment. The results for the polynucleotide sequences are displayed in Table 7-1 and the translated amino acid sequences are displayed in Table 7-2. For each of the natural bacterial strains the polynucleotide sequences and translated amino acid sequences for each of the homologous proteases are provided above.
Table 7-1 provides comparison information between ASP protease and various other sequences obtained from other bacterial strains. Amino acid sequence information for Asp-mature-protease homologues is available from 13 species:
1. Cellulomonas biazotea DSM 20112
2. Cellulomonas flavigena DSM 20109
3. Cellulomonas fimi DSM 20113
4. Cellulomonas cellasea DSM 20118
5. Cellulomonas gelida DSM 20111
6. Cellulomonas iranensis DSM 14784
7. Cellulomonas xylanilytica LMG 21723
8. Oerskovia jenensis DSM 46000
9. Oerskovia turbata DSM 20577
9. Oerskovia turbata DSM 20577
10. Cellulosimicrobium cellulans DSM 20424
11. Promicromonospora citrea DSM 43110
12. Promicromonospora sukumoe DSM 44121
13. Xylanibacterium ulmi LMG 21721
Notably, the sequence from Cellulomonas gelida at 48 amino acids is too short for useful consensus alignment. Sequence alignment against Asp-mature for the remaining 12 species are provided herein. To date, complete mature sequence has been determined for Oerskovia turbata, Cellulomonas cellasea, Cellulomonas biazotea and Cellulosimicrobium cellulans. However, there are some problems and sequence fidelity is not guaranteed for the sequence information known to the public, Cellulomonas cellasea protease is clearly

homologous to Asp (61.4% identity). However, the sequencing of 10 independent PCR fragments of the C-terminal region all gives a stop codon at position 184, suggesting that there is no C-terminal prosequence. In addition, Cellulosimicrobium cellulans is a close relative of Cellulomonas and clearly has an Asp homologous protease. However, the sequence identity is low, only 47.7%. It contains an insertion of 4 amino acids at position 43-44 and it is uncertain where the N-terminus of the protein begins. Nonetheless, the data provided here clearly show that there are enzymes homologous to the ASP protease described herein. Thus, it is intended that the present invention encompass the ASP protease isolated from Cellulomonas strain 69B4, as well as other homologous proteases.
In this Table, the nucleotide numbering is based on full-length gene of 69B4 protease (SEQ ID NO:2), where nt 1 - 84 encode the signal peptide, nt 85 - 594 encode the N-terminal prosequence, nt 595 - 1161 encode the mature 69B4 protease, and nt 1162 -1485 encode the C-terminal prosequence.

The following Table (Table 7-2) provides information regarding the translated amino acid sequence data in natural isolate strains compared with full-length ASP.

These results clearly show that bacterial strains of the suborder Micrococcineae, including the families Cellulomonadaceae and Promicromonosporaceae possess genes that are homologous with the 69B4 protease. Over the region of the mature 69B protease, the gene sequence identities range from about 60%-80%. The amino acid sequences of these homologous sequences exhibit about 45%-80% identity with the mature 69B4 protease protein. In contrast to the majority of streptogrisin proteases derived from members of the suborder Streptomycineae, these 69B4 (Asp) protease homologues from the suborder Micrococcineae possess six cysteine residues, which form three disulfide bridges in the mature 69B4 protease protein.
Indeed, in spite of the incomplete sequences provided herein and questions regarding fidelity, the present invention provides essential elements of the Asp group of proteases and comparisons with streptogrisins. Asp is uniquely Asp is characterized, along with Streptogrisin C, as having 3 disulfide bridges. In the following sequence, the Asp amino acids are printed in bold and the fully conserved residues are underlined. The active site residues are marked with # and double underlined. The cysteine residues are marked with * and underlined. The disulfide bonds are located between C17 and C38, C95 and homologues were subjected to mass spectrometry-based protein sequencing procedures which consisted of these major steps: micropurification, gel electrophoresis, in-gel proteolytic digestion, capillary liquid chromatography electrospray

tandem mass spectrometry (nanoLC-ESI-MS/MS), database searching of the mass spectrometric data, and de novo sequencing. Details of these steps are described what follows. As described previously in Example 6, concentrated culture sample (about 200 ml) was added to 500ml 1M CaCI2 and centrifuged at 14,000 rpm (model 5415C Eppendorf) for 5 min. The supernatant was cooled on ice and acidified with 200 ml 1N HCI. After 5 min, 200 ml 50% trichloroacetic acid were added and the sample was centrifuged for 4 min at 14,000 rpm (model 5415C Eppendorf). The supernatant was discarded and the pellet was washed first with water and then with 90% acetone. The pellet, after being dried in the speed vac, was dissolved in 2X Protein Preparation (Tris-Glycine Sample Buffer; Novex) buffer and diluted 1 + 1 with water before being applied to the SDS-PAGE gel. SDS-PAGE was run with NuPAGE MES SDS Running Buffer. SDS-PAGE gel (1 mm NuPAGE 10% Bis-Tris; Novex) was developed and stained using standard protocols known in the art. Following SDS-PAGE, bands corresponding to ASP homologues were excised and processed for mass spectrometric peptide sequencing using standard protocols in the art.
Peptide mapping and sequencing was performed using capillary liquid chromatography electrospray tandem mass spectrometry (nanoLC-ESI-MS/MS). This analysis systems consisted of capillary HPLC system (model CapLC; Waters) and mass spectrometer (model Qtof Ultima API; Waters). Peptides were loaded on a pre-column (PepMaplOO C18, 5um, 100A, 300um ID x 1mm; Dionex) and chromatographed on capillary columns (Biobasic C18 75um x 10cm; New Objectives) using a gradient from 0 to 100% solvent B in 45min at a flow rate of 200nl_/min (generated using a static split from a pump flow rate of 5uL/min). Solvent A consisted of 0.1% formic acid in water; and solvent B was 0.1% formic acid in acetonitrile. The mass spectrometer was operated with the following parameters: spray voltage of 3.1RV, desolavation zone at 150C, mass spectra acquired from 400 to 1900 m/z, resolution of 6000 in v-mode. Tandem MS spectra were acquired in data dependent mode with two most intense peaks selected and fragmented with mass dependent collision energy (as specified by vendor) and collision gas (argon) at 2.5x10-5 torr.
The identities of the peptides were determined using a database search program (Mascot, Matrix Science) using a database containing ASP homologue DNA-obtained sequences. Database searches were performed with the following parameters: no enzyme selected, peptide error of 2.5Da, MS/MS ions error of 0.1 Da, and variable modification of carboxyaminomethyl cysteine). For unmatched MS/MS spectra, manual de novo sequence assignments were performed. For example, Figure 4 shows the sequence of N-terminal most tryptic peptide from C. flavigena determined from this tandem mass spectrum. In

Table 8-1, the percentage of the sequence verified on the protein level for various homologues are reported along with N-terminal and C-terminal peptide sequences.



Table 8-1. Mass Spec. Sequencing of ASP Homologues
Sequence Verified
ASP Homologue
N-terminal
and
C-terminal
Sequences
(Peptide Mass in Da)
Trypsin,
Chymotrypsin
Digests

Cellulomonas cellasea

81,81

[IY]AWDAFAENVVDWSSR (SEQ ID
NO: 126) (2026.7)
YGGTTYFQPVNEILQAY (SEQ ID
NO:127)(1961.8)



Cellulomonas flavigena

70,50

VDVI\LGGNAYYI/L[...]R (SEQ ID NO: 128)(1697.7)

Cellulomonas fimi.
21, ND
VDVI/LGGDAY[...]R (SEQ ID NO:129)
(1697.6)
Notes:
ND: not determined
sequence not determined indicated in [..]
sequence order not determined indicated by [ ]
isobaric residues not distinguished indicated by I\L

EXAMPLE 9 Protease Production in Streptomyces lividans
This Example describes experiments conducted to develop methods for production

of protease by S. lividans. Thus, a plasmid comprising a polypeptide encoding a polypeptide having proteolytic activity was constructed and used such vector to transform Streptomyces lividans host cells The methods used for this transformation are more fully described in US Patent No. 6,287,839 and WO 02/50245, both of which are herein expressly incorporated by reference.
One plasmid developed during these experiments was designated as "pSEG69B4T." The construction of this plasmid made use of one pSEGCT plasmid vector (See, WO 02/50245). A glucose isomerase ("Gl") promoter operably linked to the structural gene encoding the 69B4 protease was used to drive the expression of the protease. A fusion between the Gl-promoter and the 69B4 signal-sequence, N-terminal prosequence and mature sequence was constructed by fusion-PCR techniques as a Xbal-BamHI fragment. The fragment was ligated into plasmid pSEGCT digested with Xbal and SamHI, resulting in plasmid pSEG69B4T (See, Figure 6). Although the present Specification provides specific expression vectors, it is contemplated that additional vectors utilizing different promoters and/or signal sequences combined with various prosequences of the 69B4 protease will find use in the present invention.
An additional plasmid developed during the experiments was designated as "pSEA469B4CT" (See, Figure 7). As with the pSEG69B4T plasmid, one pSEGCT plasmid vector was used to construct this plasmid. To create the pSEA469B4CT, the Aspergillus niger (regulatory sequence) ("A4") promoter was operably linked to the structural gene encoding the 69B4 protease, and used to drive the expression of the protease. A fusion between the A4-promoter and the Cel A (from Streptomyces coelicolor) signal-sequence, the asp-N-terminal prosequence and the asp mature sequence was constructed by fusion-PCR techniques, as a Xbal-BamHI fragment. The fragment was ligated into plasmid pSEMGCT digested with Xba\ and BamHI, resulting in plasmid pSEA469B4CT (See, Figure 7). The sequence of the A4 (A niger) promoter region is:
1 TCGAA CTTCAT GTTCGA GTTCTT GTTCAC GTAGAA GCCGGA GATGTG AGAGGT
AGCTT GAAGTA CAAGCT CAAGAA CAAGTG CATCTT CGGCCT CTACAC TCTCCA
61 GATCTG GAACTG CTCACC CTCGTT GGTGGT GACCTG GAGGTA AAGCAA GTGACC CTTCTG
CTAGAC CTTGAC GAGTGG GAGCAA CCACCA CTGGAC CTCCAT TTCGTT CACTGG GAAGAC
121 GCGGAG GTGGTA AGGAAC GGGGTT CCACGG GGAGAG AGAGAT GGCCTT GACGGT CTTGGG
CGCCTC CACCAT TCCTTG CCCCAA GGTGCC CCTCTC TCTCTA CCGGAA CTGCCA GAACCC
181 AAGGGG AGCTTC NGCGCG GGGGAG GATGGT CTTGAG AGAGGG GGAGCT AGTAAT GTCGTA
TTCCCC TCGAAG NCGCGC CCCCTC CTACCA GAACTC TCTCCC CCTCGA TCATTA CAGCAT
241 CTTGGA CAGGGA GTGCTC CTTCTC CGACGC ATCAGC CACCTC AGCGGA GATGGC ATCGTG
GAACCT GTCCCT CACGAG GAAGAG GCTGCG TAGTCG GTGGAG TCGCCT CTACCG TAGCAC
301 CAGAGA CAGACC
GTCTCT GTCTGG (SEQ ID NO:130)

In these experiments, the host Streptomyces lividans TK23 was transformed with either of the vectors described above using protoplast methods known in the art (See e.g., Hopwood, etal.,. Genetic Manipulation of Streptomyces, A Laboratory Manual. The John Innes Foundation, Norwich, United Kingdom [1985]).
The transformed culture was expanded to provide two fermentation cultures. At various time points, samples of the fermentation broths were removed for analysis. For the purposes of this experiment, a skimmed milk procedure was used to confirm successful cloning. In these methods, 30 ul of the shake flask supernatant was spotted in punched out holes in skim milk agar plates and incubated at 37°C. The incubated plates were visually reviewed after overnight incubation for the presence of halos. For purposes of this experiment, the same samples were also assayed for protease activity and for molecular weight (SDS-PAGE). At the end of the fermentation run, full length protease was observed by SDS-PAGE.
A sample of the fermentation broth was assayed as follows: 10ul of the diluted supernatant was taken and added to 190 ul AAPF substrate solution (cone. 1 mg/ml, in 0.1 M Tris/0.005% TWEEN, pH 8.6). The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored (25°C). The assay results of the fermentation broth of 3 clones (X, Y, W) obtained using the pSEG69B4T and two clones using the pSEA469B4T indicated that Asp was expressed by both constructs, able XXI. Results for Two Clones (pSEA469B4T). Indeed, the results obtained in these experiments showed that the polynucleotide encoding a polypeptide having proteolytic activity was expressed in Streptomyces lividans, using both of these expression vectors. Although two vectors are described in this Example, it is contemplated that additional expression vectors using different promoters and/or signal sequences combined with different combinations of 69B4 protease: + / - N terminal and C terminal prosequence in the pSEA4CT backbone (vector), as well as other constructs will find use in the present invention.
EXAMPLE 10 Protease Production in B. subtilis
In this Example, experiments conducted to produce protease 69B4 (also referred to herein as "ASP," "Asp," and "ASP protease," and "Asp protease") in B. subtilis are described. In this Example, the transformation of plasmid pHPLT-ASP-C1-2 (See, Table 10-1; and Figure 9), into B. subtilis is described. Transformation was performed as known in the art (See e.g., WO 02/14490, incorporated herein by reference. To optimize ASP

expression in B. subtilis a synthetic DMA sequence was produced by DNA2.0, and utilized in these expression experiments. The DNA sequence (synthetic ASP DMA sequence) provided below, with codon usage adapted for Bacillus species, encodes the wild type ASP precursor protein:
ATGACACCACGAACTGTCACAAGAGCTCTGGCTGTGGCAACAGCAGGTGCTACACTCTTGGCTGGGGGTAT
GGCAGCACAAGCTAACGAACCGGCTCCTCCAGGATCTGCATCAGCCCCTCCACGATTAGCTGAAAAACTTGA
CCCTGACTTACTTGAAGCAATGGAACGCGATCTGGGGTTAGATGCAGAGGAAGCAGCTGCAACGTTAGCTTT
TCAGCATGACGCAGCTGAAACGGGAGAGGCTCTTGCTGAGGAACTCGACGAAGATTTCGCGGGCACGTGGG
TTGAAGATGATGTGCTGTATGTTGCAACCACTGATGAAGATGCTGTTGAAGAAGTCGAAGGCGAAGGAGCAA
CTGCTGTGACTGTTGAGCATTCTCTTGCTGATTTAGAGGCGTGGAAGACGGTTTTGGATGCTGCGCTGGAGG
GTCATGATGATGTGCCTACGTGGTACGTCGACGTGCCTACGAATTCGGTAGTCGTTGCTGTAAAGGCAGGAG
CGCAGGATGTAGCTGCAGGACTTGTGGAAGGCGCTGATGTGCCATCAGATGCGGTCACTTTTGTAGAAACG
GACGAAACGCCTAGAACGATGTTCGACGTAATTGGAGGCAACGCATATACTATTGGCGGCCGGTCTAGATG
TTCTATCGGATTCGCAGTAAACGGTGGCTTCATTACTGCCGGTCACTGCGGAAGAACAGGAGCCACTACTG
CCAATCCGACTGGCACATTTGCAGGTAGCTCGTTTCCGGGAAATGATTATGCATTCGTCCGAACAGGGGCA
GGAGTAAATTTGCTTGCCCAAGTCAATAACTACTCGGGCGGCAGAGTCCAAGTAGCAGGACATACGGCCG
CACCAGTTGGATCTGCTGTATGCCGCTCAGGTAGCACTACAGGTTGGCATTGCGGAACTATCACGGCGCT
GAATTCGTCTGTCACGTATCCAGAGGGAACAGTCCGAGGACTTATCCGCACGACGGTTTGTGCCGAACCA
GGTGATAGCGGAGGTAGCCTTTTAGCGGGAAATCAAGCCCAAGGTGTCACGTCAGGTGGTTCTGGAAATT
GTCGGACGGGGGGAACAACATTCTTTCAACCAGTCAACCCGATTTTGCAGGCTTACGGCCTGAGAATGATT
ACGACTGACTCTGGAAGTTCCCCTGCTCCAGCACCTACATCATGTACAGGCTACGCAAGAACGTTCACAGG
AACCCTCGCAGCAGGAAGAGCAGCAGCTCAACGGAACGGTAGCTATGTTCAGGTCAACCGGAGCGGTACAC
ATTCCGTGTGTCTCAATGGACCTAGCGGTGCGGACTTTGATTTGTATGTGCAGCGATGGAATGGCAGTAGCT
GGGTAACCGTCGCTCAATCGACATCGCCGGGAAGCAATGAAACCATTACGTACCGCGGAAATGCTGGATATT
ATCGCTACGTGGTTAACGCTGCGTCAGGATCAGGAGCTTACACAATGGGACTCACCCTCCCCTGA (SEQ ID
NO:131)
In the above sequence, bold indicates the DNA that encodes the mature protease, standard font indicates the leader sequence, and the underline indicates the N-terminal and C-terminal prosequences.
Expression of the Synthetic ASP Gene
Asp expression cassettes were constructed in the pXX-Kpnl (See, Figure 15) or p2JM103-DNNDPI (See, Figure 16) vectors and subsequently cloned into the pHPLT vector (See, Figure 17) for expression of ASP in B. subtilis. pXX-Kpnl is a pUC based vector with the aprE promoter (B. subtilis) driving expression, a cat gene, and a duplicate aprE promoter for amplification of the copy number in B. subtilis. The bla gene allows selective growth in E. coll The Kpn\, introduced in the ribosomal binding site, downstream of the aprE promoter region, together with the Hind\\\ site enables cloning of Asp expression cassettes in pXX-

Kpnl. The vector p2JM103-DNNDPI contains the aprEpromoter (B. subtilis) to drive expression of the BCE103 cellulase core (endo-cellulase from an obligatory alkaliphilic Bacillus; See, Shaw etal., J. Mol. Biol., 320:303-309 [2002]), in frame with an acid labile linker(DDNDPI [SEQ ID N0:132]; See, Segalas etal., FEES Lett., 371:171-175 [1995]). The ASP expression cassette (BamH\ and H/ndlll) was fused to BCE103-DDNDPI fusion protein. When secreted, ASP is cleaved of the cellulase core to turn into the mature protease
pHPLT (See, Figure 17; and Solingen et al., Extremophiles 5:333-341 [2001]) contains the thermostable amylase LAT promoter (PLAT) of Bacillus licheniformis, followed by Xba\ and /-/pal restriction sites for cloning ASP expression constructs. The following sequence is that of the BCE103 cellulase core with DNNDPI acid labile linker. In this sequence, the bold indicates the acid-labile linker, while the standard font indicates the BCE103core.
VR S K KL W I S L L F A L TL IF TM 1 GTGAGA AGCAAA AAATTG TGGATC AGCTTG TTGTTT GCGTTA ACGTTA ATCTTT ACGATG CACTCT TCGTTT TTTAAC ACCTAG TCGAAC AACAAA CGCAAT TGCAAT TAGAAA TGCTAC
AF S N MS AQ AD DY S V VE EH GQ 61 GCGTTC AGCAAC ATGAGC GCGCAG GCTGAT GATTAT TCAGTT GTAGAG GAACAT GGGCAA CGCAAG TCGTTG TACTCG CGCGTC CGACTA CTAATA AGTCAA CATCTC CTTGTA CCCGTT
L S IS NG EL VN ER GE QV QL KG 121 CTAAGT ATTAGT AACGGT GAATTA GTCAAT GAACGA GGCGAA CAAGTT CAGTTA AAAGGG GATTCA TAATCA TTGCCA CTTAAT CAGTTA CTTGCT CCGCTT GTTCAA GTCAAT TTTCCC
MS S H GL QW YG QF VN YE SM KW 181 ATGAGT TCCCAT GGTTTG CAATGG TACGGT CAATTT GTAAAC TATGAA AGCATG AAATGG TACTCA AGGGTA CCAAAC GTTACC ATGCCA GTTAAA CATTTG ATACTT TCGTAC TTTACC
LR DD WG IT VF RA AM YT SS GG 241 CTAAGA GATGAT TGGGGA ATAACT GTATTC CGAGCA GCAATG TATACC TCTTCA GGAGGA GATTCT CTACTA ACCCCT TATTGA CATAAG GCTCGT CGTTAC ATATGG AGAAGT CGTCCT
YI DD PS VK EK VK ET VE AA ID 301 TATATT GACGAT CCATCA GTAAAG GAAAAA GTAAAA GAGACT GTTGAG GCTGCG ATAGAC ATATAA CTGCTA GGTAGT CATTTC CTTTTT CATTTT CTCTGA CAACTC CGACGC TATCTG
LG IY VI ID WH IL SD ND.PN IY 361 CTTGGC ATATAT GTGATC ATTGAT TGGCAT ATCCTT TCAGAC AATGAC CCGAAT ATATAT GAACCG TATATA CACTAG TAACTA ACCGTA TAGGAA AGTCTG TTACTG GGCTTA TATATA
KE EA KD FF DE MS EL YG DY PN 421 AAAGAA GAAGCG AAGGAT TTCTTT GATGAA ATGTCA GAGTTG TATGGA GACTAT CCGAAT TTTCTT CTTCGC TTCCTA AAGAAA CTACTT TACAGT CTCAAC ATACCT CTGATA GGCTTA
VI YE IA NE PN GS DV TW DN QI 481 GTGATA TACGAA ATTGCA AATGAA CCGAAT GGTAGT GATGTT ACGTGG GACAAT CAAATA CACTAT ATGCTT TAACGT TTACTT GGCTTA CCATCA CTACAA TGCACC CTGTTA GTTTAT
KP YA EE VI PV IR DN DP NN IV 541 AAACCG TATGCA GAAGAA GTGATT CCGGTT ATTCGT GACAAT GACCCT AATAAC ATTGTT TTTGGC ATACGT CTTCTT CACTAA GGCCAA TAAGCA CTGTTA CTGGGA TTATTG TAACAA
IV GT GT W S QD VH HA AD NQ LA 601 ATTGTA GGTACA GGTACA TGGAGT CAGGAT GTCCAT CATGCA GCCGAT AATCAG CTTGCA TAACAT CCATGT CCATGT ACCTCA GTCCTA CAGGTA GTACGT CGGCTA TTAGTC GAACGT
DP NV MY AF HF YA GT HG QN LR 661 GATCCT AACGTC ATGTAT GCATTT CATTTT TATGCA GGAACA CATGGA CAAAAT TTACGA CTAGGA TTGCAG TACATA CGTAAA GTAAAA ATACGT CCTTGT GTACCT GTTTTA AATGCT
DQ VD YA LD QG AA IF VS EW GT 721 GACCAA GTAGAT TATGCA TTAGAT CAAGGA GCAGCG ATATTT GTTAGT GAATGG GGGACA CTGGTT CATCTA ATACGT AATCTA GTTCCT CGTCGC TATAAA CAATCA CTTACC CCCTGT
SA AT GD GG VF LD EA QV WI DF 781 AGTGCA GCTACA GGTGAT GGTGGT GTGTTT TTAGAT GAAGCA CAAGTG TGGATT GACTTT

TCACGT CGATGT CCACTA
M D E R N L 841 ATGGAT GAAAGA AATTTA TACCTA CTTTCT TTAAAT
A A L M P G 901 GCAGCG TTAATG CCAGGT CGTCGC AATTAC GGTCCA
S G T F V R 961 TCTGGT ACATTT GTGAGG •AGACCA TGTAAA CACTCC
(DMA; SEQ ID NO: 133) and (Am

CCACCA CACAAA AATCTA CTTCGT GTTCAC ACCTAA CTGAAA
SW AN WS LT HK DE SS AGCTGG GCCAAC TGGTCT CTAACG CATAAG GATGAG TCATCT TCGACC CGGTTG ACCAGA GATTGC GTATTC CTACTC AGTAGA
AN PT GG WT EA EL SP GCAAAT CCAACT GGTGGT TGGACA GAGGCT GAACTA TCTCCA CGTTTA GGTTGA CCACCA ACCTGT CTCCGA CTTGAT AGAGGT
EK IR ES AS DN ND PI GAAAAA ATAAGA GAATCA GCATCT GACAAC AATGAT CCCATA CTTTTT TATTCT CTTAGT CGTAGA CTGTTG TTACTA GGGTAT
inoAcid; SEQ ID NO: 134)

The Asp expression cassettes were cloned in the pXX-Kpnl vector containing DNA encoding the wild type Asp signal peptide, or a hybrid signal peptide constructed of 5 subtilisin AprE N-terminal signal peptide amino acids fused to the 25 Asp C-terminal signal peptide amino acids (MRSKKRTVTRALAVATAAATLLAGGMAAQA (SEQ ID NO:f 35), or a hybrid signal peptide constructed of 11 subtilisin AprE N-terminal signal peptide amino acids fused to the 19 asp C-terminal signal peptide amino acids
(MRSKKLWISLLLAVATAAATLLAGGMAAQA (SEQ ID NO:136). These expression cassettes were also constructed'with the asp C-terminal prosequence encoding DNA in frame. Another expression cassette, for cloning in the p2JM103-DNNDPI vector, encodes the ASP N-terminal pro- and mature sequence.
The Asp expression cassettes cloned in the pXX-Kpnl or p2JM103-DNNDPI vector were transformed into E.co//(Electromax DH10B, Invitrogen, Cat.No. 12033-015). The primers and cloning strategy used are provided in Table 10-1. Subsequently, the expression cassettes were cloned from these vectors and introduced in the pHPLT expression vector for transformation into a B. subtilis (AaprE, AnprE, oppA, Aspo//E, degUHy32, kamyE::(xylR,pxylA-comK) strain. The primers and cloning strategy for ASP expression cassettes cloning in pHPLT are provided in Table 10-2. Transformation to B. subtilis was performed as described in WO 02/14490, incorporated herein by reference. Figures 12-21 provide plasmid maps for various plasmids described herein.
Table 10-1. ASP in pXX-Kpnl and p2JM103-DNNDPI

Primers were obtained from MWG and Invitrogen. Invitrogen Platinum Taq DNA polymerase High Fidelity (Cat.No. 11304-029) was used for PCR amplification (0.2 uM primers, 25 up to 30 cycles) according to the Invitrogen's protocol. Ligase reactions of ASP expression cassettes and host vectors were completed by using Invitrogen T4 DNA Ligase (Cat. No. 15224-025), utilizing Invitrogen's protocol as recommended for general cloning of cohesive ends).
Selective growth of B. subtilis (AaprE, AnprE, oppA, AspollE, degUHy32, &amyE'.'.(xylR,pxylA-comK) transformants harboring the p2JM103-ASP vector or one of the pHPLT-ASP vectors was performed in shake flasks containing 25 ml Synthetic Maxatase Medium (SMM), with 0.97 g/l CaCI2.6H2O instead of 0.5 g/l CaCI2 (See, U.S. Pat. No. 5,324,653, herein incorporated by reference) with either 25 mg/L chloramphenicol or 20 mg/L neomycin. This growth resulted in the production of secreted ASP protease with proteolytic activity. However. Gel analysis was performed using NuPage Novex 10% Bis-Tris gels (Invitrogen, Cat.No. NP0301BOX). To prepare samples for analysis, 2 volumes of supernatant were mixed with 1 volume 1M HCI, 1 volume 4xLDS sample buffer (Invitrogen, Cat.No. NP0007), and 1% PMSF (20 mg/ml) and subsequently heated for 10 minutes at 70°C. Then, 25 uL of each sample was loaded onto the gel, together with 10 uL of SeeBlue plus 2 pre-stained protein standards (Invitrogen, Cat.No.LC5925). The results'clearly demonstrated that all asp cloning strategies described in this Example yield sufficient amounts of active Asp produced by B. subtilis.
In addition, samples of the same fermentation broths were assayed as follows: 10ul of the diluted supernatant was taken and added to 190 ul AAPF substrate solution (cone. 1 mg/ml, in 0.1 M Tris/0.005% TWEEN®, pH 8.6). The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored (25°C), as it provides a measure of the ASP concentration produced. These results indicated that all of the constructs resulted in the production of measurable ASP protease.
The impact of the synthetic asp gene was investigated in Bacillus subtilis comparing

the expression levels of the pHPLT-ASP-c-1-2 construct with the synthetic and native asp gene in a B. subtilis (AaprE, AnprE, oppA, Aspo//E, degUHy32, &amyE::(xylR,pxylA-comK) strain. The native gene was amplified from plasmid containing the native asp gene, using platinum pfx polymerase (Invitrogen) with the following primers:
AK04-12.1: Nhe\ thru RBS
TTATGCGAGGCTAGCAAAAGGAGAGGGTAAAGAGTGAGAAGCAAAAAACG (SEQ ID NO: 165)
AK04-11: RBS thru 5 aa aprE for ASP native C1 fusion in pHPLT taaagagtgagaagcaaaaaacgcacagtcacgcgggccctg (SEQ ID NO:166)
AK04-13: Hpal 3' of native ASP mature gtcctctgttaacttacgggctgctgcccgagtcc (SEQ ID NO:167)
The following conditions were used for these PCRs: 94°C for 2 min.; followed by 25 cycles of 94°C for 45 sec., 60°C for 30 sec., and 68°C for 2 min. for 30 sec.; followed by 68°C for 5 min. The resulting PCR product was run on an E-gel (Invitrogen), excised, and purified with a gel extraction kit (Qiagen). Ligase reaction of this fragment containing the native ASP with the pHPLT vector was completed by using ligated (T4 DMA Ligase, NEB) and transformed directly into B. subtilis (AaprE, AnprE, oppA, Aspo//E, degUHy32, AamyE::(xylR,pxylA-comK). Transformation to B. subtilis was performed as described in WO 02/14490 A2, herein incorporated by reference.
The Asp protein was produced by growth in shake flasks at 37°C in medium containing the following ingredients; 0.03 g/L MgSO4, 0.22 g/L K2HPO4, 21.3 g/L NA2HPO4*7H2O, 6.1 g/L NaH2PO4*H2O, 3.6 g/L Urea, 7 g/L soymeal, 70 g/L Maltrin M150, and 42 g/L glucose, with a final pH7.5. In these experiments, the production level of the host carrying the synthetic gene cassette was found to be 3-fold higher than the host carrying the native gene cassette.
In additional experiments, expression of ASP was investigated in Bacillus subtilis using the sacB promoter and aprE signal peptide. The gene was amplified from plasmid containing the synthetic asp gene using TGO polymerase (Roche) and the primers:
CF 520 (+) Fuse ASP (pro) to aprE ss GCAACATGTCTGCGCAGGCTAACGAACCGGCTCCTCCAGGA (SEQ ID NO:168)
CF 525 (-) End of Asp gene Hindlll GACATGACATAAGCTTAAGGGGAACTTCCAGAGTC

(SEQIDNO:169)
The sacB promoter (Bacillus subtilis), the start of the messenger RNA (+1) from aprE, and the aprE signal peptide were amplified from the plasmid pJHsacBJ2 using TGO polymerase (Roche) and the primers:
CF 161 (+)£coRI at start of sacB promoter GAGCCGAATTCATATACCTGCCGTT (SEQ ID NO: 170)
CF 521 (-) Reverse complement of CF 520 TCCTGGAGGAGCCGGTTCGTTAGCCTGCGCAGACATGTTGC (SEQ ID NO:171)
The following PCR conditions were used to amplify both pieces:
94°C for 2 min. ; followed by 30 cycles of 94°C for 30 sec., 50°C for 1 min., and 66°C for 1 min.; followed by 72°C for 7 min. The resulting PCR products were run on an E-gel (Invitrogen), excised, and purified with a gel extraction kit (Qiagen).
In addition, a PCR overlap extension fusion (Ho, Gene, 15:51-59 [1989]) was used to fuse the above gene fragment to the sacB promoter-aprE signal peptide fragment with PFX polymerase (Invitrogen) using the following primers:
CF 161 (+)EcoRI at start of sacB promoter GAGCCGAATTCATATACCTGCCGTT (SEQ ID NO:170)
CF 525 (-) End of Asp gene Hind\\\ GACATGACATAAGCTTAAGGGGAACTTCCAGAGTC (SEQ ID NO: 169)
The following conditions were used for these PCRs:
94°C for 2 min.; followed by 25 cycles of 94°C for 45 sec., 60°C for 30 sec., and 68°C for 2 min. 30 sec.; followed 68°C for 5 min. The resulting PCR fusion products were run on an E-gel (Invitrogen), excised, and purified with a gel extraction kit (Qiagen). The purified fusions were cut (EcoRI/tf/ndlll) and ligated (T4 DNA Ligase, NEB) into an EcoRI/H/ndlll pJH101 (Ferrari eif ai, J. Bacteriol., 152:809-814 [1983]) vector containing a strong transcriptional terminator. The ligation mixture was transformed into competent E. co//cells (Top 10 chemically competent cells, Invitrogen) and plasmid preps were done to retrieve the plasmid (Qiagen spin-prep).
The plasmid, pJHsacB-ASP (1-96 sacS promoter; 97-395 aprE+1 through end of aprEss; and 396-1472 pro+mature asp; See, sequence provided below) was transformed to B. subtilis . Transformation to B. subtilis (AaprE, AnprE, oppA, Aspo//E, degUHy32,

&amyE::(xylR,pxylA-comK) strain was performed as described in WO 02/14490 A2, herein incorporated by reference. The chromosomal DNA was extracted from an overnight culture of the strain (grown in LB media) then transformed to strain BG 3594 and named "CF 202." This strain produced a clear halo on the indicator plate (LA + 1.6% skim milk).
pJHsacB-ASP Sequence:
CATCACATATACCTGCCGTTCACTATTATTTAGTGAAATGAGATATTATGATATTTTCTG
AATTGTGATTAAAAAGGCAACTTTATGCCCATGCAACAGAAACTATAAAAAATACAGAGA
ATGAAAAGAAACAGATAGATTTTTTAGTTCTTTAGGCCCGTAGTCTGCAAATCCTTTTAT
GATTTTCTATCAAACAAAAGAGGAAAATAGACCAGTTGCAATCCAAACGAGAGTCTAAT
AGAATGAGGTCacaGAATAGTCTTTTAAGTAAGTCTACTCTGAATTTTTTTAAAAGGAGA
GGGTAAAGAgtgAGAAGCAAAAAATTGTGGATCAGCTTGTTGTTTGCGTTAACGTTAATC
TTTACGATGGCGTTCAGCAACATGTCTGCGCAGGCTaacgaaccggctcctccaggatctgcatcag
cccctccacgattagctgaaaaacttgaccctgacttacttgaagcaatggaacgcgatctggggttagatgcagaggaagca
gctgcaacgttagcttttcagcatgacgcagctgaaacgggagaggctcttgctgaggaactcgacgaagatttcgcgggcac
gtgggttgaagatgatgtgctgtatgttgcaaccactgatgaagatgctgttgaagaagtcgaaggcgaaggagcaactgctgt
gactgttgagcattctcttgctgatttagaggcgtggaagacggttttggatgctgcgctggagggtcatgatgatgtgcctacgtg
gtacgtcgacgtgcctacgaattcggtagtcgttgctgtaaaggcaggagcgcaggatgtagctgcaggacttgtggaaggcg
ctgatgtgccatcagatgcggtcacttttgtagaaacggacgaaacgcctagaacgatgttcgacgtaattggaggcaacgcat
atactattggcggccggtctagatgttctatcggattcgcagtaaacggtggcttcattactgccggtcactgcggaagaacagg
agccactactgccaatccgactggcacatttgcaggtagctcgtttccgggaaatgattatgcattcgtccgaacaggggcagg
agtaaatttgcttgcccaagtcaataactactcgggcggcagagtccaagtagcaggacatacggccgcaccagttggatctg
ctgtatgccgctcaggtagcactacaggttggcattgcggaactatcacggcgctgaattcgtctgtcacgtatccagagggaac
agtccgaggacttatccgcacgacggtttgtgccgaaccaggtgatagcggaggtagccttttagcgggaaatcaagcccaag
gtgtcacgtcaggtggttctggaaattgtcggacggggggaacaacattctttcaaccagtcaacccgattttgcaggcttacggc
ctgagaatgattacgactgactctggaagttcccctTAAGCTTAAAAAACCGGCCTTGGCCCCGCCGGTT
TTTTATTATTTTTCTTCCTCCGCATGTTCAATCCGCTCCATAATCGACGGATGGCTCCCT
CTGAAAATTTTAACGAGAAACGGCGGGTTGACCCGGCTCAGTCCCGTAACGGCCAAGT
CCTGAAACGTCTCAATCGCCGCTTCCCGGTTTCCGGTCAGCTCAATGCCGTAACGGTC
GGCGGCGTTTTCCTGATACCGGGAGACGGCATTCGTAATCGGATCCCGGACGCATCG
TGGCCGGCATCACCGGCGCCACAGGTGCGGTTGCTGGCGCCTATATCGCCGACATCA
CCGATGGGGAAGATCGGGCTCGCCACTTCGGGCTCATGAGCGCTTGTTTCGGCGTGG
GTATGGTGGCAGGCCCCGTGGCCGGGGGACTGTTGGGCGCCATCTCCTTGCATGCAC
CATTCCTTGCGGCGGCGGTGCTCAACGGCCTCAACCTACTACTGGGCTGCTTCCTAAT
GCAGGAGTCGCATAAGGGAGAGCGTCGACCGATGCCCTTGAGAGCCTTCAACCCAGT
CAGCTCCTTCCGGTGGGCGCGGGGCATGACTATCGTCGCCGCACTTATGACTGTCTTC
TTTATCATGCAACTCGTAGGACAGGTGCCGGCAGCGCTCTGGGTCATTTTCGGCGAGG
ACCGCTTTCGCTGGAGCGCGACGATGATCGGCCTGTCGCTTGCGGTATTCGGAATCTT
GCACGCCCTCGCTCAAGCCTTCGTCACTGGTCCCGCCACCAAACGTTTCGGCGAGAA
GCAGGCCATTATCGCCGGCATGGCGGCCGACGCGCTGGGCTACGTCTTGCTGGCGTT
CGCGACGCGAGGCTGGATGGCCTTCCCCATTATGATTCTTCTCGCTTCCGGCGGCATC
GGGATGCCCGCGTTGCAGGCCATGCTGTCCAGGCAGGTAGATGACGACCATCAGGGA
CAGCTTCAAGGATCGCTCGCGGCTCTTACCAGCCTAACTTCGATCACTGGACCGCTGA
TCGTCACGGCGATTTATGCCGCCTCGGCGAGCACATGGAACGGGTTGGCATGGATTG
AGGCGCCGCCCTATACCTTATTTATGTTACAGTAATATTGACTTTTAAAAAAGGATTGAT
TCTAATGAAGAAAGCAGACAAGTAAGCCTCCTAAATTCACTTTAGATAAAAATTTAGGAG
GCATATCAAATGAACTTTAATAAAATTGATTTAGACAATTGGAAGAGAAAAGAGATATTT
AATCATTATTTGAACCAACAAACGACTTTTAGTATAACCACAGAAATTGATATTAGTGTTT

TATACCGAAACATAAAACAAGAAGGATATAAATTTTACCCTGCATTTATTTTCTTAGTGA
CAAGGGTGATAAACTCAAATACAGCTTTTAGAACTGGTTACAATAGCGACGGAGAGTTA
GGTTATTGGGATAAGTTAGAGCCACTTTATACAATTTTTGATGGTGTATCTAAAACATTC
TCTGGTATTTGGACTCCTGTAAAGAATGACTTCAAAGAGTTTTATGATTTATACCTTTCT
GATGTAGAGAAATATAATGGTTCGGGGAAATTGTTTCCCAAAACACCTATACCTGAAAA
TGCTTTTTCTCTTTCTATTATTCCATGGACTTCATTTACTGGGTTTAACTTAAATATCAAT
AATAATAGTAATTACCTTCTACCCATTATTACAGCAGGAAAATTCATTAATAAAGGTAATT
CAATATATTTACCGCTATCTTTACAGGTACATCATTCTGTTTGTGATGGTTATCATGCAG
GATTGTTTATGAACTCTATTCAGGAATTGTCAGATAGGCCTAATGACTGGCTTTTATAAT
ATGAGATAATGCCGACTGTACTTTTTACAGTCGGTTTTCTAATGTCACTAACCTGCCCC
GTTAGTTGAAGAAGGTTTTTATATTACAGCTCCAGATCCTGCCTCGCGCGTTTCGGTGA
TGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAA
GCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTG
TCGGGGCGCAGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTTAAC
TATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGC
ACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCTCTTCCGCTTCCTCGCTCACTGA
CTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGT
AATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGC
CAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCC
GCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGA
CAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGT
TCCGACCCTGCCGCTTACCGGATACCTGTGCGCCTTTCTCCCTTCGGGAAGCGTGGCG
CTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTGCAAGCT
GGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTA
TCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT
AACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGG
CCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAG
TTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAG
CGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAG
ATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGG
ATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGA
AGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTA
ATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACT
CCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCA
ATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAG
CCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTAT
TAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTG
TTGCCATTGCTGCAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAG
CTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCG
GTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCAC
TCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTT
TCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGA
GTTGCTCTTGCCCGGCGTCAACACGGGATAATACCGCGCCACATAGCAGAACTTTAAA
AGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTG
TTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTAC
TTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGA
ATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGC
ATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAA
CAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCA
TTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTTCAA
(SEQIDNO:172)

Expression of the asp gene was investigated in a nine-protease delete Bacillus subtilis host. The plasmid pHPLT-ASP-C1-2 (See, Table 10-2, and Figure 9), was transformed into B. subtilis (AaprE, AnprE, Aepr, AispA, Abpr, Avpr, AwprA,Ampr-ybfJ, AnprB) and (degUHy32, oppA, AspollE3501, amyE:(xylRPxylAcomK-ermC). Transformation was performed as known in the art (See e.g., WO 02/14490, incorporated herein by reference). The Asp protein was produced by growth in shake flasks at 37°C in MBD medium, a MOPS based defined medium. MBD medium was made essentially as known in the art (See, Neidhardt etal., J. Bacteriol., 119: 736-747 [1974]), except NH4CI2, FeSO4, and CaCI2 were left out of the base medium, 3 mM K2HPO4 was used, and the base medium was supplemented with 60 mM urea, 75 g/L glucose, and 1 % soytone. Also, the micronutrients were made up as a 100 X stock containing in one liter, 400 mg FeSO4 .7H2O, 100 mg MnSO4 .H2O, 100 mg ZnSO4.7H2O, 50 mg CuCI2.2H2O, 100 mg CoCI2.6H2O, 100 mg NaMoO4.2H2O, 100 mg Na2B4O7.10H2O, 10 ml of 1M CaCI2 , and 10 ml of 0.5 M sodium citrate. The expression levels obtained in these experiments were found to be fairly high.
In additional embodiments, "consensus" promoters such as those developed through site-saturation mutagenesis to create promoters that more perfectly conform to the established consensus sequences for the "-10" and "-35" regions of the vegetative "sigma A-type" promoters for B. subtilis (See, Voskuil et al., Mol. Microbiol., 17:271-279 [1995]) find use in the present invention. However, it is not intended that the present invention be limited to any particular consensus promoter, as it is contemplated that other promoters that function in Bacillus cells will find use in the present invention.
EXAMPLE 11 Protease Production in Bacillus clausii
In this Example, experiments conducted to produce protease 69B4 (also referred to as "Asp" herein) in B. clausii are described. In order to express the Asp protein in Bacillus clausii, it was necessary to use a promoter that works in this alkaliphilic microorganism due to its unique regulation systems. The production profile of the alkaline serine protease of B. clausii PB92 (MAXACAL® protease) has shown that it has to have a strong promoter (referred to as "MXL-prom." herein; SEQ ID NOS:173, 174, and 175, See, Figure 18) with a delicate regulation. Besides the promoter region, also signal sequences (leader sequences) are known to be very important for secreting proteins in B. clausii. Therefore, 3 constructs were designed with the MAXACAL® protease promoter region and separate fusions of the

MAXACAL® protease leader sequence and the Asp leader sequence in front of the N-terminal Pro and the mature Asp protein with 3, 6 and 27 amino acids of the MAXACAL® protease leader fused to 25, 25 and 0 amino acids of the Asp leader, respectively.
To make these constructs, amplification of DNA fragments needed to be done in order to enable the fusion. Therefore, PCRs were performed on both MAXACAL® protease and Asp template DNA with Phusion high fidelity polymerase (Finnzymes) according to the manufacturer's instructions.
PCR reactions were executed with the following primers (bold indicates the MAXACAL® protease part of the primer) synthesized at MWG-Biotech AG:
1: B. clau-3F: agggaaccgaatgaagaaacgaactgtcacaagagctctg (SEQ ID NO:176) 2: B. clau-3R: cagagctcttgtgacagttcgtttcttcattcggttccct (SEQ ID NO: 177) 3: B. clau-6F: aatgaagaaaccgttggggcgaactgtcacaagagctctg (SEQ ID NO: 178) 4: B. clau-6R: cagagctcttgtgacagttcgccccaacggtttcttcatt (SEQ ID NO: 179) 5: B. clau-27F: agttcatcgatcgcatcggctaacgaaccggctcctccagga (SEQ ID NO:180) 6: B. clau-27R: tcctggaggagccggttcgttagccgatgcgatcgatgaact (SEQ ID NO:181) 7: B. clau-vector 5': tcagqgggatcctaqattctgttaacttaacqtt. (SEQ ID NO:182)
This primer contains the Hpal-site (GTTAAC) from the promoter region and a
SamHI-site (GGATCC) for cloning reasons (both underlined). 8: pHPLT-H/ndlll-R: gtgctgttttatcctttaccttgtctcc. (SEQ ID NO:183). The sequence of
this primer lays just upstream of the H/nc/lll-site of pHPLT-ASP-C1-2 (See, TablelO-
2).

In Table 11-1, "pMAX4" refers to the template described in WO 88/06623, herein incorporated by reference. PCR fragments 3F3R, 6F6R, 27F27R were digested with both BamH\ and Hind\\\. The digested PCR fragments were ligated with T4 ligase (Invitrogen) into Ba/nHI + /-//ndlll-opened plasmid pHPLT-ASP-C1-2 (See, Figure 18). The ligation product was transformed to competent B. subtilis cells ((AaprE, AnprE, oppA, LspollE, degUHy32, AamyE::(xylR,pxylA-comK; See e.g., WO 02/14490, incorporated herein by reference) and selected on neomycin (20 mg/l). Heart Infusion-agar plates containing neomycin were used to identify neomycin resistant colonies. DMA of the B. subtilis transformants was isolated using Qiagen's plasmid isolation kit according to manufacture's instructions, and were tested on the appearance of the fused MAXACAL® protease-Asp fragment by their pattern after digestion with both A/col + /-/pal together in one tube. The restriction enzymes used in this Example (i.e., Bam]-\\, Hindltt, Nco\ and Hpa\) were all purchased from NEB, and used following the instructions of the supplier. DNA of B. subtilis transformants that showed 2 bands with restriction enzymes (A/col + Hpal) was used to transform protease negative B. clausii strain PBT142 protoplast cells (these were derived from PB92).
The protoplast transformation of B. clausii strain PBT142 was performed according to the protocol mentioned for the protoplast transformation of B. alkalophilus (renamed B. clausii) strain PB92 in patent WO88/06623, herein incorporated by reference A modification to this protocol was the use of an alternative recipe for the regeneration plates, in that instead of 1.5% agar, 8.0 g/l Gelrite gellam gum (Kelco) was used. In addition, instead of 1000 mg/l neomycin, 20 mg/l neomycin was used as described by Van der Laan etal., (Van der Laan et al., Appl. Environ. Microbiol., 57:901-909 [1991]).
DNA from all 3 constructs isolated from B. subtilis (see above) was transformed into B. c/aus//PBT142 protoplasts using the same protocol as above. Transformants in B. clausii PBT142 were selected by replica-plating on Heart Infusion agar plates containing 20 mg/l neomycin. The B. clausii strains with the different construct were produced as indicated in Table 11-2.


27 MXL/OASP

PMAX-ASP1

These 3 strains were fermented in shake flasks containing 100 ml Synthetic Maxatase Medium (SMM) (See, U.S. Pat. No. 5,324,653, herein incorporated by reference). However, instead of 0.97 g/l CaCI2.6H2O, 0.5 g/l CaCI2 was used. Also, instead of 0.5 ml/l antifoam 5693, 0.25 ml/l Basildon was used. The 100 ml SSM shake flasks were inoculated with 0.2 ml of a pre-culture of the 3 B. clausii strains containing the leader constructs in 10 ml TSB (Tryptone Soya Broth) with 20 mg/l neomycin. The protease production values were measured via the AAPF-assay (as described above) after growth in the shake flasks for 3 days. The results indicated that these constructs were able to express protease with proteolytic activity.
In an additional experiment, integration of the leader construct with the entire MAXACAL® protease leader length (27 amino acids) was investigated. However, it is not intended that the present invention be limited to any particular mechanism.
Stable integration of heterologous DNA in the B. alcalophilus (now, B. clausii) chromosome is described in several publications (See e.g., WO 88/06623, and Van der Laan etal., supra). The procedure described in patent WO 88/06623 for integration of 1 or 2 copies of the MAXACAL® protease gene in the chromosome of B. alcalophilus (now, B. clausii) was used to integrate at least 1 copy of the asp gene in the chromosome of B. clausii PBT142. However, a derivative of pE194-neo: pENM#3 (See, Figure 19) was used instead of the integration vector pE194-neo (to make pMAX4 containing the MAXACAL® protease gene). In the integration vector pENM#3, the Asp leader PCR product 27F27R was cloned in the unique blunt end site /-/pal in between the 5' and the 3' flanking regions of the MAXACAL® protease gene. Therefore, 27F27R was made blunt-ended as follows: it was first digested with Hpa\ (5'end), purified with the Qiagen PCR purification kit, and then digested with Hind\\I (3'end). This treated PCR fragment 27F27R was purified again after Hind\\\ digestion (using the same Qiagen kit) and filled in with dNTP's using T4 polymerase (Invitrogen) and purified again with Qiagen kit. The /-/pal-opened pENM#3 and the blunt-ended PCR product 27F27R were ligated with T4 ligase (Invitrogen). The ligation product was transformed directly to B. clausii PBT142 protoplasts and selected after replica-plating on HI agar plates with 20 mg/l neomycin. Two transformants with the correct orientation of the asp gene in the integration vector were identified and taken into the integration procedure as described in patent WO 88/06623. Selections were done at 2 mg/l and 20 mg/l neomycin for integration in the MAXACAL® protease locus and at an illegitimate locus, respectively. These results indicated that B. clausii is also suitable as an expression host

for the Asp protease.
EXAMPLE 12 Protease Production in B. licheniformis
In this Example, experiments conducted to produce protease 69B4 in B. licheniformis are described. During these experiments, various expression constructs were created to produce protease 69B4 protease (also referred to as "ASP protease") in Bacillus licheniformis. Constructs were cloned into expression plasmid pHPLT (replicating in Bacillus) and/or into integration vector pICatH. Plasmid pHPLT (See, Figure 17; and U.S. Pat. No. 6,562,612 [herein incorporated by reference) is a pUB110 derivative, has a neomycin resistance marker for selection, and contains the B. licheniformis a-amylase (LAT) promoter (PLAT), a sequence encoding the LAT signal peptide (preLAT), followed by Psfi and Hpa\ restriction sites for cloning and the LAT transcription terminator. The pICatH vector (See, Figure 20) contains a temperature sensitive origin of replication (ori pE194, for replication in Bacillus), ori pBR322 (for amplification in E. coli), a neomycin resistance gene for selection, and the native B. licheniformis chloramphenicol resistance gene (cat) with repeats for selection, chromosomal integration and cassette amplification.
Construct ASPd was created as a Psf\-Hpa\ fragment by fusion PCR with High Fidelity Platinum Taq Polymerase (Invitrogen) according to the manufacturer's instructions, and with the following primers:
pHPLT-80/ll_FW AGTTAAGCAATCAGATCTTCTTCAGGTTA (SEQ ID NO:184)
fusionC1_FW CATTGAAAGGGGAGGAGAATCATGAGAAGCAAGAAGCGAACTGTCAC (SEQIDNO:185)
fusionC1_RV GTGACAGTTCGCTTCTTGCTTCTCATGATTCTCCTCCCCTTTCAATG (SEQ1DNO:186)
pHPLT-H/ndllLRV CTTTACCTTGTCTCCAAGCTTAAAATAAAAAAACGG (SEQ ID NO: 187)
These primers were obtained from MWG Biotech. PCR reactions were typically

performed on a thermocycler for 30 cycles with High Fidelity Platinum Taq polymerase (Invitrogen) according to the manufacturer's instructions, with annealing temperature of 55°C. PCR-I was performed with the primers pHPLT-Bglll_FW and fusionC1_RV on pHPLT as template DNA. PCR-II was performed with primers fusionC1_FW and pHPLTHindlll_RV on plasmid pHPLT-ASP-C1-2. The fragments from PCR-I and PCR-il were assembled in a fusion PCR with the primers pHPLT-Bglll_FW and pHPLT-Hindlll_RV. This final PCR fragment was purified using the Qiagen PCR purification kit, digested with Bgl\\ and Hind\\\, and ligated with T4 DNA ligase according to the manufacturers' instructions into Bgl\\ and H/ndlll digested pHPLT. The ligation mixture was transformed into B. subtilis strain OS14 as known in the art (See, U.S. Pat. Appl. No. US20020182734 and WO 02/14490, both of which are incorporated herein by reference). Correct transformants produced a halo on a skimmed milk plate and one of them was selected to isolate plasmid pHPLT-ASPd. This plasmid was introduced into B. licheniformis host BML780 (BRA7 derivative, cat-, amyL-, spo-, aprL-, endoGluC-) by protoplast transformation as known in the art (See, Pragai etal., Microbiol., 140:305-310 [1994]). Neomycin resistant transformants.formed halos on skim plates, whereas the parent strain without pHPLT-ASPd did not. This result shows that B. licheniformis is capable of expressing and secreting ASP protease when expression is driven by the LAT promoter and when it is fused to a hybrid signal peptide (MRSKKRTVTRALAVATAAATLLAGGMAAQA; SEQ ID NO:135).
Construct ASPcS was created as a Psi\-Hpa\ fragment by fusion PCR (necessary to remove the internal Psti site in the synthetic asp gene) as described above with the following primers:
ASPdelPsfl_FW GCGCAGGATGTAGCAGCTGGACTTGTGG (SEQ ID NO:188)
ASPdelPsfl_RV CCACAAGTCCAGCTGCTACATCCTGCGC (SEQ ID NO:189)
AspPsfl_FW GCCTCATTCTGCAGCTTCAGCAAACGAACCGGCTCCTCCAGG
(SEQIDNO:190)
AspHpaLRV CGTCCTCTGTTAACTCAGTCGTCACTTCCAGAGTCAGTCGTAATC
(SEQIDNO:191)
After purification, the PCR product was digested with Psi\-Hpa\ and ligated into Psfl and Hpa\ digested pHPLT and then transformed into B. subtilis strain OS14. Plasmid pHPLT-ASPcS was isolated from a neomycin resistant that formed a relatively (compared to other transformants) large halo on a skim milk plate. Plasmid DNA was isolated using the Qiagen plasmid purification kit and sequenced by BaseClear.

Sequencing confirmed that the ASPcS construct encodes mature ASP that has two aspartic acid residues at the extreme C-terminal end (S188D, P189D). These mutations were deliberately introduced by PCR to make the C-terminus of ASP less susceptible against proteolytic degradation (See, WO 02055717). It also appeared that two mutations were introduced into the coding region of the N-terminal pro region by the PCR methods. These mutations caused two amino acid changes in the N-terminal pro-region: L42I and Q141P. Since this particular clone with these two pro(N) mutations gives a somewhat larger halo than other clones without these mutations, it was contemplated that expression and/or secretion of ASP protease in Bacillus is positively affected by these N-terminal pro mutations. However, it is not intended that the present invention be limited to these specific mutations, as it is also contemplated that further mutations will find use in the present invention.
Next, pHPLT-ASPcS was transformed into BML780 as described above. In contrast to the parental strain without the plasmid, BML780(pHLPT-ASPc3) produced a halo on a skim milk plate indicating that also this ASPcS construct leads to ASP expression in B. licheniformis. To make an integrated, amplified strain containing the ASPcS expression cassette, the C3 construct was amplified from pHPLT-ASPc3 with the following primers:
EBS2X/70LFW ATCCTACTCGAGGCTTTTCTTTTGGAAGAAAATATAGGG (SEQ ID
NO: 192)
EBS2X/70LRV TGGAATCTCGAGGTTTTATCCTTTACCTTGTCTCC (SEQ ID
NO: 193)
The PCR product was digested with Xho\, ligated into X/vol-digested pICatH (See, Figure 20) and transformed into 6. subtilis OS14 as described above. The plasmid from an ASP expressing clone (judged by halo formation on skim milk plates) was isolated and designated pICatH-ASPcS. DNA sequencing by BaseClear confirmed that no further mutations were introduced in the ASPcS cassette in pICatH-ASPCS. The plasmid was then transformed into BML780 at the permissive temperature (37 °C) and one neomycin resistant (neoR) and chloramphenicol resistant (capR) transformant were selected and designated BML780(pICatH-ASPc3). The plasmid in BML780(plCatH-ASPc3) was integrated into the cat region on the B. licheniformis genome by growing the strain at a non-permissive temperature (50 °C) in medium with chloramphenicol. One capR resistant clone was selected and designated BML780-plCatH-ASPc3. BML780-plCatH-ASPc3 was grown again at the permissive temperature for several generations without antibiotics to loop-out vector sequences and then one neomycin sensitive (neoS), capR clone was selected. In this

clone, vector sequences of pICatH on the chromosome were excised (including the neomycin resistance gene) and only the ASPc3-cat cassette was left. Note that the cat gene is a native B. licheniformis gene and that the asp gene is the only heterologous piece of DNA introduced into the host. Next, the ASPc3-cat cassette on the chromosome was amplified by growing the strain in/on media with increasing concentrations of chloramphenicol. After various rounds of amplification, one clone (resistant against 75 ug/ml chloramphenicol) was selected and designated "BML780-ASPc3." This clone produced a clear halo on a skim milk plate, whereas the parental strain BML780 did not, indicating that ASP protease is produced and secreted by the BML780-ASPc3 strain.
Construct ASPc4 is similar to ASPcS, but ASP protease expressed from ASPc4 does not have two aspartic acid residues at the C-terminal end of the mature chain. ASPc4 was created by amplification of the asp gene in pHPLT-ASPcS with the following Hypur primers from MWG Biotech (Germany):
XhoPlatPREIat_FW
acccccctcgaggcttttcttttggaagaaaatatagggaaaatggtacttgttaaaaattcggaatatttatacaatatcatatgtttc acattgaaaggggaggagaatcatgaaacaacaaaaacggctttac (SEQ ID NO:194)
ASPendTERMXhoLRV
gtcgacctcgaggttttatcctttaccttgtctccaagcttaaaataaaaaaacggatttccttcaggaaatccgtcctctgttaactc aaggggaacttccagagtcagtcgtaatc (SEQ ID NO:195)
The ASPc4 PCR product was purified and digested with Xho\, ligated into Xho\-digested pICatH, and transformed into 6. subtilis OS14 as described above for ASPcS. Plasmid was isolated from a neoR, capR clone and designated plCatH-ASPc4. pICatH-ASPc4 was transformed into BML780, integrated in the genome, vector sequences were excised, and the cat-ASPc4 cassette was amplified as described above for the ASPcS construct. Strains with the ASPc4 cassette did not produce smaller halos on skim milk plates than strains with the AspC3 cassette, suggesting that the polarity of the C-terminus of ASP mature is not a significant factor for ASP production, secretion and/or stability in Bacillus. However, it is not intended that the present invention be limited to any particular method.
To explore whether the native ASP signal peptide can drive export in Bacillus, ASPcS was constructed. PCR was performed on the synthetic asp gene of DNA2.0 with primers ASPendTERMXhoLRV (above) and XhoPlatPREasp_FW.

XhoPlatPREasp_FW
:acccccctcgaggcttttcttttggaagaaaatatagggaaaatggtacttgttaaaaattcggaatatttatacaatatcatatgttt cacattgaaaggggaggagaatcatgacaccacgaactgtcacaag (SEQ ID NO:196)
The ASPc5 PCR product was purified and digested with Xho\, ligated into Xho\ digested pICatH, and transformed into B. subtilis OS14 as described above for ASPc3. Plasmid was isolated from a neoR, capR clone and designated "pICatH-ASPcS." DMA sequencing confirmed that no unwanted mutations were introduced into the asp gene by the PCR. pICatH-ASPcS was transformed into BML780, integrated in the genome, vector sequences were excised, and the cat-ASPc5 cassette was amplified as described above for the ASPcS construct. It was observed that B. licheniformis strains with the ASPc5 construct also form halos on skim milk plates, confirming that the native signal peptide of ASP functions as a secretion signal in Bacillus species. .
Finally, construct ASPcG was created. It has the B. licheniformis subtilisin (ap/t) promoter, RBS and signal peptide sequence fused in-frame to the DNA sequence encoding mature ASP from the optimized DNA2.0 gene. It was created by a fusion PCR with primer ASPendTERMXhoLRV and the following primers:
AprLupXhoLFW attagtctcgaggatcgaccggaccgcaacctcc (SEQ ID NO:197)
Aprl_Asp_FW cgatggcattcagcgattccgcttctgctaacgaaccggctcctccaggatctgc (SEQ ID
NO: 198)
Aprl_Asp_RV gcagatcctggaggagccggttcgttagcagaagcggaatcgctgaatgccatcg (SEQ
1DNO:199)
PCR-I was performed with the primers AprLupXhoLFW and Aprl_Asp_RV on chromosomal DNA of BRA7 as template DNA. PCR-I I was performed with primers Aprl_Asp_FW and ASPendTERMXhoLRV on the synthetic asp gene of DNA2.0. The fragments from PCR-I and PCR-I I were assembled in a fusion PCR with the primers ASPendTERMXhoLRV and AprLupXhoLFW. This final PCR fragment was purified using Qiagen's PCR purification kit (according to the manufacturer's instructions), digested with Xho\, ligated into pICatH, and transformed into B. subtilis OS14, as described above for ASPcS. Plasmid was isolated from a neoR, capR clone and designated "pICatH-ASPc6." DNA sequencing confirmed that no unwanted mutations were introduced into the asp gene

or aprL region by the PCRs. pICatH-ASPcG was transformed into BML780, integrated in the genome, vector sequences were excised, and the cat-ASPc6 cassette was amplified as described above for the ASPcS construct. B. licheniformis strains with the ASPcG construct also formed halos on skim milk plates, indicating that the aprL promoter in combination with the AprL signal peptide drives expression/secretion of ASP protease in B. licheniformis.
EXAMPLE 13 Protease Production in T. reesei
In this Example, experiments conducted to produce protease 69B4 in T. reesei are described. In these experiments, three different fungal constructs (fungal expression vectors comprising cbhl fusions) were developed. One contained the ASP 5' pro region, mature gene, and 3' pro region; the second contained the ASP 5' pro region and the mature gene; and the third contained only the ASP mature gene.
The following primer pairs were used to PCR (in the presence of 10% DMSO), the different fragments from the chromosomal DNA K25.10, carrying the ASP gene and introduced Spe\-Asc\ sites to clone the fragments into the vector pTREX4 (See, Figure 21) digested with Spel and Asc\ restriction enzymes.
1. CBHI fusion with the ASP 5'pro region, mature gene, and 3'pro region:
AspproF forward primer (Spel-Kexin site-ATG-pro sequence): 5'-ACTAGTAAGCGGATGAACGAGCCCGCACCACCCGGGAGCGCGAGC (SEQ ID NO:200)
AspproR reverse primer (AscI site; C-term pro region from the TAA stop codon to the
end of the gene):
5'- GGCGCGCC TTA GGGGAGGGTGAGCCCCATGGTGTAGGCACCG (SEQ ID
NO:201)
2. The ASP 5'pro region and mature gene:
AspproF forward primer (Spel-Kexin site-ATG-pro sequence): 5J-ACTAGTAAGCGGATGAACGAGCCCGCACCACCCGGGAGCGCGAGC (SEQ ID NO:202)

AspmatR reverse primer (AscI site: TAA stop to the end of the mature sequence) 5'- GGCGCGCC TTA CGGGCTGCTGCCCGAGTCCGTGGTGATCA-3' (SEQ ID NO:203)
3. The ASP mature gene only:
AspmatF forward primer Spel-Kexin site-ATG-mature:
5'-ACTAGT AAGCGG ATG TTCGACGTGATCGGCGGCAACGCCTACACCAT
(SEQ ID NO:204)
AspmatR Reverse Primer (AscI site: TAA stop to end of mature sequence)
5'- GGCGCGCC TTA CGGGCTGCTGCCCGAGTCCGTGGTGATCA-3' (SEQ ID
NO:205)
After construction, the different plasmids were transformed into a Trichoderma reesei strain with disruptions in the cbhl, cbh2, eg!1, and eg/2 genes, using biolistic transformation methods known in the art. Stable transformants were screened, based on morphology. Ten stable transformants for each construct were screened in shake flasks. The initial inoculum media used contained 30g/L cc-lactose, 6.5g/L (NH4)2SO4, 2g/L KH2PO4, 0.3g/L MgSO4*7H2O, 0.2g/L CACI2, 1ml/L 1000X T. reesei Trace Salts, 2 mL/L 10% TWEEN®-80, 22.5 g/L Proflo, and 0.72g/l_ CaCO3, in which the transformants were grown for approximately 48 hr. After this incubation period, 10% of the culture was transferred into flasks containing minimal medium known in the art (See, Foreman etal., J. Biol. Chem., 278:31988-31997 [2003]), with 16g/L of lactose to induce expression. The flasks were placed in a 28°C shaker. Four-day samples were run on NuPAGE 4-12% gels, and stained with Coomassie Blue. After five-days the protease activity was measured by adding 10ul of the supernatant to190 ul AAPF substrate solution (cone. 1 mg/ml, in 0.1 M Tris/0.005% TWEEN, pH 8.6). The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored (25°C)
The activity data showed that there was a 5x higher production over the control strain (i.e., the parent strain), indicating that T. reesei \s suitable for the expression of ASP protease.
EXAMPLE 14 Protease Production in A. niger
In this Example, experiments conducted to produce protease 69B4 in Aspergillus niger var. awamori (PCT WO90/00192) are described. In these experiments, four different

fungal constructs (fungal expression vectors comprising glaA fusions) were developed. One contained the ASP pre-region, 5' pro-region, mature gene, and the 3' pro-region: the second contained the ASP pre-region, 5' pro-region, and the mature gene; the third contained the ASP 5' pro-region, mature gene, and the 3' pro-region; the fourth contained the ASP 5' pro-5 region, and the mature gene.
Selected from the following primer pairs, primers were used to PCR (in the presence of 10% DMSO) the different fragments from the chromosomal DNA 69B4 carrying the asp gene and introduced the Nhe 1-BstE\\ sites to clone the fragments into the vector pSLGAMpR2 (See, Figure 22) digested with M?e1 and BstE\\ restriction enzymes.
10 Primers Anforward 01 and Anforward 02 contained attB1 Gateway cloning
sequences (Invitrogen) at the 5' end of the primer. Primers Anreversed 01 and Anreversed 02 contained attB2 Gateway cloning sequences (Invitrogen) at the 5' end of the primer. These primers were used to PCR (in the presence of 10% DMSO) the different fragments from the chromosomal DNA 69B4 carrying the ASP genes.
15 The different constructs were transferred to a A. niger Gateway compatible
destination vector pRAXdes2 (See, Figure 23; See also, U.S. Pat. Appln. Ser. No. 10/804,785, and PCT Appln. No. US04/08520, both of which are incorporated herein by reference).
20 Anforward 01 (without the attB1 sequence)
5'- ATGACACCACGAACTGTCACAAGAGCTCTG-3' (SEQ ID NO:206)
Anforward 02 (without the attB1 sequence)
5'- AACGAACCGGCTCCTCCAGGATCTGCATCA-3' (SEQ ID NO:207)
25
Anreversed 01 (without the attB2 sequence)
5'- AGGGGAACTTCCAGAGTCAGTCGTAATCATTCTCAGGCC-3' (SEQ ID NO:208)
Anreversed 02 (without the attB1 sequence) 30 5'- GGGGAGGGTGAGTCCCATTGTGTAAGCTCCTGA-3' (SEQ ID NO:209)
pSLGAM-NT_FW 5'-
ACCGCGACTGCTAGCAACGTCATCTCCAAGCGCGGCGGTGGCAACGAACCGGCTCCT 35 CCAGGATCt-3'(SEQIDNO:210)
pSLGAM-MAT_FW
5'-
ACCGCGACTGCTAGCAACGTCATCTCCAAGCGCGGCGGTGGCAACGAACCGGCTCCT

CCAGGATCT-3'(SEQ ID NO:211)
pSLGAM-MAT_RV
5'-CCGCCAGGTGTCGGTCACCTAAGGGGAACTTCCAGAGTCAGTCGTAATCATTCT-3'
(SEQIDNO:212)
PCR conditions were as follows: 5 uL of 10X PCR reaction buffer (Invitrogen); 20 mM MgSO4; 0.2 mM each of dATP, dTTP, dGTP, dCTP (final concentration), 1 uL of 10 ng/uL genomic DNA, 1 uL of High Fidelity Taq polymerase (Invitrogen) at 1 unit per uL, 0.2uM of each primer (final concentration), 5ul DMSO and water to 50 uL The PCR protocol was: 94°C for 5 min.; followed by 30 cycles of 94°C for 30 sec., 55°C for 30 sec., and 68°C for 3 min; followed by 68°C for 10 min., and 15°C for 1 min.
After construction, the different plasmids and a helper plasmid (HM 396 pAPDI) were transformed into Aspergillus niger var awamori (Delta Ap4 strain), using protoplast transformation methods known in the art. Stable transformants were screened, based on morphology. Ten stable transformants for each construct were screened in shake flasks. After this period, a piece of agar containing the strain was transferred into flasks containing RoboSoy medium or the formula 12 g/l Tryptone, 8 g/l Soytone, 15 g/l Ammonium sulfate, 12.1 g/l NaH2PO4.H2O, 2.19 g/l Na2HPO4, 5 ml 20% MgSO4.7H2O, 10 ml 10% Tween 80, 500 ml 30% Maltose and 50 ml 1M phosphate buffer pH 5.8 and 2 g/l uridine to induce expression. The flasks were placed in a 28°C shaker. Four-day samples were run on NuPAGE 10% Bis Tris protein gels, and stained with Coomassie Blue. Five-day samples were assayed for protease activity using the AAPF method.
The amount of ASP expressed was found to be low, such that it could not be detected in the Coomassie stained gel. Colonies on plates however showed a clear halo formation on skim milk plate agar plates that were significantly larger than the control strain. Thus, although the expression was low, these results clearly indicate that A. niger is suitable for the expression of ASP protease.
EXAMPLE 15 Generation of Asp Site-Saturated Mutagenesis (SSM) Libraries
In this Example, experiments conducted to develop site-saturation mutagenesis libraries of asp are described. Site saturated Asp libraries each contained 96 B. subtilis

(AaprE, AnprE, oppA, Aspo//E, degUHy32, hamyE::(xylR,pxylA-comK) clones harboring the pHPLT-ASP-c1-2 expression vector. This vector, containing the Asp expression cassette composed of the synthetic DMA sequence (See, Example 10) encoding the Asp hybrid Signal peptide and the Asp N-terminal pro and mature protein were found to enable expression of the protein indicated below (the signal peptide and precursor protease) and secretion of the mature Asp protease.
DMA Sequence encoding synthetic Asp hybrid signal peptide:
ATGAGAAGCAAGAAGCGAACTGTCACAAGAGCTCTGGCTGTGGCAACAGCAGCTGCTA CACTCTTGGCTGGGGGTATGGCAGCACAAGCT (SEQ ID NO:213)
The signal peptide and precursor protease are provided in the following sequence (SEQ ID NO:214) (in this sequence, bold indicates the mature protease, underlining indicates the N-terminal prosequence, and the standard font indicates the signal peptide):
MRSKKRTVTRALAVATAAATLLAGGMAAQANEPAPPGSASAPPRLAEKLDPDLLEAMERDL
GLDAEEAAATLAFQHDAAETGEALAEELDEDFAGTWVEDDVLYVATTDEDAVEEVEGEGA
TAVTVEHSLADLEAWKTVLDAALEGHDDVPTWYVDVPTNSVVVAVKAGAQDVAAGLVEGA
DVPSDAVTFVETDETPRTMFDVIGGNAYTIGGRSRCSIGFAVNGGFITAGHCGRTGATTAN
PTGTFAGSSFPGNDYAFVRTGAGVNLLAQVNNYSGGRVQVAGHTAAPVGSAVCRSGSTT
GWHCGTITALNSSVTYPEGTVRGLIRTTVCAEPGDSGGSLLAGNQAQGVTSGGSGNCRT
GGTTFFQPVNPILQAYGLRMITTDSGSSP (SEQ ID NO:214)
Construction of the189 asp site saturated mutagenesis libraries was completed by using the pHPLT-ASP-C1-2 expression vector as template and primers listed in Table 15-1. The mutagenesis primers used in these experiments all contain the triple DNA sequence code NNS (N = A, C, T or G and S = C or G) at the position that corresponds with the codon of the Asp mature sequence to be mutated and guaranteed random incorporation of nucleotides at that position. Construction of each SSM library started with two PCR amplifications using pHPLT-Bglll-FW primer and a specific Reverse mutagenesis primer, and pHPLT-Bglll-RV primer and a specific Forward mutagenesis primer (equal positions for the mutagenesis primers). Platinum Tag DNA polymerase High Fidelity (Cat.No. 11304-029; Invitrogen) was used for PCR amplification (0.2 uM primers, 20 up to 30 cycles) according to protocol provided by Invitrogen. Briefly, 1 uL amplified DNA fragment of both specific PCR mixes, both targeted the same codon, was added to 48 uL of fresh PCR

reaction solution together with primers pHPLT-Bglll-FW and pHPLT-Bglll-RV. This fusion PCR amplification (22 cycles) resulted in a linear pHPLT-ASP-c1-2 DNA fragment with a specific Asp mature codon randomly mutated and a unique Bgl\\ restriction site on both ends. Purification of this DNA fragment (Qiagen PCR purification kit, Cat.No. 28106), digesting it with Bgl\\, performing an additional purification step and a ligation reaction (Invitrogen T4 DNA Ligase (Cat.No. 15224-025) generated circular and multimeric DNA that was subsequently transformed into B. subtilis (AaprE, AnprE, oppA, kspollE, degUHy32, &amyE::(xylR,pxylA-comK). For each library, after overnight incubation at 37°C, 96 single colonies were picked from Heart Infusion agar plates with 20 mg/L neomycin and grown for 4 days at 37°C in MOPS media with 20 mg/ml neomycin and 1.25 g/L yeast extract (See, WO 03/062380, incorporated herein by reference, for the exact medium formulation used herein) for sequence analysis (BaseClear) and protease expression for screening purposes. The library numbers ranged from 1 up to 189, with each number representing the codon of the mature asp sequence that is randomly mutated. After selection, each library included a maximum of 20 Asp protease variants.
Table 15-1. Primers Used to Generate Synthetic ASP SSM Libraries
pHPLT-Bglll-FW GCAATCAGATCTTCCTTCAGGTTATGACC (SEQ ID N215) pHPLT-Bglll-RV GCATCGAAGATCTGATTGCTTAACTGCTTC (SEQ ID NO:216)
Forward
Mutagenesis
Primer DNA sequence, 5' to 3'
GAAACGCCTAGAACGATGNNSGACGTAATTGGAGGCAAC
asplF (SEQ ID NO:217)
ACGCCTAGAACGATGTTCNNSGTAATTGGAGGCAACGCA
asp2F (SEQIDNO:218)
CCTAGAACGATGTTCGACNNSATTGGAGGCAACGCATAT
aspSF (SEQIDNO:219)
AGAACGATGTTCGACGTANNSGGAGGCAACGCATATACT
asp4F (SEQ ID NO:220)
ACGATGTTCGACGTAATTNNSGGCAACGCATATACTATT
asp5F (SEQ ID NO:221)
ATGTTCGACGTAATTGGANNSAACGCATATACTATTGGC
asp6F (SEQ ID NO:222)
TTCGACGTAATTGGAGGCNNSGCATATACTATTGGCGGC
asp7F (SEQ ID NO:223)
aspSF GACGTAATTGGAGGCAACNNSTATACTATTGGCGGCCGG

(SEQ ID NO:224)
GTAATTGGAGGCAACGCANNSACTATTGGCGGCCGGTCT
asp9F (SEQ ID N0225)
ATTGGAGGCAACGCATATNNSATTGGCGGCCGGTCTAGA
asplOF (SEQ ID NO:226)
GGAGGCAACGCATATACTNNSGGCGGCCGGTCTAGATGT
asp11F (SEQ ID NO:227)
GGCAACGCATATACTATTNNSGGCCGGTCTAGATGTTCT
asp12F (SEQ ID NO:228)
AACGCATATACTATTGGCNNSCGGTCTAGATGTTCTATC
asp13F (SEQ ID NO:229)
GCATATACTATTGGCGGCNNSTCTAGATGTTCTATCGGA
asp14F (SEQ ID NO:230)
TATACTATTGGCGGCCGGNNSAGATGTTCTATCGGATTC
asp15F (SEQ ID NO:231)
ACTATTGGCGGCCGGTCTNNSTGTTCTATCGGATTCGCA
asp16F (SEQ ID NO:232)
ATTGGCGGCCGGTCTAGANNSTCTATCGGATTCGCAGTA
asp17F (SEQ ID NO:233)
GGCGGCCGGTCTAGATGTNNSATCGGATTCGCAGTAAAC
asp18F (SEQ ID NO:234)
GGCCGGTCTAGATGTTCTNNSGGATTCGCAGTAAACGGT
asp19F (SEQ ID NO:235)
CGGTCTAGATGTTCTATCNNSTTCGCAGTAAACGGTGGC
asp20F (SEQ ID NO:236)
TCTAGATGTTCTATCGGANNSGCAGTAAACGGTGGCTTC
asp21F (SEQ ID NO:237)
AGATGTTCTATCGGATTCNNSGTAAACGGTGGCTTCATT
asp22F (SEQ ID NO:238)
TGTTCTATCGGATTCGCANNSAACGGTGGCTTCATTACT
asp23F (SEQ ID NO:239)
TCTATCGGATTCGCAGTANNSGGTGGCTTCATTACTGCC
asp24F (SEQ ID NO:240)
ATCGGATTCGCAGTAAACNNSGGCTTCATTACTGCCGGT
asp25F (SEQIDNO:241)
GGATTCGCAGTAAACGGTNNSTTCATTACTGCCGGTCAC
asp26F (SEQ ID NO:242)
TTCGCAGTAAACGGTGGCNNSATTACTGCCGGTCACTGC
asp27F (SEQ ID NO:243)
GCAGTAAACGGTGGCTTCNNSACTGCCGGTCACTGCGGA
asp28F (SEQ ID NO:244)
GTAAACGGTGGCTTCATTNNSGCCGGTCACTGCGGAAGA
asp29F (SEQ ID NO:245)
AACGGTGGCTTCATTACTNNSGGTCACTGCGGAAGAACA
aspSOF (SEQ ID NO:246)
GGTGGCTTCATTACTGCCNNSCACTGCGGAAGAACAGGA
asp31F (SEQ ID NO:247)

GGCTTCATTACTGCCGGTNNSTGCGGAAGAACAGGAGCC
asp32F (SEQ ID NO:248)
TTCATTACTGCCGGTCACNNSGGAAGAACAGGAGCCACT
asp33F (SEQ ID NO:249)
ATTACTGCCGGTCACTGCNNSAGAACAGGAGCCACTACT
asp34F (SEQ ID NO:250)
ACTGCCGGTCACTGCGGANNSACAGGAGCCACTACTGCC
asp35F (SEQIDNO:251)
GCCGGTCACTGCGGAAGANNSGGAGCCACTACTGCCAAT
asp36F (SEQ ID NO:252)
GGTCACTGCGGAAGAACANNSGCCACTACTGCCAATCCG
asp37F (SEQ ID NO:253)
CACTGCGGAAGAACAGGANNSACTACTGCCAATCCGACT
asp38F (SEQ ID NO:254)
TGCGGAAGAACAGGAGCCNNSACTGCCAATCCGACTGGC
asp39F (SEQ ID NO:255)
GGAAGAACAGGAGCCACTNNSGCCAATCCGACTGGCACA
asp40F (SEQ ID NO:256)
AGAACAGGAGCCACTACTNNSAATCCGACTGGCACATTT
asp41F (SEQ ID NO:257)
ACAGGAGCCACTACTGCCNNSCCGACTGGCACATTTGCA
asp42F (SEQ ID NO:258)
GGAGCCACTACTGCCAATNNSACTGGCACATTTGCAGGT
asp43F (SEQ ID NO:259)
GCCACTACTGCCAATCCGNNSGGCACATTTGCAGGTAGC
asp44F (SEQ ID NO:260)
ACTACTGCCAATCCGACTNNSACATTTGCAGGTAGCTCG
asp45F (SEQIDNO:261)
ACTGCCAATCCGACTGGCNNSTTTGCAGGTAGCTCGTTT
asp46F (SEQ ID NO:262)
GCCAATCCGACTGGCACANNSGCAGGTAGCTCGTTTCCG
asp47F (SEQ ID NO:263)
AATCCGACTGGCACATTTNNSGGTAGCTCGTTTCCGGGA
asp48F (SEQ ID NO:264)
CCGACTGGCACATTTGCANNSAGCTCGTTTCCGGGAAAT
asp49F (SEQ ID NO:265)
ACTGGCACATTTGCAGGTNNSTCGTTTCCGGGAAATGAT
aspSOF (SEQ ID NO:266)
GGCACATTTGCAGGTAGCNNSTTTCCGGGAAATGATTAT
asp51F (SEQ ID NO:267)
ACATTTGCAGGTAGCTCGNNSCCGGGAAATGATTATGCA
asp52F (SEQ ID NO:268)
TTTGCAGGTAGCTCGTTTNNSGGAAATGATTATGCATTC
asp53F (SEQ ID NO:269)
GCAGGTAGCTCGTTTCCGNNSAATGATTATGCATTCGTC
asp54F (SEQ ID NO:270)
GGTAGCTCGTTTCCGGGANNSGATTATGCATTCGTCCGA
asp55F (SEQ ID NO:271)
AGCTCGTTTCCGGGAAATNNSTATGCATTCGTCCGAACA
asp56F (SEQ ID NO:272)
asp57F TCGTTTCCGGGAAATGATNNSGCATTCGTCCGAACAGGG

(SEQ ID NO:273)
TTTCCGGGAAATGATTATNNSTTCGTCCGAACAGGGGCA
asp58F (SEQ ID NO:274)
CCGGGAAATGATTATGCANNSGTCCGAACAGGGGCAGGA
asp59F (SEQ ID NO:275)
GGAAATGATTATGCATTCNNSCGAACAGGGGCAGGAGTA
aspGOF (SEQ ID N0:276)
AATGATTATGCATTCGTCNNSACAGGGGCAGGAGTAAAT
asp61F (SEQ ID N0:277)
GATTATGCATTCGTCCGANNSGGGGCAGGAGTAAATTTG
asp62F (SEQ ID NO:278)
TATGCATTCGTCCGAACANNSGCAGGAGTAAATTTGCTT
asp63F (SEQ ID NO:279)
GCATTCGTCCGAACAGGGNNSGGAGTAAATTTGCTTGCC
asp64F (SEQ ID NO:280)
TTCGTCCGAACAGGGGCANNSGTAAATTTGCTTGCCCAA
asp65F (SEQ ID NO:281)
GTCCGAACAGGGGCAGGANNSAATTTGCTTGCCCAAGTC
asp66F (SEQ ID NO:282)
CGAACAGGGGCAGGAGTANNSTTGCTTGCCCAAGTCAAT
asp67F (SEQ ID NO:283)
ACAGGGGCAGGAGTAAATNNSCTTGCCCAAGTCAATAAC
asp68F (SEQ ID NO:284)
GGGGCAGGAGTAAATTTGNNSGCCCAAGTCAATAACTAC
asp69F (SEQ ID NO:285)
GCAGGAGTAAATTTGCTTNNSCAAGTCAATAACTACTCG
asp70F (SEQ ID NQ-.286)
GGAGTAAATTTGCTTGCCNNSGTCAATAACTACTCGGGC
asp71F (SEQ ID NO:287)
GTAAATTTGCTTGCCCAANNSAATAACTACTCGGGCGGC
asp72F (SEQ ID NO:288)
AATTTGCTTGCCCAAGTCNNSAACTACTCGGGCGGCAGA
asp73F (SEQ ID NO:289)
TTGCTTGCCCAAGTCAATNNSTACTCGGGCGGCAGAGTC
asp74F (SEQ ID NO:290)
CTTGCCCAAGTCAATAACNNSTCGGGCGGCAGAGTCCAA
asp75F (SEQ ID NO:291)
GCCCAAGTCAATAACTACNNSGGCGGCAGAGTCCAAGTA
asp76F (SEQ ID NO292)
CAAGTCAATAACTACTCGNNSGGCAGAGTCCAAGTAGCA
asp77F (SEQ ID NO:293)
GTCAATAACTACTCGGGCNNSAGAGTCCAAGTAGCAGGA
asp78F (SEQ ID NO:294)
AATAACTACTCGGGCGGCNNSGTCCAAGTAGCAGGACAT
asp79F (SEQ ID NO295)
AACTACTCGGGCGGCAGANNSCAAGTAGCAGGACATACG
aspSOF (SEQ ID NO:296)
TACTCGGGCGGCAGAGTCNNSGTAGCAGGACATACGGCC
asp81F (SEQ ID NO:297)
TCGGGCGGCAGAGTCCAANNSGCAGGACATACGGCCGCA
asp82F (SEQ ID NO:298)

asp83F
asp84F
asp85F
asp86F
asp87F
asp88F
asp89F
asp90F
asp91F
asp92F
asp93F
asp94F
asp95F
asp96F
asp97F
asp98F
asp99F
asplOOF
asp101F
asp102F
asp103F
asp104F
asp105F
asp106F
asp107F asp108F

GGCGGCAGAGTCCAAGTANNSGGACATACGGCCGCACCA
(SEQ ID NO:299)
GGCAGAGTCCAAGTAGCANNSCATACGGCCGCACCAGTT
(SEQ ID NO:300)
AGAGTCCAAGTAGCAGGANNSACGGCCGCACCAGTTGGA
(SEQ ID NO:301)
GTCCAAGTAGCAGGACATNNSGCCGCACCAGTTGGATCT
(SEQ ID NO:302)
CAAGTAGCAGGACATACGNNSGCACCAGTTGGATCTGCT
(SEQ ID NO:303)
GTAGCAGGACATACGGCCNNSCCAGTTGGATCTGCTGTA
(SEQ ID NO:304)
GCAGGACATACGGCCGCANNSGTTGGATCTGCTGTATGC
(SEQ ID NO:305)
GGACATACGGCCGCACCANNSGGATCTGCTGTATGCCGC
(SEQ ID NO:306)
CATACGGCCGCACCAGTTNNSTCTGCTGTATGCCGCTCA
(SEQ ID NO:307)
ACGGCCGCACCAGTTGGANNSGCTGTATGCCGCTCAGGT
(SEQ ID NO:308)
GCCGCACCAGTTGGATCTNNSGTATGCCGCTCAGGTAGC
(SEQ ID NO:309)
GCACCAGTTGGATCTGCTNNSTGCCGCTCAGGTAGCACT
(SEQ ID NO:310)
CCAGTTGGATCTGCTGTANNSCGCTCAGGTAGCACTACA
(SEQIDNO:311)
GTTGGATCTGCTGTATGCNNSTCAGGTAGCACTACAGGT
(SEQ ID NO:312)
GGATCTGCTGTATGCCGCNNSGGTAGCACTACAGGTTGG
(SEQ ID NO:313)
TCTGCTGTATGCCGCTCANNSAGCACTACAGGTTGGCAT
(SEQ ID NO:314)
GCTGTATGCCGCTCAGGTNNSACTACAGGTTGGCATTGC
(SEQ ID NO:315)
GTATGCCGCTCAGGTAGCNNSACAGGTTGGCATTGCGGA
(SEQIDNO:316)
TGCCGCTCAGGTAGCACTNNSGGTTGGCATTGCGGAACT
(SEQIDNO:317)
CGCTCAGGTAGCACTACANNSTGGCATTGCGGAACTATC
(SEQIDNO:318)
TCAGGTAGCACTACAGGTNNSCATTGCGGAACTATCACG
(SEQIDNO:319)
GGTAGCACTACAGGTTGGNNSTGCGGAACTATCACGGCG
(SEQ ID NO:320)
AGCACTACAGGTTGGCATNNSGGAACTATCACGGCGCTG
(SEQIDNO:321)
ACTACAGGTTGGCATTGCNNSACTATCACGGCGCTGAAT
(SEQ ID NO:322)
ACAGGTTGGCATTGCGGANNSATCACGGCGCTGAATTCG
(SEQ ID NO:323)
GGTTGGCATTGCGGAACTNNSACGGCGCTGAATTCGTCT

asp109F asp110F asp111F asp112F asp113F asp114F asp115F asp116F asp117F asp118F asp119F asp120F asp121F asp122F asp123F asp124F asp125F asp126F asp127F asp128F asp129F asp130F asp131F asp132F asp133F

(SEQ ID NO:324)
TGGCATTGCGGAACTATCNNSGCGCTGAATTCGTCTGTC
(SEQ ID NO:325)
CATTGCGGAACTATCACGNNSCTGAATTCGTCTGTCACG
(SEQ ID NO:326)
TGCGGAACTATCACGGCGNNSAATTCGTCTGTCACGTAT
(SEQ ID NO:327)
GGAACTATCACGGCGCTGNNSTCGTCTGTCACGTATCCA
(SEQ ID NO:328)
ACTATCACGGCGCTGAATNNSTCTGTCACGTATCCAGAG
(SEQ ID NO:329)
ATCACGGCGCTGAATTCGNNSGTCACGTATCCAGAGGGA
(SEQ ID NO:330)
ACGGCGCTGAATTCGTCTNNSACGTATCCAGAGGGAACA
(SEQIDNO:331)
GCGCTGAATTCGTCTGTCNNSTATCCAGAGGGAACAGTC
(SEQ ID NO:332)
CTGAATTCGTCTGTCACGNNSCCAGAGGGAACAGTCCGA
(SEQ ID NO:333)
AATTCGTCTGTCACGTATNNSGAGGGAACAGTCCGAGGA
(SEQ ID NO:334) .
TCGTCTGTCACGTATCCANNSGGAACAGTCCGAGGACTT
(SEQ ID NO:335)
TCTGTCACGTATCCAGAGNNSACAGTCCGAGGACTTATC
(SEQ ID NO:336)
GTCACGTATCCAGAGGGANNSGTCCGAGGACTTATCCGC
(SEQ ID NO:337)
ACGTATCCAGAGGGAACANNSCGAGGACTTATCCGCACG
(SEQ ID NO:338)
TATCCAGAGGGAACAGTCNNSGGACTTATCCGCACGACG
(SEQ ID NO:339)
CCAGAGGGAACAGTCCGANNSCTTATCCGCACGACGGTT
(SEQ ID NO:340)
GAGGGAACAGTCCGAGGANNSATCCGCACGACGGTTTGT
(SEQIDNO:341)
GGAACAGTCCGAGGACTTNNSCGCACGACGGTTTGTGCC
(SEQ ID NO:342)
ACAGTCCGAGGACTTATCNNSACGACGGTTTGTGCCGAA
(SEQ ID NO:343)
GTCCGAGGACTTATCCGCNNSACGGTTTGTGCCGAACCA
(SEQ ID NO:344)
CGAGGACTTATCCGCACGNNSGTTTGTGCCGAACCAGGT
(SEQ ID NO:345)
GGACTTATCCGCACGACGNNSTGTGCCGAACCAGGTGAT
(SEQ ID NO:346)
CTTATCCGCACGACGGTTNNSGCCGAACCAGGTGATAGC
(SEQ ID NO:347)
ATCCGCACGACGGTTTGTNNSGAACCAGGTGATAGCGGA
(SEQ ID NO:348)
CGCACGACGGTTTGTGCCNNSCCAGGTGATAGCGGAGGT
(SEQ ID NO:349)

asp134F asp135F asp136F asp137F asp138F asp139F asp140F asp141F asp142F asp143F asp144F asp145F asp146F asp147F asp148F asp149F asp150F asp151F asp152F asp153F asp154F asp155F asp156F asp157F
asp158F asp159F

ACGACGGTTTGTGCCGAANNSGGTGATAGCGGAGGTAGC
(SEQ ID NO:350)
ACGGTTTGTGCCGAACCANNSGATAGCGGAGGTAGCCTT
(SEQIDNO:351)
GTTTGTGCCGAACCAGGTNNSAGCGGAGGTAGCCTTTTA
(SEQ ID NO:352)
TGTGCCGAACCAGGTGATNNSGGAGGTAGCCTTTTAGCG
(SEQ ID NO:353)
GCCGAACCAGGTGATAGCNNSGGTAGCCTTTTAGCGGGA
(SEQ ID NO:354)
GAACCAGGTGATAGCGGANNSAGCCTTTTAGCGGGAAAT
(SEQ ID NO:355)
CCAGGTGATAGCGGAGGTNNSCTTTTAGCGGGAAATCAA
(SEQ ID NO:356)
GGTGATAGCGGAGGTAGCNNSTTAGCGGGAAATCAAGCC
(SEQ ID NO:357)
GATAGCGGAGGTAGCCTTNNSGCGGGAAATCAAGCCCAA
(SEQ ID NO:358)
AGCGGAGGTAGCCTTTTANNSGGAAATCAAGCCCAAGGT
(SEQ ID NO:359)
GGAGGTAGCCTTTTAGCGNNSAATCAAGCCCAAGGTGTC
(SEQ ID NO:360)
GGTAGCCTTTTAGCGGGANNSCAAGCCCAAGGTGTCACG
(SEQIDNO:361)
AGCCTTTTAGCGGGAAATNNSGCCCAAGGTGTCACGTCA
(SEQ ID NO:362)
CTTTTAGCGGGAAATCAANNSCAAGGTGTCACGTCAGGT
(SEQ ID NO:363)
TTAGCGGGAAATCAAGCCNNSGGTGTCACGTCAGGTGGT
(SEQ ID NO:364)
GCGGGAAATCAAGCCCAANNSGTCACGTCAGGTGGTTCT
(SEQ ID NO:365)
GGAAATCAAGCCCAAGGTNNSACGTCAGGTGGTTCTGGA
(SEQ ID NO:366)
AATCAAGCCCAAGGTGTCNNSTCAGGTGGTTCTGGAAAT
(SEQ ID NO:367)
CAAGCCCAAGGTGTCACGNNSGGTGGTTCTGGAAATTGT
(SEQ ID NO:368)
GCCCAAGGTGTCACGTCANNSGGTTCTGGAAATTGTCGG
(SEQ ID NO:369)
CAAGGTGTCACGTCAGGTNNSTCTGGAAATTGTCGGACG
(SEQ ID NO:370)
GGTGTCACGTCAGGTGGTNNSGGAAATTGTCGGACGGGG
(SEQIDNO:371)
GTCACGTCAGGTGGTTCTNNSAATTGTCGGACGGGGGGA
(SEQ ID NO:372)
ACGTCAGGTGGTTCTGGANNSTGTCGGACGGGGGGAACA
(SEQ ID NO:373)
TCAGGTGGTTCTGGAAATNNSCGGACGGGGGGAACAACA
(SEQ ID NO:374)
GGTGGTTCTGGAAATTGTNNSACGGGGGGAACAACATTC

asp160F asp161F asp162F asp163F asp164F asp165F asp166F asp167F asp168F asp169F asp170F asp171F asp172F asp173F asp174F asp175F asp176F asp177F asp178F asp179F asp180F asp181F asp182F asp183F asp184F

(SEQ ID NO:375)
GGTTCTGGAAATTGTCGGNNSGGGGGAACAACATTCTTT (SEQ ID NO:376)
TCTGGAAATTGTCGGACGNNSGGAACAACATTCTTTCAA (SEQ ID NO:377)
GGAAATTGTCGGACGGGGNNSACAACATTCTTTCAACCA (SEQ ID NO:378)
AATTGTCGGACGGGGGGANNSACATTCTTTCAACCAGTC (SEQ ID NO:379)
TGTCGGACGGGGGGAACANNSTTCTTTCAACCAGTCAAC (SEQ ID N0:380)
CGGACGGGGGGAACAACANNSTTTCAACCAGTCAACCCG (SEQ ID NO:381)
ACGGGGGGAACAACATTCNNSCAACCAGTCAACCCGATT (SEQ ID N0:382)
GGGGGAACAACATTCTTTNNSCCAGTCAACCCGATTTTG (SEQ ID NO:383)
GGAACAACATTCTTTCAANNSGTCAACCCGATTTTGCAG (SEQ ID NO:384)
ACAACATTCTTTCAACCANNSAACCCGATTTTGCAGGCT (SEQ ID NO:385)
ACATTCTTTCAACCAGTCNNSCCGATTTTGCAGGCTTAC (SEQ ID NO:386)
TTCTTTCAACCAGTCAACNNSATTTTGCAGGCTTACGGC (SEQ ID NO:387)
TTTCAACCAGTCAACCCGNNSTTGCAGGCTTACGGCCTG (SEQ ID NO:388)
CAACCAGTCAACCCGATTNNSCAGGCTTACGGCCTGAGA (SEQ ID NO:389)
CCAGTCAACCCGATTTTGNNSGCTTACGGCCTGAGAATG (SEQ ID NO:390)
GTCAACCCGATTTTGCAGNNSTACGGCCTGAGAATGATT (SEQ ID NO:391)
AACCCGATTTTGCAGGCTNNSGGCCTGAGAATGATTACG (SEQ ID NO:392)
CCGATTTTGCAGGCTTACNNSCTGAGAATGATTACGACT (SEQ ID NO:393)
ATTTTGCAGGCTTACGGCNNSAGAATGATTACGACTGAC (SEQ ID NO:394)
TTGCAGGCTTACGGCCTGNNSATGATTACGACTGACTCT (SEQ ID N0:395)
CAGGCTTACGGCCTGAGANNSATTACGACTGACTCTGGA (SEQ ID NO:396)
GCTTACGGCCTGAGAATGNNSACGACTGACTCTGGAAGT (SEQ ID NO:397)
TACGGCCTGAGAATGATTNNSACTGACTCTGGAAGTTCC (SEQ ID NO:398)
GGCCTGAGAATGATTACGNNSGACTCTGGAAGTTCCCCT (SEQ ID NO:399)
CTGAGAATGATTACGACTNNSTCTGGAAGTTCCCCTTAA (SEQ ID N0:400)

asp185F asp186F aspl87F asp188F
asp189F
Reverse
mutagenesis
primer
aspIR
asp2R
aspSR
asp4R
asp5R
aspGR
asp7R
aspSR
asp9R
asplOR
asp11R
asp12R
asp13R
asp14R
asp15R
asp16R
asp17R
asp18R asp19R

AGAATGATTACGACTGACNNSGGAAGTTCCCCTTAACCC (SEQIDNO:401)
ATGATTACGACTGACTCTNNSAGTTCCCCTTAACCCAAC (SEQ ID NO:402)
ATTACGACTGACTCTGGANNSTCCCCTTAACCCAACAGA (SEQ ID NO:403)
ACGACTGACTCTGGAAGTNNSCCTTAACCCAACAGAGGA (SEQ ID NO:404)
ACTGACTCTGGAAGTTCCNNSTAACCCAACAGAGGACGG (SEQ ID NO:405)
DMA sequence, 5'-3'
GTTGCCTCCAATTACGTCSNNCATCGTTCTAGGCGTTTC
(SEQ ID NO:406)
TGCGTTGCCTCCAATTACSNNGAACATCGTTCTAGGCGT
(SEQ ID NO:407)
ATATGCGTTGCCTCCAATSNNGTCGAACATCGTTCTAGG
(SEQ ID NO:408)
AGTATATGCGTTGCCTCCSNNTACGTCGAACATCGTTCT
(SEQ ID NO:409)
AATAGTATATGCGTTGCCSNNAATTACGTCGAACATCGT
(SEQIDNO:410)
GCCAATAGTATATGCGTTSNNTCCAATTACGTCGAACAT
(SEQIDNO:411)
GCCGCCAATAGTATATGCSNNGCCTCCAATTACGTCGAA
(SEQ ID NO:412)
CCGGCCGCCAATAGTATASNNGTTGCCTCCAATTACGTC
(SEQIDNO:413)
AGACCGGCCGCCAATAGTSNNTGCGTTGCCTCCAATTAC
(SEQ ID NO:414)
TCTAGACCGGCCGCCAATSNNATATGCGTTGCCTCCAAT
(SEQIDNO:415)
ACATCTAGACCGGCCGCCSNNAGTATATGCGTTGCCTCC
(SEQ1DNO:416)
AGAACATCTAGACCGGCCSNNAATAGTATATGCGTTGCC
(SEQIDNO:417)
GATAGAACATCTAGACCGSNNGCCAATAGTATATGCGTT
(SEQIDNO:418)
TCCGATAGAACATCTAGASNNGCCGCCAATAGTATATGC
(SEQIDNO:419)
GAATCCGATAGAACATCTSNNCCGGCCGCCAATAGTATA
(SEQ ID NO:420)
TGCGAATCCGATAGAACASNNAGACCGGCCGCCAATAGT
(SEQ ID NO:421)
TACTGCGAATCCGATAGASNNTCTAGACCGGCCGCCAAT
(SEQ ID NO:422)
GTTTACTGCGAATCCGATSNNACATCTAGACCGGCCGCC
(SEQ ID NO:423)
ACCGTTTACTGCGAATCCSNNAGAACATCTAGACCGGCC

asp20R asp21R asp22R asp23R asp24R asp25R asp26R asp27R asp28R asp29R aspSOR asp31R asp32R asp33R asp34R asp35R asp36R asp37R asp38R asp39R asp40R asp41R asp42R asp43R asp44R

(SEQ ID NO:424)
GCCACCGTTTACTGCGAASNNGATAGAACATCTAGACCG
(SEQ ID N0:425)
GAAGCCACCGTTTACTGCSNNTCCGATAGAACATCTAGA
(SEQ ID N0:426)
AATGAAGCCACCGTTTACSNNGAATCCGATAGAACATCT
(SEQ ID NO:427)
AGTAATGAAGCCACCGTTSNNTGCGAATCCGATAGAACA
(SEQ ID N0:428)
GGCAGTAATGAAGCCACCSNNTACTGCGAATCCGATAGA
(SEQ ID NO:429)
ACCGGCAGTAATGAAGCCSNNGTTTACTGCGAATCCGAT
(SEQ ID NO:430)
GTGACCGGCAGTAATGAASNNACCGTTTACTGCGAATCC
(SEQ ID NO:431)
GCAGTGACCGGCAGTAATSNNGCCACCGTTTACTGCGAA
(SEQ ID NO:432)
TCCGCAGTGACCGGCAGTSNNGAAGCCACCGTTTACTGC
(SEQ ID NO:433)
TCTTCCGCAGTGACCGGCSNNAATGAAGCCACCGTTTAC
(SEQ ID NO:434)
TGTTCTTCCGCAGTGACCSNNAGTAATGAAGCCACCGTT
(SEQ ID NO:435)
TCCTGTTCTTCCGCAGTGSNNGGCAGTAATGAAGCCACC
(SEQ ID NO:436)
GGCTCCTGTTCTTCCGCASNNACCGGCAGTAATGAAGCC
(SEQ ID NO:437)
AGTGGCTCCTGTTCTTCCSNNGTGACCGGCAGTAATGAA
(SEQ ID NO:438)
AGTAGTGGCTCCTGTTCTSNNGCAGTGACCGGCAGTAAT
(SEQ ID NO:439)
GGCAGTAGTGGCTCCTGTSNNTCCGCAGTGACCGGCAGT
(SEQ ID NO:440)
ATTGGCAGTAGTGGCTCCSNNTCTTCCGCAGTGACCGGC
(SEQIDNO:441)
CGGATTGGCAGTAGTGGCSNNTGTTCTTCCGCAGTGACC
(SEQ ID NO:442)
AGTCGGATTGGCAGTAGTSNNTCCTGTTCTTCCGCAGTG
(SEQ ID NO:443)
GCCAGTCGGATTGGCAGTSNNGGCTCCTGTTCTTCCGCA
(SEQ ID NO:444)
TGTGCCAGTCGGATTGGCSNNAGTGGCTCCTGTTCTTCC
(SEQ ID NO:445)
AAATGTGCCAGTCGGATTSNNAGTAGTGGCTCCTGTTCT
(SEQ ID NO:446)
TGCAAATGTGCCAGTCGGSNNGGCAGTAGTGGCTCCTGT
(SEQ ID NO:447)
ACCTGCAAATGTGCCAGTSNNATTGGCAGTAGTGGCTCC
(SEQ ID N0:448)
GCTACCTGCAAATGTGCCSNNCGGATTGGCAGTAGTGGC
(SEQ ID NO:449)

CGAGCTACCTGCAAATGTSNNAGTCGGATTGGCAGTAGT
asp45R (SEQ ID NO:450)
AAACGAGCTACCTGCAAASNNGCCAGTCGGATTGGCAGT
asp46R (SEQIDNO:451)
CGGAAACGAGCTACCTGCSNNTGTGCCAGTCGGATTGGC
asp47R (SEQ ID NO:452)
TCCCGGAAACGAGCTACCSNNAAATGTGCCAGTCGGATT
asp48R (SEQ ID NO:453)
ATTTCCCGGAAACGAGCTSNNTGCAAATGTGCCAGTCGG
asp49R (SEQ ID NO:454)
ATCATTTCCCGGAAACGASNNACCTGCAAATGTGCCAGT
asp50R (SEQ ID NO:455)
ATAATCATTTCCCGGAAASNNGCTACCTGCAAATGTGCC
asp51R (SEQ ID NO:456)
TGCATAATCATTTCCCGGSNNCGAGCTACCTGCAAATGT
asp52R (SEQ ID NO:457)
GAATGCATAATCATTTCCSNNAAACGAGCTACCTGCAAA
asp53R (SEQ ID NO:458)
GACGAATGCATAATCATTSNNCGGAAACGAGCTACCTGC
asp54R (SEQ ID NO:459)
TCGGACGAATGCATAATCSNNTCCCGGAAACGAGCTACC
asp55R (SEQ ID NO:460)
TGTTCGGACGAATGCATASNNATTTCCCGGAAACGAGCT
asp56R (SEQIDNO:461)
CCCTGTTCGGACGAATGCSNNATCATTTCCCGGAAACGA
asp57R (SEQ ID NO:462)
TGCCCCTGTTCGGACGAASNNATAATCATTTCCCGGAAA
asp58R (SEQ ID NO:463)
TCCTGCCCCTGTTCGGACSNNTGCATAATCATTTCCCGG
asp59R (SEQ ID NO:464)
TACTCCTGCCCCTGTTCGSNNGAATGCATAATCATTTCC
aspGOR (SEQ ID NO:465)
ATTTACTCCTGCCCCTGTSNNGACGAATGCATAATCATT
asp61R (SEQ ID NO:466)
CAAATTTACTCCTGCCCCSNNTCGGACGAATGCATAATC
asp62R (SEQ ID NO:467)
AAGCAAATTTACTCCTGCSNNTGTTCGGACGAATGCATA
asp63R (SEQ ID NO:468)
GGCAAGCAAATTTACTCCSNNCCCTGTTCGGACGAATGC
asp64R (SEQ ID NO:469)
TTGGGCAAGCAAATTTACSNNTGCCCCTGTTCGGACGAA
asp65R (SEQ ID NO:470)
GACTTGGGCAAGCAAATTSNNTCCTGCCCCTGTTCGGAC
asp66R (SEQIDNO:471)
ATTGACTTGGGCAAGCAASNNTACTCCTGCCCCTGTTCG
asp67R (SEQ ID NO:472)
GTTATTGACTTGGGCAAGSNNATTTACTCCTGCCCCTGT
asp68R (SEQ ID NO:473)
GTAGTTATTGACTTGGGCSNNCAAATTTACTCCTGCCCC
asp69R (SEQ ID NO:474)
asp70R CGAGTAGTTATTGACTTGSNNAAGCAAATTTACTCCTGC

asp71R asp72R asp73R asp74R asp75R asp76R asp77R asp78R asp79R aspSOR asp81R asp82R asp83R asp84R asp85R aspSGR asp87R asp88R asp89R asp90R asp91R asp92R asp93R asp94R asp95R

(SEQ ID NO:475)
GCCCGAGTAGTTATTGACSNNGGCAAGCAAATTTACTCC
(SEQ ID NO:476)
GCCGCCCGAGTAGTTATTSNNTTGGGCAAGCAAATTTAC
(SEQ ID NO:477)
TCTGCCGCCCGAGTAGTTSNNGACTTGGGCAAGCAAATT
(SEQ ID NO:478)
GACTCTGCCGCCCGAGTASNNATTGACTTGGGCAAGCAA
(SEQ ID NO:479)
TTGGACTCTGCCGCCCGASNNGTTATTGACTTGGGCAAG
(SEQ ID NO:480)
TACTTGGACTCTGCCGCCSNNGTAGTTATTGACTTGGGC
(SEQ ID NO:481)
TGCTACTTGGACTCTGCCSNNCGAGTAGTTATTGACTTG
(SEQ ID NO:482)
TCCTGCTACTTGGACTCTSNNGCCCGAGTAGTTATTGAC
(SEQ ID NO:483)
ATGTCCTGCTACTTGGACSNNGCCGCCCGAGTAGTTATT
(SEQ ID NO:484)
CGTATGTCCTGCTACTTGSNNTCTGCCGCCCGAGTAGTT
(SEQ ID NO:485)
GGCCGTATGTCCTGCTACSNNGACTCTGCCGCCCGAGTA
(SEQ ID NO:486)
TGCGGCCGTATGTCCTGCSNNTTGGACTCTGCCGCCCGA
(SEQ ID NO:487)
TGGTGCGGCCGTATGTCCSNNTACTTGGACTCTGCCGCC
(SEQ ID NO:488)
AACTGGTGCGGCCGTATGSNNTGCTACTTGGACTCTGCC
(SEQ ID NO:489)
TCCAACTGGTGCGGCCGTSNNTCCTGCTACTTGGACTCT
(SEQ ID NO:490)
AGATCCAACTGGTGCGGCSNNATGTCCTGCTACTTGGAC
(SEQIDNO:491)
AGCAGATCCAACTGGTGCSNNCGTATGTCCTGCTACTTG
(SEQ ID NO:492)
TACAGCAGATCCAACTGGSNNGGCCGTATGTCCTGCTAC
(SEQ ID NO:493)
GCATACAGCAGATCCAACSNNTGCGGCCGTATGTCCTGC
(SEQ ID NO:494)
GCGGCATACAGCAGATCCSNNTGGTGCGGCCGTATGTCC
(SEQ ID NO:495)
TGAGCGGCATACAGCAGASNNAACTGGTGCGGCCGTATG
(SEQ ID NO:496)
ACCTGAGCGGCATACAGCSNNTCCAACTGGTGCGGCCGT
(SEQ ID NO:497)
GCTACCTGAGCGGCATACSNNAGATCCAACTGGTGCGGC
(SEQ ID NO:498)
AGTGCTACCTGAGCGGCASNNAGCAGATCCAACTGGTGC
(SEQ ID NO:499)
TGTAGTGCTACCTGAGCGSNNTACAGCAGATCCAACTGG
(SEQ ID NO:500)

asp96R
asp97R
asp98R
asp99R
asplOOR
asp101R
asp102R
asp103R
asp104R
asp105R
asp106R
asp107R
asp108R
asp109R
asp110R
aspl11R
asp112R
asp113R
aspH4R
asp115R
asp116R
asp117R
asp118R
asp119R
asp120R asp121R

ACCTGTAGTGCTACCTGASNNGCATACAGCAGATCCAAC
(SEQIDNO:501)
CCAACCTGTAGTGCTACCSNNGCGGCATACAGCAGATCC
(SEQ ID NO:502)
ATGCCAACCTGTAGTGCTSNNTGAGCGGCATACAGCAGA
(SEQ ID NO:503)
GCAATGCCAACCTGTAGTSNNACCTGAGCGGCATACAGC
(SEQ ID NO:504)
TCCGCAATGCCAACCTGTSNNGCTACCTGAGCGGCATAC
(SEQ ID NO:505)
AGTTCCGCAATGCCAACCSNNAGTGCTACCTGAGCGGCA
(SEQ ID NO:506)
GATAGTTCCGCAATGCCASNNTGTAGTGCTACCTGAGCG
(SEQ ID NO:507)
CGTGATAGTTCCGCAATGSNNACCTGTAGTGCTACCTGA
(SEQ ID NO:508)
CGCCGTGATAGTTCCGCASNNCCAACCTGTAGTGCTACC
(SEQ ID NO:509)
CAGCGCCGTGATAGTTCCSNNATGCCAACCTGTAGTGCT
(SEQ ID NO:510)
ATTCAGCGCCGTGATAGTSNNGCAATGCCAACCTGTAGT
(SEQIDNO:511)
CGAATTCAGCGCCGTGATSNNTCCGCAATGCCAACCTGT
(SEQ ID NO:512)
AGACGAATTCAGCGCCGTSNNAGTTCCGCAATGCCAACC
(SEQIDNO:513)
GACAGACGAATTCAGCGCSNNGATAGTTCCGCAATGCCA
(SEQ ID NO:514)
CGTGACAGACGAATTCAGSNNCGTGATAGTTCCGCAATG
(SEQ ID NO:515)
ATACGTGACAGACGAATTSNNCGCCGTGATAGTTCCGCA
(SEQIDNO:516)
TGGATACGTGACAGACGASNNCAGCGCCGTGATAGTTCC
(SEQ ID NO:517)
CTCTGGATACGTGACAGASNNATTCAGCGCCGTGATAGT
(SEQIDNO:518)
TCCCTCTGGATACGTGACSNNCGAATTCAGCGCCGTGAT
(SEQIDNO:519)
TGTTCCCTCTGGATACGTSNNAGACGAATTCAGCGCCGT
(SEQ ID NO:520)
GACTGTTCCCTCTGGATASNNGACAGACGAATTCAGCGC
(SEQIDNO:521)
TCGGACTGTTCCCTCTGGSNNCGTGACAGACGAATTCAG
(SEQ ID NO:522)
TCCTCGGACTGTTCCCTCSNNATACGTGACAGACGAATT
(SEQ ID NO:523)
AAGTCCTCGGACTGTTCCSNNTGGATACGTGACAGACGA
(SEQ ID NO:524)
GATAAGTCCTCGGACTGTSNNCTCTGGATACGTGACAGA
(SEQ ID NO:525)
GCGGATAAGTCCTCGGACSNNTCCCTCTGGATACGTGAC

(SEQ ID NO:526)
CGTGCGGATAAGTCCTCGSNNTGTTCCCTCTGGATACGT
asp122R (SEQ ID NO:527)
CGTCGTGCGGATAAGTCCSNNGACTGTTCCCTCTGGATA
asp123R (SEQ ID NO:528)
AACCGTCGTGCGGATAAGSNNTCGGACTGTTCCCTCTGG
asp124R (SEQ ID NO:529)
ACAAACCGTCGTGCGGATSNNTCCTCGGACTGTTCCCTC
asp125R (SEQ ID NO:530)
GGCACAAACCGTCGTGCGSNNAAGTCCTCGGACTGTTCC
asp126R (SEQIDNO:531)
TTCGGCACAAACCGTCGTSNNGATAAGTCCTCGGACTGT
asp127R (SEQ ID NO:532)
TGGTTCGGCACAAACCGTSNNGCGGATAAGTCCTCGGAC
asp128R (SEQ ID NO:533)
ACCTGGTTCGGCACAAACSNNCGTGCGGATAAGTCCTCG
asp129R (SEQ ID NO:534)
ATCACCTGGTTCGGCACASNNCGTCGTGCGGATAAGTCC
asp130R (SEQ ID NO:535)
GCTATCACCTGGTTCGGCSNNAACCGTCGTGCGGATAAG
asp131R (SEQ ID NO:536)
TCCGCTATCACCTGGTTCSNNACAAACCGTCGTGCGGAT
asp132R (SEQ ID NO:537)
ACCTCCGCTATCACCTGGSNNGGCACAAACCGTCGTGCG
asp133R (SEQ ID NO:538)
GCTACCTCCGCTATCACCSNNTTCGGCACAAACCGTCGT
asp134R (SEQ ID NO:539)
AAGGCTACCTCCGCTATCSNNTGGTTCGGCACAAACCGT
asp135R (SEQ ID NO:540)
TAAAAGGCTACCTCCGCTSNNACCTGGTTCGGCACAAAC
asp136R (SEQ ID NO:541)
CGCTAAAAGGCTACCTCCSNNATCACCTGGTTCGGCACA
asp137R (SEQ ID NO:542)
TCCCGCTAAAAGGCTACCSNNGCTATCACCTGGTTCGGC
asp138R (SEQ ID NO:543)
ATTTCCCGCTAAAAGGCTSNNTCCGCTATCACCTGGTTC
asp139R (SEQ ID NO:544)
TTGATTTCCCGCTAAAAGSNNACCTCCGCTATCACCTGG
asp140R (SEQ ID NO:545)
GGCTTGATTTCCCGCTAASNNGCTACCTCCGCTATCACC
asp141R (SEQ ID NO:546)
TTGGGCTTGATTTCCCGCSNNAAGGCTACCTCCGCTATC
asp142R (SEQ ID NO:547)
ACCTTGGGCTTGATTTCCSNNTAAAAGGCTACCTCCGCT
asp143R (SEQ ID NO:548)
GACACCTTGGGCTTGATTSNNCGCTAAAAGGCTACCTCC
asp144R (SEQ ID NO:549)
CGTGACACCTTGGGCTTGSNNTCCCGCTAAAAGGCTACC
asp145R (SEQ ID NO:550)
TGACGTGACACCTTGGGCSNNATTTCCCGCTAAAAGGCT
asp146R (SEQ ID NO:551)

asp147R asp148R asp149R asp150R asp151R asp152R asp153R asp154R asp155R asp156R asp157R asp158R asp159R asp160R asp161R asp162R asp163R asp164R asp165R asp166R asp167R asp168R asp169R asp170R
asp171R asp172R

ACCTGACGTGACACCTTGSNNTTGATTTCCCGCTAAAAG (SEQ ID NO:552)
ACCACCTGACGTGACACCSNNGGCTTGATTTCCCGCTAA (SEQ ID NO:553)
AGAACCACCTGACGTGACSNNTTGGGCTTGATTTCCCGC (SEQ ID NO:554)
TCCAGAACCACCTGACGTSNNACCTTGGGCTTGATTTCC (SEQ ID N0:555)
ATTTCCAGAACCACCTGASNNGACACCTTGGGCTTGATT (SEQ ID N0:556)
ACAATTTCCAGAACCACCSNNCGTGACACCTTGGGCTTG (SEQ ID NO:557)
CCGACAATTTCCAGAACCSNNTGACGTGACACCTTGGGC (SEQ ID NO:558)
CGTCCGACAATTTCCAGASNNACCTGACGTGACACCTTG (SEQ ID NO:559)
CCCCGTCCGACAATTTCCSNNACCACCTGACGTGACACC (SEQ ID N0:560)
TCCCCCCGTCCGACAATTSNNAGAACCACCTGACGTGAC (SEQIDNO:561)
TGTTCCCCCCGTCCGACASNNTCCAGAACCACCTGACGT (SEQ ID NO:562)
TGTTGTTCCCCCCGTCCGSNNATTTCCAGAACCACCTGA (SEQ ID NO:563)
GAATGTTGTTCCCCCCGTSNNACAATTTCCAGAACCACC (SEQ ID NO:564)
AAAGAATGTTGTTCCCCCSNNCCGACAATTTCCAGAACC (SEQ ID NO:565)
TTGAAAGAATGTTGTTCCSNNCGTCCGACAATTTCCAGA (SEQ ID NO:566)
TGGTTGAAAGAATGTTGTSNNCCCCGTCCGACAATTTCC (SEQ ID N0:567)
GACTGGTTGAAAGAATGTSNNTCCCCCCGTCCGACAATT (SEQ ID NO:568)
GTTGACTGGTTGAAAGAASNNTGTTCCCCCCGTCCGACA (SEQ ID NO:569)
CGGGTTGACTGGTTGAAASNNTGTTGTTCCCCCCGTCCG (SEQ ID NO:570)
AATCGGGTTGACTGGTTGSNNGAATGTTGTTCCCCCCGT (SEQ ID N0:571)
CAAAATCGGGTTGACTGGSNNAAAGAATGTTGTTCCCCC (SEQ ID N0:572)
CTGCAAAATCGGGTTGACSNNTTGAAAGAATGTTGTTCC (SEQ ID NO:573)
AGCCTGCAAAATCGGGTTSNNTGGTTGAAAGAATGTTGT (SEQ ID NO:574)
GTAAGCCTGCAAAATCGGSNNGACTGGTTGAAAGAATGT (SEQ ID N0:575)
GCCGTAAGCCTGCAAAATSNNGTTGACTGGTTGAAAGAA (SEQ ID N0:576) CAGGCCGTAAGCCTGCAASNNCGGGTTGACTGGTTGAAA

asp173R asp174R asp175R asp176R asp177R asp178R asp179R asp180R asp181R asp182R asp183R asp184R asp185R asp186R asp187R asp188R asp189R

(SEQ ID NO:577)
TCTCAGGCCGTAAGCCTGSNNAATCGGGTTGACTGGTTG
(SEQ ID NO:578)
CATTCTCAGGCCGTAAGCSNNCAAAATCGGGTTGACTGG
(SEQ ID NO:579)
AATCATTCTCAGGCCGTASNNCTGCAAAATCGGGTTGAC
(SEQ ID NO:580)
CGTAATCATTCTCAGGCCSNNAGCCTGCAAAATCGGGTT
(SEQIDNO:581)
AGTCGTAATCATTCTCAGSNNGTAAGCCTGCAAAATCGG
(SEQ ID NO:582)
GTCAGTCGTAATCATTCTSNNGCCGTAAGCCTGCAAAAT
(SEQ ID NO:583)
AGAGTCAGTCGTAATCATSNNCAGGCCGTAAGCCTGCAA
(SEQ ID NO:584)
TCCAGAGTCAGTCGTAATSNNTCTCAGGCCGTAAGCCTG
(SEQ ID NO:585)
ACTTCCAGAGTCAGTCGTSNNCATTCTCAGGCCGTAAGC
(SEQ ID NO:586)
GGAACTTCCAGAGTCAGTSNNAATCATTCTCAGGCCGTA
(SEQ ID NO:587)
AGGGGAACTTCCAGAGTCSNNCGTAATCATTCTCAGGCC
(SEQ ID NO:588)
TTAAGGGGAACTTCCAGASNNAGTCGTAATCATTCTCAG
(SEQ ID NO:589)
GGGTTAAGGGGAACTTCCSNNGTCAGTCGTAATCATTCT
(SEQ ID NO:590)
GTTGGGTTAAGGGGAACTSNNAGAGTCAGTCGTAATCAT
(SEQ ID NO:591)
TCTGTTGGGTTAAGGGGASNNTCCAGAGTCAGTCGTAAT
(SEQ ID NO:592)
TCCTCTGTTGGGTTAAGGSNNACTTCCAGAGTCAGTCGT
(SEQ ID NO:593)
CCGTCCTCTGTTGGGTTASNNGGAACTTCCAGAGTCAGT
(SEQ ID NO:594)

EXAMPLE 16 Construction of Arginine and Cysteine Combinatorial Mutants
In this Example, the construction of multiple arginine and cysteine mutants of ASP is described. These experiments were conducted in order to determine whether the use of surface arginine and cysteine combinatorial libraries would lead to mutants with increased expression at the protein level.
The QuikChange® multi site-directed mutagenesis (QCMS) kit (Stratagene) was used to construct the two libraries. The 5' phosphorylated primers used to create the two libraries are shown in Table 16-1. It was noted that HPLC, PAGE or any other type of

purified primers gave far better results in terms of incorporation of full length primers as well as significant reduction in primer-containing errors. However, in these experiments, purified primers were not used, probably resulting in the production of 12% of clones had undesired mutations.
Table 16-1. Primers and Sequences

Primer name

Primer sequence



ASPR14L
ASPR16Q
ASPR35F
ASPR61S
ASPR79T
ASPR123L
ASPR127Q
ASPR159Q
ASPR179Q
ASPC17S
ASPC33S
ASPC95S
ASPC105S
ASPC131S
ASPC158S
ASPSEQF1 ASPSEQF4 ASPSEQR4

gcatatactattggcggcctgtctagatgttctatcgga (SEQ ID NO:595) actattggcggccggtctcagtgttctatcggattcgc (SEQ ID NO:596) ctgccggtcactgcggatttacaggagccactactgc (SEQ ID NO:597) atgattatgcattcgtctcaacaggggcaggagtaaat (SEQ ID NO:598) ataactactcgggcggcacagtccaagtagcaggacatac (SEQ ID NO:599) atccagagggaacagtcctgggacttatccgcacgac (SEQ ID NO:600) cagtccgaggacttatccagacgacggtttgtgccgaac (SEQ ID NO:601) gtggttctggaaattgtcagacggggggaacaacattc (SEQ ID NO:602) tgcaggcttacggcctgcagatgattacgactgactc (SEQ ID NO:603)
ttggcggccggtctagatcatctatcggattcgcagta (SEQ ID NO:604) tcattactgccggtcactcaggaagaacaggagccact (SEQ ID NO:605) cagttggatctgctgtatctcgctcaggtagcactac (SEQ ID NO:606) cactacaggttggcattcaggaactatcacggcgctg (SEQ ID NO:607) cttatccgcacgacggtttcagccgaaccaggtgatag (SEQ ID NO:608) caggtggttctggaaattcacggacggggggaacaac (SEQ ID NO:609)
tgcctcacatttgtgccac (SEQ ID NO:610) caggatgtagctgcaggac (SEQ ID NO:611) ctcggttatgagttagttc (SEQ ID NO:612)

pHPLT-ASP-C1-2 Plasmid Preparation and In vitro Methylation
To construct the cysteine and arginine libraries using the QCMS kit, the template plasmid pHPLT-ASP-C1-2 was first methylated in vitro since it was derived from a Bacillus strain that does not methylate DMA at GATC sites. This method was used because the more common approach of ensuring methylation in plasmids used in the QCMS protocol involving deriving DNA from dam+ E. constrains was not an option here, because the plasmid pHPLT-ASP-C1-2 does not grown in E. coll.
Miniprep DNA was prepared from Bacillus cells harboring the pHPLT-ASP-C1-2 plasmid. Specifically, the strain was grown overnight in 5 ml of LB withlOppm of neomycin, after which the cells were spun down. The Qiagen spin miniprep DNA kit was used for

preparing the plasmid DMA with an additional step wherein 100uL of 10mg/ml_ lysozyme was added after the addition of 250uL of P1 buffer from the kit. The sample was incubated at 37°C for 15 min with shaking, after which the remaining steps outlined in the Qiagen miniprep kit manual were carried out. The miniprep DMA was eluted with 30ul_ of Qiagen buffer EB provided in the kit.
Next, the pHPLT-ASP-C1-2 plasmid DNA was methylated in vitro using a dam methylase kit from NEB (NEB catalog # MO222S). Briefly, 25ul_ of the miniprep DNA (about 1-2 ug) was incubated with 20uL of the 10x NEB dam methylase buffer, 0.5uL of S-adenosylmethionine (80uM), 4uL of the dam methylase and 150.5uL of sterile distilled water. The reaction was incubated at 37°C for 4 hours, after which the DNA was purified using a Qiagen PCR purification kit. The methylated DNA was eluted with 40uL of buffer EB provided in the kit. To confirm methylation of the DNA, 4uL of the purified, methylated DNA was digested with Mbol (NEB; this enzyme cuts unmethylated GATC sites) or Dpn\ (Roche; this enzyme cuts methylated GATC sites) in a 20ul_ reaction using 2uL of each enzyme. The reactions were incubated at 37°C for 2 hours and they were analyzed on a 1.2% E-gel (l.nvitrogen). A small molecular weight DNA smear/ladder was observed for the Dpn\ digest, whereas the Mbo\ digest showed intact DNA, which indicated that the pHPLT-ASP-C1-2 plasmid was successfully methylated.
Library Construction
The cysteine (cys) and arginine (arg) combinatorial libraries were constructed as outlined in the Stratagene QCMS kit, with the exception of the primer concentration used in the reactions. Specifically, 4uL of the methylated, purified pHPLT-ASP-C1-2 plasmid (about 25 to 50ng) was mixed with 15uL of sterile distilled water, 1.5uL of dNTP, 2.5uL of 10x buffer, 1 uL of the enzyme blend and 1 .OuL arginine or cysteine mutant primer mix (i.e., for a total oflOOng of primers). The primer mix was prepared using 10ul_ of each of the nine arginine primers (100ng/uL) or each of the six cysteine primers (100ng/ul_); adding 50ng of each primer for both the arg and cys libraries as recommended in the Stratagene manual resulted in less than 50% of the clones containing mutations in a previous round of mutagenesis. Thus, the protocol was modified in the present round of mutagenesis to include a total of 100ng of primers in each reaction. The cycling conditions were 95°C for 1 min, followed by 30 cycles of 95°C for 1 min, 55°C for 1 min, and 65°C for 9 min, in an MJ Research thermocycler using thin-walled 0.2ml_ PCR tubes. The reaction product was digested with 1uL of Dpn\ from the QCMS kit by incubating at 37°C overnight. An additional 0.5uL of Dpn\ was added, and the reaction was incubated for 1 hour.

To transform the library DMA directly into Bacillus cells with out going through E. coli, the library DMA (single-stranded QCMS product) was amplified using the TempliPhi kit (Amersham cat. #25-6400), because Bacillus requires double-stranded multimeric DNAfor transformation. For this purpose, 1uL of the arginine or cysteine QCMS reaction was mixed with 5uL of sample buffer from the TempliPhi kit and heated for 3 minutes at 95°C to denature the DMA. The reaction was placed on ice to cool for 2 minutes and then spun down briefly. Next, 5uL of reaction buffer and 0.2uL of phi29 polymerase from the TempliPhi kit were added, and the reactions were incubated at 30°C in an MJ Research PCR machine for 4 hours. The phi29 enzyme was heat inactivated in the reactions by incubation at 65°C for 10 min in the PCR machine.
For transformation of the libraries into Bacillus, 0.5uL of the TempliPhi amplification reaction product was mixed with 100ul_ of comK competent cells followed by vigorous
shaking at 37°C for 1 hour. The transformation was serially diluted up to 10 fold, and 50ul_ of each dilution was plated on LA plates containing 10 ppm neomycin and 1.6% skim milk. Twenty-four clones from each library were picked for sequencing. Briefly, the colonies were resuspended in 20uL of sterile distilled water and 2uL was then used for PCR with ReadyTaq beads (Amersham) in a total volume of 25uL Primers ASPF1 and ASPR4 were added at a concentration of 0.5uM. Cycling conditions were 94°C for 4 min once, followed by 30 cycles of 94°C for 1min, 55°C for 1 min, and 72°C for 1min, followed by one round at 72°C for 7 min. A 1.5kb fragment was obtained in each case and the product was purified using a Qiagen PCR purification kit. The purified PCR products were sequenced with ASPF4 and ASPR4 primers.
A total of 48 clones were sequenced (24 from each library). The mutagenesis worked quite well in that only about 15% of the clones were WT. But 20% of the clones had mixed sequences because the plate was crowded with colonies or the TempliPhi amplification resulted in very concentrated DNA for transformation. Also, as indicated above, about 12% of clones had extra mutations. The remaining clones were all mutant, and of these about 60-80% were unique mutants. The sequencing results for the arginine and cysteine libraries are provided below in Tables 16-2, and 16-3.



C24

yes

Of the mutants identified in sequencing, the following mutants from the arginine library (See, Table 16-4) were found to be of interest. See the Examples below for additional data regarding the properties of these mutants.

Importantly, the activity results indicated that mutations in the cysteine residues produced ASP proteases with very low or no activity, suggesting that the disulfide bridges play an important role in the stability of the molecule. However, it is not intended that the present invention be limited to any particular mechanism(s).
EXAM RLE 17 Expression of Homologous O. turbata Protease in S. lividans
In this Example, expression of protease produced by O. turbata that is homologous to the protease 69B4 in S. lividans is described. Thus, this Example describes plasmids comprising polynucleotides encoding a polypeptide having proteolytic activity and used such vectors to transform a Streptomyces lividans host cell. The transformation methods used herein are known in the art (See e.g., U.S. Pat. No. 6,287,839; and WO 02/50245, herein incorporated by reference).
The vector (i.e., plasmid) used in these experiments comprised a polynucleotide

encoding a protease of the present invention obtained from Oerskovia turbata DSM 20577. This plasmid was used to transform Streptomyces lividans. The final plasmid vector is referred to herein as "pSEA4CT-O.turbata."
As with previous vectors, the construction of pSEA4CT-O.turbata made use of the pSEGCT plasmid vector (See, above).
An Aspergillus niger("A4") regulatory sequence operably linked to the structural gene encoding the Oerskovia turbata protease (Otp) was used to drive the expression of the protease. A fusion between the A4-regulatory sequence and the Oerskovia turbata signal-sequence, N-terminal prosequence and mature protease sequence (i.e., without the C-terminal prosequence) was constructed by fusion-PCR techniques known in the art, as an Xba\-Bam\r\\ fragment. The polynucleotide primers for the cloning of Oerskovia turbata protease (Otp) in pSEA4CT were based on SEQ ID NO:67. The primer sequences used were:
A4-turb Fw
5'-CAGAGACAGACCCCCGGAGGTAACCATGGCACGATCATTCTGGAGGACGC-3' (SEQ ID NO:613)
A4- turb RV
5'-GCGTCCTCCAGAATGATCGTGCCATGGTTACCTCCGGGGGTCTGTCTCTG-3' (SEQ ID NO:614)
A4- turb Bam Rv
5'-ATCCGCTCGCGGATCCCCATTGTCAGCTCGGGCCCCCACCGTCAGAGGTCACGAG-3' (SEQ ID NO:615)
5'-GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCAT-3' (SEQ ID NO:616)
The fragment was ligated into plasmid pSEA4CT digested with Xba\ and SamHI, resulting in plasmid pSEA4CT-O.turbata.
The host Streptomyces lividans TK23 was transformed with plasmid vector pSEA4CT-O.turbata using the protoplast method described in the previous Example (i.e., using the method of Hopwood etal., supra).
The transformed culture was expanded to provide two fermentation cultures in TS* medium. The composition of TS* medium was (g/L) tryptone (Difco) 16, soytone (Difco) 4, casein hydrolysate (Merck) 20, K2HP04 10, glucose 15, Basildon antifoam 0.6, pH 7.0. At various time points, samples of the fermentation broths were removed for analysis. For the

purposes of this experiment, a skim milk procedure was used to confirm successful cloning. 30 uL of the shake flask supernatant was pipetted in punched out holes in skim milk agar plates and incubated at 37°C.
The incubated plates were visually reviewed after overnight incubation for the presence of clearing zones (halos) indicating the expression of proteolytic enzyme. For purposes of this experiment, the samples were also assayed for protease activity and for molecular weight (SDS-PAGE). At the end of the fermentation, full length protease was observed by SDS-PAGE.
A sample of the fermentation broth was assayed as follows: 10uL of the diluted supernatant was collected and analyzed using the Dimethylcasein Hydrolysis Assay described in Example 1. The assay results of the fermentation broth of 2 clones clearly show that the polynucleotide from Oerskovia turbata encoding a polypeptide having proteolytic activity was expressed in Streptomyces lividans.
EXAMPLE 18
Expression of Homologous Cellulomonas and Cellulosimicrobium
Proteases in S. lividans
In this Example, expression of proteases produced by Cellulomonas cellasea DSM 20118 and Cellulosimicrobium cellulans DSM 204244 that are homologous to the protease 69B4 in S. lividans is described. Thus, this Example describes plasmids comprising polynucleotides encoding a polypeptide having proteolytic activity and used such vectors to transform a Streptomyces lividans host cell. The transformation methods used herein are known in the art (See e.g., U.S. Pat. No. 6,287,839; and WO 02/50245, herein incorporated by reference).
The final plasmid vectors are referred to as pSEA4CT-C.cellasea and pSEA4CT-Cm.cellulans. The construction of pSEA4CT-C.cellasea and pSEA4CT-Cm.cellulans made use of the pSEGCT plasmid vector described above.
An Aspergillus niger ("A4") regulatory sequence operably linked to the structural gene encoding the Cellulomonas cellasea mature protease (Ccp) or alternatively, the structural gene encoding the Cellulosimicrobium cellulans mature protease (Cmcp) was used to drive the expression of the protease. A fusion between the A4-regulatory sequence and the 69B4 protease signal-sequence, N-terminal prosequence of the 69B4 protease

gene and mature sequence of the native protease gene obtained from genomic DNA of a strain of Micrococcineae (herein, Cellulomonas cellasea or Cellulosimicrobium cellulans) was constructed by fusion-PCR techniques, as a Xba\-Bam\-\\ fragment. The polynucleotide primers for the cloning of Cellulomonas cellasea protease (Ccp) in pSEA4CT were based on SEQ ID NO:63, and are as follows:
Asp-npro fw-cell 5'-
AGACCGACGAGACCCCGCGGACCATGGTCGACGTCATCGGCGGCAACGCGTACTAC-3' (SEQ ID NO:617)
Cell-BH1-rv 5'-
TCAGCCGATCCGCTCGCGGATCCCCATTGTCAGCCCAGGACGAGACGCAGACOGTA-3' (SEQIDNO:618)
Asp-npro rv-cell 5'-
GTAGTACGCGTTGCCGCCGATGACGTCGACCATGGTCCGCGGGGTCTCGTCGGTCT-3' (SEQIDNO:619)
Xba-1 fwA4 5'-GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCATGTTCGA-3' (SEQ ID
NO:620)
The polynucleotide primers for the cloning of Cellulosimicrobium cellulans protease (Cmcp) in pSEA4CT were based on SEQ ID NO:71, and are as follows,
ASP-npro fw cellu 5'-ACCGAGGAGACCCCGCGGACCATGCACGGCGACGTGCGCGGCGGCGACCGCTA-3'
(SEQ ID NO:621)
ASP-npro rv cellu 5'-TAGCGGTCGCCGCCGCGCACGTCGCCGTGCATGGTCCGCGGGGTCTCGTCGGT-3'
(SEQ ID NO:622)

Cellu-BH1-rv 5'-
TCAGCCGATCCGCTCGCGGATCCCCATTGTCAGCGAGCCCGACGAGCGCGCTGCCCG AC-31 (SEQ ID NO:623)
Xba-1 fw A4 5'-GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCATGTTCGA-3' (SEQ ID
NO:620)
The host Streptomyces lividans TK23 was transformed with plasmid vector pSEA4CT using the protoplast method described above (i.e., Hopwood etal., supra). The transformed culture was expanded to provide two fermentation cultures in TS* medium. The composition of TS* medium was (g/L) tryptone (Difco) 16, soytone (Difco) 4, casein hydrolysate (Merck) 20, K2HPO410, glucose 15, Basildon antifoam 0.6, pH 7.0. At various time points, samples of the fermentation broths were removed for analysis. For the purposes of this experiment, a skim milk procedure was used to confirm successful cloning. 30 uL of the shake flask supernatant was pipetted in punched out holes in skim milk agar plates and incubated at 37°C.
The incubated plates were visually reviewed after overnight incubation for the presence of clearing zones (halos) indicating the expression of proteolytic enzyme. For purposes of this experiment, the samples were also assayed for protease activity and for molecular weight (SDS-PAGE). At the end of the fermentation full length protease was observed by SDS-PAGE.
A sample of the fermentation broth was assayed as follows: 10uL of the diluted supernatant was taken and added to 190 uL AAPF substrate solution (cone. 1 mg/ml, in 0.1 M Tris/0.005% Tween 80, pH 8.6). The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored (25°C).
As in previous Examples, the results obtained clearly indicated that the polynucleotide from Cellulomonas cellasea or from Cellulosimicrobium cellulans, both encoding polypeptides having proteolytic activity were expressed in Streptomyces lividans.
EXAMPLE 19 Determination of the Crystal Structure of ASP Protease
In this Example, methods used to determine the crystal structure of ASP protease are described. Indeed, high quality single crystals were obtained from purified ASP

protease. The crystallization conditions were as follows: 25% PEG 8000, 0.2M ammonium sulphate, and 15% glycerol. These crystallization conditions are cryo-protective, so transfer to a cryoprotectant was not required. The crystals were frozen in liquid nitrogen, and kept frozen during data collection using an Xstream (Molecular Structure). Data were collected with a R-axis IV (Molecular Structure), equipped with focusing mirrors. X-ray reflection data were obtained to 1.9A resolution. The space group was P212121, with cell dimensions a=35.65A, b=51.82 A and c=76.86A. There was one molecule per asymmetric unit.
The crystal structure was solved using the molecular replacement method. The program used was X-MR (Accelrys Inc.). The starting model for the molecular replacement calculations was Streptogrisin. It is clear from the electron density map obtained from X-MR that the molecular replacement solution is correct. Thus, 98% of the model was built correctly, with some minor errors that were fixed manually. The R-factor for data to 1.9A was 0.23.
The structure was found to largely consist of (3-sheets, with 2 very short a-helices, and a longer helix toward the C-terminal end. There are two sets of (3-sheets, with a considerable interface between them. The active-site is found in a cleft formed at this interface. The catalytic triad is formed by His 32, Asp 56, and Ser 137. Table 19-1 provides the atomic coordinates identified for ASP.
Table 19-1 Atomic Coordinates for ASP
1HPG = Streptomyces griseus glutamic acid specific protease. 1SGP = Streptomyces griseus proteinase B 1SGT = Streptomyces griseus strain K1 trypsin 1TAL = Lysobacter enzymogenes alpha-lytic protease 2SFA = Streptomyces fradiae serine proteinase 2SGA = Streptomyces griseus protease A
EXAMPLE 20 Enzyme Substrate Modeling and Mapping of the ASP Active-Site
In this Example, enzyme-substrate modeling and mapping of the ASP active site methods are described. Preliminary inspection of the active-site revealed a large P1 binding pocket that is large enough to accommodate large hydrophobic groups such as the side-chains of Trp, Tyr, and Phe.
The crystal structure of Streptogrisin A with the turkey third domain of the ovomucoid inhibitor (pdb code 2SGB) was been determined. 2SGB was structurally aligned to ASP, using MOE (Chemical Computing Corp), which places the inhibitor in the active-site of ASP.

All of the 2SGB co-ordinates were removed, except for those which define a hexa-peptide bound in the ASP active-site, corresponding to binding at the S4 to S2' binding sites. The Pro-ASP protein self-cleaves the pro domain-mature domain junction, to release the mature protease enzyme. The last four residues of the pro domain are expected to occupy the S1-S4 sites, and the first two residues of the mature protease occupy the S1' and S2' sites. Therefore the hexapeptide in the active-site was in-silico mutated to sequence PRTMFD (SEQ ID N0:630).
From inspection of the structure of the initial substrate bound model, the backbone amide of Gly135 and Asp136 would be expected to form the oxy-anion hole. However, the amide nitrogen of Gly135 appears to point in the wrong direction. Comparison with streptogrisin A confirms this. Thus, it is presumed that a conformational change in ASP is required to form the oxy-anion hole. However, it is not intended that the present invention be limited by any particular mechanism nor hypothesis. The peptide backbone between residues 134 and 135 was altered to that of a similar orientation to that of structurally equivalent atoms in the streptogrisin A structure. The enzyme substrate model was then energy minimized.
Residues within 6 A of the modeled substrate were determined using the proximity tools within the program QUANTA. These residues were identified as: Arg14, Ser15, Arg16, Cys17, His32, Cys33, Phe52, Asp56, ThrlOO, Val115, Thr116, Tyr117, Pro118, Glu119, Ala132, Glu133, Pro134, Gly135, Asp136, Ser137, Thr151, Ser152, Gly153, Gly154, Ser155, Gly156, Asn157, Thr164, Phe165. Of these, His 32, Asp56, and Ser137 form the catalytic triad.
The P1 pocket is formed by Cys131, Ala132, Glu133, Pro134, Gly135, Thr151, Ser152, Gly153, Gly154, Ser155, Gly156, Asn157 and Gly 162, Thr 163, Thr164. The P2 pocket is defined by Phe52, Tyr117, Pro118 and Glu119. The P3 pocket has main-chain to main chain hydrogen bonding from Gly 154 to the substrate main-chain. The P1' pocket is defined by Arg16, and His32. The P2' pocket is defined by ThrlOO, and Pro134. The atomic coordinates of ASP with the modeled octapeptide substrate are provided in Table 20-1 below.
Table 20-1. Atomic Coordinates of ASP with the Modeled Octapeptide Substrate

EXAMPLE 21 Oxidative Stability of ASP
This Example describes experiments conducted to determine the oxidative stability of the ASP protease and mutant proteases. The resistance to oxidation of Cellulomonas 69B4 protease was compared to that of: a BPN'-variant protease (BPN'-variant 1; Genencor; See, RE 34,606 [incorporated herein by reference], for a description of this enzyme); a GG36 variant protease (GG36-variant 1; Genencor; See e.g., U.S. Pat. Nos. 5,955,340 and 5,700,676, herein incorporated by reference); and PURAFECT protease (Genencor).
The assay was conducted by incubating a sample of the protease with 0.1 M H2O2. A 2.0 ml volume of 0.1 M Borate buffer (45.4 gm NaB4O7 10 H2O), pH 9.45 containing 0.1 M H2O2 and 100 ppm protease was incubated at 25°C for 20 minutes and assayed for enzyme activity.
The enzyme activity was determined as follows: 50 ul of the incubation mixture was combined with 950 ul 0.1 M Tris buffer, pH 8.6 and a sample from 10 ul was taken and added to 990 ul AAPF substrate solution, cone. 1 mg/ml, in 0.1 M Tris 70.005% TWEEN®, pH 8.6. The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored. The results obtained for these proteases are provided in Figure 31. As indicated in this graph, protease 69B4 showed greatly enhanced stability under oxidative conditions relative to the subtilisin proteases.

EXAMPLE 22 Chelate Stability of ASP
In this Example, experiments to determine the chelate stability of ASP are described. The resistance to the presence of a chelator of 69B4 protease was assayed by incubating an aliquot of the enzyme with 10 mM EDTA in 50 mM Tris, pH 8.2. The same enzyme preparations as used in Example 21 were used in these experiments.
In specific, a volume of 2.0 ml 50 mM Tris buffer, pH 8.2, containing 10 mM EDTA and 100 ppm protease was incubated at 45°C for 100 minutes and assayed for enzyme activity as follows: 50 ul of the incubation mixture was combined with 950 ul 0.1 M Tris buffer, pH 8.6 and a sample from 10 ul was taken and added to 990 ul AAPF substrate solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, pH 8.6
The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored. The results obtained for these four proteases are shown in Figure 32. As indicated by these results, protease 69B4 showed greatly enhanced stability in the presence of a chelator than BPN' variant-1, PURAFECT®, or GG36 variant-1.
EXAMPLE 23 Thermal Stability of ASP
In this Example, experiments conducted to determine the thermostability of ASP protease are described. In one set of experiments, 69B4 protease was tested for resistance to thermal inactivation in solution. As in Examples 21 and 22, a BPN' variant (BPN'-varianM), PURAFECT®, and a GG36 variant (GG36-variant-1) were also tested and compared with ASP.
The thermal inactivation was performed by incubating a volume of 2.0 ml 50 mM Tris buffer, pH 8.0, containing 100 ppm protease at 45°C for 300 minutes and assayed for enzyme activity as follows: 50 ul of the incubation mixture was combined with 950 ul 0.1 M Tris buffer, pH 8.6 and a sample from 10 ul was taken and added to 990 ul AAPF substrate solution, cone. 1 mg/ml, in 0.1 M Tris 70.005% TWEEN®, pH 8.6. The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored. The results of these four proteases are shown in Figure 33. As shown by these results, protease 69B4 showed enhanced or comparative thermal stability at 45 degrees centigrade than the BPN' variant, PURAFECT®, or the GG36 variant.
In addition to the above experiments, an alternative method for determining the thermostability of ASP was also tested. In these experiments, a temperature gradient between 57°- 62 °C was used. The thermal inactivation (using a Thermocycler -MTP plate

DNA Engine Tetad; MJ Research) was performed by incubating a volume of 180ul 100 mM Tris buffer, pH 8.6, containing 1 mM CaCI2 and 5 ppm protease for 60 minutes and assayed for enzyme activity as follows: 10 ul was taken and added to 190 ul AAPF substrate solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, pH 8.6. The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was monitored (at 25°C). The results of 4 proteases are shown in Figure 34.
EXAMPLE 24 pH profile of ASP Protease on DMC Substrate
In this Example, experiments conducted to determine the pH profile of the ASP protease are described. The Cellulomonas 69B4 protease of the present invention, isolated and purified by methods described herein and three currently used subtilisin proteases (PURAFECT®, BPN'-varianI 1, GG36-variant-1) described in Examples 21-23, were analyzed for their ability to hydrolyze a commercial synthetic substrate, di-methyl casein ("DMC"/ Sigma C-9801) in the pH range from 4 to 12.
The DMC method described at the beginning of the Experimental section was used, with modifications, as indicated below. Briefly, a 5 mg/ml DMC substrate solution was prepared in the appropriate buffer (5 mg/ml DMC, 0.005% (w/w) TWEEN-80® (polyoxyethylene sorbitan mono-oleate, Sigma P-1754)). The appropriate DMC buffers were composed as follows: 40 mM MES for pH 4 and 5 ; 40 mM HEPES for pH 6 and 7, 40 mM TRIS for pH 8 and 9; and 40 mM Carbonate for pH 10, 11 and 12.
For the determination, 180 (il of each pH-substrate solution was transferred into 96 well microtiter plate and were pre-incubated at 37°C for twenty minutes prior to enzyme addition. The respective enzyme solutions (BPN'-varianM; GG36-variant-1; PURAFECT®; and 69B4 protease) were prepared, containing about 25 ppm and 20 ul of these enzyme solutions. These enzyme solutions were pipetted into the substrate containing wells in order to achieve a 2.5 ppm final enzyme concentration in each well. The 96 well plate containing enzyme-substrate mixtures was incubated at 37°C and 300 rpm for one hour in an IKS-Multitron incubator/shaker.
A 2,4,6-trinitrobenzene sulfonate ("TNBS") color reaction method was used to determine the amount of peptides and amino acids release from DMC substrate. The free amino groups (of the peptides and amino acids) react with 2,4,6-trinitro-benzene sulfonic acid to form a yellow colored complex. The absorbance was measured at 405 nm in a SpectraMax 250 MTP Reader.

The TNBS assay was conducted as follows. A 1 mg/ml solution of TNBS ( 5% 2,4,6 trinitrobenzene sulfonic acid/Sigma-P2297) was prepared in reagent buffer A (2.4 g NaOH, 45.4 g Na2B4O7.10H2O dissolved by heating in 1000ml). Then, 60 ul per well were aliquoted into a 96-well plate and 10 ul of the incubation mixture described above were added to each well and mixed for 20 minutes at room temperature. Then, 200 ul of reagent B (70.4 g NaH2PO4-H20 and 1.2 g Na2SO3 in 2000 ml) were added to each well and mixed to stop the reaction. The absorbance at 405 nm was measured in a SpectraMax 250 MTP Reader. The absorbance value was corrected for a blank (without enzyme). The data in Table 24-1 show the comparative ability of the 69B4 protease to hydrolyze such substrate versus proteases from a known mutant variants (BPN1 variant-1 and GG36 variant-1).
Also, as shown in Figure 35, the serine protease of the present invention showed comparative or increased hydrolysis of DMC substrate with an optimal DMC-hydrolysis activity over a broad pH range from 7 to 12.

EXAMPLE 25 pH Stability of ASP Protease
In this Example, experiments conducted to determine the pH stability of the ASP protease are described. As in Examples 21-24, two currently used subtilisin proteases (PURAFECT® and BPN'-varianM) were also tested.
The respective enzyme solutions (i.e., BPN'-varianM, PURAFECT®, and 69B4 protease) were prepared containing 90 ppm protease in 0.1 M Citrate buffer, pH 3, 4, 5 and 6. Then, 10 ml tubes containing 1 ml of buffered enzyme solutions were placed in a GFL

1083 water bath set at 25°C, 35°C and 45°C respectively, for 60 minutes. AAPF activity was determined for each enzyme sample at time 0 and 60 minutes as described above. The remaining enzyme activity was calculated and the results are provided in Table 25-1 below, and are shown in Figures 25-28).
As indicated by the data in Table 25-1, the ASP protease is exceptional stable at pH 3, 4, 5, and 6, at temperatures between 25°C and 45°C, as compared to the BPN' variant-1 and PURAFECT®.

EXAMPLE 26 Stability and Specificity of ASP
In this Example, experiments conducted to determine the stability and specificity differences between ASP, ASP mutants, and FNA are described. These experiments were performed by formulating liquid TIDE® detergent (Procter & Gamble) with calcium formate (an anionic surfactant titrant), borate (a P1 binder/inhibitor), and glycerol (water ordering), either independently of or in combination with each other. The enzyme was tested under these conditions and the residual enzyme activity was determined over time at a fixed temperature.
The experiments are described in greater detail below. Unformulated liquid TIDE® detergent (i.e., without added enzyme stabilizing chemicals ) was divided into eleven aliquots. Then, glycerol, borax, or calcium formate were added to the detergent aliquots in the proportions shown in Table 26-1.


Each aliquot was pre-warmed to 90°F, and either FNA, ASP (wild-type) or an ASP R18 variant was added to approximately one gram per liter protease. After thorough mixing, a portion was removed and assayed for activity with synthetic AAPF-pNA substrate, as described above. After the assay, each aliquot was placed back into a 90°F oven. The assay process was repeated over time, and the decline in activity at TO was plotted as a % TO activity remaining.
Surprisingly, it was found that ASP did not have the same calcium formate or glycerol dependency as FNA. Furthermore, it was determined that borate (alone) had the most dramatic effect on stabilizing ASP. It was also found that the addition of stabilizing chemicals provided significant benefits to the wild-type ASP, as well as the ASP R18 variant, indicating that the variant site is independent of the bo rate-activated site.
EXAMPLE 27 LAS Stability of ASP
In this Example, experiments conducted to determine the stability of ASP to anionic surfactants are described. LAS (linear alkyl sulfonate), an anionic surfactant, is a component of HDL detergents known to inactivate enzymes. The methods used are described above.
It was determined that wild-type ASP incubated in LAS dissolved in Tris HCI pH 8.6 is inactivated (See, Table 27-1, below). Further study revealed that inactivation is rapid (See, Table 27-2). As LAS is a negatively charged molecule, the hypothesis that electrostatic attraction of LAS with positively charged amino-acid side chains of ASP was the cause of the LAS sensitivity, was developed. To test this hypothesis, arginine residues (wild-type ASP contains no lysine residues), were mutated to other amino-acids.
Incubation of these mutants in 0.05%(w/v) LAS in Tris HCI pH8.6, for one hour revealed that all arginine replacement mutants were more stable than wild-type ASP. In contrast, non-arginine replacement mutations that were also tested for LAS stability were

generally not improved compared to wild-type (See, Table 27-3). Subsequent multiple arginine replacement mutations revealed that the enzyme is substantially more stable than the wild-type enzyme, and more stable that single arginine replacement mutations (See, Table 27-4).
Another anionic surfactant that is used in HDL detergents is AES. Wild-type ASP was found to be unstable in high concentrations of AES (See, Table 27-5). The mutant ASP R18 was found to be more stable than wild-type in AES (See, Table 27-5). Also, the rate of inactivation of activity by 5% AES was found to be higher for the wild-type than the ASP R18 mutant (See, Table 27-6). These results confirm that replacement of arginine residues of ASP improves the stability of ASP in anionic detergents in general. It is not intended that the present invention be limited to any specific anionic detergents or mutations. Indeed, it is contemplated that various anionic detergents (as well as other detergents) will find use in the present invention, as will various ASP mutants.
Table 27-1. Inactivation of ASP by LAS in Tris HCI pH 8.6
%LAS (w/v) % Activity of Control
Control (0 LAS) 100
0.01 87
0.03 77
0.06 59
0.10 47
0.30 31
0.60 20
1.00 12
Table 27-2. Time-course of ASP Inactivation by 0.1% LAS
Time (sees) % Remaining Activity
0 100
60 45
120 26
240 20
600 11

Table 27-3. Stability of ASP and Single Mutants (Incubated 0.05% LAS in Tris HCI, pH 8.6, for 60 mins.)
Mutant % Remaining Activity of 0 LAS Control
Wild-type 18
R14L 47
R16I 49
R16L 56
R16Q 51
R35F 43
R127A 59
R127K 31
R127Q 52
R159K 25
T36S 11
G65Q 22
Y75G 7
N76L 17
S76V 17
Table 27-4. Stability of ASP and Multiple Arginine Replacements (Incubated 0.05% LAS in Tris HCI, pH 8.6. for 60 in ins)
Mutant % Remaining Activity of 0 LAS Control
Wild-type 27.5
ASP R-1 98.8
ASP R-2 69.6
ASP R-3 100.2
ASPR-7 103.9
ASP R-1 OB 98.9
ASP R-18 100.9
ASP R23 79.4
In this Table,
R-1=R16Q/R35F/R159Q
R-2=R159Q
R-3=R16Q/R123L
R-7=R14L/R127Q/R159Q
R-10B=R14L/R179Q
R-18=R123L/R127Q/R179Q.
R-21=R16Q/R79T/R127Q
R-23=R16Q/R79T

Table 27-5. Inactivation of ASP and ASP Mutant R-18 by AES in Tris HCI pH8.6 %Remaining activity of 0% AES control
%AES(v/w) Wild-type ASP ASP R-18
0 100 100
1 70 94
5 32 57
Table 27-6. Time-course of ASP and Mutant R-18 Inactivation by 5% AES in Tris HCI, pH 8.6
% Remaining Activity of 0% AES Control
Time (Mins) Wild-type ASP ASP R-128
0 100 100
90 99 105
4020 15 83
EXAMPLE 28 Determination of ASP Autolysis Sites in the Presence and Absence of LAS Detergent
In this Example, experiments conducted to determine the ASP autolysis sits in the presence and absence of LAS are described. ASP autolysis was evaluated in a buffer with and without LAS (dodecylbenzene-sulfonic acid). Autolysis peptide assignments were made based on molecular weight and sequence of each peptide (from MS and MS/MS data, respectively).
ASP (at concentration of 0.35ug/uL) was incubated (at 4°C) in a 100mM Tris pH 8.6 with and without 0.1%LAS (dodecylbenzene-sulfonic acid). Aliquots were taken at time periods from 0 to 30 min of incubation and autolysis was terminated by an addition of TFA (final concentration 1%). Aliquots (10uL) were analyzed by liquid chromatography coupled with electrospray tandem mass spectrometry (LC-ESI-MS/MS). Peptides were resolved using an HPLC system (model 1100, Agilent Technologies) using a reversed-phase column (Vydac C4, O.SmmID x 150mm), and a gradient from 0 to 100% solvent B (0.1%formic acid in acetonitrile) in 60 min at a flow rate of 5ul_/min (generated using a static split from a pump flow rate of 250ui_/min). Solvent A consisted of 0.1% formic acid in water; and solvent B was 0.1% formic acid in acetonitrile.
Mass spectra were acquired using ion trap mass spectrometer (model LCQ Classic, Thermo). The mass spectrometer was tuned for optimum detection of m/z of 785 and

operated with spray voltage of 2.5RV, and a heated capillary at 250°C. Mass spectra were acquired with injection time of 500 msec and 5 microscans. Tandem MS spectra were acquired in data-dependent mode, with the most intense peak selected and fragmented with a normalized collision energy of 35%. For relative peptide quantitation, peak areas were determined using vendor software. The identity of the autolysis peptides was determined using a database search program (TurboSequest, Thermo) run on a database containing ASP sequence. Database searches were performed with no enzyme selected, threshold of 10000, dta file parameters (peptide m/z error of 1.7, group 11, minimum ion count 15), and database parameters (peptide error of 2.2, MS/MS ions error of 0.0, both B,Y ions).
Without LAS in the sample buffer, ASP cleavages were primarily observed at the termini and in the middle of the molecule (positions Y9, F47, Y59, F165, Q174, Y176; See Table 28-1, below). Relative quantitative data for observed peptides and intact ASP was plotted over the course of the experiment (See, Figure 25, Panel A). The majority of the ASP remained intact and only 1% was in the form of cleaved peptides (protein:peptide ratio of 99:1) These data indicated that the majority of ASP remains intact, folded, and resistant to further autolytic cleavage.
With 0.1% LAS in the sample buffer, ASP cleavages were observed thoughout the protein (positions Y9, T40, F47, Y57, F59, R61, L69, F165, Q174, Y176). The majority of the ASP was in the peptide form after 10min (See, Figure 25, Panel B). After 60 min, the protein:peptide ratio was Table 28-1. ASP Autolysis Peptides Observed With and Without 0.1% LAS


EXAMPLE 29 Use of Reversible Inhibitors to Reduce LAS-Induced Degradation of ASP
In this Example, experiments conducted to assess the use of reversible inhibitors to reduce LAS-induced degradation of ASP are described. Benzamidine (BZA) is a known reversible inhibitor of serine proteases. Using the standard succ-AAPF-pNA assay as described above, BZA was shown to inhibit the activity of approximately 2ug/ml ASP, with complete inhibition occurring at 1000mM (1M), as indicated in Table 29-1, below:

Approximately 200ug/ml ASP was then incubated with 0.1% LAS and with, and without 1M BZA for up to 4 days. Enzyme activity was measured at different time points by addition of 10ul incubated sample to 990 ul of assay solution. This reduces the BZA concentration to 10mM, which by reference to the table above is not inhibitory. Therefore, any loss of activity will be due to enzyme degradation. As indicated in the results below, enzyme incubated with 0.1% LAS and without BZA lost all activity (i.e., it was degraded), while enzyme incubated with 0.1 % LAS and 1M BZA, retained activity over the 4 day time-course of the study, demonstrating that inhibition of ASP activity prevents degradation by LAS.

EXAMPLE 30 Testing of Mutant ASPs
In addition to the tests described above, tests were conducted on various mutants of ASP. The methods described above in Example 1 were used. In the following Tables, "Variant Code" provides the wild-type amino acid, the position in the amino acid sequence, and the replacement amino acid (i.e., "F001A" indicates that the phenylalanine at position 1 in the amino acid sequence has been replaced by alanine in this particular variant).
Keratin Hydrolysis
The table (Table 30-1) below provides the keratin hydrolysis data obtained for the ASP variants which show activity on this substrate in the keratin assay as described above ("Protease Assay with Keratin in Microtiter Plates"). The values are relative to wild type (WT) and calculated as described in the assay procedure. Values greater than 1 are indicative of better activity than WT ASP.
Table 30-1. Keratin Hydrolysis Results






I

|S188L | "1.04 | Thermostability Assays
The data in the following table (Table 30-3) represent the relative thermostability data of variants of ASP relative to the stability of the WT ASP stability under these conditions. The stability was measured by determining casein activity before and after incubation at elevated temperature (See, "Thermostability Assays" above). The table contains the relative thermostability values compared to WT under these conditions. It is the quotient of (Variant residual activity/WT residual activity). A value greater than one indicates higher thermostability.

Table 30-3. Thermostability Assay Results


Cleaning Activity
In this Example, experiments conducted to determine the cleaning activity of ASP under various conditions, as well as the properties of the various wash conditions are described.
There is a wide variety of wash conditions including varying detergent formulations, wash water volume, wash water temperature, and length of wash time. Thus, detergent components such as proteases must be able to tolerate and function under adverse environmental conditions. For example, detergent formulations used in different areas have different concentrations of their relevant components present in the wash water. For example, a European detergent typically has about 3000-8000 ppm of detergent components in the wash water, while a Japanese detergent typically has less than 800 (e.g., 667 ppm) of detergent components in the wash water. In North America, particularly the United States, detergent typically have about 800 to 2000 (e.g., 975 ppm) of detergent components present in the wash water.
Latin American detergents are generally high suds phosphate builder detergents and the range of detergents used in Latin America can fall in both the medium and high

detergent concentrations, as they range from 1500 ppm to 6000 ppm of detergent components in the wash water. Brazilian detergents typically has approximately 1500 ppm of detergent components present in the wash water. However, other high suds phosphate builder detergent geographies, not limited to other Latin American countries, may have high detergent concentration systems up to about 6000 ppm of detergent components present in the wash water.
In light of the foregoing, it is evident that concentrations of detergent compositions in typical wash solutions throughout the world varies from less than about 800 ppm of detergent composition ("low detergent concentration geographies"), for example about 667 ppm in Japan, to between about 800 ppm to about 2000 ppm ("medium detergent concentration geographies"), for example about 975 ppm in U.S. and about 1500 ppm in Brazil, to greater than about 2000 ppm ("high detergent concentration geographies"), for example about 3000 ppm to about 8000 ppm in Europe and about 6000 ppm in high suds phosphate builder geographies.
The concentrations of the typical wash solutions are determined empirically. For example, in the U.S., a typical washing machine holds a volume of about 64.4 L of wash solution. Accordingly, in order to obtain a concentration of about 975 ppm of detergent within the wash solution, about 62.79 g of detergent composition must be added to the 64.4 L of wash solution. This amount is the typical amount measured into the wash water by the consumer using the measuring cup provided with the detergent.
As a further example, different geographies use different wash temperatures. The temperature of the wash water in Japan is typically less than that used in Europe. For example, the temperature of the wash water in North America and Japan can be between 10 and 30°C (e.g., about 20°C), whereas the temperature of wash water in Europe is typically between 30 and 50°C (e.g., about 40°C).
As a further example, different geographies may have different water hardness. Water hardness is typically described as grains per gallon mixed Ca2+/Mg2+. Hardness is a measure of the amount of calcium (Ca2+) and magnesium (Mg2+) in the water. Most water in the United States is hard, but the degree of hardness varies from area to area. Moderately hard (60-120 ppm) to hard (121-181 ppm) water has 60 to 181 parts per million (i.e., parts per million converted to grains per U.S. gallon is ppm # divided by 17.1 equals grains per gallon) of hardness minerals. Table 31-1 provides ranges of water hardness.


European water hardness is typically greater than 10.5 (e.g., 10.5-20.0) grains per gallon mixed Ca^/Mg2* (e.g., about 15 grains per gallon mixed Ca2+/Mg2+). North American water hardness is typically greater than Japanese water hardness, but less than European water hardness. For example, North American water hardness can be between 3 to10 grains, 3-8 grains or about 6 grains. Japanese water hardness is typically lower than North American water hardness, typically less than 4, for example 3 grains per gallon mixed Ca2+/Mg2+.
The present invention provides protease variants that provide improved wash performance in at least one set of wash conditions and typically in multiple wash conditions.
As described herein, the protease variants are tested for performance in different types of detergent and wash conditions using a microswatch assay (See above, and U.S. Pat. Appln. Ser. No. 09/554,992; and WO 99/34011, both of which are incorporated by reference herein). Protease variants are tested for other soil substrates also in a similar fashion.
In the experiments conducted to determine cleaning activity of ASP, the following methods were used. Incubators (Innova 4330 Model Incubator, New Brunswick) was pre-warmed for 60 minutes to 40SC for "European" conditions and for 20Q C for "Japanese" conditions. Blood-Milk-Ink swatches (EMPA 116) were obtained from the Swiss Federal Laboratories for Material Testing and from CFT Research, and were modified by exposure to 0.03 % hydrogen peroxide for 30 minutes at 60- C., then dried. Circles of 1/4" diameter were cut from the dried swatches and placed vertically, one per well, in a 96 well microplate.
Protease samples of ASP were diluted in 10 mM NaCI, 0.005% TWEENCD-80 to provide the desired concentration of 10 ppm (protein). To provide "North American wash conditions," 1 gram per liter TIDE® laundry detergent (Procter & Gamble) without bleach was prepared in deionized water, and a concentrated stock of calcium and magnesium was added to result in a final water hardness value of 6 grains per gallon. To provide "European wash conditions," 7.6 gram per liter ARIEL® REGULAR laundry detergent (Procter & Gamble) without bleach was prepared in deionized water, and a concentrated stock of calcium and magnesium was added to result in a final water hardness value of 15 grains per

gallon. To provide "Japanese wash conditions," 0.67 gram per liter PURE CLEAN laundry detergent (Procter & Gamble) without bleach was prepared in deionized water, and a concentrated stock of calcium and magnesium was added to result in a final water hardness value of 3 grains per gallon.
In yet another detergent composition to provide "Japanese wash conditions with North American detergent formulation," 0.66 gram per liter Detergent Composition III without bleach was prepared in deionized water, and a concentrated stock of calcium and magnesium was added to result in a final water hardness value of 3 grains per gallon.
The detergent solutions were allowed to mix for 15 minutes and were then filtered through a 0.2 micron cellulose acetate filter. A 190 ul of the respective detergent solution was then added to the appropriate wells of a microplate. Then, 10 ul of the enzyme preparation were added to the filtered detergent in order to obtain a final concentration 0.25-3.0 ppm (micrograms per milliliter) of enzyme, for a total volume of 200 |jl. The microplate was then sealed to prevent leakage, placed in a holder on an incubator/shaker set to 20SC and 350/400 RPM and allowed to shake for one hour.
The plate was then removed from the incubator/shaker and an aliquot of 100ul of solution was removed from each well, and placed on a fresh Costar microtiter plate (Corning). The absorbance at 405 nm wavelength was read for each aliquot on a Microtiter plate reader (SpectraMax 340, Molecular Devices), and reported. The detergent composition and incubation conditions in the microswatch assay are set forth in Table 31-2.
Table 31-2. Detergent Composition and Incubation Conditions

The dose response curves depicting absorbance at 405 nm as a function of concentration (ppm in well), for PURAFECT® (Genencor), OPTIMASE® (Genencor), RELASE™ (Genencor; GG36-variant described above), and ASP are provided in Figures 23-27).
As indicated in Figure 26, under North American conditions, in liquid TIDE® detergent, the ASP protease showed enhanced cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ proteases under the same conditions. Under Japanese conditions, in Detergent Comp. Ill powder (0.66 g/l), ASP showed enhanced or the same cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ proteases under the same conditions (See, Figure 27). Under European conditions, in ARIEL® REGULAR powder detergent, the ASP protease showed enhanced cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ proteases under the same conditions (See, Figure 28). In both tests, ASP and OPTIMASE™ provided results that were 2 to 10 times the absorbance at 405 nm as compared to PURAFECT® and RELASE™. Under Japanese conditions, in PURE CLEAN powder detergent (See, Figure 29), the ASP protease showed enhanced and comparative cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ proteases under the same conditions. Under North American conditions, in Detergent Composition III powder detergent (See, Figure 30), the ASP protease showed enhanced or comparative cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ proteases under the same conditions.
EXAMPLE 32 Liquid Fabric Cleaning Compositions
This Example provides liquid fabric cleaning compositions that find use in conjunction with the present invention. These compositions are contemplated to find particular utility under Japanese machine wash conditions, as well as for applications involving cleaning of fine and/or delicate fabrics. Table 32-1 provides a suitable composition. However, it is not intended that the present invention be limited to this specific formulation, as many other formulations find use with the present invention.

EXAMPLE 33 Liquid Dishwashing Compositions
This Example provides liquid dishwashing compositions that find use in conjunction with the present invention. These compositions are contemplated to find particular utility under Japanese dish washing conditions. Table 33-1 provide suitable compositions. However, it is not intended that the present invention be limited to this specific formulation, as many other formulations find use with the present invention.


EXAMPLE 34 Liquid Fabric Cleaning Compositions
The proteases of the present invention find particular use in cleaning compositions. For example, it is contemplated that liquid fabric cleaning composition of particular utility under Japanese machine wash conditions be prepared in accordance with the invention. In some preferred embodiments, these compositions comprise the following components shown in Table 34-1.


EXAMPLE 35 Granular Fabric Cleaning Compositions
In this Example, various granular fabric cleaning compositions that find use with the present invention are provided. The following Tables provide suitable compositions. However, it is not intended that the present invention be limited to these specific formulations, as many other formulations find use with the present invention.



The following laundry detergent compositions are contemplated to provide particular utility under European machine wash conditions.
EXAMPLE 36 Detergent Formulations
In this Example, various detergent formulations which find use with ASP and/or ASP variants are provided. It is understood that the test methods provided in this section must be used to determine the respective values of the parameters of the present invention.
In the exemplified detergent compositions, the enzymes levels are expressed by pure enzyme by weight of the total composition and unless otherwise specified, the detergent ingredients are expressed by weight of the total compositions. The abbreviated component identifications therein have the following meanings:

LAS
TAS CxyAS
CxyEz CxyAEzS
Nonionic
QAS Silicate Metasilicate Zeolite A
SKS-6
Sulfate
STPP
MA/AA

Table 36-1. Definitions Used in this Example
: Sodium linear C-| 1.13 alkyl benzene sulfonate.
: Sodium tallow alkyl sulphate. : Sodium C-\x - C-jy alkyl sulfate.
: C-|X - C-|y predominantly linear primary alcohol condensed
with an average of z moles of ethylene oxide. : C-|X - C-|y sodium alkyl sulfate condensed with an average of
z moles of ethylene oxide. Added molecule name in the
examples. : Mixed ethoxylated/propoxylated fatty alcohol e.g. Plurafac
LF404 being an alcohol with an average degree of
ethoxylation of 3.8 and an average degree of propoxylation of
4.5.
: R2.N+(CH3)2(C2H4OH) with R£ = C-| 2-Ci 4. : Amorphous Sodium Silicate (SiC>2:Na2O ratio = 1.6-3.2:1). : Sodium metasilicate (SiO2:Na2O ratio = 1.0). : Hydrated Aluminosilicate of formula Nai2(A102SiC>2) 12-
27H20
Crystalline layered silicate of formula 8-Na2Si2O5
Anhydrous sodium sulphate.
Sodium Tripolyphosphate.
Random copolymer of 4:1 acrylate/maleate, average
molecular weight about 70,000-80,000.

AA Polycarboxylate
BB1 BB2 PB1 PB4
Percarbonate
TAED
NOBS
DTPA
HEDP
DETPMP
EDDS Diamine
DETBCHD
PAAC Paraffin
Paraffin Sulfonate Aldose oxidase
Galactose oxidase Protease
Amylase
Lipase

Sodium polyacrylate polymer of average molecular weight
4,500.
Copolymer comprising mixture of carboxylated monomers
such as acrylate, maleate and methyacrylate with a MW
ranging between 2,000-80,000 such as Sokolan commercially
available from BASF, being a copolymer of acrylic acid,
MW4,500.
3-(3,4-Dihydroisoquinolinium)propane sulfonate
1 -(3,4-dihydroisoquinolinium)-decane-2-sulfate
Sodium perborate monohydrate.
Sodium perborate tetrahydrate of nominal formula
NaBO3.4H2O.
Sodium percarbonate of nominal formula 2Na2CO3-3H2O2 .
Tetraacetyl ethylene diamine.
Nonanoyloxybenzene sulfonate in the form of the sodium salt.
Diethylene triamine pentaacetic acid.
1,1-hydroxyethane diphosphonic acid.
Diethyltriamine penta (methylene) phosphonate, marketed by
Monsanto under the Trade name Dequest 2060;
Ethylenediamine-N,N'-disuccinic acid, (S,S) isomer in the form
of its sodium salt
Dimethyl aminopropyl amine; 1,6-hezane diamine; 1,3-
propane diamine; 2-methyl-1,5-pentane diamine; 1,3-
pentanediamine; 1 -methyl-diaminopropane.
5, 12- diethyl-1,5,8,12-tetraazabicyclo [6,6,2] hexadecane,
dichloride, Mn(ll) salt
Pentaamine acetate cobalt(lll) salt.
Paraffin oil sold under the tradename Winog 70 by
Wintershall.
A Paraffin oil or wax in which some of the hydrogen atoms
have been replaced by sulfonate groups.
Oxidase enzyme sold under the tradename Aldose Oxidase
by Novozymes A/S
Galactose oxidase from Sigma
Proteolytic enzyme sold under the tradename Savinase,
Alcalase, Everlase by Novo Nordisk A/S, and the following
from Genencor International, Inc: "Protease A" described in
US RE 34,606 in Figures 1A, 1B, and 7, and at column 11,
lines 11-37; "Protease B" described in 1)35,955,340 and
US5,700,676 in Figures 1A, 1B and 5, as well as Table 1; and
"Protease C" described in US6.312,936 and US 6,482,628 in
Figures 1-3 [SEQ ID 3], and at column 25, line 12, "Protease
D" being the variant
101G/103A/1041/159D/232V/236H/245R/248D/252K (BPN'
numbering) described in WO 99/20723.
Amylolytic enzyme sold under the tradename Purafect® Ox Am described in WO 94/18314, WO96/05295 sold by
Genencor; Natalase®, Termamyl®, Fungamyl® and
Duramyl®, all available from Novozymes A/S.
Lipolytic enzyme sold under the tradename Lipolase Lipolase
Ultra by Novozymes A/S and Lipomax by Gist-Brocades.

Cellulase
Pectin Lyase PVP
PVNO
PVPVI
Brightener 1 Silicone antifoam
Suds Suppressor
SRP1 PEGX PVP K60 ® Jeffamine ® ED-2001 Isachem ® AS MME PEG (2000)
DC3225C
TEPAE
BTA
Betaine
Sugar
CFAA
TPKFA Clay
PH

Cellulytic enzyme sold under the tradename Carezyme,
Celluzyme and/or Endolase by Novozymes A/S.
Pectaway® and pectawash® available from Novozymes A/S.
Polyvinylpyrrolidone with an average molecular weight of
60,000
Polyvinylpyridine-N-Oxide, with an average molecular weight
of 50,000.
Copolymer of vinylimidazole and vinylpyrrolidone, with an average molecular weight of 20,000. Disodium 4,4'-bis(2-sulphostyryl)biphenyl. Polydimethylsiloxane foam controller with siloxane-oxyalkylene copolymer as dispersing agent with a ratio of said foam controller to said dispersing agent of 10:1 to 100:1. 12% Silicone/silica, 18% stearyl alcohol,70% starch in granular form.
Anionically end capped poly esters. Polyethylene glycol, of a molecular weight of x. Vinylpyrrolidone homopolymer (average MW 160,000) Capped polyethylene glycol from Huntsman A branched alcohol alkyl sulphate from Enichem Monomethyl ether polyethylene glycol (MW 2000) from Fluka Chemie AG.
Silicone suds suppresser, mixture of Silicone oil and Silica
from Dow Corning.
Tetreaethylenepentaamineethoxylate.
Benzotriazole.
(CH3)3N+CH2COCr
Industry grade D-glucose or food grade sugar
C12-C14 alkyl N-methyl glucamide
Ci2-C14 topped whole cut fatty acids.
A hydrated aluminumu silicate in a general formula
AlaOsSiCyxHaO. Types: Kaolinite, montmorillonite, atapulgite,
illite, bentonite, halloysite.
Measured as a 1% solution in distilled water at 20°C.

The following Table (Table 36-2) provides liquid laundry detergent compositions that are prepared.

# added to product to adjust the neat pH of the product to about 4.2 for (I) and about 3.8 for (II).

he following Table (36-3) provides hand dish liquid detergent compositions that are prepared.

The pH of these compositions is about 8 to about 11

Table 36-4 provides liquid automatic dishwashing detergent compositions that are prepared.
Table 36-4. Liquid Automatic Dishwashing Detergent Compositions
Component I II III IV V
STPP 16 16 18 16 16
Potassium Sulfate - 10 8 - 10
1,2 propanediol 6.0 0.5 2.0 6.0 0.5
Boric Acid 4.0 3.0 3.0 4.0 3.0
CaCI2 dihydrate 0.04 0.04 0.04 0.04 0.04
Nonionic 0.5 0.5 0.5 0.5 0.5
ASP 0.1 0.03 0.05 0.03 0.06
Protease B ... o.01
Amylase 0.02 - 0.02 0.02
Aldose Oxidase - 0.15 0.02 - 0.01
Galactose Oxidase - - 0.01 - 0.01
PAAC 0.01 - - 0.01
DETBCHD - 0.01 - - 0.01
Balance to 100% perfume / dye and/or water
Table 36-5 provides laundry compositions which may be prepared in the form of granules or tablets that are prepared.
Table 36-5. Laundry Compositions
Base Product I II III IV V
Ci4-C15AS or TAS 8.0 5.0 3.0 3.0 3.0
LAS 8.0 - 8.0 - 7.0
C12-C15AE3S 0.5 2.0 1.0
C12-Ci5E5 or E3 2.0 - 5.0 2.0 2.0
QAS - - - 1.0 1.0
Zeolite A 20.0 18.0 11.0 - 10.0
SKS-6 (dry add) - - 9.0 -
MA/AA 2.0 2.0 2.0
AA 4.0
3Na Citrate 2H2O - 2.0 -
Citric Acid (Anhydrous) 2.0 - 1.5 2.0
DTPA 0.2 0.2 -
EDDS - - 0.5 0.1
HEDP - - 0.2 0.1
PB1 3.0 4.8 - - 4.0
Percarbonate - - 3.8 5.2
NOBS 1.9 - - - -
NACAOBS - - 2.0 -
TAED 0.5 2.0 2.0 5.0 1.00
BB1 0.06 - 0.34 - 0.14
BB2 - 0.14 - 0.20
Anhydrous Na Carbonate 15.0 18.0 8.0 15.0 15.0

Table 36-5. Laundry Compositions
Base Product I II III IV V
Sulfate 5.0 12.0 2.0 17.0 3.0
Silicate - 1.0 - - 8.0
ASP 0.03 0.05 1.0 0.06 0.1
Protease B - 0.01
Protease C ... Q.01
Lipase - 0.008 ...
Amylase 0.001 - - - 0.001
Cellulase - 0.0014 -
Pectin Lyase 0.001 0.001 0.001 0.001 0.001
Aldose Oxidase 0.03 - 0.05
PAAC - 0.01 - - 0.05
Balance to 100% Moisture and/or Minors*
* Perfume, Dye, Brightener / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgSO4 / PVPVI/ Suds suppressor /High Molecular PEG/Clay.
Table 36-6 provides liquid laundry detergent formulations which are prepared.
Table 36-6. Liquid Laundry Detergent Formulations
Component I I II III IV V
LAS 11.5 11.5 9.0 - 4.0
Ci2-Ci5AE2.85S - - 3.0 18.0 - 16.0
C14-Ci5E25S 11.5 11.5 3.0 - 16.0
C12-C13E9 - - 3.0 2.0 2.0 1.0
C i2-CisE 7 3.2 3.2 - -
CFAA - - 5.0 3.0
TPKFA 2.0 2.0 - 2.0 0.5 2.0
Citric Acid 3.2 3.2 0.5 1.2 2.0 1.2
(Anhydrous)
Ca formate 0.1 0.1 0.06 0.1
Na formate 0.5 0.5 0.06 0.1 0.05 0.05
NaCulmene 4.0 4.0 1.0 3.0 1.2
Sulfonate
Borate 0.6 0.6 - 3.0 2.0 3.0
Na Hydroxide 6.0 6.0 2.0 3.5 4.0 3.0
Ethanol 2.0 2.0 1.0 4.0 4.0 3.0
1,2 Propanediol 3.0 3.0 2.0 8.0 8.0 5.0
Mono- 3.0 3.0 1.5 1.0 2.5 1.0
ethanolamine
TEPAE 2.0 2.0 - 1.0 1.0 1.0
ASP 0.03 0.05 0.01 0.03 0.08 0.02
Protease A - - 0.01
Lipase - - - 0.002
Amylase - - - - 0.002
Cellulase - - ... 0.0001
Pectin Lyase 0.005 0.005
Aldose Oxidase 0.05 - - 0.05 - 0.02
Galactose oxidase - 0.04

Table 36-6. Liquid Laundry Detergent Formulations

Balance to 100% Moisture and/or Minors*
*Brightener / Dye / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgSO4 / PVPVI/ Suds
suppressor /High Molecular PEG/Clay.
The pH of the above compositions is from about 9.6 to about 11.3.

Table 36-8 provides tablet detergent compositions of the present invention that are prepared by compression of a granular dishwashing detergent composition at a pressure of 13KN/cm2 using a standard 12 head rotary press:

Table 36-8. Tablet Detergent Compositions Component
STPP
3Na Citrate 2H2O 20.0
Na Carbonate
Silicate
Lipase
Protease B
Protease C - 0.01
ASP
Amylase
Pectin Lyase
Aldose Oxidase
PB1
Percarbonate
BB1
BB2
Nonionic
PAAC
DETBCHD - - - 0.02 0.02
TAED ----- 2.1 - 1.6
HEDP
DETPMP
Paraffin
BTA
Polycarboxylate
PEG 400-30,000
Glycerol
Perfume
Balance to 100% Moisture and/or Minors*
*Brightener / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgSO4 / PVPVI/ Suds
suppressor /High Molecular PEG/Clay.
The pH of these compositions is from about 10 to about 11.5.
The tablet weight of these compositions is from about 20 grams to about 30 grams.
Table 36-9 provides liquid hard surface cleaning detergent compositions of the present invention that are prepared.
Table 36-9. Liquid Hard Surface Cleaning Detergent Compositions
Component I II III IV V VI VII

Table 36-9. Liquid Hard Surface Cleaning Detergent Compositions

Component
Cg-Cl 1 £5
LAS
Sodium culmene sulfonate 1 .5
Isachem ® AS
Na2CO3
3Na Citrate 2H2O
NaOH
Fatty Acid
2-butyl octanol
PEG DME-2000®
PVP
MME PEG (2000) ®
Jeffamine ® ED-2001
PAAC
DETBCHD
ASP
Protease B
Amylase
Lipase
Pectin Lyase
PB1
Aldose Oxidase
Balance to 100% perfume / dye and/or water
The pH of these compositions is from about 7.4 to about 9.5.


EXAMPLE 37 Animal Feed Comprising ASP
The present invention also provides animal feed compositions comprising ASP

and/or ASP variants. In this Example, one such feed, suitable for poultry is provided. However, it is not intended that the present invention be limited to this specific formulation, as the proteases of the present invention find use with numerous other feed formulations. It is further intended that the feeds of the present invention be suitable for administration to any animal, including but not limited to livestock (e.g., cattle, pigs, sheep, etc.), as well as companion animals (e.g., dogs, cats, horses, rodents, etc.). The following Table provides a formulation for a mash, namely a maize-based starter feed suitable for administration to turkey poults up to 3 weeks of age.

In some embodiments, this feed formulation is supplemented with various concentrations of the protease(s) of the present invention (e.g., 2,000 units/kg, 4,000 units/kg and 6,000 units/kg).
All patents and publications mentioned in the specification are indicative of the levels of those skilled in the art to which the invention pertains. All patents and publications are herein incorporated by reference to the same extent as if each individual publication was specifically and individually indicated to be incorporated by reference. However, the citation of any publication is not to be construed as an admission that it is prior art with respect to the present invention.
Having described the preferred embodiments of the present invention, it will appear to those ordinarily skilled in the art that various modifications may be made to the disclosed embodiments, and that such modifications are intended to be within the scope of the present invention.




We claim:
1. An isolated serine protease obtained from a Cellutomonas species which has at least 70% sequence identity to SEQ ID NO:8.
2. The serine protease as claimed in Claim 1, wherein said protease is obtained from Cellulomonas 69B4.
3. The serine protease as claimed in Claim 2, wherein said protease comprises the amino acid sequence set forth in SEQ ID NO:6, SEQ ID NO:7 or SEQ ID NO:8.
4. The serine protease as claimed in Claim 3, wherein said protease comprises the amino acid sequence set forth in SEQ ID NO:8.
5. The amino acid sequence as claimed in Claim 2 wherein said sequence comprises substitutions at least one amino acid position selected from the group comprising positions 2, 8, 10, 11, 12, 13, 14, 15, 16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 155, 159,160, 163, 165, 170, 174,179, 183, 184, 185, 186, 187, and 188.
6. The amino acid sequence as claimed in Claim 2, wherein said sequence comprises substitutions at at least one amino acid position selected from the group comprising positions 1, 4, 22, 27, 28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 156, 158, 161, 166, 176, 177, 181, 182, 187, and 189.
7. An isolated protease variant from a Cellulomonas species comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8, wherein said variant protease comprises at least 70% sequence identity to SEQ ID NO:8.
8. The isolated protease as claimed in Claim 7, wherein said substitution is made at a position equivalent to position 2, 8, 10, 11, 12, 13, 14, 15, 16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107,

109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 185, 186, 187, or 188 in a Cellulomonas 69B4 protease comprising an amino acid sequence set forth in SEQ ID NO:8.
9. The isolated protease as claimed in Claim 7, wherein said substitution is made at a position equivalent to position 1,4, 22, 27, 28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 156, 158, 161, 166, 176, 177, 181, 182, 187, or 189, in a Cellulomonas 69B4 protease comprising an amino acid sequence set forth in SEQ ID NO:8.
10. An isolated protease comprising the amino acid sequence set forth in SEQ ID NO:8, wherein at least one amino acid position at positions selected from the group consisting of 14, 16, 35, 36, 65, 75, 76, 79, 123, 127, 159, and 179, is substituted with another amino acid.
11. The protease as claimed in Claim 10, wherein said protease comprises at least one mutation selected from the group consisting of R14L, R16I, R16L, R16Q, R35F, T36S, G65Q, Y75G, N76L, N76V, R79T, R123L, R123Q, R127A, R127K, R127Q, R159K, R159Q, and R179Q.
12. The protease as claimed in Claim 11, wherein said protease comprises multiple mutations selected from the group consisting of R16Q/R35F/R159Q, R16Q/R123L, R14UR127Q/R159Q, R14L/R179Q, R123L/R127Q/R179Q, R16Q/R79T/R127Q, and R16Q/R79T.
13. The protease as claimed in Claim 12, wherein said protease comprises the following mutations R123L, R127Q, and R179Q.
14. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of T36I, A38R, N170Y, N73T, G77T, N24A, T36G, N24E, L69S, T36N, T36S, E119R, N74G, T36W, S76W, N24T, N24Q, T36P, S76Y, T36H, G54D, G78A, S187P, R179V, N24V, V90P, T36D, L69H, G65P, G65R, N7L, W103M, N55F, G186E, A70H, S76V, G186V, R159F, T36Y, T36V, G65V, N24M, S51A, G65Y, Q71I, V66H, P118A, T116F, A38F, N24H, V66D, S76L, G177M, G186I, H85Q, Q71K, Q71G, G65S, A38D, P118F, A38S, G65T, N67G, T36R, P118R, S114G, Y75I, I181H,


G65Q, Y75G, T36F, A38H, R179M, T183I, G78S, A64W, Y75F, G77S, N24L, W103I, V3L, Q81V, R179D, G54R, T36L, Q71M, A70S, G49F, G54L, G54H, G78H, R179I, Q81K, V90I, A38L, N67L, T109I, R179N, V66I, G78T, R179Y, S187T, N67K, N73S, E119K, V3I, Q71H, 111Q, A64H, R14E, R179T, L69V, V150L, Q71A, G65L, Q71N, V90S, A64N, 111A, N145I, H85T, A64Y, N145Q, V66L, S92G, S188M, G78D, N67A, N7S, V80H, G54K, A70D, P118H, D2G, G54M, Q81H, D2Q, V66E, R79P, A38N, N145E, R179L, T109H, R179K, V66A, G54A, G78N, T109A, R179A, N7A, R179E, H104K, A64R, and V80L
15. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of H85R, H85L, T62I, N67H, G54I, N24F, T40V, T86A, G63V, G54Q, A64F, G77Y, R35F, T129S, R61M, I126L, S76N, T182V, R79G, T109P, R127F, R123E, P118I, T109R, 171S, T183K, N67T, P89N, F1T, A64K, G78I, T109L, G78V, A64M, A64S, T10G, G77N, A64L, N67D, S76T, N42H, D184F, D184R, S76I, S78R, A38K, V72I, V3T, T107S, A38V, F47I, N55Q, S76E, P118Q, T109G, Q71D, P118K, N67S, Q167N, N145G, I28L, 111T, A64I, G49K, G49A, G65A, N170D, H85K, S185I, I181N, V80F, L69W, S76R, D184H, V150M, T183M, N67Q, S51Q, A38Y, T107V, N145T, Q71F, A83N, S76A, N67R, T151L, T163L, S51F, Q81I, F47M, A41N, P118E, N67Y, T107M, N73H, 67V, G63W, T10K, I181G, S187E, T107H, D2A, L142V, A143N, A8G, S187L, V90A, G49L, N170L, G65H, T36C, G12W, S76Q, A143S, F1A, N7H, S185V, A110T, N55K, N67F, N7I, A110S, N170A, Q81D, A64Q, Q71L, A38I, N112I, V90T, N145L, A64T, I11S, A30S, R123I, D2H, V66M, Q71R, V90L, L68W, N24S, R159E, V66N, D184Q, E133Q, A64V, D2N, G13M, T40S, S76K, G177S, G63Q, S15F, A8K, A70G, and A38G.
16. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of R35E, R35D, R14E, R14D, Q167E, G49C, S15R, S15H, 111W, S15C, G49Q, R35Q, R35V, G49E, R123D, R123Y, G49H, A38D, R35S, F47R, R123C, T151L, RUT, R35T, R123E, G49A, G49V, D56L, R35N, R35A, G12D, R35C, R123N, T46V, R123H, S155C, T121E, R127E, S113C, R123T, R16E, T46F, T121L, A38C, T46E, R123W, T44E, N55G, A8G, E119G, R35P, R14G, F59W, R127S, R61E, R14S, S155W, R123F, R123S, G49N, R127D, E119Y, A48E, N170D, R159T, S99A, G12Q, P118R, F165W, R127Q, R35H, G12N, A22C, G12V, R16T, Y57G, T100A, T46Y, R159E, E119R, T107R, T151C, G54C, E119T, R61V, 111E, R14I, R61M, S15E, A22S, R16C, T36C, R16V, L125Q, M180L, R123Q, R14A, R14Q, R35M, R127K, R159Q, N112P, G124D, R179E, G49L, A41D, G177D, R123V, E119V, T10L, T109E, R179D, G12S, T10C, G91Q,

S15Y, S155Y, R14C, T163D, T121F, R14N, F165E, N24E, A41C, R61T, G12I, P118K, T46C, 111T, R159D, N170C, R159V, S1551,111Q, D2P, T100R, R159S, S114C, R16D, and P134R.
17. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of S99G, T100K, R127A, F1P, S155V, T128A, F165H, G177E, A70M, S140P, A87E, D2I, R159K, T36V, R179C, E119N, T10Y, I172A, A8T, F47V, W103L, R61K, D2V, R179V, D2T, R159N, E119A, G54E, R16Q, G49S, R16I, S51L, S155E, S15M, R179I, T10Q, G12H, R159C, R179T, T163C, R159A, A132S, N157D, G13E, L141M, A41T, R123M, R14M, A8R, Q81P, N24T, T10D, A88F, R61Q, S99K, R179Y, T121A, N112E, S155T, T151V, S99Q, T10E, S92T, T109K, T44C, R123A, A87C, S15F, S155F, D56F, T10F, A83H, R179M, T121D, G13D, P118C, G49F, Q174C, S114E, T86E, F1N, T115C, R127C, R123K, V66N, G12Y, S113A, S15N, A175T, R79T, R123G, R179S, R179N, R123I, P118A, S187E, N112D, A70G, E119L, E119S, R159M, R14H, R179F, A64C, A41S, R179W, N24G, T100Q, P118W, Q81G, G49K, R14L, N55A, R35K, R79V, D2M, T160D, A83D, R179L, S51A, G12P, S99H, N42D, S188E, T10M, L125M, T116N, A70P, Q174S, G65D, S113D, E119Q, A83E, N170L, Q81A, S51C, P118G, Q174T, I28V, S15G, and T116G.
18. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of G26I, G26K, G26Q, G26V, G26W, F27V, F27W, I28P, T29E, T129W, T40D, T40Q, R43D, P43H, P43K, P43L, A22C, T40H, P89W, G91L, S18E, F59K, A30M, A30N, G31M, C33M, G161L, G161V, P43N, G26E, N73P, G84C, G84P, G45V, C33L, Y9E, Y9P, A147E, C158H, I28W, A48P, A22S, T62R, S137R, S155P, S155R, G156I, G156L, Q81A, R96C, I4D, I4P, A70P, C105E, C105G, C105K, C105M, C105N, C105S, T128A, T128V, T128G, S140P, G12D, C33N, C33E, T164G, G45A, G156P, S99A, Q167L, S155W, I28T, R96F, A30P, R123W, T40P, T39R, C105P, T100A, C105W, S155K, T46Y, R123F, I4G, S155Y, T46V, A93S, Y57N, Q81S, G186S, G31H, T10Y, G31V, A83H, A38D, R123Y, R79T, C158G, G31Y, Q81P, R96E, A30Y, R159K, A22T, T40N, Y57M, G31N, Q81G, T164L, T121E, T10F, Q146P, R123N, V3R, P43G, Q81H, Q81D, G161I, C158M, N24T, T10W, T128S, T160I, Y176P, S155F, T128C, L125A, P168Y, T62G, F166S, S188A, Q81F, T46W, A70G, and A38G.
19. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of S188E, S188V, Y117K, Y117Q, Y117R, Y117V, R127K, R127Q, R123L, T86S, R123I, Q81E, L125M, H32A,

S188T, N74F, C33D, F27I, A83M, Q71Y, R123T, V90A, F59W, L141C, N170E, T46F, S51V, G162P, S185R, A41S, R79V, T151C, T107S, T129Y, M180L, F166C, C105T, T160E, P89A, R159T, T183P, S188M, T10L, G25S, N24S, E119L, T107L, T107Q, G161K, G15Q, S15R, G153K, G153V, S188G, A83E, G186P, T121D, G49A, S15C, C105Y, C105A, R127F, Q71A, T10C, R179K, T86I, W103N, A87S, F166A, A83F, R123Q, A132C, A143H, T163I, T39V, A93D, V90M, R123K, P134W, G177N, V115I, S155T, T110D, G105L, N170D, T107A, G84V, G84M, L111K, P168I, G154L, T183I, S99G, S15T, A8G, S15N, P189S, S188C, T100Q, A110G, A121A, G12A, R159V, G31A, G154R, T182L, V115L, T160Q, T107F, R159Q, G144A, S92T, T101S, A83R, G12HM S15H, T116Q, T36V, G154, Q81C, V130T, T183A, P118T, A87E, T86M, V150N, and N24E.
20. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of T36I, I172T, N24E, N170Y, G77T, G186N, I181L, N73T, A38R, N74G, N24A, G54D, S76D, R123E, 159E, N112E, R35E, R179V, R123D, N24T, R179T, R14L, A38D, V90P, R14Q, R123I, R179D, S76V, R79G, R35L, S76E, S76Y, R79D, R79P, R35Q, R179N, N112D, R179E, G65P, Y75G, V90S, R179M, R35F, R123F, A64I, N24Q, R14I, R179A, R127A, R179I, N170D, R35A, R159F, T109E, R14D, N67D, G49A, N112Q, G78D, T121E, L69S, T116E, V90I, T36S, T36G, N145E, T86D, S51D, R179K, T107E, T129S, L142V, R79A, R79E, A38H, T107S, R123A, N55E, R123L, R159N, G65D, R14N, G65Q, R123Q, N24V, R14G, T116Q, A38N, R159Q, R179Y, A83E, N112L, S99N, G78A, T10N, H85Q, R35Q, N24L, N24H, G49S, R79L, S76T, S76L, G65S, N55F, R79V, G65T, R123N, T86E, Y75F, F1T, S76N, S99V, R79T, N112V, R79M, T107V, R79S, G54E, G65V, R127Q, R159D, T107H, H85T, R35T, T36N, Q81E, R123H, S76I, A38F, V90T, and R14T.
21. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of G65L, S99D, T107M, S113T, S99T, G77S, R14M, A64N, R61M, A70D, Q71G, A93D, S92G, N112Y, S15W, R159K, N67G, T10E, R127H, A64Y, R159C, A38L, T160E, T183E, R127S, A8E, S51Q, N7L, G63D, A38S, R35H, R14K, T107I, G12D, A64L, S76W, A41N, R35M, A64V, A38Y, T183I, W103M, A41D, R127K, T36D, R61T, G65Y, G13S, R35Y, R123T, A64H, G49H, A70H, A64F, R127Y, R61E, A64P, T121D, V115A, R123Y, T101S, T182V, H85L, N24M, R127E, N145D, Q71H, S76Q, A64T, G49F, A64Q, T10D, F1D, A70G, R35W, Q71D, N121I, A64M, T36H, A8G, T107N, R35S, N67T, S92A, N170L, N67E, S114A, R14A, RMS, Q81D, S51H, R123S, A93S, R127F,

119V, T40V, S185N, R123G, R179L, S51V, T163D, T109I, A64S, V72I, N67S, R159S, H85M, T109G, Q71S, R61H, T107A, Q81V, V90N, T109A, A38T, N145T, R159A, A110S, Q81H, A48E, S51T, A64W, R159L, N67H, A93E, T116F, R61S, R123V, V3L, and R159Y.
22. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of T36I, P89D, A93T, A93S, T36N, N73T, T36G, R159F, T36S, A38R, S99W, S76W, T36P, G77T, G54D, R127A, R159E, H85Q, T36D, S76L, S99N, Y75G, S76Y, R127S, N24E, R127Q, D184F, N170Y, N24A, S76T, H85L, Y75F, S76V, L69S, R159K, R127K, G65P, N74G, R159H, G65Q, G186V, A48Q, T36H, N67L, R14I, R127L, T36Y, S76I, S114G, R127H, S187P, V3L, G78D, R123I, I181Q, R35F, H85R, R127Y, N67S, Q81P, R123F, R159N, S99A, S76D, A132V, R127F, A143N, S92A, N24T, R79P, S76N, R14M, G186E, N24Q, N67A, R127T, H85K, G65T, G65Y, R179V, Y75I, 111Q, A38L, T36L, R159Y, R159D, N24V, G65S, N157D, G186I, G54Q, N67Y, R127G, S76A, A38S, T109E, V66H, T116F, R123L, G49A, A64H, T36W, D184H, S99D, G161K, P134E, A64F, N67G, S99T, D2Q, S76E, R16Q, G54N, N67V, R35L, Q71I, N7L, N112E, L69H, N24H, G54I, R16L, N24M, A64Y, S113A, H85F, R79G, 111A, T121D, R61V, and G65L
23. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of N67Q, S187Q, Q71H, T163D, R61K, R159V, Q71F, V31F, V90I, R79D, T160E, R123Q, A38Y, S113G, A88F, A70G, 111T, G78A, N24L, S92G, R14L, D184R, G54L, N112L, H85Y, R16N, G77S, R179T, V80L, G65V, T121E, Q71D, R16G, P89N, N42H, G49F, I11S, R61M, R159C, G65R, T183I, A93D, L111E, S51Q, G78N, N67T, A38N, T40V, A64W, R159L, T10E, R179K, R123E, V90P, A64N, G161E, H85T, A8G, L142V, A41N, S185I, Q71L, A64T, R16I, A38D, G54M, N112Q, R16A, R14E, V80H, N170D, S99G, R179N, S15E, G49H, A70P, A64S, G54A, S185W, R61H, T10Q, A38F, N170L, T10L, N67F, G12D, D184T, R14N, S187E, R14P, N112D, S140A, N112G G49S, L111D, N67M, V150L, G12Y, R123K, P89V, V66D, G77N, S51T, A8D, I181H, T86N, R179D, N55F, N24S, D184L, R61S, N67K, G186L, F1T, R159A, I11L, R61T, D184Q, A93E, Q71T, R179E, L69W, T163I, S188Q, L125V, A38V, R35A, P134G, A64V, N145D, V90T, and A143S.
24. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of T36I, N170Y, A38R, R79P, G77T, L69S, N73T, S76V, S76Y, R179V, T36N, N55F, R159F, G54D, G65P, L69H, T36G, G177M, N24E, N74G, R159E, T36S, Y75G, S76I, S76D, A8R, A24A, V90P, R159C,

G65Q, T121E, A8V, S76L, T109E, R179M, A8T, T107N, G186E, S76W, R123E, A38F, T36P, N67G, Y75F, S76N, R179I, S187P, N67V, V90S, R127A, R179Y, R35F, N145S, G65S, R61M, S51A, R179N, R123D, N24T, N55E, R79C, G186V, R123I, G161E, G65Y, A38S, R14L, V90I, R79G, N145E, N67L, R127S, R150Y, M180D, N67T, A93D, T121D, Q81V, T109I, A93E; T107S, R179T, R179L, R179K, R159D, R179A, R79E, R123F, R79D, T36D, A64N, L142V, T109A, 1172V, A83N, T85A, R179D, A38L, I126L, R127Q, R127L, L69W, R127K, G65T, R127H, P134A, N67D, RUM, N24Q, A143N, N55S, N67M., S51D, S76E, T163D, A38D, R159K, T183I, G63V, A8S, T107M, H85Q, N112E, N67F, N67S, A64H, T86I, P134E, T182V, N67Y, A64S, G78D, V90T, R61T, R16Q, G65R, T86L, V90N, R159Q, G54I, S76C, R179E, V66D, L69V, R127Y, R35L, R14E, and T86F.
25. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of G186I, A64Q, T109G, G64L, N24L, A8E, N112D, A38H, R179W, S114G, R123L, A8L, T129S, N170D, R159N, N67C, S92C, T107A, G54E, T107E, T36V, R127T, A8N, H85L, A110S, N170C, A64R, A132V, T36Y, G63D, W103M, T151V, R123P, W103Y, S76T, S187T, R127F, N67A, P171M, A70S, R159H, S76Q, L125V, G54Q, G49L, R14I, R14Q, A83I, V90L, T183E, R159A, T101S, G65D, G54A, T107Q, Q71M, T86E, N24M, N55Q, R61V, P134D, R96K, A88F, N145Q, A64M, A64T, N24V, S140A, A8H, A64I, R123Q, T183Q, N24H, A64W, T62I, T129G, R35A, T40V, 111T, A38N, N145G, A175T, G77Q, T109H, A8P, R35E, T109N, A110T, N67Q, G63P, H85R, S92G, A175V, S51Q, G63Q, T116F, G65A, R79L, N145P, L69Q, Q146D, A83D, F166Y, R123A, T121L, R123H, A70P, T182W, S76A, A64F, T107H, G186L.Q81I, R123K, A64L, N67R, V3L, S187E, S161K, T86M, I4M, G77N, G49A, A41N, G54M, T107V, Q81E, A38I, T109L, T183K, A70G, Q71D, T183L, Q81H, A64V, A93Q, S188E, S51F, G186P, G186T, R159L, P134G, N145T, N55V, V66E, R159V, Y176L, and R16L.
26. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises at least one substitution selected from the group consisting of T36I, N73T, P134R, G77T, N24E, P134E, P134L, N24T, 159F, L69S, T10G, G186S, S140A, T36S, N112S, N24Q, T36G, P134H, G34A, N24A, A38T, E119R, G186E, R14M, S76W, T10A, A38F, L142V, N170Y, P134V, A22V, S76V, T182V, S76Y, 111A, 111S, S118A, G186V, L69H, 111T, T36N, G65V, G49F, V90I, R179V, R16K, T163I, R127F, R159K, N24L, Q71I, S15G, S15F, R14G, S99N, T10L, S15E, T107R, F166Y, G49A, V90P, P134D, Q167N, S76D, S51A, V80A, V150L, N74G, T107K, S76L, N24V, G12I, S99V, and R16N.

27. The protease as claimed in Claim 7, wherein the amino acid of said protease comprises Arg14, Ser15, Arg16, Cys17, His32, Cys33, Phe52, Asp56, Th100, Val115, Thr116, Tyr117, Pro118, Glu119, Ala132, Glu133, Pro134, Gly135, Asp136, Ser137, Thr151, Ser152, Gly153, Gly154, Ser155, Gly156, Asn157, Thr164, and Phe165.
28. The protease as claimed in Claim 27, wherein the catalytic triad of said protease comprises His 32, Asp56, and Ser137.

29. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises Cys131, Ala132, Glu133, Pro134, Gly135, Thr151, Ser152, Gly153, Gly154, Ser155, Gly156, Asn157 and Gly 162, Thr 163, and Thr164.
30. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease comprises Phe52, Tyr117, Pro118 and Glu119.
31. The protease as claimed in Claim 7, wherein the amino acid sequence of said protease has main-chain to main-chain hydrogen bonding from Gly 154 to the substrate main-chain.
32. The protease as claimed in Claim 7, wherein said protease comprises three disulfide bonds.
33. The protease as claimed in Claim 7, wherein said variant has an altered substrate specificity as compared to wild-type Cellulomonas 69B4 protease.
34. The protease as claimed in Claim 7, wherein said variant has an altered pi as compared to wild-type Cellulomonas 69B4 protease.
35. The protease as claimed in Claim 7, wherein said variant has improved stability as compared to wild-type Cellulomonas 69B4 protease.
36. The protease as claimed in Claim 7, wherein said variant exhibits an altered surface property.

37. The protease as claimed in Claim 7, wherein said variant comprises mutations at least one substitution at sites selected from the group consisting of 1, 2, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 22, 24, 25, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 57, 59, 61, 62, 63, 64, 65, 66, 67, 68, 69, 71, 73, 74, 75, 76, 77, 78, 79, 80, 81, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 95, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 123, 124, 126, 127, 128, 130, 131, 132, 133, 134, 135, 137, 143, 144, 145, 146, 147, 148, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 170, 171, 173, 174, 175, 176, 177, 178, 179, 180, 181,182, 183, and 184.
38. The protease as claimed in Claim 1, wherein said protease is a variant protease having at least one improved property as compared to wild-type protease selected from the group consisting of acid stability, thermostability, casein hydrolysis, keratin hydrolysis, cleaning performance, and LAS stability.
39. The protease as claimed in Claim 4, wherein said protease is a variant protease having at least one improved property as compared to wild-type protease selected from the group consisting of acid stability, thermostability, casein hydrolysis, keratin hydrolysis, cleaning performance, and LAS stability.
40. An expression vector comprising a polynucleotide sequence encoding the protease variant as claimed in Claim 7.
41. A host cell comprising said expression vector as claimed in Claim 40.
42. The host cell as claimed in Claim 41, wherein said host is selected from the group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp.
43. A serine protease produced by said host cell as claimed in Claim 42.
44. A variant protease obtained from a Cellulomonas species, comprising an amino acid sequence selected from the group consisting of SEQ ID NOs:54, 56, 58, 60, 62, 64 and 66, wherein said variant protease has at least 70% sequence identity to SEQ ID NO:8.

45. The variant protease as claimed in Claim 44, wherein said amino acid sequence is encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NOS:53, 55, 57, 59, 61, 63 and 65.
46. An expression vector comprising a polynucleotide sequence encoding the protease variant as claimed in Claim 45.
47. A host cell comprising said expression vector as claimed in Claim 46.
48. The host cell as claimed in Claim 47, wherein said host is selected from the group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp.
49. A serine protease produced by said host cell as claimed in Claim 48.
50. A composition comprising at least a portion of the isolated serine protease as claimed in Claim 1, wherein said protease is encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, and SEQ ID NO:4.
51. A polynucleotide sequence comprising at least a portion of SEQ ID NO:1.
52. An expression vector comprising the polynucleotide sequence as claimed in Claim 51 or a polynucleotide sequence encoding the protease variant as claimed in claim 4 or claim 44.
53. A host cell comprising said expression vector as claimed in Claim 52.
54. The host cell as claimed in Claim 53, wherein said host is selected from the group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp.
55. A serine protease produced by said host cell as claimed in Claim 54.
56. A variant serine protease, wherein said protease comprises at least one substitution corresponding to the amino acid positions in SEQ ID NO:8, and wherein said variant protease has at least 70% sequence identity to SEQ ID NO:8, and wherein said variant has better performance in at least one property selected from the group consisting of keratin hydrolysis,

thermostability, casein activity, LAS stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease.
57. An isolated polynucleotide from a Cellulomonas species comprising a nucleotide sequence (i) having at least 70% identity to SEQ ID NO:4, or (ii) being capable of hybridizing to a probe derived from the nucleotide sequence set forth in SEQ ID NO:4, under conditions of intermediate to high stringency, or (iii) being complementary to the nucleotide sequence set forth in SEQ ID NO:4.
58. A vector comprising the polynucleotide as claimed in Claim 57.
59. A host cell transformed with the vector as claimed in Claim 58.
60. A polynucleotide comprising a sequence complementary to at least a portion of the sequence set forth in SEQ ID NO:4.
61. A method of producing an enzyme having protease activity, comprising:

(a) transforming a host cell with an expression vector comprising a polynucleotide having at least 70% sequence identity to SEQ ID NO:4;
(b) cultivating said transformed host cell under conditions suitable for said host cell to produce said protease; and
(c) recovering said protease.

62. The method as claimed in Claim 61, wherein said host cell is a Streptomyces, Aspergillus, Trichoderma or Bacillus species.
63. A probe comprising a 4 to 150 polynucleotide sequence substantially identical to a corresponding fragment of SEQ ID NO:4, wherein said probe is obtained from a Cellulomonas species and wherein said probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic activity.
64. The probe as claimed in Claim 63, wherein said Cellulomonas is Cellulomonas strain 69B4.

65. A cleaning composition comprising at least one serine protease obtained from a Cellulomonas species wherein said protease has at least 70% sequence identity to SEQ ID NO:8.
66. The cleaning composition as claimed in Claim 65, wherein said protease is obtained from Cellulomonas 69B4.
67. The cleaning composition as claimed in Claim 66, wherein said protease comprises the amino acid sequence set forth in SEQ ID NO:8.
68. The cleaning composition as claimed in Claim 67, wherein said serine protease has at least 60% amino acid identity with the amino acid sequence set forth in SEQ ID NO:8.
69. A cleaning composition comprising a serine protease obtained from a Cellulomonas species, wherein said serine protease has at least 70% sequence identity to SEQ ID NO:8, and wherein said serine protease has immunological cross-reactivity with the serine protease as claimed in Claim 1 or claim 2.
70. The cleaning composition as claimed in Claim 69, wherein said protease is a variant protease having an amino acid sequence comprising at least one substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 69B4 protease having an amino acid sequence set forth in SEQ ID NO:8.
71. The cleaning composition as claimed in Claim 70, wherein said substitution is made at a position equivalent to position 2, 8, 10, 11, 12, 13, 14, 15, 16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121, 123, 127, 145, 155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 185, 186, 187, and 188 in a Cellulomonas 69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8.
72. The cleaning composition as claimed in Claim 70, wherein said substitutions are made at positions equivalent to positions 1,4, 22, 27, 28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 156, 158,

161, 166, 176, 177, 181, 182, 187, and 189, in a Cellulomonas 69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8.
73. The cleaning composition as claimed in Claim 70, wherein said protease comprises at least one amino acid substitutions at positions 14, 16, 35, 36, 65, 75, 76, 79, 123, 127, 159, and 179, in an equivalent amino acid sequence to that set forth in SEQ ID NO:8.
74. The cleaning composition as claimed in Claim 73, wherein said protease comprises at least one mutation selected from the group consisting of R14L, R16I, R16L, R16Q, R35F, T36S, G65Q, Y75G, N76L, N76V, R79T, R123L, R123Q, R127A, R127K, R127Q, R159K, R159Q, and R179Q.
75. The cleaning composition as claimed in Claim 74, wherein said protease comprises a set of mutations selected from the group consisting of the sets R16Q/R35F/R159Q, R16Q/R123L, R14L/R127Q/R159Q, R14L/R179Q, R123L/R127Q/R179Q, R16Q/R79T/R127Q, and R16Q/R79T.
76. The cleaning composition as claimed in Claim 75, wherein said protease comprises the following mutations R123L, R127Q, and R179Q.
77. The cleaning composition as claimed in Claim 73, wherein said variant serine protease comprises at least one substitution corresponding to the amino acid positions in SEQ ID NO:8, and wherein said variant protease has better performance in at least one property selected from the group consisting of keratin hydrolysis, thermostability, casein activity, LAS stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease.
78. The cleaning composition as claimed in Claim 65, wherein said variant protease comprises an amino acid sequence selected from the group consisting of SEQ ID NOS:54, 56, 58, 60, 62, 64 and 66.
79. The cleaning composition as claimed in Claim 65, wherein said variant protease amino acid sequence is encoded by a polynucleotide sequence selected from the group consisting of SEQ ID NOS:53, 55, 57, 59, 61, 63 and 65.

80 A cleaning composition comprising a cleaning effective amount of a proteolytic enzyme obtained from a Cellulomonas species, said enzyme comprising a polynucleotide sequence having at least 70 % sequence identity to SEQ ID NO:4, and a suitable cleaning formulation.
81. The cleaning composition as claimed in Claim 80, further comprising one or more additional enzymes or enzyme derivatives selected from the group consisting of proteases, amylases, lipases, mannanases, pectinases, cutinases, oxidoreductases, hemicellulases, and cellulases.
82. A composition comprising the serine protease as claimed in Claim 1 and at least one stabilizing agent.
83. The composition as claimed in Claim 82, wherein said stabilizing agent is selected from the group consisting of borax, glycerol, and competitive inhibitors.
84. The composition as claimed in Claim 83, wherein said competitive inhibitors stabilize said serine protease to anionic surfactants.
85. The composition as claimed in Claim 1, wherein said serine protease is an autolytically stable variant.

86. A cleaning composition comprising at least 0.0001 weight percent of the serine protease as claimed in Claim 1, and optionally, an adjunct ingredient.
87. A cleaning composition as claimed in Claim 86, said composition comprising a sufficient amount of a pH modifier to provide said composition with a neat pH of from about 3 to about 5, said composition being essentially free of materials that hydrolyze at a pH of from about 3 to about 5.
88. A cleaning composition as claimed in Claim 87, wherein said materials that hydrolyze comprise a surfactant material.
89. A cleaning composition as claimed in Claim 87, said cleaning composition being a liquid composition.

90. A cleaning composition as claimed in Claim 88, wherein said surfactant material comprises a sodium alkyl sulfate surfactant that comprises an ethylene oxide moiety.
91. A composition as claimed in Claim 87, said composition comprising from about 0.001 to about 0.5 percent weight of said serine protease.
92. A composition as claimed in Claim 91, said composition from about 0.01 to about 0.1 percent weight of said serine protease.
93. A method of cleaning, said method comprising the steps of:
a) contacting a surface and/or an article comprising a fabric with the cleaning
composition as claimed in Claim 94 and/or a composition comprising the cleaning
composition as claimed in Claim 98; and
b) optionally washing and/or rinsing said surface or material.

Documents:

2866-DELNP-2006-Abstract-(30-09-2011).pdf

2866-delnp-2006-abstract.pdf

2866-delnp-2006-Claims-(30-04-2013).pdf

2866-DELNP-2006-Claims-(30-09-2011).pdf

2866-delnp-2006-claims.pdf

2866-delnp-2006-Correspondance Others-(16-04-2013).pdf

2866-delnp-2006-Correspondance Others-(30-04-2013).pdf

2866-DELNP-2006-Correspondence Others-(30-09-2011).pdf

2866-delnp-2006-correspondence-others-1.pdf

2866-delnp-2006-description (complete).pdf

2866-delnp-2006-drawing.pdf

2866-DELNP-2006-Drawings-(30-09-2011).pdf

2866-delnp-2006-form-1.pdf

2866-delnp-2006-form-18.pdf

2866-DELNP-2006-Form-2-(30-09-2011).pdf

2866-delnp-2006-form-2.pdf

2866-delnp-2006-Form-3-(30-04-2013).pdf

2866-DELNP-2006-Form-3-(30-09-2011).pdf

2866-delnp-2006-form-3.pdf

2866-delnp-2006-form-5.pdf

2866-delnp-2006-gpa.pdf

2866-delnp-2006-pct-237.pdf

2866-delnp-2006-pct-345.pdf

2866-delnp-2006-pct-373.pdf

2866-delnp-2006-pct-search report.pdf

2866-DELNP-2006-Petition-137-(30-09-2011).pdf


Patent Number 257598
Indian Patent Application Number 2866/DELNP/2006
PG Journal Number 43/2013
Publication Date 25-Oct-2013
Grant Date 18-Oct-2013
Date of Filing 19-May-2006
Name of Patentee GENENCOR INTERNATIONAL INC.
Applicant Address 925, PAGE MILL ROAD, PALO ALTO, CA 94304, USA
Inventors:
# Inventor's Name Inventor's Address
1 JONES BRIAN EDWARD GRAVIN JUTIANA VAN STOLBERGLAAN 24, NL-2263 VA LEIDSCHENDAM, THE NETHERLANDS
2 KOLKMAN MARC LINNAEUSH OF 25, NL-2341 PA OEGSTGEEST, THE NETHERLANDS
3 LEEFLANG CHRIS GRANAATHORST 345, NL-2592 SZ THE HAGUE, THE NETHERLANDS
4 OH HIROSHI 8541 MEADOW BLUFF COURT, CINCINNATI, OH 45249, USA
5 POULOSE AYROOKARAN J 2848 WAKEFIELD DRIVE, BELMONT, 594002, USA
6 SADLOWSKI EUGENE S 9980 PEBBLEKNOLL DRIVE, CINCINNATI, OH 45252, USA
7 SHAW ANDREW 2560 HYDE STREET, SAN FRANCISCO, 5 94109, USA
8 VAN DER KLEIJ WILHELMUS A.H LAAN VAN WATERINGEVELD 1418, NL-2671 DE THE NETHERLANDS
9 VAN MARREWIJK LEO LAAN VAN WATERINGEVELD 1418, NL-2671 DG THE HAGUE, THE NETHERLANDS
PCT International Classification Number C12N9/50; C11D3/386; C12N9/52
PCT International Application Number PCT/US2004/039066
PCT International Filing date 2004-11-19
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 60/523,609 2003-11-19 U.S.A.