Title:
Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeutics
Document Type and Number:
United States Patent 7404958

Abstract:
The invention provides isolated polypeptide and nucleic acid sequences derived from Streptococcus pneumoniae that are useful in diagnosis and therapy of pathological conditions; antibodies against the polypeptides; and methods for the production of the polypeptides. The invention also provides methods for the detection, prevention and treatment of pathological conditions resulting from bacterial infection.

Inventors:
Doucette-stamm, Lynn (Framingham, MA, US)
Bush, David (Somerville, MA, US)
Zeng, Qiandong (Waltham, MA, US)
Opperman, Timothy (Somerville, MA, US)
Houseweart, Chad Eric (Waltham, MA, US)
      Plaque It!

Sponsored by:
Flash of Genius
Application Number:
11/524942
Publication Date:
07/29/2008
Filing Date:
09/21/2006
View Patent Images:
Images are available in PDF form when logged in. To view PDFs, Login  or  Create Account (Free!)
Assignee:
Sanofi Pasteur Limited (Toronto, Ontario, CA)
Primary Class:
International Classes:
A61K39/00
Field of Search:
424/185.1
US Patent References:
5302527Nitrate reductase as marker for filamentous fungiApril, 1994Birkett et al.
5994066Species-specific and universal DNA probes and amplification primers to rapidly detect and identify common bacterial pathogens and associated antibiotic resistance genes from clinical specimens for routine diagnosis in microbiology laboratoriesNovember, 1999Bergeron et al.
6420135Streptococcus pneumoniae polynucleotides and sequencesJuly, 2002Kunsch et al.
6573082Streptococcus pneumoniae antigens and vaccinesJune, 2003Choi et al.
6582706Vaccine compositions comprising Streptococcus pneumoniae polypeptides having selected structural MOTIFSJune, 2003Johnson et al.
6689369Immunogenic pneumococcal protein and vaccine compositions thereofFebruary, 2004Koenig et al.
6699703Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsMarch, 2004Doucette-Stamm et al.
6800744Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsOctober, 2004Doucette-Stamm et al.
6887663Streptococcus pneumoniae SP036 polynucleotidesMay, 2005Choi et al.
6929930Streptococcus pneumoniae SP042 polynucleotidesAugust, 2005Choi et al.
6936252Streptococcus pneumoniae proteins and nucleic acid moleculesAugust, 2005Gilbert et al.
7056510Streptococcus pneumoniae SP036 polynucleotides, polypeptides, antigens and vaccinesJune, 2006Choi et al.
7074914Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJuly, 2006Doucette-Stamm et al.
7081530Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJuly, 2006Doucette-Stamm et al.
7098023Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsAugust, 2006Doucette-Stamm et al.
7115731Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsOctober, 2006Doucette-Stamm et al.
7122194Vaccine compositions comprising Streptococcus pneumoniae polypeptides having selected structural motifsOctober, 2006Johnson et al.
7122368Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsOctober, 2006Doucette-Stamm et al.
7129339Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsOctober, 2006Doucette-Stamm et al.
7129340Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsOctober, 2006Doucette-Stamm et al.
7135560Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsNovember, 2006Doucette-Stamm et al.
7141418Streptococcus pneumoniae polynucleotides and sequencesNovember, 2006Kunsch et al.
7151171Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsDecember, 2006Doucette-Stamm et al.
7153952Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsDecember, 2006Doucette-Stamm et al.
20030022181Streptococcus pneumoniae antigensJanuary, 2003Cripps et al.
20030134407Nucleic acids and proteins from Streptococcus pneumoniaeJuly, 2003Le Page et al.
20040005331Vaccine compositions comprising Streptococcus pneumoniae polypeptides having selected structural motifsJanuary, 2004Johnson et al.
20040052781Vaccine compositions comprising Streptococcus pneumoniae polypeptides having selected structural motifsMarch, 2004Johnson et al.
20040219165Streptococcus pneumoniae antigensNovember, 2004Cripps et al.
20040265933ProteinsDecember, 2004Le Page et al.
20050181439Streptococcus pneumoniae antigens and vaccinesAugust, 2005Choi et al.
20050276814Streptococcus pneumoniae proteins and nucleic acid moleculesDecember, 2005Gilbert et al.
20070009901Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070009902Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070009903Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070009904Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070009905Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070009906Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070015255Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070015256Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070021368Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070021369Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070021370Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070021371Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070021372Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070021374Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070021601Nucleic acid amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2007Doucette-Stamm et al.
20070031852Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsFebruary, 2007Doucette-Stamm et al.
20070037766Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsFebruary, 2007Doucette-Stamm et al.
20070059801Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsMarch, 2007Doucette-Stamm et al.
20070059802Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsMarch, 2007Doucette-Stamm et al.
20070065458Vaccine compositions comprising Streptococcus pneumoniae polypeptides having selected structural motifsMarch, 2007Johnson et al.
20070082005Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsApril, 2007Doucette-Stamm et al.
20070083038Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsApril, 2007Doucette-Stamm et al.
20070088150Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsApril, 2007Doucette-Stamm et al.
20070092946Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsApril, 2007Doucette-Stamm et al.
20070093647Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsApril, 2007Doucette-Stamm et al.
20070093648Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsApril, 2007Doucette-Stamm et al.
20070099861Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsMay, 2007Doucette-Stamm et al.
20070117965Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsMay, 2007Doucette-Stamm et al.
20070154986Streptococcus pneumoniae Polynucleotides and SequencesJuly, 2007Kunsch et al.
20070207976Nucleic acids and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsSeptember, 2007Doucette-Stamm et al.
20070243207Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsOctober, 2007Doucette-Stamm et al.
20070243585Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsOctober, 2007Doucette-Stamm et al.
20070287172Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsDecember, 2007Doucette-Stamm et al.
20080009035Nucleic acid and amino acid sequences relating to Streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2008Doucette-Stamm et al.
20080020442Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsJanuary, 2008Doucette-Stamm et al.
20080032339Nucleic acid and amino acid sequences relating to streptococcus pneumoniae for diagnostics and therapeuticsFebruary, 2008Doucette-Stamm et al.
Foreign References:
AU200131407June, 2001
AU2004210523October, 2004
EP0942983September, 1999-i(STREPTOCOCCUS PNEUMONIAE) ANTIGENS AND VACCINES
EP1400592March, 2004Streptococcus pneumoniae polynucleotides and sequences
EP1770164April, 2007Streptococcus pneumoniae antigens and vaccines
WO/1995/014712June, 1995STREPTOCOCCUS PNEUMONIAE HEMIN/HEMOGLOBIN-BINDING ANTIGENS
WO/1995/031548November, 1995STREPTOCOCCUS PNEUMONIAE CAPSULAR POLYSACCHARIDE GENES AND FLANKING REGIONS
WO/1996/008582March, 1996SPECIFIC AND UNIVERSAL PROBES AND AMPLIFICATION PRIMERS TO RAPIDLY DETECT AND IDENTIFY COMMON BACTERIAL PATHOGENS AND ANTIBIOTIC RESISTANCE GENES FROM CLINICAL SPECIMENS FOR ROUTINE DIAGNOSIS IN MICROBIOLOGY LABORATORIES
WO/1998/006734February, 1998NOVEL PROKARYOTIC POLYNUCLEOTIDES, POLYPEPTIDES AND THEIR USES
WO/1998/018930May, 1998$i(STREPTOCOCCUS PNEUMONIAE) ANTIGENS AND VACCINES
WO/1998/018931May, 1998$i(STREPTOCOCCUS PNEUMONIAE) POLYNUCLEOTIDES AND SEQUENCES
WO/1998/026072June, 1998STREPTOCOCCUS PNEUMONIAE DNA SEQUENCES
WO/1999/033871July, 1999ESSENTIAL BACTERIAL GENES AND THEIR USE
WO/2000/006737February, 2000STREPTOCOCCUS PNEUMONIAE PROTEINS AND NUCLEIC ACID MOLECULES
WO/2000/006738February, 2000NUCLEIC ACIDS AND PROTEINS FROM $i(STREPTOCOCCUS PNEUMONIAE)
WO/2000/014200March, 2000ESSENTIAL BACTERIAL GENES AND THEIR USE
WO/2000/037105June, 2000STREPTOCOCCUS PNEUMONIAE PROTEINS AND IMMUNOGENIC FRAGMENTS FOR VACCINES
WO/2000/058475October, 2000STREPTOCOCCUS PNEUMONIAE ANTIGENS
WO/2000/076540December, 2000STREPTOCOCCUS PNEUMONIAE PROTEINS AND VACCINES
WO/2002/022168March, 2002VACCINE AGAINST STREPTOCOCCUS PNEUMONIAE
WO/2002/079241October, 2002SECRETED STREPTOCOCCUS PNEUMONIAE PROTEINS
Other References:
US 6,159,469, 12/2000, Choi et al. (withdrawn)
U.S. Appl. No. 12/001,413, by Lynn Doucette-Stamm, et al., filed Dec. 11, 2007.
U.S. Appl. No. 12/001,752, by Lynn Doucette-Stamm, et al., filed Dec. 12, 2007.
U.S. Appl. No. 12/001,437, by Lynn Doucette-Stamm, et al., filed Dec. 11, 2007.
U.S. Appl. No. 12/001,619, by Lynn Doucette-Stamm, et al., filed Dec. 12, 2007.
U.S. Appl. No. 12/001,753, by Lynn Doucette-Stamm, et al., filed Dec. 12, 2007.
U.S. Appl. No. 12/001,605, by Lynn Doucette-Stamm, et al., filed Dec. 12, 2007.
U.S. Appl. No. 11/785,503, by Christophe Francois Guy Gilbert, et al., filed Apr. 20, 2007.
U.S. Appl. No. 11/785,507, by Christophe Francois Guy Gilbert, et al., filed Apr. 20, 2007.
U.S. Appl. No. 11/785,513, by Christophe Francois Guy Gilbert, et al., filed Apr. 20, 2007.
U.S. Appl. No. 11/785,517, by Christophe Francois Guy Gilbert, et al., filed Apr. 18, 2007.
Wizemann, et al., “Use of a Whole Genome Approach to Identify Vaccine Molecules Affording Protection against Streptococcus pneumoniae Infection,” Infection and Immunity, 69(3):1593-1598 (Mar. 2001).
U.S. Appl. No. 11/524,834, by Lynn Doucette-Stamm, et al., filed Sep. 21, 2006.
U.S. Appl. No. 11/524,363, by Lynn Doucette-Stamm, et al., filed Sep. 20, 2006.
U.S. Appl. No. 11/525,165, by Lynn Doucette-Stamm, et al., filed Sep. 21, 2006.
U.S. Appl. No. 11/524,794, by Lynn Doucette-Stamm, et al., filed Sep. 21, 2006.
U.S. Appl. No. 11/523,424, by Lynn Doucette-Stamm, et al., filed Sep. 19, 2006.
U.S. Appl. No. 11/523,686, by Lynn Doucette-Stamm, et al., filed Sep. 19, 2006.
U.S. Appl. No. 11/796,426, by Lynn Doucette-Stamm, et al., filed Apr. 27, 2007.
U.S. Appl. No. 11/796,386, by Lynn Doucette-Stamm, et al., filed Apr. 27, 2007.
U.S. Appl. No. 11/796,731, by Lynn Doucette-Stamm, et al., filed Apr. 27, 2007.
U.S. Appl. No. 11/796,730, by Lynn Doucette-Stamm, et al., filed Apr. 27, 2007.
U.S. Appl. No. 11/799,735, by Lynn Doucette-Stamm, et al., filed Apr. 27, 2007.
U.S. Appl. No. 11/801,737, by Lynn Doucette-Stamm, et al., filed May 10, 2007.
U.S. Appl. No. 11/801,963, by Lynn Doucette-Stamm, et al., filed May 11, 2007.
U.S. Appl. No. 11/801,901, by Lynn Doucette-Stamm, et al., filed May 10, 2007.
U.S. Appl. No. 11/803,173, by Lynn Doucette-Stamm, et al., filed May 11, 2007.
U.S. Appl. No. 11/803,180, by Lynn Doucette-Stamm, et al., filed May 11, 2007.
Gerhold, D., and Caskey, C.T., “It's the genes! EST access to human genome content,” BioEssays 18(12): 973-981 (Dec. 1996).
Revised Interim Utility Guidelines Training Materials, United States Patent and Trademark Office (2001).
Plotkin, S.A. and Mortimer, Jr., E.A., “New Technologies for Making Vaccines,” Vaccines, W.B. Saunders Co., p. 571 (1988).
Camara et al., “A Neuraminidase from Streptococcus pneumoniae Has the Features of a Surface Protein,” Infection and Immunity, 62(9), pp. 3688-3695 (1994).
Buck, M.A, et al., “Single Protein Omission Reconstitution Studies of Tetracycline Binding to the 30S Subunit of Escherichia coli Ribosomes,” Abstract only, Biochemistry, Jun. 1990, 5(22):5374-5379, American Chemical Society Publications, Columbus, Ohio, USA.
Burgess, W.H., et al., “Possible Dissociation of the Heparin-binding and Mitogenic Activities of Heparin-binding (Acidic Fibroblast) Growth Factor-1 from Its Receptor-binding Activities by Site-directed Mutagenesis of a Single Lysine Residue,” Journal of Cell Biology 111:2129-2138 (1990).
Crickmore, N., et al., “The Escherichia coli Heat Shock Regulatory Gene is Immediately Downstream of a Cell Division Operon: The Fam Mutation is Allelic with rpoH,” Abstract only, Mol Gen Genet, Dec. 1986, 205(3):535-539, Springer-Verlag, Berlin, Germany.
Fleck, R.A., et al., “Use of HL-60 Cell Line To Measure Opsonic Capacity of Pneumococcal Antibodies,” Clinical and Diagnostic Laboratory Immunology 12(1): 19-27 (2005).
Gill, D.R., et al., “The Identification of the Escherichia coli ftsY Gene Product: An Unusual Protein,” Abstract only, Mol Microbiol, Apr. 1990, 4(4):575-583, Blackwell Science, Ltd., Boston, MA, USA.
Gosink, K.K., et al., “Role of Novel Choline Binding Proteins in Virulence of Streptococcus pneumoniae,” Infection and Immunity 68(10):5690-5695 (Oct. 2000).
Haasum, Y., et al., “Amino Acid Repetitions in the Dihydropteroate Synthase of Streptococcus pneumonae Lead to Sulfonamide Resistance with Limited Effects on Substrate Km,” Antimicrobial Agents and Chemotherapy 45(3): 805-809 (2001).
Hoffman, J.A., et al., “Streptococcus pneumoniae Infections in the Neonate,” Pediatrics 112(5): 1095-1102 (2003).
Jobling, M.G. and Holmes, R.K., “Analysis of structure and function of the B subunit of cholera toxin by the use of site-directed mutagenesis,” Molecular Microbiology 5(7): 1755-1767 (1991).
Klugman, K.P., and Lonks, J.R., “Hidden Epidemic of Macrolide-resistant Pneumococci,” Emerging Infectious Diseases 11(6): 802-807 (2005).
Lazar, E., et al., “Transforming Growth Factor α: Mutation of Aspartic Acid 47 and Leucine 48 Results in Different Biological Activities,” Molecular and Cellular Biology 8(3): 1247-1252 (1988).
López, R., “Streptcoccus pneumoniae and its bacteriophages: one long argument,” International Microbiology 7: 163-171 (2004).
Menzies, B.E., and Kernodle, D.S., “Site-Directed Mutagenesis of the Alpha-Toxin Gene of Staphylococcus aureus: Role of Histidines in Toxin Activity In Vitro and in a Murine Model,” Infection and Immunity 62(5): 1843-1847 (1994).
Moelling, K., “DNA for Genetic Vaccination and Therapy,” Abstract only, Cytokines Cell Mol Ther., Jun. 1997, 3(2):127-135, Elsevier Science Ltd., New York, New York, USA.
Nishi, K., et al., “DNA Sequence and Complementation Analysis of a Mutation in the rp1X Gene from Escherichia coli Leading to Loss of Ribosomal Protein L24”, Abstract only, J. Bacteriol, Sep. 1985, 163(3):890-894, American Society for Microbiology, Washington, DC, USA.
Parikh, S., et al., “Roles of Tyrosine 158 and Lysine 165 in the Catalytic Mechanism in InhA, the Enoyl-ACP Reductase from Mycobacterium tuberculosis,” Biochemistry 38: 13623-13634 (1999).
Rost, R., “Twilight Zone of Protein Sequence Alignments,” Protein Entineering, 1999, 12(2):85-94, Oxford University Press, Cary, North Carolina, USA and Oxford, United Kingdom.
Rudinger, J., “Characteristics of the amino acids as components of a peptide hormone sequence.” In Peptide Hormones, J.A. Parsons, ed. (University Park Press) pp. 1-7 (1976).
Russell, R.B., and Barton, G.J., “Structural Features can be Unconserved in Proteins with Similar Folds: An Analysis of Side-chain to Side-chain Contacts Secondary Structure and Accessibility,” J. Mol. Biol 244: 332-350 (1994).
Smith, D.R., “Microbial Pathogen Genomes—New Strategies for Identifying Therapeutics and Vaccine Targets,” Tibtech, Aug. 1996, 14:290-293, Elsevier Science Ltd., New York, New York, USA.
Stephens, C., et al., “Bacterial Protein Secretion—A Target for New Antibiotics?,” Abstract only, Chem Biol, Sep. 1997, 4(9):637-641, Elsevier Science Ltd., New York, New York, USA.
Wells, T.N.C., and Peitsch, M., “The chemokine information source: identification and characterization of novel chemokines using the WorldWideWeb and Expressed Sequence Tag Databases,” Journal of Leukocyte Biology 61: 545-551 (1997).
Willison, J.C, et al., “The Escherichia coli efg Gene and the Rhodobacter capsulatus adgA Gene Code for NH3-Dependent NAD Synthetase,” Abstract only, J. Bacteriol, Jun. 1994, 176(11):3400-3402, American Society for Microbiology, Washington, DC, USA.
Wower, I.K., et al., “Ribosomal Protein L27 Participates in both 50 S Subunit Assembly and the Peptidyl Transferase Reaction,” J Biol Chem, Jul. 1998, 273(31):19847-19852, American Society for Biochemistry and Molecular Biology, Bethesda, MD, USA.
AAT28529, Genbank, Apr. 1, 1997.
AAV52227, Genbank, Oct. 23, 1998.
Database sequence, Genbank acc. No. AAV52268, Oct. 23, 1998.
Database sequence, Genbank acc. No. AAV52490, Oct. 23, 1998.
Database sequence, Genbank acc. No. W65693, Jun. 11, 1996.
Database sequence, Genbank acc. No. AA025574, Aug. 14, 1996.
Database sequence, Genbank acc. No. X67663, Jul. 18, 1996.
Database sequence, Genbank acc. No. AAV42980, Nov. 8, 1998.
Database sequence, Genbank acc. No. AAZ96269, Apr. 10, 2000.
Database sequence, Genbank acc. No. AAX30819, May 20, 1999.
Database sequence, Genbank acc. No. AAZ96466, Apr. 10, 2000.
Database sequence, Genbank acc. No. AAT98768, Nov. 10, 1998.
Database sequence, Genbank acc. No. AAZ96379, Apr. 10, 2000.
Database sequence, Genbank acc. No. AAV52231, Oct. 23, 1998.
Database sequence, Genbank acc. No. AAT98563, Nov. 6, 1998.
Database sequence, Genbank acc. No. AAT98628, Nov. 6, 1998.
L26052, Genbank, Aug. 3, 1994.
M15328, Genbank, Oct. 23, 1995.
M57624, Genbank, Apr. 26, 1993.
M81748, Genbank, Nov. 8, 1995.
T58840, Genbank, Feb. 9, 1995.
U66912, Genbank, Sep. 5, 1996.
X02656, Genbank, Feb. 18, 1992.
X16548, Genbank, Sep. 12, 1993.
X54994, Genbank, Jan. 15, 1993.
Z33011, Genbank, Aug. 18, 1995.
Primary Examiner:
Navarro, Mark
Attorney, Agent or Firm:
Hamilton, Brook, Smith & Reynolds, P.C.
Parent Case Data:

RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No. 11/027,892, filed Dec. 30, 2004, which is a divisional of U.S. application Ser. No. 10/640,833, filed Aug. 14, 2003, now abandoned, which is a continuation of U.S. application Ser. No. 09/583,110 (now U.S. Pat. No. 6,699,703) filed May 26, 2000, which is a continuation-in-part of U.S. application Ser. No. 09/107,433 (now U.S. Pat. No. 6,800,744), filed Jun. 30, 1998, which claims the benefit of U.S. Application No. 60/085,131, filed May 12, 1998 and of U.S. Application No. 60/051,553, filed Jul. 2, 1997. The entire teachings of the above applications are incorporated herein by reference

Claims:
What is claimed is:

1. A method of treating a subject, comprising the step of administering to the subject a composition that includes a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 4263, wherein administration of the polypeptide elicits an immune response to S. pneumoniae, thereby treating the subject.

2. A method of treating a subject, comprising the step of administering to the subject a composition that includes a S. pneumoniae surface protein having at least 90% identity to SEQ ID NO: 4263, wherein administration of the protein elicits an immune response to S. pneumoniae, thereby treating the subject.

3. A method of treating a S. pneumoniae infection in a subject, comprising the step of administering to a subject having an S. pneumoniae infection a composition that includes a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 4263.

4. A method of treating a S. pneumoniae infection in a subject, comprising the step of administering to a subject having an S. pneumoniae infection a composition that includes a S. pneumoniae surface protein having at least 90% identity to SEQ ID NO: 4263.

5. The method of claim 1, wherein the S. pneumoniae surface protein has at least 95% identity to SEQ ID NO: 4263.

6. The method of claim 1, wherein the S. pneumoniae surface protein has at least 98% identity to SEQ ID NO: 4263.

7. The method of claim 1, wherein the S. pneumoniae surface protein has at least 99% identity to SEQ ID NO: 4263.

8. The method of claim 4, wherein the S. pneumoniae surface protein has at least 95% identity to SEQ ID NO: 4263.

9. The method of claim 4, wherein the S. pneumoniae surface protein has at least 98% identity to SEQ ID NO: 4263.

10. The method of claim 4, wherein the S. pneumoniae surface protein has at least 99% identity to SEQ ID NO: 4263.

Description:

INCORPORATION BY REFERENCE OF MATERIAL ON COMPACT DISK

This application incorporates by reference the Sequence Listing contained on the two compact disks (Copy 1 and Copy 2), filed concurrently herewith, containing the following file:

File name: 3687.1000-038SequenceList.txt; created Sep. 16, 2006, 8,135 KB in size.

This application also incorporates by reference Table 2 contained on the two compact disks (Copy 1 and Copy 2), filed concurrently herewith, containing the following file:

File name: Table2 2.txt; created Aug. 21, 2006, 351 KB in size.

FIELD OF THE INVENTION

The invention relates to isolated nucleic acids and polypeptides derived from Streptococcus pneumoniae that are useful as molecular targets for diagnostics, prophylaxis and treatment of pathological conditions, as well as materials and methods for the diagnosis, prevention, and amelioration of pathological conditions resulting from bacterial infection.

BACKGROUND OF THE INVENTION

Streptococcus pneumoniae ( S. pneumoniae ) is a common, spherical, gram-positive bacterium. Worldwide it is a leading cause of illness among children, the elderly, and individuals with debilitating medical conditions (Breiman, R. F. et al., 1994, JAMA 271: 1831). S. pneumoniae is estimated to be the causal agent in 3,000 cases of meningitis, 50,000 cases of bacteremia, 500,000 cases of pneumonia, and 7,000,000 cases of otitis media annually in the United States alone (Reichler, M. R. et al., 1992, J. Infect. Dis. 166: 1346; Stool, S. E. and Field, M. J., 1989 Pediatr. Infect. Dis J. 8: S11). In the United States alone, 40,000 deaths result annually from S. pneumoniae infections (Williams, W. W. et al., 1988 Ann. Intern. Med. 108: 616) with a death rate approaching 30% from bacteremia (Butler, J. C. et al., 1993, JAMA 270: 1826). Pneumococcal pneumonia is a serious problem among the elderly of industrialized nations (Käyhty, H. and Eskola, J., 1996 Emerg. Infect. Dis. 2: 289) and is a leading cause of death among children in developing nations (Käyhty, H. and Eskola, J., 1996 Emerg. Infect. Dis. 2: 289; Stansfield, S. K., 1987 Pediatr. Infect. Dis. 6: 622).

Vaccines against S. pneumoniae have been available for a number of years. There are a large number of serotypes based on the polysaccharide capsule (van Dam, J. E., Fleer, A., and Snippe, H., 1990 Antonie van Leeuwenhoek 58: 1) although only a fraction of the serotypes seem to be associated with infections (Martin, D. R. and Brett, M. S., 1996 N. Z. Med. J. 109: 288). A multivalent vaccine against capsular polysaccharides of 23 serotypes (Smart, L. E., Dougall, A. J. and Gridwood, R. W., 1987 J. Infect. 14: 209) has provided protection for some groups but not for several groups at risk for pneumococcal infections, such as infants and the elderly (Mäkel, P. H. et al., 1980 Lancet 2: 547; Sankilampi, U., 1996 J. Infect. Dis. 173: 387). Conjugated pneumococcal capsular polysaccharide vaccines have somewhat improved efficacy, but are costly and, therefore, are not likely to be in widespread use (Käyhty, H. and Eskola, J., 1996 Emerg. Infect. Dis. 2: 289).

At one time, S. pneumoniae strains were uniformly susceptible to penicillin. The report of a penicillin-resistant strain of (Hansman, D. and Bullen, M. M., 1967 Lancet 1: 264) was followed rapidly by many reports indicating the worldwide emergence of penicillin-resistant and penicillin non-susceptible strains (Klugman, K. P., 1990 Clin. Microbiol. Rev. 3: 171). S. pneumoniae strains which are resistant to multiple antibiotics (including penicillin) have also been observed recently within the United States (Welby, P. L., 1994 Pediatr. Infect. Dis. J. 13: 281; Ducin, J. S. et al., 1995 Pediatr. Infect. Dis. J. 14: 745; Butler, J. C., 1996 J. Infect. Dis. 174: 986) as well as internationally (Boswell, T. C. et al., 1996; J. Infect. 33: 17; Catchpole, C., Fraise, A., and Wise, R., 1996 Microb. Drug Resist. 2: 431; Tarasi, A. et al., 1997 Microb. Drug Resist. 3: 105).

A high incidence of morbidity is associated with invasive S. pneumoniae infections (Williams, W. W. et al., 1988 Ann. Intern. Med. 108: 616). Because of the incomplete effectiveness of currently available vaccines and antibiotics, the identification of new targets for antimicrobial therapies, including, but not limited to, the design of vaccines and antibiotics, which may help prevent infection or that may be useful in fighting existing infections, is highly desirable.

SUMMARY OF THE INVENTION

The present invention fulfills the need for diagnostic tools and therapeutics by providing bacterial-specific compositions and methods for detecting, treating, and preventing bacterial infection, in particular S. pneumoniae infection.

The present invention encompasses isolated polypeptides and nucleic acids derived from S. pneumoniae that are useful as reagents for diagnosis of bacterial infection, components of effective antibacterial vaccines, and/or as targets for antibacterial drugs, including anti- S. pneumoniae drugs. The nucleic acids and peptides of the present invention also have utility for diagnostics and therapeutics for S. pneumoniae and other Streptococcus species. They can also be used to detect the presence of S. pneumoniae and other Streptococcus species in a sample; and in screening compounds for the ability to interfere with the S. pneumoniae life cycle or to inhibit S. pneumoniae infection. More specifically, this invention features compositions of nucleic acids corresponding to entire coding sequences of S. pneumoniae proteins, including surface or secreted proteins or parts thereof, nucleic acids capable of binding mRNA from S. pneumoniae proteins to block protein translation, and methods for producing S. pneumoniae proteins or parts thereof using peptide synthesis and recombinant DNA techniques. This invention also features antibodies and nucleic acids useful as probes to detect S. pneumoniae infection. In addition, vaccine compositions and methods for the protection or treatment of infection by S. pneumoniae are within the scope of this invention.

The nucleotide sequences provided in SEQ ID NO: 1-SEQ ID NO: 2661, a fragment thereof, or a nucleotide sequence at least 99.5% identical to a sequence contained within SEQ ID NO: 1-SEQ ID NO: 2661 may be “provided” in a variety of medias to facilitate use thereof. As used herein, “provided” refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence of the present invention, i.e., the nucleotide sequence provided in SEQ ID NO: 1-SEQ ID NO: 2661, a fragment thereof, or a nucleotide sequence at least 99.5% identical to a sequence contained within SEQ ID NO: 1-SEQ ID NO: 2661. Uses for and methods for providing nucleotide sequences in a variety of media is well known in the art (see e.g., EPO Publication No. EP 0 756 006)

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer readable media. As used herein, “computer readable media” refers to any media which can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage media, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A person skilled in the art can readily appreciate how any of the presently known computer readable media can be used to create a manufacture comprising computer readable media having recorded thereon a nucleotide sequence of the present invention.

As used herein, “recorded” refers to a process for storing information on computer readable media. A person skilled in the art can readily adopt any of the presently known methods for recording information on computer readable media to generate manufactures comprising the nucleotide sequence information of the present invention.

A variety of data storage structures are available to a person skilled in the art for creating a computer readable media having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the nucleotide sequence information of the present invention on computer readable media. The sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, Oracle, or the like. A person skilled in the art can readily adapt any number of data processor structuring formats (e.g. text file or database) in order to obtain computer readable media having recorded thereon the nucleotide sequence information of the present invention.

By providing the nucleotide sequence of SEQ ID NO: 1-SEQ ID NO: 2661, a fragment thereof, or a nucleotide sequence at least 99.5% identical to a sequence contained within SEQ ID NO: 1-SEQ ID NO: 2661 in computer readable form, a person skilled in the art can routinely access the sequence information for a variety of purposes. Computer software is publicly available which allows a person skilled in the art to access sequence information provided in a computer readable media. Examples of such computer software include programs of the “Staden Package”, “DNA Star”, “MacVector”, GCG “Wisconsin Package” (Genetics Computer Group, Madison, Wis.) and “NCBI toolbox” (National Center for Biotechnology Information).

Computer algorithms enable the identification of S. pneumoniae open reading frames (ORFs) within SEQ ID NO: 1-SEQ ID NO: 2661 which contain homology to ORFs or proteins from other organisms. Examples of such similarity-search algorithms include the BLAST [Altschul et al., J. Mol. Biol. 215:403-410 (1990)] and Smith-Waterman [Smith and Waterman (1981) Advances in Applied Mathematics, 2:482-489] search algorithms. These algorithms are utilized on computer systems as exemplified below. The ORFs so identified represent protein encoding fragments within the S. pneumoniae genome and are useful in producing commercially important proteins such as enzymes used in fermentation reactions and in the production of commercially useful metabolites.

The present invention further provides systems, particularly computer-based systems, which contain the sequence information described herein. Such systems are designed to identify commercially important fragments of the S. pneumoniae genome. As used herein, “a computer-based system” refers to the hardware means, software means, and data storage means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, output means, and data storage means. A person skilled in the art can readily appreciate that any one of the currently available computer-based systems is suitable for use in the present invention. The computer-based systems of the present invention comprise a data storage means having stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means for supporting and implementing a search means. As used herein, “data storage means” refers to memory which can store nucleotide sequence information of the present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide sequence information of the present invention.

As used herein, “search means” refers to one or more programs which are implemented on the computer-based system to compare a target sequence or target structural motif with the sequence information stored within the data storage means. Search means are used to identify fragments or regions of the S. pneumoniae genome which are similar to, or “match”, a particular target sequence or target motif. A variety of known algorithms are known in the art and have been disclosed publicly, and a variety of commercially available software for conducting homology-based similarity searches are available and can be used in the computer-based systems of the present invention. Examples of such software include, but is not limited to, FASTA (GCG Wisconsin Package), Bic_SW (Compugen Bioccelerator, BLASTN2, BLASTP2 and BLASTX2 (NCBI) and Motifs (GCG). BLASTN2, A person skilled in the art can readily recognize that any one of the available algorithms or implementing software packages for conducting homology searches can be adapted for use in the present computer-based systems.

As used herein, a “target sequence” can be any DNA or amino acid sequence of six or more nucleotides or two or more amino acids. A person skilled in the art can readily recognize that the longer a target sequence is, the less likely a target sequence will be present as a random occurrence in the database. The most preferred sequence length of a target sequence is from about 10 to 100 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized that many genes are longer than 500 amino acids, or 1.5 kb in length, and that commercially important fragments of the S. pneumoniae genome, such as sequence fragments involved in gene expression and protein processing, will often be shorter than 30 nucleotides.

As used herein, “a target structural motif,” or “target motif,” refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a specific functional domain or three-dimensional configuration which is formed upon the folding of the target polypeptide. There are a variety of target motifs known in the art. Protein target motifs include, but are not limited to, enzymatic active sites, membrane spanning regions, and signal sequences. Nucleic acid target motifs include, but are not limited to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences).

A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the S. pneumoniae genome possessing varying degrees of homology to the target sequence or target motif. Such presentation provides a person skilled in the art with a ranking of sequences which contain various amounts of the target sequence or target motif and identifies the degree of homology contained in the identified fragment.

A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify sequence fragments of the S. pneumoniae genome. In the present examples, implementing software which implement the BLASTP2 and bic_SW algorithms (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Compugen Biocellerator) was used to identify open reading frames within the S. pneumoniae genome. A person skilled in the art can readily recognize that any one of the publicly available homology search programs can be used as the search means for the computer-based systems of the present invention.

The invention features S. pneumoniae polypeptides, preferably a substantially pure preparation of an S. pneumoniae polypeptide, or a recombinant S. pneumoniae polypeptide. In preferred embodiments: the polypeptide has biological activity; the polypeptide has an amino acid sequence at least 60%, 70%, 80%, 90%, 95%, 98%, or 99% identical to an amino acid sequence of the invention contained in the Sequence Listing, preferably it has about 65% sequence identity with an amino acid sequence of the invention contained in the Sequence Listing, and most preferably it has about 92% to about 99% sequence identity with an amino acid sequence of the invention contained in the Sequence Listing; the polypeptide has an amino acid sequence essentially the same as an amino acid sequence of the invention contained in the Sequence Listing; the polypeptide is at least 5, 10, 20, 50, 100, or 150 amino acid residues in length; the polypeptide includes at least 5, preferably at least 10, more preferably at least 20, more preferably at least 50, 100, or 150 contiguous amino acid residues of the invention contained in the Sequence Listing. In yet another preferred embodiment, the amino acid sequence which differs in sequence identity by about 7% to about 8% from the S. pneumoniae amino acid sequences of the invention contained in the Sequence Listing is also encompassed by the invention.

In preferred embodiments: the S. pneumoniae polypeptide is encoded by a nucleic acid of the invention contained in the Sequence Listing, or by a nucleic acid having at least 60%, 70%, 80%, 90%, 95%, 98%, or 99% homology with a nucleic acid of the invention contained in the Sequence Listing.

In a preferred embodiment, the subject S. pneumoniae polypeptide differs in amino acid sequence at 1, 2, 3, 5, 10 or more residues from a sequence of the invention contained in the Sequence Listing. The differences, however, are such that the S. pneumoniae polypeptide exhibits an S. pneumoniae biological activity, e.g., the S. pneumoniae polypeptide retains a biological activity of a naturally occurring S. pneumoniae enzyme.

In preferred embodiments, the polypeptide includes all or a fragment of an amino acid sequence of the invention contained in the Sequence Listing; fused, in reading frame, to additional amino acid residues, preferably to residues encoded by genomic DNA 5′ or 3′ to the genomic DNA which encodes a sequence of the invention contained in the Sequence Listing.

In yet other preferred embodiments, the S. pneumoniae polypeptide is a recombinant fusion protein having a first S. pneumoniae polypeptide portion and a second polypeptide portion, e.g., a second polypeptide portion having an amino acid sequence unrelated to S. pneumoniae . The second polypeptide portion can be, e.g., any of glutathione-S-transferase, a DNA binding domain, or a polymerase activating domain. In preferred embodiment the fusion protein can be used in a two-hybrid assay.

Polypeptides of the invention include those which arise as a result of alternative transcription events, alternative RNA splicing events, and alternative translational and postranslational events.

In a preferred embodiment, the encoded S. pneumoniae polypeptide differs (e.g., by amino acid substitution, addition or deletion of at least one amino acid residue) in amino acid sequence at 1, 2, 3, 5, 10 or more residues, from a sequence of the invention contained in the Sequence Listing. The differences, however, are such that: the S. pneumoniae encoded polypeptide exhibits a S. pneumoniae biological activity, e.g., the encoded S. pneumoniae enzyme retains a biological activity of a naturally occurring S. pneumoniae.

In preferred embodiments, the encoded polypeptide includes all or a fragment of an amino acid sequence of the invention contained in the Sequence Listing; fused, in reading frame, to additional amino acid residues, preferably to residues encoded by genomic DNA 5′ or 3′ to the genomic DNA which encodes a sequence of the invention contained in the Sequence Listing.

The S. pneumoniae strain, 14453, from which genomic sequences have been sequenced, has been deposited on Jun. 26, 1997 in the American Type Culture Collection, 10801 University Blvd., Manassas, Va. 20110-2209, and assigned the ATCC designation # 55987.

Included in the invention are: allelic variations; natural mutants; induced mutants; proteins encoded by DNA that hybridize under high or low stringency conditions to a nucleic acid which encodes a polypeptide of the invention contained in the Sequence Listing (for definitions of high and low stringency see Current Protocols in Molecular Biology, John Wiley & Sons, New York, 1989, 6.3.1-6.3.6, hereby incorporated by reference); and, polypeptides specifically bound by antisera to S. pneumoniae polypeptides, especially by antisera to an active site or binding domain of S. pneumoniae polypeptide. The invention also includes fragments, preferably biologically active fragments. These and other polypeptides are also referred to herein as S. pneumoniae polypeptide analogs or variants.

The invention further provides nucleic acids, e.g., RNA or DNA, encoding a polypeptide of the invention. This includes double stranded nucleic acids as well as coding and antisense single strands.

In preferred embodiments, the subject S. pneumoniae nucleic acid will include a transcriptional regulatory sequence, e.g. at least one of a transcriptional promoter or transcriptional enhancer sequence, operably linked to the S. pneumoniae gene sequence, e.g., to render the S. pneumoniae gene sequence suitable for expression in a recombinant host cell.

In yet a further preferred embodiment, the nucleic acid which encodes an S. pneumoniae polypeptide of the invention, hybridizes under stringent conditions to a nucleic acid probe corresponding to at least 8 consecutive nucleotides of the invention contained in the Sequence Listing; more preferably to at least 12 consecutive nucleotides of the invention contained in the Sequence Listing; more preferably to at least 20 consecutive nucleotides of the invention contained in the Sequence Listing; more preferably to at least 40 consecutive nucleotides of the invention contained in the Sequence Listing.

In another aspect, the invention provides a substantially pure nucleic acid having a nucleotide sequence which encodes an S. pneumoniae polypeptide. In preferred embodiments: the encoded polypeptide has biological activity; the encoded polypeptide has an amino acid sequence at least 60%, 70%, 80%, 90%, 95%, 98%, or 99% homologous to an amino acid sequence of the invention contained in the Sequence Listing; the encoded polypeptide has an amino acid sequence essentially the same as an amino acid sequence of the invention contained in the Sequence Listing; the encoded polypeptide is at least 5, 10, 20, 50, 100, or 150 amino acids in length; the encoded polypeptide comprises at least 5, preferably at least 10, more preferably at least 20, more preferably at least 50, 100, or 150 contiguous amino acids of the invention contained in the Sequence Listing.

In another aspect, the invention encompasses: a vector including a nucleic acid which encodes an S. pneumoniae polypeptide or an S. pneumoniae polypeptide variant as described herein; a host cell transfected with the vector; and a method of producing a recombinant S. pneumoniae polypeptide or S. pneumoniae polypeptide variant; including culturing the cell, e.g., in a cell culture medium, and isolating an S. pneumoniae polypeptide or an S. pneumoniae polypeptide variant, e.g., from the cell or from the cell culture medium.

In another series of embodiments, the invention provides isolated nucleic acids comprising sequences at least about 8 nucleotides in length, more preferably at least about 12 nucleotides in length, and most preferably at least about 15-20 nucleotides in length, that correspond to a subsequence of any one of SEQ ID NO: 1-SEQ ID NO: 2661 or complements thereof. Alternatively, the nucleic acids comprise sequences contained within any ORF (open reading frame), including a complete protein-coding sequence, of which any of SEQ ID NO: 1-SEQ ID NO: 2661 forms a part. The invention encompasses sequence-conservative variants and function-conservative variants of these sequences. The nucleic acids may be DNA, RNA, DNA/RNA duplexes, protein-nucleic acid (PNA), or derivatives thereof.

In another aspect, the invention features, a purified recombinant nucleic acid having at least 50%, 60%, 70%, 80%, 90%, 95%, 98%, or 99% homology with a sequence of the invention contained in the Sequence Listing.

In another aspect, the invention features nucleic acids capable of binding mRNA of S. pneumoniae . Such nucleic acid is capable of acting as antisense nucleic acid to control the translation of mRNA of S. pneumoniae . A further aspect features a nucleic acid which is capable of binding specifically to an S. pneumoniae nucleic acid. These nucleic acids are also referred to herein as complements and have utility as probes and as capture reagents.

In another aspect, the invention features an expression system comprising an open reading frame corresponding to S. pneumoniae nucleic acid. The nucleic acid further comprises a control sequence compatible with an intended host. The expression system is useful for making polypeptides corresponding to S. pneumoniae nucleic acid.

In another aspect, the invention features a cell transformed with the expression system to produce S. pneumoniae polypeptides.

In yet another embodiment, the invention encompasses reagents for detecting bacterial infection, including S. pneumoniae infection, which comprise at least one S. pneumoniae -derived nucleic acid defined by any one of SEQ ID NO: 1-SEQ ID NO: 2661, or sequence-conservative or function-conservative variants thereof. Alternatively, the diagnostic reagents comprise polypeptide sequences that are contained within any open reading frames (ORFs), including complete protein-coding sequences, contained within any of SEQ ID NO: 1-SEQ ID NO: 2661, or polypeptide sequences contained within any of SEQ ID NO: 2662-SEQ ID NO: 5322, or polypeptides of which any of the above sequences forms a part, or antibodies directed against any of the above peptide sequences or function-conservative variants and/or fragments thereof.

The invention further provides antibodies, preferably monoclonal antibodies, which specifically bind to the polypeptides of the invention. Methods are also provided for producing antibodies in a host animal. The methods of the invention comprise immunizing an animal with at least one S. pneumoniae -derived immunogenic component, wherein the immunogenic component comprises one or more of the polypeptides encoded by any one of SEQ ID NO: 1-SEQ ID NO: 2661 or sequence-conservative or function-conservative variants thereof; or polypeptides that are contained within any ORFs, including complete protein-coding sequences, of which any of SEQ ID NO: 1-SEQ ID NO: 2661 forms a part; or polypeptide sequences contained within any of SEQ ID NO: 2662-SEQ ID NO: 5322; or polypeptides of which any of SEQ ID NO: 2662-SEQ ID NO: 5322 forms a part. Host animals include any warm blooded animal, including without limitation mammals and birds. Such antibodies have utility as reagents for immunoassays to evaluate the abundance and distribution of S. pneumoniae -specific antigens.

In yet another aspect, the invention provides a method for detecting bacterial antigenic components in a sample, which comprises the steps of: (i) contacting a sample suspected to contain a bacterial antigenic component with a bacterial-specific antibody, under conditions in which a stable antigen-antibody complex can form between the antibody and bacterial antigenic components in the sample; and (ii) detecting any antigen-antibody complex formed in step (i), wherein detection of an antigen-antibody complex indicates the presence of at least one bacterial antigenic component in the sample. In different embodiments of this method, the antibodies used are directed against a sequence encoded by any of SEQ ID NO: 1-SEQ ID NO: 2661 or sequence-conservative or function-conservative variants thereof, or against a polypeptide sequence contained in any of SEQ ID NO: 2662-SEQ ID NO: 5322 or function-conservative variants thereof.

In yet another aspect, the invention provides a method for detecting antibacterial-specific antibodies in a sample, which comprises: (i) contacting a sample suspected to contain antibacterial-specific antibodies with a S. pneumoniae antigenic component, under conditions in which a stable antigen-antibody complex can form between the S. pneumoniae antigenic component and antibacterial antibodies in the sample; and (ii) detecting any antigen-antibody complex formed in step (i), wherein detection of an antigen-antibody complex indicates the presence of antibacterial antibodies in the sample. In different embodiments of this method, the antigenic component is encoded by a sequence contained in any of SEQ ID NO: 1-SEQ ID NO: 2661 or sequence-conservative and function-conservative variants thereof, or is a polypeptide sequence contained in any of SEQ ID NO: 2662-SEQ ID NO: 5322 or function-conservative variants thereof.

In another aspect, the invention features a method of generating vaccines for immunizing an individual against S. pneumoniae . The method includes: immunizing a subject with an S. pneumoniae polypeptide, e.g., a surface or secreted polypeptide, or active portion thereof, and a pharmaceutically acceptable carrier. Such vaccines have therapeutic and prophylactic utilities.

In another aspect, the invention features a method of evaluating a compound, e.g. a polypeptide, e.g., a fragment of a host cell polypeptide, for the ability to bind an S. pneumoniae polypeptide. The method includes: contacting the candidate compound with an S. pneumoniae polypeptide and determining if the compound binds or otherwise interacts with an S. pneumoniae polypeptide. Compounds which bind S. pneumoniae are candidates as activators or inhibitors of the bacterial life cycle. These assays can be performed in vitro or in vivo.

In another aspect, the invention features a method of evaluating a compound, e.g. a polypeptide, e.g., a fragment of a host cell polypeptide, for the ability to bind an S. pneumoniae nucleic acid, e.g., DNA or RNA. The method includes: contacting the candidate compound with an S. pneumoniae nucleic acid and determining if the compound binds or otherwise interacts with an S. pneumoniae polypeptide. Compounds which bind S. pneumoniae are candidates as activators or inhibitors of the bacterial life cycle. These assays can be performed in vitro or in vivo.

DETAILED DESCRIPTION OF THE INVENTION

The sequences of the present invention include the specific nucleic acid and amino acid sequences set forth in the Sequence Listing that forms a part of the present specification, and which are designated SEQ ID NO: 1-SEQ ID NO: 5322. Use of the terms “SEQ ID NO: 1-SEQ ID NO: 2661”, “SEQ ID NO: 2662-SEQ ID NO: 5322”, “the sequences depicted in Table 2”, etc., is intended, for convenience, to refer to each individual SEQ ID NO individually, and is not intended to refer to the genus of these sequences. In other words, it is a shorthand for listing all of these sequences individually. The invention encompasses each sequence individually, as well as any combination thereof.

Definitions

“Nucleic acid” or “polynucleotide” as used herein refers to purine- and pyrimidine-containing polymers of any length, either polyribonucleotides or polydeoxyribonucleotides or mixed polyribo-polydeoxyribo nucleotides. This includes single- and double-stranded molecules, i.e., DNA-DNA, DNA-RNA and RNA-RNA hybrids, as well as “protein nucleic acids” (PNA) formed by conjugating bases to an amino acid backbone. This also includes nucleic acids containing modified bases.

A nucleic acid or polypeptide sequence that is “derived from” a designated sequence refers to a sequence that corresponds to a region of the designated sequence. For nucleic acid sequences, this encompasses sequences that are homologous or complementary to the sequence, as well as “sequence-conservative variants” and “function-conservative variants.” For polypeptide sequences, this encompasses “function-conservative variants.” Sequence-conservative variants are those in which a change of one or more nucleotides in a given codon position results in no alteration in the amino acid encoded at that position. Function-conservative variants are those in which a given amino acid residue in a polypeptide has been changed without altering the overall conformation and function of the native polypeptide, including, but not limited to, replacement of an amino acid with one having similar physico-chemical properties (such as, for example, acidic, basic, hydrophobic, and the like). “Function-conservative” variants also include any polypeptides that have the ability to elicit antibodies specific to a designated polypeptide.

An “ S. pneumoniae -derived” nucleic acid or polypeptide sequence may or may not be present in other bacterial species, and may or may not be present in all S. pneumoniae strains. This term is intended to refer to the source from which the sequence was originally isolated. Thus, a S. pneumoniae -derived polypeptide, as used herein, may be used, e.g., as a target to screen for a broad spectrum antibacterial agent, to search for homologous proteins in other species of bacteria or in eukaryotic organisms such as fungi and humans, etc.

A purified or isolated polypeptide or a substantially pure preparation of a polypeptide are used interchangeably herein and, as used herein, mean a polypeptide that has been separated from other proteins, lipids, and nucleic acids with which it naturally occurs. Preferably, the polypeptide is also separated from substances, e.g., antibodies or gel matrix, e.g., polyacrylamide, which are used to purify it. Preferably, the polypeptide constitutes at least 10, 20, 50 70, 80 or 95% dry weight of the purified preparation. Preferably, the preparation contains: sufficient polypeptide to allow protein sequencing; at least 1, 10, or 100 mg of the polypeptide.

A purified preparation of cells refers to, in the case of plant or animal cells, an in vitro preparation of cells and not an entire intact plant or animal. In the case of cultured cells or microbial cells, it consists of a preparation of at least 10% and more preferably 50% of the subject cells.

A purified or isolated or a substantially pure nucleic acid, e.g., a substantially pure DNA, (are terms used interchangeably herein) is a nucleic acid which is one or both of the following: not immediately contiguous with both of the coding sequences with which it is immediately contiguous (i.e., one at the 5′ end and one at the 3′ end) in the naturally-occurring genome of the organism from which the nucleic acid is derived; or which is substantially free of a nucleic acid with which it occurs in the organism from which the nucleic acid is derived. The term includes, for example, a recombinant DNA which is incorporated into a vector, e.g., into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other DNA sequences. Substantially pure DNA also includes a recombinant DNA which is part of a hybrid gene encoding additional S. pneumoniae DNA sequence.

A “contig” as used herein is a nucleic acid representing a continuous stretch of genomic sequence of an organism.

An “open reading frame”, also referred to herein as ORF, is a region of nucleic acid which encodes a polypeptide. This region usually represents the total coding region for the polypeptide and can be determined from a stop to stop codon or from a start to stop codon.

As used herein, a “coding sequence” is a nucleic acid which is transcribed into messenger RNA and/or translated into a polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a translation start codon at the five prime terminus and a translation stop codon at the three prime terminus. A coding sequence can include but is not limited to messenger RNA, synthetic DNA, and recombinant nucleic acid sequences.

A “complement” of a nucleic acid as used herein refers to an anti-parallel or antisense sequence that participates in Watson-Crick base-pairing with the original sequence.

A “gene product” is a protein or structural RNA which is specifically encoded by a gene.

As used herein, the term “probe” refers to a nucleic acid, peptide or other chemical entity which specifically binds to a molecule of interest. Probes are often associated with or capable of associating with a label. A label is a chemical moiety capable of detection. Typical labels comprise dyes, radioisotopes, luminescent and chemiluminescent moieties, fluorophores, enzymes, precipitating agents, amplification sequences, and the like. Similarly, a nucleic acid, peptide or other chemical entity which specifically binds to a molecule of interest and immobilizes such molecule is referred herein as a “capture ligand”. Capture ligands are typically associated with or capable of associating with a support such as nitro-cellulose, glass, nylon membranes, beads, particles and the like. The specificity of hybridization is dependent on conditions such as the base pair composition of the nucleotides, and the temperature and salt concentration of the reaction. These conditions are readily discernable to one of ordinary skill in the art using routine experimentation.

“Homologous” refers to the sequence similarity or sequence identity between two polypeptides or between two nucleic acid molecules. When a position in both of the two compared sequences is occupied by the same base or amino acid monomer subunit, e.g., if a position in each of two DNA molecules is occupied by adenine, then the molecules are homologous at that position. The percent of homology between two sequences is a function of the number of matching or homologous positions shared by the two sequences divided by the number of positions compared×100. For example, if 6 of 10 of the positions in two sequences are matched or homologous then the two sequences are 60% homologous. By way of example, the DNA sequences ATTGCC and TATGGC share 50% homology. Generally, a comparison is made when two sequences are aligned to give maximum homology.

Nucleic acids are hybridizable to each other when at least one strand of a nucleic acid can anneal to the other nucleic acid under defined stringency conditions. Stringency of hybridization is determined by: (a) the temperature at which hybridization and/or washing is performed; and (b) the ionic strength and polarity of the hybridization and washing solutions. Hybridization requires that the two nucleic acids contain complementary sequences; depending on the stringency of hybridization, however, mismatches may be tolerated. Typically, hybridization of two sequences at high stringency (such as, for example, in a solution of 0.5×SSC, at 65° C.) requires that the sequences be essentially completely homologous. Conditions of intermediate stringency (such as, for example, 2×SSC at 65° C.) and low stringency (such as, for example 2×SSC at 55° C.), require correspondingly less overall complementarity between the hybridizing sequences. (1×SSC is 0.15 M NaCl, 0.015 M Na citrate).

The terms peptides, proteins, and polypeptides are used interchangeably herein.

As used herein, the term “surface protein” refers to all surface accessible proteins, e.g. inner and outer membrane proteins, proteins adhering to the cell wall, and secreted proteins.

A polypeptide has S. pneumoniae biological activity if it has one, two and preferably more of the following properties: (1) if when expressed in the course of an S. pneumoniae infection, it can promote, or mediate the attachment of S. pneumoniae to a cell; (2) it has an enzymatic activity, structural or regulatory function characteristic of an S. pneumoniae protein; (3) or the gene which encodes it can rescue a lethal mutation in an S. pneumoniae gene. A polypeptide has biological activity if it is an antagonist, agonist, or super-agonist of a polypeptide having one of the above-listed properties.

A biologically active fragment or analog is one having an in vivo or in vitro activity which is characteristic of the S. pneumoniae polypeptides of the invention contained in the Sequence Listing, or of other naturally occurring S. pneumoniae polypeptides, e.g., one or more of the biological activities described herein. Especially preferred are fragments which exist in vivo, e.g., fragments which arise from post transcriptional processing or which arise from translation of alternatively spliced RNA's. Fragments include those expressed in native or endogenous cells as well as those made in expression systems, e.g., in CHO cells. Because peptides such as S. pneumoniae polypeptides often exhibit a range of physiological properties and because such properties may be attributable to different portions of the molecule, a useful S. pneumoniae fragment or S. pneumoniae analog is one which exhibits a biological activity in any biological assay for S. pneumoniae activity. Most preferably the fragment or analog possesses 10%, preferably 40%, more preferably 60%, 70%, 80% or 90% or greater of the activity of S. pneumoniae , in any in vivo or in vitro assay.

Analogs can differ from naturally occurring S. pneumoniae polypeptides in amino acid sequence or in ways that do not involve sequence, or both. Non-sequence modifications include changes in acetylation, methylation, phosphorylation, carboxylation, or glycosylation. Preferred analogs include S. pneumoniae polypeptides (or biologically active fragments thereof) whose sequences differ from the wild-type sequence by one or more conservative amino acid substitutions or by one or more non-conservative amino acid substitutions, deletions, or insertions which do not substantially diminish the biological activity of the S. pneumoniae polypeptide. Conservative substitutions typically include the substitution of one amino acid for another with similar characteristics, e.g., substitutions within the following groups: valine, glycine; glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid; asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine. Other conservative substitutions can be made in view of the table below.

TABLE 1
CONSERVATIVE AMINO ACID REPLACEMENTS
For Amino Acid Code Replace with any of
Alanine A D-Ala, Gly, beta-Ala, L-Cys, D-Cys
Arginine R D-Arg, Lys, D-Lys, homo-Arg, D-homo-Arg,
Met, Ile, D-Met, D-Ile, Orn, D-Orn
Asparagine N D-Asn, Asp, D-Asp, Glu, D-Glu, Gln, D-Gln
Aspartic Acid D D-Asp, D-Asn, Asn, Glu, D-Glu, Gln, D-Gln
Cysteine C D-Cys, S-Me-Cys, Met, D-Met, Thr, D-Thr
Glutamine Q D-Gln, Asn, D-Asn, Glu, D-Glu, Asp, D-Asp
Glutamic Acid E D-Glu, D-Asp, Asp, Asn, D-Asn, Gln, D-Gln
Glycine G Ala, D-Ala, Pro, D-Pro, β-Ala, Acp
Isoleucine I D-Ile, Val, D-Val, Leu, D-Leu, Met, D-Met
Leucine L D-Leu, Val, D-Val, Leu, D-Leu, Met, D-Met
Lysine K D-Lys, Arg, D-Arg, homo-Arg, D-homo-Arg,
Met, D-Met, Ile, D-Ile, Orn, D-Orn
Methionine M D-Met, S-Me-Cys, Ile, D-Ile, Leu, D-Leu,
Val, D-Val
Phenylalanine F D-Phe, Tyr, D-Thr, L-Dopa, His, D-His,
Trp, D-Trp, Trans-3,4, or 5-phenylproline,
cis-3,4, or 5-phenylproline
Proline P D-Pro, L-1-thioazolidine-4-carboxylic acid,
D-or L-1-oxazolidine-4-carboxylic acid
Serine S D-Ser, Thr, D-Thr, allo-Thr, Met, D-Met,
Met(O), D-Met(O), L-Cys, D-Cys
Threonine T D-Thr, Ser, D-Ser, allo-Thr, Met, D-Met,
Met(O), D-Met(O), Val, D-Val
Tyrosine Y D-Tyr, Phe, D-Phe, L-Dopa, His, D-His
Valine V D-Val, Leu, D-Leu, Ile, D-Ile, Met, D-Met

Other analogs within the invention are those with modifications which increase peptide stability; such analogs may contain, for example, one or more non-peptide bonds (which replace the peptide bonds) in the peptide sequence. Also included are: analogs that include residues other than naturally occurring L-amino acids, e.g., D-amino acids or non-naturally occurring or synthetic amino acids, e.g., β or γ amino acids; and cyclic analogs.

As used herein, the term “fragment”, as applied to an S. pneumoniae analog, will ordinarily be at least about 20 residues, more typically at least about 40 residues, preferably at least about 60 residues in length. Fragments of S. pneumoniae polypeptides can be generated by methods known to those skilled in the art. The ability of a candidate fragment to exhibit a biological activity of S. pneumoniae polypeptide can be assessed by methods known to those skilled in the art as described herein. Also included are S. pneumoniae polypeptides containing residues that are not required for biological activity of the peptide or that result from alternative mRNA splicing or alternative protein processing events.

An “immunogenic component” as used herein is a moiety, such as an S. pneumoniae polypeptide, analog or fragment thereof, that is capable of eliciting a humoral and/or cellular immune response in a host animal.

An “antigenic component” as used herein is a moiety, such as an S. pneumoniae polypeptide, analog or fragment thereof, that is capable of binding to a specific antibody with sufficiently high affinity to form a detectable antigen-antibody complex.

The term “antibody” as used herein is intended to include fragments thereof which are specifically reactive with S. pneumoniae polypeptides.

As used herein, the term “cell-specific promoter” means a DNA sequence that serves as a promoter, i.e., regulates expression of a selected DNA sequence operably linked to the promoter, and which effects expression of the selected DNA sequence in specific cells of a tissue. The term also covers so-called “leaky” promoters, which regulate expression of a selected DNA primarily in one tissue, but cause expression in other tissues as well.

Misexpression, as used herein, refers to a non-wild type pattern of gene expression. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-translational modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.

As used herein, “host cells” and other such terms denoting microorganisms or higher eukaryotic cell lines cultured as unicellular entities refers to cells which can become or have been used as recipients for a recombinant vector or other transfer DNA, and include the progeny of the original cell which has been transfected. It is understood by individuals skilled in the art that the progeny of a single parental cell may not necessarily be completely identical in genomic or total DNA compliment to the original parent, due to accident or deliberate mutation.

As used herein, the term “control sequence” refers to a nucleic acid having a base sequence which is recognized by the host organism to effect the expression of encoded sequences to which they are ligated. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include a promoter, ribosomal binding site, terminators, and in some cases operators; in eukaryotes, generally such control sequences include promoters, terminators and in some instances, enhancers. The term control sequence is intended to include at a minimum, all components whose presence is necessary for expression, and may also include additional components whose presence is advantageous, for example, leader sequences.

As used herein, the term “operably linked” refers to sequences joined or ligated to function in their intended manner. For example, a control sequence is operably linked to coding sequence by ligation in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequence and host cell.

The “metabolism” of a substance, as used herein, means any aspect of the expression, function, action, or regulation of the substance. The metabolism of a substance includes modifications, e.g., covalent or non-covalent modifications of the substance. The metabolism of a substance includes modifications, e.g., covalent or non-covalent modification, the substance induces in other substances. The metabolism of a substance also includes changes in the distribution of the substance. The metabolism of a substance includes changes the substance induces in the distribution of other substances.

A “sample” as used herein refers to a biological sample, such as, for example, tissue or fluid isolated from an individual (including without limitation plasma, serum, cerebrospinal fluid, lymph, tears, saliva and tissue sections) or from in vitro cell culture constituents, as well as samples from the environment.

Technical and scientific terms used herein have the meanings commonly understood by one of ordinary skill in the art to which the present invention pertains, unless otherwise defined. Reference is made herein to various methodologies known to those of skill in the art. Publications and other materials setting forth such known methodologies to which reference is made are incorporated herein by reference in their entireties as though set forth in full. The practice of the invention will employ, unless otherwise indicated, conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are explained fully in the literature. See e.g., Sambrook, Fritsch, and Maniatis, Molecular Cloning; Laboratory Manual 2nd ed. (1989); DNA Cloning , Volumes I and II (D. N Glover ed. 1985); Oligonucleotide Synthesis (M. J. Gait ed, 1984); Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); the series, Methods in Enzymology (Academic Press, Inc.), particularly Vol. 154 and Vol. 155 (Wu and Grossman, eds.); PCR - A Practical Approach (McPherson, Quirke, and Taylor, eds., 1991); Immunology, 2d Edition, 1989, Roitt et al., C. V. Mosby Company, and New York; Advanced Immunology, 2d Edition, 1991, Male et al., Grower Medical Publishing, New York.; DNA Cloning: A Practical Approach , Volumes I and II, 1985 (D. N. Glover ed.); Oligonucleotide Synthesis, 1984, (M. L. Gait ed); Transcription and Translation, 1984 (Hames and Higgins eds.); Animal Cell Culture, 1986 (R. I. Freshney ed.); Immobilized Cells and Enzymes, 1986 (IRL Press); Perbal, 1984, A Practical Guide to Molecular Cloning ; and Gene Transfer Vectors for Mammalian Cells, 1987 (J. H. Miller and M. P. Calos eds., Cold Spring Harbor Laboratory).

Any suitable materials and/or methods known to those of skill can be utilized in carrying out the present invention: however preferred materials and/or methods are described. Materials, reagents and the like to which reference is made in the following description and examples are obtainable from commercial sources, unless otherwise noted.

S. pneumoniae Genomic Sequence

This invention provides nucleotide sequences of the genome of S. pneumoniae which thus comprises a DNA sequence library of S. pneumoniae genomic DNA. The detailed description that follows provides nucleotide sequences of S. pneumoniae , and also describes how the sequences were obtained and how ORFs and protein-coding sequences were identified. Also described are methods of using the disclosed S. pneumoniae sequences in methods including diagnostic and therapeutic applications. Furthermore, the library can be used as a database for identification and comparison of medically important sequences in this and other strains of S. pneumoniae.

To determine the genomic sequence of S. pneumoniae , DNA was isolated from strain 14453 of S. pneumoniae and mechanically sheared by nebulization to a median size of 2 kb. Following size fractionation by gel electrophoresis, the fragments were blunt-ended, ligated to adapter oligonucleotides, and cloned into each of 20 different pMPX vectors (Rice et al., abstracts of Meeting of Genome Mapping and Sequencing, Cold Spring Harbor, N.Y., May 11-May 15, 1994, p. 225) and the PUC19 vector to construct a series of “shotgun” subclone libraries.

DNA sequencing was achieved using two sequencing methods. The first method used multiplex sequencing procedures essentially as disclosed in Church et al., 1988, Science 240:185; U.S. Pat. Nos. 4,942,124 and 5,149,625). DNA was extracted from pooled cultures and subjected to chemical or enzymatic sequencing. Sequencing reactions were resolved by electrophoresis, and the products were transferred and covalently bound to nylon membranes. Finally, the membranes were sequentially hybridized with a series of labelled oligonucleotides complimentary to “tag” sequences present in the different shotgun cloning vectors. In this manner, a large number of sequences could be obtained from a single set of sequencing reactions. The remainder of the sequencing was performed on ABI377 automated DNA sequencers. The cloning and sequencing procedures are described in more detail in the Exemplification.

Individual sequence reads were assembled using PHRAP (P. Green, Abstracts of DOE Human Genome Program Contractor-Grantee Workshop V, January 1996, p. 157). The average contig length was about 3-4 kb.

A variety of approaches are used to order the contigs so as to obtain a continuous sequence representing the entire S. pneumoniae genome. Synthetic oligonucleotides are designed that are complementary to sequences at the end of each contig. These oligonucleotides may be hybridized to libaries of S. pneumoniae genomic DNA in, for example, lambda phage vectors or plasmid vectors to identify clones that contain sequences corresponding to the junctional regions between individual contigs. Such clones are then used to isolate template DNA and the same oligonucleotides are used as primers in polymerase chain reaction (PCR) to amplify junctional fragments, the nucleotide sequence of which is then determined.

The S. pneumoniae sequences were analyzed for the presence of open reading frames (ORFs) comprising at least 180 nucleotides. As a result of the initial analysis of ORFs based on stop-to-stop codon reads, it should be understood that these ORFs may not correspond to the ORF of a naturally-occurring S. pneumoniae polypeptide. These ORFs may contain start codons which indicate the initiation of protein synthesis of a naturally-occurring S. pneumoniae polypeptide. Such start codons within the ORFs provided herein can be identified by those of ordinary skill in the relevant art, and the resulting ORF and the encoded S. pneumoniae polypeptide is within the scope of this invention. For example, within the ORFs a codon such as AUG or GUG (encoding methionine or valine) which is part of the initiation signal for protein synthesis can be identified and the portion of an ORF to corresponding to a naturally-occurring S. pneumoniae polypeptide can be recognized.

The second analysis of the ORFs included identifying the start codons and the predicted coding regions. These ORFs provided in this invention were defined by one or more of the following methods: evaluating the coding potential of such sequences with the program GENEMARK™ (Borodovsky and McIninch, 1993, Comp. 17:123), distinguishing the coding from noncoding regions using the program Glimmer (Fraser et al, Nature, 1997), determining codon usage (Staden et al., Nucleic Acid Research 10: 141), and each predicted ORF amino acid sequence was compared with all protein sequences found in current GENBANK, SWISS-PROT, and PIR databases using the BLAST algorithm. BLAST identifies local alignments occurring by chance between the ORF sequence and the sequence in the databank (Altschal et al., 1990, L Mol. Biol. 215:403-410). Homologous ORFs (probabilities less than 10 −5 by chance) and ORF's that are probably non-homologous (probabilities greater than 10 −5 by chance) but have good codon usage were identified. Both homologous, sequences and non-homologous sequences with good codon usage are likely to encode proteins and are encompassed by the invention.

S. pneumoniae Nucleic Acids

The nucleic acids of this invention may be obtained directly from the DNA of the above referenced S. pneumoniae strain by using the polymerase chain reaction (PCR). See “ PCR, A Practical Approach ” (McPherson, Quirke, and Taylor, eds., IRL Press, Oxford, UK, 1991) for details about the PCR. High fidelity PCR can be used to ensure a faithful DNA copy prior to expression. In addition, the authenticity of amplified products can be verified by conventional sequencing methods. Clones carrying the desired sequences described in this invention may also be obtained by screening the libraries by means of the PCR or by hybridization of synthetic oligonucleotide probes to filter lifts of the library colonies or plaques as known in the art (see, e.g., Sambrook et al., Molecular Cloning, A Laboratory Manual 2nd edition, 1989, Cold Spring Harbor Press, NY).

It is also possible to obtain nucleic acids encoding S. pneumoniae polypeptides from a cDNA library in accordance with protocols herein described. A cDNA encoding an S. pneumoniae polypeptide can be obtained by isolating total mRNA from an appropriate strain. Double stranded cDNAs can then be prepared from the total mRNA. Subsequently, the cDNAs can be inserted into a suitable plasmid or viral (e.g., bacteriophage) vector using any one of a number of known techniques. Genes encoding S. pneumoniae polypeptides can also be cloned using established polymerase chain reaction techniques in accordance with the nucleotide sequence information provided by the invention. The nucleic acids of the invention can be DNA or RNA. Preferred nucleic acids of the invention are contained in the Sequence Listing.

The nucleic acids of the invention can also be chemically synthesized using standard techniques. Various methods of chemically synthesizing polydeoxynucleotides are known, including solid-phase synthesis which, like peptide synthesis, has been fully automated in commercially available DNA synthesizers (See e.g., Itakura et al. U.S. Pat. No. 4,598,049; Caruthers et al. U.S. Pat. No. 4,458,066; and Itakura U.S. Pat. Nos. 4,401,796 and 4,373,071, incorporated by reference herein).

Nucleic acids isolated or synthesized in accordance with features of the present invention are useful, by way of example, without limitation, as probes, primers, capture ligands, antisense genes and for developing expression systems for the synthesis of proteins and peptides corresponding to such sequences. As probes, primers, capture ligands and antisense agents, the nucleic acid normally consists of all or part (approximately twenty or more nucleotides for specificity as well as the ability to form stable hybridization products) of the nucleic acids of the invention contained in the Sequence Listing. These uses are described in further detail below.

Probes

A nucleic acid isolated or synthesized in accordance with the sequence of the invention contained in the Sequence Listing can be used as a probe to specifically detect S. pneumoniae . With the sequence information set forth in the present application, sequences of twenty or more nucleotides are identified which provide the desired inclusivity and exclusivity with respect to S. pneumoniae , and extraneous nucleic acids likely to be encountered during hybridization conditions. More preferably, the sequence will comprise at least twenty to thirty nucleotides to convey stability to the hybridization product formed between the probe and the intended target molecules.

Sequences larger than 1000 nucleotides in length are difficult to synthesize but can be generated by recombinant DNA techniques. Individuals skilled in the art will readily recognize that the nucleic acids, for use as probes, can be provided with a label to facilitate detection of a hybridization product.

Nucleic acid isolated and synthesized in accordance with the sequence of the invention contained in the Sequence Listing can also be useful as probes to detect homologous regions (especially homologous genes) of other Streptococcus species using appropriate stringency hybridization conditions as described herein.

Capture Ligand

For use as a capture ligand, the nucleic acid selected in the manner described above with respect to probes, can be readily associated with a support. The manner in which nucleic acid is associated with supports is well known. Nucleic acid having twenty or more nucleotides in a sequence of the invention contained in the Sequence Listing have utility to separate S. pneumoniae nucleic acid from the nucleic acid of each other and other organisms. Nucleic acid having twenty or more nucleotides in a sequence of the invention contained in the Sequence Listing can also have utility to separate other Streptococcus species from each other and from other organisms. Preferably, the sequence will comprise at least twenty nucleotides to convey stability to the hybridization product formed between the probe and the intended target molecules. Sequences larger than 1000 nucleotides in length are difficult to synthesize but can be generated by recombinant DNA techniques.

Primers

Nucleic acid isolated or synthesized in accordance with the sequences described herein have utility as primers for the amplification of S. pneumoniae nucleic acid. These nucleic acids may also have utility as primers for the amplification of nucleic acids in other Streptococcus species. With respect to polymerase chain reaction (PCR) techniques, nucleic acid sequences of ≧10-15 nucleotides of the invention contained in the Sequence Listing have utility in conjunction with suitable enzymes and reagents to create copies of S. pneumoniae nucleic acid. More preferably, the sequence will comprise twenty or more nucleotides to convey stability to the hybridization product formed between the primer and the intended target molecules. Binding conditions of primers greater than 100 nucleotides are more difficult to control to obtain specificity. High fidelity PCR can be used to ensure a faithful DNA copy prior to expression. In addition, amplified products can be checked by conventional sequencing methods.

The copies can be used in diagnostic assays to detect specific sequences, including genes from S. pneumoniae and/or other Streptococcus species. The copies can also be incorporated into cloning and expression vectors to generate polypeptides corresponding to the nucleic acid synthesized by PCR, as is described in greater detail herein.

Antisense

Nucleic acid or nucleic acid-hybridizing derivatives isolated or synthesized in accordance with the sequences described herein have utility as antisense agents to prevent the expression of S. pneumoniae genes. These sequences also have utility as antisense agents to prevent expression of genes of other Streptococcus species.

In one embodiment, nucleic acid or derivatives corresponding to S. pneumoniae nucleic acids is loaded into a suitable carrier such as a liposome or bacteriophage for introduction into bacterial cells. For example, a nucleic acid having twenty or more nucleotides is capable of binding to bacteria nucleic acid or bacteria messenger RNA. Preferably, the antisense nucleic acid is comprised of 20 or more nucleotides to provide necessary stability of a hybridization product of non-naturally occurring nucleic acid and bacterial nucleic acid and/or bacterial messenger RNA. Nucleic acid having a sequence greater than 1000 nucleotides in length is difficult to synthesize but can be generated by recombinant DNA techniques. Methods for loading antisense nucleic acid in liposomes is known in the art as exemplified by U.S. Pat. No. 4,241,046 issued Dec. 23, 1980 to Papahadjopoulos et al.

The present invention encompasses isolated polypeptides and nucleic acids derived from S. pneumoniae that are useful as reagents for diagnosis of bacterial infection, components of effective antibacterial vaccines, and/or as targets for antibacterial drugs, including anti- S. pneumoniae drugs.

Expression of S. pneumoniae Nucleic Acids

Table 2 provides a list of open reading frames (ORFs) in both strands. An ORF is a region of nucleic acid which encodes a polypeptide. This region normally represents a complete coding sequence or a total sequence and was determined from an initial analysis of stop to stop codons followed by the prediction of start codons. The first column lists the ORF designation. The second and third columns list the SEQ ID numbers for the nucleic acid and amino acid sequences corresponding to each ORF, respectively. The fourth and fifth columns list the length of the nucleic acid ORF and the length of the amino acid ORF, respectively. Most of the nucleotide sequences corresponding to each ORF begin at the first nucleotide of the start codon and end at the nucleotide immediately preceding the next downstream stop codon in the same reading frame. It will be recognized by one skilled in the art that the natural translation initiation sites will correspond to ATG, GTG, or TTG codons located within the ORFs. The natural initiation sites depend not only on the sequence of a start codon but also on the context of the DNA sequence adjacent to the start codon. Usually, a recognizable ribosome binding site is found within 20 nucleotides upstream from the initiation codon. In some cases where genes are translationally coupled and coordinately expressed together in “operons”, ribosome binding sites are not present, but the initiation codon of a downstream gene may occur very close to, or overlap, the stop codon of the an upstream gene in the same operon. The correct start codons can be generally identified rapidly and efficiently because only a few codons need be tested. It is recognized that the translational machinery in bacteria initiates most polypeptide chains with the amino acid methionine. In some cases, polypeptides are post-translationally modified, resulting in an N-terminal amino acid other than methionine in vivo. The sixth and seventh columns provide metrics for assessing the likelihood of the homology match (determined by the BLASTP2 algorithm), as is known in the art, to the genes indicated in the description field. Specifically, the sixth column represents the “Score” for the match (a higher score is a better match), and the seventh column represents the “P-value” for the match (the probability that such a match could have occurred by chance; the lower the value, the more likely the match is valid). If a BLASTP2 score of less than 46 was obtained, no value is reported in the table the “P-value”. The description field provides, where available, the accession number (AC) or the Swissprot accession number (SP), the locus name (LN), Superfamily Classification (CL), the Organism (OR), Source of variant (SR), E.C. number (EC), the gene name (GN), the product name (PN), the Function Description (FN), the Map Position (MP), Left End (LE), Right End (RE), Coding Direction (DI), the Database from which the sequence originates (DB), and the description (DE) or notes (NT) for each ORF. This information allows one of ordinary skill in the art to determine a potential use and function for each identified coding sequence and, as a result, allows the use of the polypeptides of the present invention for commercial and industrial purposes.

Using the information provided in SEQ ID NO: 1-SEQ ID NO: 2661 and in Table 2 together with routine cloning and sequencing methods, one of ordinary skill in the art will be able to clone and sequence all the nucleic acid fragments of interest including open reading frames (ORFs) encoding a large variety proteins of S. pneumoniae.

Nucleic acid isolated or synthesized in accordance with the sequences described herein have utility to generate polypeptides. The nucleic acid of the invention exemplified in SEQ ID NO: 1-SEQ ID NO: 2661 and in Table 2 or fragments of said nucleic acid encoding active portions of S. pneumoniae polypeptides can be cloned into suitable vectors or used to isolate nucleic acid. The isolated nucleic acid is combined with suitable DNA linkers and cloned into a suitable vector.

The function of a specific gene or operon can be ascertained by expression in a bacterial strain under conditions where the activity of the gene product(s) specified by the gene or operon in question can be specifically measured. Alternatively, a gene product may be produced in large quantities in an expressing strain for use as an antigen, an industrial reagent, for structural studies, etc. This expression can be accomplished in a mutant strain which lacks the activity of the gene to be tested, or in a strain that does not produce the same gene product(s). This includes, but is not limited to, Eucaryotic species such as the yeast Saccharomyces cerevisiae, Methanobacterium strains or other Archaea, and Eubacteria such as E. coli, B. subtilis, S. aureus, S. pneumonia or Pseudomonas putida . In some cases the expression host will utilize the natural S. pneumoniae promoter whereas in others, it will be necessary to drive the gene with a promoter sequence derived from the expressing organism (e.g., an E. coli beta-galactosidase promoter for expression in E. coli ).

To express a gene product using the natural S. pneumoniae promoter, a procedure such as the following can be used. A restriction fragment containing the gene of interest, together with its associated natural promoter element and regulatory sequences (identified using the DNA sequence data) is cloned into an appropriate recombinant plasmid containing an origin of replication that functions in the host organism and an appropriate selectable marker. This can be accomplished by a number of procedures known to those skilled in the art. It is most preferably done by cutting the plasmid and the fragment to be cloned with the same restriction enzyme to produce compatible ends that can be ligated to join the two pieces together. The recombinant plasmid is introduced into the host organism by, for example, electroporation and cells containing the recombinant plasmid are identified by selection for the marker on the plasmid. Expression of the desired gene product is detected using an assay specific for that gene product.

In the case of a gene that requires a different promoter, the body of the gene (coding sequence) is specifically excised and cloned into an appropriate expression plasmid. This subcloning can be done by several methods, but is most easily accomplished by PCR amplification of a specific fragment and ligation into an expression plasmid after treating the PCR product with a restriction enzyme or exonuclease to create suitable ends for cloning.

A suitable host cell for expression of a gene can be any procaryotic or eucaryotic cell. For example, an S. pneumoniae polypeptide can be expressed in bacterial cells such as E. coli or B. subtilis , insect cells (baculovirus), yeast, or mammalian cells such as Chinese hamster ovary cell (CHO). Other suitable host cells are known to those skilled in the art.

Expression in eucaryotic cells such as mammalian, yeast, or insect cells can lead to partial or complete glycosylation and/or formation of relevant inter- or intra-chain disulfide bonds of a recombinant peptide product. Examples of vectors for expression in yeast S. cerivisae include pYepSec1 (Baldari. et al., (1987) Embo J. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 30:933-943), pJRY88 (Schultz et al., (1987) Gene 54:113-123), and pYES2 (Invitrogen Corporation, San Diego, Calif.). Baculovirus vectors available for expression of proteins in cultured insect cells (SF 9 cells) include the pAc series (Smith et al., (1983) Mol. Cell Biol. 3:2156-2165) and the pVL series (Lucklow, V. A., and Summers, M. D., (1989) Virology 170:31-39). Generally, COS cells (Gluzman, Y., (1981) Cell 23:175-182) are used in conjunction with such vectors as pCDM 8 (Aruffo, A. and Seed, B., (1987) Proc. Natl. Acad. Sci. USA 84:8573-8577) for transient amplification/expression in mammalian cells, while CHO (dhfr Chinese Hamster Ovary) cells are used with vectors such as pMT2PC (Kaufman et al. (1987), EMBO J. 6:187-195) for stable amplification/expression in mammalian cells. Vector DNA can be introduced into mammalian cells via conventional techniques such as calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, or electroporation. Suitable methods for transforming host cells can be found in Sambrook et al. ( Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory textbooks.

Expression in procaryotes is most often carried out in E. coli with either fusion or non-fusion inducible expression vectors. Fusion vectors usually add a number of NH 2 terminal amino acids to the expressed target gene. These NH 2 terminal amino acids often are referred to as a reporter group or an affinity purification group. Such reporter groups usually serve two purposes: 1) to increase the solubility of the target recombinant protein; and 2) to aid in the purification of the target recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the reporter group and the target recombinant protein to enable separation of the target recombinant protein from the reporter group subsequent to purification of the fusion protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. Typical fusion expression vectors include pGEX (Amrad Corp., Melbourne, Australia), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase, maltose E binding protein, or protein A, respectively, to the target recombinant protein. A preferred reporter group is poly(His), which may be fused to the amino or carboxy terminus of the protein and which renders the recombinant fusion protein easily purifiable by metal chelate chromatography.

Inducible non-fusion expression vectors include pTrc (Amann et al., (1988) Gene 69:301-315) and pET11d (Studier et al., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 60-89). While target gene expression relies on host RNA polymerase transcription from the hybrid trp-lac fusion promoter in pTrc, expression of target genes inserted into pET11d relies on transcription from the T7 gn10-lac 0 fusion promoter mediated by coexpressed viral RNA polymerase (T7 gn1). This viral polymerase is supplied by host strains BL21(DE3) or HMS174(DE3) from a resident λ prophage harboring a T7 gn1 under the transcriptional control of the lacUV 5 promoter.

For example, a host cell transfected with a nucleic acid vector directing expression of a nucleotide sequence encoding an S. pneumoniae polypeptide can be cultured under appropriate conditions to allow expression of the polypeptide to occur. The polypeptide may be secreted and isolated from a mixture of cells and medium containing the peptide. Alternatively, the polypeptide may be retained cytoplasmically and the cells harvested, lysed and the protein isolated. A cell culture includes host cells, media and other byproducts. Suitable media for cell culture are well known in the art. Polypeptides of the invention can be isolated from cell culture medium, host cells, or both using techniques known in the art for purifying proteins including ion-exchange chromatography, gel filtration chromatography, ultrafiltration, electrophoresis, and immunoaffinity purification with antibodies specific for such polypeptides. Additionally, in many situations, polypeptides can be produced by chemical cleavage of a native protein (e.g., tryptic digestion) and the cleavage products can then be purified by standard techniques.

In the case of membrane bound proteins, these can be isolated from a host cell by contacting a membrane-associated protein fraction with a detergent forming a solubilized complex, where the membrane-associated protein is no longer entirely embedded in the membrane fraction and is solubilized at least to an extent which allows it to be chromatographically isolated from the membrane fraction. Several different criteria are used for choosing a detergent suitable for solubilizing these complexes. For example, one property considered is the ability of the detergent to solubilize the S. pneumoniae protein within the membrane fraction at minimal denaturation of the membrane-associated protein allowing for the activity or functionality of the membrane-associated protein to return upon reconstitution of the protein. Another property considered when selecting the detergent is the critical micelle concentration (CMC) of the detergent in that the detergent of choice preferably has a high CMC value allowing for ease of removal after reconstitution. A third property considered when selecting a detergent is the hydrophobicity of the detergent. Typically, membrane-associated proteins are very hydrophobic and therefore detergents which are also hydrophobic, e.g., the triton series, would be useful for solubilizing the hydrophobic proteins. Another property important to a detergent can be the capability of the detergent to remove the S. pneumoniae protein with minimal protein-protein interaction facilitating further purification. A fifth property of the detergent which should be considered is the charge of the detergent. For example, if it is desired to use ion exchange resins in the purification process then preferably detergent should be an uncharged detergent. Chromatographic techniques which can be used in the final purification step are known in the art and include hydrophobic interaction, lectin affinity, ion exchange, dye affinity and immunoaffinity.

One strategy to maximize recombinant S. pneumoniae peptide expression in E. coli is to express the protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Another strategy would be to alter the nucleic acid encoding an S. pneumoniae peptide to be inserted into an expression vector so that the individual codons for each amino acid would be those preferentially utilized in highly expressed