ARRAY HAVING SUBSTANCES FIXED ON SUPPORT ARRANGED WITH CHROMOSOMAL ORDER OR SEQUENCE POSITION INFORMATION ADDED THERETO
Abstract
ABSTRACT In fabricating various types of arrays such as a micro array, different kinds of biosubstances, or synthetic substances interacting with the biosubstances, are arranged and immobilized on a support such that the chromosomal order of base sequence blocks, corresponding to the biosubstances, is flticertainable. The biosubstances mny he niiclcic ni I'dH HVicii MS DNA, or polypeptides such as protein. The synthetic substances may be compounds that react with the biosubstances. By thus specifying the order of the biosubstances or synthetic substances immobilized on the support, the array can be used, for example, for screening in variety improvement of living organisms.
Full Text
TECHNICAL FIELD
The present invention relates, for example, to a novel
array and a fabrication method of the array, various analytical
systems using the array, and representative methods of using
these techniques.
More specifically, the invention relates to (1) an array,
such as a DNA micro array, in which biosubstances derived
from a Hving organism, or synthetic substances that interact
with the biosubstances, are immobilized on a support by being
arranged in an orderly manner, (2) a system for analyzing a
genotype of the organism of interest for display, and, in
particular a genotype analyzing and display system that enables
locations o( crossovers on the chromosomes to be visually
recognized in hybrid individuals obtained by crossing, (3) a
HVstfin for analyzing quantitative iriiit Inci f»( llir' riri'.nnism of
inlrrrNl, and representative melhutU of UHltiK ll, iiiul In
particular a quantitative loci analyzing system for analyzing
QTL by effectively using the analysis result obtained from a
nucleic acid array, and (4) a gene interaction analyzing system,
and in particular a gene interaction analyzing system for
effectively analyzing which genes or a group of genes are
associated with the traits or genes being analyzed, by effectively
using the analysis result obtained from a nucleic acid array.
The invention also relates to representative methods of using
such arrays and analyzing systems.
BACKGROUND ART
With the recent advance of the worldwide genome project,
the entire genomes of many model organisms have been
sequenced. Sequencing of the entire genomes of many other
organisms are underway as in the sequencing ai the human
genome in the Human Genome Project. As evidenced by these
advances, research in molecular biology has entered the
post-genome (post-sequence) era.
In the post-genome era, a new approach has been used for
the analysis of genome functions. Specifically, the emphasis of
genome function analysis has shifted, rather drastically, from
the conventional pinpoint approach whereby analysis is made
by cloning individual genes associated with certain living
phenomena, to a systematic and comprehensive approach
whereby gene functions are analyzed on a genome scale.
The genome information is also used for the analysis of
transcripts and proteins. Specifically, ^transcriptome analysis
and prote.c>m& analysis^ have won the recognition. The
transcriptome analysis is used for the analysis of transcripts,
whereby the expression of all transcripts in an organism or cells
are analyzed both systematically and comprehensively using
genome information. The proteome analysis is a systematic and
comprehensive method of analyzing proteins, in which the
properties or expression of all proteins expressed at any given
location and any given time in an organism or cells are analyzed
using genome information.
For the systematic and comprehensive analyses, various
array techniques are often used. The array technique refers to a
technique using an array, in which biosubstances, such as DNA
or various proteins obtained from the organism of interest being
analyzed, or synthetic substances (for example, compounds
with hythophobic groups or ion exchniiRe KrcuipH) lliiil internet
with such biosubstances urr immohillzeci on ti MVijiptut in tm
orderly manner.
With the array technique, the systematic and
comprehensive analysis can be performed efficiently. For
example, for the analysis of gene transcription control
mechanism, it is required to measure transcription level of
genes, which varies according to the state of the cell. For this
purpose, use of a DNA micro array, one form of the array
technique, allows for systematic measurement of transcription
level of several thousand to several ten thousand of genes (see
Non-Patent Documents 1-6, for example).
Among such DNA micro array techniques, one that has
been widely used is the DNA micro array technique developed
by Affymetrix. In this technique, oligonucleotides are directly
synthesized on a silica substrate using a microfabrication
technique employed in the fabrication of semiconductors (see
Patent Document 1, for example).
For example, for the analysis of gene transcription control
mechanism, it is required to measure transcription level of
genes, which varies according to the state of the cell. For this
purpose, use of a DNA micro array, one form of the array
H
technique, allows for systematic measurement of transcription
level of several thousand to several ten thousand of genes. Thus,
through hybridization, the nucleic acid array such as the DNA
micro array can produce a large amount of data concerning
gene expression.
However, it is practically impossible to manually process
the gene expression data obtained from the nucleic acid array
since the amount of data obtained in biotechnology is enormous.
In view of this, there have been proposed various types of
bioinformatics techniques, whereby a large volume of data is
analyzed using computers. As a technique of analyzing gene
expression data, it has been known to analyze gene expression
patterns in clusters, as disclosed in Patent Document 2, or
analyze gene expression data based on parameters and use it
for clinical purposes, as disclosed in Patent Document 3.
With the large data volume to be analyzed, the analysis
may yield complex results. Therefore, the bioinformatics
technique requires a technique of desirably displaying the
analysis results. For example, as a technique concerning gene
expression display, a technique for two-dimensionally
displaying expression level has been known, as disclosed in
Patent Document 4.
Willi the recent tidvancc in Ihr unir itunllfit iidnn
technique, alien genes have been introduced into vurinua pIuntB
to confer new traits. Actual application of such plants aa crop
plants is also underway. The development of genetically
modified crops (GMO) was once believed to have a promising
future in bio-industries. However, the GMO could not win
customer acceptance, and, today, safety of processed foods is
often promoted by not using GMO.
It is therefore inconceivable that the traditional crossing
or mutant induction will fade away in the variety improvement
$"
of crops. On the contrary, for' improving the market value of
crops or processed foods using crops, crossing or other
traditional methods are still favored as a primary method of
variety improvement of crops.
However, in actual variety improvement by crossing for
example, a group of hybrid individuals, numbering several
thousand to several tens of thousand, is screened for useful
individuals by observing or analyzing traits of the hybrid
individuals. As such, the efficiency of screening for superior
individuals is considerably poor.
The array technique and bioinformatics technique are
believed to facilitate the variety improvement employing
traditional crossing.
One known technique of crossing is screening of a
genotype using genetic markers. In variety improvement using
genetic markers, it is important to recognize loci associated with
target quantitative traits (QTL). The quantitative traits are
governed by the polygene system, and therefore it is not
possible to directly deal with the effects of expression of
individual genes. This is where statistical analysis is important
for the recognition of QTL. [Specifically, in order to recognize
QTL, selected genetic markers are scattered along the entire
chromosomes, and any linkage between the genetic markers
and the quantitative traits is determined in order to map
locations of QTL on a linkage map.
The QTL analysis requires development of genetic markers
or other materials such as hybrid lines (family lines), which are
used to construct a linkage map. In addition, the QTL analysis
produces a vast amount of information concerning analysis,
such as measurement of traits, or typing of genetic markers
(number of genetic markers B number of individuals). The array
technique and bioinformatics technique are considered to
facilitate the QTL analysis.
In tlie analysis of ^rne rxpicMHion (inlii. llir iritii
/^expression profile" is used to refer to puUcrna of ^f.nc
expression or amount which vary depending on the cell type or
cell stage. By measuring and analyzing the expression profile,
important findings concerning gene functions or regulation
mechanisms can be obtained. Such findings can be effectively
used for the variety improvement of industrially useful species.
In the case of humans, the analysis of expression profile can
yield useful results for drug discovery, pharmacology, toxicology,
and diagnosis.
One technique of expression profile analysis is one that
employs plustering, as disclosed in Patent Document 2 and
Patent Document 5. In clustering, a group of genes that shows
similar expression patterns under different measurement
conditions is identified and sorted into clusters on a nucleic
acid array. Another technique is one that Analyzes expression
networks between genes, as disclosed in Patent Document 6.
The expression level of a gene is directly or indirectly regulated
by other genes, and therefore finding expression networks
between genes provides important information in the expression
profile analysis, as does clustering.
For human applications, Patent Document 7 discloses an
evaluation index estimation technique, in which genes for
quantitatively estimating an evaluation index of interest are
suitably selected from data obtained from, each sample. For
example, in the score measuring changes in gene expression
profile caused by human illness, the number of samples is
considerably smaller than the number of genes on which
changes in expression level are measured, owning to the
difficulty in collecting a large number of samples. Thus, it is
often difficult to analyze the correlation with the illness by a
1
common statistical method. In order to overcome such problems,
the technique disclosed in Patent Document 7 extracts genes
closely related to an evaluation index of interest and estimates
evaluation index data.
[Non-Patent Document 1]
Genome Functions, Expression Profile and Transcriptome;
Editor-in-Chief, Ken-ich Matsubara, Yoshiyuki Sakaki,
Nakayama-Shoten Co., Ltd., published on September 13, 2000
[Non-Patent Document 2]
DNA Micro Array; chief translator, Ikunoshinn Kato,
Maruzen, published on September 25, 2000
[Non-Patent Document 3)
DNA Micro Array Praclicnl Maniinl for S\iiirnsfui lUtta
Acquisition, Basic Principle, from Chip Fabrication to
Bioinformatics, Editor-in-Chief, Yoshihidc Hayashizaki,
YODOSHA Co., Ltd., published on December ], 2000
[Non-Patent Document 4]
Concise and Practical Introductions to DNA Micro Array
Data Analysis, YODOSHA Co., Ltd., published on November 20,
2002
[Non-Patent Document 5]
DNA Microarrays Associate Editor: Kaaren Janssen, Cold
Spring Harbor Laboratory Press, 2003
[Non-Patent Document 6]
Microarray Analysis, Mark Schena, John Wiley 86 Sons,
Inc., 2003
[Patent Document 1]
Japanese Unexamined Patent PubUcation No.
228999/2000 (Tokukai 2000-228999; published on August 22,
2000)
[Patent Document 2]
Japanese Unexamined Patent Publication No.
i
342299/2000 {Tokukai 2000-34^299; published on December
12, 2000)
[Patent Document 3]
Japanese PCT Laid-Open Publication No. 508853/2003
(published on March 4, 2003; International Publication No.
WOOl/016860, published on March 8, 2001)
[Patent Document 4]
Japanese Unexamined Patent Publication No.
342000/1999 [Tokukaihei U-342000; published on December
14, 1999)
[Patent Document 5]
Japanese Unexamined Patent Publication No. 30093/2004
{Tokukai 2004-30093; published on January 29, 2004)
[Patent Document 6]
Japanese Unexamined Patent Publication No.
175305/2002 (Tokukai 2002-175305; published on June 21,
2002)
[Patent Document 7]
Japanese Unexamined Patent Publication No. 4739/2003
(Tokukai 2003-4739; published on January 8, 2003)
Conventionally, the array technique han been developed
primarily for academic purposes centered on genome analysis,
or for providing a research tool. As such, Ihrrr has been no
Hcrrivf flrvclopment for morr prflcticid piniKiNfN. A intthlrni of
the array technique then is that it is not oftrn HUitoblr for
practical purposes such as identification of individuals, or
genetic analysis.
Specifically, in the array technique, the biosubstances or
synthetic substances are immobilized on a support in an orderly
fashion, but the order is not specific and the biosubstances or
synthetic substances are randomly arranged in most cases. The
random arrangement of biosubstances or synthetic substance,
r
however, does not cause any problem as long as the array
technique is used for the systematic and comprehensive
analysis of genes, etc. That is, there was no special meaning in
arranging the biosubstances or synthetic substances in a
predetermined order based on some criteria.
However, the systematic and comprehensive analysis of
genes, etc, has potential use in more practical applications such
as variety improvement of plants, for example. In using the
array technique for such purposes, it is desirable that the
biosubstances and synthetic substances be analyzed with
additional position information of chromosomes. In some cases,
it may be required to use some kind of reference to set the order
of arrangement.
A problem of the conventional bioinformatics technique is
that it cannot be used to efficiently perform crossing for variety
improvement, QTL analysis, and the like.
Specifically, in crossing, numerous numbers of
individuals in the hybrid generations need to be screened for
individuals in which target traits are expressed. Conventionally,
it is been required to grow the hybrid generation for several
years until the traits are confirmed. Further, depending on the
type of trait, the traits cannot be easily recognized by simply
growing the hybrid individuals. On the other hand, if the
screening is performed with large gene expression data obtained
from the nucleic acid array, whether target traits have been
inherited or not can be efficiently confirmed with good
reproducibility only by obtaining nucleic acids from the
individuals of the hybrid generation.
However, since the conventional bioinformatics technique
concerning gene expression is not intended for such a purpose,
the gene expression data obtained from the DNA micro array
has not been effectively used for crossing,
iO
The QTL analysis involves statistical analysis. When only
this aspect of QTL analysis is considered, the bioinformatics
ir(:hni(|iir IH easily applicablt* lo the QTl. IUIHIVHIN. Ilnwrvrr, im
technique is known that uses the array technique and the
bioinformatics technique in combination for the QTL analysis,
Further, as to the conventional bioinformatics technique
concerning gene expression, it has not been passible to
effectively use the technique in the QTL analysis.
Further, while the conventional technique allows
information concerning gene functions or regulating functions
to be obtained by performing an expression profile analysis on
cells of a particular type or particular stage, the technique
cannot provide enough information concerning expression of
genes associated with particular traits.
More specifically, since the expression profile analysis
analyzes expression profiles of cells of a particular type or
particular stage, a comprehensive gene expression analysis can
be carried out and expression patterns specific to a particular
cell type or particular cell stage can be obtained. However, while
the technique is useful in finding target genes or a target gene
group, it is not sufficient to analy2e which genes or a group of
genes are associated with predetermined specific traits or genes
of interest.
That is, the comprehensive gene expression analysis is
useful in finding clusters or networks in a vast amount
expression information, and obtaining therefrom specific genes
or a group of genes. However, the technique is not effective in
analyzing which genes or a group of genes are associated with
specific traits or genes of interest, because the technique in
which a vast amount of expression information is narrowed
down to desired information involves unnecessary information
processing and may cause difficulties in accurately narrowing
W
down the information.
DISCLOSURE OF INVENTION
The present invention was made in view of the foregoing
problems, and an object of the invention is to provide an array
technique in which the order of arrangement of biosubstances
or synthetic substances immobihzed on a support is specified,
and which is therefore applicable to, for example, screening in
variety improvement of organisms.
Another object of the invention is to provide {IJ a genotype
analyzing and display system to be suitably used in effectively
using gene expression data of a nucleic acid array in crossing
for variety improvement, (2) a quantitative toci nniilyzing system
to br miitulily used in effectively iminK tlnla of a inulric Htid
arr&y in QTL analysis, (3) a gene interaction iitm]yzmg Hyntrm
for effectively analyzing, using the result of analysis obtained
from the nucleic acid array, which genes or a group of genes are
associated with target traits or genes that have been specified
beforehand, and (4) representative methods of using such
analyzing systems.
The inventors of the present invention diligently worked to
solve the foregoing problems, and accomplished the invention
by finding that, for example, a DNA micro array can be used for
screening in variety improvement of living organisms when DNA
fragments immobilized on a glass substrate (support) are
arranged in the order they are coded for on the chromosomes,
or when information obtained from the array is analyzed with
such order information.
In order to achieve the foregoing objects, the present
invention provides an array in which different kinds of
biosubstances obtained from an organism of interest, or
synthetic substances interacting with such biosubstances are
\1J
arranged and immobilized on 'a support in an orderly manner,
the different kinds of biosubstances or the synthetic substances
being arranged such that a chromosomal order of base
sequence blocks corresponding to the biosubstances is
ascertainable.
In one specific example of the array in which the
biosubstances or synthetic substances are arranged in such a
manner that their chromosomal order is recognizable, different
kinds of biosubstances or synthetic substances are arranged in
the chromosomal order of respective base sequence blocks of
the biosubstances. Such an arrangement will be called a "direct
arrangement" (see First Embodiment).
in the direct-arrangement array, it is not necessarily
required that all of the biosubstances or synthetic substances
are arranged in the chromosomal order of respective base
sequence blocks of the biosubstances. As such, only some of
the biosubstances or synthetic substances may he arranged in
the chromosomal order of their respective base sequence blocks.
The support may include labels that indicate the chromosomal
order of the respective base sequence blocks of the
biosubstances.
In another example of the chromosomal order recognizable
array, the biosubstances or synthetic substances immobiUzed
on the support are each appended with sequence position
information corresponding to the chnimoHomnl nttter of the
respective sequence blocks of the biosubstunceH, udd. In UHr,
data is acquired and the sequence position information is read
out, so as to rearrange sequences of the data in the
chromosomal order. Such an arrangement will be called an
"indirect arrangement" (see First Embodiment).
In a more specific example of the indirect-arrangement
array, the support is realized by a collection of micro supports
13
individually immobilizing the biosubstances or synthetic
substances, and each micro support is appended with sequence
position information corresponding to the chromosomal order of
the respective base sequence blocks of the biosubstances.
Based on the sequence position information, the order of
acquired data is rearranged in the chromosomal order.
In the array, nucleic acids or polypeptides can be used as
the biosubstances. The nucleic acid may be DNA, for example.
The type of DNA is not particularly limited, but a genetic marker,
genomic DNA, genomic DNA treated with restriction enzyme,
cDNA, EST, and synthetic oligoDNA are preferably used, for
example. It is preferable that a plurality of DNA molecules
immobilized on the support be arranged based on a genetic map
or physical map.
As a rule, in order to quantify gene expression level, cDNA
or cRNA derived from mRNA is generally used as a target
sample. In addition to cDNA and cRNA, the target sample used
in an array of the present invention may be genomic DNA
treated with restriction en2yme, when the biosubstance is
nucleic acid. Here, it is preferable that the target DNA have
been fractionated by size after treated with restriction enzyme.
When the biosubstance is polypeptide, proteins, fragments
of proteins, or oligopeptides can be used as the biosubstances.
The type of protein is not particularly limited. For example,
enzymes, kinase, antibodies, receptors, and proteins with an
SH3 region may be used. It is preferable that the proteins be
arranged based on a genetic map or physical map {see Second
Embodiment).
In an array according to the present invention, the
support or micro support may be an inorganic substrate, an
organic membrane, or a bead. More specifically, an array
according to the present invention may be a micro array, a
i)f
macro array, a bead array, or a protein chip.
A producing process of an array according to the present
invention includes the step of orderly arranging and
imniobiliziiiB on a support differrnl kintlH nf UluHuiiNltinrrH
(iblftined from an organism of interest, or synlhrlic BubstnnneH
interacting with such biosubstances, the step including
arranging and immobilizing the biosubstances or the synthetic
substances according to the order in which genes corresponding
to the biosubstances are coded for on a chromosome of the
organism. In the process, nucleic acids or polypeptides may be
used as the biosubstances.
Use of the present invention is not particularly limited.
For example, the invention can be used for identification of a
genotype, in which a chromosome fragment including a target
trait is identified from hybrids obtained by crossing, with the
use of an array using DNA as the biosubstance. The organism
used for the identification of such a chromosome fragment is
not particularly limited, and experimental animals and plants
can be used, for example. Further, the organism used for this
purpose may be a human. In this case, the genotype
identification method can be used as a gene diagnosis method.
The present invention can also be used, for example, for
screening in variety improvement, whereby a variety including a
target trait is selected, with the use of an array using DNA as
the biosubstance, from hybrids obtained by crossing of
organisms whose characteristics are to be improved. Here, the
type of organism used for variety improvement is not
particularly limited. For example, domestic animals or crops
can be used. Specific examples of crops include cereals such as
rice, wheat, corn, and barley.
The inventors of the present invention diligently worked to
achieve the foregoing objects, and accomplished the invention
\
based on the following finding. Namely, the inventors found that,
in analyzing gene expression data obtained from hybrid
individuals with the nucleic acid array, use of at least (1)
genetic information of parents of the hybrid individuals and (2)
a genetic map of the species to which these individuals belong
allows the gene expression data to be analyzed based on
graphical representation of locations of crossovers on the
chromosomes, and thereby enables the gene expression data
obtained with the nucleic acid array to be effectively used in
crossing for variety improvement.
Namely, a genotype analyzing and display system
according to the present invention includes; a genotype origin
detecting section for comparing (a) gene expression level
information comprehensively obtained through n hybridization
HMijlysiN of hybrid individnnlN wilh >i nuilrii uriil itnuv witlt (b|
genetic information of parents of the hybrid individuals, and a
genetic map of a species to which the hybrid individuals belong,
so as to determine whether a genotype of a hybrid individual of
interest derives from which parent; and a display information
generating section for gathering a plurality of results obtained
from the genotype origin detecting section and, based on the
results, generating display information used to display a
plurality of genotypes altogether on a chromosome basis, so as
to determine whether individual genotypes derives from which
parent (see Fourth Embodiment).
In the genotype analyzing and display system, it is highly
preferable that the nucleic acid be a chromosomal location
recognizable array in which a plurality of nucleic acid molecules
immobilized thereon are arranged such that a chromosomal
order of base sequence blocks corresponding to the nucleic acid
molecules is ascertainable.
It is preferable that the genotype analyzing and display
\G
system includes a genetic map constructing section for
constructing, based on genetic map constructing information, a
genetic map of a species to which the hybrid individuals belong.
It is preferable that the genetic map constructing information
includes names of genes and/or genetic markers known in the
species, and chromosomal loci of the genes and/ or genetic
markers.
In the genotype analyzing and display system, it is
preferable that the genotype origin detecting section determines
a genotype as being homozygous for one of the parents,
heterozygous, or unrecognizable to yield a result. Further, it is
preferable that the genotype origin detecting section use
genotype information and/or gene expression profile
information of parents as genetic information of parents.
In the genotype analyzing and display system, it is
preferable that the display information generating section
generate display information including at least one of
recombination number and recombination frequency of
individual chromosomes. Further, it is preferable that the
display information generating section generate display
information such that an origin of a genotype is identifiable
based on different display colors or patterns.
It is preferable that the genotype analyzing and display
system include at least one of an input section and an output
section. The input section preferably receives at least one of
('()mprr))rnHJve expression Irve] infnrmntinn of ^rnrn of Ihr
hybrid individuals, and genetic information of pHirnfs. J'Yirt.hrr,
the input section preferably receives genetic map constructing
information.
The input section may be, for example, a scanner i'or
enabling a hybridization result of the nucleic acid array to be
read out as image information. Preferably, an image information
1^
processing section is also provided that analyzes an expression
level of gene based on the image information and generating
comprehensive expression level information of gene.
It is preferable that the input section be a manual input
section for modifying at least one of: the comprehensive
expression level information of gene of the hybrid individuals;
the genetic information of parents; and the genetic map
constructing information.
It is preferable that the output section include at least one
of: a display for displaying the display information on a screen;
and a printer for printing the display information. Preferably,
the input section and output section are realized by an externa)
communications section for sending and receiving information
to and from an external device.
In the genotype analyzing and display system, the nucleic
acid array is generally, but not limited to, a DNA array on which
DNA is immobilized. Specific examples of DNA immobilized on
the DNA array include a genetic marker, genomic DNA, genomic
DNA treated with a restriction enzyme, cDNA, EST, and
synthetic oligoDNA. Specific examples of the nucleic acid array
include a micro array, a macro array, and a bead array.
Use of the present invention is not particularly limited.
For example, the invention can be used for identifying a target
trait-including chromosome fragment, using the genotype
analyzing and display system, from hybrids obtained by
crossing organisms. The organisms may be experimental
animals and plants.
The invention can also be used for screening for a target
trait-carrying variety from hybrids obtained by crossing
organisms whose characteristics are to be improved, using the
genotype analyzing and display system. The organisms crossed
for variety improvement may be experimental animals and
\^
plants, domestic animals, or crops.
The inventors of the present invention diliKf ntly worked to
achieve the foregoing objects, and accomplished the invention
by finding that the gene expression data ohf[iined with the
niiclric lu'id array can be effccrtively used fnr (he *y\'\. IUIMIVHIH
when the result of hybrid!2alion oblained from llir spoty of ihr
nucleic acid array is used as genetic marker information.
Namely, a quantitative loci analyzing system according to
the present invention include: a genetic marker specifying
section for comparing (a) com,prehensive presence information
of genes of hybrid individuals, obtained by hybridizing a
genomic sample of the hybrid individuals of a certain hybrid
line with a nucleic acid array on which a genetic marker of a
species of interest is immobilized (b) with a genetic map of a
species to which the hybrid individuals belong, and genetic
marker information known in the species, so as to specify a
genetic marker that exists in the hybrid line; and a quantitative
loci detecting section for detecting a quantitative locus of a
phenotype of interest of the hybrid individual, by confirming
whether a phenotypic value indicative of the phenotype is linked
to the genetic marker (see Fifth Embodiment).
In the quantitative loci analyzing system, it is highly
preferable that the nucleic acid array be a chromosomal
location recognizable array in which a plurality of nucleic acid
molecules immobilized thereon are arranged such that a
chromosomal order of base sequence blocks corresponding to
the nucleic acid molecules is ascertainable.
It is preferable that the quantitative loci analyzing system
include a genetic map constructing section for constructing,
based on genetic map constructing information, a genetic map
of a species to which the hybrid individuals belong. The genetic
map constructing information preferably includes names of
\^
genes and/or genetic markers known in the species, and
chromosomal loci of the genes and/or genetic markers.
In the quantitative loci analyzing system, it is preferable
that the genetic marker information used by the genetic marker
specifying section include a genetic marker with polymorphism.
More specifically, the genetic marker is preferably SNP or RFLP.
In the quantitative loci analyzing system, it is preferable
that the quantitative loci detecting section detect a quantitative
locus of phenotype by interval mapping.
It is preferable that the quantitative loci analyzing system
include: a scanner for enabling a hybridization result of the
nucleic acid array to be read out as im.age information; and an
image information processing section for analyzing an
expression level of gene based on the image information and
generating comprehensive expression level informntion of gene.
il IH preferable that thr t|vian1ilii1ive loci inmlvziiig t*VMlrm
include at least one of an input section and an outpvit section.
Here, the scanner can be used as an input section. The input
section preferably receives at least one of the genetic marker
information and the phenotypic value. Further, the input
section preferably receives at least one of the genetic map and
the genetic map constructing information.
Further, it is preferable that the input section be a
manual input section for modifying at least one of: the
comprehensive presence information of gene of the hybrid
individuals; the genetic marker information, and the genetic
map constructing information.
It is preferable that the output section be at least one of a
display for displaying an analysis result on a screen; and a
printer for printing an analysis result. Preferably, the input
section and output section be realized by an external
communications section for sending and receiving information
J-0
to and from an external device.
In the quantitative loci analyzing system, the nucleic acid
array is generally, but not limited to, a DNA array on which
DNA is immobilized. Specific examples of the nucleic acid array
include a micro array, a macro array, and a bead array.
Use oi the present invention is not particularly limited.
For example, the invention can be used as a quantitative trait
analyzing method for analyzing a quantitative trait of an
organism, using the quantitative loci analyzing system, or a
gene searching method for searching for a gene associated with
expression of a trait of interest, using the quantitative loci
analyzing system, or a variety improvement method for
organisms, which uses the quantitative loci analyzing system.
The organisms used for variety improvement are preferably
laboratory animals and plants, domestic animals, or crops.
The inventors of the present invention diligently worked to
achieve the foregoing objects, and accomplished the invention
by finding that an analysis of whether or not which gene or
which group of genes is associated with a previously specified
trait or gene of interest can be effectively performed when
hereditary factors for regulating the expression level oi
individual genes are described based on the hybridization
results of the genetic markers immobilized on the nucleic acid
array.
Namely, a gene interaction analyzing system according to
the present invention includes: a genetic marker specifying
Hpftinn for comparing (a) comprehensivr ptcHcnir information
of ^PiirB of hybrid Individ mils, nljliiijird by hybi idlzln^ n
genomic sample of the hybrid individuals of u certain hybrid
line with a nucleic acid array on which a genetic marker of H
species oi interest is immobilized (b| with a genetic map oi a
species to which the hybrid individuals belong, and genetic
Jl
marker information known in the species, so as to specify a
genetic marker that exists in the hybrid line; a spot marker
information generating section for comparing the specified
genetic marker with the genetic marker immobilized on the
nucleic acid array, so as to generate spot marker information,
being genetic marker information for use in analysis, from
hybridization results obtained from individual spots on the
nucleic acid array; and a hereditary factor specifying section for
specifying, with regard to an arbitrarily selected phenotype and
gene to be analyzed, a hereditary factor of the selected
phenotype by determining whether the phenotypic value
indicative of the phenotype, and an expressed gene included in
expression profile information obtained from the hybrid
individual are linked to a plurality of spot marker information
(see Sixth Embodiment).
In the gene interaction analyzing systena, it is highly
preferable that the nucleic acid array be a chromosomal
location recognizable array in which a plurality of nucleic acid
molecules immobilized thereon are arranged such that a
chromosomal order of base sequence blocks corresponding to
the nucleic acid molecules is ascertainable.
It is preferable that the gene interaction analyzing system
include a genetic map constructing section for constructing,
based on genetic map constructing information, a genetic map
of a species to which the hybrid individuals belong. Further, it
is preferable that the genetic map constructing information be
names of genes and/or genetic markers known in the species,
and chromosomal loci of the genes and/or genetic markers.
In the gene interaction analyzing system, it is preferable
that the genetic marker information used by the genetic marker
specifying section be a genetic marker with polymorphism. More
specifically, the genetic marker is preferably SNP or RFLP.
W^
In the gene interaction analyzing system, the spot marker
information generating section generates spot marker
information only for a genetic marker spot found by
hybridization. Here, it is preferable that the spot marker
information generating section generate spot marker
iiiforniHlion by including position informntion of ii K'*"'^!^'^
murkrr immobilized on the jjucleic ucJd arvuy.
It is preferable that the gene interaction analyzing system
include an expression profile information generating section for
analyzing an expression profile in regard to a comprehensive
gene expression level obtained from the hybrid individual, so as
to generate expression profile information of the hybrid
individual. The expression profile information generating
section generates expression profile information of the hybrid
individual by comprehensively measuring gene expression,
using at least one of a micro array, a macro array, a bead array,
and a differential display. Here, it is preferable that the
expression profile information generating section generate
expression profile information using a nucleic acid array used
to obtain comprehensive presence information of gene of the
hybrid individual, or a nucleic acid array on which the sample
has been spotted.
The DNA array on which DNA is immobilized can be
suitably used as the nucleic acid array for obtaining the
gene-presence-information, or the nucleic acid array for
obtaining expression profiles. Specifically, the nucleic acid may
be a micro array, a macro array, or a bead array.
In the gene interaction analyzing system, the hereditary
factor specifying section specifies a hereditary factor of a
phenotype based on a quantitative trait locus (QTL) that exists
among genetic markers obtained by interval mapping. Here, the
hereditary factor specifying section may uses information of
expression level of a gene associated with the genetic marker, so
as to specify a hereditary factor of the phenotype.
The gene interaction analyzing system includes at least
one of an input section and an output section. The input
section receives at least one of: comprehensive presence
information of gene of the hybrid individual; the genetic marker
information; the phenotypic value; and the expression profile
information. Preferably, the input section receives at least one
of the genetic map and the genetic map constructing
information.
The input section is not limited to a particular structure.
For example, the input section may be provided as a scanner for
enabling a hybridization result of the nucleic acid array to be
read out as image information. Here, it is preferable that an
image information processing section be provided that analyzes
an expression level of gene based on the image information and
f^rnrriiting comprehensive expresaitui U'vel inrciniintifin of ^rne.
Tlic HCinmcr may be used us an input HCCMHII lui entering thr
expression profile information.
Further, it is preferable that the input section be provided
as a manual input section for modifying at least one of; the
comprehensive presence information of gene of the hybrid
individuals; the genetic marker information, and the genetic
map constructing information.
It is preferable that the output section be at least one of a
display for displaying an analysis result on a screen; and a
printer for printing an analysis result. Further, it is preferable
that the input section and the output section be realized by an
external communications section for sending and receiving
information to and from an external device.
Use of the present invention is not particularly limited.
For example, the present invention may be used as a gene
interaction analyzing method for analyzing interaction between
genes, using the gene interaction analyzing system, or a gene
searching method for searching for a gene associated with a
trait of interest, using the gene interaction analyzing system, or
a variety improvement method for organisms, which uses the
gene interaction analyzing system. The organisms used for
variety improvement may be laboratory animals and plants,
domestic animals, or crops.
For a fuller understanding of the nature and advantages
of the invention, reference should be made to the ensuing
detailed description taken in conjunction with the
accompanying drawings.
BRIEF DESCRIPTION OF DRAWINGS
Fig. 1 is a schematic diagram showing a specific
exemplary structure of an array according to the present
invention, when the substance immobilized on a support
(substrate) is DNA.
Figs. 2(a) and 2(b) are plan views schematically
illustrating expression of genes with particular characteristics
in the array of Fig. 1.
Fig. 3 is a schematic diagram showing expression of genes
with particular characteristics, concerning a resulting
segregating population of the cross between varieties
respectively expressing genes as shown in Figs, 2(a) and 2(b),
and a specific variety selected from the segregating population.
Fig. 4 is a schematic diagram showing a specific
exrmpliiry structure of an array arrordinK In I he present
invention, when the subsliince iminohilizrd oii ii Hupjunl
(aubatrtite) is protein.
Fig. 5 is a schematic diagram showing a specific
exemplary structure of an array according to the present
J^
invention, when the substance immobilized on a support
(substance) is a compound (synthetic substance) which
specifically interacts with protein.
Fig. 5 is a schematic diagram showing a specific
exemplary structure of a bead array as one example of an array
according to the present invention.
Fig. 7 is a block diagram illustrating an example of a
genotype analyzing and display system according to the present
invention.
Fig. 8 is a view illustrating an example of display
information displayed in the genotype analyzing and display
system according to the present invention.
Fig. 9 is a flowchart representing an example of an
analysis method employed by the genotype analyzing and
display system according to the present invention.
Fig. 10 is a block diagram illustrating an example of a
quantitative loci analyzing system according to the present
invention.
Fig. 11 is a flowchart representing an example of an
analysis method employed in the quantitative loci analyzing
system according to the present invention.
Fig. 12 is a block diagram illustrating an example of a
gene interaction analyzing system according to the present
invention.
Fig. 13 is a flowchart representing an example of an
analysis method employed by the gene interaction analyzing
system according to the present invention.
BEST MODE FOR CARRYING OUT THE INVENTION
[First Embodiment]
The following will describe one embodiment of the present
invention with reference to Fig. 1 through Fig. 3. It should be
«
appreciated that the present invention is not just limited to the
particular embodiment described below.
According to the present invention, there is provided an
array in which substances are immobilized on a support by
being arranged in a chromosomal order. The invention is
applicable to a wide range of array techniques. As used herein,
the "arniy techniques" refer to techniques conerMiiMH nrrnvs in
wliich different kinds of subsLancctt arc ordrily uimiigcd unci
immobiUzed on a support.
An array according to the present invention can be
classified according to the type of substance immobilized, the
type of support, use, or the Uke. The invention, to a large extent,
is characterized by the order of substances immobilized on a
support, and therefore the following specifically describes
representative examples of the invention based on different
types of substances immobilized on a support. First, in the
present embodiment, the invention will be described through
the case where the substance immobilized on a support is
nucleic acid.
The basic structure of an array used in the present
invention is not particularly limited. As noted above, the
invention provides an array in which a substance is immobilized
on a support, Here, the support (substrate) is not particularly
limited and may have any shape and may be made of any
material as long as it can immobilize the substance.
Examples of support materials include, generally,
inorganic materials such as glass or silicon wafer; natural
polymers such as paper; synthetic polymers such as
nitrocellulose or nylon; and gels using synthetic polymer or
natural polymer. The shape of the support is not particularly
limited either as long as it has a sufficient area on which the
^1
substance can be immobilized. Generally, those with a two
dimensional plane, for example, such as a substrate with little
or no flexibility, a flexible membrane, or a flexible substrate
with intermediate flexibility can be preferably used. The
thickness of the substrate or membrane is not particularly
limited either, and it can be suitably set according to the
material or use of the substrate or membrane.
The invention can also use bead arrays, as will be
described later. As such, the support may be a collection of
micro-supports on which biosubstances or synthetic substances
are individually immobilized. As such micro-supports, various
beads may be used, for example.
Here, a collection (group) of micro-supports makes up a
single support. Such a group of micro-supports is prepared and
used as a dispersion liquid (or a solution} charged into a small
container, in which m.icro-supports immobilizing biosubstances
(nucleic acid, protein, etc.) are dispersed. In this way, data can
br fredy acquired from the micro-supportH, I'^MCII rnictn HUpporl
is iippended with an ID code, and dutu is ufqulicd fruni llir
micro-support with the ID code. Thus, the order of substances
immobilized on the micro-supports corresponds to the arranged
order of data acquired from the micro-supports based on the ID
codes.
As used herein, the "substances immobilized on a
support" refer to different kinds of biosubstances obtained from
a living organism of interest, or synthetic substances which
interact with such biosubstances. In other words, in an array
according to the present invention, it is required that the
substances immobilized on a support be at least substances
associated with biosubstances derived from living organisms,
Substances which are not associated with biosubstances cannot
be used because, in this case, the coding order of chromosomes
^H
cannot be used as a basis of arranging these substances.
Nucleic acids and polypeptides are specific examples of
such biosubstances. As nucleic acids, DNA and RNA can be
used. Use of polypeptides as the biosubstances will described in
detail in the Second Embodiment. As to use of synthetic
substances that interact with biosubstances, detailed
description will be given in the Third Embodiment. Note that,
the biosubstances may include sugar chains, etc.
In an array according to the present invention, different
kinds of biosubstances or synthetic substances are arranged in
such a manner that the chromosomal order of respective base
sequence blocks of these biosubstances is recognizable. Thus,
for convenience of explanation, an array according to the
present invention will be referred to as a chromosomal location
recognizable array. In one specific implementation of such a
chromosomal location recognizable array, different kinds of
biosubstances are arranged in the chromosomal order. For
convenience of explanation, such an arrangement will be called
a "direct arrangement," because the order of the substances
arranged on the array directly corresponds to the order in which
these substances are sequenced on the chromosome.
In another implementation of a chromosomal location
recognizable array, the order of the substances arranged on the
array indirectly corresponds to the chromosomal order. This will
be called an "indirect arrangement."
The present embodiment is described below in more detail
based on an example (direct-arrangement array) in which DNA,
as an example of nucleic acid, is arrnngfd on a support in a
clirnrndsmiiul order.
For example, it is assumed here lluit an array is
fabricated for an organism Z based on an organism Z
c?-t
chromosome in which 10 genes ABCl through ABC 10 are
present that are lined up in this order on the chromosome, as
schematically illustrated in Fig. 1. It is also assumed that the
genes ABCl through ABC 10 respectively have corresponding
DNA fragments (assuming that such DNA fragments are
obtained). In this case, an array is fabricated by spotting these
DNA fragments in an orderly manner on a substrate. Note that,
in the following, the biosubstances immobilized on a substrate
will be referred to as "spots" where appropriate.
In spotting the DNA fragments on the substrate, a device
called a spotter or arrayer is generally used. The operation of
the spotter is controlled in such a manner that the DNA
fragments are spotted in the order their corresponding genes
are found on the chromosome. In this way, the DNA fragments
are immobilized on the support by being arranged in the order
"respective base sequence blocks of the biosubstances are
sequenced on the chromosome."
As used herein, the "base sequence block" refers to a
region of a certain length in the base sequence of a chromosome.
A typical example is a region corresponding to a gene that
encodes a protein. It should be noted, however, that the "base
sequence block" is not just limited to gene but may be a large
DNA fragment like a BAC (Bacterial Artificial Chromosome)
clone, or a region corresponding to only an exon. Further, the
"base sequence block" may be a region, like EST, that does not
necessarily include a coding region of a protein.
Referring to the foregoing example, the chromosomal order
may be simply the order of the genes ABCl, ABC2, ABC3, ... up
to ABC 10, or the order of three different fragments of ABCl
gene, three different fragments of ABC2 gene, and three
different fragments of ABC3 gene, and so on. Here, the number
of fragments may be three for ABCl gene, two for ABC2 gene.
30
and five for ABC3 gene. Namely, the order of substances
immobilized on the support is not particularly limited as long as,
when taken as a whole, it corresponds to the order in which
these substances are sequenced on the chromosome.
In the example illustrated in Fig. 1, a plurality of DNA
fragments occurs on a single chromosome. However, the present
invention is not just limited to this example, and the DNA
fragments may occur in more than one clirdinoHnme. In this
ciiHc, us with the foregoing, Ihr DNA IriiKiiirniN iitr niiimKrd oi)
the array in the order they are sequenced on the chromosomes.
Further, in the example illustrated in Fig. 1, a plurality of
DNA fragments is arranged as they are sequenced on the
chromosome. However, the present invention is not just limited
to this example. For example, in order to meet different
purposes, only some of the DNA fragments may be arranged in
the chromosomal order. That is, an array according to the
present invention may immobilize substances other than
nucleic acids, and at least some of the different kinds of
biosubstances or synthetic substances may be arranged in the
order the respective base sequence blocks of the biosubstances
are sequenced on the chromosome.
Further, in the direct arrangement, the chromosomal
order can be recognized by techniques other than arranging the
substances in the chromosomal order. For example, labels
indicative of the chromosomal order of respective base sequence
blocks of the biosubstances may be appended on the support.
As an example, labels may be provided that can
distinguish between first and second rows of DNA fragments
obtained from an organism of interest, wherein the first row
includes 10 kinds of DNA fragments (spots) obtained from
chromosome 1 and arranged in the chromosomal order, and the
second row includes 10 kinds of DNA fragments (spots) obtained
^
from chromosome 2 and arranged in the chromosomal order.
Further, as in the indirect arrangement described below,
information indicative of the type of DNA fragment immobilized
on each spot may be appended as a label in the vicinity of each
spot.
The following describes the indirect-arrangement array. In
the indirect-arrangement array, sequence position information
corresponding to the chromosomal order of the base sequence
blocks of the biosubstances is added to each of the
biosubstances or synthetic substances immobilized on the
support. This enables acquired data to be rearranged in the
chromosomal order based on the sequence position information,
irrespective of the order of the immobilized substances.
A specific example of the in direct-arrangement array is a
bead array, in which the support is a collection of
micro-supports individually immobilizing biosubstances or
synthetic substances (bead array will be described later). In this
arrnnRemrnt, each micro-support is npppndrti witli sequence
poMitidii information corrrHpondin^ lo thr- onlrr In wlilili
respective base sequence blocks of the bioHubstuncea iirc
sequenced on the chromosome.
In use, data is acquired and the sequence position
information is read out. Based on the sequence position
information, the sequence of the acquired data is rearranged in
the chromosomal order. By thus recognizing the chromosomal
order, the substances immobilized on the micro-supports can
be arranged in the chromosomal order.
Note that, a specific form of the sequence position
information is not particularly limited as long as it corresponds
to the chromosomal order of the respective base sequence
blocks of the DNA immobilized on the micro-supports.
-5^
In the present embodiment, DNA is used as the
biosubstance. The type of DNA (DNA fragment) is not
particularly limited, but a genetic marker, genomic DNA,
genomic DNA treated with restriction enzyme, cDNA, EST, and
synthetic oligoDNA are preferably used, for example. It is
preferable that the DNA be arranged based on a genetic map or
physical map. For example, for a group of different kinds of
genetic markers, it is preferable that these genetic markers
make up a genetic map. Based on the genetic map, the DNA
fragments can be arranged on a substrate.
The genetic marker or a group of genetic markers are riot
particularly limited as long as they can serve as genetic labels
on the chromosome. Non-limiting examples include an EST
marker using EST, a SNP marker including SNP (Single
Nucleotide Polymorphism), a RFLP (Restriction Fragment Length
Polymorphism) marker, and a micro satellite marker (SSR
(simple sequence repeat) marker). Thus, the genetic marker or a
group of genetic markers include genomic DNA treated with
restriction enzyme, EST, synthetic oligoDNA, and the like, if
they can be used as markers.
The number of biosubstances immobilized on the support
is not particularly limited, and it is generally on the order of
several thousand (10^). The number of immobilized (or
arranged) biosubstances varies greatly depending on the type of
device, such as a spotter, used for the fabrication of the array,
or the area of the support (substrate), for example.
It should be noted that, in the DNA array, information
concerning gene expression can only be obtained for genes
corresponding to the immobilized DNA fragmmlH. It is therefore
prrfrrtibk.' to increase the number oC iiiiiiHihl!l/cil blnHubMliiiii'fH
(DNA fragments) as much as possilile, in order lo perform ^;enc
i3
expression analysis more systematically and comprehensively.
The type of array used in the present invention is not
particularly limited and various conventional arrays can be
used. Specifically, a micro array, a macro array, a bead array,
or a protein chip can be used, for example. The present
embodiment uses nucleic acid as the biosubstance, and
therefore more specific examples include a DNA micro array and
DNA macro array, for example.
The DNA micro array is also known as a DNA chip, and
the immobilized DNA is often referred to as a probe. The micro
array is smaller in size than macro array and provides more
density. This enables the number of genes (DNA fragments)
immobilized as probes to be increased, allowing for more
comprehensive gene expression analysis.
The DNA micro array can be classified based on types of
immobilized DNA. However, structural differences can be
revealed more clearly if the DNA micro array is classified based
on fabrication methods. Specifically, based on fabrication
methods, the micro array can be broadly classified into the
Stanford type and the Affymetrix type.
A DNA micro array of the Stanford type is fabricated by
spotting a DNA solution onto a substrate (support) with a
spotter, wherein a slide glass for a microscope is used as the
substrate. One advantage of a DNA micro array of a Stanford
type is that it can always be fabricated with the use of a spotter.
However, this comes with a drawback in that it requires
expensive hardware (spotter, etc.}, or complex procedures for
the preparation of biosubstances as necessitated by a large
number of probes required for spotting.
On the other hand, a DNA micro array of the Affymetrix
type, as described in the BACKGROUND ART section, does not
3.^
employ the method of immobilizing DNA fragments on a
substrate with a spotter, etc., but is fabricated by chemically
synthesizing oligoDNA of about 25 mer on a substrate using a
micro fabrication technique commonly used in the fabrication of
semiconductors, namely, a photolithography technique.
Specifically, for each gene, 11 to 20 oligos (25 mers) (for
example, 11 oligos in the case of a barley DNA array) are set
based on base sequence data, and a pair of oligo DNA: one with
a perfect match to each 25 mer, and onr with a forced
MinKlf-buse mismatch at thr 13th biiMr IN UHrd MN II [iiolir, 'I'hr
array can be fabricated without using a spotter or other devices
when it is designed with data of a known database. Further,
since the probe (DNA fragment) has a constant length and the
sequence is known, the CG content, which influences the
strength of hybridization, can remain constant. It should be
noted, however, that since the probe is synthesized based on
information of a database, clones to be analyzed need to be
separately isolated.
As described above, the present invention is characterized
by the order information of the sequence of the nucleic acids
(biosubstances) immobiUzed on a substrate (support), and the
invention can use various types of DNA, including synthetic
oligoDNA, as the nucleic acids. This makes the techniques of
the present invention suitable for both Stanford DNA micro
array and Affymetrix DNA micro array.
The following describes an exemplary method of using the
DNA micro array. First, the DNA micro array is hybridized with
fluorescent-labeled target DNA (hereinafter, "targets"). Here, the
target molecules containing complementary sequences to the
probes on the DNA micro array bind to (hybridize with) their
complementary probe molecules, leaving other target molecules
unbound. Then, these target molecules not bound to the probes
^s-
are washed and removed, leaving only the hybridized target
molecules on the micro array. Since the target molecules are
fluorescence-labeled, the fluorescence of the targets is
measured as signal intensity and hybridized probes are
identified.
The fluorescent-labeled targets are generally prepared first
by extracting mRNA from cells of two different states (first state
and second state) to be compared, and then performing a
reverse transcription reaction in the presence of fluorescent
nucleotides. Here, two kinds of fluorescent dyes with different
detection wavelengths are used for the first state and second
state, respectively. The expression level of genes is greater for
the cDNA contained in the targets, and the fluorescent signal
intensity is in accord with the expression level of genes in each
state. Thus, from the measured signal intensity, the expression
level of a specific gene can be detected.
The DNA macro array basically has the same structure as
the DNA micro array, but differs from the DNA micro array in
that it uses a common membrane filter like a nylon membrane
as H subHtrate. An advantage nf the macro iirrny JH Ihnt it allows
liir fin t^xpression profile itiitilysiM, nciinrnr- wu\r, mnirdliiK.', In
methods based on conventional blotting nirlhoclH, Anollier
advantage is that, unlike the micro array, the DNA does nol
detach in washing, owning to the fact that Ihe spotted DNA is
immobilized on a membrane filter after denatured by an alktili
treatment. Therefore, the macro array and micro array should
be suitably selected according to use.
The following describes an exemplary method of using the
macro array. The macro array is used basically in the same way
as the micro array. Specifically, the macro array is hybridized
with isotope {^^P, etc.)-labeled targets. Then, target molecules
that did not bind to the array are washed and removed, leaving
5^
only hybridized target molecules on the macro array. Here,
since the target molecules are isotope-labeled, the spots are
exposed on an imaging plate and the expression level of the
targets is determined by measuring signal intensity from the
imaging plate—a procedure not performed in the micro array.
The techniques of the present invention can also be
applied to the mass array. In the mass array, genomic DNA
fragments are arranged and immobilized in an orderly manner
on a silicon substrate, and therefore the structure is basically
the same as that of the micro array. The mass array was
developed for SNP analysis, and as such it is used differently
from the DNA micro array.
Specifically, oligonucleotides corresponding to regions in
the vicinity of target SNP are synthesized and hybridized with
the mass array. Then, by using the oligonucleotides as primers,
a DNA fragment having a SNP single base difference is
synthesized through elongation catalyzed by DNA polymerase.
The DNA fragment is eluted and then ionized with MALDI. The
SNP type can be determined by detecting a single base mass
difference using TOS-MS. Note that, as to the MALDI-TOS-MS,
details will be described later in the Third Embodiment.
The DNA micro array and macro array are both
direct-arrangement arrays, whereas the bead array is classified
as an indirect-arrangement array. The bead array is used in
such a manner that, in a small container, a probe such as a
nucleic acid or antibody is immobilized on a surface of each
bead to which an ID code has been added, and that the probe
immobilized on the probe surface is specified by reading the ID
code of the bead. With use of a two-wavelength inser beam, 100
kinds of beads can be quantified. That is, in an array according
to the present invention, the support may br o collection of
micro itrrtiys (beads, for example) on wliicli hldHiilmtHiiifM or
?)-l
synthetic substances are individually immobilized.
In applying the invention to the bead array, each bead Is
appended with an ID code containing sequence position
information, as described above. In this way, measurement can
be performed in the same manner as in the other techniques.
Further, since the bead array allows for detection in a liquid
phase, it is effective in efficiently quantifying proteins in
particular. This will be described in detail in the Third
Embodiment.
The target DNA is not particularly limited. In quantifying
the expression level of genes, cDNA or cRNA derived from mRNA
is generally used as a target sample. In the present invention,
genomic DNA treated with restriction enzyme can also be used,
for example.
A gene expression analysis with a common DNA array
{represented by DNA micro array) is based on the principle of
Northern blotting. This is effective in detecting genes having
different expression patterns between two samples that differ
from each other by the presence or absence of a particular
disease, for example. However, if the purpose of the analysis is
to detect genetic differences between the two samples, finding
different gene expression is often not effective in meeting such a
purpose because different gene expression does not necessarily
mean that the samples are genetically different.
For a comparative expression analysis of a large number
of samples (lines) using known DNA micro array techniques, a
strict coordination (synchronization) of growth stage is required
between tested samples, or only specific tissues need to be
collected. Further, since the mRNA (cDNA) used as target DNA
is a collection of expressed genes, comparison can only be made
for the information of genes whose expression is specifically
-^9
activated or suppressed in a tested growth stage.
Further, there have been many reports that suggest
difficulties of a DNA micro array analysis in detecting a specific
mutated gene in the genome even if it is present, owning to the
fact that the expression level may not reflect the amount of
transcripts, that the genes may be expressed only in limited
tissues or stages, or that the amount of transcripts may be too
small to be detected by the Northern blotting method.
Meanwhile, diversity of genes is not necessarily governed
by mutations in the coding regions of genes. For rxample, there
have h^rn many reports that jiddrcHM ttir picuriicr ot (ibNriK r oC
insprEion and/or deletion in the introns, or Btructural
differences (for example, differences in promoter activities) in
the expression regulating region like a promoter sequence.
One applicable area of the present invention is variety
improvement. In this application, cereals can be suitably used
for variety improvement, for example. Among cereals, the
genome size of barley for example is greater than that of rice by
more than 10 fold. It is then highly likely that the non-coding
regions, which account for the majority of the barley genome,
contribute to the intraspecies diversity in barley.
In a DNA array according to the present invention, the
DNA fragments (biosubstances) immobilized on a support are
arranged in the chromosomal order. Thus, with an array of the
present invention, the location of chromosomal recombination
can be grasped by a single round of testing. Thus, in an
analysis using an array of the present invention, target DNA is
prepared so as to allow for use of the Southern blotting method.
In this way, structural mutations in the non-coding regions of
genes can also be efficiently detected, in addition to solving the
conventional problems associated with the Northern blotting
method.
^^
The method by which target DNA is prepared for Southern
blotting is not particularly limited, and genomic DNA is
fragmented by known methods. Specifically, genomic DNA
subjected to restriction enzyme is used as target DNA. In other
words, RFLP analysis is performed with an array of the present
invention.
Digestion of genomic DNA with restriction enzymes
produce probe DNA fragments of many different sizes as
compared with using mRNA (cDNA). This can be a drawback
where accurate detection of polymorphism, such as a length
difference, for example, between 500 bp and 5 kbp is required
on the array (detection sensitivity of imaging means for
detecting image information of array is brought into question).
In order to avoid such a problem, DNA fragments obtained
by the treatment of genomic DNA with restriction enzymes are
fractionated by size to be used as target DNA. In this way, a
length difference can be effectively detected as a polymorphism,
enabling an array of the present invention to be effectively used
in the analysis employing the Southern blotting method.
The method of size fractionation is not particularly limited,
unci any technique can be used as long fis tlir inr(hod HUOWS
Ihr ^ciinmic DNA treated with i rHliiitioti fiizviin'H 1
fractionated to required sizes. For example, ti coTnmerciidly
available nucleic acid purification column kit usinf^ u
centrifugal tube can be used. Further, size fractionation can be
performed by setting PCR conditions such that DNA fragments
of certain sizes are specifically amplified. The labeling method of
genomic DNA is not particularly limited, and labeling can be
made by a known method using PCR, for example.
A fabrication method of array according to the present
invention at least includes the step of arranging and
/eO
immobilizing on a support different kinds of biosubstances
obtained from a living organism of interest, or synthetic
substances interacting with such biosubstances. In the step,
the biosubstances or synthetic substances immobilized on the
support are arranged in the order genes of the organism are
coded on the chromosome.
When the biosubstances are nucleic acids as in the
present embodiment, the step follows the following procedure,
for example. After preparing genomic DNA, the genomic DNA is
fragmented by restriction enzymes, and a solution of DNA
fragments is spotted on the support using a spotter. Here, the
DNA fragments are spotted with a spotter in such a manner
that chromosome information of the corresponding genes can be
identified, as described above.
The spotter is not particularly limited and known
instruments can be suitably used. Specifically, for example, an
instrument that sputters a DNA solution onto a substrate
through a capillary pen, or an ink jet device that plots a DNA
solution on a substrate is available.
In the case where the support is a collection of micro
supports (beads) like a bead array {micro support group), DNA
or other substances are individually immobilized on the beads,
and sequence position information indicative of chromosomal
locations of the immobilized DNA is added, together with an
identification code, to each bead. The group of beads so
obtained is dispersed in a known liquid to prepare a bead
solution, which is then charged into a small container and used
as a bead array.