Full Text
Description
RECORDING MEDIUM, REPRODUCING DEVICE, RECORDING METHOD, AND
REPRODUCING METHOD
Technical Field
[0001]
The present invention belongs to a technical field of Out-of-
MUX framework.
Background Art
[0002]
The Out-of-MUX framework is a technology that simultaneously
reads a digital stream recorded on a read-only recording medium, such
as a BD-ROM, and a digital stream recorded in a local storage, which
is a rewritable recording medium, supplies them to a decoder, and
then plays back them synchronously.
Here, assume that the digital stream recorded on a BD-ROM is a
main portion of a movie while the digital stream recorded in a local
storage is a commentary of the director of the movie. In this case,
by realizing the above-mentioned Out-of-MUX framework, the main
portion of the movie on the BD-ROM and the commentary can be played
back together, which thereby improves and expands content on the BD-
ROM.
The prior art regarding read-only recording media includes the
following patent application. Japanese Laid-Open Patent Application No.
H8-83478
Disclosure of the Invention
1
[Problems that the Invention is to Solve]
[0003]
In the above-described Out-of-MOX framework, a stream recorded
on the BD-RCM and a stream recorded in the local storage must be read
simultaneously, and TS packets constituting these streams need to be
supplied to the decoder. According to an examination of how much
band is required for the supply to the decoder, in the worst case
where the supply bit rate of the BD-RCM is 48 Mbps and the supply bit
rate of the local storage is 48 Mbps, the data supply of as much as
96 Mbits (48 Mbits + 48 Mbits) may occur during the period of the
simultaneous readout. If such a worst case is likely to occure, the
band in the device must be increased so as to supply TS packets at 96
Mbps. If this cannot be done, it is necessary to provide a large
buffer in the decoder and cause the decoder to perform a prior read
operation to read TS packets in advance so that the supply does not
concentrate at a point in time. If the period of simultaneous
readout is short, it may be possible; however, in the case of plyaing
back a movie of two hours length, the buffer capacity is insufficient,
and the prior read operation is therefore not successfully performed.
[0004]
Since the prior read operation is not successfully performed,
an underflow occures in the buffer for the prior reading operation.
This then causes loss of video and audio, and therefore the playback
quality is significantly reduced. However, high-bit-rate data supply
results in an impediment of price reduction of such playback
apparatuses.
The present invention aims at providing a recording medium
capable of supplying, to a decoder, digital streams supplied from
2
different recording media without the need of the band to be
increased.
[Means to Solve the Problem]
[0005]
In order to achieve the above-mentioned object, the recording
medium of the present invention is characterized by that: the
playlist information includes main-path information and sub-path
information; the main-path information specifies, among a plurality
of digital streams, one digital stream as a main stream, and defines
a primary playback section on the main stream; the sub-path
information specifies, among rest of the plurality of digital streams,
one digital stream as a substream, and defines, on the substream, a
secondary playback section which is to be synchronized with the
primary playback section; the playlist information further includes a
stream table showing at least one pair of elementary streams which
are allowed to be simultaneously played back, the pair of elementary
streams being made up of one of a plurality of elementary streams
multiplexed into the main stream and one of a plurality of elementary
streams multiplexed into the substream; and the total data size of a
digital stream per unit time is less than or equal to a predetermined
value, the digital stream including the pair of elementary streams
and not including an elementary stream which is not allowed in the
stream table to be simultaneously played back.
[Advantageous Effects of the Invention]
[0006]
The total data size, per unit time, of a plurality of
elementary streams allowed in the stream table to be played back is
less than or equal to the predetermined value. Even in the worst
3
case, the amount of TS packets transferred per unit time does not
exceed the predetermined value.
For example, in the case where the unit time is one second and
the predetermined value is 48 Mbits, if the supply amount of TS
packets locally reaches 96 Mbits due to the simultaneous readout of
the streams, the bit amount per second is controlled to be less than
or equal to 48 Mbits. Accordingly, the worst case—i.e. the data
supply amount of 96 Mbits—does not continue for 0.5 seconds or more.
[0007]
Since it is ensured that "the worst case does not continue for
0.5 seconds or more" at any point on the time axis of stream playback,
an underflow in the buffer of the decoder can be prevented by
building the playback apparatus in such a manner that TS packets with
a size of 96 Mbits x 0.5 seconds are always read in advance and
supplied to the decoder.
The prior reading operation with the upper limit of "96 Mbits
x 0.5 seconds" prevents the occurrence of an underflow, and therefore
TS packets can be stably supplied to the decoder. This eliminates
the risk that simultaneous readout to realize the Out-of-MUX
framework has an influence on the quality of the digital stream. It
is possible to realize the Out-of-MUX framework on a playback
apparatus that performs ED-ROM playback only without requiring the
bandwidth to be increased. As a result, playback apparatuses that
realize the Out-of-MUX framework can be introduced to the market at
low prices.
[0008]
In addition, with the limitation of "48 Mbps or less per
second," if the playback apparatus executes the simple control of
4
"always performing a prior reading operation" as described above, it
is possible to prevent the occurrence of an underflow even if the
worst-case data supply occurs. This eliminates the need of
implementation of a process for predicting the timings at which the
worst-case data supply would occur, whereby facilitating development
of the playback apparatuses.
Brief Description of the Drawings
[0009]
FIG. 1 shows a usage application of a recording medium
according to the present invention;
FIG. 2 shows an internal structure of a BD-RCM;
FIG. 3 is a schematic structure of a file with an extension
of .m2ts attached thereto;
FIG. 4 shows further details of how video and audio streams
are stored in a PES packet sequence;
FIG. 5 shows how the video and audio are multiplexed into a
program stream and a transport stream;
FIG. 6 shows details of a transport stream;
FIG. 7 shows internal structures of a PAT packet and a PMT
packet;
FIG. 8 shows what processes TS packets constituting an AVClip
are subject to before they are written to the BD-RCM;
FIG. 9 shows an internal structure of an Aligned Unit;
FIG. 10 shows an internal structure of Clip information;
FIG. 11 shows EP_map settings for a video stream of a movie;
FIG. 12 shows a data structure of PlayList information;
FIG. 13 shows relationships between AVClip and PlayList
information;
5
FIG. 14 shows an internal structure of a local storage 200;
FIG. 15 shows the way a Primary TS and a Secondary TS making
up an Out_of_MUX application are supplied to a decoder within a BD-RCM
playback apparatus;
FIG. 16 shows a data structure of PlayList information;
FIG. 17 shows a close-up of an internal structure of Sutpath
information;
FIG. 18 shows relationship of SubClips in the local storage
200, PlayList information in the local storage 200 and MainClip in
the BD-RCM;
FIG. 19A shows an internal structure of an STN_table;
[0010]
FIG. 19B shows a Stream_attribute corresponding to a video
stream;
FIG. 19C shows a Stream_attribute corresponding to an audio
stream;
FIG. 19D shows a Stream_entry of the audio stream;
FIG. 20 shows TS packets read from a BD-RCM and from a local
storage, and illustrates, of these TS packets, ones to be supplied to
the decoder;
FIGs. 21A-21D show shift of Window;
FIG. 22 is a graph showing temporal transition regarding a
data amount of TS packets read from the BD-RCM as well as a data
amount of TS packets read from the local storage;
FIGs. 23A and 23B show the comparison between the
transmittable amount and the amount supplied to the decoder for each
Window;
6
FIG. 24 shows a connection state of Playltems and SubPlayItems
constituting the Out_of_MUX;
FIG. 25 shows a relationship between In_Times and Out_Times of
Playltems and In_Times and Out_Times of SubPlayltems in the case
where connection_condition information of Playltem and
sp_connection_condition infomraiton of SubPlayl tern shown in FIG. 24
are set to "= 5";
FIG. 26 shows an STC value to be referred to when part
existing from ln_Time to Out_Time of Playltem is played back and an
STC value to be referred to when part existing from In_Time to
Out_Time of SubPlayltem is played back;
FIG. 27 shows how TSls and TS2s are identified in a MainClip
referred to in the previous Playltem and a SubClip referred to in the
current Playltem;
FIG. 28 shows details of CC = 5 and SP_CC = 5;
FIG. 29 shows a relationship among multiple Video Presentation
Units specified by a previous Playltem and the current Playltem,
multiple Audio Presentation Units, and STC time axes;
FIG. 30 shows an internal structure of the playback apparatus
of the present invention;
FIG. 31 is a flowchart showing a playback procedure based on
PlayList information;
FIG. 32 is a flowchart showing a processing procedure of a
seamless connection of SubPlayltems;
FIG. 33 shows an internal structure of an authoring system of
Embodiment 2;
FIG. 34 is a flowchart showing the verification procedure on
Primary TSs and Secondary TSs;
7
FIG. 35 is a flowchart showing a procedure of verification on
a Primary TS and a Secondary TS when there are multiple elementary
streams of the same type;
FIG. 36 shows a detailed explanation of CC = 6;
FIG. 37 shows a correlation between PlayItems and
SubPlayItems ;
FIG. 38 schematically shows the way multiple TS packets
present on an ATC time axis are multiplexed;
FIG. 39 schematically shows, in the case where a subtitle (PG)
and a menu (IG) are also replaced in addition to audio, the way
multiple TS packets constituting the Primary TS and multiple TS
packets constituting the Secondary TS are multiplexed together;
FIG. 40 shows the way a Primary TS and a Secondary TS
constituting an audio mixing application are supplied to a decoder
within the BD-ROM playback apparatus;
FIG. 41 shows an internal structure of the playback appratus
according to Embodiment 5;
FIG. 42 shows a correlation between PlayItems and SubPlayIterns
specified by a PlayList indicating audio mixing; and
FIG. 43 shows an example of PlayList information making up
both a theatrical version and a director's cut.
Explanation of References
[0011]
la ED-ROM drive
lb, c read buffer
1b, a, c ATC counter
2a, d source depacketizer
2c, d ATC counter
8
3a, c STC counter
3b, d PID filter
4 video decoder
5 video plane
6 transport buffer
7 elementary buffer
8 audio decoder
10a, b, c, d switch
11 interactive graphics decoder
12 interactive graphics plane
13 presentation graphics decoder
14 presentation graphics plane
17 synthesis unit
21 memory
22 controller
23 PSR set
24 PID conversion unit
25 network unit
26 operation receiving unit
100 ED-ROM
200 local storage
300 playback apparatus
400 television
500 AV amplifier
Best Mode for Carrying Out the Invention
[0012]
EMBODIMENT 1
9
The following gives an account of a preferred embodiment of a
recording medium according to the present invention. First, a usage
application is described in relation to the implementation of the
recording medium of the present invention. FIG. 1 shows a usage
application of the recording medium according to the present
invention. A local storage 200 in FIG. 1 is the recording medium of
the present invention. The local storage 200 is used for the purpose
of supplying a movie to a home theater system composed of a playback
apparatus 300, a television 400, an AV amplifier 500 and speakers 600.
[0013]
The following explains a BD-ROM 100, the local storage 200 and
the playback apparatus 300.
The BD-ROM 100 is a recording medium on which a movie is
recorded.
The local storage 200 is a hard disk that is built in the
playback apparatus, and is used for storing content distributed from
a server of a movie distributor.
[0014]
The playback apparatus 300 is a digital home electrical
appliance supported for networks, and has a function to play the BD-
ROM 100. The playback apparatus 300 is also able to download content
from a server 700 of a movie distributor via a network, store the
downloaded content in the local storage 200, and combine this content
with content recorded on the ED-ROM 100 to expand/update the content
of the BD-ROM 100. A technology called "virtual package" combines
content recorded on the BD-ROM 100 with content stored in the local
storage 200 and treats data not recorded on the BD-ROM 100 in the way
as if it is recorded on the BD-ROM 100.
10
[0015]
Thus concludes the description of the usage application of the
recording medium of the present invention.
Next is described a production application of the recording
medium of the present invention. The recording medium of the present
invention can be realized as a result of improvements in the file
system of a BD-RCM.
FIG. 2 shows an internal structure of a ED-RCM. Level 4 in
the figure shows the BD-ROM, and Level 3 shows a track on the BD-ROM.
The figure depicts the track in a laterally drawn-out form, although
the track is, in fact, formed in a spiral, winding from the inside
toward the outside of the BD-RCM. The track is composed of a lead-in
area, a volume area, and a lead-out area. The volume area in the
figure has a layer model made up of a physical layer, a filesystem
layer, and an application layer. Level 1 in the figure shows a
format of the application layer of the BD-RCM by using a directory
structure. In Level 1, BD-RCM has BDMV directory under Root
directory.
[0016]
Furthermore, three subdirectories are located under the BDMV
directory: PIAYLIST directory; CLIPINF directory; and STREAM
directory.
The PIAYLIST directory includes a file to which an extension
of mpls is attached (00001.mpls).
The CLIPINF directory includes files to each of which an
extension of clpi is attached (00001.clip and 00002.clip).
[0017]
11
The STREAM directory includes files to each of which an
extension of m2ts is attached (00001.m2ts and 00002.m2ts) .
Thus, it can be seen that multiple files of different types
are arranged in the ED-RCM according to the directory structure above.
First, files to which the extension "m2ts" is attached are
explained. FIG. 3 shows a schematic structure of the file to which
the extension "m2ts" is attached. The files to each of which the
extension "m2ts" is attached (00001.m2ts and 00002.m2ts) store an
AVClip. The AVClip is a digital stream in the MPBG2-Transport Stream
format. The digital stream is generated by converting the digitized
video and audio (upper Level 1) into an elementary stream composed of
PES packets (upper Level 2), and converting the elementary stream
into TS packets (upper Level 3), and similarly, converting the
Presentation Graphics (PG) stream for the subtitles or the like and
the Interactive Graphics (IG) stream for the interactive purposes
(lower Level 1 and lower Level 2) into the TS packets (lower Level 3),
and then finally multiplexing these TS packets.
[0018]
The following describes the video stream, audio stream, PG
stream and IG stream.
The video stream is a stream forming moving images of the
movie, and is composed of picture data of SD images and HD images.
The video stream is in VC-1 video stream, MPEG4-AVC or MPEG2-Video
format. When the video stream is a video stream in MPEG4-AVC format,
time stamps such as PTS and DTS are attached to 3DR, I, P and B
pictures, and playback control is performed in units of pictures. A
12
unit of a video stream, which is a unit for playback control with PTS
and DTS attached thereto, is called the "Video Presentation Unit" .
The audio stream is a stream for an audio track of the movie,
and the formats of the audio stream include LPCM audio stream format,
DTS-HD audio stream format, DD/DD+ audio stream format, and DD/MLP
audio stream format. Time stamps are attached to audio frames in the
audio stream, and playback control is performed in units of audio
frames. A unit of an audio stream, which is a unit for playback
control with a time stamp attached thereto, is called the "Audio
Presentation Unit".
The PG stream is a graphics stream constituting a subtitle
written in a language. There are a plurality of streams that
respectively correspond to a plurality of languages such as English,
Japanese and French. The PG stream is composed of functional
segments such as: PCS (Presentation Control Segment); PDS (Pallet
Define Segment); WDS (Window Define Segment); ODS (Object Define
Segment) ; and END (END of Display Set Segment) . The ODS (Object
Define Segment) is a functional segment that defines a graphics
object which is a subtitle.
[0019]
The WDS (Window Define Segment) is a functional segment that
defines a bit amount of a graphics object on the screen. The PDS
(Pallet Define Segment) is a functional segment that defines a color
in drawing a graphics object. The PCS (Presentation Control Segment)
is a functional segment that defines a page control in displaying a
subtitle. Such page control includes Cut-In/Out, Fade-In/Out, Color
13
Change, Scroll, and Wipe-In/Out. It is possible with the page
control by the PCS to achieve a display effect—for example, making
the current subtitle fade out while displaying the next subtitle.
[0020]
The IG stream is a graphics stream for achieving interactive
control. The interactive control defined by an IG stream is an
interactive control that is compatible with an interactive control on
a DVD playback apparatus. The IG stream is composed of functional
segments such as: ICS (Interactive Composition Segment); PDS
(Palette Definition Segment); and ODS (Object Definition Segment).
The ODS (Object Definition Segment) is a functional segment that
defines a graphics object. Buttons on the interactive screen are
drawn by a plurality of such graphics objects. The PDS (Palette
Definition Segment) is a functional segment that defines a color in
drawing a graphics object. The ICS (Interactive Composition Segment)
is a functional segment that achieves a state change in which the
button state changes in accordance with a user operation. The ICS
includes a button command that is executed when a confirmation
operation is performed on a button.
[0021]
Here, an AVClip is made up of at least one "STC_Sequence".
The "STC_Sequence" is a section in which there is no discontinuity
(system time-base discontinuity) in the STC (System Time Clock),
which is a system base time of AV streams. A discontinuity in the
STC is a point at which discontinuity information
(discontinuity_indicator) of a PCR packet carrying a PCR (Program
Clock Reference) referred to by the decoder to obtain the STC is ON.
14
[0022]
FIG. 4 shows further details of how video and audio streams
are stored in a PES packet sequence. Level 1 in the figure shows a
video stream and Level 3 shows an audio stream. Level 2 shows a PES
packet sequence. As shown by the arrows yyl, yy2, yy3 and yy4 in the
figure, it can be seen that the IDR pictures, B pictures and P
pictures, which are multiple Video Presentation Units in the video
stream, are divided into multiple sections, and each of the divided
sections is stored in one of the pay loads (V#l, V#2, V#3 and V#4 in
the figure) of the PES packets. It can be also understood that each
of the audio frames, which are Audio Presentation Units constituting
the audio stream, is stored in one of the payloads (A#l and A#2 in
the figure) of PES packets, as shown by the arrows aal and aa2.
[0023]
FIG. 5 shows how the video and audio are multiplexed into a
program stream and a transport stream. The lower part of the figure
shows multiple PES packets (V#l, V#2, V#3, V#4, A#l and A#2 in the
figure) which have stored therein the video and audio streams. It
can be seen from the figure that the video and audio streams are
stored in different PES packets. The upper part shows a program
stream and a transport stream in which the PES packets shown in the
lower part are stored. When multiplexed into a program stream, each
PES packet is fit into one pack. When multiplexed into a transport
stream, a PES packet is divided into sections, each of which is then
stored in one of payloads of multiple TS packets. Not the format of
the program stream but the format of the transport stream is used for
the storage format of the ED-ROM. It is common that a video PES
15
packet used for a transport stream stores therein one frame or two
paired fields although FIG. 5 does not illustrate such a case.
[0024]
FIG. 6 shows details of a transport stream. Level 1 of the
figure shows a sequence of multiple TS packets forming an MPBG2
transport stream and Level 2 shows the internal structure of a TS
packet. As shown in Level 2, one TS packet is composed of a "header",
an "adaptation field" and a "payload". The lead line thl shows up-
close details of the structure of the header of a TS packet. As
shown by the lead line, the header of a TS packet includes: a "unit
start indicator (payload_unit_start_indicator)" indicating the start
of the PES packet is stored; a "PID (Packet Identifier)" indicating a
type of an elementary stream which is multiplexed into the transport
stream; and an "adaptation field control" indicating whether an
adaptation field is present in the TS packet.
[0025]
The lead line th2 shows up-close details of the internal
structure of an adaptation field. An adaptation field is given to a
TS packet in the case when the adaptation field control of the header
of the TS packet is set to "1". Specifically speaking, the
adaptation field stores: therein a "random access indicator
(random_access_indicator)" indicating that the TS packet is the
beginning of a video or audio frame and an entry point; and a "PCR
(Program Clock Reference)" that gives an STC (System Time Clock) of
the T-STD (Transport System Target Decoder).
[0026]
16
FIG. 7 shows the internal structures of a PAT packet and a PMT
packet. These packets describe the program structure of a transport
stream.
The lead line hml of the figure shows up-close details of the
structure of a TS packet with PID = 0 in the transport stream. Such
a TS packet is called the PAT (Program Association Table) packet, and
indicates a program structure of the entire transport stream. The
PID of a PAT packet is always "0" . In a PAT packet, a PAS (Program
Association Section) is stored. The lead line hm2 shows up-close
details of the internal structure of a PAS. As shown by the lead
line, a PAS shows the correspondence between program_number (program
number) and a program map table (a PID of the PMT) . The lead line
hm3 shows up-close details of the structure of a TS packet with PID =
0x100 present in the transport stream. Such a TS packet is called
the PMT packet. As shown by the lead line hm4, a PMS of the PMT
packet includes: "stream_type" indicating a type of the stream
included in a program corresponding to the PMS; and "elementaryJPID"
which is a PID of the stream. According to the example of the figure,
the program with the program number #1 has a PMT with PID = 0x100,
and a MPBG2 video with PID = 0x200 and an ADTS audio with PID = 0x201
make up the program with the program number #1. A program in the
transport stream as well as a PID of a stream constituting the
transport stream and a type of the stream can be found by obtaining
the PID of the PMT from the PAT whose PID is always 0, then obtaining
the PMT packet according to the PID of the PMT, and referring to the
PMS.
[0027]
Next, how an AVClip having the above-described structure is
17
written to the BD-RCM is explained. FIG. 8 shows what processes TS
packets constituting an AVClip are subjected to before they are
written to the BD-RCM. Level 1 of the figure shows the TS packets
constituting the AVClip.
As shown in Level 2 of FIG. 8, a 4-byte TS_extra_header
(hatched portions in the figure) is attached to each 188-byte TS
packet constituting the AVClip to generate each 192-byte Source
Packet. The TS_extra_header includes Arrival_Time_Stamp that is
information indicating the time at which the TS packet is input to
the decoder. The reason for attaching an ATS header to each TS
packet to form a stream is to assign, to each TS packet, a time at
which the TS packet is input to the decoder (STD) . In the digital
broadcasting, a transport stream is treated as a stream having a
fixed bit rate. Therefore, dummy TS packets, called NULL packets,
are also mulplexed together to form a transport stream so that the
transport stream is broadcast at a fixed bit rate. However, in the
case where streams are recorded on an optical disk or another
recording medium having a limited recording capacity, such a fixed-
bit-rate recording method is a disadvantage because it consumes the
capacity wastefully. Therefore, NOLL packets are not recorded on BD-
RCMs. In order to comply with a variable-bit-rate recording method,
an ATS is attached to each TS packet, and then the transport stream
is recorded on a BD-RCM. The use of the ATS allows for restoring the
decoder input time for each TS packet, and thus can comply with a
variable-bit-rate recording method. Hereinafter, a pair of an ATS
header and a TS packet is called a Source Packet.
[0028]
The AVClip shown in Level 3 includes one or more
18
"ATC_Sequences," each of which is a sequence of Source Packets. The
"ATC_Sequence" is a sequence of Source Packets, where
Arrival_Time_Clocks referred to by the Arrival_Time_Stamps included in
the ATC_Sequence do not include "arrival time-base discontinuity".
In other words, the "ATC_Sequence" is a sequence of Source Packets,
where Arrival_Time_Clocks referred to by the Arrival_Time_Stamps
included in the ATC_Sequence are continuous.
[0029]
Such ATC_Sequences constitute the AVClip, and are recorded on
the BD-RCM with a file name "xxxxx.m2ts".
The AVClip is, as is the case with the normal computer files,
divided into one or more file extents, which are then recorded in
areas on the BD-RCM. Level 4 shows how the AVClip is recorded on the
BD-RCM. In Level 4, each file extent constituting the file has a
data length that is equal to or larger than a predetermined length
called Sextent.
[0030]
Sextent is the minimum data length of each file extent, where
an AVClip is divided into a plurality of file extents to be recorded.
The time required for the optical pickup to jump to a location
on the BD-RCM is obtained by the following equation:
Tjump = Taccess + Toverhead.
The "Taccess" is a time required that corresponds to a jump
distance (a distance to a jump-destination physical address) .
The TS packets read out from the BD-RCM are stored in a buffer
called read buffer, and then output to the decoder. The "Toverhead"
is obtained by the following equation when the input to the read
19
buffer is performed with a bit rate called "Rud" and the number of
sectors in the ECC block is represented by Secc:
Toverhead
TS packets read out from the BD-ROM are stored in the read
buffer in the state of Source Packets, and then supplied to the
decoder at a transfer rate called "TS_Recording_rate".
[0031]
To keep the transfer rate of the TS_Recording_rate while the TS
packets are supplied to the decoder, it is necessary that, during
Tjump, the TS packets are continuously output from the read buffer to
the decoder. Here, Source Packets, not TS packets, are output from
the read buffer. As a result, when the ratio of the TS packet to the
Source Packet in size is 192/188, it is necessary that during Tjump,
the Source Packets are continuously output from the read buffer at a
transfer rate of "192/188 x TS_Recording_rate".
Accordingly, the amount of occupied buffer capacity of the
read buffer that does not cause an underflow is represented by the
following equation:
Boccupied > (Tjump/1000x8) x ((192/188) x TS_Recording_rate) .
The input rate to the read buffer is represented by Rud, and
the output rate from the read buffer is represented by
TS_Recording_rate x (192/188) . Therefore, the occupation rate of the
read buffer is obtained by performing "(input rate) - (output rate)",
and thus obtained by "(Rud - TS_Recording_rate) x (192/188)".
[0032]
The time "Tx" required to occupy the read buffer by
"Boccupied" is obtained by the following equation:
Tx = Boccupied / (Rud - TS_Recording_rate x (192/188)) .
20
When reading from the BD-ROM, it is necessary to continue to
input TS packets with the bit rate Rud for the time period "Tx". As
a result, the minimum data length Sextent per extent when the AVClip
is divided into a plurality of file extents to be recorded is
obtained by the following equations:
Sextent = RudxTx
= RudxBoccupied/ (Rud-TS_Recording_ratex (192/188))
≥Rudx (Tjump/1000x8) x (192/188) xTS_Recording_rate)
/ (Rud—TS_Recording_ratex (192/188))
≥ (RudxTjump/1000x8) xTS_Recording_ratexl92
/(Rudxl88—TS_Recording_ratexi92) .
Hence,
Sextent≥ (TjumpxRud/1000x8) x (TS_Recording_ratexl92/ (Rudxl88-
TS_Recording_ratexl92)).
If each file extent constituting the AVClip has the data
length equal to or larger than Sextent that is calculated as a value
that does not cause an underflow of the decoder, even if the file
extents constituting the AVClip are located discretely on the ED-RCM,
TS packets are continuously supplied to the decoder so that the data
is read out continuously during the playback.
[0033]
The minimum constituent unit of the above-mentioned file
extent is an Aligned Unit (the data size is 6 Kbytes) that is
composed of a group of 32 Source Packets. Accordingly, the size of a
stream file (XXXX.AVClip) on a ED is always a multiple of 6 Kbytes.
FIG. 9 shows the internal structure of an "Aligned Unit". The
Aligned Unit is composed of 32 Source Packets and is then written
into a set of three consecutive sectors. The group of 32 Source
21
Packets is 6144 bytes (= 32x192), which is equivalent to the size of
three sectors (= 2048x3) . As to sectors on the BD-ROM, an error
correction code is attached for every 32 Source Packets to thereby
form an ECC block. As long as accessing the ED-ROM in units of
Aligned Units, the playback apparatus can obtain 32 complete Source
Packets. Thus concludes the description of the process of writing an
AVClip to the BD-RCM. An AVClip that is recorded on the BD-RCM and
with which high-resolution video streams are multiplexed together is
hereinafter referred to as the "MainClip". On the other hand, an
AVClip that is stored in the local storage and played back with a
MainClip is called the "SubClip".
[0034]
A partial transport stream is obtained by demultiplexing a
MainClip recorded on the BD-RCM. A partial transport stream
corresponds to each elementary stream. A partial transport stream
obtained by demultiplexing a MainClip and corresponding to each
elementary stream is called the "Primary TS".
Next are described files to which an extension "dpi" is
attached. Files (00001.clpi and 00002.clpi) to which an extension
"clpi" is attached store Clip information. The Clip information is
management information on each AVClip. FIG. 10 shows the internal
structure of Clip information. As shown on the left-hand side of the
figure, the Clip information includes:
i) "ClipInfo()" storing therein information regarding the
AVClip;
ii) "Sequence Info()" storing therein information regarding
the ATC Sequence and the STC Sequence;
22
iii) "Program Info()" storing therein information regarding
the Program Sequence; and
iv) "Characteristic Point Info (CPI())".
[0035]
The "Cliplnfo" includes "application_type" indicating the
application type of the AVClip referred to by the Clip information.
Referring to the Cliplnfo allows identification of whether the
application type is the MainClip or SubClip, whether video is
contained, or whether still pictures (slide show) are contained. In
addition, the above-mentioned TS_recording_rate is described in the
Cliplnfo.
The Sequence Info is information regarding one or more STC-
Sequences and ATC-Sequences contained in the AVClip. The reason that
these information are provided is to preliminarily notify the
playback apparatus of the system time-base discontinuity and the
arrival time-base discontinuity. That is to say, if such
discontinuity is present, there is a possibility that a PTS and an
ATS that have the same value appear in the AVClip. This might be a
cause of defective playback. The Sequence Info is provided to
indicate from where to where in the transport stream the STCs or the
ATCs are sequential.
[0036]
The Program Info is information that indicates a section
(called "Program Sequence") of the program where the contents are
constant. Here,, "Program" is a group of elementary streams that have
in common a time axis for synchronous playback. The reason that the
Program Info is provided is to preliminarily notify the playback
apparatus of a point at which the Program contents change. It should
23
be noted here that the point at which the Program contents change is,
for example, a point at which the PID of the video stream changes, or
a point at which the type of the video stream changes from SDTV to
HDTV.
Next is described the Characteristic Point Info. The lead
line cu2 in FIG. 9 indicates a close-up of the structure of CPI. As
indicated by the lead line cu2, the CPI is composed of Ne pieces of
EP_map_for_one_stream_PIDs (EP_map_for_one_stream_PID [0] to
EP_map_for_one_stream_PID[Ne-l]) . These EP_map_for_one_stream_PIDs
are EP_maps of the elementary streams that belong to the AVClip. The
EP_map is information that indicates, in association with an entry
time (PTS_EP_start), a packet number (SPN_EP_start) at an entry
position where the Access Unit is present in one elementary stream.
The lead line cu3 in the figure indicates a close-up of the internal
structure of EP_map_for_one_stream_PID.
[0037]
It is understood from the close-up that the
EP_map_for_one_stream_PID is composed of Ne pieces of EP_Highs
(EP_High(0) to EP_High(Nc-l)) and Nf pieces of EP_Lows (EP_Low(0) to
EP_Low(Nf-l)) . Here, the EP_High plays a role of indicating upper
bits of the SPN_EP_start and the PTS_EP_start of the Access Unit (Non-
TDR I-Picture, IDR-Picture), and the EP_Low plays a role of
indicating lower bits of the SPN_EP_start and the PTS_EP_start of the
Access Unit (Non-IDR I-Picture and IDR-Picture).
[0038]
The lead line cu4 in the figure indicates a close-up of the
internal structure of EP_High. As indicated by the lead line cu4,
the EP_High(i) is composed of: "ref_to_EP_Low_id [i]" that is a
24
reference value to EP_Low; "PTS_EP_High[i]" that indicates upper bits
of the PTS of the Access Unit (Non-IDR I-Picture, IDR-Picture) ; and
"SPN_EP_High[i]" that indicates upper bits of the SPN of the Access
Unit (Non-IDR I-Picture, IDR-Picture) . Here, "i" is an identifier of
a given EP_High.
[0039]
The lead line cu5 in the figure indicates a close-up of the
structure of EP_Low. As indicated by the lead line cu5, the
EP_Low(i) is composed of: "is_angle_change_point (EP_Low_id)" that
indicates whether the corresponding Access Unit is an IDR picture;
"I_end_position_offset (EP_Low_id)" that indicates the size of the
corresponding Access Unit; "PTS_EP_Low(EP_Low_id)" that indicates
lower bits of the PTS of the Access Unit (Non-IDR I-Picture, IDR-
Picture) ; and "SPN_EP_Low(EP_Low_id)" that indicates lower bits of
the SPN of the Access Unit (Non-IDR I-Picture, IDR-Picture) . Here,
"EP_Low_id" is an identifier for identifying a given EP_Low.
[0040]
Here, the EP_map is explained using a specific example. FIG.
11 shows EP_map settings for a video stream of a movie. Level 1
shows a plurality of pictures (IDR picture, I-Picture, B-Picture, and
P-Picture defined in MPEG4-AVC) arranged in the order of display.
Level 2 shows the time axis for the pictures. Level 4 indicates a TS
packet sequence on the BD-RCM, and Level 3 indicates settings of the
EP_map.
[0041]
Assume here that, in the time axis of Level 2, an IDR picture
or an I picture is present at each time point tl to t7. The interval
25
between adjacent ones of the time points tl to t7 is approximately-
one second. The EP_map used for the movie is set to indicate tl to
t7 with the entry times (PTS_EP_start), and indicate entry positions
(SPN_EP_start) in association with the entry times.
Next is described the PlayList information. A file
(00001.mpls) to which extension "mpls" is attached is a file storing
therein the PlayList (PL) information.
[0042]
FIG. 12 shows the data structure of the PlayList information.
As indicated by the lead line mpl in the figure, the PlayList
information includes: MainPath information (MainPath ()) that defines
MainPath; PlayListMark information (PlayListMarkO) that defines
chapter; and other extension data (Extension Data) .
First is described the MainPath. The MainPath is a playback
path that is defined in terms of a video stream, such as the main
video, and an audio stream.
[0043]
As indicated by the arrow mpl, the MainPath is defined by a
plurality of pieces of Playltem information: Playltem information #1
to Playltem information #m. Playltem information defines one or more
logical playback sections that constitute the MainPath. The lead
line hsl in the figure indicates a close-up of the structure of
Playltem information. As indicated by the lead line hsl, Playltem
information is composed of: "Clip_Information_file_name" that
indicates the file name of the playback section information of the
AVClip to which the IN point and the OUT point of the playback
26
section belong; "Clip_codec_identifier" that indicates the AVClip
encoding method; "is_multi_angle" that indicates whether or not
Playltem is multi angle; "connection_condition" that indicates
whether or not to seamlessly connect the current Playltem and the
preceding Playltem; "ref_to_STC_id [0]" that indicates uniquely the
STC_Sequence targeted by Playltem; "In_time" that is time information
indicating the start point of the playback section; "Out_time" that
is time information indicating the end point of the playback section;
"UO_mask_table" that indicates which user operation should be masked
by Playltem; "Playltem_random_access_flag" that indicates whether to
permit a random access to a mid-point in Playltem; "Stilljnode" that
indicates whether to continue a still display of the last picture
after the playback of Playltem ends; and "STN_table". Among these,
the time information "In_time" indicating the start point of the
playback section and the time information "Out_time" indicating the
end point of the playback section constitute a playback path. The
playback path information is composed of "In_time" and "Out_time" .
[0044]
FIG. 13 shows the relationships between the AVClip and the
PlayList information. Level 1 shows the time axis of the PlayList
information (PlayList time axis) . Levels 2 to 5 show the video
stream that is referenced by the EP_map.
The PlayList information includes two pieces of Playltem
information: Playltem information #1; and Playltem information #2.
Two playback sections are defined by "In_time" and "Out_time"
included in Playltem information #1 and Playltem information #2,
respectively. When these playback sections are arranged, a time axis
that is different from the AVClip time axis is defined. This is the
27
PlayList time axis shown in Level 1. Thus, it is possible to define
a playback path that is different from the AVClip by defining
Playltem information.
[0045]
Thus concludes the description of the ED-ROM 100.
The following describes the local storage 200 that is a
recording medium of the present invention. FIG. 14 shows an internal
structure of the local storage 200. As shown in the figure, the
recording medium of the present invention can be produced by
improving the application layer.
[0046]
Level 4 of the figure shows the local storage 200 and Level 3
shows a track on the local storage 200. The figure depicts the track
in a laterally drawn-out form, although the track is, in fact, formed
in a spiral, winding from the inside toward the outside of the local
storage 200. The track is composed of a lead-in area, a volume area,
and a lead-out area. The volume area in the figure has a layer model
made up of a physical layer, a filesystem layer, and an application
layer. Level 1 in the figure shows a format of the application layer
of the local storage 200 by using a directory structure.
[0047]
In the directory structure shown in FIG. 13, there is a
subdirectory "organizational" under a root directory. Also, there is
a subdirectory "disk#l" under the directory "organizational". The
directory "organization#l" is assigned to a specific provider of a
movie. The directory "disk#l" is assigned to each ED-ROM provided
from the provider.
28
[0048]
With this construction in Which the directory assigned to a
specific provider includes directories that corresponds to BD-ROMs,
download data for each ED-ROM is stored separately. Similarly to the
information stored in the BD-ROM, under the subdirectory "disk#l",
the following information is stored: PlayList information
("00002.mpls"); Clip information ("00003.dpi" and "00004.clpi"); and
AVClips ("00003.m2ts" and "00004. m2ts".
The following describes components of the local storage 200:
the PlayList information, Clip information and AVClips.
[0049]
The AVClips (00003 .m2ts and 00004.m2ts) in the local storage
200 make up SubClips. A partial transport stream is obtained by
demultiplexing a SubClip. A partial transport stream obtained by
demultiplexing a SubClip is called the "Secondary TS". Such a
Secondary TS is a constituent of the Out_of_MUX application. The
following describes the Out_of_MQK application.
(Out_of_MUX Application)
The Out_of_MUX application is an application that, for example,
selects two TSs—a Primary TS in the ED-ROM and a Secondary TS, which
is obtained via a network or the like and recorded in the local
storage—and plays them back simultaneously, whereby allowing various
combinations of elementary streams between these two TSs.
[0050]
FIG. 15 shows the way a Primary TS and a Secondary TS making
up the Out_of_MUX application are supplied to the decoder within the
BD-ROM playback apparatus. In the figure, among the internal
29
structural components of the BD-ROM playback apparatus, a BD-RCM
drive, a local storage and a network are shown on the left side while
the decoder is shown on the right side. A PID Filter that performs
stream demultiplexing is shown in the center. Primary TS (Video 1,
Audio 1 (English) , Audio 2 (Spanish), PG 1 (English Subtitle) , IG
1 (English Menu)) and the Secondary TS (Audio 2 (Japanese), Audio 3
(Korean), PG 2 (Japanese Subtitle), PG 3 (Korean Subtitle), IG 2
(Japanese Menu), IG 3 (Korean Menu)) in the figure are transport
streams supplied from the BD-RCM and the local strage, respectively.
Since only English (Audio 1) and Spanish (Audio 2) are recorded on
the disk, a Japanese-dubbed version, for example, cannot be selected
on the disk. However, by downloading, to the local storage, the
Secondary TS which includes the Japanese-dubbed version (Audio 2)
provided by the content provider, the Japanese-dubbed audio (Audio 2),
Japanese subtitle (PG 2), and Japanese menu screen (IG 2) can be sent
to the decoder. Herewith, the user is able to select any of the
Japanese-dubbed audio (Audio 2), Japanese subtitle (PG 2), and
Japanese menu screen (IG 2), and play it back with the video (Video
1).
[0051]
The Out_of_MUX application allows the user to freely make a
selection on an audio and a subtitle under the condition that the
selection can be made for up to one for each type of the elementary
streams that are stored in the two TSs to be played back
simultaneously (in other words, up to one video, one audio, one
subtitle and one menu stored in the primary and Secondary TSs) .
Any BD-RCM playback apparatus is able to decode a Primary TS,
however, cannot decode two TSs simultaneously. Accordingly, the
30
introduction of the Out_of_MUX application without restriction would
cause an increase in the size of the hardware and/or a large addition
of software, which results in an increase in the cost of BD-ROM
playback apparatuses. Therefore, when it comes to the realization of
the Out_of_MUX application, whether the Out_of_MUX application can be
realized on recources capable of decoding only a Primary TS is a key
issue.
[0052]
The limitation of allowing for playback of up to one for each
type of the elementary streams can be assumed as "replacing" the
elementary streams of Primary TS with those of the Secondary TS.
Herewith, the Out_of_MUX application can be realized on resources
capable of decoding only a single TS, avoiding an increase in costs
of the decoders. According to the example of the figure, the audio
stream, subtitle stream (PG), and menu stream (IG) of the Primary TS
are replaced with those of the Secondary TS.
[0053]
The Secondary TS may be input not only from a built-in HDD,
such as the above-mentioned local storage, but also from a flush
memory, a primary storage memory, and an HDD via a network, or by
streaming via a direct netwark. For ease of explanation, assume that
the Secondary TS is supplied from a built-in HDD like one shown in
FIG. 1.
Clip information (00003.clpi, 00004.clpi) in the local strage,
has the same data structure as Clip information recorded in the ED-
ROM. Here, TS_Recording_Rate of Clip information in the local
storage is set to be the same as the bit rate for reading the AVClip
31
from the ED-ROM. That is, TS_Recording_Rate written in Clip
information of a SubClip is the same as TS_Recording _Rate written in
Clip information of a MainClip. If TS_Recording_Rate of a MainClip
is differenrt from TS_Recording_Rate of a SubClip, the data rate for
transmission from each source depacketizer to the buffer changes
according to which TS is transmitted. This fails to establish the
assumption that the Out_of_MOX application can be regarded as one
input TS.
[0054]
In addition, since the elementary streams to be played back
are freely selected from two TSs, all the source depacketizer and the
buffer in the decoder are set for a Primary TS bit rate when an audio
of the Primary TS is selected, and all the source depacketizer and
the buffer in the decoder are set for a Secondary TS bit rate when an
audio of the Secondary TS is selected. This makes processes and
verification of the playback apparatus cumbersome and complicated.
Next is described PlayList information in the local storage
200. A file (00002.mpls) to which extension "mpls" is attached is
information that defines a group made by binding up two types of
playback paths called MainPath and Subpath as Playlist (PL) . FIG. 16
shows the data structure of the PlayList information. As shown in
the figure, the PlayList information includes: MainPath information
(MainPath ()) that defines MainPath; PlayListMark information
(PlayListMark()) that defines a chapter; and Subpath information
(SubpathO) that defines Subpath. The internal structures of the
PlayList information and Playltem information are the same as those
in the ED-ROM, and therefore their descriptions are omitted here.
32
The following describes the Subpath information.
Whereas the MainPath is a playback path defined for the
MainClip which is a main video, the Subpath is a playback path
defined for the SubClip which synchronizes with the MainPath.
[0055]
FIG. 17 shows a close-up of the internal structure of the
Subpath information. As indicated by the arrow hc0 in the figure,
each Subpath includes "SubPath_type" indicating a type of the SubClip
and one or more pieces of SubPlayltem information (..J3ubPlayItem()...) .
The lead line hcl in the figure indicates a close-up of the
structure of SubPathltem information. As indicated by the arrow hcl
in the figure, SubPlayltem informationn includes:
"Clip_information_file_name" ; "Clip_codec_identif ier" ;
"SP_connection_condition" ; "ref_to_STC_id[0] " ; "SubPlayItem_In_time" ;
"SubPlayItem_Out_time"; 'sync_PlayItem_id"; and
"sync_start_PTS_of_PlayItem".
[0056]
The "Clip_information_file_name" is information that uniquely
specifies a SubClip corresponding to SubPlayltem by describing a file
name of the Clip information.
The "Clip_codec_identifier" indicates an encoding system of the
AVClip.
The "SP_connection_condition" indicates a state of connection
between SubPlayltem (current SubPlayltem) and
SubPlayltem (previousSubPlayltem) immediately preceding
SubPlayltem(current SubPlayltem).
[0057]
33
The "ref_to_STC_id[0]" uniquely indicates an STC_Sequence at
which Playltem aims.
The "SubPlayItem_In_time" is information indicating a start
point of SubPlayltem on the playback time axis of the SubClip.
The "SubPlayItem_Out_time" is information indicating an end
point of SubPlayltem on the Playback time axis of the SubClip.
[0058]
The "sync_PlayItem_id" is information uniquely specifying, from
among Playl terns making up the MainPath, Playltem with which
SubPlayltem synchronizes. The "SubPlayItem_In_time" is present on the
playback time axis of Playltem specified with the sync_PlayItem_id.
The " sync_start_PTS_of_PlayItem " indicates, with a time
accuracy of 45KHz, where the start point of SubPlayltem specified by
SubPlayItem_In_time is present on the playback time axis of Playltem
specified with the sync_PlayItem_id.
[0059]
Objects>
Here, the three objects mean SubClips in the local storage 200,
PlayList information in the local storage 200 and the MainClip in the
BD-ROM.
FIG. 18 shows relationship of SubClips in the local storage
200, PlayList information in the local storage 200 and the MainClip
on the BD-RCM. Level 1 of the figure indicates SubClips present in
the local storage 200. As shown in Level 1, there are different
types of Secondary TS in SubClips of the local storage 200: an audio
stream, a PG stream and an IG stream. Any one of them is used as a
SubPath for the synchronous playback.
34
[0060]
Level 2 indicates two time axes defined by PlayList
information. The lower time axis in Level 2 is a PlayList time axis
defined by Playltem information and the upper time axis is
SubPlayltem time axis defined by SubPlayltem.
As shown in the figure, it can be seen that
SubPlayItem_Clip_information_file_name of SubPlayltem information
plays a role of selecting, from among .m2ts files storing SubClips,
a .m2ts file as a target for the playback section.
[0061]
SubPlayltem. Out_time play roles in defining the start point
and end point of the playback section.
The arrow Sync_PlayItem_Id plays a role in specifying which
Playltem is synchronized with SubPlayltem. The
sync_start_PTS_of_PlayItem plays a role in determining a time point
of SubPlayItem_In_time on the PlayList time axis.
[0062]
Thus concludes the description of the SubPath information.
A feature of the PlayList information in the local storage 200
is an STN_Table. The following describes PlayList information in the
local storage 200.
The STN_table is a table showing at least one combination of
elementary streams that are allowed to be played back simultaneously.
The combination of elementary streams have been selected from
multiple elementary streams multiplexed into a MainClip specified by
Clip_Information_file_name of Playltem information as well as multiple
elementary streams multiplexed into a SubClip specified by
35
Clip_Inforrration_file_name of SubPlayltem information. Such multiple
elementary streams allowed to be played back simultaneously in the
STN_table in the PlayList information form the so-called "system
stream".
[0063]
Specifically speaking, the STN_table is formed by associating
a Stream_entry of each of the multiple elementary streams multiplexed
into the MainClip and those multiplexed into the SubClip with a
Stream_attribute.
FIG. 19A shows an internal structure of the STN_table. As
shown in the figure, the STN_table includes multiple pairs of an
entry and an attribute (entry-attribute), and has a data structure
showing the count of these entry-attribute pairs
(number_of_video_stream_entries, number_of_audio_stream_entries,
number_of_PG_stream_entries, number_of_IG_stream_entries) .
[0064]
The entry-attribute pairs respectively correspond to each of
the video streams, audio streams, PG streams and IG streams that can
be played back in Playltem, as shown by the symbol of "{" in the
figure.
The following describes the details of the entry-attribute.
[0065]
FIG. 19B shows a Stream_attribute corresponding to a video
stream.
The Stream_attribute of the video stream includes
"Video_format" indicating a display format of the video stream and
"frame_rate" indicating a frequency for displaying the video stream.
36
FIG. 19C shows a Stream_attribute corresponding to an audio
stream.
The Stream_attribute of the audio stream is composed of:
"stream_coding_type" indicating an encoding method of the audio
stream; "audio_presentation_type" indicating a channel structure of
the corresponding audio stream; "Sampling_frequency" indicating a
sampling frequency of the correspondingn audio stream; and
"audio_language code" indicating a language attribute of the audio
stream.
[0066]
FIG. 19D shows a Stream_entry of the audio stream. As shown
in the figure, the Stream_entry of the video stream includes
"ref_to_Stream_PID_of_Main_Clip" indicating a PID used for
demultiplexing the video stream.
Stream_attribute of an audio stream, an IG stream and a PG
stream multiplexed into a MainClip has a format shown in FIG. 19D.
[0067]
Be Played Back>
The STN_table shows, among elementary streams read from the
ED-ROM and the local storage, ones allowed to be played back.
However, if such a STN_table allows elementary streams to be played
back with no restriction, the decoder system may be broken down.
[0068]
The reason for this is as follows. According to the MPEG2
decoder system standard, an overlap between TS packets on the ATC
time axis in one transport stream is not allowed. This is a basic
principle in order to cause the decoder system to perform a proper
37
decoding process. On the other hand, in the case where both playback
of a stream read from the BD-ROM and playback of a stream read from
the local storage are allowed, and then playback of an AVClip read
from the ED-ROM and playback of an AVClip read from the local storage
are performed simultaneously, an overlap is created between a TS
packet from the BD-ROM and a TS packet from the local storage.
[0069]
Given this factor, the following restriction is imposed on
decoding elementary streams.
The decoding elementary streams are a video stream, an audio
stream, a PG stream and an IG stream that have been allowed in the
STN_table to be played back and have been selected for simultaneous
playback. Some decoding elementary streams are read from the local
storage and others are read from the BD-ROM.
[0070]
The restriction imposed on the decoding elementary streams is
that the bit amount of TS packets (Decoding TS packets) constituting
an AVClip (MainClip, SubClip) that includes elementary streams
allowed in the STN_table to be simultaneously played back but does
not include elementary streams not allowed to be played back must be
48 Mbits/second or less.
The unit time of one second is called the "Window", and can be
located at any point on the time axis of the ATC Sequence. That is
to say, the bit amount of the decoding elementary streams during one
second at any point must be 48 Mbits or less.
[0071]
FIG. 20 shows TS packets read from the BD-ROM and from the
local storage, and illustrates, of these TS packets, ones to be
38
supplied to the decoder. Level 1 of the figure shows multiple TS
packets read from the BD-RCM; Level 3 shows multiple TS packets read
from the local storage. Among the TS packets in Levels 1 and 3,
hatched ones in the figure are TS packets constituting a decoding
elementary stream (Decoding TS packets) . Level 2 in the figure shows,
of the Decoding TS packets shown in Levels 1 and 3, ones occurring in
a period of one second. As has been described above, according to
the MPBG2 decoder system standard, an overlap is not allowed between
TS packets on the ATC time axis in one transport stream. However, it
can be seen from the figure that overlaps rpl, rp2 and rp3 between TS
packets occur on the ATC time axis. Thus, overlaps in the TS packet
operations are allowed in the unit time of the Window. However,
another requirement, which is not applied to the MPBG2 decoder system
standard, is imposed. That is the above-mentioned restriction of 48
Mbits/Window or less. Level 4 presents mathematical expressions of
the condition that the Decoding TS packets must satisfy. The
mathematical expressions indicate that a value obtained by converting
the count of the above-mentioned Decoding TS packets into a bit count
(the count of the Decoding TS packets is multiplied by the number of
bytes of a TS packet, 188, and the result is expressed in 8 bits) is
48 Mbits or less.
[0072]
Imposing the above-mentioned condition on the Decoding TS
packets in any period of one second is the restriction of the bit
amount according to the present embodiment. When the authoring is
performed for the Out_of_MUX application, it is checked whether the
bit amount of a Decoding TS packet over the period of one second is
48 Mbits or less while keeping the Window shifting on the Source
39
Packet sequence by one packet each time. When the limitation is
satisfied, the Window is shifted to the next TS packet. If the
limitation is not satisfied, it is determined that there is a
violation of the BD-RCM standard. When the Out_Time of the Window
reaches the last Source Packet after the repetition of such shifts,
it is determined that the Source Packets conform to the BD-RCM
standard.
[0073]
An ATS having a time accuracy of 27 MHz is attached to each TS
packet. Coordinates on the ATC time axis have a time accuracy of
1/27,000,000 second; however, an ATS is not always present at each
coordinate on the ATC time axis. On the ATC time axis, periods
having no ATS and periods having an ATS appear in an irregular manner.
The occurrence of ATSs is varied, and therefore when the Window is
shifted, how to adjust the Out_Time of the Window becomes an issue in
the case where an ATS is absent 1 second after the In_Time.
[0074]
The Out_Time of the Window is, in principle, set to be 1
second after the In_Time. Here, if an ATS is present, on the ATC
time axis, at a coordinate corresponding to 1 second after the
In_Time, the coordinate of the In__Time + 1 second is set as the
Out_time. If an ATS is absent at the coordinate corresponding to 1
second after the In_Time, a coordinate at which an ATS appears on the
ATC time axis for the first time after the In_Time + 1 second is set
as the Out_Time. Since the Out_Time of the Window shifting is
adjusted by taking into account time periods during which no ATS is
present, a different bit value is calculated each time when the
40
Window shifts. The In_Time is shifted by one TS packet each time,
and the Out_Time is ad_lasted in accordance with the shift, and
thereby the transition of the bit values in the ATC time axis can be
calculated with precision.
[0075]
FIGs. 21A-21D show the shifts of the Window. In each of FIGs.
21A to 2ID, the upper part shows a Source Packet sequence which is a
target for verification, and the lower part shows the In_Time and
Out_Time of the Window. In FIG. 21A, the In_Time of the Window
specifies a Source Packet #i. A TS packet #j corresponding to 1
second after the In_Time of the Window is set as the Out_Time of the
Window.
In FIG. 21B, the In_Time of the Window specifies a Source
Packet #i+l. On the other hand, no ATS is present at the coordinate
corresponding to the Source Packet #j+l, which is 1 second after the
In_Time of the Window. The Out_Time of the Window of FIG. 2IB should
specify one TS packet beyond the TS packet #j; however, since a
Source Packet is not present immediately after the TS packet #j, the
bit rate of the Window of FIG. 2IB becomes smaller than the bit rate
of the Window of FIG. 21A. In such a case, there is no point for the
Window of FIG. 2IB performing the check. Given this factor, by
adjusting the Out_Time of the Window, a TS packet #j+2, which appears
for the first time after 1 second from the In_Time of the Window, is
set as the Out_Time. Setting the Out_Time in this way makes the check
of the Window of FIG. 2IB worth performing.
[0076]
In FIG. 21C, the In_Time of the Window specifies a Source
Packet #i+2. On the other hand, the TS packet #j+2 is located at a
41
position corresponding to 1 second after the In_Time of the Window.
The count of the TS packets for the Window of FIG. 21C is the same as
that for the Window of FIG. 2IB, and therefore there is no point for
performing the check. Accordingly, no check is performed in FIG. 21C,
and the In_Time of the Window is shifted.
In FIG. 21D, the In_Time of the Window specifies a Source
Packet #1+3. On the other hand, no Source Packet is present at a
position corresponding for a Source Packet #j+3, which is in 1 second
after the In_Time of the Window. Given this factor, by adjusting the
Out_Time of the Window in a manner described above, a TS packet #j+4,
which appears for the first time after 1 second from the In_Time of
the Window is set as the Out_Time. Herewith, the count of the TS
packets in the Window becomes different from that for the Window of
FIG. 21B, and the check of the Window of FIG. 21D is made to be worth
performing.
[0077]
By performing the bit amount check with the Window shift in
the above described manner when the authoring is carried out, it is
guaranteed that no underflow or overflow is caused when TS packets
are read from the local storage and the BD-ROM and supplied to the
decoder.
The assurance of the Window shift is described next with
reference to specific examples of FIGs. 22-26.
Level 1 in FIG. 22 is a graph showing temporal transition
regarding the data amount of TS packets read from the BD-ROM as well
as the data amount of TS packets read from the local storage. The
horizontal axis is time and the vertical axis is transmission amounts
in relation to each point on the time axis. In the graph, the bit
42
amounts at the time when TS packets are being read from the BD-RCM
and the local storage undergo a transition as indicated by the dashed
curves.
[0078]
Level 2 in FIG. 22 shows the total data amount of, from among
the TS packets read from the BD-RCM and the local storage, TS packets
which are to be supplied to the decoder. The temporal transition of
the total transmission amount is as shown by the solid curve. The
total data amount is the sum amount of TS packets belonging to
streams that have been allowed in the STN_table. In the worst case,
the total transmission amount would reach close to 96 Mbits and TS
packets having this data amount would be supplied to the decoder.
Here, the time axis of the graph is divided into seven Windows, and a
comparison is made between the supply amount in each Window and the
transmittable amount for each Window.
[0079]
Level 3 in FIG. 22 is the graph of Level being divided for
every 1 second. FIGs. 23A and 23B show the comparison between the
transmittable amount and the amount supplied to the decoder for each
Window. The transmittable amount for a Window is 48 Mbits per second,
and it is 96 Mbits if the amount is converted in bit per 0.5 seconds.
A hatching pattern pnl in the figure indicates the data amount
supplied to the decoder. A hatching pattern pn2 in the figure
indicates the transmittable amount in each Window. In any Window,
the portion with the hatching pattern pnl has the same or smaller
area than the portion with the hatching pattern pn2. This indicates
that the data amount supplied from the BD-RCM and the local storage
is limited to the transmittable amount or less in any Window.
43
[0080]
At any point on the ATC time axis, the transmittable amount to
the decoder is 48 Mbits/second or less. Therefore, even if the
transmittable amount to the decoder locally reaches close to 96 Mbits,
the transmission at 96 Mbits never continues for 0.5 seconds, as
evidenced by the calculation of 48 Mbits = 96 Mbits x 0.5 seconds.
Accordingly, if the decoder performs a prior read operation to read
in advance Source Packets from the BD-ROM and the local storage
before the peak is reached, no underflow or overflow is caused in the
buffer of the decoder.
[0081]
The transmittable amount in each Window, i.e. 48 Mbits/second,
has been determined using, as a guide, an amount that a decoder
complying with MPEG can read in advance into the buffer. If the
amount of data that can be read in advance into the buffer is larger,
the data amount per second can be made larger, or the period for the
Window can be set longer. Thus, the present invention is not limited
to the rate of 48 Mbits/second.
[0082]
Thus concludes the description of the restriction of the data
amount on a Secondary TS that is allowed in the STN_table to be
played back.
sp_connection_condition Information>
The following describes settings of connection_condition
information in Playltem and sp_connection_condition information in
SubPlayltem for realizing the Out_of_MUX application. The fields of
connection_condition information and sp_connection_condition
44
information can take values of "1", "5", and "6", the meanings of
which are as follows.
[0083]
connection_condition=l (CC = 1) : There is no guarantee for a
seamless connection between PlayItern (current Playltem) and the
immediate previous Playltem (previous Playltem) . That is, it is a
connection mode that allows a freeze to occur and the playback is
interrupted (non-seamless connection) .
connection__condition= 5 (CC = 5) : There is a guarantee for a
seamless connection between a video stream, a PG stream and an IG
stream multiplexed into the MainClip of the current Playltem and a
video stream, a PG stream and an IG stream multiplexed into the
MainClip of the previous Playltem. On the other hand, this is not
the case with an audio stream multiplexed into the MainClip.
[0084]
connection_condition=6 (CC = 6) : Respective TS streams
belonging to the current Playltem and to the previous Playltem,
respectively, are logically continued (they are continuous on the
time axis, and the encoding methods are also the same), and there is
a guarantee for a seamless connection of both audio and video streams.
sp_connection_condition information written in SubPlayItem#n
can be defined as follows.
[0085]
sp_connection_condition information (SP_CC = 1) : There is no
guarantee for a seamless connection between SubPlayltem (current
SubPlayltem) and the immediate previous SubPlayltem (previous
SubPlayltem).
45
sp_connection_condition information (SP_CC = 5) : There is a
guarantee for a seamless connection between a PG stream and an IG
stream multiplexed into the SubClip of the current SubPlayltem and a
PG stream and an IG stream multiplexed into the SubClip of the
previous SubPlayltem. On the other hand, this is not the case with
an audio stream multiplexed into the SubClip.
[0086]
sp_connection_condition information (SP_CC = 6) : Respective
TS streams belonging to the current SubPlayltem and to the previous
SubPlayltem, respectively, are logically continued (they are
continuous on the time axis, and the endocing methods are also the
same), and there is a guarantee for a seamless connection.
SubPlayltem to be set for Playltem that realizes the
Out_of_MUX application should not cause discordance even if a video
stream, an audio stream, a PG stream or an IG stream of SubPlayltem
is within Playltem. Therefore, they have identical connection
conditions. That is, if Playltem#l and Playltem#2 are connected by
CC = 1, SubPlayItem#l and SuPlayItem#2 corresponding to them are also
connected by CC = 1. Similarly, if Playltem#l and Playltem #2 are
connected by CC= 5, the corresponding SubPlayItem#l and SubPlayItem#2
are connected while satisfying the condition of CC = 5.
The following describes the relationship of In_Times and
Out_Times of Playltems and SubPlayltems constituting the Out_of_MUX
application as well as the detail of connection_condition information
with reference to FIGs. 24, 25 and 26.
[0087]
Relationship of In_Times and Out_Times>
46
FIG. 24 shows a connection state of Playltetns and SubPlaylterns
constituting the Out_of_MUX. Level 1 of the figure is a SubClip time
axis; and Levels 2 and 3 are a SubPlayltem time axis and a PlayList
time axis, respectively. Level 4 is a MainClip time axis. In the
figure, in the case where connection_condition information of
Playltem is "= 5", connection_condition information of SubPlayltem is
also SP_CC = 5.
[0088]
FIG. 25 shows the relationship between In_Times and Out_Times
of Playltems and In_Times and Out_Times of SubPlayltems in the case
where connection_condition information of Playltem and
sp_connection_condition infomraiton of SubPlayltem shown in FIG. 24
are set to "= 5". Levels 1 and 4 are the same as those in FIG. 24.
Of two Playltems (Playltem information #1 and Playltem information
#2) shown in FIG. 24, Playltem information #1 has In_Time indicating
a time point tl and has Out_Time indicating a time point t2. In_Time
of Playltem information #2 indicates a time point t3, and Out_Time of
Playltem information #2 indicates a time point t4.
[0089]
When the connection state of Playltem is CC = 5,
Sync_Start_Pts_of_PlayItem of SubPlayltem indicates the same time
point as In_Time of Playltem. In_Time and Out_Time of SubPlayltem
show the same time points as In_Time and Out_Time of Playltem. Thus,
in the case where connection_condition information of Playltem is "=
5", sp_connection_condition information of SubPlayltem is also set to
"= 5", and In_Time and Out_time of Playltem indicate the same time
points as In_Time and Out_Time of SubPlayltem.
[0090]
47
In_Time and Out_Time of Play Item and In_Time and Out_Time of
SubPlayltem respectively refer to PTSs of a Video Presentation Unit
and an Audio Presentation Unit. In_Time and Out_Time of Playltem and
In_Time and Out_Time of SubPlayltem matching each other means that
PTS values of the Video Presentation Unit and Audio Presentation Unit
referred to by In_Time and Out_Time of Playltem are the same as PTS
values of the Video Presentation Unit and Audio Presentation Unit
referred to by In_Time and Out_Time of SubPlayltem. In this case, it
is necessary that Primary TS and Secondary TS should be encoded so as
to have the same length of time and to cause PTSs of the Video
Presentation Unit and Audio Presentation Unit to be the same when the
authoring is performed. Creating Primary TS and Secondary TS in this
way is also a condition for realizing CC = 5 and SP__CC = 5.
[0091]
FIG. 26 shows an STC value to be referred to when part
existing from InJTime to Out_Time of Playltem is played back and an
STC value to be referred to when part existing from In_Time to
Out_Time of SubPlayltem is played back. Levels 2 and 3 are the same
as those in the previous figure. Level 1 shows, in graph format, an
STC value to be referred to when part existing from In_Time to
Out_Time of SubPlayltem is played back. Level 4 shows, in graph
format, an STC value to be referred to when part existing from
In_Time to Out_Time of Playltem is played back. The horizontal axis
of Level 1 is a time axis, and the vertical axis shows STC values in
relation to each time point on the time axis. The STC values of
Level 1 include a monotonic increase zkl from In_Time to Out_Time of
SubPlayltem information #1 and a monotonic increase zk2 from In_Time
48
to Out_Time of SubPlayltem information #2. The STC values of Level 4
include a monotonic increase zk3 from In_Time to Out_Time of Playltem
information #1 and a monotonic increase zk4 from In_Time to Out_Time
of Playltem information #2.
[0092]
As In_Time of Playltem indicates the same time point of
In_Time of SubPlayltem, the initial values of the STCs in the above
graph are the same and the STC values in the middle time points are
also the same. That is, STC2(i), which is an STC value to be
referred to when a Source Packet located at a discretional time point
i between In_Time and Out_Time of Playltem is supplied to the decoder,
is the same as STCl(i), which is an STC value to be referred to when
a Source Packet located at the same time point i between In_Time and
Out_Time of SubPlayltem is supplied to the decoder. When the STC
values are the same, all the STC counters in the apparatus have to do
is to create the same clock values and supply them to the
demultiplexing units, thus simplifying controls on the playback
apparatus.
[0093]
Hypothetically speaking, in the case where two or more
SubPlayltems are prepared for one Playltem against the controls
illustrated in FIGs. 25 and 26, the video and audio are interrupted
at the boundary of these SubPlayltems, and inconveniences—such as
playback suspension in the middle of Playltem—will result.
Additionally, when a process of replacing a Primary TS with a
Secondary TS is realized in the Out-of-MUX application, the STC time
axis have to be changed at the replacement, which leads to
complication of the synchronous controls on the playback apparatus.
49
On the other hand, by defining both In_Time and Out_Time of Playltem
or SubPlayltem on a continuous STC time axis, it is possible to
prevent the above-mentioned inconveniences, i.e. an interruption of
video and audio and replacement of transport streams. Due to these
situations, with respect to one Playltem, one SubPlayltem having the
same start and end points as those of the Playltem is assigned.
[0094]
Here, an exact match between In_Time and Out_Time of Playltem
and those of SubPlayltem is not required, and some degree of errors
can be allowed. The errors of In_Time and Out_Time are described
next.
STC times of In_Time and Out_Time of Playltem are set for
video frames of Playltem. On the other hand, STC times of In_Time
and Out_Time of SubPlayltem are set for audio frames of SubPlayltem.
This is because SubPlayltem is mainly used for commentary and
therefore it is often the case that a video stream is not multiplexed
thereinto. In this case, due to, in a precise sense, a difference in
the length of playback period of respective presentation units, their
start and end times do not match each other. Accordingly, it is
necessary to allow an error of, at least, less than one frame. The
start and end times of Playltem#n and SubPlayItem#n are also
specified on the same STC time axis as follows:
(PlayItem#n.Out - Playltem#n.ln) - (SubPlayItem#n.Out_time -
SubPlayItem#n.In_time) |
or two interlace fields of video having the shortest playback period
in Playltem#n
the playback period of 1 progressive frame or two interlace fields of
50
video having the longest playback period in Playltem#n (
be used, or the value can be set to be 1 second or less.
Thus concludes the description of the relationship of In_Times
and Out_Times of Playltera and SubPlayltem.
[0095]
The following describes connection_condition information and
sp_connection_condition information in detail. In order to satisfy
CC = 5 and SP_CC = 5, the following conditions have to be met in all
the levels of AV stream, transport stream, Video Presentation Unit
and Audio Presentation Unit, and elementary stream.
connection_condition information of the current Playltem and
sp_connection_condition information being set to "5" means that there
is "Clean Break" between the end point of an AV stream played back in
the previous Playltem and the start point of the AV stream played
back in the current Playltem.
[0096]
In order to realize Clean Break, the AV stream played back in
the previous Playltem and the AV stream played back in the current
Playltem must satisfy the following requirements.
(1) An unnecessary Access Unit is absent at the end point of
MainClip specified in the previous Playltem, and an unnecessary
Access Unit having a PTS has been excluded from the period following
Out_Time of the previous Playltem.
[0097]
Similarly, an unnecessary Access Unit is absent at the end
point of SubClip specified in the previous SubPlayltem, and an
51
unnecessary Access Unit having a PTS has been excluded from the
period following Out_Time of the previous SubPlayltem.
(2) At the start of the AV stream specified in the current
Playltem, an unnecessary Access Unit having a PTS has been excluded
from the period prior to In_Time of the current Playltem. In
addition, the first Audio Presentation Unit of MainClip includes
Sample to be played back at In_Time on the STC time axis.
[0098]
Similarly, at the start of the AV stream specified in the
current SubPlayltem, an unnecessary Access Unit having a PTS has been
ecluded from the period prior to In_Time of the current SubPlayltem.
In addition, the first Audio Presentation Unit of the SubClip
includes Sample to be played back at In_Time on the STC time axis.
(3) Source Packets constituting the MainClip specified in the
previous Playltem must be multiplexed in a manner that all of them
are taken into the decoder system before the first packet of the
MainClip specified in the current Playltem is sent to the decoder.
[0099]
Similarly, data of the SubClip specified in the previous
SubPlayltem must be multiplexed in a manner that all the data is
taken into the decoder system before the first packet of the SubClip
specified in the current SubPlayltem is sent to the decoder.
Thus concludes the description of conditions that should be
satisfied at the level of the AV stream. Now, conditions that should
be satisfied at the level of transport streams are described.
Here, two Primary TSs that are targets of a seamless
connection when CC = 5 are called Primary TS1 and Primary TS2. Two
52
Primary TSs that are targets of a seamless connection when SP_CC = 5
are called Secondary TS1 Secondary TS2.
[0100]
FIG. 27 shows how TSls and TS2s are identified in an AVClip
referred to in the previous Playltem and the previous SubPlayltem and
in an AVClip referred to in the current Playltem and the current
SubPlayltem. Level 4 in the figure shows Primary TS1 and Primary
TS2; and Level 3 shows MainClipl of the previous Playltem and
MainClip2 of the current Playltem. Level 1 shows Secondary TS1 and
Secondary TS2; and Level 2 shows SubClipl of the previous SubPlayltem
and SubClip2 of the current SubPlayltem.
[0101]
Primary TS1 is composed of a portion of data which is hatched
in MainClipl in the figure. This data portion in MainClipl starts
with a Source Packet from which decoding of In_Time in the previous
Playltem can be started. This Source Packet is located at the
beginning of a Video Presentation Unit and an Audio Presentation Unit
that are referred to by In_Time. Then, the data portion ends with
the last packet of MainClipl.
[0102]
Primary TS2 is composed of a portion of data which is hatched
in MainClip2 in the figure. This data portion in MainClip2 starts
with the first Source Packet of MainClip2. Then, the data portion in
MainClip2 ends with a Source Packet at which decoding of the current
Playltem is finished. This Source Packet is a Source Packet located
at the end of a Video Presentation Unit and an Audio Presentation
Unit that are referred to by Out_Time of the current Playltem.
[0103]
53
Secondary TS1 is composed of a portion of data which is
hatched in SubClipl in the figure. This data portion in SubClipl
starts with a Source Packet from which decoding of In_Time in the
previous SubPlayltem can be started. This Source Packet is located
at the beginning of a Video Presentation Unit and an Audio
Presentation Unit that are referred to by In_Time. Then, the data
portion ends with the last packet of SubClipl.
[0104]
Secondary TS2 is composed of a portion of data which is
hatched in SubClip2 in the figure. This data portion in SubClip2
starts with the first Source Packet of SubClip2. Then, the data
portion in SubClip2 ends with a Source Packet at which decoding of
the current Playltem is finished. This Source Packet is located at
the end of a Video Presentation Unit and an Audio Presentation Unit
that are referred to by Out_Time of the current SubPlayltem.
[0105]
According to the description above, it can be understood how
two transport streams to be connected are arranged in a MainClip and
a SubClip when CC = 5 and SP_CC = 5. The MainClip of the previous
Playltem must end with a Video Presentation Unit and an Audio
Presentation Unit that are referred to by Out_Time of the previous
Playltem, and the MainClip of the current Playltem must start with a
Video Presentation Unit and an Audio Presentation Unit which are
referred to by In_Time of the current Playltem. This relationship is
also true for the previous SubPlayltem. That is, the SubClip of the
previous SubPlayltem must end with an Audio Presentation Unit which
is referred to by Out_Time of the previous SubPlayltem, and the
SubClip of the current SubPlayltem must start with an Audio
54
Presentation Unit which is referred to by In_Time of the current
SubPlayltem. This is because an unnecessary Audio Presentation Unit
should not be present at or after a Video Presentation Unit and an
Audio Presentation Unit which are referred to by Out_Time of the
previous SubPlayltem, as described above. On the other hand, the
SubClip of the previous SubPlayltem does not have to start with an
Audio Presentation Unit which is referred to by In_Time of the
previous SubPlayltem, and SubClip of the current SubPlayltem also
does not have to end with an Audio Presentation Unit which is
referred to by Out_Time of Current SubPlay Item.
[0106]
According to FIGs. 24 and 27, Primary TS and Secondary TS must
be made to have the same length of time, and PTS values of the Video
Presentaiton Unit and Audio Presentation Unit must be made to have
the same value. In addition, the MainClip of the previous Playltem
and the SubClip of the previous Playltem must be multiplexed in such
a manner to end with a Video Presentation Unit and an Audio
Presentation Unit corresponding to Out_Time. The MainClip of the
current Playltem and the SubClip of the current Playltem must be
multiplexed in such a manner to start with a Video Presentation Unit
and an Audio Presentation Unit corresponding to In_Time.
[0107]
Additionally, these transport streams must meet the following
conditions:
the number of programs in TS1 and TS2 is one;
the number of video streams is one;
the number of audio streams is the same;
55
the content of STN_table of the previous Playltem is the
same as that of STN_table of the current Playltem; and
the playback period of the transport stream in each
Playltem is three seconds.
These are the conditions that should be satisfied at the level
of transport streams for connecting two streams when CC = 5 and SP_CC
= 5. Now, conditions that should be satisfied at the level of a
Video Presentation Unit and an Audio Presentation Unit are described.
Although the start time of the last Video Presentation Unit in
the video stream of Primary TS1 is originally different from the end
time of the first Video Presentation Unit in the video stream of
Primary TS 2, CC = 5 makes the start time and the end time match each
other. When the end time and start time of the Video Presentation
Units are made to match each other, how to handle such Video
Presentation Units and Audio Presentation Units for synchronous
playback becomes an issue. This is because video and audio have
different sampling frequencies, and the length of times of a Video
Presentation Unit and an Audio Presentation Unit do not match each
other.
[0108]
FIG. 28 shows details of CC = 5 and SP_CC = 5. Levels 1 to 3
show connection_condition of SubPlayltem, and Levels 4 to 7 show
sp_connection_condition in Playltem. Level 4 shows multiple Video
Presentation Units of TS1 and TS2, and Level 5 shows Audio
Presentation Units in TS1 and Audio Presentation Units in TS2. Level
6 shows STC values in the MainClip. Level 7 shows a Source Packet
sequence of the MainClip.
56
[0109]
Hatched parts in the figure represent Video Presentation Units,
Audio Presentation Units, and Source Packets of TS1, while parts with
no shade represent Video Presentation Units, Audio Presentation Units,
and Source Packets of TS2.
In the figure, CC = 5 represents the state in which Video
Presentation Units are aligned to have a common boundary (Level 4),
there is a gap between ATCs in the MainClip (Level 7), and there is
an overlap between Audio Presentation Units in the MainClip (Level 5) .
SP_CC = 5 represents the state in which there is a gap between ATCs
in the SubClip (Level 1), and there is an overlap between Audio
Presentation Unit in the SubClip (Level 2) .
[0110]
The above-mentioned boundary between Video Presentation Units
is located at, from the perspective of TS1, an end point
PTSl(lstEnd)+Tpp of the last Video Presentation Unit of Level 4, and
is located at, from the perspective of TS2, a start point
PTS2(2ndSTART) of the Video Presentation Unit of Level 4.
Assume that in TS1, the end point of an Audio Presentation
Unit corresponding to a boundary time point T4 is T5a, and in TS2,
the start point of Audio Presentation Unit corresponding to the time
point T4 is T3a. Here, the overlap of Audio Presentation Units in
the MainClip extends from T3a to T5a.
[0111]
In the figure, each Audio Presentation Unit of the SubClip is
set longer than each Audio Presentation Unit of the MainClip. This
is because the audio stream of the SubClip is set to have a low
sampling frequency since it is supplied via a network, and
57
accordingly, the period of time for each Audio Presentation Unit
becomes longer. In the packet sequence of Level 1, there is a gap
similar to the one in Level 7. Also, in Audio Presentation Units of
Level 2, there is an overlap similar to the one in Level 4. Assume
that, in TS1 of the SubClip, the end point of Audio Presentation Unit
corresponding to the boundary time point T4 is T5b, and in TS2 of the
SubClip, the start point of Audio Presentation Unit corresponding to
the time point T4 is T3b. Here, the overlap extends from T3b to T5b.
[0112]
From the figure, it can be seen that, in order to realize CC =
5 and SP_CC = 5, the following four conditions must be met at the
levels of Video Presentation Units, Audio Presentation Units, and
packets.
(1) The last Audio Presentation Unit of the audio stream in
TS1 includes a sample having a playback time which coincides with the
end of the display period of the last picture in TS1 specified in the
previous Playltem and the previous SubPlayltem.
[0113]
(2) The first Audio Presentation Unit of the audio stream in
TS2 includes a sample having a playback time which coincides with the
start of the display period of pictures of the first picture in TS2
specified in the current Playltem and the current SubPlayltem.
(3) There is no gap at a connection point in the Audio
Presentation Unit sequence. This means that an overlap in the Audio
Presentation Unit sequence can occur at a connection point. However,
the extent of such an overlap must be shorter than the playback
period of two audio frames.
[0114]
58
(4) The first packet of TS2 includes a PAT, which can be
immediately followed by one or more PMTs. If a PMT is larger than a
payload of a TS packet, the EMT may be divided into two packets or
more. TS packe storeing therein a EMT may include a PCR and an SIT.
Relationship of In_Time and Out__Time with Video Presentation
Unit>
FIG. 29 shows a relationship among multiple Video Presentation
Units specified by a previous Playltem and the current Playltem,
multiple Audio Presentation Units, and STC time axes. Level 1 shows
multiple Video Presentation Units belonging to TS1 to which the
previous Playltem refers and multiple Video Presentation Units
belonging to TS2 to which the current Playltem refers. Level 2 shows
multiple Audio Presentation Units belonging to a time stamp to which
the previous SubPlayltem refers and multiple Audio Presentation Units
belonging to TS2 to which the current SubPlayltem refers. Level 3
shows an STC time axis of TS1 in the previous SubPlayltem and an STC
time axis of TS2 in the current SubPlayltem. As shown in FIG. 28,
within Audio Presentation Units of TS1 and Audio Presentation Units
of TS2 at Level 2, the portion from the start point T3b to the end
point T5b overlaps. In_Time of the current SubPlayltem and Out_Time
of the previous SubPlayltem respectively specify the time point T4,
which is a boundary of Video Presentation Units. Since In_Time of
the current Playltem and Out_Time of SubPlayltem also specify the
time point T4 of the boundary of Video Presentation Units, In_Time
and Out_Time of Playltem coincide with In_Time and Out_Time of
SubPlayltem. Thus, although In_Time of the previous SubPlayltem and
Out_Time of the current SubPlayltem are recorded on a recording
medium different from the BD-RCM, it can be seen that they correspond
59
to the boundary of Video Presentation Units in the MainClip, and also
correspond to Out_Time of the previous Playltem and In_Time of the
current Playltem, respectively.
[0115]
Thus concludes the detailed description of conditions that
should be satisfied at the level of Video Presentation Units and
Audio Presentation units.
The following describes encoding conditions at the level of
elementary streams in order to realize CC = 5 and SP_CC = 5.
[0116]
The following encoding conditions must be satisfied at the
level of each elementary stream.
(1) Video Stream
the video resolution and the frame rate do not change
before and after a seamless connection; and
a video stream immediately before a seamless connection
ends with sequence_end_code (for MPEG-2 Video) and
end__of_sequence_rbsp (for MPEG-4 AVC) .
(2) Audio Stream
the encoding format of audio streams having the same PID do
not change; and
the sampling frequency, the quantization bit rate and the
number of channels do not change.
(3) PG Stream
a) The number of PG streams in TS1 and in TS2 is the same.
[0117]
60
b) PG stream of TS1 ends with a function segment called "End
of Display Set".
c) PTS of a PES packet carrying the last PCS in TS1 indicates
a time point before the playback time corresponding to OutJTime of
the previous Playltem and the previous SubPlayltem.
d) PG stream of TS2 must start with Epock Start-type or Epock
Continue-type Display Set.
[0118]
e) PTS of a PES packet carrying the first PCS in TS2 indicates
a time point at or after the playback time corresponding to In_Time
of the current Playltem and the current SubPlayltem.
f) Taking out of Source Packets from TS1, which is followed by
taking out of Source Packets from TS2, can be defined as STC1 and
STC2 on the same system time axis, and there is no overlap in their
DTS values/PTS values.
(4) IG Stream
a) The number of IG streams in TS1 and in TS2 is the same.
[0119]
b) IG stream of TS1 ends with the function segment called "End
of Display Set".
c) PTS of a PES packet carrying the last ICS in TS1 indicates
a time point before the playback time corresponding to Out_Time of
the previous Playltem and the previous SubPlayltem.
d) IG stream of TS2 must start with Epock Start-type or Epock
Continue-type Display Set.
[0120]
61
e) PTS of a PES packet carrying the first ICS in TS2 indicates
a time point at or after the playback time corresponding to In_Time
of the current Play Item and the current SubPlayltem.
f) Taking out of Source Packet from TS1, which is followed by
taking out of Source Packet from TS2, can be defined as STC1 and STC2
on the same system time axis, and there is no overlap in their DTS
values/PTS values.
In order to connect the previous Playltem and the current
Playltem with CC = 5 and connect the previous SubPlayltem and the
current SubPlayltem with SP_CC = 5, the above-mentioned all
conditions for the levels of AV stream, transport stream, Video
Presentation Units and Audio Presentation Units, and elementary
stream must be met.
[0121]
Thus concludes the explanation of PlayList information which
is a constituent of the storage content of the local storage 200.
Thus concludes the explanation of the recording medium
according to the present invention. Next, the playback apparatus of
the present invention is explained.
FIG. 30 shows an internal structure of the playback apparatus
of the present invention. The playback apparatus of the present
invention is commercially manufactured based on the internal
structure shown in the figure. The playback device is mainly
composed of two parts—a system LSI and a drive device, and can be
produced commercially by mounting these parts on the cabinet and
substrate of the device. The system LSI is an integrated circuit
that integrates a variety of processing units for carrying out the
functions of the playback device. The playback apparatus
62
manufactured in this way comprises: a BD-ROM drive la; read buffers
Ib and lc; ATC counters 2a and 2c; Source depacketizers 2b and 2d;
ATC counters 2c and 2d; STC counters 3a and 3c; PID filters 3b and
3d; a video decoder 4; a transport buffer (TB) 4a; a multiplexed
buffer (MB) 4b; a coded picture buffer (CPB) 4c; a video decoder 4d;
a re-order buffer 4e; a switch 4f; a video plane 5; an audio decoder
9; a transport buffer 6; an elementary buffer 7; a decoder 8;
switches. 10a, 10b, 10c and lOd; an interactive graphics decoder 11; a
transport buffer (TB) lla,- a coded data buffer (CDB) 11b; a stream
graphics processor (SGP) 11e; an object buffer 11d; a composition
buffer 11e; a graphics controller llf; an Interactive Graphics plane
12; a presentation graphics decoder 13; a transport buffer (TB) 13a;
a coded data buffer (CDB) 13b; a stream graphics processor (SGP) 13c;
an object buffer 13d; a composition buffer 13e; a graphics controller
13f; a presentation graphics plane 14; a transport buffer 15a; an
elementary buffer 15b; a decoder 15c; a transport buffer 16a; an
elementary buffer 16b; a decoder 16c; a synthesis unit 17 a memory
21; a controller 22; a PSR set 23; a PID conversion unit 24; a
network unit 25; an operation receiving unit 26; and the local
storage 200.
[0122]
The BD-ROM drive la loads/ejects a BD-ROM, and executes access
to the BD-ROM.
The read buffer (RB) lb accumulates Source Packet sequences
read from the J3D-RGM.
The read buffer (RB) lc accumulates Source Packet sequences
read from LastPlay title.
[0123]
63
The ATC counter 2a is reset by using an ATS of the Source
Packet located at the beginning of the playback section within Source
Packets constituting Primary TS, and subsequently outputs ATCs to the
source depacketizer 2b.
The source depacketizer 2b takes out TS packets from source
packets constituting Primary TS and sends out the TS packets. At the
sending, the source depacketizer 2b adjusts the time of an input to
the decoder according to an ATS of each TS packet. To be more
specific, at the moment when the value of the ATC generated by the
ATC counter 2a becomes the same as the ATS value of a Source Packet,
the source depacketizer 2b transfers only the TS packet to the PTD
filter 3b at TS_Recording_Rate.
[0124]
The ATC counter 2c is reset by using an ATS of the Source
Packet located at the beginning of the playback section within Source
Packets constituting Secondary TS, and subsequently outputs ATCs to
the source depacketizer 2d.
The source depacketizer 2d takes out TS packets from source
packets constituting Secondary TS and sends out the TS packets. At
the sending, the source depacketizer 2d adjusts the time of an input
to the decoder according to an ATS of each TS packet. To be more
specific, at the moment when the value of the ATC generated by the
ATC counter 2c becomes the same as the ATS value of a Source Packet,
the source depacketizer 2d transfers only the TS packet to the PID
filter 3d at TS_Recording_Rate.
[0125]
The STC counter 3a is reset by a PCR of Primary TS and outputs
an STC.
64
The PID filter 3b is a demultiplexing unit for Primary TS and
outputs, among Source Packets output from the source depacketizer 2b,
ones having PID reference values informed by the PID conversion unit
24 to the video decoder 4, the audio decoder 9, the interactive
graphics decoder 11 and the presentation graphics decoder 13. Each
of the decoders receives elementary streams passed through the PUD
filter 3b and performs from decoding processing to playback
processing according to the PCR of Primary TS (STC1 time axis) . Thus,
the elementary streams input to each decoder after being passed
through the PID filter 3b are subjected to decoding and playback
based on the PCR of Primary TS.
[0126]
The STC counter 3 c is reset by a PCR of Secondary TS and
outputs an STC. The PID filter 3d performs demultiplexing with
reference to this STC.
The PID filter 3d is a demultiplexing unit for the SubClip and
outputs, among Source Packets output from the source depacketizer 2d,
ones having PUD reference values informed by the PID conversion unit
24 to the audio decoder 9, the interactive graphics decoder 11 and
the presentation graphics decoder 13. Thus, the elementary streams
input to each decoder after being passed through the PID filter 3d
are subjected to decoding and playback based on the PCR of Secondary
TS.
[0127]
As described in the explanation of the recording medium above,
In_Time and Out_Time of Playltem correspond to In_Time and Out_Time
of SubPlayltem. Therefore, if the ATC counters 2a and 2c have the
same value (time) and tick at the same speed, time axes of Primary TS
65
and Secondary TS become aligned together. As a result, Primary TS
and Secondary TS constituting the Out-of-MUX application can be
handled as a single stream. In addition, the ATC time axes showing
times for data input to the decoder can be synchronized, and also the
STC time axes showing decoder base time can be synchronized.
[0128]
According to the synchronization of ATC time axes, the above-
mentioned two source depacketizers can respectively process Source
Packets read from the RD-RCM and Source Packets read from the local
storage.
The STC counters 3a and 3c have the same time and tick at the
same speed according to the synchronization of STC time axes, and
therefore two TSs can be processed as a single TS. Since the decoder
of the playback apparatus operates on a single STC time axis, the
management of STC time can be standardized in the same manner as when
usual Primary TS-only playback is performed. Being able to cause all
the video decoder 4, IG decoder 11, PG decoder 13, system decoders
15c and 16c, and audio decoder 9 to operate on the same STC time axis
is desirable from the perspective of the development of playback
apparatuses since the control is exact the same as one used on usual
playback apparatuses that perform BD-RCM playback only. Furthermore,
when the authoring is performed, the buffer state can be observed by
controlling the input timing of one TS, whereby facilitating
verification at the authoring stage.
[0129]
The video decoder 4 decodes multiple PES packets output from
the P3D filter 3b, obtains uncompressed pictures and writes the
pictures to the video plane 5. The video decoder 4 is composed of
66
the transport buffer 4a, multiplexed buffer 4b, elementary buffer 4c,
decoder 4d, re-order buffer 4e and switch 4f.
The transport buffer (TB) 4a is a buffer in which TS packets
belonging to a video stream are temporarily accumulated after being
output from the PID filter 3b.
[0130]
The multiplexed buffer (MB) 4b is a buffer in which PES
packets are temporarily accumulated when a video stream is output
from the transport buffer 4a to the elementary buffer 4c.
The elementary buffer (EB) 4c is a buffer in which pictures in
an encoded state (I pictures, B pictures, P pictures) are stored.
[0131]
The decoder (DEC.) 4d obtains multiple frame images by
decoding individual frame images of a video elementary stream for
each predetermined encoding time (DTS) and writes the frame images to
the video plane 5.
The re-order buffer 4e is a buffer used for changing the order
of the decoded pictures so that they are arranged in the order of
display.
[0132]
The switch 4f realizes the order change of the decoded
pictures so that they are arranged in the display order.
The video plane 5 is a plane for storing therein uncompressed
pictures. The plane is a memory area of the playback apparatus for
storing pixel data of a single screen capacity. The resolution of
the video plane 5 is 1920 x 1080, and the picture data stored in the
video plane 5 is composed of pixel data represented by a 16-bit YUV.
[0133]
67
The audio decoder 9 is composed of the transport buffer 6,
elementary buffer 7 and decoder 8, and decodes an audio stream.
The transport buffer 6 stores therein TS packets output from
the PID filter 3b in a first-in first-out manner, and sends the TS
packets to the audio decoder 8.
The elementary buffer 7 stores therein, among TS packets
output from the PID filter 3b, only those having PID of an audio
stream to be played back in a first-in first-out manner, and sends
them to the audio decoder 8.
[0134]
The decoder 8 converts TS packets stored in. the transport
buffer 6 into PES packets, decodes the PES packets to obtain
noncompressed audio data in the LPCM state, and outputs the obtained
audio data.
The switch 10a selectively provides TS packets read from the
BD-ROM or TS packets read from the local storage 200 to the video
decoder 4.
[0135]
The switch 10b selectively provides TS packets read from the
BD-ROM or TS packets read from the local storage 200 to the
interactive graphics decoder 11.
The switch 10c selectively provides TS packets read from the
BD-ROM or TS packets read from the local storage 200 to the
presentation graphics decoder 13.
[0136]
The interactive graphics (IG) decoder 11 decodes an IG stream
read from the BD-ROM 100 or the local storage 200 and writes the
noncompressed graphics to the IG plane 12. The IG decoder 11 is
68
composed of the transport buffer (IB) lla, coded data buffer (CDB)
11b, stream graphics processor (SGP) 11e, object buffer 11d,
composition buffer 11e and graphics controller (Ctrl) llf.
[0137]
The transport buffer (TB) lla is a buffer in which TS packets
belonging to an IG stream are temporarily accumulated.
The coded data buffer (CDB) lib is a buffer in which PES
packets constituting an IG stream.
The stream graphics processor (SGP) 11e decodes PES packets
storeing therein graphics data and writes noncompressed bitmap
composed of index colors obtained by the decode processing to the
object buffer 11d as a graphics object.
[0138]
In the object buffer 11d, a graphics object obtained by decode
processing performed by the stream graphics processor 11e is
positioned.
The composition buffer 11e is a memory in which control
information for drawing graphics data is positioned.
The graphics controller (Ctrl) llf decodes control information
positioned in the composition buffer 11e and performs control based
on the result of the decode processing.
[0139]
To the Interactive Graphics (IG) plane 12, uncompressed
graphics obtained by decode processing of the IG decoder 11 are
written.
The presentation graphics (PG) decoder 13 decodes a PG stream
read from a BD-ROM or the local storage 200 and writes the
uncompressed graphics to the presentation graphics plane 14. The PG
69
decoder 13 is composed of the transport buffer (TB) 13a, coded data
buffer (CDB) 13b, stream graphics processor (SGP) 13c, object buffer
(OB) 13d, composition buffer (CB) 13e and graphics controller (Ctrl)
13f.
[0140]
The transport buffer (TB) 13a is a buffer in which TS packets
belonging to a PG stream are temporarily accumulated after being
output from the PH> filter 4.
The coded data buffer (CDB) 13b is a buffer in which PES
packets constituting a PG stream.
The stream graphics processor (SGP) 13c decodes PES packets
(ODS) storeing therein graphics data and writes noncompressed bitmap
composed of index colors obtained by the decode processing to the
object buffer 13d as a graphics object.
[0141]
In the object buffer 13d, a graphics object obtained by decode
processing performed by the stream graphics processor 13c is
positioned.
The composition buffer (CB) 13e is a memory in which control
information (PCS) for drawing graphics data is positioned.
The graphics controller (Ctrl) 13f decodes PCS positioned in
the composition buffer 13e and performs control based on the result
of the decode processing.
[0142]
The Presentation Graphics (PG) plane 14 is a memory having a
single screen capacity area, and is able to store therein
uncompressed graphics of a single screen capacity.
70
The system decoder 15 processes system control packets (PAT
and PMT) of Secondary TS and controls the entire decoders.
The transport buffer 15a stores therein system control packets
(PAT and PMT) present in Primary TS.
[0143]
The elementary buffer 15b sends system control packets to the
decoder 15c.
The decoder 15c decodes system control packets stored in the
elementary buffer 15b.
The transport buffer 16a stores therein system control packets
present in Secondary TS.
[0144]
The elementary buffer 16b sends system control packets of
Secondary TS to the decoder 16c.
The decoder 16c decodes system control packets stored in the
elementary buffer 16b.
The memory 21 is a memory for storing therein current PlayList
information and current Clip information. The current PlayList
information is PlayList information that is currently processed,
among a plurality of pieces of PlayList information stored in the BD-
ROM. The current Clip information is Clip information that is
currently processed, among a plurality of pieces of Clip information
stored in the ED-ROM/local storage.
[0145]
The controller 22 achieves a playback control of the BD-RCM by
performing PlayList playback (i.e. playback control in accordance
with the current PlayList information) . The controller 22 also
performs the above-mentioned control on the ATS and STC. In this
71
control, the controller 22 performs a prior read operation to read,
in the period of 1 second, in advance Source Packets from the BD-RCM
or the local storage to the buffer of the decorder. By performing
this prior read operation, prevention of underflow and overflow can
be ensured due to the above-mentioned control of the Window.
[0146]
The PSR set 23 is a register built in the playback apparatus,
and is composed of 64 pieces of Player Setting/Status Registers (PSR)
and 4096 pieces of General Purpose Registers (GPR) . Among the values
(PSR) set in the Player Setting/Status Registers, PSR4 to PSR8 are
used to represent the current playback point.
The PID conversion unit 24 converts audio streams and stream
numbers of the audio streams stored in the PSR set 23 into PID
reference values based on the STN_table, and notifies the PID
reference values of the conversion results to the PUD filters 3b and
3d.
[0147]
The network unit 25 achieves a communication function of the
playback apparatus. When a URL is specified, the communication unit
25 establishes a TCP connection or an FTP connection with a web site
of the specified URL. The establishment of such a connection allows
for downloading from web sites.
The operation receiving unit 26 receives specification of an
operation made by a user on the remote controller, and notifies User
Operation information, which indicates the operation specified by the
user, to the controller 22.
[0148]
72
Thus concludes the description of the internal structure of
the playback apparatus. The following describes implementation of
the controller 22 on the playback appratus. The controller 22 can be
implemented on the playback apparatus by creating a program which
causes the CPU to perform the process procedure of the flowcharts
shown in FIGs. 31 and 32, writing the program to an instruction RCM
and sending it to the CPU.
FIG. 31 is a flowchart showing a playback procedure based on
PlayList information. The flowchart shows a loop structure in which
a mpls file structuring the PlayList information is read in (Step
Sll) , a Playltem at the beginning of the PlayList information is set
as the current Playltem (Step S12) , and Steps S13 to S25 are repeated
for the current Playltem. This loop structure has Step S23 as an
ending condition. The BD-RCM drive is instructed to read Access
Units starting with one corresponding to In_Time and ending with one
corresponding to Out_Time of the current Playltem (Step S13). A
judgment is made whether the previous Playltem is present in the
current Playltem (Step S14) . Step S15 or Steps S16 to S21 is
selectively executed according to the judgment result. To be more
specific, if the current Playltem does not have the previous Playltem
(Step S14: NO), the decoder is instructed to perform playback of the
PlayItem_In_Time to the PlayItem_Out_Time (Step S15).
[0149]
If the current Playltem has the previous Playltem (Step S14:
YES), a judgment is made whether the current Playltem is CC = 5 (Step
S16) . When CC = 5 (Step S16: YES), the processing of Steps S17 to
S20 is carried out.
When the previous Playltem above is present, an ATC_Sequence
73
in the MainClip is switched. For the switch of the ATC_Sequence, an
offset value for Primary TS, called ATCjdeltal, is calculated (Step
S17) . An ATC value (ATC2) for a new ATC_Sequence is obtained by-
adding the ATCjdeltal to an ATC value (ATC1) of the original
ATC_Sequence (Step S18).
[0150]
In addition, when the previous Playltem above is present, an
STC_Bequence in Primary TS is switched. For the switch of the
STC_Sequence, an offset value called STC_deltal is calculated (Step
S19) . An STC value (STC2) of a new STC_Sequence is obtained by
adding the STCjdeltal to an STC value of the original STC_Sequence
(Step S20).
After the audio decoder 9 is instructed to mute the Audio
Overlap, and the decoder is instructed to perform playback from the
PlayItem_In_Time to the PlayItem_Out_Time (Step S21) . When the
current Playltem is not CC = 5, the processing of CC = 1 and CC = 6
is performed.
[0151]
After either one of the processing of Step S15 and the
processing of Steps S16 to S21 is carried out, the processing of Step
S25 is executed. Step S25 is a process of checking whether there is
SubPlayltem to be synchronously played back with the current Playltem.
Here, each SubPlayltem constituting the SubPath information has
information called Sync_PlayItem_Id, and Sync_PlayItem_Id of a
SubPlayltem to be synchronously played back with the current Playltem
is set to this current Playltem. Therefore, in Step S25, a check is
made whether a SubPlayltem whose Sync_PlayItem_Id has been set to the
current Playltem is present in multiple SubPlayltems constituting the
74
SubPath information.
[0152]
If no such a SubPlayltem is present, the process moves to Step
S22. In Step 22, a judgment is made whether the current playback
time (Current PTM (Presentation TiMe)) on the AVClip time axis
reaches Out_Time of the current Playltem (Step S22) . If it has
reached, the process moves to Step S23. In Step S23, a judgment is
made whether the current Playltem is the last Playltem of the
PlayList information. If it is not the last Playltem, the next
Playltem in the PlayList Information is set as the current Playltem
(Step S24), and the process moves to Step S13. In this way, the
processing of Steps S13-S24 is performed on all Playlterns in the
PlayList information.
[0153]
FIG. 32 is a flowchart showing a processing procedure of a
seamless connection of SubPlayltems.
When it is determined in Step S25 that a SubPlayltem whose
Sync_PlayItem_Id has been set to the current Playltem is present, the
SubPlayltem is set as the current SubPlayltem (Step S31) . Then, the
local storage 200 is instructed to output Access Units starting with
one corresponding to In_Time of the SubPlayltem and ending with one
corresponding to Out_Time (Step S32) . Then, a judgment is made
whether the previous SubPlayltem is present in the current Playltem
(Step S33) , and one of Step S34, Step S35, and Steps S36-S41 is
selectively executed according to the judgment result. To be more
specific, if the previous SubPlayltem is not present in the current
Playltem (Step S33: No), it is waited until the current PTM reaches
Sync_Start_Pts_of_PlayItem (Step S34) . When the current PTM has
75
reached it, the decoder is instructed to play back from
SubPlayItem_In_Time to SubPlayItem_Out_Time (Step S35) .
[0154]
When the previous SubPlayltem is present in the current
PlayItern (Step S3 3: Yes), a judgment is made whether the current
Playltem is SP_CC = 5 (Step S36) . When it is SP_CC = 5 (Step S36:
Yes), the processing of Steps S37-S41 is carried out.
When the current Playltem has a previous SubPlayltem, the
ATCjSequence is switched. For the switch of the ATC_Sequence, an
offset value for Secondary TS, called ATC_delta2, is calculated (Step
S37) , and obtains an ATC value (ATC2) for a new ATC_Sequence by
adding the ATC_delta 1 to an ATC value (ATC1) of the original
ATC_Sequence (Step S38).
[0155]
The ATC_delta means an offset value representing an offset
from the input time point Tl of the last TS packet of a transport
stream (TS1) that has been originally read out to the input time
point T2 of the last TS packet of a transport stream (TS2) that has
been newly read out. The ATC_delta satisfies "ATC_delta >
Nl/TS_recording_rate", where Nl is the count of TS packets following
the last video PES packet of the TS1.
[0156]
In addition, when the previous Playltem above is present, an
STC_Sequence is switched. For the switch of the STC_Sequence,
STC_delta2 is calculated (Step S39), and an STC value (STC2) of a new
STC_Sequence is obtained by adding the STC_delta2 to an STC value of
the original STC_Sequence (Step S40).
Assume that the display start time of a picture lastly played
76
in the preceding STC_Sequence is PTSl(lstEND), the display time
period of the picture is TPP, and the start time of a picture
initially displayed in the following STC_Sequence is PTS2 (2ndSTART) .
Here, for CC = 5, since it is necessary to match the time of
PTSl(lstEND) + TPP with the time of PTS2 (2ndSTART), the STC_delta2
can be calculated from the following equation:
STC_delta2 = PTSl(lstEND) + TPP - PTS2(2ndSTART).
After the audio decoder 9 is instructed to mute the Audio
Overlap, the decoder is instructed to play back from PlayItem_In_Time
to PlayItem_Out_Time (Step S41).
[0157]
The controller 22 performs the STC switch process as described
above, and this process is performed in a playback apparatus with
general implementation when the decoder is in a free-run state. The
free-run state means the state where the decoder is not performing
synchronous control. Subsequently, when the STC returns to the
condition where the STC time axis can be set, the decoder makes the
transition from the free-run state to synchronous control with the
STC. On the other hand, when the current Playltem is judged not
being CC = 5 in Step S36 (Step S36: NO) , the processing of CC = 1 and
CC = 6 is performed.
[0158]
Thus, according to the present embodiment, the transmittable
amount called Window is limited to 48 Mbits/second or less.
Therefore, if TS packets with a size of ,96 Mbits x 0.5 seconds are
read to the decoder in advance, the buffer of the decoder will not
cause underflow or overflow even when the transmittable amount
locally reaches 96 Mbits within a period of 1 second. Since the data
77
amount is "96 Mbits x 0.5 seconds" or less at any period of time in a
digital stream and TS packets can be supplied without underflow or
overflow, loss of video and audio can be prevented. This eliminates
the risk that simultaneous readout to realize the Out-of-MUX
framework has an influence on the quality of the digital stream.
[0159]
In addition, if In_Time and Out_Time of a Playltem and In_Time
and Out_Time of a SubPlayltem match each other and the connection
state of Playltems is CC = 5, the connection state of SubPlaylterns
becomes SP_CC = 5. Therefore, when a Playltem is switched, the
switch from the Playltem to another Playltem and a switch from a
SubPlayltem to another SubPlayltem can be performed simultaneously
without reset of the demultiplexing units. Thus, while STC time axes
to which the demultiplexing units refer are made to synchronize to
each other, the playback process based on PlayList information can be
proceeded.
[0160]
EMBODIMENT 2
In the present embodiment, the production of the BD-ROM of the
previous embodiment is described in detail. The BD-ROM of the
previous embodiment can be produced by sequentially performing the
following processes.
First, an outline with which the BD-ROM is played back is
planned (Planning Process) , materials such as moving image records
and audio records are created (Material Production Process), and
volume configuration information is created based on the outline
created in the planning process (Senario Production Process) .
78
[0161]
The volume configuration information is information indicating
a format of the application layer on the optical disk using an
abstract description.
Subseuently, each of video materials, audio materials,
subtitle materials, and menu materials is encoded to thereby create
elementary streams (Material Encoding Process). Then, multiple
elementary streams are multiplexed (Multiplexing Process).
[0162]
Then, an operation is carried out to fit the multiplexed
streams and the volume configuration information into the format of
the application layer of the BD-ROM, and the entire data (generally
called the "volume data") to be recorded in the volume area of the
BD-ROM is obtained (Formatting Process) .
Instances of a class structure described in a programming
language are the format of the application layer of the recording
medium according to the present invention. Clip information,
PlayList information and the like can be created by describing
instances of the class structure based on syntaxes specified in the
BD-RCM standard. In this case, data in a table format can be defiend
using "for" statements of a programming language, and data required
under specific conditions can be defined using "if" statements.
[0163]
When the volume data is obtained after such a fitting process,
the volume data is played back to see whether the result of the
scenario production process is correct (Emulation Process). In the
emulation process, it is desirable to conduct a simulation of the
buffer state of the BD-ROM player model.
79
Lastly, a press process is carried out. In this press process,
volume images are converted into physical data sequences, and master
disk cutting is conducted by using the physical data sequences to
create a master disk. Then, ED-ROMs are produced from a master
created by a press apparatus. The production is composed of various
processes, mainly including substrate molding, reflective film
coating, protective film coating, laminating and printing a label.
[0164]
By completing these processes, the recording medium (BD-ROM)
described in the embodiment above can be created.
When a motion picture is composed of BD-ROM contents and
additional contents, the above-mentioned planning process to
formatting process are carried out. Then, AVClips, Clip information
and PlayList information making up one piece of volume data are
obtained. Ones which will be provided by the BD-ROM are removed from
the obtained AVClips, Clip information and PlayList information, and
the remaining information is assembled into one file as additional
contents by an archiver program or the like. When such additional
contents are obtained after these processes, the additional contents
are provided to a www server and sent to playback apparatuses upon
request.
[0165]
The verification described in the above embodiment is
conducted when AVClips, Clip information and PlayList information are
completed and elementary streams to be played back are determined by
the STN table in the PlayList information—i.e. in the formatting
80
process. The following explains an authoring system that creates
such application format.
FIG. 33 shows an internal structure of an authoring system of
Embodiment 2. As shown in the figure, the authoring system is
composed of: an input apparatus 51; an encode apparatus 52; a sever
apparatus 53; a material storage 54; a BD configuration information
storage 55; client apparatuses 56-58; a multiplexer 60; a BD scenario
converter 61; a formatter 62; and a verifier 63.
[0166]
On the input apparatus 51, a videocassette on which HD images
and SD images are recorded is mounted, and then the input apparatus
51 plays the videocassette back and outputs playback signals to the
encode apparatus 52.
The encode apparatus 52 encodes the playback signals output
from the input apparatus 51 to thereby obtain elementary streams such
as video streams and audio streams. The elementary streams obtained
in this way are output to the server apparatus 53 via a IAN and
written to the material storage 54 in the server apparatus 53.
[0167]
The server apparatus 53 is composed of two drive devices, the
material storage 54 and the BD configuration information storage 55.
The material storage 54 is a built-in disk apparatus of the
server apparatus 53, and sequentially stores therein elementary
streams obtained by the encoding operations by the eoncode apparatus
52. The material storage 54 has two directories, an HD stream
directory and an SD stream directory. Elementary streams obtained by
encoding HD images are written to the HD stream directory.
81
[0168]
The ED configuration information storage 55 is a drive device
in which the BD volume configuration information is sotred.
The multiplexer 60 reads, among elementary streams stored in
the HD stream directory and the SD stream directory in the material
storage 54, ones specified in the BD volume configuration information,
and then multiplexes the read elementary streams according to the BD
volume configuration information to thereby obtain a multiplexed
stream, i.e. an AVClip.
[0169]
The BD scenario converter 61 obtains a BD scenario by
converting the BD volume configuration information sotred in the BD
configuration information storage 55 into the BD-ROM application
format.
The formatter 62 adapts the Clip obtained by the multiplexer
60 and the BD scenario obtained by the BD scenario converter 61 to
the format of the application layer on the BD-ROM. Herewith, a
master of the BD-ROM and contents for downloading which are to be
stored in the local storage can be obtained from the adapted BD
scenario.
[0170]
The verification unit 63 judges, by referring to the STN_table
in the PlayList information generated by the scenario converter 61,
whether Primary TSs for the BD-ROM and Secondary TSs for the local
storage obtained by the multiplexer 60 satisfy the restrictions for
realizing the Out_of_MUX application.
82
Thus concludes the internal structure of the authoring system.
The following explains the implementation of the verification unit 63
of the authoring system.
[0171]
The verification unit 63 can be implemented in the authoring
system by creating a program Which causes the CPU to perform the
process procedures of the flowcharts shown in FIGs. 34 and 35,
writing the program to an instruction ROM and sending it to the CPU.
FIG. 34 is a flowchart showing the verification procedure on
Primary TSs and Secondary TSs. The flowchart shows that an ATS of
the first Source Packet in the Source Packet sequence is set as
In_Time of the current Window in Step SI and the processes of Steps
S2 to S7 are repeated. The loop structure repeats the following
Steps S2 to S5 until the judgment in Step S6 becomes Yes: an ATS
appearing after 1 second from the In_Time of the current Window is
set as the Out_Time of the current Window (Step S2) ; TS packets
present between the In_Time and the Out_Time of the current window are
counted (Step S3); a bit count of the current Window is calculated
from the In_Time (Step S4) ; and a judgment is made whether the bit
value is 48 Mbits or less (Step S5) . Step S6 is a judgment whether
the Out_Time of the current Window has reached the last Source Packet
on the ATC time axis. If Step S6 is No, the next ATS in the Source
Packet sequence is set to the In_Time of the current Window (Step S7),
and Steps S2-S6 are repeated. If, with any Window, Step S5 is No, it
is determined that there is a violation of the BD-RCM standardization
(Step S9) . When Step S5 is Yes for all Windwos, and then Step S6 is
83
Yes, it is determined that the Primary TSs and Secondary TSs comply
with the ED-ROM standard (Step S8) .
[0172]
Since Primary TSs and Secondary TSs have been subject to the
verification process, the above-mentioned restrictions are always
satisfied even when Primary TSs and Secondary TSs are supplied from
the ED-ROM and the local storage, respectively.
As to the video streams, audio streams, PG streams and IG
streams, if there are multiple elementary streams of the same type,
it is desirable to conduct the verification according to the
procedure shown in FIG. 35. In the verification procedure of FIG. 35,
Steps S3 and S4 of FIG. 34 are replaced with Steps S81-S83.
[0173]
Steps S81-S83 are that: regarding TS packets belonging to the
current Window, from among TS packets making up elementary streams
that are allowed in the STN_table to be played back, the bit rate is
calculated for each elementary stream each time one current Window is
determined (Step S81) ; for each type of streams—i.e. multiple video
streams, multiple audio streams, multiple PG streams and multiple IG
streams, one having the highest calculated bit rate is selected (Step
S82) ; the highest bit rate of the video stream, the highest bit rate
of the audio stream, the highest bit rate of the PG stream, and the
highest bit rate of the IG stream are summed (Step S83) ; and a
judgment is made whether the sum total is 48 Mbits or less (Step S5).
[0174]
In the Out_of_MUX application, an elementary stream is always
solely and exclusively selected among the same type of elementary
84
streams, and therefore it is more reasonable that the verification is
conducted in the above-mentioned procedure.
Regarding the verification, it is effective to check locations
with locally high bit rates, i.e. bit values of locations at which
local peaks appear. The locations where local peaks appear are as
follows.
[0175]
(1) the beginning of TS packet indicated by In_Time of the
Window;
(2) the end of TS packet indicated by In_Time of the Window;
(3) the beginning of TS packet indicated by Out_Time of the
Window; and
(4) the end of TS packet indicated by Out_Time of the Window.
The verification process in the authoring can be more
simplified by specifically focusing on the bit amounts of these
locations.
[0176]
Thus, according to the present embodiment, when an STN_table
which allows playback of Secondary TSs is created, it can be verified
in advance when the authoring is performed whether underflow or
overflow would be caused in the playback process based on the
STN_table.
EMBODIMENT 3
In the present envodiment, a new type of CC = 6 is provided as
to the connection between Playltems and between SubPlayltems.
[0177]
CC = 6 specifies a connection state among multiple pieces of
PlayItern information constituting Progressive PlayList information.
85
The Progressive PlayList information is PlayList information used for
specifying, as one playback path, multiple AVClips for streaming
playback.
The Progressive PlayList information has an advantage of
making the cache size smaller or being able to start playback without
waiting for all files to be downloaded, by dividing Secondary TSs for
downloading/streaming into piecemeal files.
[0178]
Since contents allowing for streaming transfer are specified
by many short AVClips, the Progressive PlayList information is
composed of many pieces of Playltem information, each of which
corresponds to a different one of the multiple AVClips. On the other
hand, the AVClips divided into small units have been divided for
streaming transfer, and therefore discontinuity is not present in STC
and ATC. Accordingly, such a connection state between AVClips must
be specified as a different state from CC = 5. This type of
connection state is specified as CC = 6.
When CC = 6, TS1 and TS2 specified by two Playltems and TS1
and TS2 specified by two SubPlayltems must satisfy the following
conditions.
[0179]
1) A video stream of TS2 has to start with a GOP.
2) There is no gap, in an Audio Presentation Unit sequence, at
the connection point between the audio stream of TS2 and the audio
stream of TS1 having the same PUD as that of the audio stream of TS2.
86
The audio stream of TS1 may finish as an incomplete audio
stream. Then, the audio stream of TS2 having the same PID as TS1 may
start with an incomplete Audio Presentation Unit. By playing back
these TS1 and TS2 based on multiple Playltems and multiple
SubPlaylterns, one complete Audio Presentation Unit can be obtained
from two Audio Presentation Units.
[0180]
In the case of CC = 6, the stream is actually continuous.
Therefore, all elementary streams are connected seamlessly unlike the
case of CC = 5 where the video is only semelessly connected while the
audio is connected in a discontinuous manner or set to mute.
Thus, CC = 6 means a divisional boundary created when a
logically continuous stream is divided into multiple parts according
to the purpose of streaming transfer. Note that, since a stream to
be recorded on the BD-ROM has to be composed of 32 Source Packets,
one stream file forming one SubPlayltem needs to be multiples of 6
Kbytes.
[0181]
FIG. 36 shows a detailed explanation of CC = 6. Level 1 shows
a file (20000.m2ts) having a single continuous ATC/STC time series
and the encoding method does not change. Level 2 shows three files
(20001.m2ts, 20002.m2t and 20003.m2ts) storing therein three streams.
These three files store therein three Primary TSs that have been
obtained by dividing the single stream of Level 1 in units of Aligned
Units (6 Kbytes).
[0182]
87
FIG. 37 shows a correlation between Playlterns and SubPlaylterns.
Level 1 shows three Playlterns (Playltem information #1, Playltem
information #2 and Playltem information #3) in PlayList information.
These three Playltems specify a Primary TS, and the connection
between Playltem information #1 and #2 is set to CC = 1 while the
connection between Playltem information #2 and #3 is set to CC = 5.
Level 2 shows three SubPlayl terns (SubPlayl tern #1, SubPlayl tern #2 and
SubPlayltern #3) in PlayList information. These three SubPlaylterns
specify a Secondary TS, and the connection between SubPlayltems #1
and #2 is set to CC = 1 while the connection between SubPlayltems #2
and #3 is set to CC =5. Level 3 shows nine SubPlayltems (SubPlayltem
#1, SubPlayltem #2, SubPlayltem #3 to SubPlayltem #9) in the
Progressive PlayList information. These nine SubPlayltems specify a
Secondary TS. Here, the connection between SubPlayltems #3 and #4 is
set to CC = 1, the connection between SubPlayltems #6 and #7 is set
to CC = 5, and the remaining connections are set to CC = 6.
SubPlayltems in the Progressive PlayList are generally connected with
CC = 6; however, when Playltems are connected with CC = 1 and CC =5,
SubPlayltems are also connected satisfying the condition of CC = 1
and CC = 5, respecitively, like the Playltems.
[0183]
Thus, the present embodiment introduces the new connection
state of CC = 6 for Playltems and SubPlayltems, whereby realizing a
process of dividing AVClips constituting the Progressive PlayList
information into small sections and providing them by means of a
streaming transfer.
EMBODIMENT 4
88
In Embodiment 1, how to limit the bit amount for each Window
is explained; the present embodiment presents how to perform
multiplexing to satisfy such restrictions.
[0184]
FIG. 38 schematically shows, in the case where audio
constituting a Primary TS is replaced with audio constituting a
Secondary TS, how multiple TS packets constituting the Primary TS and
multiple TS packets constituting the Secondary TS are multiplexed
together.
FIG. 38 schematicall shows the way multiple TS packets present
on the ATC time axis are multiplexed together. Level 1 shows a
Primary TS. The Primary TS is composed of TS packets storeing
therein V, Al and A2 (one set of video and two sets of audio) . These
TS packets are obtained by multiplexing these three elementary
streams of two types together.
[0185]
Level 2 shows a Secondary TS. The secondary TS is composed of
TS packets storing therein two sets of audio A3 and A4. A time
period p3 during which these TS packets of the Secondary TS are
multiplexed is, on the ATC time axis indicating input timings to the
decoder, made up of a time period p1 during which audio packets of
the Primary TS are multiplexed and a time period p2 during which TS
packets constituting the Primary TS are not being transferred.
[0186]
By multiplexing the streams in this way, it can be make sure
that the sum total of the bit rate of elementary streams to be
decoded does not exceed the allowable maximum bit rate of Primary TS
89
(48 Mbps) no natter which elementary stream is selected for each type
of the elementary streams. The example shown in FIG. 38 is a
simplest case in which the Secondary TS includes only audio.
FIG. 39 schematically shows, in the case where a subtitle (PG
stream) and a menu (IG stream) are also replaced in addition to the
audio, the way multiple TS packets constituting the Primary TS and
multiple TS packets constituting the Secondary TS are multiplexed
together.
[0187]
In the figure, a time period k3 during which packets of the
Secondary TS are transferred is the sum total of:
1) a time period kl during which a packet whose type is the
same as in the Primary TS is transferred; and
2) a time period during which the Primary TS is not being
transferred.
The above rules 1) and 2) are applied in the same manner to
other types of streams (Video, IG, PG and the like) stored in the
Secondary TS. Therefore, it is efficient if, for each stream, a
judgment is first made whether the stream can be multiplexed into the
Secondary TS during the time period when a packet whose type is the
same as in the Primary TS is transferred, and when the judgment is
negative, multiplexing is performed in the time period during which
no Primary TS is being transferred.
[0188]
The following specifically describes the process of the
multiplexer 60 of the present embodiment.
90
To realize the multiplexing described above, the multiplexer
60 simulates, according to a decoder model, the state of the buffer
entered when a Primary TS is played back, and finds a time period for
transferring each packet of the Primary TS and a time period for no
Primary TS being transferred. After finding these time periods, the
multiplexer 60 converts each PES packet constituting the Secondary TS
into TS packets so that each of the PES packets is transferred during
the time period when a packet whose type is the same as in the
Primary TS is transferred or during the time period when the Primary
TS is not being transferred, and attaches an ATS to each TS packet.
Since an ATS attached in this way indicates the time period when a
packet whose type is the same as in the Primary TS is transferred or
the time period when the Primary TS is not being transferred, each
PES packet constituting the Secondary TS is sent to the decoder
during the time period when a packet whose type is the same as in the
Primary TS is transferred or the time period when the Primary TS is
not being transferred, as shown in FIG. 39.
[0189]
In the case when elementary streams supplied from the local
storage are made not in the transport stream format but in the
program stream format, the multiplexer 60 converts PES packets
constituting the elementary streams into packs, and an SCR (System
Clock Reference) is attached to the TS header of each pack. An SCR
attached in this way also indicates, like an ATS, the time period
when a packet whose type is the same as in the Primary TS is
transferred or the time period when the Primary TS is not being
transferred. Therefore, each PES packet constituting a Secondary PS
91
(a program stream supplied from the local storage) is sent to the
decoder during the time period when a packet whose type is the same
as in the Primary PS (a program stream supplied from the BD-ROM) or
the time period when the Primary TS is not being transferred, as
shown in FIG. 39. In the case when elementary streams supplied from
the local storage are made in the program stream format, the time
period when a packet whose type is the same as in the Primary TS is
transferred or the time period when the Primary TS is not being
transferred are expressed in large units of time, pack (PES packet) .
Therefore, burden when the authoring is performed is significantly
less, which facilitates the Out_of_MQX application to be realized.
This is an advantage when the Out_of_MUX application is realized on a
DVD playback apparatus.
[0190]
Thus, the present embodiment performs multiplexing by
selecting, as input periods for packets constituting the Secondary TS,
the time period when a packet whose type is the same as in the
Primary TS is transferred or the time period when the Primary TS is
not being transferred. This facilitates the restriction of the bit
amount shown in Embodiment 1 to be satisfied. Realizing such
multiplexing on the authoring system of Embodiment 2 makes it easier
to produce a movie performing the Out_of_MUX application. Herewith, a
guarantee of no occurrence of an overflow during the playback can be
easily realized when the authoring is performed.
EMBODIMENT 5
In the present embodiment, an audio mixing application is
explained in detail. This application includes an exception to the
Out_of_MUX rule of selecting only one elementary stream for each type.
92
That is, the audio mixing application simultaneously selects an audio
stream for the Primary TS and an audio stream for the Secondary TS at
the same time, and decodes two audios, an audio of the Primary TS and
an audio of the Secondary TS, at the same time.
[0191]
FIG. 40 shows the way a Primary TS and a Secondary TS
constituting the audio mixing application are supplied to the decoder
in the ED-ROM playback apparatus. In the figure, among the internal
structural components of the BD-ROM playback apparatus, a BD-ROM
drive la, the local storage 200 and a network unit 25 are shown on
the left side while the respective decoders are shown on the right
side. A PID Filter that performs stream demultiplexing is shown in
the center. Primary TS (Video 1, Audio 1 (English) , Audio 2
(Spanish), PG 1 (English Subtitle), IG 1 (English Menu)) and the
Secondary TS (Audio 3 (Commentary) , PG 2 (Japanese Subtitle), PG 3
(Korean Subtitle) , PG 4 (Chinese Subtitle) , IG 2 (English Menu)) in
the figure are transport streams supplied from the BD-ROM and the
local strage, respectively. Since only English (Audio 1) and Spanish
(Audio 2) are recorded on the disk, the commentary of the movie
director cannot be selected on the disk. However, by downloading, to
the local storage, the Secondary TS which includes Audio 3 provided
by the content provider, the English audio (Audio 1) and Audio 3
(Commentary) can be sent to the decoder. Then, the decoder mixes the
English audio (Audio 1) and Audio 3 (Commentary) and outputs the
result, which allows the user to play back, together with the video
(Video 1), the English audio to which the commentary is attached.
[0192]
93
Here, the only difference from the Out_of_MUX application is
decoding two audio streams at the same time. With any Primary TS,
the case may occure where a directory's commentary audio, for example,
is desired to be added after the release of the disk. Accordingly, a
restriction on the bit rate of the Primary TS is not preferable, and
therefore a restriction on the Secondary TS is introduced as in the
case of the Out_of_MUX application. Since the audio mixing needs to
decode an audio in addition to each elementary stream (a video, an
audio, a subtitle and a menu), two recources to the audio decoder are
necessary.
[0193]
In the realization of the audio mixing application, an audio
stream that will belong to a Primary TS is referred to as a primary
audio stream while an audio stream that will belong to a Secondary TS
is referred to as a secondary audio stream. The following describes
such primary and secondary audio streams.
[0194]
There are 32 primary audio streams, each of which has a
different PTD from among 0x1100 to OxlllF. On the other hand,
similar to the primary streams, there are 32 secondary audio streams,
each of which has a different PTD from among OxlAOO to OxlAlF.
The difference of the secondary audio streams from the primary
audio streams is that audio frames of the secondary audio streams
include metadata made up of "downmixing information" and "gain
control information".
[0195]
94
The Mownmixing information" is information for downnrbcing.
Downmixing is a conversion that reduces the number of the audio
playback channels less than the number of the encoded channels. The
downmbcing information specifies a conversion factor matrix for
downmixing, and thereby causes the playback apparatus to perform
downmixing. Playing back a 5.1 ch audio stream after converting it
into a 2 ch audio stream is one example of downmixing.
[0196]
The "gain control information" is information for increasing
or decreasing a gain of the audio output of a primary audio stream;
however, the gain control information here only has to decrease the
gain. Thus, the metadata of a secondary audio stream is able to
decrease, in real time, the output of a primary audio stream which is
played back with the secondary audio stream at the same time. In the
case of superimposing a Secondary audio onto a Primary audio, since a
pair of a Primary audio and a Secondary audio to be mixed is known in
advance, there is no need to control the gain of the two audios in
real time. In this case, mixing (superposition) can be realized well
by only reducing the gain of the Primary audio while keeping the gain
of the Secondary audio unchanged. By providing such metadata, it is
possible to avoid occurrence of adding up the output sound volume of
the primary audio stream playback and the output sound volume of the
secondary audio stream playback and, in this way, damaging the
speakers. Thus concludes the audio streams of the present embodiment.
Improvements of the PlayList information of the present embodiment
are described next.
[0197]
95
Elementary streams of the same type are to be decoded by the
decoder at the same time, and therefore regarding the PlayList
information of the present embodiment, multiple primary audio streams
and multiple secondary audio streams allowed to be played back are
shown in the STN__table of each Playltem.
[0198]
The following describes the STN_table of the present
embodiment. To realize the audio mixing application, pairs of
Stream_entry and Stream_attribute in the secondary audio streams are
present in STN_table in addition to pairs of Stream_entry and
Stream_attribute in the primary audio streams. Each pair of
Stream_entry and Stream_attribute in the secondary audio streams is
associated with Canb_info_Secondary_audio_Primary_audio.
[0199]
(Comb_inf o_Secondary_audio_Primary_audio)
Corib_info_Secondary_audio_Primary_audio uniquely specifies one
or more primary audio streams with which the playback output of the
secondary audio stream can be mixed. This allows for, when the
authoring is performed, making a setting of the necessity of mixing
according to the audio attribute so that, for example, a secondary
audio stream is not mixed when a primary audio stream having a
predetermined attribute is to be played back while a secondary audio
stream is mixed when a primary audio stream having an attribute other
than the predetermined attribute is to be played back.
[0200]
(sp_connection_condition Information)
In PlayList information, the same value as
connection_condition information of the Playltem information is set
96
for sp_connection_condition information of a SubPlayltem. Therefore,
when connection_condition information of Playltem information is "=
5", sp_connection_condition information of SubPlayltem information is
also set as "SP_CC = 5". In addition, In_Time and Out_Time of
SubPlayltem information shows the same points of time as In_Time and
Out_Time of Playltem information.
[0201]
Thus concludes the improvement of the recording medium of the
present embodiment. The internal structure of the playback apparatus
of the present embodiment is described next.
FIG. 41 shows an internal structure of the playback appratus
according to Embodiment 5. The TB 6, EB 7 and audio decoder 8 are
replaced with an audio mixing processor (enclosed by the dotted
lines), as shown in the figure. The audio mixing processor inputs
two audio streams from a Primary TS and a Secondary TS, decodes them
at the same time, and mixes them. The rest of the internal structure
is the same as that for realizing the Out_of_MUX application. The
audio mixing processor is described next. The audio mixing processor
is composed of: transport buffers 6a and 6b; EBs 7a and 7b; a
preload buffer 7c; audio decoders 8a and 8b; and mixers 9a and 9b.
[0202]
The transport buffer 6a stores therein TS packets having PIDs
of audio streams and output from the PUD filter 3b in a first-in
first-out manner, and sends the TS packets to the audio decoder 8a.
The transport buffer 6b stores therein TS packets having PIDs
of audio streams and output from the PID filter 3d in a first-in
first-out manner, and sends the TS packets to the audio decoder 8b.
97
[0203]
The EB 7a is a buffer that stores therein PES packets obtained
by converting the TS packets stored in the buffer 6a.
The EB 7b is a buffer that stores therein PES packets obtained
by converting the TS packets stored in the buffer 6a.
The preload buffer 7c is a memory for preloading sound.bdmv
file read from the BD-RCM/local storage. The sound.bdmv file is a
file that stores therein audio data to be output in response to an
operation made on the menu.
[0204]
The audio decoder 8a decodes PES packets constituting a
Primary TS to thereby obtain noncorpressed audio data in the LPCM
state, and outputs the obtained audio data. This achieves a digital
output of an audio stream.
The audio decoder 8b decodes PES packets constituting a
Secondary TS to thereby obtain noncompressed audio data in the LPCM
state, and outputs the obtained audio data. This achieves a digital
output of an audio stream.
[0205]
The mixer 9a mixes digital audio in the LPCM state output from
the audio decoder 8a and digital audio in the LPCM state output from
the audio decoder 8b.
The mixer 9b mixes digital audio in the LPCM state output from
the mixer 9a and sound data stored in the buffer 7c. This mixing
operation by the sound mixer 9b is realized by that the controller 22
decodes a navigation command intending to emit a clicking sound.
[0206]
98
Thus concludes the description of the playback apparatus of
the present embodiment.
Since the audio mixing application is composed of primary
audio streams and secondary audio streams, as described above, the
verification as shown in Embodiment 2 is conducted assuming that a
primary audio stream and a secondary audio stream have been read at
the same time. Specifically speaking, the Window is shifted by one
packet each time on the ATC time axis to which the MainClip and
SubClip refer. This shifting procedure is the same as one shown in
the flowchart of FIG. 35. On each coordinate of the ATC time axis
indicated by an ATS, a stream having the highest calculated bit rate
is selected with respect to each type of a video stream, multiple
primary audio streams, multiple secondary audio streams, multiple PG
streams and multiple IG streams. The highest bit rate of the video
stream, the highest bit rate of the primary audio stream, the highest
bit rate of the secondary audio stream, the highest bit rate of the
PG stream and the highest bit rate of the IG stream are summed, and a
judgment is made whether the sum total is 48 Mbits or less. If the
sum total exceeds 48 Mbits, it is determined that there is a
violation of the ED-ROM standardization.
[0207]
Thus, according to the present embodiment, it is guaranteed
that the bit amount per second does not exceed a predetermined upper
limit even when primary and secondary audio streams are read both
from the ED-ROM and local storage at the same time and supplied to
the decoders for primary and secondary audio streams. With such a
guarantee, the audio mixing application can be created efficiently.
99
This enables a supply system that downloads, to the local storage,
additional contents that realize the audio mixing application and
supplies them to the decoder from the local storage. Therefore, a
supply arrangement for, for example, adding a commentary after
shipment of the BD-RCM can be readily realized.
[0208]
EMBODIMENT 6
In Embodiment 1, connection points between Playlterns and
between SubPlayltems are matched by matching In_Times and Out_Times
of Playl terns and In_Times and Out_Times of SubPlayltems. On the
other hand, the present embodiment does not require the connection
points to be matched and allows some degree of time difference in
order to realize audio mixing.
[0209]
In the case of allowing the time difference, another
restriction is required. The above-mentioned process of changing
STCs is performed at seamless connections between Playltems and
between SubPlayltems, and this changing process is performed when the
decoder is in the free-run state. Here, in the seamless connection,
the decoder cannot move to synchronous control until an STC returns,
and therefore a seamless connection involving an STC change cannot be
accepted frequently due to implementation issues. Accordingly, the
connection points of CC =3 continuing both in Playltems and
SubPlayltems should controlled to occure at a predetermined interval
(e.g. three seconds or so) from each other.
[0210]
FIG. 42 shows a correlation between Playltems and SubPlayltems
specified by a PlayList indicating audio mixing. Level 1 of FIG. 42
100
shows three PlayIterns (Play I tern information #1, Playltem information
#2 and Playltem information #3) in PlayList information. These three
Playltems specify a Primary TS, and the connection between Playltem
information #1 and #2 is set to CC = 1 while the connection between
Playltem information #2 and #3 is set to CC = 5. Level 2 of FIG. 42
shows three SubPlaylterns (SubPlayltem #1, SubPlayltem #2 and
SubPlayltem #3) in PlayList information. These three SubPlaylterns
specify a Secondary TS, and the connection between SubPlaylterns #1
and #2 is set to CC = 1 while the connection between SubPlayltems #2
and #3 is set to CC =5. Level 3 of FIG. 42 shows nine SubPlayltems
(SubPlayltem #1, SubPlayltem #2, SubPlayltem #3 to SubPlayltem #9) in
the Progressive PlayList information. These nine SubPlayltems
specify a Secondary TS. Here, the connection between SubPlayltems #3
and #4 is set to CC = 1, the connection between SubPlayltems #4 and
#5 is set to CC = 5, and the remaining connections are set to CC = 6.
[0211]
In the figure, the start of SubPlayltem #3 of Level 2 is 3
seconds before the start point of Playltem #3 of Level 1. Similarly,
the start point of SubPlayltem #5 of Level 3 is 3 seconds before the
start point of Playltem #3 of Level 1.
The time interval for changing the STC time axes of the
Playltems and SubPlayltems is 3 seconds, and therefore the change of
the STC time axes does not occur too often.
[0212]
The timing of CC = 1 for Playltems is set in accordance with
SP_CC = 1. This is for preventing the playback of Playltems and
SubPlayltems from getting out of synchronization in the case where
101
only playback of the SubPlayltems is continued when the connection is
nonseamless with CC = 1.
The connection mode of connecting SubPlayltems with SP_CC = 5
in the middle of Playltems becomes useful when both a theatrical
version and a director's cut are stored on a single disk.
[0213]
Level 1 of FIG. 43 shows one example of PlayList information
constituting both a theatrical version and a director's cut. Within
the PlayList information, the director's cut is composed of PlayItem
#1, Playltem #2 and PlayItern #4 while the theatrical version is
composed of Playltem #1, Playltem #3 and Playltem #4. Thus, since
Playltem #1 and Playltem #4 are shared by the two versions, titles
can be created effectively. Because the part of video in each
version different from the other is shorter than the entire length of
the video, the data volume recorded on the disk can be reduced
effectively. Level 2 of FIG 43 shows an example in which
commentaries corresponding to Playltem #1, Playltem #2 and Playltem
#4 of Level 1 are defined as one SubPlayltem and commentaries
corresponding to Playltem #1, Playltem #3 and Playltem #4 are defined
as another SubPlayltem. In this case, the commentaries corresponding
to Playltem #1 and Playltem #4 have to be prepared for each of the
two SubPlayltems, which is unfavorable in terms of the volume of data.
[0214]
Level 3 of FIG. 43 shows an example in which SubPlayltems
(SubPlayltem #1, SubPlayltem #2, SubPlayltem #3 and SubPlayltem #4)
each corresponding to Playltem #1, Playltem #2, Playltem #3 and
Playltem #4 are defined. Assume that the connections of SubPlayltem
#1 with SubPlayltem #2 and with SubPlayltem #3 as well as the
102
connections of SubPlayltem #2 with SubPlayltem #3 and with
SubPlayltem #4 are CC = 5. These connection points occur at points
of time apart from the connection points of Playlterns. That is, on
the commentary side, branching to SubPlayltem #2 or SubPlayltem #3 is
caused 3 seconds before Playltem #1 ends, using CC = 5 (or CC = 6).
[0215]
In addition, branching to SubPlayltem #4 is caused 3 seconds
after Playltem #2 and Playltem #3 end, using CC = 5 (or CC = 6). The
starts of SubPlayltem #2 and SubPlayltem #3 and the start of
SubPlayltem #4 are respectively 3 seconds apart from the starts of
Playltem #2 and Playltem #3 and the start of Playltem #4. By
provinding such time intervals, the change of the STC time axes does
not occur too often.
[0216]
In a precise sense, CC = 5 is required only to cause a return
from SubPlayltem #3 to SubPlayltem #4 (seamless connection at which
the ATC/STC time axes are reset), and CC = 6 can be used instead of
CC = 5 for the remaining branchings.
Thus, "according to the present embodiment, since In_Time and
Out_Time of Playltems do not match In_Time and Out_Time of
SubPlayltems, the synchronization of the ATC counters 2a and 2c as
well as the STC counters 3a and 3c is not necessary, which increases
the freedom of design of playback apparatuses.
[0217]
EMBODIMENT 7
In Embodiment 6, the primary and secondary audio streams are
targets of the restriction of the bit amount when they are read from
the BD-ROM and the local storage at the same time and supplied to the
103
decoder. The present embodiment explains the restriction of the bit
amount imposed When Picture in Picture (PiP) playback application is
realized.
[0218]
PiP playback is, when MainClips constituting moving images are
specified by MainPath information of PlayList information and
SubClips constituting another set of moving images are specified by
SubPlayltem information of PlayList information, technology for
displaying the former moving images (Primary Video) and the latter
moving images (Secondary Video) on the same screen. Here, the
Primary Video is composed of HD images while the Secondary Video is
composed of SD images. The HD images have a resolution of 1920 x
1080 with a frame clock cycle of 3750 (alternatively 3753 or 3754),
like a film material. The SD images have a resolution of 720 x 480
with a display clock cycle of 1501 like an NTSC material or with a
frame clock cycle of 1800 like a PAL material.
[0219]
The SD images have about 1/4 the resolution of the HD images,
and therefore if the Primary Video, which is HD images, and the
Secondary Video are displayed on the same screen, the size of the
Secondary Video is about 1/4 in relation to the Primary Video.
Here, assume that the Secondary Video is moving images in
which only the director and/or the cast appear and give a performance
of, for example, pointing at the video content of the Primary Video.
In this case, by combining the video content of the Secondary Video
with the video content of the Primary Video, it is possible to
realize an amusing screen effect where the movie director and/or cast
104
is giving commentary while pointing at the contents in the playback
video of the movie.
[0220]
A video stream for the Secondary Video (secondary video
stream) is specified by multiple pieces of SubPlayltem information in
SubPath information of PlayList information. To such SubPlayltem
information, information elements of PiP_Position and PiP_Size are
newly added.
"PiP_Position" indicates, using X and Y coordinates on the
screen plane used for the playback of the Primary Video, a position
at which the playback video of the Secondary Video is to be located.
[0221]
"PiP_Size" indicates the height and width of the playback
video of the Secondary Video.
Additionally, sp_connection_condition information of
SubPlaylterns in the present embodiment is set to "= 5". This means a
guarantee of a seamless connection between a secondary video stream
multiplexed into SubClips of the current SubPlayltem and a secondary
video stream multiplexed into SubClip of the previous SubPlayltem.
sp_connection_condition information of such SubPlayltems is set to the
same value as connection_condition information of Playltem
information. Therefore, if connect ion_condition information of
Playltem information is "= 5", sp_connection_condition information of
SubPlayltem information must also be set to "= 5". That is, if the
primary video stream on the Playltem side is seamlessly connected,
the secondary video stream on the SubPlayltem side must be seamlessly
connected. In addition, In_Time and Out_Time of SubPlayltem
105
information must indicate the same points of time as In_Time and
Out_Time of Playltem information.
[0222]
Thus concludes the description of the recording medium of the
present embodiment.
The following explains improvements of the playback apparatus.
In order to perform decode processing of secondary video streams, the
hardware of the playback apparatus of the present embodiment includes
another set of structural elements used to decode the video streams.
Here, the structural elements used to decode the video streams are:
a transport buffer; a multiplexed buffer; an elementary buffer; a
decoder; and a video plane, and decode secondary video streams. In
addition, the playback apparatus of the present embodiment includes a
scalier and a synthesis unit described hereinafter.
[0223]
The scalier enlarges or reduces the size of the playback video
in the Secondary Video plane based on the height and width indicated
by PiP_Size of SubPlayltem information.
The synthesis unit realizes PiP playback by synthesizing
playback video, the size of which has been enlarged by the scalier,
and playback video obtained by the video decoder. The synthesis of
the playback video of the Primary Video and the playback video of the
Secondary Video is performed in accordance with PiP_Position
specified by SubPlayltem information. Herewith, synthesized video
which is created by synthesizing the playback video of the Primary
Video and the playback video of the Secondary Video can be played
back. The synthesis unit is able to perform Chroma-key synthesis,
106
layer synthesis and the like, and perform a process of, for example,
removing the background of the Secondary Video, extracting image of a
person, and synthesizing the image of the person with the playback
video of the Primary Video. Thus concludes the description of the
playback apparatus of the present embodiment.
[0224]
In the case where a video stream which is a primary TS
(primary video stream) and a video stream which is a secondary video
stream (secondary video stream) are read at the same time and
supplied to the decoder in order to realize PiP playback, the primary
and secondary video streams are targets for verification for
restricting the bit amount.
[0225]
Specifically speaking, as the Window is shifted on the ATC
time axis, a stream having the highest calculated bit rate is
selected, on each coordinate of the ATC time axis indicated by an ATS,
with respect to each type of primary video stream, secondary video
stream, multiple primary audio streams, multiple secondary audio
streams, multiple PG streams and multiple IG streams. The highest
bit rate of the primary video stream, the highest bit rate of the
secondary video stream, the highest bit rate of the primary audio
stream, the highest bit rate of the secondary audio stream, the
highest bit rate of the PG stream and the highest bit rate of the IG
stream are summed, and a judgment is made whether the sum total is 48
Mbits or less.
[0226]
107
Thus, according to the present embodiment, it is guaranteed
that the bit amount per second does not exceed a predetermined upper
limit even when primary and secondary video streams are read both
from the BD-ROM and local storage at the same time and supplied to
the respective decoders. With such a guarantee, the PiP application
can be created efficiently.
[0227]
(Supplementary Notes)
The best modes for carrying out the invention, as far as known
to the applicant at the time of filing the present application, have
been described. However, further improvements or modifications can
be made on the present invention in terms of the following technical
topics. It should be noted here that whether or not to make such
improvements or modifications is optional, and depends on the
implementer of the invention.
[0228]
(In_Time, Out_Time)
In FIG. 27, the last Video Presentation Unit of TS1 is
selected for Out_Time of the previous Playltem while the first Video
Presentation Unit of TS2 is selected for In_Times of the previous
Playltem and the previous SubPlayltem. Instead, however, a middle
Video Presentation Unit in TS1 may be selected for Out_Time of the
previous Playltem while a middle Video Presentation Unit in TS2 may
be selected for In_Times of the current Playltem and the current
SubPlayltem. In this case, seamless connections cannot be realized
for the current Playltem and the current Subplayltem, and they must
be connected using CC = 1 and SP_CC = 1.
[0229]
108
(All PlayList Information)
When it is desired to connect two Playltems with CC = 5, all
Playltem information and all SubPlayltem information that belong to
one piece of PlayList information must be connected with CC = 5.
(Data Amount Supplied to Decoder)
As to the Out_of_ICX, the data amount supplied to the decoder
does not always become large. For example, assume the case in which
the primary audio stream is a MainClip and is composed of DD (Dolby
Digital) of CBR and MLP of VBR, and the MLP is replaced with the DD
of the CBR supplied from the local storage. In this case, the data
amount supplied to the decoder is in fact decreased. If the
occurrence is obvious, the verification process can be omitted.
[0230]
(Difference in Playback Times)
In order to realize CC = 5 and SP_CC =5, it is desirable if a
difference in playback time of the video and audio streams in one
Playltem is small. An allowable difference may be: a time period
equivalent to one video frame (1/60 to 1/25 seconds); one second or
less; a time period corresponding to a certain percentage of the
entire playback period (e.g. 1% or less); or a combination of two of
these. This is also the case for a difference in playback time of
the video and audio streams in one SubPlayltem.
[0231]
In the case where two elementary streams are stored in one PID,
it is desirable that a difference in playback time of the two streams
stored in the same PID is the same as or less than the minimum
playback unit (1 frame) of a stream having a shorter playback time.
This condition can be realized by storing Dolby Digital (AC-3) and
109
MLP (Meridian Lossless Packing) in a single elementary stream and
then recording the elementary stream on the BD-ROM.
[0232]
(Processing of Additional Contents)
It is desirable to make the initial setting of the playback
apparatus in a manner that additional contents downloaded to the
local storage 200 will be automatically deleted when several months
or several years have elapsed after the downloading.
(Substitution of PID)
When the audio mixing application is realized, PIDs are used
to distinguish between the primary and secondary audio streams; when
MPEG2-PG is used, however, it is desirable to make stream_id of PES
packet headers different from each other.
[0233]
In addition, the primary and secondary audio streams only have
to be distinguished on a system stream level so that two audio
streams can be differentiated by one demultiplexer. Alternatively,
before multiplexing two streams, PIDs of one of the streams may be
changed to different PIDs.
[0234]
(Preloading)
It is desirable that preloading of audio data (a file
"sound.bdmv") for a clicking sound is performed when the BD-ROM is
being loaded or when a title is switched. This is because, if
reading of the file sound.bdmv is attempted during the playback of an
AVClip, a seek operation of optical pickup for reading a file
different from the AVClip is caused. On the other hand, when the BD-
ROM is being loaded or when a title is switched, it is rare that the
110
playback of an AVClip is being continued. Therefore, by reading the
file sound.bdmv at such a timing, it is possible to enhance the
responsivity of the apparatus and make it difficult to cause
interruption of the AVClip playback.
[0235]
(Java™ Platform)
A Java™ platform can be structured by fully mounting, on the
playback apparatus of each embodiment, the Java™ 2Micro_Edition(J2ME)
Personal Basis Profile (PBP 1.0) and the Globally Executable MHP
specification (GEM..0.2) for package media targets, and then the
playback apparatus may be caused to perform a BD-J application. To
perform the application, the playback apparatus may be caused to
perform the Out_of_MUX framework.
[0236]
(Title)
It is preferable to create a "module manager" in the playback
apparatus, which selects a title according to the mount of the BD-ROM,
a user operation, or a state of the apparatus. The decoder in the
BD-RCM playback apparatus performs playback of an AVClip based on the
PlayList information according to the title selection by the "module
manager".
[0237]
When the "module manager" selects a title, the application
manager executes signaling using an application management table
(AMI) corresponding to a previous title and an AMT corresponding to
the current title. The signaling takes control that terminates the
operation of an application described in the AMT of the previous
title but not described in the AMT of the current AMT while commences
111
the operation of an application not described in the AMT of the
previous title but described in the AMT of the current title.
[0238]
(Directory Structure in Local Storage)
Individual areas in the local storage described in each
embodiment are preferably created under a directory corresponding to
a disk's root certificate of the BD-ROM.
The disk's root certificate is a root certificate that is
distributed by the root certificate authority and assigned to the BD-
RCM by the creator of the BD-RCM. The disk's root certificate is
encoded in, for example, the X.509. The specifications of the X.509
have been issued by the International Telegraph and Telephone
Consultative Committee, and described in CCITT Recommendation X.509
(1988), "The Directory - Authentication Framework".
[0239]
In addition, it is preferable that the contents recorded in
the BD-RCM and local storage be encoded using the Advanced Access
Content System (AACS), a signature information be attached thereto,
and a use authorization be specified in a permission file.
(Package to be Mounted)
When the BD-RCM playback apparatus is implemented as the Java™
platform, it is desirable to mount the following BD-J Extention on
the playback apparatus. The BD-J Extention includes various packages
specialized to provide functions beyond GEM[1.0.2] to the Java™
platform. Packages included in the BD-J Extention are shown below.
• org.bluray.media
112
This package provides special functions to be added to Java™
Media Framework. Controls for selecting angle, audio and subtitle
are added to the package.
• org.bluray.ti
This package includes: API for mapping "services" of
GEM [1.0.2] on a "title"; a mechanism to inquire about title
information from the BD-RCM; and a mechanism to select a new title.
• org.bluray.application
This package includes APIs for managing active periods of an
application. In addition, the package includes APIs for inquiring
about information required for signaling when an application is
performed.
• org.bluray.ui
This package includes classes that define constants used for
key events specialized for the BD-RCM and realize synchronization
with video playback.
• org.bluray.vfs
This package provides a mechanism (Binding Scheme) to bind
contents recorded on the BD-RCM (on-disc contents) and contents in
the local storage (off-disc contents), which are not recorded on the
BD-RCM, in order to playback the data seamlessly indifferent to where
the data is recorded.
[0240]
Binding Scheme associates contents on the BD-RCM (AVClip,
subtitle, and BD-J application) with related contents in the tlocal
storage. Binding Scheme realizes seamless playback indifferent to
where the contents are recorded.
(Virtual Package)
113
The BD-ROM playback apparatus may be caused to perform a
process of creating Virtual Package. This is realized by that the
playback appratus creates Virtual Package information. Virtual
Package information is information obtained by expanding the volume
management information on the BD-ROM. Here, the volume management
information is information that specifies a directory-file structure
existing on a recording medium, and is composed of directory
management information related to the directories and file management
information related to the files. Virtual Package information is
designed to expand the directory-file structure of the BD-ROM by
adding new file management information to the volume management
information showing the directory-file structure of the BD-ROM.
[0241]
(Realization of Control Procedure)
Both the control procedures explained in the above-described
embodiments using the flowcharts and the control procedures of the
functional components explained in the above-described embodiments
satisfy the requirements for the "program invention" since the above-
mentioned control procedures are realized concretely using the
hardware resources and are the creation of a technical idea utilizing
natural laws.
[0242]
• Production of Program of Present Invention
The program of the present invention is an object program that
can execute on a computer. The object program is composed of one or
more program codes that cause the computer to execute each step in
the flowchart or each procedure of the functional components. There
are various types of program codes such as the native code of the
114
processor, and JAVA™ byte code. There are also various forms of
realizing the steps of the program codes. For example, when each
step can be. realized by using an external function, the call
statements for calling the external functions are used as the program
codes. Program codes that realize one step may belong to different
object programs. In the RISC processor in which the types of
instructions are limited, each step of flowcharts may be realized by
combining arithmetic operation instructions, logical operation
instructions, branch instructions and the like.
[0243]
The program of the present invention can be produced as
follows. First, the software developer writes, using a programming
language, a source program that achieves each flowchart and
functional component. In this writing, the software developer uses
the class structure, variables, array variables, calls to external
functions, and so on, which conform to the sentence structure of the
programming language s/he uses.
[0244]
The written source program is sent to the compiler as files.
The compiler translates the source program and generates an object
program.
The translation performed by the compiler includes processes
such as the sentence structure analysis, optimization, resource
allocation, and code generation. In the sentence structure analysis,
the characters and phrases, sentence structure, and meaning of the
source program are analyzed and the source program is converted into
an intermediate program. In the optimization, the intermediate
program is subjected to such processes as the basic block setting,
115
control flow analysis, and data flow analysis. In the resource
allocation, to adapt to the instruction sets of the target processor,
the variables in the intermediate program are allocated to the
register or memory of the target processor. In the code generation,
each intermediate instruction in the intermediate program is
converted into a program code, and an object program is obtained.
[0245]
After the object program is generated, the programmer
activates a linker. The linker allocates the memory spaces to the
object programs and the related library programs, and links them
together to generate a load module. The generated load module is
based on the presumption that it is read by the computer and causes
the computer to execute the procedures indicated in the flowcharts
and the procedures of the functional components. The program of the
present invention can be produced in this way.
[0246]
The program of the present invention can be used as follows.
When the program of the present invention is used as an embedded
program, the load module as the program is written into an
instruction ROM, together with the Basic Input/Output System (BIOS)
program and various pieces of middleware (operation systems). The
program of the present invention is used as the control program of
the playback apparatus 300 as the instruction ROM is embedded in the
control unit and is executed by the CPU.
[0247]
When the playback apparatus is a bootstrap model, the Basic
Input/Output System (BIOS) program is embedded in an instruction ROM,
and various pieces of middleware (operation systems) are preinstalled
116
in a hard disk. Also, a boot ROM for activating the system from the
hard disk is provided in the playback apparatus. In this case, only
the load module is supplied to the playback apparatus via a
transportable recording medium and/or a network, and is installed in
the hard disk as one application. This enables the playback
apparatus to perform the bootstrapping by the boot ROM to activate an
operation system, and then causes the CPU to execute the installed
load module as one application so that the program of the present
application can be used.
[0248]
As described above, when the playback apparatus is a hard-disk
model, the program of the present invention can be used as one
application. Accordingly, it is possible to transfer, lend, or
supply, via a network, the program of the present invention
separately.
(Controller 22)
The controller 22 can be realized as one system LSI.
[0249]
The system LSI is obtained by implementing a bare chip on a
high-density substrate and packaging them. The system LSI is also
obtained by implementing a plurality of bare chips on a high-density
substrate and packaging them, so that the plurality of bare chips
have an outer appearance of one LSI (such a system LSI is called a
multi-chip module) .
The system LSI has a QFP (Quad Flat Package) type and a PGA
(Pin Grid Array) type. In the QFP-type system LSI, pins are attached
to the four sides of the package. In the PGA-type system LSI, a lot
of pins are attached to the entire bottom.
117
[0250]
These pins function as an interface with other circuits. The
system LSI, which is connected with other circuits through such pins
as an interface, plays a role as the core of the playback apparatus.
Each of the bare chips packaged in the system LSI is composed
of: a front end unit; a back end unit; and a digital processing unit.
The front end unit digitizes an analogue signal while the back end
unit changes the obtained data into analog form and outputs it.
[0251]
Each structural element shown in the diagram of the internal
structure in the above embodiment is mounted in the digital
processing unit.
As described above in "Used as Embedded Program", the load
module as the program, the Basic Input/Output System (BIOS) program
and various pieces of middleware (operation systems) are written into
an instruction ROM. The major improvement of the embodiments is
achieved by the load module as the program. It is therefore possible
to produce a system LSI of the present invention by packaging the
instruction ROM as bare chips, in which the load module as the
program is stored, as the bare chip.
[0252]
It is desirable to employ SoC implementation or SiP
implementation for the actual implementation. The SoC (System on
chip) implementation is a technique that burns multiple circuits onto
one chip. The SiP (System in package) implementation is a technique
that puts multiple chips in one package using resin. By the above
procedure, the system LSI of the present invention can be produced
118
based on the internal structure diagrams of the playback apparatus
described in each embodiment.
[0253]
It should be noted here that although the term LSI is used
here, it may be called IC, LSI, super LSI, ultra LSI or the like,
depending on the level of integration.
Further, part or all of the components of each playback
apparatus may be achieved as one chip. The integrated circuit is not
limited to the SoC implementation or the SiP implementation, but may
be achieved by a dedicated circuit or a general purpose processor.
It is also possible to achieve the integrated circuit by using the
FPGA (Field Programmable Gate Array) that can be re-programmed after
it is manufactured, or a reconfigurable processor that can
reconfigure the connection and settings of the circuit cells inside
the LSI. Furthermore, a technology for an integrated circuit that
replaces the LSI may appear in the near future as the semiconductor
technology improves or branches into another technologies. In that
case, the new technology may be incorporated into the integration of
the functional blocks constituting the present invention as described
above. Such possible technologies include biotechnology.
Industrial Applicability
[0254]
The recording medium and playback apparatus of the present
invention can be mass-produced based on the internal structures of
them shown in the embodiments above. As such, the and playback
apparatus of the present invention has the industrial applicability.
119
Claims
1. A recording medium on which playlist information is recorded,
wherein the playlist information includes main-path
information and sub-path information,
the main-path information specifies, among a plurality of
digital streams, one digital stream as a main stream, and defines a
primary playback section on the main stream,
the sub-path information specifies, among rest of the
plurality of digital streams, one digital stream as a substream, and
defines, on the substream, a secondary playback section which is to
be synchronized with the primary playback section,
the playlist information further includes a stream table
showing at least one pair of elementary streams which are allowed to
be simultaneously played back, the pair of elementary streams being
made up of one of a plurality of elementary streams multiplexed into
the main stream and one of a plurality of elementary streams
multiplexed into the substream, and
a total data size of a digital stream per unit time is less
than or equal to a predetermined value, the digital stream including
the pair of elementary streams and not including an elementary stream
which is not allowed in the stream table to be simultaneously played
back.
2. The recording medium of Claim 1,
wherein in a case where at least one of the main stream and
the substream includes a plurality of elementary streams of same type,
an elementary stream having a highest bit rate among the elementary
120
streams of same type is used for calculation of the total data size
of the digital stream per unit time.
3. The recording medium of Claim 2,
wherein types of the elementary streams include a primary-
audio stream type and a secondary-audio stream type,
an elementary stream of the primary-audio stream type and an
elementary stream of the secondary-audio stream type make up an audio
mixing application, and
with respect to each of the primary-audio stream type and the
secondary-audio stream type, the elementary stream having the highest
bit rate is selected from a plurality of elementary streams.
4. The recording medium of Claim 2,
wherein types of the elementary streams include a
primary-video stream type and a secondary-video stream type,
an elementary stream of the primary-video stream type and an
elementary stream of the secondary-video stream type make up a
picture-in-picture application, and
with respect to each of the primary-video stream type
and the secondary-video stream type, the elementary stream having the
highest bit rate is selected.
5. A playback apparatus for playing back, in accordance with playlist
information, a main stream in which a primary playback section is
defined and a substream in which a secondary playback section is
defined,
121
wherein the playlist information defines, with respect to each
of a plurality of digital streams, a playback section, and includes
main-path information and sub-path information, and
the playback apparatus comprising:
a 1st reading unit operable to read packets constituting the
primary playback section in accordance with the main-path
information;
a 2nd reading unit operable to read packets constituting the
secondary playback section in accordance with the sub-path
information;
a decoder; and
a demultiplexing unit operable to demultiplex the primary
playback section and the secondary playback section to obtain packets,
which are supplied to the decoder, and
wherein a total data size of the supplied packets per unit
time is less than or equal to a predetermined value.
6. The playback apparatus of Claim 5,
wherein in a case where at least one of the main stream and
the substream includes a plurality of elementary streams of same type,
an elementary stream having a highest bit rate among the elementary
streams of same type is used for calculation of the total data size
of the digital stream per unit time.
7. The playback apparatus of Claim 6,
wherein types of the elementary streams include a primary-
audio stream type and a secondary-audio stream type,
the decoder includes:
122
a 1st decoder operable to decode an elementary stream of the
primary-audio stream type;
a 2nd decoder operable to decode an elementary stream of the
secondary-audio stream type; and
a synthesis unit operable to synthesize decoded results
obtained by the 1st and 2nd decoders, and
with respect to each of the primary-audio stream type and the
secondary-audio stream type, the elementary stream having the highest
bit rate is selected from a plurality of elementary streams.
8. The playback apparatus of Claim 6,
wherein types of the elementary streams include a primary-
video stream type and a secondary-video stream type,
the decoder includes:
a 1st decoder operable to decode an elementary stream of the
primary-audio stream type;
a 2nd decoder operable to decode an elementary stream of the
secondary-audio stream type; and
a synthesis unit operable to synthesize decoded results
obtained by the 1st and 2nd decoders, and
with respect to each of the primary-video stream type and the
secondary-video stream type, the elementary stream having the highest
bit rate is selected.
9. A recording method, for recording application data on a recording
medium, comprising the steps of:
(a) generating the application data;
(b) verifying the application data; and
123
(c) obtaining the recording medium to which the application
data, whose authenticity has been verified, is written,
wherein the application data includes playlist information and
a plurality of digital streams,
the playlist information includes main-path information and
sub-path information,
the main-path information specifies, among the plurality of
digital streams, one digital stream as a main stream, and defines a
primary playback section on the main stream,
the sub-path information specifies, among rest of the
plurality of digital streams, one digital stream as a substream, and
defines, on the substream, a secondary playback section which is to
be synchronized with the primary playback section,
the playlist information further includes a stream table
showing at least one pair of elementary streams which are allowed to
be simultaneously played back, the pair of elementary streams being
made up of one of a plurality of elementary streams multiplexed into
the main stream and one of a plurality of elementary streams
multiplexed into the substream, and
the step (b) verifies whether a total data size of a digital
stream per unit time is less than or equal to a predetermined value,
the digital stream including the pair of elementary streams and not
including an elementary stream which is not allowed in the stream
table to be simultaneously played back.
10. The recording method of Claim 9,
wherein in a case where at least one of the main stream and
the substream includes a plurality of elementary streams of same type,
124
an elementary stream having a highest bit rate among the elementary
streams of same type is used for calculation of the total data size
of the digital stream per unit time.
11. The recording method of Claim 10,
wherein types of the elementary streams include a primary-
audio stream type and a secondary-audio stream type,
an elementary stream of the primary-audio stream type and an
elementary stream of the secondary-audio stream type make up an audio
mixing application, and
with respect to each of the primary-audio stream type and the
secondary-audio stream type, the elementary stream having the highest
bit rate is selected from a plurality of elementary streams.
12. The recording method of Claim 10,
wherein types of the elementary streams include a
primary-video stream type and a secondary-video stream type,
an elementary stream of the primary-video stream type and an
elementary stream of the secondary-video stream type make up a
picture-in-picture application, and
with respect to each of the primary-video stream type and the
secondary-video stream type, the elementary stream having the highest
bit rate is selected.
13. The recording method of Claim 9,
wherein each of the digital streams includes a plurality of
packets, each of which has an arrival time stamp attached thereto,
and
125
the recording method further comprising a step of:
defining, on a time axis that functions as a reference for the
arrival time stamps, a window having a length of the unit time, and
shifting the window along the time axis in accordance with a
coordinate indicated by each of the arrival time stamps, and
wherein verification of the total data size of step (b) is
performed each time the window shifts.
14. A playback method for playing back, in accordance with playlist
information, a main stream in which a primary playback section is
defined and a substream in which a secondary playback section is
defined,
wherein the playlist information defines, with respect to each
of a plurality of digital streams, a playback section, and includes
main-path information and sub-path information, and
the playback method comprising:
a 1st reading step of reading packets constituting the primary
playback section in accordance with the main-path information;
a 2nd reading step of reading packets constituting the
secondary playback section in accordance with the sub-path
information; and
a demultiplexing step of demultiplexing the primary playback
section and the secondary playback section to obtain packets, which
are supplied to a decoder, and
wherein a total data size of the supplied packets per unit
time is less than or equal to a predetermined value.
On a BD-ROM, PlayList information is recorded. The PlayList
information includes MainPath information and SubPath information.
The MainPath information specifies one of a plurality of AVClips as a
MainClip, and defines a primary playback section on the MainClip.
The SubPath information specifies, among the rest of the AVClips, one
AVClip as a SubClip, and defines, on the SubClip, a secondary
playback section to be synchronized with the primary playback section.
The PlayList information includes an STN_table, which indicates
SubClip and, from among a plurality of elementary streams multiplexed
into the SubClip, elementary streams allowed to be played back. A
total data size of AVClip per unit time is, for example, less than or
equal to 48 Mbits when the AVClip includes a plurality of elementary
streams allowed in the STN_table to be played back and does not
include elementary streams which are not allowed in the STN_table to
be played back.