Title of Invention	SYSTEMS AND METHODS FOR DIGITAL DOCUMENT PROCESSING
Abstract	The invention relates to display technologies that separate the underlying functionality of an application program from the graphical display process, thereby eliminating or reducing the application's need to control the device display and to provide graphical user interface tools and controls for the display. Additionally, such systems reduce or eliminate the need for an application program to be present on a processing system when displaying data created by or for that application program, such as a document or video stream. Thus it will be understood that in one aspect, the systems and method described herein can display content, including documents, video steams, or other content, and will provide the graphical user functions for viewing the displayed document, such as zoom, pan, or other such functions, without need for the underlying application to be present on the system that is displaying the content. The advantages over the prior art of the systems and methods described herein include the advantage of allowing different types of content from different application programs to be shown on the same display within the same work space.

Title of Invention

SYSTEMS AND METHODS FOR DIGITAL DOCUMENT PROCESSING

Abstract

The invention relates to display technologies that separate the underlying functionality of an application program from the graphical display process, thereby eliminating or reducing the application's need to control the device display and to provide graphical user interface tools and controls for the display. Additionally, such systems reduce or eliminate the need for an application program to be present on a processing system when displaying data created by or for that application program, such as a document or video stream. Thus it will be understood that in one aspect, the systems and method described herein can display content, including documents, video steams, or other content, and will provide the graphical user functions for viewing the displayed document, such as zoom, pan, or other such functions, without need for the underlying application to be present on the system that is displaying the content. The advantages over the prior art of the systems and methods described herein include the advantage of allowing different types of content from different application programs to be shown on the same display within the same work space.

Full Text	Systems and Methods for Digital Document Processing 2 Related Applications 3 This application claims priority to earlier filed 4 British Patent Application No. 0009129.8, filed 14 5 April 2000, and US Patent Application Serial Number 6 09/703,502 filed 31 October 2000, both having Majid 7 Anwar as an inventor, the contents of which are 8 hereby incorporated by reference. 9 Field of the Invention 10 The invention relates to data processing systems, 11 and more particularly, to methods and systems for 12 processing digital documents to generate an output 13 representation of a source document as a visual 14 display, a hardcopy, or in some other display 15 format. 16 Background 2 1 As used herein, the term "digital document" is used 2 to describe a digital representation of any type of 3 data processed by a data processing system which is 4 intended, ultimately, to be output in some form, in 5 whole or in part, to a human user, typically by 6 being displayed or reproduced visually {e.g., by 7 means of a visual display unit or printer), or by 8 text-to-speech conversion, etc. A digital document 9 may include any features capable of representation, 10 including but not limited to the following: text; 11 graphical images; animated graphical images; full 12 motion video images; interactive icons, buttons, 13 menus or hyperlinks. A digital document may also 14 include non-visual elements such as audio (sound) 15 elements. 16 Data processing systems, such as personal computer 17 systems, are typically required to process "digital 18 documents," which may originate from any one of a 19 number of local or remote sources and which may 20 exist in any one of a wide variety of data formats 21 ("file formats"). In order to generate an output 22 version of the document, whether as a visual display 23 or printed copy, for example, it is necessary for 24 the computer system to interpret the original data 25 file and to generate an output compatible with the 26 relevant output device (e.g., monitor, or other 27 visual display device or printer). In general, this 28 process will involve an application program adapted 29 to interpret the data file, the operating system of 30 the computer, a software "driver" specific to the 31 desired output device and, in some cases 32 (particularly for monitors or other visual display 33 units), additional hardware in the form of an 34 expansion card. 35 This conventional approach to the processing of 36 digital documents in order to generate an output is 37 inefficient in terms of hardware resources, software 38 overheads and processing time, and is completely 39 unsuitable for low power, portable data processing 40 systems, including wireless telecommunication 10 systems, or for low cost data processing systems 11 such as network terminals, etc. Other problems are 12 encountered in conventional digital document 13 processing systems, including the need to configure 14 multiple system components (including both hardware 15 and software components) to interact in the desired 16 manner, and inconsistencies in the processing of 17 identical source material by different systems 18 (e.g., differences in formatting, color 19 reproduction, etc.). In addition, the conventional 20 approach to digital document processing is unable to 21 exploit the commonality and/or re-usability of file 22 format components. 23 summary of the invention 24 It is an object of the present invention to provide 25 digital document processing methods and systems, and 26 devices incorporating such methods and systems, 27 which obviate or mitigate the aforesaid 28 disadvantages of conventional methods and systems. 29 The systems and methods described herein provide a 30 display technology that separates the underlying 31 functionality of an application program from the 32 graphical display process, thereby eliminating or 33 reducing the application's need to control the 34 device display and to provide graphical user 35 interface tools and controls for the display. 36 Additionally, such systems reduce or eliminate the 37 need for an application program to be present on a 10 processing system when displaying data created by or 11 for that application program, such as a document or 12 video stream. Thus it will be understood that in 13 one aspect, the systems and methods described herein 14 can display content, including documents, video 15 streams, or other content, and will provide the 16 graphical user functions for viewing the displayed 17 document, such as zoom, pan, or other such 18 functions, without need for the underlying 19 application to be present on the system that is 20 displaying the content. The advantages over the 21 prior art of the systems and methods described 22 herein include the advantage of allowing different 23 types of content from different application programs 24 to be shown on the same display within the same work 25 space. Many more advantages will be apparent to 26 those of ordinary skill in the art and those of 27 those of ordinary skill in the art will also be able 28 to see numerous way of employing the underlying 29 technology of the invention for creating additional 3D systems, devices, and applications. These modified 31 systems and alternate systems and practices will be 1 understood to fall within the scope of the 2 invention. 3 4 More particularly, the systems and methods described 5 herein include a digital content processing system 6 that comprises an application dispatcher for 7 receiving an input byte stream representing source 8 data in one of a plurality of predetermined data 9 formats and for associating the input byte stream 10 with one of the predetermined data formats. The 11 system may also comprise a document agent for 12 interpreting the input byte stream as a function of 13 the associated predetermined data format and for 14 parsing the input byte stream into a stream of 15 document objects that provide an internal 16 representation of primitive structures within the 17 input byte stream. The systems also include a core 18 document engine for converting the document objects 19 into an internal representation data format and for 20 mapping the internal representation data to a 21 location on a display. A shape processor within the 22 system processes the internal representation data to 23 drive an output device to present the content as 24 expressed through the internal representation. 25 2 6 Embodiments of the invention will now be described, 27 by way of example only, with reference to the 2 8 accompanying drawings. 29 Brief Description of the Drawings 1 The foregoing and other objects and advantages of 2 the invention will be appreciated more fully from 3 the following further description thereof, with 4 reference to the accompanying drawings, wherein: 5 Figure 1 is a block diagram illustrating an 6 embodiment of a digital document processing system 7 in accordance with the present invention. 8 Figure 2 is a block diagreim that presents in greater 9 detail the system depicted in Figure 1; 10 Figure 3 is a flowchart diagram of one document 11 agent; 12 Figure 4 depicts schematically an exemplary document 13 of the type that can be processed by the system of 14 Figure 1; 15 Figure 5 depicts flowchart diagrams of two 15 exemplary processes employed to reduce redundancy 17 within the internal representation of a document; 18 and 19 Figures 6-8 depict an exemplary data structure for 20 storing an internal representation of a processed 21 source document. 22 Detailed Description of Certain Illustrated 23 Embodiments 24 The systems and methods described herein include 25 computer programs that operate to process an output 26 stream or output file generated by an application 27 program for the purpose of presenting the output on 28 an output device, such as a video display. The 29 applications according to the invention can process 30 these streams to create an internal representation 31 of that output and can further process that internal 32 representation to generate a new output stream that 33 may be displayed on an output device as the output 34 generated by the application according to the 10 invention. Accordingly, the systems of the 11 invention decouple the application program from the 12 display process thus relieving the application 13 progreum from having to display its output onto a 14 particular display device and further removes the 15 need to have the application program present when 16 processing the output of that application for the 17 purpose of displaying that output. 18 To illustrate this operation, Figure 1 provides a 19 high-level functional block diagram of a system 10 20 that allows a plurality of application programs, 21 shown collectively as element 13, to deliver their 22 output streams to a computer process 8 that 23 processes those output streams and generates a 24 representation of the collective output created by 25 those streams for display on the device 26. The 26 collective output of the application programs 13 is 27 depicted in Figure 1 by the output printer device 26 28 that presents the output content generated by the 29 different application programs 13. It will be 3 0 understood by those of skill in the art the output 31 device 2 6 is presenting output generated by the 1 computer process 8 and that this output collectively 2 carries the content of the plural application 3 programs 13. In the illustration provided by 4 Figure 1, the presented content comprises a 5 plurality of images and the output device 26 is a 6 display. However, it will be apparent to those of 7 skill in the art that in other practices the content 8 may be carried in a format other than images, such 9 as auditory tactile, or any other format, or 10 combination of formats suitable for conveying 11 information to a user. Moreover, it will be 12 understood by those of skill in the art that the 13 type of output device 26 will vary according to the 14 application and may include devices for presenting 15 audio content, video content, printed content, 16 plotted content or any other type of content. For 17 the purpose of illustration, the systems and methods 18 described herein will largely be shown as displaying 19 graphical content through display devices, yet it 20 will be understood that these exemplary systems are 21 only for the purpose of illustration, and not to be 22 understood as limiting in anyway. Thus the output 23 generated by the application programs 13 is 24 processed and aggregated by the computer process 8 25 to create a single display that includes all the 26 content generated by the individual application 27 programs 13. 28 In the depicted embodiment, each of the 29 representative outputs appearing on display 26 is 3 0 termed a document, and each of the depicted 31 documents can be associated with one of the 1 application programs 13. It will be understood that 2 the term document as used herein will encompass 3 documents, streamed video, streamed audio, web 4 pages, and any other form of data that can be 5 processed and displayed by the computer process 8. 6 The computer process 8 generates a single output 7 display that includes within that display one or 8 more of the documents generated from the application 9 programs 13. The collection of displayed documents 10 represents the content generated by the application 11 programs 13 and this content is displayed within the 12 program window generated by the computer process 8. 13 The program window for the computer process 8 also 14 may include a set of icons representative of tools 15 provided with the graphical user interface and 16 capable of allowing a user to control the operation, 17 in this case the display, of the dociunents appearing 18 in the program window. 19 In contrast, the conventional approach of having 20 each application program form its own display would 21 result in a presentation on the display device 26 22 that included several program windows, typically one 23 for each application program 13. Additionally, each 24 different type of program window would include a 25 different set of tools for manipulating the content 26 displayed in that window. Thus the system 10 of the 27 invention has the advantage of providing a 28 consistent user interface, and only requiring 29 knowledge of one set of tools for displaying and 30 controlling the different documents. Additionally, 31 the computer process 8 operates on the output of the 32 application programs 13, thus only requiring that 33 output to create the documents that appear within 34 the program window. Accordingly, it is not 35 necessary that the application programs 13 be 36 resident on the same machine as the process 8, nor 37 that the application programs 13 operate in concert 38 with the computer process 8. The computer process 8 39 needs only the output from these application 40 programs 13, and this output can be derived from 10 stored data files that were created by the 11 application programs 13 at an earlier time. 12 However, the systems and methods described herein 13 may be employed as part of systems wherein an 14 application program is capable of presenting its own 15 content, controlling at least a portion of the 16 display 26 and presenting that content within a 17 program window associated with that application 18 program. In these embodiments the systems and 19 methods of the invention can work as separate 20 applications that appear on the display within a 21 portion of the display provided for its use. 22 More particularly. Figure 1 depicts a plurality of 23 application programs 13. These application programs 24 can include word processing programs such as Word, 25 WordPerfect, or any other similar word processing 26 program. It can further include programs such as 27 Netscape Composer that generates HTML files, Adobe 28 Acrobat that processes PDF files, a web server that 29 delivers XML or HTML, a streaming server that 3 0 generates a stream of audio-visual data, an e-mail 31 client or server, a database, spreadsheet or any 1 other kind of application program that delivers 2 output either as a file, data stream, or in some 3 other format suitable for use by a computer process. 4 In the embodiment of Figure 1 each of the 5 application programs 13 presents its output content 6 to the computer process 8. In operation this can 7 occur by having the application process 13 direct 8 its output stream as an input byte stream to the 9 computer process 8. The use of data streams is 10 well known to those of ordinary skill in the art and 11 described in the literature, including for example, 12 Stephen G. Kochan, Programming in C, Hayden 13 Publishing (1983). Optionally, the application 14 program 13 can create a data file such as a Word 15 document, that can be streamed into the computer 16 process 8 either by a separate application or by the 17 computer process 8. 18 The computer process 8 is capable of processing the 19 various input streams to create the aggregated 20 display shown on display device 26. To this end, 21 and as will be shown in greater detail hereinafter, 22 the computer process 8 processes the incoming 23 streams to generate an internal representation of 24 each of these input streams. In one practice this 25 internal representation is meant to look as close as 26 possible to the output stream of the respective 27 application program 13. However, in other 28 embodiments the internal representation may be 29 created to have a selected, simplified or partial 30 likeness to the output stream generated by the 31 respective application program 13. Additionally and 32 optionally, the systems and methods described herein 33 may also apply filters to the content being 34 translated thereby allowing certain portions of the 35 content to be removed from the content displayed or 36 otherwise presented. Further, the systems and 37 methods described herein may allow alteration of the 38 structure of the source document, allowing for 39 repositioning content within a document, rearranging 40 the structure of the document, or selecting only 10 certain types of data. Similarly in an optional 11 embodiment, content can be added during the 12 translation process, including active content such 13 as links to web sites. In either case, the internal 14 representation created by computer process 8 may be 15 further processed by the computer process 8 to drive 16 the display device 2 6 to create the aggregated image 17 represented in Figure 1. 18 Turning to Figure 2, a more detailed representation 19 of the system of Figure 1 is presented. 20 Specifically, Figure 2 depicts the system 10 which 21 includes that computer process 8, the source 22 documents 11, a and a display device 26. The 23 computer process 8 includes a plurality of document 24 agents 12, an internal representation format file 25 and process 14, buffer storage 15, a library of 26 generic objects 16, a core document engine that in 27 this embodiment comprises a parsing module 18, and a 28 rendering module 19, an internal view 20, a shape 29 processor 22 and a final output 24. Figure 2 30 further depicts an optional input device 30 for 31 transmitting user input 40 to the computer process 32 8. The depicted embodiment includes a process 8 33 that comprises a shape processor 22. However, it 34 will be apparent to those of ordinary skill in the 35 art, that the depicted process 8 is only exemplary 36 and that the process 8 may be realized through 37 alternate processes and architectures. For example, 38 the shape processor 22 may optionally be realized as 39 a hardware component, such as a semiconductor 40 device, that supports the operation of the other 10 elements of the process 8. Moreover, it will be 11 understood that although Figure 2 presents process 8 12 as a functional block diagram that comprises a 13 single system, it may be that process 8 is 14 distributed across a number of different platforms, 15 and optionally it may be that the elements operate 16 at different times and that the output from one 17 element of process 8 is delivered at a later time as 18 input to the next element of process 8. 19 As discussed above, each source document 11 is 20 associated with a document agent 12 that is capable 21 of translating the incoming document into an 22 internal representation of the content of that 23 source document 11, To identify the appropriate 24 document agent 12 to process a source document 11, 25 the system 10 of Figure 1 includes an application 26 dispatcher {not shown) that controls the interface 27 between application programs and the system 10. In 28 one practice, the use of an external application 29 programming interface (API) is handled by the 30 application dispatcher which passes data, calls the 31 appropriate document agent 12, or otherwise carries 32 out the request made by the application program. To 33 select the appropriate document agent 12 for a 34 particular source document 11, the application 35 dispatcher advertises the source document 11 to all 36 the loaded document agents 12. These document 37 agents 12 then respond with information regarding 38 their particular suitability for translating the 39 content of the published source document 11. Once 40 the document agents 12 have responded, the 10 application dispatcher selects a document agent 12 11 and passes a pointer, such as a URI of the source 12 document 11, to the selected document agent 12. 13 In one practice, the computer process 8 may be run 14 as a service under which a plurality of threads may 15 be created thereby supporting multi-processing of 16 plural docviment sources 11. In other embodiments, 17 the process 8 does not support multi-threading and 18 the document agent 12 selected by the application 19 dispatcher will be called in the current thread. 20 It will be understood that the exemplary embodiment 21 of Figure 2 provides a flexible and extensible front 22 end for processing incoming data streams of 23 different file formats. For example, optionally, 24 if the application dispatcher determines that the 25 system lacks a document agent 12 suitable for 26 translating the source document 11, the application 27 dispatcher can signal the respective application 28 program 13 indicating that the source document 11 is 29 in an unrecognized format. Optionally, the 30 application program 13 may choose to allow the 31 reformatting of the source document 11, such as by 32 converting the source document 11 produced by the 33 application program 13 from its present format into 34 another format supported by that application program 35 13. For example an application program 13 may 36 determine that the source document 11 needs to be 37 saved in a different format, such as an earlier 38 version of the file format. To the extent that the 39 application program 13 supports that format, the 10 application program 13 can resave the source 11 document 11 in this supported format in order that a 12 document agent 12 provided by the system 10 will be 13 capable of translating the source document 11. 14 Optionally, the application dispatcher, upon 15 detecting that the system 10 lacks a suitable 16 document agent 12, can indicate to a user that a new 17 document agent of a particular type may be needed 18 for translating the present source document 11. To 19 this end, the computer process 8 may indicate to the 20 user that a new document agent needs to be loaded 21 into the system 10 and may direct the user to a 22 location, such as a web site, from where the new 23 document agent 12 may be downloaded. Optionally, 24 the system could fetch automatically the document 25 agent without asking the user, or could identify a 26 generic agent 12, such as a generic text agent that 27 can extract portions of the source document 11 28 representative of text. Further, agents that prompt 29 a user for input and instruction during the 30 translation process may also be provided. 1 In a still further optional embodiment, an 2 application dispatcher in conjunction with the 3 document agents 12 acts as an input module that 4 identifies the file format of the source document 11 5 on the basis of any one of a variety of criteria, 6 such as an explicit file-type identification within 7 the document, from the file name, including the file 8 name extension, or from known characteristics of the 9 content of particular file types. The bytestream is 10 input to the document agent 12, specific to the file 11 format of the source document 11. 12 Although the above description has discussed input 13 data being provided by a stream or computer file, it 14 shall be understood by those of skill in the art 15 that the system 10 may also be applied to input 16 received from an input device such as a digital 17 camera or scanner as well as from an application 18 program that can directly stream its output to the 19 process 8, or that has its output streamed by cui 20 operating system to the process 8. In this case the 21 input bytestream may originate directly from the 22 input device, rather from a source document 11. 23 However, the input bytestream will still be in a 24 data format suitable for processing by the system 10 25 and, for the purposes of the invention, input 26 received from such an input device may be regarded 27 as a source document 11. 28 As shown in Figure 2, the document agent 12 employs 29 the library 16 of standard objects to generate the 30 internal representation 14, which describes the 31 content of the source document in terms of a 32 collection of document objects whose generic types 33 are as defined in the library 16, together with 34 parameters defining the properties of specific 35 instances of the various document objects within the 36 document. Thus, the library 16 provides a set of 37 types of objects which the document agents 12, the 38 parser 18 and the system 10 have knowledge of. For 39 example, the document objects employed in the 10 internal representation 14 may include: text, 11 bitmap graphics and vector graphics document objects 12 which may or may not be animated and which may be 13 two- or three-dimensional: video, audio and a 14 variety of types of interactive objects such as 15 buttons and icons. Vector graphics document objects 16 may be PostScript-like paths with specified fill and 17 transparency. Bitmap graphic document objects may 18 include a set of sub-object types such as for 19 example JPEG, GIF and PNG object types. Text 20 document objects may declare a region of stylized 21 text. The region may include a paragraph of text, 22 typically understood as a set of characters that 23 appears between two delimiters, like a pair of 24 carriage returns. Each text object may include a 25 run of characters and the styling information for 26 that character run including one or more associated 27 typefaces, points and other such styling 28 information. 29 The parameters defining specific instances of 30 document objects will generally include dimensional 31 co-ordinates defining the physical shape, size and 32 location of the document object and any relevant 33 temporal data for defining document objects whose 34 properties vary with time, thereby allowing the 35 system to deal with dynamic document structures 36 and/or display functions. For example, a stream of 37 video input may be treated by the system 10 as a 38 series of figures that are changing at a rate of, 39 for example, 30 frames per second. In this case the 40 temporal characteristic of this figure object 10 indicates that the figure object is to be updated 30 11 times per second. As discussed above, for text 12 objects, the parameters will normally also include a 13 font and size to be applied to a character string. 14 Object parameters may also define other properties, 15 such as transparency. It will be understood that 16 the internal representation may be saved/stored in a 17 file format native to the system and that the range 18 of possible source documents 11 input to the system 19 10 may include documents in the system's native file 20 format. It is also possible for the internal 21 representation 14 to be converted into any of a 22 range of other file formats if required, using 23 suitable conversion agents. 24 Figure 3 depicts a flow chart diagram of one 25 exemplary process that may be carried out by a 26 document agent 12. Specifically, Figure 3 depicts a 27 process 50 that represents the operation of an 28 example document agent 12, in this case a document 29 agent 12 suitable for translating the contents of a 30 Microsoft Word document into an internal 31 representation format. Specifically, the process 50 32 includes an initialization step 52 wherein the 33 process 50 initializes the data structures, memory 34 space, and other resources that the process 50 will 35 employ while translating the source document 11. 36 After step 52 the process 50 proceeds to a series of 37 steps, 54, 58 and 60, wherein the source document 11 38 is analyzed and divided into subsections. In the 39 process 50 depicted in Figure 3 steps 54, 58 and 60, 40 subdivide the source document 11 as it is streamed 10 into the document agent 12 first into sections, then 11 subdivides the sections into paragraphs and then 12 subdivides paragraphs into the individual characters 13 that make up that paragraph. The sections, 14 paragraphs and characters identified within the 15 source document 11 may be identified within a piece 16 table that contains pointers to the different 17 subsections identified within the source document 18 11. It will be understood by those of skill in the 19 art that the piece table depicted in Figure 3 20 represents a construct employed by MSWord for 21 providing pointers to different subsections of a 22 document. It will further be understood that the 23 use of a piece table or a piece table lilo construct 24 is optional and depends on the application at hand, 25 including depending on the type of document being 26 processed. 27 As the process 50 in step 60 begins to identify 28 different characters that appear within a particular 29 paragraph, the process 60 may proceed to step 62 30 wherein a style is applied to the character or set 31 of characters identified in step 60. The 1 application of a style is understood to associated 2 the identified characters with a style of 3 presentation that is being employed with those 4 characters. The style of presentation may include 5 properties associated with the character including 6 font type, font size, whether the characters are 7 bold, italic, or otherwise stylized. Additionally, 8 in step 62 the process can determine whether the 9 characters are rotated, or being positioned for 10 following a curved path or other shape. 11 Additionally, in step 62 style associated with the 12 paragraph in which the characters occur may also be 13 identified and associated with the characters. Such 14 properties can include the line spacing associated 15 with the paragraph, the margins associated with the 16 paragraph, the spacing between characters, and other 17 such properties. 18 After step 62 the process 50 proceeds to step 70 19 wherein the internal representation is built up. 20 The object which describes the structure of the 21 document is created in Step 64 as an object within 22 the internal representation, and the associated 23 style of this object, together with the character 24 run it contains, is created separately within the 25 internal representation at Step 68. Figures 6, 7 26 and 8, which will be explained in more detail herein 27 after, depict figuratively the file structure 28 created by the process 50 wherein the structure of a 29 document is captured by a group of dociiment objects 30 and the data associated with the document objects is 31 stored in a separate data structure. After step 70, 32 a process 50 proceeds to decision block 72 wherein 33 the process 50 determines whether the paragraph 34 associated with the last processed character is 35 complete. If the paragraph is not complete the 5 process 50 returns to step 60 wherein the next 5 character from the paragraph is read. 7 Alternatively, if the paragraph is complete the 8 process 50 proceeds to decision block 74 wherein the 9 process 50 determines whether the section is 10 complete. If the section is complete the process 11 returns to step 58 and the next paragraph is read 12 from the piece table. Alternatively if the section 13 is complete the process 50 proceeds to step 54 14 wherein the next section, if there is a next section 15 is read from the piece table and processing 16 continues. Once the document has been processed the 17 system 8 can transmit, save, export or otherwise 18 store the translated document for subsequent use. 19 The system can store the translated file in a format 20 compatible with the internal representation, and 21 optionally in other formats as well including 22 formats compatible with the file formats of the 23 source documents 11 (for which it may employ 'export 24 document agents' not shown capable of receiving 25 internal representation data and creating source 26 document data), or in a binary form, a textual 27 document description structure, marked-up text or in 28 any other suitable format; and may employ a 29 universal text encoding model, including Unicode, 30 shiftmapping, big-5, and a luminance/chrominance 31 model. 32 As can be seen from the above, the format of the 33 internal representation 14 separates the "structure" 34 (or "layout") of the documents, as described by the 35 object types and their parameters, from the 5 "content" of the various objects; e.g. thi^ character 5 string (content) of a text object is separated from 7 the dimensional parameters of the object; the image 8 data (content) of a graphic object is separated from 9 its dimensional parameters. This allows document 10 structures to be defined in a compact manner and 11 provides the option for content data to be stored 12 remotely and to be fetched by the system only when 13 needed. The internal representation 14 describes 14 the document and its constituent objects in terms of 15 "high-level" descriptions. 16 The document agent 12 described above with reference 17 to Figure 3 is capable of processing a data file 18 created by the MSWord word processing application 19 and translating that data file into an internal 20 representation that is formed from a set of object 21 types selected from the library 16, that represents 22 the content of the processed document. Accordingly, 23 the document agent 12 analyzes the Word document and 24 translates the structure and content of that 25 document into an internal representation known to 26 the computer process 8. One example of one type of 27 Word document that may be processed by tfee document 28 agent 12 is depicted in Figure 4. Specifically, 29 Figure 4 depicts a Word document 32 of the type 3 0 created by the MSWord application program. The 31 depicted document 32 comprises one page of 1 information wherein that one page includes two 2 columns of text 34 and one figure 36. Figure 4 3 further depicts that the columns of text 34 and the 4 figure 3 6 are positioned on the page 38 in such a 5 way that one column of text runs from the top of the 6 page 38 to the bottom of the page 38 and the second 7 column of text runs from about the center of the 8 page to the bottom of the page with the figure 36 9 being disposed above the second column of text 34. 10 As discussed above with reference to Figure 3 the 11 document agent 12 begins processing the document 32 12 by determining that the document 32 comprises one 13 page and contains a plurality of different objects. 14 For the one page found by the document agent 12, the 15 document agent 12 identifies the style of the page, 16 which for example may be a page style of an 8.5 x 11 17 page in portrait format. The page style identified 18 by the document agent 12 is embodied in the internal 19 representation for later use by the parser 18 in 20 formatting and flowing text into the docviment 21 created by the process 8. 22 For the document 32 depicted in Figure 4 only one 23 page is present. However, it will be understood 24 that the document agent 12 may process Word 25 documents comprising a plurality of pages. In such 26 a case the document agent 12 would process each page 27 separately by creating a page then filling it with 28 objects of the type found in the library. Thus page 29 style information can include that a document 30 comprises a plurality of pages and that the pages 31 are of a certain size. Other page style information 32 may be identified by the document agent 12 and the 33 page style information identified can vary according 34 to the application. Thus different page style 35 information may be identified by a document agent 36 capable of processing a Microsoft Excel document or 37 a real media data stream, 38 As further described with reference to Figure 3- 4 39 once the document agent 12 has identified the page 10 style the document agent 12 may begin to break the 11 document 32 dovm into objects that can be mapped to 12 document objects knovm to the system and typically 13 stored in the library 16. For example, the docioment 14 agent 12 may process the document 32 to find text 15 objects, bitmap objects and vector graphic objects. 16 Other type of object types may optionally be 17 provided including video type, animation type, 18 button type, and script type. In this practice, the 19 document agent 12 will identify a text object 34 20 whose associated style has two columns. The 21 paragraphs of text that occur within the text object 22 34 may be analyzed for identifying each character in 23 each respective paragraph. Process 50 may apply 24 style properties to each identified character run 25 and each character run identified within the 26 document 32 may be mapped to a text object of the 27 type listed within the library 16. Each character 28 run and the applied style can be understood as an 29 object identified by the document agent 12 as having 30 been found within the document 32 and having been 31 translated to a document object, in this case a text 1 object of the type listed within the library 16. 2 This internal representation object may be streamed 3 from the document agent 12 into the internal 4 representation 14. The document agent 12 may 5 continue to translate the objects that appear within 6 the document 32 into document objects that are known 7 to the system 10 until each object has been 8 translated. The object types may be appropriate for 9 the application and may include object types 10 suitable for translating source data representative 11 of a digital document, an audio/visual presentation, 12 a music file, an interactive script, a user 13 interface file and an image file, as well as any 14 other file types. 15 Turning to Figure 5, it can be seen that the 16 process 80 depicted in Figure 5 allows for 17 compacting similar objects appearing within the 18 internal representation of the source document 11, 19 for the purpose of reducing the size of the internal 20 representation. For example, Figure 5 depicts a 21 process 80 wherein step 82 has a primitive library 22 object A being processed by, in step 84, inserting 23 that primitive object into the docximent that is 24 becoming the internal representation of the source 25 document 11. In step 88 another object B, provided 26 by the document agent 12 is delivered to the 27 internal representation file process 14. The 28 process 80 then undertakes the depicted sequence of 29 steps 92 through 98 wherein characteristics of 3 0 object A are compared to the characteristics of 31 object B to determine if the two objects have the 1 same characteristics. For example, if object A and 2 object B represent two characters such as the letter 3 P and the letter N, if both characters P and N are 4 the scime color, same font, same size and the same 5 style such as bold or italicized, then the process 6 80 in step 94 joins the two objects together within 7 one object classification stored within the internal 8 representation. If these characteristics do not 9 match then the process 80 adds them to the internal 10 representation as two separate objects. 11 Figure 5 depicts a process 80 wherein the internal 12 representation file 14 compacts the objects as a 13 function of the similarity of physically adjacent 14 objects. Those of ordinary skill in the art will 15 understand that this is merely one process for 16 compacting the objects and that other techniques may 17 be employed. For example, in an optional practice, 18 the compaction process may comprise a process for 19 compacting objects that are visually adjacent. 20 Figures 6, .7 and 8 depict the structure of the 21 internal representation of a document that has been 22 processed by the system depicted in Figures 1 and 2. 23 The internal representation of the document may be 24 embodied as a computer file or as data stored in 25 core memory. However, it will be apparent to those 26 of ordinary skill in the art that data structure 27 selected for capturing or transporting the internal 28 representation may vary according to the application 29 and any suitable data structure may be employed with 30 the systems and methods described herein without 31 departing from the scope of the invention. 32 As will be described in greater detail hereinafter 33 the structure of the internal representation of the 34 processed document separates the structure of the 35 document from the content of the document. 36 Specifically, the structure of the document is 37 captured by a data structure that shows the 38 different document objects that make up the 10 document, as well as the way that these document 11 objects are arranged relative to each other. This 12 separation of structure from content is shown in 13 Figure 6 wherein the data structure 110 captures the 14 structure of the document being processed and stores 15 that structure in a data format that is independent 16 of the actual content associated with that document. 17 Specifically, the data structure 110 includes a 18 resource Table 112 and a document structure 114. 19 The resource table 112 provides a list of resources 20 for constructing the internal representation of the 21 document. For example the resource table 112 can 22 include one or more tables of common structures that 23 occur within the document, such as type faces, 24 links, and color lists. These common structures may 25 be referenced numerically within the resource table 26 112. The resources of resource table 112 relate to 27 the document objects that are arranged within the 28 document structure 114. As Figure 6 shows, the 29 document structure 114 includes a plurality of 3 0 containers 118 that are represented by the sets of 31 the nested parentheses. Within the containers 118 1 are a plurality of document objects 120. As shown 2 in Figure 6 the containers 118 represent collections 3 of document objects that appear within the document 4 being processed. As further shown by Figure 6 the 5 containers 118 are also capable of holding sub- 6 containers. For example, the document structure 114 7 includes one top-level container, identified by the 8 set of outer parentheses labeled 1, and has three 9 nested containers 2, 3 and 4. Additionally, the 10 container 4 is double nested within container 1 and 11 container 3. 12 Each container 118 represents features within a 13 document, wherein the features may be a collection 14 of individual document objects, such as the depicted 15 dociiment objects 120. Thus for example, a document, 16 such as the document 32 depicted in Figure 4, may 17 include a container representative of the character 18 run wherein the character run includes the text that 19 appears within the columns 34. The different 20 document objects 120 that occur within the character 21 run container may, for example, be representative of 22 the different paragraphs that occur within that 23 character riin. The character run container has a 24 style associated with it. For example, the 25 character run depicted in Figure 4 can include style 26 information representative of the character font 27 type, font size, styling, such as bold or italic 28 styling, and style information representative of the 29 size of the column, including width and length, in 30 which the character run, or at least a portion of 31 that character run, occurs. This style information 32 may be later used by the parser 18 to reformat and 33 reflow the text within the context specific view 20. 34 Another example of a container may be a table that, 35 for example, could appear within a column 34 of text 36 in document 32. The table may be a container with 37 objects. The other types and uses of containers 38 will vary according to the application at hand and 39 the systems and methods of the invention are not 40 limited to any particular set of object types or 10 containers. 11 Thus, as the document agent 12 translates the source 12 document 11, it will encounter objects that are of 13 known object types, and the document agent 16 will 14 request the library 16 to create an object of the 15- appropriate object type. The docvunent agent 12 will 15 then lodge that created document object into the 17 appropriate location within dociament structure 114 18 to preserve the overall structure of the source 19 document 11. For example, as the document agent 12 20 encounters the image 36 within the source document 21 11, the document agent 12 will recognize the image 22 36, which may for example be a JPEG image, as an 23 object of type bitmap, and optionally sub-type JPEG. 24 This document agent 12, as shown in steps 64 and 68 25 of Figure 3, can create the appropriate document 26 object 120 and can lodge the created docioment object 27 120 into the structure 114. Additionally, the data 28 for the JPEG image document object 120, or in 29 another example, the data for the characters and 3 0 their associated style for a character run, may be 1 stored within the data structure 150 depicted in 2 Figure 8. 3 As the source document 11 is being processed, the 4 document agent 12 may identify other containers 5 wherein these other containers may be representative 6 of a subfeature appearing within an existing 7 container, such as a character run. For example, 8 these subfeatures may include links to referenced 9 material, or clipped visual regions or features that 10 appear within the document and that contain 11 collections of individual document objects 120. The 12 document agent 12 can place these doc^ument objects 13 120 within a separate container that will be nested 14 within the existing container. The arrangement of 15 these document objects 120 and the containers 118 16 are shown in Figure 7A as a tree structure 130 17 wherein the individual containers 1, 2, 3 and 4 are 18 shown as container objects 132, 134, 138 and 140 19 respectively. The containers 118 and the document 20 objects 120 are arranged in a tree structure that 21 shows the nested container structure of documents 22 structure 114 and the different document objects 12 0 23 that occur within the containers 118. The tree 24 structure of Figure 7A also illustrates that the 25 structure 114 records and preserves the structure of 26 the source document 11, showing the source doc\iment 27 as a hierarchy of document objects 120, wherein the 28 document objects 120 include the style information, 29 such as for example the size of columns in which a 30 run of characters appears, or temporal information, 31 such as the frame rate for streamed content. Thus, 8 As can be seen, Table 1 presents an example of 9 parameters that may be used to describe a document's 10 graphical structure. Table one presents examples of 11 such parameters, such as the object type, which in 12 this case is a Bitmap object type. A bounding box 13 parameter is provided and gives the location of the 14 docviment object within the source document 11. 15 Table one further provides the Fill employed and an 16 alpha factor that is representative of the degree of 17 transparency for the object. A Shape parameter 18 provides a handle to the shape of the object, which 19 in this case could be a path that defines the 20 outline of the object, including irregularly shaped 21 objects. Table 1 also presents a time parameter 22 representative of the temporal changing for that 23 object. In this example, the image is stable and 24 does not change with time. However, if the image 1 object presented streamed media, then this parameter 2 could contain a temporal characteristic that 3 indicates the rate at which the object should 4 change, such as a rate comparable to the desired 5 freime rate for the content. 6 7 . Thus, the structural elements are containers with 8 flowable data content, with this flowable data held 9 separately and referenced by a handle from the 10 container. In this way, any or all data content can 11 be held remotely from the document structure. This 12 allows for rendering of the docxament in a manner 13 that can be achieved with a mixture of locally held 14 and remotely held data content. Additionally, this 15 data structure allows for rapid progressive 15 rendering of the internal representation of the 17 source document 11, as the broader and higher level 18 objects can be rendered first, and the finer 19 features can be rendered in subsequent order. Thus, 20 the separate structure and data allows visual 21 document to be rendered while streaming data to 22 "fill" the content. Additionally, the separation of 23 content and structure allows the content of the 24 document to readily be edited or changed. As the 25 document structure is independent from the content, 26 different content can be substituted into the 27 document structure. This can be done on container 28 by container basis or for the whole document. The 29 structure of the document can be delivered 30 separately from the content and the content provided 31 later, or made present on the platform to which the 1 structure is delivered. 2 3 Additionally, Figure 7A shows that the structure of 4 a source document 11 can be represented as a tree 5 structure 130. In one practice the tree structure 6 may be modified and edited to change the 7 presentation of the source document 11. For 8 example, the tree structure may be modified to add 9 additional structure and content to the tree 130. 10 This is depicted in Figure 7B that shows the 11 original tree structure of Figure 7A duplicated and 12 presented under a higher level container. Thus, 13 Figure 7B shows that a new document structure, and 14 therefore new representation, may be created by 15 processing the tree structure 130 produced by the 16 document agent 12. This allows the visual position 17 of objects within a document to change, while the 18 relative position of different objects 120 may 19 remain the same. By adjusting the tree structure 20 13 0, the systems described herein can edit and 21 modify content. For example, in those applications 22 where the content within the tree structure 130 is 23 representative of visual content, the systems 24 described herein can edit the tree structure to 25 duplicate the image of the document, and present 26 side by side images of the document. Alternatively, 27 the tree structure 130 can be edited and 28 supplemented to add additional visual information, 29 such as by adding the image of a new document or a 3 0 portion of that document. Moreover, by controlling 31 the rate at which the tree structure is changed, the 32 systems described herein can create the illusion of 33 a document gradually changing, such as sliding 34 across a display, such as display device 26, or 35 gradually changing into a new document. Other 36 effects, such as the creation of thumbnail views and 37 other similar results can be achieved and those of 38 ordinary skill by making modifications to the 39 systems and methods described herein and such 40 modified systems and methods will fall within the 41 scope of the invention. 10 11 The data of the source docximent 11 is stored 12 separately from the structure 114. To this end, 13 each document object 120 includes a pointer to the 14 data associated with that object and this 15 information may be arranged within an indirection 16 list such as the indirection list 160 depicted in 17 Figure 8. In this practice, and as shown in Figure 18 8, each document object 120 is numbered and an 19 indirection list 152 is created wherein each 20 document object number 154 is associated with an 21 offset value 158. For example the document object 22 number 1, identified by reference number 160, may be 23 associated with the offset 700, identified by 24 reference number 162. Thus, the indirection list 25 associates the object number 1 with the offset 700. 26 The offset 700 may represent a location in core 27 memory, or a file offset, wherein the data 28 associated with object 1 may reside. As further 29 shown in Figure 8 a data structure 150 may be 30 present wherein the data that is representative of 31 the content associated with a respective document 32 object 120 may be stored. Thus for example, the 33 depicted object 1 at jump location 700 may include 34 the Unicode characters representative of the 35 characters that occur within the character run of 36 the container 1 depicted in Figure 6. Similarly, 37 the object 2 data, depicted in Figure 8 by reference 38 number 172, and associated with in core memory 39 location 810, identified by reference numeral 170, 40 may be representative of the JPEG bit map associated 41 with a bit map document object 120 referenced within 10 the document structure 114 of Figure 6. 11 It will be noted by those of skill in the art, that 12 as the data is separated from the structure, the 13 content for a source document is held in a 14 centralized repository. As such, the systems 15 described herein allow for compressing across 16 different types of data objects. Such processes 17 provide for greater storage flexibility in limited 18 resource systems. 19 Returning to Figure 2, it will be understood that 20 once the process for compacting the content of an 21 internal representation file completes compacting 22 different objects, these objects are passed to the 23 parser 18. The parser 18 parses the objects 24 identified in the structure section of the internal 25 representation, and with reference to the data 26 content associated with this object, it re-applies 27 the position and styling information to each object. 28 The renderer 19 generates a context-specific 29 representation or "view" 20 of the documents 3 0 represented by the internal representation 14. The 1 required view may be of the all the documents, a 2 whole document or of parts of one or some of the 3 documents. The renderer 19 receives view control 4 inputs 40 which define the viewing context and any 5 related temporal parameters of the specific document 6 view which is to be generated. For example, the 7 system 10 may be required to generate a zoomed view 8 of part of a document, and then to pan or scroll the 9 zoomed view to display adjacent portions of the 10 document. The view control inputs 40 are 11 interpreted by the renderer 19 to determine which 12 parts of the internal representation are required 13 for a particular view and how, when and for how long 14 the view is to be displayed. 15 The context-specific representation/view 20 is 16 expressed in terms of primitive shapes and 17 parameters. 18 The renderer 19 may also perform additional pre- 19 processing functions on the relevant parts of the 20 internal representation 14 when generating the 21 required view 20 of the source document 11. The view 22 representation 20 is input to a shape processor 22 23 for processing to generate an output in a format 24 suitable fore driving an output device 26, such as a 25 display device or printer. 26 The pre-processing functions of the renderer 19 may 27 include colour correction, resolution 28 adjustment/enhancement and anti-aliasing. 29 Resolution enhancement may comprise scaling 30 functions which preserve the legibility of the 31 content of objects when displayed or reproduced by 32 the target output device. Resolution adjustment may 33 be context-sensitive; e.g. the display resolution of 34 particular objects may be reduced while the 35 displayed document view is being panned or scrolled 36 and increased when the document view is static. 37 Optionally, there may be a feedback path 42 between 38 the parser 18 and the internal representation 14, 10 e.g. for the purpose of triggering an update of the 11 content of the internal representation 14, such as 12 in the case where the source document 11 represented 13 by the internal representation comprises a multi- 14 frame animation. 15 The output from the renderer 19 expresses the 16 document in terms of primitive objects. For each 17 document object, the representation from the 18 renderer 19 defines the object at least in terms of 19 a physical, rectangle boundary box, the actual 20 outline path of the object bounded by the boundary 21 box, the data content of the object, and its 22 transparency. 23 The shape processor 22 interprets the primitive 24 object and converts it into an output frame format 25 appropriate to the target output device 26; e.g. a 26 dot-map for a printer, vector instruction set for a 27 plotter, or bitmap for a display device. An output 28 control input 44 to the shape processor 22 provides 29 information to the shape processor 22 to generate 30 output suitable for a particular output de^^cice 26. 31 The shape processor 22 preferably processes the 32 objects defined by the view representation 2 0 in 33 terms of "shape" (i.e. the outline shape of the 34 object), "fill" (the data content of the object) and 35 "alpha" (the transparency of the object), performs 36 scaling and clipping appropriate to the required 37 view and output device, and expresses the object in 10 terms appropriate to the output device (typically in 11 terms of pixels by scan conversion or the like, for 12 most types of display device or printer). The shape 13 processor 22 optionally includes an edge buffer 14 which defines the shape of an object in terms of 15 scan-converted pixels, and preferably applies anti- 16 aliasing to the outline shape. Anti-aliasing may be 17 performed in a manner determined by the 18 characteristics of the output device 26, by applying 19 a grey-scale ramp across the object boundary. This 20 approach enables memory efficient shape-clipping and 21 shape-intersection processes, and is memory 22 efficient and processor efficient as well. A look-up 23 table, or other technique, may be employed to define 24 multiple tone response curves, allowing non-linear 25 rendering control. The individual primitive objects 26 processed by the shape processor 22 are combined in 27 the composite output frame. The design of one 28 shape processor suitable for use with the systems 29 described herein is shown in greater detail in the 30 patent application entitled Shape Processor, filed 31 on even date herewith, the contents of which are 32 incorporated by reference. However, any suitable 33 shape processor system or process may be employed 34 without departing from the scope of the invention. 35 As discussed above, the process 8 depicted in Figure 36 1 can be realized as a software component operating 37 on a data processing system such as a hand held 38 computer, a mobile telephone, set top box, facsimile 39 machine, copier or other office equipment, an 40 embedded computer system, a Windows or Unix 10 workstation, or any other type of 11 computer/processing platform capable of supporting, 12 in whole or in part, the document processing system 13 described above. In these embodiments, the system 14 can be implemented as a C language computer program, 15 or a computer program written in any high level 16 language including C++, Fortran, Java or Basic. 17 Additionally, in an embodiment where 18 microcontrollers or DSPs are employed, the systems 19 can be realized as a computer program written in 20 microcode or written in a high level language and 21 compiled down to microcode that can be executed on 22 the platform employed. The development of such 23 systems is known to those of skill in the art, and 24 such techniques are set forth in Intel® StrongABM 25 processors SA-1110 Microprocessor Advanced 26 Developer's Manual. Additionally, general 27 techniques for high level programming are known, and 28 set forth in, for example, Stephen G. Kochan, 29 Programming in C, Hayden Publishing (1983). It is 30 noted that DSPs are particularly suited for 31 implementing signal processing functions, including 32 preprocessing functions such as image enhancement 33 through adjustments in contrast, edge definition and 34 brightness. Developing code for the DSP and 35 microcontroller systems follows from principles well 36 known in the art. 37 Accordingly, although Figs. 1 and 2 graphically 38 depicts the computer process 8 as comprising a 39 plurality of functional block elements, it will be 40 apparent to one of ordinary skill in the art that 10 these elements can be realized as computer programs 11 or portions of computer programs that are capable of 12 running on the data processing platform to thereby 13 configure the data processing platform as a system 14 according to the invention. Moreover, although Fig. 15 1 depicts the system 10 as an integrated unit of a 16 document processing process 8 and a display device 17 26, it will be apparent to those of ordinary skill 18 in the art that this is only one embodiment, and 19 that the systems described herein can be realized 20 through other architectures and arrangements, 21 including system architectures that separate the 22 document processing functions of the process 8 from 23 the document display operation performed by the 24 display 26. Moreover, it will be understood that 25 the systems of the invention are not limited to 26 those systems that include a display or output 27 device, but that the systems of the invention will 28 encompass those processing systems that process one 29 or more digital documents to create output that can 30 be presented on an output device. However, this 31 output may be stored in a data file for sub&equent 32 presentation on a display device, for long term 33 storage, for delivery over a network, or for some 34 other purpose than for immediate display. 35 Accordingly, it will be apparent to those of skill 36 in the art that the systems and methods described 37 herein can support many different document and 38 content processing applications and that the 39 structure of the system or process employed for a 40 particular application will vary according to the 10 application and the choice of the designer. 11 From the foregoing, it will be understood that the 12 system of the present invention may be "hard-wired"; 13 e.g. implemented in ROM and/or integrated into ASICs 14 or other single-chip systems, or may be implemented 15 as firmware (programmable ROM such as flashable 16 ePROM), or as software, being stored locally or 17 remotely and being fetched and executed as required 18 by a particular device. Such improvements and 19 modifications may be incorporated without departing 20 from the scope of the present invention. 21 Those skilled in the art will know or be able to 22 ascertain using no more than routine 23 experimentation, many equivalents to the embodiments 24 and practices described herein. For example, the 25 systems and methods described herein may be stand 26 alone systems for processing source documents 11, 27 but optionally these systems may be incorporated 28 into a variety of types of data processing systems 29 and devices, and into peripheral devices, in a 3 0 number of different ways. In a general purpose data 1 processing system (the "host system"), the system of 2 the present invention may be incorporated alongside 3 the operating system and applications of the host 4 system or may be incorporated fully or partially 5 into the host operating system. For example, the 6 systems described herein enable rapid display of a 7 variety of types of data files on portable data 8 processing devices with LCD displays without 9 requiring the use of browsers or application 10 programs. Examples of portable data processing 11 devices which may employ the present system include 12 "palmtop" computers, portable digital assistants 13 (PDAs, including tablet-type PDAs in which the 14 primary user interface comprises a graphical display 15 with which the user interacts directly by means of a 16 stylus device), internet-enabled mobile telephones 17 and other communications devices. This class of 18 data processing devices requires small size, low 19 power processors for portability. Typically, these 20 devices employ advanced RISC-type core processors 21 designed in to ASICs (application specific 22 integrated circuits), in order that the electronics 23 package is small and integrated. This type of 24 device also has limited random access memory and 25 typically has no non-volatile data store (e.g. hard 26 disk). Conventional operating system models, such 27 as are employed in standard desktop computing 28 systems (PCs), require high powered central 29 processors and large amounts of memory to process 30 digital documents and generate useful output, and 31 are entirely unsuited for this type of data 32 processing device. In particular, conventional 33 systems do not provide for the processing of 34 multiple file formats in an integrated manner. By 35 contrast, the systems described herein employ common 36 processes and pipelines for all file formats, 37 thereby providing a highly integrated document 38 processing system which is extremely efficient in 39 terms of power consumption and usage of system 40 resources. 41 The system of the invention may be integrated at the 10 BIOS level of portable data processing devices to 11 enable document processing and output with much 12 lower overhead than conventional system models. 13 Alternatively, these systems may be implemented at 14 the lowest system level just above the transport 15 protocol stack. For example, the system may be 16 incorporated into a network device (card) or system, 17 to provide in-line processing of network traffic 18 (e.g. working at the packet level in a TCP/IP 19 system) . 20 The systems herein can be configured to operate with 21 a predetermined set of data file formats and 22 particular output devices; e.g. the visual display 23 unit of the device and/or at least one type of 24 printer. 25 The systems described herein may also be 26 incorporated into low cost data processing terminals 27 such as enhanced telephones and "thin" network 28 client terminals (e.g. network terminals with 29 limited local processing and storage resources), and 30 "set-top boxes" for use in interactive/internet- 31 enabled cable TV systems. The systems may also be 32 incorporated into peripheral devices such as 33 hardcopy devices (printers and plotters), display 34 devices (such as digital projectors), networking 35 devices, input devices (cameras, scanners, etc.) and 36 also multi-function peripherals (MFPs). When 37 incorporated into a printer, the system enables the 38 printer to receive raw data files from the host data 39 processing system and to reproduce the content of 10 the original data file correctly, without the need 11 for particular applications or drivers provided by 12 the host system. This avoids or reduces the need to 13 configure a computer system to drive a particular 14 type of printer. The present system directly 15 generates a dot-mapped image of the source document 16 suitable for output by the printer (this is true 17 whether the system is incorporated into the printer 18 itself or into the host system). Similar 19 considerations apply to other hardcopy devices such 20 as plotters. 21 When incorporated into a display device, such as a 22 projector, the system again enables the device to 23 display the content of the original data file 24 correctly without the use of applications or drivers 25 on the host system, and without the need for 26 specific configuration of the host system and/or 27 display device. Peripheral devices of these types, 28 when equipped with the present system, may receive 29 and output data files from any source, via any type 30 of data communications network. 31 Additionally, the systems and methods described 32 herein may be incorporated into in-car systems for 33 providing driver information or entertainment 34 systems, to facilitate the delivery of information 35 within the vehicle or to a network that communicates 36 beyond the vehicle. Further, it will be understood 37 that the systems described herein can drive devices 38 having multiple output sources to maintain a 39 consistent display using modifications to only the 10 control parameters. Examples include, but are not 11 limited to, a STB or in-car system incorporating a 12 visual display and print head, thereby enabling 13 viewing and printing of documents without the need 14 for the source applications and drivers. 15 From the foregoing, it will be understood that the 16 system of the present invention may be "hard-wired"; 17 e.g. implemented in ROM and/or integrated into ASICs 18 or other single-chip systems, or may be implemented 19 as firmware (programmable ROM such as flashable 20 ePROM), or as software, being stored locally or 21 remotely and being fetched and executed as required 22 by a particular device. 23 24 Accordingly, it will be understood that the 25 invention is not to be limited to the embodiments 2 6 disclosed herein, but is to be understood from the 27 following claims, which are to be interpreted as 28 broadly as allowed under the law. 29 1 CLAImS 2 1. A digital document processing system, 3 comprising 4 an application dispatcher for receiving an 5 input bytestream representing source data in one of 6 a plurality of predetermined data formats and for 7 associating the input bytestream with one of said 8 plurality of predetermined data formats, 9 a document agent for interpreting said input 10 bytestream as a function of said associated 11 predetermined data format and for parsing the input 12 bytestream into a stream of document objects 13 representative of internal representations of 14 primitive structures within the input bytestream, 15 and 16 a core document engine for converting said 17 document objects into an internal representation 18 data format and for mapping said internal 19 representation data to a location on a display. 20 21 2. A digital document system according to claim 1, 22 further comprising 23 a shape processor for processing said internal 24 representation data to drive an output device. 25 26 3. A digital document processing system as claimed 27 in claim 1 or 2, wherein said source data defines 28 the content and structure of a digital document, and 29 wherein said internal representation data describes 30 said structure in terms of document objects of a 31 plurality of data types and parameters defining 32 properties of specific instances of the document 33 objects, separately from said content. 34 4. A digital document processing system according 35 to claim 3, wherein the parameters defining 36 properties of specific instances include properties 37 selected from the group consisting of dimensional, 38 temporal, and physical. 39 5. A digital document processing system as claimed 40 in claim 3 or 4, further including a library of 10 objects types, said internal representation data 11 being based on the content of said library. 12 6. A digital document processing system as claimed 13 in any of claims 3 to 5, wherein said core document 14 engine includes a parsing and rendering module 15 adapted to generate an object and parameter based 16 representation of a specific view of at least part 17 of said internal representation data, on the basis 18 of a first control input to said parsing and 19 rendering module. 20 7. A digital document processing system according 21 to claim 6 wherein said parameter based 22 representation includes parameters selected from the 23 group consisting of fill, path, bounding box and 24 transparency. 25 8, A digital document processing system according 25 to any of claims 5 to 7, further including a shape 27 processing module adapted to receive said object and 28 parameter based representation of said specific view 29 from said parsing and rendering module and to 30 convert said object and parameter based 31 representation into an output data format suitable 32 for driving a particular output device. 33 9. A digital document processing system according 34 to claim 8, wherein said shape processing module 35 processes said objects on the basis of a shape 36 defining the shape of the object bounded by the 37 boundary box, the data content of the object and the 10 transparency of the object. 11 10. A digital document processing system according 12 to claim 8 or 9, wherein said shape processing 13 module processes said objects on the basis of a 14 shape defining the shape of the object bounded by 15 the boundary box representative of a defined area on 16 a display on which an object may be rendered. 17 11. A digital document processing system according 18 to any preceding claim, wherein the system employs a 19 chrominance/luminance-based colour model to describe 20 colour data. 21 22 12. A digital document processing system according 23 to any preceding claim, wherein the system employs a 24 universal text encoding model. 25 26 13. A digital document processing system according 27 to claim 12, wherein universal text encoding 28 includes Unicode, shift-mapping and big-5. 29 14. A digital document processing system according 30 to any preceding claim, further including a process 31 for compacting an internal representation of a 32 source document by combining document objects having 33 similar attributes. 34 15, A digital document processing system according 35 to any preceding claim, further including a process 36 for compacting an internal representation of a 37 source document by combining document objects having 10 similar style attributes. 11 16. A digital document processing system according 12 to any preceding claim, wherein the system is 13 adapted for multiple parallel implementation for 14 processing source data from one or more data sources 15 and for generating one or more sets of output 16 representation data. 17 17. A digital document processing system according 18 to any preceding claim, further comprising a 19 graphical user interface for generating internal 20 representations of interactive visual displays to be 21 employed by a user for controlling the digital 22 document processing system. 23 18. A digital document processing system according 24 to claim 17, comprising a data processing device 25 incorporating a graphical user interface. 26 19. A digital document processing system according 27 to any preceding claim, having a platform adapted 28 for being embedded into a device selected from the 29 group consisting of a hand held computer, a mobile 30 telephone, a set top box, a facsimile machine, a 31 copier, an embedded computer system, a printer, an 32 in-car system and a computer workstation. 33 20. A digital document processing system according 34 to any preceding claim, having a processor including 35 a core processor system. 36 21. A digital document processing system according 9 to claim 20, wherein said core processor is a RISC 10 processor. 11 12 22. A digital document processing system according 13 to any preceding claim, wherein the document agent 14 includes an export process for exporting data in a 15 selected format. 16 17 23. A digital document processing system according 18 to any preceding claim, adapted for operating on a 19 multiple processing system. 20 24. A method for displaying content, comprising 21 receiving a source of data representative of 22 the digital content having a structure and data 23 content, 24 processing the source of data to identify a 25 file format associated therewith, 2 6 translating the source of data, as a function 27 of its identified file format, into an internal 28 representation that includes a first data structure 29 for storing information about the structure of the 30 digital content, and a second data structure for 31 storing information about the data content contained 32 in the digital content, 4 generating a content file representative of an 5 internal representation of content to be presented 6 to a user, by processing the first data structure to 7 determine a structure for a portion of the content 8 file and by processing the second data structure to 9 determine data content for the respective portion of 10 the content file. 11 25. A method according to claim 24, wherein 12 receiving a source of data includes receiving a 13 stream of input data from a data source. 14 26. A method according to claim 25, wherein the 15 data source is selected from the group consisting of 16 a data file, a byte stream generated from a 17 peripheral device, and a byte stream generated from 18 a data file. 19 27. A method according to claim 25 or 26, wherein 20 processing the source of data includes 21 presenting information about the source of data to a 22 plurality of document agents, each being capable of 23 translating a data source of a known file format 24 into the internal representation. 25 28. A method according to any of claims 24 to 27, 26 wherein 27 translating the source of data into an internal 28 representation includes processing the source of 29 data to identify data therein, and mapping the 30 identified data to a set of object types 31 representative of types of content that are present 32 in a source of data. 4 5 29. A method according to claim 28, wherein mapping 6 includes mapping identified data to a set of object 7 types suitable for translating source data 8 representative of a content selected from the group 9 consisting of a digital document, an audio/visual 10 presentation, a music file, an interactive script, a 11 user interface file and an image file. 12 30. A method according to any of claims 24 to 29, 13 wherein mapping includes mapping the identified data 14 to a set of object types including a bitmap object 15 type, a vector graphic object type, a video type, an 16 animation type, a button type, a script type and a 17 text object type. 18 31. A method according to any of claims 24 to 30, 19 wherein translating the source of data includes 20 filtering portions of the source data to create a 21 filtered internal representation of the source 22 document. 23 32. A method according to any of claims 24 to 31, 24 wherein translating the source of data includes 25 altering the first data structure to adjust the 26 structure of the digital content. 27 33. A method according to any of claims 24 to 32, 28 wherein translating the source of data includes the 29 further act of substituting data content in the 30 second data structure to thereby modify content 31 presented within the internal representation. 32 34. A method according to any of claims 24 to 33, 33 wherein translating the source of data includes 34 translating the source of data into a set of 35 document objects of known object types, wherein a 36 document object includes a set of parameters that 9 define dimensional, temporal and physical 10 characteristics of the document object, 11 12 35. A method according to any of claims 24 to 34, 13 wherein the process is adapted for running on 14 multiple processors. 15 16 36. A method according to any of claims 24 to 35, 17 wherein the process provided a text encoding 18 process, for encoding in a format selected from the 19 group consisting of Unicode, shift-mapping and big- 20 5. 21 37. A method according to any of claims 24 to 36, 22 wherein generating a content data file includes 23 parsing a set of document objects having associated 24 parameters, to define a structure and content for 25 the content data file. 26 38. A method according to claim 37, further 27 including processing the structure and content of 28 the content data file to create a set of objects 29 that define the content data file and are capable of 30 being rendered on an output device. 31 39. A method according to claim 37 or 38, wherein 32 processing the document objects includes processing 33 the associated parameters for flowing content into a 34 structure defined by the document object. 35 40. A method according to claim 38 or claim 39 when 36 dependent on claim 38, wherein the output device 37 includes a display selected from the group 10 consisting of a visual display, an audio speaker, a 11 video player, a television display, printer, disc 12 drive, network, and an embedded display. 13 14 41. A system for interacting with content in a 15 digital documnent, comprising 16 a documment agent for converting content in the 17 digital document into a set of document objects 18 representative of internal representations of 19 primitive structures, and 20 a core document engine for rendering said 21 document objects to generate a display 22 representative of the digital content, 23 a user interface for detecting input signals 24 representative of input for modifying the content of 25 the digital document, and 26 a process for changing the internal 27 representation of the content as a function of the 28 input signals, to modify the display of the digital 29 content. 30 30 42. A system according to claim 41, wherein the 31 user interface includes an input device selected 32 from the group consisting of a mouse, a touch pad, a 33 touch screen, a joy stick, a remote control and a 34 keypad. 43. A digital document processing system substantially as herein described with reference to the accompanying drawings. 44. A method for displaying content substantially as herein described with reference to the accompanying drawings.

Full Text

Systems and Methods for Digital Document Processing
2 Related Applications
3 This application claims priority to earlier filed
4 British Patent Application No. 0009129.8, filed 14
5 April 2000, and US Patent Application Serial Number
6 09/703,502 filed 31 October 2000, both having Majid
7 Anwar as an inventor, the contents of which are
8 hereby incorporated by reference.
9 Field of the Invention

10 The invention relates to data processing systems,
11 and more particularly, to methods and systems for
12 processing digital documents to generate an output
13 representation of a source document as a visual
14 display, a hardcopy, or in some other display
15 format.
16 Background

2
1 As used herein, the term "digital document" is used
2 to describe a digital representation of any type of
3 data processed by a data processing system which is
4 intended, ultimately, to be output in some form, in
5 whole or in part, to a human user, typically by
6 being displayed or reproduced visually {e.g., by
7 means of a visual display unit or printer), or by
8 text-to-speech conversion, etc. A digital document
9 may include any features capable of representation,

10 including but not limited to the following: text;
11 graphical images; animated graphical images; full
12 motion video images; interactive icons, buttons,
13 menus or hyperlinks. A digital document may also
14 include non-visual elements such as audio (sound)
15 elements.
16 Data processing systems, such as personal computer
17 systems, are typically required to process "digital
18 documents," which may originate from any one of a
19 number of local or remote sources and which may
20 exist in any one of a wide variety of data formats

21 ("file formats"). In order to generate an output
22 version of the document, whether as a visual display
23 or printed copy, for example, it is necessary for
24 the computer system to interpret the original data
25 file and to generate an output compatible with the
26 relevant output device (e.g., monitor, or other
27 visual display device or printer). In general, this
28 process will involve an application program adapted
29 to interpret the data file, the operating system of
30 the computer, a software "driver" specific to the
31 desired output device and, in some cases

32 (particularly for monitors or other visual display
33 units), additional hardware in the form of an
34 expansion card.
35 This conventional approach to the processing of
36 digital documents in order to generate an output is
37 inefficient in terms of hardware resources, software
38 overheads and processing time, and is completely
39 unsuitable for low power, portable data processing
40 systems, including wireless telecommunication

10 systems, or for low cost data processing systems
11 such as network terminals, etc. Other problems are
12 encountered in conventional digital document
13 processing systems, including the need to configure
14 multiple system components (including both hardware
15 and software components) to interact in the desired
16 manner, and inconsistencies in the processing of
17 identical source material by different systems

18 (e.g., differences in formatting, color
19 reproduction, etc.). In addition, the conventional
20 approach to digital document processing is unable to
21 exploit the commonality and/or re-usability of file
22 format components.
23 summary of the invention
24 It is an object of the present invention to provide
25 digital document processing methods and systems, and
26 devices incorporating such methods and systems,
27 which obviate or mitigate the aforesaid
28 disadvantages of conventional methods and systems.

29 The systems and methods described herein provide a
30 display technology that separates the underlying
31 functionality of an application program from the
32 graphical display process, thereby eliminating or
33 reducing the application's need to control the
34 device display and to provide graphical user
35 interface tools and controls for the display.
36 Additionally, such systems reduce or eliminate the
37 need for an application program to be present on a

10 processing system when displaying data created by or
11 for that application program, such as a document or
12 video stream. Thus it will be understood that in
13 one aspect, the systems and methods described herein
14 can display content, including documents, video
15 streams, or other content, and will provide the
16 graphical user functions for viewing the displayed
17 document, such as zoom, pan, or other such
18 functions, without need for the underlying
19 application to be present on the system that is
20 displaying the content. The advantages over the
21 prior art of the systems and methods described
22 herein include the advantage of allowing different
23 types of content from different application programs
24 to be shown on the same display within the same work
25 space. Many more advantages will be apparent to

26 those of ordinary skill in the art and those of
27 those of ordinary skill in the art will also be able
28 to see numerous way of employing the underlying
29 technology of the invention for creating additional
3D systems, devices, and applications. These modified
31 systems and alternate systems and practices will be

1 understood to fall within the scope of the
2 invention. 3

4 More particularly, the systems and methods described
5 herein include a digital content processing system
6 that comprises an application dispatcher for
7 receiving an input byte stream representing source
8 data in one of a plurality of predetermined data
9 formats and for associating the input byte stream

10 with one of the predetermined data formats. The
11 system may also comprise a document agent for
12 interpreting the input byte stream as a function of
13 the associated predetermined data format and for
14 parsing the input byte stream into a stream of
15 document objects that provide an internal
16 representation of primitive structures within the
17 input byte stream. The systems also include a core
18 document engine for converting the document objects
19 into an internal representation data format and for
20 mapping the internal representation data to a

21 location on a display. A shape processor within the
22 system processes the internal representation data to
23 drive an output device to present the content as
24 expressed through the internal representation. 25
2 6 Embodiments of the invention will now be described,
27 by way of example only, with reference to the
2 8 accompanying drawings.
29 Brief Description of the Drawings

1 The foregoing and other objects and advantages of
2 the invention will be appreciated more fully from
3 the following further description thereof, with
4 reference to the accompanying drawings, wherein:
5 Figure 1 is a block diagram illustrating an
6 embodiment of a digital document processing system
7 in accordance with the present invention.
8 Figure 2 is a block diagreim that presents in greater
9 detail the system depicted in Figure 1;

10 Figure 3 is a flowchart diagram of one document
11 agent;
12 Figure 4 depicts schematically an exemplary document
13 of the type that can be processed by the system of
14 Figure 1;
15 Figure 5 depicts flowchart diagrams of two
15 exemplary processes employed to reduce redundancy
17 within the internal representation of a document;
18 and
19 Figures 6-8 depict an exemplary data structure for
20 storing an internal representation of a processed
21 source document.
22 Detailed Description of Certain Illustrated
23 Embodiments
24 The systems and methods described herein include
25 computer programs that operate to process an output

26 stream or output file generated by an application
27 program for the purpose of presenting the output on
28 an output device, such as a video display. The
29 applications according to the invention can process
30 these streams to create an internal representation
31 of that output and can further process that internal
32 representation to generate a new output stream that
33 may be displayed on an output device as the output
34 generated by the application according to the

10 invention. Accordingly, the systems of the
11 invention decouple the application program from the
12 display process thus relieving the application
13 progreum from having to display its output onto a
14 particular display device and further removes the
15 need to have the application program present when
16 processing the output of that application for the
17 purpose of displaying that output.
18 To illustrate this operation, Figure 1 provides a
19 high-level functional block diagram of a system 10
20 that allows a plurality of application programs,
21 shown collectively as element 13, to deliver their
22 output streams to a computer process 8 that
23 processes those output streams and generates a
24 representation of the collective output created by
25 those streams for display on the device 26. The
26 collective output of the application programs 13 is
27 depicted in Figure 1 by the output printer device 26

28 that presents the output content generated by the
29 different application programs 13. It will be
3 0 understood by those of skill in the art the output
31 device 2 6 is presenting output generated by the

1 computer process 8 and that this output collectively
2 carries the content of the plural application
3 programs 13. In the illustration provided by
4 Figure 1, the presented content comprises a
5 plurality of images and the output device 26 is a
6 display. However, it will be apparent to those of
7 skill in the art that in other practices the content
8 may be carried in a format other than images, such
9 as auditory tactile, or any other format, or

10 combination of formats suitable for conveying
11 information to a user. Moreover, it will be
12 understood by those of skill in the art that the
13 type of output device 26 will vary according to the
14 application and may include devices for presenting
15 audio content, video content, printed content,
16 plotted content or any other type of content. For
17 the purpose of illustration, the systems and methods
18 described herein will largely be shown as displaying
19 graphical content through display devices, yet it
20 will be understood that these exemplary systems are
21 only for the purpose of illustration, and not to be
22 understood as limiting in anyway. Thus the output
23 generated by the application programs 13 is
24 processed and aggregated by the computer process 8
25 to create a single display that includes all the
26 content generated by the individual application
27 programs 13.
28 In the depicted embodiment, each of the
29 representative outputs appearing on display 26 is
3 0 termed a document, and each of the depicted
31 documents can be associated with one of the

1 application programs 13. It will be understood that
2 the term document as used herein will encompass
3 documents, streamed video, streamed audio, web
4 pages, and any other form of data that can be
5 processed and displayed by the computer process 8.
6 The computer process 8 generates a single output
7 display that includes within that display one or
8 more of the documents generated from the application
9 programs 13. The collection of displayed documents

10 represents the content generated by the application
11 programs 13 and this content is displayed within the
12 program window generated by the computer process 8.
13 The program window for the computer process 8 also
14 may include a set of icons representative of tools
15 provided with the graphical user interface and
16 capable of allowing a user to control the operation,
17 in this case the display, of the dociunents appearing
18 in the program window.
19 In contrast, the conventional approach of having
20 each application program form its own display would
21 result in a presentation on the display device 26
22 that included several program windows, typically one
23 for each application program 13. Additionally, each
24 different type of program window would include a
25 different set of tools for manipulating the content
26 displayed in that window. Thus the system 10 of the

27 invention has the advantage of providing a
28 consistent user interface, and only requiring
29 knowledge of one set of tools for displaying and
30 controlling the different documents. Additionally,
31 the computer process 8 operates on the output of the

32 application programs 13, thus only requiring that
33 output to create the documents that appear within
34 the program window. Accordingly, it is not
35 necessary that the application programs 13 be
36 resident on the same machine as the process 8, nor
37 that the application programs 13 operate in concert
38 with the computer process 8. The computer process 8
39 needs only the output from these application
40 programs 13, and this output can be derived from

10 stored data files that were created by the
11 application programs 13 at an earlier time.
12 However, the systems and methods described herein
13 may be employed as part of systems wherein an
14 application program is capable of presenting its own
15 content, controlling at least a portion of the
16 display 26 and presenting that content within a
17 program window associated with that application
18 program. In these embodiments the systems and
19 methods of the invention can work as separate
20 applications that appear on the display within a
21 portion of the display provided for its use.
22 More particularly. Figure 1 depicts a plurality of
23 application programs 13. These application programs
24 can include word processing programs such as Word,
25 WordPerfect, or any other similar word processing
26 program. It can further include programs such as
27 Netscape Composer that generates HTML files, Adobe
28 Acrobat that processes PDF files, a web server that
29 delivers XML or HTML, a streaming server that
3 0 generates a stream of audio-visual data, an e-mail
31 client or server, a database, spreadsheet or any

1 other kind of application program that delivers
2 output either as a file, data stream, or in some
3 other format suitable for use by a computer process.
4 In the embodiment of Figure 1 each of the
5 application programs 13 presents its output content
6 to the computer process 8. In operation this can
7 occur by having the application process 13 direct
8 its output stream as an input byte stream to the
9 computer process 8. The use of data streams is

10 well known to those of ordinary skill in the art and
11 described in the literature, including for example,
12 Stephen G. Kochan, Programming in C, Hayden
13 Publishing (1983). Optionally, the application
14 program 13 can create a data file such as a Word
15 document, that can be streamed into the computer
16 process 8 either by a separate application or by the
17 computer process 8.
18 The computer process 8 is capable of processing the
19 various input streams to create the aggregated
20 display shown on display device 26. To this end,
21 and as will be shown in greater detail hereinafter,
22 the computer process 8 processes the incoming
23 streams to generate an internal representation of
24 each of these input streams. In one practice this

25 internal representation is meant to look as close as
26 possible to the output stream of the respective
27 application program 13. However, in other
28 embodiments the internal representation may be
29 created to have a selected, simplified or partial
30 likeness to the output stream generated by the
31 respective application program 13. Additionally and

32 optionally, the systems and methods described herein
33 may also apply filters to the content being
34 translated thereby allowing certain portions of the
35 content to be removed from the content displayed or
36 otherwise presented. Further, the systems and
37 methods described herein may allow alteration of the
38 structure of the source document, allowing for
39 repositioning content within a document, rearranging
40 the structure of the document, or selecting only

10 certain types of data. Similarly in an optional
11 embodiment, content can be added during the
12 translation process, including active content such
13 as links to web sites. In either case, the internal
14 representation created by computer process 8 may be
15 further processed by the computer process 8 to drive
16 the display device 2 6 to create the aggregated image
17 represented in Figure 1.
18 Turning to Figure 2, a more detailed representation
19 of the system of Figure 1 is presented.
20 Specifically, Figure 2 depicts the system 10 which
21 includes that computer process 8, the source
22 documents 11, a and a display device 26. The
23 computer process 8 includes a plurality of document
24 agents 12, an internal representation format file
25 and process 14, buffer storage 15, a library of
26 generic objects 16, a core document engine that in
27 this embodiment comprises a parsing module 18, and a
28 rendering module 19, an internal view 20, a shape
29 processor 22 and a final output 24. Figure 2
30 further depicts an optional input device 30 for
31 transmitting user input 40 to the computer process

32 8. The depicted embodiment includes a process 8
33 that comprises a shape processor 22. However, it
34 will be apparent to those of ordinary skill in the
35 art, that the depicted process 8 is only exemplary
36 and that the process 8 may be realized through
37 alternate processes and architectures. For example,
38 the shape processor 22 may optionally be realized as
39 a hardware component, such as a semiconductor
40 device, that supports the operation of the other

10 elements of the process 8. Moreover, it will be
11 understood that although Figure 2 presents process 8
12 as a functional block diagram that comprises a
13 single system, it may be that process 8 is
14 distributed across a number of different platforms,
15 and optionally it may be that the elements operate
16 at different times and that the output from one
17 element of process 8 is delivered at a later time as
18 input to the next element of process 8.
19 As discussed above, each source document 11 is
20 associated with a document agent 12 that is capable
21 of translating the incoming document into an
22 internal representation of the content of that
23 source document 11, To identify the appropriate
24 document agent 12 to process a source document 11,

25 the system 10 of Figure 1 includes an application
26 dispatcher {not shown) that controls the interface
27 between application programs and the system 10. In
28 one practice, the use of an external application
29 programming interface (API) is handled by the
30 application dispatcher which passes data, calls the
31 appropriate document agent 12, or otherwise carries

32 out the request made by the application program. To
33 select the appropriate document agent 12 for a
34 particular source document 11, the application
35 dispatcher advertises the source document 11 to all
36 the loaded document agents 12. These document
37 agents 12 then respond with information regarding
38 their particular suitability for translating the
39 content of the published source document 11. Once
40 the document agents 12 have responded, the

10 application dispatcher selects a document agent 12
11 and passes a pointer, such as a URI of the source
12 document 11, to the selected document agent 12.
13 In one practice, the computer process 8 may be run
14 as a service under which a plurality of threads may
15 be created thereby supporting multi-processing of
16 plural docviment sources 11. In other embodiments,
17 the process 8 does not support multi-threading and
18 the document agent 12 selected by the application
19 dispatcher will be called in the current thread.
20 It will be understood that the exemplary embodiment
21 of Figure 2 provides a flexible and extensible front
22 end for processing incoming data streams of
23 different file formats. For example, optionally,
24 if the application dispatcher determines that the
25 system lacks a document agent 12 suitable for

26 translating the source document 11, the application
27 dispatcher can signal the respective application
28 program 13 indicating that the source document 11 is
29 in an unrecognized format. Optionally, the
30 application program 13 may choose to allow the

31 reformatting of the source document 11, such as by
32 converting the source document 11 produced by the
33 application program 13 from its present format into
34 another format supported by that application program
35 13. For example an application program 13 may
36 determine that the source document 11 needs to be
37 saved in a different format, such as an earlier
38 version of the file format. To the extent that the
39 application program 13 supports that format, the

10 application program 13 can resave the source
11 document 11 in this supported format in order that a
12 document agent 12 provided by the system 10 will be
13 capable of translating the source document 11.
14 Optionally, the application dispatcher, upon
15 detecting that the system 10 lacks a suitable
16 document agent 12, can indicate to a user that a new
17 document agent of a particular type may be needed
18 for translating the present source document 11. To
19 this end, the computer process 8 may indicate to the
20 user that a new document agent needs to be loaded
21 into the system 10 and may direct the user to a
22 location, such as a web site, from where the new
23 document agent 12 may be downloaded. Optionally,
24 the system could fetch automatically the document
25 agent without asking the user, or could identify a
26 generic agent 12, such as a generic text agent that
27 can extract portions of the source document 11
28 representative of text. Further, agents that prompt
29 a user for input and instruction during the
30 translation process may also be provided.

1 In a still further optional embodiment, an
2 application dispatcher in conjunction with the
3 document agents 12 acts as an input module that
4 identifies the file format of the source document 11
5 on the basis of any one of a variety of criteria,
6 such as an explicit file-type identification within
7 the document, from the file name, including the file
8 name extension, or from known characteristics of the
9 content of particular file types. The bytestream is

10 input to the document agent 12, specific to the file
11 format of the source document 11.
12 Although the above description has discussed input
13 data being provided by a stream or computer file, it
14 shall be understood by those of skill in the art
15 that the system 10 may also be applied to input
16 received from an input device such as a digital
17 camera or scanner as well as from an application
18 program that can directly stream its output to the
19 process 8, or that has its output streamed by cui
20 operating system to the process 8. In this case the
21 input bytestream may originate directly from the
22 input device, rather from a source document 11.
23 However, the input bytestream will still be in a
24 data format suitable for processing by the system 10
25 and, for the purposes of the invention, input
26 received from such an input device may be regarded
27 as a source document 11.
28 As shown in Figure 2, the document agent 12 employs

29 the library 16 of standard objects to generate the
30 internal representation 14, which describes the

31 content of the source document in terms of a
32 collection of document objects whose generic types
33 are as defined in the library 16, together with
34 parameters defining the properties of specific
35 instances of the various document objects within the
36 document. Thus, the library 16 provides a set of
37 types of objects which the document agents 12, the
38 parser 18 and the system 10 have knowledge of. For
39 example, the document objects employed in the

10 internal representation 14 may include: text,
11 bitmap graphics and vector graphics document objects
12 which may or may not be animated and which may be
13 two- or three-dimensional: video, audio and a
14 variety of types of interactive objects such as
15 buttons and icons. Vector graphics document objects
16 may be PostScript-like paths with specified fill and
17 transparency. Bitmap graphic document objects may
18 include a set of sub-object types such as for
19 example JPEG, GIF and PNG object types. Text
20 document objects may declare a region of stylized
21 text. The region may include a paragraph of text,
22 typically understood as a set of characters that
23 appears between two delimiters, like a pair of
24 carriage returns. Each text object may include a
25 run of characters and the styling information for
26 that character run including one or more associated

27 typefaces, points and other such styling
28 information.
29 The parameters defining specific instances of
30 document objects will generally include dimensional
31 co-ordinates defining the physical shape, size and

32 location of the document object and any relevant
33 temporal data for defining document objects whose
34 properties vary with time, thereby allowing the
35 system to deal with dynamic document structures
36 and/or display functions. For example, a stream of
37 video input may be treated by the system 10 as a
38 series of figures that are changing at a rate of,
39 for example, 30 frames per second. In this case the
40 temporal characteristic of this figure object

10 indicates that the figure object is to be updated 30
11 times per second. As discussed above, for text
12 objects, the parameters will normally also include a
13 font and size to be applied to a character string.
14 Object parameters may also define other properties,
15 such as transparency. It will be understood that
16 the internal representation may be saved/stored in a
17 file format native to the system and that the range
18 of possible source documents 11 input to the system
19 10 may include documents in the system's native file
20 format. It is also possible for the internal
21 representation 14 to be converted into any of a
22 range of other file formats if required, using
23 suitable conversion agents.
24 Figure 3 depicts a flow chart diagram of one
25 exemplary process that may be carried out by a
26 document agent 12. Specifically, Figure 3 depicts a
27 process 50 that represents the operation of an
28 example document agent 12, in this case a document
29 agent 12 suitable for translating the contents of a
30 Microsoft Word document into an internal
31 representation format. Specifically, the process 50

32 includes an initialization step 52 wherein the
33 process 50 initializes the data structures, memory
34 space, and other resources that the process 50 will
35 employ while translating the source document 11.
36 After step 52 the process 50 proceeds to a series of
37 steps, 54, 58 and 60, wherein the source document 11
38 is analyzed and divided into subsections. In the
39 process 50 depicted in Figure 3 steps 54, 58 and 60,
40 subdivide the source document 11 as it is streamed

10 into the document agent 12 first into sections, then
11 subdivides the sections into paragraphs and then
12 subdivides paragraphs into the individual characters
13 that make up that paragraph. The sections,
14 paragraphs and characters identified within the
15 source document 11 may be identified within a piece
16 table that contains pointers to the different
17 subsections identified within the source document
18 11. It will be understood by those of skill in the
19 art that the piece table depicted in Figure 3
20 represents a construct employed by MSWord for
21 providing pointers to different subsections of a
22 document. It will further be understood that the
23 use of a piece table or a piece table lilo construct
24 is optional and depends on the application at hand,
25 including depending on the type of document being
26 processed.
27 As the process 50 in step 60 begins to identify
28 different characters that appear within a particular
29 paragraph, the process 60 may proceed to step 62
30 wherein a style is applied to the character or set
31 of characters identified in step 60. The

1 application of a style is understood to associated
2 the identified characters with a style of
3 presentation that is being employed with those
4 characters. The style of presentation may include
5 properties associated with the character including
6 font type, font size, whether the characters are
7 bold, italic, or otherwise stylized. Additionally,
8 in step 62 the process can determine whether the
9 characters are rotated, or being positioned for

10 following a curved path or other shape.
11 Additionally, in step 62 style associated with the
12 paragraph in which the characters occur may also be
13 identified and associated with the characters. Such
14 properties can include the line spacing associated
15 with the paragraph, the margins associated with the
16 paragraph, the spacing between characters, and other
17 such properties.
18 After step 62 the process 50 proceeds to step 70
19 wherein the internal representation is built up.
20 The object which describes the structure of the
21 document is created in Step 64 as an object within
22 the internal representation, and the associated
23 style of this object, together with the character
24 run it contains, is created separately within the
25 internal representation at Step 68. Figures 6, 7
26 and 8, which will be explained in more detail herein
27 after, depict figuratively the file structure

28 created by the process 50 wherein the structure of a
29 document is captured by a group of dociiment objects
30 and the data associated with the document objects is
31 stored in a separate data structure. After step 70,

32 a process 50 proceeds to decision block 72 wherein
33 the process 50 determines whether the paragraph
34 associated with the last processed character is
35 complete. If the paragraph is not complete the
5 process 50 returns to step 60 wherein the next
5 character from the paragraph is read.
7 Alternatively, if the paragraph is complete the
8 process 50 proceeds to decision block 74 wherein the
9 process 50 determines whether the section is

10 complete. If the section is complete the process
11 returns to step 58 and the next paragraph is read
12 from the piece table. Alternatively if the section
13 is complete the process 50 proceeds to step 54
14 wherein the next section, if there is a next section
15 is read from the piece table and processing
16 continues. Once the document has been processed the
17 system 8 can transmit, save, export or otherwise
18 store the translated document for subsequent use.
19 The system can store the translated file in a format
20 compatible with the internal representation, and
21 optionally in other formats as well including
22 formats compatible with the file formats of the
23 source documents 11 (for which it may employ 'export
24 document agents' not shown capable of receiving
25 internal representation data and creating source
26 document data), or in a binary form, a textual
27 document description structure, marked-up text or in
28 any other suitable format; and may employ a
29 universal text encoding model, including Unicode,
30 shiftmapping, big-5, and a luminance/chrominance
31 model.

32 As can be seen from the above, the format of the
33 internal representation 14 separates the "structure"
34 (or "layout") of the documents, as described by the
35 object types and their parameters, from the
5 "content" of the various objects; e.g. thi^ character
5 string (content) of a text object is separated from
7 the dimensional parameters of the object; the image
8 data (content) of a graphic object is separated from
9 its dimensional parameters. This allows document

10 structures to be defined in a compact manner and
11 provides the option for content data to be stored
12 remotely and to be fetched by the system only when
13 needed. The internal representation 14 describes
14 the document and its constituent objects in terms of
15 "high-level" descriptions.
16 The document agent 12 described above with reference
17 to Figure 3 is capable of processing a data file
18 created by the MSWord word processing application
19 and translating that data file into an internal
20 representation that is formed from a set of object
21 types selected from the library 16, that represents
22 the content of the processed document. Accordingly,
23 the document agent 12 analyzes the Word document and
24 translates the structure and content of that
25 document into an internal representation known to

26 the computer process 8. One example of one type of
27 Word document that may be processed by tfee document
28 agent 12 is depicted in Figure 4. Specifically,
29 Figure 4 depicts a Word document 32 of the type
3 0 created by the MSWord application program. The
31 depicted document 32 comprises one page of

1 information wherein that one page includes two
2 columns of text 34 and one figure 36. Figure 4
3 further depicts that the columns of text 34 and the
4 figure 3 6 are positioned on the page 38 in such a
5 way that one column of text runs from the top of the
6 page 38 to the bottom of the page 38 and the second
7 column of text runs from about the center of the
8 page to the bottom of the page with the figure 36
9 being disposed above the second column of text 34.

10 As discussed above with reference to Figure 3 the
11 document agent 12 begins processing the document 32
12 by determining that the document 32 comprises one
13 page and contains a plurality of different objects.
14 For the one page found by the document agent 12, the
15 document agent 12 identifies the style of the page,
16 which for example may be a page style of an 8.5 x 11
17 page in portrait format. The page style identified
18 by the document agent 12 is embodied in the internal
19 representation for later use by the parser 18 in
20 formatting and flowing text into the docviment
21 created by the process 8.
22 For the document 32 depicted in Figure 4 only one
23 page is present. However, it will be understood
24 that the document agent 12 may process Word
25 documents comprising a plurality of pages. In such
26 a case the document agent 12 would process each page
27 separately by creating a page then filling it with
28 objects of the type found in the library. Thus page
29 style information can include that a document
30 comprises a plurality of pages and that the pages

31 are of a certain size. Other page style information
32 may be identified by the document agent 12 and the
33 page style information identified can vary according
34 to the application. Thus different page style
35 information may be identified by a document agent
36 capable of processing a Microsoft Excel document or
37 a real media data stream,
38 As further described with reference to Figure 3- 4
39 once the document agent 12 has identified the page

10 style the document agent 12 may begin to break the
11 document 32 dovm into objects that can be mapped to
12 document objects knovm to the system and typically
13 stored in the library 16. For example, the docioment
14 agent 12 may process the document 32 to find text
15 objects, bitmap objects and vector graphic objects.
16 Other type of object types may optionally be
17 provided including video type, animation type,
18 button type, and script type. In this practice, the
19 document agent 12 will identify a text object 34
20 whose associated style has two columns. The
21 paragraphs of text that occur within the text object
22 34 may be analyzed for identifying each character in
23 each respective paragraph. Process 50 may apply
24 style properties to each identified character run
25 and each character run identified within the
26 document 32 may be mapped to a text object of the
27 type listed within the library 16. Each character
28 run and the applied style can be understood as an
29 object identified by the document agent 12 as having
30 been found within the document 32 and having been
31 translated to a document object, in this case a text

1 object of the type listed within the library 16.
2 This internal representation object may be streamed
3 from the document agent 12 into the internal
4 representation 14. The document agent 12 may
5 continue to translate the objects that appear within
6 the document 32 into document objects that are known
7 to the system 10 until each object has been
8 translated. The object types may be appropriate for
9 the application and may include object types

10 suitable for translating source data representative
11 of a digital document, an audio/visual presentation,
12 a music file, an interactive script, a user
13 interface file and an image file, as well as any
14 other file types.
15 Turning to Figure 5, it can be seen that the
16 process 80 depicted in Figure 5 allows for
17 compacting similar objects appearing within the
18 internal representation of the source document 11,
19 for the purpose of reducing the size of the internal
20 representation. For example, Figure 5 depicts a
21 process 80 wherein step 82 has a primitive library
22 object A being processed by, in step 84, inserting
23 that primitive object into the docximent that is
24 becoming the internal representation of the source
25 document 11. In step 88 another object B, provided
26 by the document agent 12 is delivered to the
27 internal representation file process 14. The
28 process 80 then undertakes the depicted sequence of
29 steps 92 through 98 wherein characteristics of
3 0 object A are compared to the characteristics of
31 object B to determine if the two objects have the

1 same characteristics. For example, if object A and
2 object B represent two characters such as the letter
3 P and the letter N, if both characters P and N are
4 the scime color, same font, same size and the same
5 style such as bold or italicized, then the process
6 80 in step 94 joins the two objects together within
7 one object classification stored within the internal
8 representation. If these characteristics do not
9 match then the process 80 adds them to the internal

10 representation as two separate objects.
11 Figure 5 depicts a process 80 wherein the internal
12 representation file 14 compacts the objects as a
13 function of the similarity of physically adjacent
14 objects. Those of ordinary skill in the art will
15 understand that this is merely one process for
16 compacting the objects and that other techniques may
17 be employed. For example, in an optional practice,
18 the compaction process may comprise a process for
19 compacting objects that are visually adjacent.
20 Figures 6, .7 and 8 depict the structure of the
21 internal representation of a document that has been
22 processed by the system depicted in Figures 1 and 2.
23 The internal representation of the document may be
24 embodied as a computer file or as data stored in
25 core memory. However, it will be apparent to those
26 of ordinary skill in the art that data structure

27 selected for capturing or transporting the internal
28 representation may vary according to the application
29 and any suitable data structure may be employed with

30 the systems and methods described herein without
31 departing from the scope of the invention.
32 As will be described in greater detail hereinafter
33 the structure of the internal representation of the
34 processed document separates the structure of the
35 document from the content of the document.
36 Specifically, the structure of the document is
37 captured by a data structure that shows the
38 different document objects that make up the

10 document, as well as the way that these document
11 objects are arranged relative to each other. This
12 separation of structure from content is shown in
13 Figure 6 wherein the data structure 110 captures the
14 structure of the document being processed and stores
15 that structure in a data format that is independent
16 of the actual content associated with that document.
17 Specifically, the data structure 110 includes a
18 resource Table 112 and a document structure 114.
19 The resource table 112 provides a list of resources
20 for constructing the internal representation of the
21 document. For example the resource table 112 can
22 include one or more tables of common structures that
23 occur within the document, such as type faces,
24 links, and color lists. These common structures may
25 be referenced numerically within the resource table
26 112. The resources of resource table 112 relate to
27 the document objects that are arranged within the
28 document structure 114. As Figure 6 shows, the
29 document structure 114 includes a plurality of
3 0 containers 118 that are represented by the sets of
31 the nested parentheses. Within the containers 118

1 are a plurality of document objects 120. As shown
2 in Figure 6 the containers 118 represent collections
3 of document objects that appear within the document
4 being processed. As further shown by Figure 6 the
5 containers 118 are also capable of holding sub-
6 containers. For example, the document structure 114
7 includes one top-level container, identified by the
8 set of outer parentheses labeled 1, and has three
9 nested containers 2, 3 and 4. Additionally, the

10 container 4 is double nested within container 1 and
11 container 3.
12 Each container 118 represents features within a
13 document, wherein the features may be a collection
14 of individual document objects, such as the depicted
15 dociiment objects 120. Thus for example, a document,
16 such as the document 32 depicted in Figure 4, may
17 include a container representative of the character
18 run wherein the character run includes the text that
19 appears within the columns 34. The different
20 document objects 120 that occur within the character
21 run container may, for example, be representative of
22 the different paragraphs that occur within that
23 character riin. The character run container has a
24 style associated with it. For example, the
25 character run depicted in Figure 4 can include style

26 information representative of the character font
27 type, font size, styling, such as bold or italic
28 styling, and style information representative of the
29 size of the column, including width and length, in
30 which the character run, or at least a portion of
31 that character run, occurs. This style information

32 may be later used by the parser 18 to reformat and
33 reflow the text within the context specific view 20.
34 Another example of a container may be a table that,
35 for example, could appear within a column 34 of text
36 in document 32. The table may be a container with
37 objects. The other types and uses of containers
38 will vary according to the application at hand and
39 the systems and methods of the invention are not
40 limited to any particular set of object types or

10 containers.
11 Thus, as the document agent 12 translates the source
12 document 11, it will encounter objects that are of
13 known object types, and the document agent 16 will
14 request the library 16 to create an object of the
15- appropriate object type. The docvunent agent 12 will
15 then lodge that created document object into the
17 appropriate location within dociament structure 114
18 to preserve the overall structure of the source
19 document 11. For example, as the document agent 12
20 encounters the image 36 within the source document
21 11, the document agent 12 will recognize the image
22 36, which may for example be a JPEG image, as an
23 object of type bitmap, and optionally sub-type JPEG.
24 This document agent 12, as shown in steps 64 and 68
25 of Figure 3, can create the appropriate document
26 object 120 and can lodge the created docioment object
27 120 into the structure 114. Additionally, the data
28 for the JPEG image document object 120, or in
29 another example, the data for the characters and
3 0 their associated style for a character run, may be

1 stored within the data structure 150 depicted in
2 Figure 8.
3 As the source document 11 is being processed, the
4 document agent 12 may identify other containers
5 wherein these other containers may be representative
6 of a subfeature appearing within an existing
7 container, such as a character run. For example,
8 these subfeatures may include links to referenced
9 material, or clipped visual regions or features that

10 appear within the document and that contain
11 collections of individual document objects 120. The
12 document agent 12 can place these doc^ument objects
13 120 within a separate container that will be nested
14 within the existing container. The arrangement of
15 these document objects 120 and the containers 118
16 are shown in Figure 7A as a tree structure 130
17 wherein the individual containers 1, 2, 3 and 4 are
18 shown as container objects 132, 134, 138 and 140
19 respectively. The containers 118 and the document
20 objects 120 are arranged in a tree structure that
21 shows the nested container structure of documents
22 structure 114 and the different document objects 12 0
23 that occur within the containers 118. The tree
24 structure of Figure 7A also illustrates that the
25 structure 114 records and preserves the structure of

26 the source document 11, showing the source doc\iment
27 as a hierarchy of document objects 120, wherein the
28 document objects 120 include the style information,
29 such as for example the size of columns in which a
30 run of characters appears, or temporal information,
31 such as the frame rate for streamed content. Thus,

8 As can be seen, Table 1 presents an example of
9 parameters that may be used to describe a document's
10 graphical structure. Table one presents examples of
11 such parameters, such as the object type, which in
12 this case is a Bitmap object type. A bounding box
13 parameter is provided and gives the location of the
14 docviment object within the source document 11.
15 Table one further provides the Fill employed and an
16 alpha factor that is representative of the degree of
17 transparency for the object. A Shape parameter
18 provides a handle to the shape of the object, which
19 in this case could be a path that defines the
20 outline of the object, including irregularly shaped
21 objects. Table 1 also presents a time parameter
22 representative of the temporal changing for that
23 object. In this example, the image is stable and
24 does not change with time. However, if the image

1 object presented streamed media, then this parameter
2 could contain a temporal characteristic that
3 indicates the rate at which the object should
4 change, such as a rate comparable to the desired
5 freime rate for the content. 6
7 . Thus, the structural elements are containers with
8 flowable data content, with this flowable data held
9 separately and referenced by a handle from the

10 container. In this way, any or all data content can
11 be held remotely from the document structure. This
12 allows for rendering of the docxament in a manner
13 that can be achieved with a mixture of locally held
14 and remotely held data content. Additionally, this
15 data structure allows for rapid progressive
15 rendering of the internal representation of the
17 source document 11, as the broader and higher level
18 objects can be rendered first, and the finer
19 features can be rendered in subsequent order. Thus,
20 the separate structure and data allows visual
21 document to be rendered while streaming data to
22 "fill" the content. Additionally, the separation of
23 content and structure allows the content of the
24 document to readily be edited or changed. As the
25 document structure is independent from the content,
26 different content can be substituted into the
27 document structure. This can be done on container
28 by container basis or for the whole document. The
29 structure of the document can be delivered
30 separately from the content and the content provided
31 later, or made present on the platform to which the

1 structure is delivered.
2
3 Additionally, Figure 7A shows that the structure of
4 a source document 11 can be represented as a tree
5 structure 130. In one practice the tree structure
6 may be modified and edited to change the
7 presentation of the source document 11. For
8 example, the tree structure may be modified to add
9 additional structure and content to the tree 130.

10 This is depicted in Figure 7B that shows the
11 original tree structure of Figure 7A duplicated and
12 presented under a higher level container. Thus,
13 Figure 7B shows that a new document structure, and
14 therefore new representation, may be created by
15 processing the tree structure 130 produced by the
16 document agent 12. This allows the visual position
17 of objects within a document to change, while the
18 relative position of different objects 120 may
19 remain the same. By adjusting the tree structure
20 13 0, the systems described herein can edit and
21 modify content. For example, in those applications
22 where the content within the tree structure 130 is
23 representative of visual content, the systems
24 described herein can edit the tree structure to
25 duplicate the image of the document, and present

26 side by side images of the document. Alternatively,
27 the tree structure 130 can be edited and
28 supplemented to add additional visual information,
29 such as by adding the image of a new document or a
3 0 portion of that document. Moreover, by controlling
31 the rate at which the tree structure is changed, the
32 systems described herein can create the illusion of

33 a document gradually changing, such as sliding
34 across a display, such as display device 26, or
35 gradually changing into a new document. Other
36 effects, such as the creation of thumbnail views and
37 other similar results can be achieved and those of
38 ordinary skill by making modifications to the
39 systems and methods described herein and such
40 modified systems and methods will fall within the
41 scope of the invention. 10

11 The data of the source docximent 11 is stored
12 separately from the structure 114. To this end,
13 each document object 120 includes a pointer to the
14 data associated with that object and this
15 information may be arranged within an indirection
16 list such as the indirection list 160 depicted in
17 Figure 8. In this practice, and as shown in Figure
18 8, each document object 120 is numbered and an
19 indirection list 152 is created wherein each
20 document object number 154 is associated with an
21 offset value 158. For example the document object
22 number 1, identified by reference number 160, may be
23 associated with the offset 700, identified by
24 reference number 162. Thus, the indirection list
25 associates the object number 1 with the offset 700.
26 The offset 700 may represent a location in core
27 memory, or a file offset, wherein the data
28 associated with object 1 may reside. As further
29 shown in Figure 8 a data structure 150 may be
30 present wherein the data that is representative of
31 the content associated with a respective document
32 object 120 may be stored. Thus for example, the

33 depicted object 1 at jump location 700 may include
34 the Unicode characters representative of the
35 characters that occur within the character run of
36 the container 1 depicted in Figure 6. Similarly,
37 the object 2 data, depicted in Figure 8 by reference
38 number 172, and associated with in core memory
39 location 810, identified by reference numeral 170,
40 may be representative of the JPEG bit map associated
41 with a bit map document object 120 referenced within

10 the document structure 114 of Figure 6.
11 It will be noted by those of skill in the art, that
12 as the data is separated from the structure, the
13 content for a source document is held in a
14 centralized repository. As such, the systems
15 described herein allow for compressing across
16 different types of data objects. Such processes
17 provide for greater storage flexibility in limited
18 resource systems.
19 Returning to Figure 2, it will be understood that
20 once the process for compacting the content of an
21 internal representation file completes compacting
22 different objects, these objects are passed to the
23 parser 18. The parser 18 parses the objects
24 identified in the structure section of the internal
25 representation, and with reference to the data
26 content associated with this object, it re-applies
27 the position and styling information to each object.
28 The renderer 19 generates a context-specific
29 representation or "view" 20 of the documents
3 0 represented by the internal representation 14. The

1 required view may be of the all the documents, a
2 whole document or of parts of one or some of the
3 documents. The renderer 19 receives view control
4 inputs 40 which define the viewing context and any
5 related temporal parameters of the specific document
6 view which is to be generated. For example, the
7 system 10 may be required to generate a zoomed view
8 of part of a document, and then to pan or scroll the
9 zoomed view to display adjacent portions of the

10 document. The view control inputs 40 are
11 interpreted by the renderer 19 to determine which
12 parts of the internal representation are required
13 for a particular view and how, when and for how long
14 the view is to be displayed.
15 The context-specific representation/view 20 is
16 expressed in terms of primitive shapes and
17 parameters.
18 The renderer 19 may also perform additional pre-
19 processing functions on the relevant parts of the
20 internal representation 14 when generating the
21 required view 20 of the source document 11. The view
22 representation 20 is input to a shape processor 22
23 for processing to generate an output in a format
24 suitable fore driving an output device 26, such as a
25 display device or printer.
26 The pre-processing functions of the renderer 19 may

27 include colour correction, resolution
28 adjustment/enhancement and anti-aliasing.
29 Resolution enhancement may comprise scaling

30 functions which preserve the legibility of the
31 content of objects when displayed or reproduced by
32 the target output device. Resolution adjustment may
33 be context-sensitive; e.g. the display resolution of
34 particular objects may be reduced while the
35 displayed document view is being panned or scrolled
36 and increased when the document view is static.
37 Optionally, there may be a feedback path 42 between
38 the parser 18 and the internal representation 14,

10 e.g. for the purpose of triggering an update of the
11 content of the internal representation 14, such as
12 in the case where the source document 11 represented
13 by the internal representation comprises a multi-
14 frame animation.
15 The output from the renderer 19 expresses the
16 document in terms of primitive objects. For each
17 document object, the representation from the
18 renderer 19 defines the object at least in terms of
19 a physical, rectangle boundary box, the actual
20 outline path of the object bounded by the boundary
21 box, the data content of the object, and its
22 transparency.
23 The shape processor 22 interprets the primitive
24 object and converts it into an output frame format
25 appropriate to the target output device 26; e.g. a

26 dot-map for a printer, vector instruction set for a
27 plotter, or bitmap for a display device. An output
28 control input 44 to the shape processor 22 provides

29 information to the shape processor 22 to generate
30 output suitable for a particular output de^^cice 26.
31 The shape processor 22 preferably processes the
32 objects defined by the view representation 2 0 in
33 terms of "shape" (i.e. the outline shape of the
34 object), "fill" (the data content of the object) and
35 "alpha" (the transparency of the object), performs
36 scaling and clipping appropriate to the required
37 view and output device, and expresses the object in

10 terms appropriate to the output device (typically in
11 terms of pixels by scan conversion or the like, for
12 most types of display device or printer). The shape
13 processor 22 optionally includes an edge buffer
14 which defines the shape of an object in terms of
15 scan-converted pixels, and preferably applies anti-
16 aliasing to the outline shape. Anti-aliasing may be
17 performed in a manner determined by the
18 characteristics of the output device 26, by applying
19 a grey-scale ramp across the object boundary. This
20 approach enables memory efficient shape-clipping and
21 shape-intersection processes, and is memory
22 efficient and processor efficient as well. A look-up
23 table, or other technique, may be employed to define
24 multiple tone response curves, allowing non-linear
25 rendering control. The individual primitive objects
26 processed by the shape processor 22 are combined in

27 the composite output frame. The design of one
28 shape processor suitable for use with the systems
29 described herein is shown in greater detail in the
30 patent application entitled Shape Processor, filed
31 on even date herewith, the contents of which are

32 incorporated by reference. However, any suitable
33 shape processor system or process may be employed
34 without departing from the scope of the invention.
35 As discussed above, the process 8 depicted in Figure
36 1 can be realized as a software component operating
37 on a data processing system such as a hand held
38 computer, a mobile telephone, set top box, facsimile
39 machine, copier or other office equipment, an
40 embedded computer system, a Windows or Unix

10 workstation, or any other type of
11 computer/processing platform capable of supporting,
12 in whole or in part, the document processing system
13 described above. In these embodiments, the system
14 can be implemented as a C language computer program,
15 or a computer program written in any high level
16 language including C++, Fortran, Java or Basic.
17 Additionally, in an embodiment where
18 microcontrollers or DSPs are employed, the systems
19 can be realized as a computer program written in
20 microcode or written in a high level language and
21 compiled down to microcode that can be executed on
22 the platform employed. The development of such
23 systems is known to those of skill in the art, and
24 such techniques are set forth in Intel® StrongABM
25 processors SA-1110 Microprocessor Advanced
26 Developer's Manual. Additionally, general

27 techniques for high level programming are known, and
28 set forth in, for example, Stephen G. Kochan,
29 Programming in C, Hayden Publishing (1983). It is
30 noted that DSPs are particularly suited for
31 implementing signal processing functions, including

32 preprocessing functions such as image enhancement
33 through adjustments in contrast, edge definition and
34 brightness. Developing code for the DSP and
35 microcontroller systems follows from principles well
36 known in the art.
37 Accordingly, although Figs. 1 and 2 graphically
38 depicts the computer process 8 as comprising a
39 plurality of functional block elements, it will be
40 apparent to one of ordinary skill in the art that

10 these elements can be realized as computer programs
11 or portions of computer programs that are capable of
12 running on the data processing platform to thereby
13 configure the data processing platform as a system
14 according to the invention. Moreover, although Fig.
15 1 depicts the system 10 as an integrated unit of a
16 document processing process 8 and a display device
17 26, it will be apparent to those of ordinary skill
18 in the art that this is only one embodiment, and
19 that the systems described herein can be realized
20 through other architectures and arrangements,
21 including system architectures that separate the
22 document processing functions of the process 8 from
23 the document display operation performed by the
24 display 26. Moreover, it will be understood that
25 the systems of the invention are not limited to

26 those systems that include a display or output
27 device, but that the systems of the invention will
28 encompass those processing systems that process one
29 or more digital documents to create output that can
30 be presented on an output device. However, this
31 output may be stored in a data file for sub&equent

32 presentation on a display device, for long term
33 storage, for delivery over a network, or for some
34 other purpose than for immediate display.
35 Accordingly, it will be apparent to those of skill
36 in the art that the systems and methods described
37 herein can support many different document and
38 content processing applications and that the
39 structure of the system or process employed for a
40 particular application will vary according to the

10 application and the choice of the designer.
11 From the foregoing, it will be understood that the
12 system of the present invention may be "hard-wired";
13 e.g. implemented in ROM and/or integrated into ASICs
14 or other single-chip systems, or may be implemented
15 as firmware (programmable ROM such as flashable
16 ePROM), or as software, being stored locally or
17 remotely and being fetched and executed as required
18 by a particular device. Such improvements and
19 modifications may be incorporated without departing
20 from the scope of the present invention.
21 Those skilled in the art will know or be able to
22 ascertain using no more than routine
23 experimentation, many equivalents to the embodiments
24 and practices described herein. For example, the

25 systems and methods described herein may be stand
26 alone systems for processing source documents 11,
27 but optionally these systems may be incorporated
28 into a variety of types of data processing systems
29 and devices, and into peripheral devices, in a
3 0 number of different ways. In a general purpose data

1 processing system (the "host system"), the system of
2 the present invention may be incorporated alongside
3 the operating system and applications of the host
4 system or may be incorporated fully or partially
5 into the host operating system. For example, the
6 systems described herein enable rapid display of a
7 variety of types of data files on portable data
8 processing devices with LCD displays without
9 requiring the use of browsers or application

10 programs. Examples of portable data processing
11 devices which may employ the present system include
12 "palmtop" computers, portable digital assistants
13 (PDAs, including tablet-type PDAs in which the
14 primary user interface comprises a graphical display
15 with which the user interacts directly by means of a
16 stylus device), internet-enabled mobile telephones
17 and other communications devices. This class of
18 data processing devices requires small size, low
19 power processors for portability. Typically, these
20 devices employ advanced RISC-type core processors
21 designed in to ASICs (application specific
22 integrated circuits), in order that the electronics
23 package is small and integrated. This type of
24 device also has limited random access memory and
25 typically has no non-volatile data store (e.g. hard
26 disk). Conventional operating system models, such
27 as are employed in standard desktop computing

28 systems (PCs), require high powered central
29 processors and large amounts of memory to process
30 digital documents and generate useful output, and
31 are entirely unsuited for this type of data
32 processing device. In particular, conventional

33 systems do not provide for the processing of
34 multiple file formats in an integrated manner. By
35 contrast, the systems described herein employ common
36 processes and pipelines for all file formats,
37 thereby providing a highly integrated document
38 processing system which is extremely efficient in
39 terms of power consumption and usage of system
40 resources.
41 The system of the invention may be integrated at the

10 BIOS level of portable data processing devices to
11 enable document processing and output with much
12 lower overhead than conventional system models.
13 Alternatively, these systems may be implemented at
14 the lowest system level just above the transport
15 protocol stack. For example, the system may be
16 incorporated into a network device (card) or system,
17 to provide in-line processing of network traffic

18 (e.g. working at the packet level in a TCP/IP
19 system) .
20 The systems herein can be configured to operate with
21 a predetermined set of data file formats and
22 particular output devices; e.g. the visual display
23 unit of the device and/or at least one type of
24 printer.
25 The systems described herein may also be
26 incorporated into low cost data processing terminals

27 such as enhanced telephones and "thin" network
28 client terminals (e.g. network terminals with
29 limited local processing and storage resources), and
30 "set-top boxes" for use in interactive/internet-

31 enabled cable TV systems. The systems may also be
32 incorporated into peripheral devices such as
33 hardcopy devices (printers and plotters), display
34 devices (such as digital projectors), networking
35 devices, input devices (cameras, scanners, etc.) and
36 also multi-function peripherals (MFPs). When
37 incorporated into a printer, the system enables the
38 printer to receive raw data files from the host data
39 processing system and to reproduce the content of

10 the original data file correctly, without the need
11 for particular applications or drivers provided by
12 the host system. This avoids or reduces the need to
13 configure a computer system to drive a particular
14 type of printer. The present system directly
15 generates a dot-mapped image of the source document
16 suitable for output by the printer (this is true
17 whether the system is incorporated into the printer
18 itself or into the host system). Similar
19 considerations apply to other hardcopy devices such
20 as plotters.
21 When incorporated into a display device, such as a
22 projector, the system again enables the device to
23 display the content of the original data file
24 correctly without the use of applications or drivers
25 on the host system, and without the need for
26 specific configuration of the host system and/or
27 display device. Peripheral devices of these types,
28 when equipped with the present system, may receive

29 and output data files from any source, via any type
30 of data communications network.

31 Additionally, the systems and methods described
32 herein may be incorporated into in-car systems for
33 providing driver information or entertainment
34 systems, to facilitate the delivery of information
35 within the vehicle or to a network that communicates
36 beyond the vehicle. Further, it will be understood
37 that the systems described herein can drive devices
38 having multiple output sources to maintain a
39 consistent display using modifications to only the

10 control parameters. Examples include, but are not
11 limited to, a STB or in-car system incorporating a
12 visual display and print head, thereby enabling
13 viewing and printing of documents without the need
14 for the source applications and drivers.
15 From the foregoing, it will be understood that the
16 system of the present invention may be "hard-wired";
17 e.g. implemented in ROM and/or integrated into ASICs
18 or other single-chip systems, or may be implemented
19 as firmware (programmable ROM such as flashable
20 ePROM), or as software, being stored locally or
21 remotely and being fetched and executed as required
22 by a particular device. 23

24 Accordingly, it will be understood that the
25 invention is not to be limited to the embodiments
2 6 disclosed herein, but is to be understood from the
27 following claims, which are to be interpreted as
28 broadly as allowed under the law. 29

1 CLAImS
2 1. A digital document processing system,
3 comprising
4 an application dispatcher for receiving an
5 input bytestream representing source data in one of
6 a plurality of predetermined data formats and for
7 associating the input bytestream with one of said
8 plurality of predetermined data formats,
9 a document agent for interpreting said input
10 bytestream as a function of said associated
11 predetermined data format and for parsing the input
12 bytestream into a stream of document objects
13 representative of internal representations of
14 primitive structures within the input bytestream,
15 and
16 a core document engine for converting said
17 document objects into an internal representation
18 data format and for mapping said internal
19 representation data to a location on a display. 20

21 2. A digital document system according to claim 1,
22 further comprising
23 a shape processor for processing said internal
24 representation data to drive an output device.
25
26 3. A digital document processing system as claimed
27 in claim 1 or 2, wherein said source data defines
28 the content and structure of a digital document, and
29 wherein said internal representation data describes
30 said structure in terms of document objects of a
31 plurality of data types and parameters defining

32 properties of specific instances of the document
33 objects, separately from said content.
34 4. A digital document processing system according
35 to claim 3, wherein the parameters defining
36 properties of specific instances include properties
37 selected from the group consisting of dimensional,
38 temporal, and physical.
39 5. A digital document processing system as claimed
40 in claim 3 or 4, further including a library of

10 objects types, said internal representation data
11 being based on the content of said library.
12 6. A digital document processing system as claimed
13 in any of claims 3 to 5, wherein said core document
14 engine includes a parsing and rendering module
15 adapted to generate an object and parameter based
16 representation of a specific view of at least part
17 of said internal representation data, on the basis
18 of a first control input to said parsing and
19 rendering module.
20 7. A digital document processing system according
21 to claim 6 wherein said parameter based
22 representation includes parameters selected from the
23 group consisting of fill, path, bounding box and
24 transparency.
25 8, A digital document processing system according
25 to any of claims 5 to 7, further including a shape
27 processing module adapted to receive said object and
28 parameter based representation of said specific view

29 from said parsing and rendering module and to
30 convert said object and parameter based
31 representation into an output data format suitable
32 for driving a particular output device.
33 9. A digital document processing system according
34 to claim 8, wherein said shape processing module
35 processes said objects on the basis of a shape
36 defining the shape of the object bounded by the
37 boundary box, the data content of the object and the

10 transparency of the object.
11 10. A digital document processing system according
12 to claim 8 or 9, wherein said shape processing
13 module processes said objects on the basis of a
14 shape defining the shape of the object bounded by
15 the boundary box representative of a defined area on
16 a display on which an object may be rendered.
17 11. A digital document processing system according
18 to any preceding claim, wherein the system employs a
19 chrominance/luminance-based colour model to describe
20 colour data. 21

22 12. A digital document processing system according
23 to any preceding claim, wherein the system employs a
24 universal text encoding model. 25

26 13. A digital document processing system according
27 to claim 12, wherein universal text encoding
28 includes Unicode, shift-mapping and big-5.

29 14. A digital document processing system according
30 to any preceding claim, further including a process
31 for compacting an internal representation of a
32 source document by combining document objects having
33 similar attributes.
34 15, A digital document processing system according
35 to any preceding claim, further including a process
36 for compacting an internal representation of a
37 source document by combining document objects having

10 similar style attributes.
11 16. A digital document processing system according
12 to any preceding claim, wherein the system is
13 adapted for multiple parallel implementation for
14 processing source data from one or more data sources
15 and for generating one or more sets of output
16 representation data.
17 17. A digital document processing system according
18 to any preceding claim, further comprising a
19 graphical user interface for generating internal
20 representations of interactive visual displays to be
21 employed by a user for controlling the digital
22 document processing system.

23 18. A digital document processing system according
24 to claim 17, comprising a data processing device
25 incorporating a graphical user interface.
26 19. A digital document processing system according
27 to any preceding claim, having a platform adapted
28 for being embedded into a device selected from the

29 group consisting of a hand held computer, a mobile
30 telephone, a set top box, a facsimile machine, a
31 copier, an embedded computer system, a printer, an
32 in-car system and a computer workstation.
33 20. A digital document processing system according
34 to any preceding claim, having a processor including
35 a core processor system.
36 21. A digital document processing system according
9 to claim 20, wherein said core processor is a RISC
10 processor.
11
12 22. A digital document processing system according
13 to any preceding claim, wherein the document agent
14 includes an export process for exporting data in a
15 selected format. 16

17 23. A digital document processing system according
18 to any preceding claim, adapted for operating on a
19 multiple processing system.
20 24. A method for displaying content, comprising
21 receiving a source of data representative of
22 the digital content having a structure and data
23 content,
24 processing the source of data to identify a
25 file format associated therewith,
2 6 translating the source of data, as a function
27 of its identified file format, into an internal
28 representation that includes a first data structure
29 for storing information about the structure of the

30 digital content, and a second data structure for
31 storing information about the data content contained
32 in the digital content,
4 generating a content file representative of an
5 internal representation of content to be presented
6 to a user, by processing the first data structure to
7 determine a structure for a portion of the content
8 file and by processing the second data structure to
9 determine data content for the respective portion of

10 the content file.
11 25. A method according to claim 24, wherein
12 receiving a source of data includes receiving a
13 stream of input data from a data source.
14 26. A method according to claim 25, wherein the
15 data source is selected from the group consisting of
16 a data file, a byte stream generated from a
17 peripheral device, and a byte stream generated from
18 a data file.
19 27. A method according to claim 25 or 26, wherein
20 processing the source of data includes
21 presenting information about the source of data to a
22 plurality of document agents, each being capable of
23 translating a data source of a known file format
24 into the internal representation.
25 28. A method according to any of claims 24 to 27,
26 wherein
27 translating the source of data into an internal
28 representation includes processing the source of
29 data to identify data therein, and mapping the

30 identified data to a set of object types
31 representative of types of content that are present
32 in a source of data. 4

5 29. A method according to claim 28, wherein mapping
6 includes mapping identified data to a set of object
7 types suitable for translating source data
8 representative of a content selected from the group
9 consisting of a digital document, an audio/visual

10 presentation, a music file, an interactive script, a
11 user interface file and an image file.
12 30. A method according to any of claims 24 to 29,
13 wherein mapping includes mapping the identified data
14 to a set of object types including a bitmap object
15 type, a vector graphic object type, a video type, an
16 animation type, a button type, a script type and a
17 text object type.
18 31. A method according to any of claims 24 to 30,
19 wherein translating the source of data includes
20 filtering portions of the source data to create a
21 filtered internal representation of the source
22 document.

23 32. A method according to any of claims 24 to 31,
24 wherein translating the source of data includes
25 altering the first data structure to adjust the
26 structure of the digital content.
27 33. A method according to any of claims 24 to 32,
28 wherein translating the source of data includes the

29 further act of substituting data content in the
30 second data structure to thereby modify content
31 presented within the internal representation.
32 34. A method according to any of claims 24 to 33,
33 wherein translating the source of data includes
34 translating the source of data into a set of
35 document objects of known object types, wherein a
36 document object includes a set of parameters that
9 define dimensional, temporal and physical
10 characteristics of the document object,
11
12 35. A method according to any of claims 24 to 34,
13 wherein the process is adapted for running on
14 multiple processors. 15

16 36. A method according to any of claims 24 to 35,
17 wherein the process provided a text encoding
18 process, for encoding in a format selected from the
19 group consisting of Unicode, shift-mapping and big-
20 5.
21 37. A method according to any of claims 24 to 36,
22 wherein generating a content data file includes
23 parsing a set of document objects having associated
24 parameters, to define a structure and content for

25 the content data file.
26 38. A method according to claim 37, further
27 including processing the structure and content of
28 the content data file to create a set of objects

29 that define the content data file and are capable of
30 being rendered on an output device.
31 39. A method according to claim 37 or 38, wherein
32 processing the document objects includes processing
33 the associated parameters for flowing content into a
34 structure defined by the document object.
35 40. A method according to claim 38 or claim 39 when
36 dependent on claim 38, wherein the output device
37 includes a display selected from the group

10 consisting of a visual display, an audio speaker, a
11 video player, a television display, printer, disc
12 drive, network, and an embedded display. 13

14 41. A system for interacting with content in a
15 digital documnent, comprising
16 a documment agent for converting content in the
17 digital document into a set of document objects
18 representative of internal representations of
19 primitive structures, and
20 a core document engine for rendering said
21 document objects to generate a display
22 representative of the digital content,
23 a user interface for detecting input signals
24 representative of input for modifying the content of
25 the digital document, and
26 a process for changing the internal
27 representation of the content as a function of the
28 input signals, to modify the display of the digital
29 content. 30

30 42. A system according to claim 41, wherein the
31 user interface includes an input device selected
32 from the group consisting of a mouse, a touch pad, a
33 touch screen, a joy stick, a remote control and a
34 keypad.

43. A digital document processing system substantially as herein
described with reference to the accompanying drawings.
44. A method for displaying content substantially as herein
described with reference to the accompanying drawings.

Documents:

in-pct-2002-1853-che abstract.pdf

in-pct-2002-1853-che claims-duplicate.pdf

in-pct-2002-1853-che claims.pdf

in-pct-2002-1853-che correspondence-others.pdf

in-pct-2002-1853-che correspondence-po.pdf

in-pct-2002-1853-che description(complete)-duplicate.pdf

in-pct-2002-1853-che description(complete).pdf

in-pct-2002-1853-che drawings.pdf

in-pct-2002-1853-che form-1.pdf

in-pct-2002-1853-che form-18.pdf

in-pct-2002-1853-che form-26.pdf

in-pct-2002-1853-che form-3.pdf

in-pct-2002-1853-che form-5.pdf

in-pct-2002-1853-che others.pdf

in-pct-2002-1853-che pct.pdf

in-pct-2002-1853-che petition.pdf

« Previous Patent

Next Patent »

Patent Number

221439

Indian Patent Application Number

IN/PCT/2002/1853/CHE

PG Journal Number

37/2008

Publication Date

12-Sep-2008

Grant Date

23-Jun-2008

Date of Filing

12-Nov-2002

Name of Patentee

PICSEL (RESEARCH) LIMITED

Applicant Address

TITANIUM BUILDING, BRAEHEAD BUSINESS PARK, KING'S INCH ROAD, PAISLEY PA4 8XE,

Inventors:

#	Inventor's Name	Inventor's Address
1	ANWAR, MAJID	TITANIUM BUILDING, BRAEHEAD BUSINESS PARK, KING'S INCH ROAD, GLASGOW G51 4BP,

PCT International Classification Number

G06F17/00

PCT International Application Number

PCT/GB01/01725

PCT International Filing date

2001-04-17

PCT Conventions:

#	PCT Application Number	Date of Convention	Priority Country
1	0009129.8	2000-04-14	U.K.
2	09/703,502	2000-10-31	U.K.