Title:
Interactive animation of sprites in a video production
Document Type and Number:
United States Patent 7432940

Abstract:
A method of animating a sprite in a video production comprising a plurality of sequential video frames is disclosed. The method comprises the steps of selecting (2004) a feature, with which the sprite is to be juxtaposed, in one of said video frames, applying (2006) a feature tracking process to the video production to thereby output (2008), for a series of said plurality of video frames containing the feature, a temporal-spatial record for the feature across the plurality of video frames, and compositing (2010), with the series of said plurality of video frames, a corresponding series of instantiations of the sprite dependent upon the temporal-spatial record.

Inventors:
Brook, John Charles (Stanhope Gardens, AU)
Reeve, Rupert William Galloway (Redfern, AU)
Qian, Lena (Marsfield, AU)
Ma, Choi Chi Evelene (Ashfield, AU)
Magarey, Julian Frank Andrew (Cremorne, AU)
Lawther, Michael Jan (Burwood, AU)
Kowald, Julie Rae (Newport, AU)
      Plaque It!

Application Number:
10/268778
Publication Date:
10/07/2008
Filing Date:
10/11/2002
View Patent Images:
Images are available in PDF form when logged in. To view PDFs, Login  or  Create Account (Free!)
Assignee:
Canon Kabushiki Kaisha (Tokyo, JP)
Primary Class:
Other Classes:
715/726
International Classes:
G09G5/00
Field of Search:
715/719-726, 345/629, 715/716
US Patent References:
4580165Graphic video overlay system providing stable computer graphics overlayed with video imageApril, 1986Patton et al.358/148
4587577Tape position data recording and reproduction methodMay, 1986Tsunoda386/69
4667802Video jukeboxMay, 1987Verduin et al.194/217
5065345Interactive audiovisual control mechanismNovember, 1991Knowles et al.715/202
5140435Video signal frame search apparatus for selection of a desired frameAugust, 1992Suzuki et al.360/72.1
5511153Method and apparatus for three-dimensional, textured models from plural video imagesApril, 1996Azarbayejani et al.395/119
5590262Interactive video interface and method of creation thereofDecember, 1996Isadore-Barreca395/806
5610653Method and system for automatically tracking a zoomed video imageMarch, 1997Abecassis348/110
5699442System for detecting the location of a reflective object within a video fieldDecember, 1997Fellinger382/103
5801685Automatic editing of recorded video elements sychronized with a script text read or displayedSeptember, 1998Miller et al.715/500.1
5802361Method and system for searching graphic images and videosSeptember, 1998Wang et al.382/217
5867166Method and system for generating images using GspritesFebruary, 1999Myhrvold et al.345/419
5867584Video object tracking method for interactive multimedia applicationsFebruary, 1999Hu et al.382/103
5923365Sports event video manipulating system for highlighting movementJuly, 1999Tamir et al.348/169
5943445Dynamic sprites for encoding video dataAugust, 1999Dufaux382/236
6014183Method and apparatus for detecting scene changes in a digital video streamJanuary, 2000Hoang348/702
6052492System and method for automatically generating an image to represent a video sequenceApril, 2000Bruckhaus382/284
6052508User interface for managing track assignment for portable digital moving picture recording and editing systemApril, 2000Mincy et al.386/96
6064393Method for measuring the fidelity of warped image layer approximations in a real-time graphics rendering pipelineMay, 2000Lengyel et al.345/427
6125229Visual indexing systemSeptember, 2000Dimitrova et al.386/69
6185538System for editing digital video and audio informationFebruary, 2001Schulz704/278
6188777Method and apparatus for personnel detection and trackingFebruary, 2001Darrell et al.382/103
6198833Enhanced interactive video with object tracking and hyperlinkingMarch, 2001Rangan et al.382/103
6226388Method and apparatus for object tracking for automatic controls in video devicesMay, 2001Qian et al.382/103
6233007Method and apparatus for tracking position of a ball in real timeMay, 2001Carlbom et al.348/157
6243104System and method for integrating a message into streamed contentJune, 2001Murray345/439
6268864Linking a video and an animationJuly, 2001Chen et al.345/428
6278466Creating animation from a videoAugust, 2001Chen345/473
6295367System and method for tracking movement of objects in a scene using correspondence graphsSeptember, 2001Crabtree et al.382/103
6400378Home movie makerJune, 2002Snook715/716
6414679Architecture and methods for generating and displaying three dimensional representationsJuly, 2002Miodonski et al.345/420
6418424Ergonomic man-machine interface incorporating adaptive pattern recognition based control systemJuly, 2002Hoffberg et al.706/21
6430357Text data extraction system for interleaved video data streamsAugust, 2002Orr386/69
6442538Video information retrieval method and apparatusAugust, 2002Nojima707/1
6507410Method for non-linear document conversion and printingJanuary, 2003Robertson et al.358/1.18
6674955Editing device and editing methodJanuary, 2004Matsui et al.386/52
6678413System and method for object identification and behavior characterization using video analysisJanuary, 2004Liang et al.382/181
6724915Method for tracking a video object in a time-ordered sequence of image framesApril, 2004Toklu et al.382/103
6738100Method for detecting scene changes in a digital video streamMay, 2004Hampapur et al.348/702
6774908System and method for tracking an object in a video and linking information theretoAugust, 2004Bates et al.345/589
6778224Adaptive overlay element placement in videoAugust, 2004Dagtas et al.348/586
6795567Method for efficiently tracking object models in video sequences via dynamic ordering of featuresSeptember, 2004Cham et al.382/103
6813313Method and system for high-level structure analysis and event detection in domain specific videosNovember, 2004Xu et al.375/240.08
6813622Media storage and retrieval systemNovember, 2004Reber et al.707/104.1
6819797Method and apparatus for classifying and querying temporal and spatial information in videoNovember, 2004Smith et al.382/181
6917692Kalman tracking of color objectsJuly, 2005Murching et al.382/103
7075591Method of constructing information on associate meanings between segments of multimedia stream and method of browsing video using the sameJuly, 2006Jun et al.348/700
20010048753SEMANTIC VIDEO OBJECT SEGMENTATION AND TRACKINGDecember, 2001Lee et al.382/103
20020141615Mechanism for tracking colored objects in a video sequenceOctober, 2002Mcveigh et al.382/103
20030011713Method and system for enhancing a graphic overlay on a video imageJanuary, 2003Kastelic348/589
20030034997Combined editing system and digital moving picture recording systemFebruary, 2003McKain et al.345/723
20040141635Unified system and method for animal behavior characterization from top view using video analysisJuly, 2004Liang et al.382/110
20040141636Unified system and method for animal behavior characterization in home cages using video analysisJuly, 2004Liang et al.382/110
20050278618Information processing apparatus and method, program, and recording mediumDecember, 2005Ogikubo715/513
Foreign References:
AUA-1910797October, 1997
AU200243487January, 2002
AU200026411November, 2002
WO/1997/039452October, 1997MEDIA EDITING SYSTEM WITH IMPROVED EFFECT MANAGEMENT
WO/1998/006098February, 1998NON-LINEAR EDITING SYSTEM FOR HOME ENTERTAINMENT ENVIRONMENTS
WO/2001/027876April, 2001VIDEO SUMMARY DESCRIPTION SCHEME AND METHOD AND SYSTEM OF VIDEO SUMMARY DESCRIPTION DATA GENERATION FOR EFFICIENT OVERVIEW AND BROWSING
WO/2001/035056May, 2001SYSTEM FOR AUTOMATED MULTIMEDIA PRESENTATION UTILIZING PRESENTATION TEMPLATES
WO/2001/082624November, 2001SYSTEMS, METHODS AND COMPUTER PROGRAM PRODUCTS FOR MANAGING MULTIMEDIA CONTENT
WO/2001/098888December, 2001CONTEXT-SENSITIVE METHODS AND SYSTEMS FOR DISPLAYING COMMAND SETS
WO/2002/054377July, 2002MEDIA EDITING AND CREATING INTERFACE
Other References:
G. Ozsoyoglu, et.al., “Electronic Books in Digital Libraries,” IEEE Proceedings, ADL 2000. pp. 1-10, 0-7695-0659-3.
Primary Examiner:
Tung, Kee M.
Assistant Examiner:
Chu, David H.
Attorney, Agent or Firm:
Fitzpatrick, Cella, Harpe & Scinto
Claims:
The invention claimed is:

1. An apparatus for extracting frames for printing from a production represented by a production description, said production comprising video frames which include an animation layer, said production having been formed using a means for animating a sprite in the animation layer of said video frames, the apparatus comprising: means for determining printing suitability measures for the frames to be extracted dependent upon meta-data in said production description, said meta-data being associated with an animated sprite in said animation layer of said video production; and means for extracting said frames for printing from the production dependent upon said printing suitability measures; wherein the means for animating the sprite in the video production comprises: means for selecting a sprite and a feature in a video frame of the video production in relation to which the sprite is to be animated; means for applying a feature tracking process to the video production to thereby output a trajectory for the feature; and means for compositing instantiations of the sprite with the video production depending upon the trajectory to thereby form a first animated production; said apparatus further comprising: means for selecting another feature in a video frame of one of the video production and the first animation production to which the sprite is to be animated; means for applying the feature tracking process to said one of the video production and the first animated production to thereby output another trajectory for said other selected feature; and means for compositing instantiations of the sprite with the first animated production depending on the other trajectory to thereby form a second animated production; wherein the animated sprite in the first animated production has an associated animation time, and the other trajectory for the other selected feature has an associated trajectory time; and wherein, if the associated animation time differs from the associated trajectory time for the other trajectory, then the compositing depending on the other trajectory is performed by: means for scaling the other trajectory so that the associated trajectory time conforms to the associated animation time; and means for compositing instantiations for the sprite with the first animated production depending on the scaled other trajectory to thereby form a second animated production.

2. A method of extracting frames for printing from a production represented by a production description, said production comprising video frames which include an animation layer, said production having been formed by a process of animating a sprite in the animation layer of said video frames, the method comprising the steps of: determining printing suitability measures for the frames to be extracted dependent upon meta-data in said production description, said meta-data being associated with an animated sprite in said animation layer of said video production; and extracting said frames for printing from the production dependent upon said printing suitability measures; wherein the animation of the sprite in the video production comprises the steps of: selecting a sprite and a feature in a video frame of the video production in relation to which the sprite is to be animated: applying a feature tracking process to the video production to thereby output a trajectory for the feature; and compositing instantiations of the sprite with the video production depending upon the trajectory to thereby form a first animated production; said method comprising the further steps of: selecting another feature in a video frame of one of the video production and the first animated production to which the sprite is to be animated; applying the feature tracking process to said one of the video production and the first animated production to thereby output another trajectory for said other selected feature; and compositing instantiations of the sprite with the first animated production depending on the other trajectory to thereby form a second animated production; wherein the animated sprite in the first animated production has an associated animation time, and the other trajectory for the other selected feature has an associated trajectory time; and wherein, if the associated animation time differs from the associated trajectory time for the other trajectory, then the compositing depending on the other trajectory comprises the steps of: scaling the other trajectory so that the associated trajectory time conforms to the associated animation time; and compositing instantiations of the sprite with the first animated production depending on the scaled other trajectory to thereby form a second animated production.

3. A method according to claim 2, wherein: the trajectory end points are defined by event bounds comprising a begin frame and an end frame; and the compositing step is performed in relation to a video frame of the video production falling between said begin and said end frames.

4. A method according to claim 2, wherein: the trajectory end points are defined by event bounds comprising a begin frame and an end frame; and the compositing step is performed in relation to a video frame of the video production falling in the neighborhood of at least one of said begin and said end frames.

5. A method according to claim 2, wherein said printing suitability measures are determined dependent upon a production template used to form said production.

Description:

TRADEMARKS

This specification may include words which are, or are asserted to be, proprietary names or trademarks. Their inclusion does not imply that they have acquired, for legal purposes, a non-proprietary or generic significance. Furthermore, no judgement is implied concerning their legal status. In cases where it is considered that proprietary rights may attach to a word, this is indicated by a propr superscript, noting that this does not imply a legal judgement concerning the legal status of such words.

TECHNICAL FIELD

The present invention relates generally to user interfaces, and, in particular, to Graphical User Interfaces (GUIs) as applied to video cameras and other recording and capture devices for media items.

BACKGROUND

With the proliferation of digital video camcorders (DVCs), there is a growth in the number of DVC users who wish to edit and process their captured video images, and also to communicate the product of their efforts to others. DVCs can capture video data, audio data, and in some cases, still images as well. These various types of data are referred to in this specification as “media data” and items of data are referred to as “media items” or “media clips” or the like. Technological elements associated with this editing activity include the editing interface, hardware accessories and editing software required, as well as communication hardware and software.

Digital Disc reCording devices (DDCs), ie digital video recording devices which utilise magnetic or magneto-optical discs (MODs) as a recording medium, offer even greater flexibility to users wishing to edit, process and communicate their media data, however greater flexibility typically exacerbates the problems facing enthusiastic yet technically untutored users.

A generic DDC system architecture comprises a number of different functional and/or structural elements. One such element is communication, both between intra-system elements, and between the system and external devices and data. Another element is infrastructure for supporting editing and incorporation of effects. A user wishing to effectively edit and process captured “media data” thus has a relatively complex system to manipulate.

Users who do edit media data typically wish to improve on the internal DDC recording, editing and effect-adding features, however very few consumers actually edit media data with software or specialised hardware. This derives in part from the fact that typical consumer users of digital video devices find it difficult to interconnect the various elements, both hardware and software, in a system architecture. This hinders the growth of DDC architectures, which inherently offer advantageous editing and processing capabilities. Furthermore, very few users attempt to gain skills in media data capture, editing or post-production. Even those consumers who do attempt to master the technology find that this is not enough, because media data editing and post-production is an art, and the required hardware and/or software is typically expensive.

Many basic editing techniques are time consuming and repetitive. Although software packages can provide assistance in the form of interactive GUIs, the tedium remains of acquiring familiarisation with the media and making edit decisions.

In some instances, a fully edited video may have tens, or hundreds of clips each clip having typically 25 to 30 frames per second. Even for a professional using high-end equipment, the task of editing such a video can take many hours or days. For a consumer video photographer, performance of this task is prohibitive in time, expensive in money terms and demanding in skill terms.

In a typical editing task, once selection of clips has been performed from the raw footage, the clips are placed in sequence. Current tools available for this process include software that provides a linear time-line, or alternatively, hardware such as dual Video Cassette Recorders (VCRs) for sequencing from the original source to another. This again is a time consuming task, involving manually slotting each video clip into place in the sequence. The current dual-VCR or camera-and-VCR solutions are slow and tediously technical for a consumer to control, and should the consumer want to amend any part of the video, the whole process must often be started again. Although some of the aforementioned process can be substituted by more capable hardware and software, the dual-VCR, or camera-and-VCR process is still used by many consumers.

Transitions such as dissolves or cross-fades are often beyond the capability of consumers' equipment unless they can use computer software. The actual implementation of video transitions and effects often places heavy storage and processing demands on editing computer resources, including requiring capture and export format decoding and encoding hardware attached to a consumer's computer. Consumer video photographers typically do not have a basic appreciation of the nature of transitions, or where they should be used. This typically results in incorrect or excessive use thereof, which constitutes a drain on resources, and results in less than pleasing results.

Consumers generally have high expectations of video because of the general availability of high-quality television programs. Home video production rarely comes close to the quality of professionally-made television programs, and this is evident in the disdain with which the general public generally holds home videos. It is very difficult for consumers to compete with the quality of professional television programs when producing their home videos. For instance, generating titles and correctly placing them in order to produce an entertaining result requires typographical and animation skills often lacking in consumers. It is also not fully appreciated that unprofessionally made titles often spoil the result of many hours of editing. Specialised software and/or additional title-generation resources are often required, thereby adding to the final cost of the production.

Current methods of sound editing are highly specialised, and the principles governing the process of embellishing a final edited rhythm with beat synchronisation is well beyond the scope of most consumer video makers. The time required to analyse the wave form of a chosen sound track in order to synchronise video cuts is prohibitive, and the cost of equipment is unjustified for most consumers. These techniques are typically unavailable in the dual-VCR editor context.

Video typically contains much content that is rarely if ever used, often being being viewed only once. Users typically capture more content than is ultimately of interest to them. Finding and viewing the content that is of interest can be carried out in various ways.

Considering an analog tape deck system, the user must shuttle through the linear tape, log the timecode of a frame sequence of interest, and/or record these segments to another tape. Logging timecode is generally only a practice of professional video editors. The practice generates log sheets, which constitute a record of locations of useful content on the tape. The case of Tape-to-digital capture is similar. Here, the user shuttles through the content marking the timecode via a keyboard and mouse using a computer software application. The interesting/useful segments are then digitised to a hard disk. It is apparent that in both above cases, the user makes a duplicate record of desired footage.

Once the content is used in an edited production, further trimming takes place. Should the user want to use the interesting content in another, different production, the analog tape deck system commands the user to carry out the same rewriting to tape process. Any content captured to disk requires that the user search through the files system, to find the relevant shots. Once again, the final edited production consists of trimmed down, interesting sequences of frames.

A large number of video editing packages are available for Personal Computer users. High-end products are available, these being intended for professional video editing users, and such products have high functionality and high complexity. Low-end packages are also available, these having limited functionality, but typically retaining considerable complexity, intended for video camera enthusiasts or even children. A common need of video Editors (the term “Editor” denoting a person performing the editing function), be they professionals or enthusiastic consumers, is to trim the length of video clips that they wish to include in any current editing project. High-end and low-end video editing products take differing approaches to this clip-trimming task, but both approaches have significant usability failings.

Low-end video editors (the term “editor” denoting a product or a device), such as Apple iMovie propr typically provide clip-trimming facilities only in the edit-timeline or storyline through the use of some kind of time-unit marker referenced to the apparent length of a clip in the timeline. Alternately, a numerical start time, and either a clip duration measure or a stop time entered into a dialogue box in units of frames or seconds or similar is used. This user-interface facility does not allow actual concurrent viewing of the clip while trimming in and out points.

High-end video editors typically provide a trimming viewer that combines the ability to play or step a clip at the whim of a user, often using a “scrubber” or slider control, while also allowing setting of in and out trim points at desired locations of the clip. The trim points are often displayed and controlled in a summary bar which represents the original length, or duration of the clip, and the trim markers appearing in this summary bar represent proportional positions of the actual trim points set by the user relative to the original clip duration. The scrubber or slider control also represents a proportional position within the clip length, this time of the viewed frame or heard audio.

High-end video editors often provide a trimming window that is disconnected from the information held within the edit timeline. Thus, any clip already imported to a timeline must be dragged, or otherwise imported into the trimming window where it may be potentially modified by the user. It is the norm that such modifications have no effect on the edit timeline during or after performance of the modifications by the user, until the user explicitly exports the trimmed clip back into the timeline. In this event, the trimmed clip is understood by the timeline portion of the editing application not to be the same clip as was originally imported into the trimmer. This identification of two separate clips adds to the list of workflow and usability problems for a user, even if that user is an expert. Exemplary high-end applications include Apple's Final Cut Pro propr , and Adobe Premiere propr .

The types of usability problems encountered by a user in the above context include the need to replace the original clip (ie., the clip prior to trimming) in the timeline with the newly trimmed clip. This forces the user to take extra steps to make the replacement. Furthermore, the user is unable to obtain any information from the timeline or the trimmer, regarding the effect of the trimming on the final edited result, as is represented by the timeline. That is, only the local effect of a trim is available to a user in this context, whereas the global effect of a trim is not available until the user actually commits the trimmed clip back into the timeline. This represents an absence of synchronism between the user's trimming action and the editor's currently held state for the project. Furthermore, the user cannot easily move to another clip within the edit timeline and trim that clip. This limitation impairs the undertaking of related trimming operations between clips and the appreciation of their overall effect on the current project in the timeline. In addition, the edit timeline often is represented as having an arbitrary length, due to a strong interest in providing a fixed resolution representation for every clip and/or frame within the timeline. This often causes a timeline's contents to scroll beyond the boundary of the available window and out of visibility. This is a limitation when multiple clips need to be trimmed that cannot all be visible within the timeline at the same time without scrolling. Furthermore, previewing of the resultant production, to view the results of any one or more trimming operations, is provided in a further, separate viewer window and is unconnected and unsynchronised with any current or recent trimming operation.

Further workflow and usability problems are encountered when automatic editing is employed to generate the edit timeline. Automatic editing has the ability to modify an EDL (often represented graphically by a timeline) based on a number of factors beyond the selected sequence of clips provided as its input. Some of these factors include (i) user metadata such as template selection, where a template contains a characteristic set of editing instructions or operations aimed at producing an expected them4e or genre result for the output EDL, and (ii) and various scene or content metadata such as user-set highlights, pan-and-zoom metadata, and so on. When a user trims an input clip to an auto-editor then their actions can result in significant changes to the output EDL because of the potential non-linear behaviour of the auto-editing template. For example, if the user trims a clip to a significantly short period, then it might be discarded by the auto-editor altogether. Or, if the user adds a highlight flag to a frame of the clip while in the trimmer (the highlight being a form of user metadata) then the auto-editor may trim the clip automatically around the position of the highlight. With current systems, the user has a poor and delayed appreciation of the effects of changes they might make within the trim window, in regard to the overall result of the auto-edit. This is a disadvantage in regard to workflow and usability for a user of an auto-editor.

A user wishing to add an animated message or sprite to a video clip must have access to a video or still-image compositing tool such as Apple Quicktime propr . Typically such an operation or effect is performed by defining or declaring a sprite layer or animation layer within a streaming compositor, and providing a matte or transparency signal for the sprite to allow it to be overlayed on the desired video content.

Users are provided with sprite animation facilities by various current software applications such as Macromedia Flash propr (often these applications utilise a streaming video compositor such as Apple Quicktime propr ). However, the application and motion definition for sprite animations is typically a very manual-intensive process, requiring per-frame sprite application (known as rotoscoping), a steady hand, and an appreciation of object dynamics in a video frame for accurate and pleasing placement of the sprite. Alternatively, very basic automated sprite application capability is provided by some software applications. Such capabilities include definition of a fixed spatial coordinate or a spatial path to which the sprite is “attached”, both of which have no continuous association or reference to a tracked feature to which the user might wish to relate the sprite.

The current consumer-level sprite application solutions understand nothing about the content of any video to which they might be applied. This content-sprite relationship must be provided entirely by the user's frame-by-frame observation of the video content or alternatively, must be completely ignored and some content-unrelated, typically pre-determined, animation track is provided instead.

Per-frame application of a sprite by a user typically involves specification of a spatial location for the sprite on a per-frame basis, with best results being provided where the user accounts for the position of one or more content objects within the frame to which she wishes to associate the sprite in some higher semantic context. Such operations suffer from human error, in which spatial placement can jitter or jump because of the difficulty in creating smooth animations from what is effectively stop-motion photography. The user is, in such cases, being asked to provide movement dynamics and thus requires trajectory-building skills of similar degree to those of animation experts. Even systems that provide auto-smoothing of a user's animation trajectory or that provide a range of predetermined and possibly adjustable trajectories, do not provide any assistance as to the correct placement of a sprite in any and every frame based on the location of the content-objects with which the user desires to associate the sprites. This lack of automatic connection of the sprite's trajectory with the desired associated, content object therefore requires the user to check and/or correct the sprite trajectory per-frame, or to accept an inaccurate animation trajectory.

It can be seen that the application of a sprite and the definition or declaration of its animation trajectory suffers from significant limitations.

It is thus apparent that when the user either wishes to perform trimming operations using current video editing applications, or wishes to incorporate sprite animation or feature-associated effects operations in current video composition or editing applications, the user must tolerate severe preparation, contrivance, cost, skill, workflow and usability limitations, and thus suffers reduced efficiency and accuracy as a result.

SUMMARY

It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements.

According to a first aspect of the invention, there is provided a method of animating a sprite in a video production comprising a plurality of video frames, said method comprising the steps of:

selecting, in one of said video frames, a feature with which the sprite is to be composited;

applying a feature tracking process to the video production to thereby output, for a series of said plurality of video frames containing the feature, a temporal-spatial record for the feature; and

compositing, dependent upon the temporal-spatial record, to each frame of the series of video frames, a corresponding instantiation of the sprite.

According to another aspect of the invention, there is provided an apparatus for animating a sprite in a video production comprising a plurality of video frames, said apparatus comprising:

means for selecting, in one of said video frames, a feature with which the sprite is to be composited;

means for applying a feature tracking process to the video production to thereby output, for a series of said plurality of video frames containing the feature, a temporal-spatial record for the feature; and

means for compositing, dependent upon the temporal-spatial record, to each frame of the series of video frames, a corresponding instantiation of the sprite.

According to another aspect of the invention, there is provided a method of selecting frames for printing from a production comprising video frames which include animation, the method comprising the steps of:

determining relative suitability measures for the video frames dependent upon at least one of (i) meta-data associated with the video frames, and (ii) a production template used to form the production; and

producing said frames for printing dependent upon said relative suitability measures.

According to another aspect of the invention, there is provided an apparatus for selecting frames for printing from a production comprising video frames which include animation, the apparatus comprising:

means for determining relative suitability measures for the video frames dependent upon at least one of (i) meta-data associated with the video frames, and (ii) a production template used to form the production; and

means for producing said frames for printing dependent upon said relative suitability measures.

According to another aspect of the invention, there is provided a method of animating a sprite in a video production, said method comprising the steps of:

selecting a sprite and a feature in a video frame of the video production in relation to which the sprite is to be animated;

applying a feature tracking process to the video production to thereby output a trajectory for the feature; and

compositing instantiations of the sprite with the video production depending upon the trajectory to thereby form a first animated production.

According to another aspect of the invention, there is provided a computer program for directing a processor to execute a procedure for animating a sprite in a video production comprising a plurality of sequential video frames, said program comprising:

code for selecting, in one of said video frames, a feature with which the sprite is to be composited;

code for applying a feature tracking process to the video production to thereby output, for a series of said plurality of video frames containing the feature, a temporal-spatial record for the feature; and

code for compositing, dependent upon the temporal-spatial record, to each frame of the series of video frames, a corresponding instantiation of the sprite.

According to another aspect of the invention, there is provided a computer program for directing a processor to execute a procedure for animating a sprite in a video production comprising a plurality of sequential video frames, said program comprising:

code for selecting a sprite and a feature in a video frame of the video production in relation to which the sprite is to be animated;

code for applying a feature tracking process to the video production to thereby output a trajectory for the feature; and

code for compositing instantiations of the sprite with the video production depending upon the trajectory to thereby form a first animated production.

According to another aspect of the invention, there is provided an apparatus for animating a sprite in a video production comprising a plurality of sequential video frames, said apparatus comprising:

a memory for storing a program; and

a processor for executing the program, said program comprising:

code for selecting, in one of said video frames, a feature with which the sprite is to be composited;

code for applying a feature tracking process to the video production to thereby output, for a series of said plurality of video frames containing the feature, a temporal-spatial record for the feature; and

code for compositing, dependent upon the temporal-spatial record, to each frame of the series of video frames, a corresponding instantiation of the sprite.

According to another aspect of the invention, there is provided a Graphical User Interface (GUI) system for editing a production having a plurality of media clips, said GUI system comprising:

(i) a clip editing process;

(ii) a GUI comprising:

a graphical representation of a selected one of said plurality of media clips, wherein manipulation of said graphical representation enables the clip editing process to be applied to the selected media clip; and

a presentation means configured to present said one media clip and any editing changes made thereto; and

(iii) a production editing process which is applied to said production to thereby form, dependent upon the selected media clip and said any editing changes made thereto, an edited production, wherein the application of the production editing process is synchronously dependent upon the application of the clip editing process.

According to another aspect of the invention, there is provided a Graphical User Interface (GUI) system for editing a production having a plurality of media clips, said GUI system comprising:

(i) a clip editing process;

(ii) a GUI comprising:

a graphical representation of a selected one of said plurality of media clips, wherein manipulation of said graphical representation enables the clip editing process to be applied to the selected media clip; and

a presentation means configured to present said one media clip and any editing changes made thereto; and

(iii) a production editing process which is applied to an EDL of said production to thereby form, dependent upon the selected media clip and said any editing changes made thereto, an edited EDL, wherein the application of the production editing process is synchronously dependent upon the application of the clip editing process.

According to another aspect of the invention, there is provided a method of editing, using a Graphical User Interface (GUI) system, a production having a plurality of media clips said method comprising the steps of:

selecting one of said plurality of media clips;

manipulating, using a GUI, a graphical representation of said selected media clip to thereby apply a clip editing process to the selected media clip;

presenting said one media clip and any editing changes made thereto using a presentation means; and

applying, synchronously with said application of the clip editing process, a production editing process to said production to thereby form, dependent upon the selected media clip and said any editing changes made thereto, an edited production.

According to another aspect of the invention, there is provided a method of editing, using a Graphical User Interface (GUI) system, a production having a plurality of media clips said method comprising the steps of:

selecting one of said plurality of media clips;

manipulating, using a GUI, a graphical representation of said selected media clip to thereby apply a clip editing process to the selected media clip;

presenting said one media clip and any editing changes made thereto using a presentation means; and

applying, synchronously with said application of the clip editing process, a production editing process to an EDL of said production to thereby form, dependent upon the selected media clip and said any editing changes made thereto, an edited EDL.

According to another aspect of the invention, there is provided a computer program product including a computer readable medium having recorded thereon a computer program for directing a computer to execute a method for editing, using a Graphical User Interface (GUI) system, a production having a plurality of media clips said program comprising:

code for selecting one of said plurality of media clips;

code for manipulating, using a GUI, a graphical representation of said selected media clip to thereby apply a clip editing process to the selected media clip;

code for presenting said one media clip and any editing changes made thereto using a presentation means; and

code for applying, synchronously with said application of the clip editing process, a production editing process to an EDL of said production to thereby form, dependent upon the selected media clip and said any editing changes made thereto, an edited EDL.

According to another aspect of the invention, there is provided a computer program for directing a computer to execute a method for editing, using a Graphical User Interface (GUI) system, a production having a plurality of media clips said program comprising:

code for selecting one of said plurality of media clips;

code for manipulating, using a GUI, a graphical representation of said selected media clip to thereby apply a clip editing process to the selected media clip;

code for presenting said one media clip and any editing changes made thereto using a presentation means; and

code for applying, synchronously with said application of the clip editing process, a production editing process to an EDL of said production to thereby form, dependent upon the selected media clip and said any editing changes made thereto, an edited EDL.

BRIEF DESCRIPTION OF THE DRAWINGS

A number of embodiments of the present invention will now be described with reference to the drawings, in which:

FIG. 1 shows a browser Graphical User Interface (GUI), whereby one or more video clips can be filtered, selected or reviewed;

FIG. 2 shows GUI system controllable processes distributed between a DDC system and a PC;

FIG. 3 shows a block diagram of a GUI system;

FIG. 4 shows a general purpose computer system upon which the GUI system of FIG. 3 can be practiced;

FIG. 5 shows a special purpose DDC computer system upon which the GUI system of FIG. 3 can be practiced;

FIG. 6 shows data flow in the GUI system;

FIGS. 7A and 7B shows the GUI for the browser of FIG. 1 and a playlist controller GUI;

FIG. 8 is a process flow-chart for implementing metadata driven GUI controls;

FIG. 9 depicts a playlist controller GUI used for manual editing;

FIG. 10 is a data flow diagram underlying the playlist controller GUI of FIG. 7B;

FIG. 11 shows a playlist controller GUI used for automatic editing;

FIG. 12 depicts a style-dial which facilitates template selection in the GUI system;

FIG. 13 shows a playlist controller GUI with an auto-magnified playlist summary bar;

FIG. 14 shows a process for Feature Tracking and Creation of a Trajectory List and Track Item;

FIG. 15 shows a GUI for User Selection of a Feature for Tracking with Simultaneous Application of a Sprite Animation;

FIG. 16 shows the GUI of FIG. 15 after termination of the Sprite Animation due to Detected Obscuring of the User-Selected Feature;

FIG. 17 shows the GUI of FIG. 15 with a Composited Preview of the Applied Sprite Animation to the Tracked Feature previously selected by the user;

FIG. 18 shows an exemplary Feature Tracking process;

FIG. 19 shows a process for Application of a Track Item and associated Feature Trajectory to a Sprite Animation and Compositing thereof over Media;

FIG. 20 depicts a process for selecting a feature in a video production, and compositing a sprite onto frames in the production in which the feature is present;

FIG. 21 is a process flow-chart for capture and/or import, and storing of media data;

FIG. 22 depicts an attributes structure incorporating a meta-data summary;

FIG. 23 is a process flow-chart showing highlight data being added to video media data;

FIG. 24 shows a Magneto-optical-disk (MOD) directory structure;

FIG. 25 shows a track composition of a time line;

FIG. 26 shows the order of elements in an exemplary media track in the time line of FIG. 25;

FIG. 27 shows time relationships of transition objects and associated media objects;

FIG. 28 depicts a file structure into which a time line can be saved;

FIG. 29 shows Trajectory and Track List Data Structures;

FIG. 30 shows a process flow-chart for parsing a file in accordance with a template;

FIG. 31 depicts a meta-data scheme relating to object detection;

FIG. 32 shows an auto-edit template process using feature tracking metadata;

FIG. 33 shows a video production data description; and

FIG. 34 shows a process for creating a print summary based on a video production.

Glossary of Terms
animation artificial moving graphic images.
audio file containing only audio data.
browse filters technology that employs meta-data structures, eg derived from
content analysis algorithms to display a filtered list of video shots.
clip/shot a single media file of video/audio content recorded by a user from
a single capture event.
content electronically-formatted material suitable for displaying to a user,
including video, audio, music, speech, still images, etc
content analysis extraction of additional information or higher-level meaning,
typically creating or extracting meta-data
DDC Digital Disc reCorder
EDL a file storing time line data, effects, references to video/audio
contents.
keyframe an image selected from a set of media items to represent the set in
a GUI.
imported media content imported from other sources such as stills, clips, and
audio.
media content includes, but is not limited to video, audio, and still images
meta-data information about content; derived directly from a device or a user
or from a process.
movie EDL file referencing video and/or audio, or a rendered or raw file
video and audio.
program/production an EDL file or movie rendered from the EDL file.
‘Promo’ an industry term meaning a short compilation of clips
demonstrative of further content in a movie serving as a preview.
provided content content provided by a source other than the user. Typically being
video/audio, graphics, titles, executable scripts, templates, access
keys, etc.
raw material/content original, captured, and unprocessed clips, stills or audio,
content provided by the user, typically being recorded or
previously recorded onto the DDC media.
still file containing a single still picture.
Template collection of rules, processes, content, expert knowledge; template
file and related media files used for automatic-editing.
time-line file file storing time-line data, not including video/audio contents.
transparency a partially transparent image layer overlying a foreground image.
trim to define or select a subset of a clip to replace that clip, typically
for playback or editing

DETAILED DESCRIPTION INCLUDING BEST MODE

The present description has been arranged in a number of sections and sub-sections, which are organised in accordance with the following Table of Contents.

TABLE OF CONTENTS
1. OVERVIEW
2. SYSTEM DESCRIPTION
3. GRAPHICAL USER INTERFACE SYSTEM
4. GUI SYSTEM SPECIFICS
4.1 THE BROWSER
4.2 THE PLAYLIST CONTROLLER
4.2.1 MANUAL EDITING PLAYLIST CONTROLLER
4.2.2 PLAYLIST CONTROLLER FUNCTIONALITY
4.2.3 AUTO-EDITING PLAYLIST CONTROLLER
4.3 FEATURE TRACKING & SPRITE ANIMATION
5. DATA FLOW AND STRUCTURE
5.1 CAPTURE AND STORAGE
5.2 META ANALYSIS AND DATA STRUCTURE
5.3 USE OF HIGHLIGHT
5.4 DIRECTORY STRUCTURE
5.5 EDL STRUCTURE
5.6 MEDIA TRACK STRUCTURE
5.7 TRANSITION/OBJECT RELATIONSHIP
5.8 TIME LINE FILE STRUCTURE
5.9 TRAJECTORY, TRACK AND CLIP STRUCTURES
6. PLAYER PROCESS
7. TEMPLATES
7.1 OVERVIEW
7.2 EXAMPLES
7.3 FEATURE TRACKING AND AUTO-EDIT TEMPLATES
7.4 PRINT SUMMARISATION OF A VIDEO PRODUCTION

Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.

1. Overview

The underlying context for the present description is a DDC user who wishes to make captured raw video into home movies. Since video consumes high data volumes, the user typically does not keep original video footage on a Personal Computer (PC) hard-drive. Instead, the user maintains the original footage on a Magneto-optic disk (MOD) 512 (see FIG. 5) upon which the footage was originally stored when it was captured on a DDC 204 (see FIG. 2). Thus, when new video footage is captured, or at some later time when review of the video footage is performed, it is necessary to analyse the MOD contents, keeping track of all the shots taken. With this information, the user can build a movie with titles, transitions, effects, and music. Transitions are used to join pieces of video (or audio), and can include simple cuts, dissolves, and fade to colour transitions. Transitions can also affect the viewer's perception of time, through selection of transition type and duration. Effects can include colour adjust (including Black and White (B&W), saturate, de-saturate, and so on), simple wipes, “old film look” (which involves incorporation of B&W, scratch, and dust effects), titles with animated text, as well as sound effects for the audio channel, and so on.

Significant benefits can accrue to the consumer as an outcome of good video editing. Such benefits can include improving on the basic quality of raw material, setting a context to make the content more holistic and comprehensible, providing extra information to aid in understanding and level of viewer interest, removing poor/unwanted content and so on. A typical motivation for those wishing to edit a consumer video is to produce a result that will be interesting or entertaining to themselves and their intended audience. This typically means that the consumer video editor attempts to meet the audience expectation of professional-quality television production, thereby providing a significant advantage by meeting the goals of comprehensibility and interest.

The DDC/PC arrangements shown herein (see FIGS. 2, 4 , 5 for example) incorporate what will be referred to as a “GUI system” 300 (see FIG. 3) as well as various GUIs (see FIGS. 1, 7 A, 7 B, 9 , 11 - 13 and 15 - 17 ) supported by the GUI system 300 . The GUI system 300 provides both the GUIs by which the user interacts with both the DDC 204 and the data in the MOD 512 (see FIG. 5), and also supports the underlying functionality required to maintain and utilise the MOD library which the user accumulates over time.

When the user has captured and stored new media data on the MOD 512 (see FIG. 5), the GUI system 300 (see FIG. 3) can be used to analyse the content of the disk. This analysis involves generating and storing meta-data for every shot stored on the MOD 512 . The meta-data stored on the MOD 512 is both that meta-data provided by the DDC 204 during recording time, and that meta-data obtained thereafter by content analysis.

Using the arrangements described, the user can browse and edit the captured video footage. This can be performed either manually, using a browser 220 (see FIG. 3) and a manual edit feature 228 , or the process can be partially or completely automated, by using an auto-editor 240 (see FIG. 3). The auto-editor 240 produces substantially professional quality edited video from raw footage by applying pre-defined templates. The templates supply titles, music and effects in one or more styles depending on the template. Video material which has been edited, or has had effects added thereto, is sometimes referred to as “effected material”. The user need only make minor inputs during the auto-edit process, in order to yield a finished movie. At a minimum, the final result may be similar to that produced by a two-step “browse and manual edit” operation, but with much less effort required on the part of the user. Typically, however, the templates provide much more sophisticated editing and effects than can be achieved with a single pass through a manual editor 228 (see FIG. 3). Additionally, the template incorporates implicit, and sometimes explicit, video and audio production expertise, and thus the template is more than merely a sum of its editing and effects parts.

The GUI system 300 provides two basic capabilities, namely review, and storyboard. The review capability allows video to be reviewed at all stages of the video lifetime. Reviewed material may thus include user-selected sequences of raw shots, trimmed shots, shots placed in an auto-edit template including music and effects, and finally the entire movie or part thereof. The review capability provides GUI controls, which are adapted to whichever particular task is at hand. Most reviews are rendered and played in real time at normal play speed, however, where complex processing is required, such as may be required by some templates, non-realtime or background processing may precede review-playback of the material.

FIG. 1 shows a GUI 100 , by means of which one or more video clips can be filtered, selected or reviewed. The GUI 100 takes the form of a basic browser screen and shows a number of keyframes, 104 , 106 and 108 in a display area 110 under a tab 102 which, in the present instance, is identified by a label “MEDIA”. If a touch-screen 524 is used (see FIG. 5), a pen 526 functions as a user interface mechanism, and one or more of the keyframes 104 , 106 and 108 can be selected for a desired play sequence by tapping each keyframe once with the pen 526 . After selecting the aforementioned keyframes, a play button 736 on a companion GUI 704 (see FIG. 7A) is pressed, or a double-click action is initiated over the selected keyframe(s). Consequently, the video clips associated with the aforementioned selected keyframes are played in the desired sequence, in a contiguous fashion as a single video presentation.

The GUI 100 (FIG. 1) is used in this example primarily to provide a browsing and selection mechanism in relation to keyframes which are intended to form all, or part of a subsequent, or current editing project which is held within a playlist controller 224 shown in FIG. 2. A “playlist” is an overall timeline representation of the entire production being considered.

The typical operational sequence used by the user is to select a sequence of keyframes that is subsequently imported into the playlist controller 224 of the GUI system 300 (see FIG. 3). This importation step of the keyframe sequence into the playlist controller 224 (see FIG. 2) can be manually initiated, by the user pressing a button (not shown). Alternately, the importation can be completely automatic, whereby the playlist controller 224 dynamically accepts any new selection sequence which is created by the user in the browser GUI 100 .

The playlist controller 224 (see FIG. 2) can dynamically accept a selection sequence in a number of different ways. If the playlist is empty, then the new sequence forms the entirety of the new playlist. If, however, the playlist is not empty, then the new sequence is accepted as a replacement of the current playlist. Alternately, if the playlist is not empty then the new sequence can be accepted and appended to the current playlist. It can be seen that the last two options require a modal change to the importation operation, either under user control, or via a process that determines the user's intent. Such a process can be as simple as comparing the lengths of the current playlist sequence and the new selection sequence, and if the former is significantly longer than the latter, then the latter is assumed to be intended for appending to the former. Other importation methods are possible, particularly if the user is allowed greater latitude in specification of his intent, with actions such as drag and drop into, or out of, positions within the length of the playlist.

Additional to the basic functions of review and storyboard is the tracking and animation facility that can be accessed via a sprites button-menu 2004 in FIG. 12. In that instance, the tracking and animation facility can be operated by the user with reference to a playlist controller viewer 1510 (see FIG. 15).

2. System Description

FIG. 2 depicts one example of a DDC system 200 , in which GUI controllable processes 214 are distributed between the DDC 204 and the PC 208 . The system 200 supports media data retrieval, meta-data processing and usage, and media data display. Recalling the earlier definition of media data, the term “display” is understood to encompass presentation to the user of the various types of media data. The system utilises meta-data creation, storage and processing to underpin advantageous capabilities offered to a DDC user. The system comprises the DDC 204 , which communicates, by means of a connection 206 , with the Personal Computer (PC) 208 . The DDC 204 can communicate with external devices and data by means of a connection 202 . The PC 208 can communicate with external devices and data by means of a connection 210 .

The GUI system controllable processes 214 depict the prime mechanisms by which the user interacts with both the DDC 204 and the PC 208 . The GUIs which incorporate substantiations of the GUI system controllable processes 214 are presented to the user by a browser process 220 which supports the browser GUI 100 (see FIG. 1) and other GUIs. The playlist controller 224 is one of the GUI system controllable processes 214 which provides a convenient and intuitive mechanism by which the user can access and control the various DDC or software application media selection and editing functions. The user can operate the GUI system controllable processes 214 in a number of modalities, including via the touch screen 524 (see FIG. 5), the pen 526 or a cordless remote control (not shown), these options being depicted by a bidirectional arrow 218 in FIG. 2. The specifics of the GUI system controllable processes 214 are dependent upon the distribution of functionality between the DDC 204 and the PC 208 .

One element used by the browser process 220 is a set of browse filters 230 that utilise content analysis and meta-data analysis to filter and display more meaningful video segments for the user's selection. A display function 246 enables the user to view both the unedited, unprocessed video data as captured, and also to view the final edited images, as well as data during intermediate stages of processing.

The playlist controller 224 provides access, as indicated by a dashed line 236 , to the automatic editing function 240 which utilises expert knowledge, content and effects in templates, whereby the consumer can obtain a better quality result with very simple control selections. The playlist controller 224 also provides access as depicted by a dashed line 226 to the manual editing function 228 that provides the user with a simple and intuitive set of editing functions and interface controls. The playlist controller 224 also provides a display of its status and media content via the display function 246 as depicted by a dashed arrow 242 . The playlist controller 224 also provides access to a Feature Tracking function 232 as depicted by a dashed line 234 , and to a Sprite Animation function 244 as depicted by a dashed line 238 .

3. Graphical User Interface System

FIG. 3 provides a block diagram representation of the GUI system 300 incorporating the GUI system controllable processes 214 (see FIG. 2) as well as other functional elements. The GUI system 300 can be implemented in hardware, software or a combination thereof. An integrated data management system 322 keeps track of input media data, and of all generated movies. This system 322 allows the user to search and/or browse a DDC database or directory structure for content, such as media data including video, animation, audio, or effects, via the browser module 220 . The system 322 also provides access to meta-data and media content for other DDC modules, by acting as a central information resource in the DDC architecture. The data management system 322 is preferably implemented as an object store, however, other data management configurations can be used. Searching is handled with custom-made software modules, or is performed by using an object tree. Alternately, searching can be performed by dynamically creating an index by filtering, or by partially traversing an object tree, or by using a master index.

A DDC interface 306 is responsible for importing and exporting digital video via either or both the DDC external connection 202 (see FIG. 2) and the connection 206 to the PC 208 . The DDC interface 306 is also responsible for handling the interface between the DDC's functional elements and the “lens” system 520 of the DDC 204 (see FIG. 5). The interface 306 provides control over the DDC 204 (see FIG. 2), allowing digital video data to be transferred to the PC 208 via the connection 206 , and allows digital video to be sent back to the DDC 204 from the PC 208 , or from an external source.

When capturing media data, the interface module 306 parses the video and audio data for meta-data. The interface module 306 also captures scene meta-data from the image sensor 520 (see FIG. 5) when recording media data. Scene meta-data can include white balance, focus and focal length information, operating mode information and auxiliary information such as date and time, clip start time, and so on.

The Browse filters 230 provide an extra level of functionality to the data management system 322 . Since the data management system 322 is only an object store, it is the browse filters 230 which provide the functionality for more efficient searches. The Browse filters 230 allow the user to sort the contents of the MOD 512 (see FIG. 5) according to certain criteria. Typically the browse filters 230 are used to search for a particular video clip. Alternately, the browse filters 230 can be used in combination with the editing facilities of the GUI system 300 . By selecting a specific type of video clip for display (for example “bright” clips only), the number of key frames (eg. 104 in FIG. 1) displayed to the user is greatly reduced. This allows for easier navigation and manipulation of the contents of the MOD 512 (see FIG. 5). It is noted that while the description makes reference to “key frames” as being objects representative of a particular media content item, other indicia such as text labels can also be used to fulfil this purpose.

Meta-data generated during the content analysis process is available for use in filtering video. White balance, aperture speed, and other scene meta-data directly obtainable from the DDC 204 (see FIG. 2) can be used as a basis for filtering video segments into, for example, bright or dark categories for browsing. Timestamps and other non-visual or scene meta-data can also be obtained directly from the DDC 204 , and can be used to filter video by criteria such as “most recent”, or by user-selectable date and time options.

A Render (Play) System 310 (see FIG. 3) captures a movie, and renders it. The render system 310 can accept and render a movie in a number of forms. Thus, for example, a raw movie clip can be rendered directly to the display 246 (see FIG. 2). A sequence of raw movie clips, such as might be displayed temporarily in the browser GUI 100 (see FIG. 1) by a user selecting a sequence of clips for playing, can be rendered from a simple EDL which references raw input. Typically, the result from the set of EDL references can be rendered to the display 246 (see FIG. 2). The renderer 310 (see FIG. 3) can also accept more complex EDLs, these often comprising complex editing and effects operations on raw video clips. In this case, the renderer 310 can be instructed to output a movie either to the display 246 (see FIG. 2), or to a file or stream, and can also, or instead, be instructed to output a compiled EDL comprising optimised or specialised instructions, this output EDL differing from the input EDL.

Further, the renderer 310 (see FIG. 3) can be instructed to render and store to file any components which are to be used in the desired final movie that do not already exist in the input raw movie clips, for instance, transitions and effects. This latter function provides a pre-caching capability useful for accelerating a subsequent rendering operation.

The renderer 310 provides services of the aforementioned kinds to other modules, and can operate synchronously, or asynchronously, as requested by another module and as system capabilities permit. The render system 310 can render a movie and store the rendered movie in a file, or alternately, the system 310 can render the movie in real time to the display 246 (see FIG. 2). If a “live to display” render is required, then segments of the movie that can not be rendered live are either rendered offline first, or alternately, the segments are substituted with near-equivalent effects that are more easily rendered.

The playlist controller 224 (see FIG. 3) is an important element of the GUI system 300 , and plays an important role in the video editing process. The playlist controller module 224 allows importation of video clip information from the browser 220 . Control of the playlist controller 224 is available to the user, who is also provided with a display, in the form of a graphical representation, of the playlist controller status (see 722 in FIG. 7A). The playlist controller 224 (see FIG. 3) channels much of the user's interaction with the GUI system 300 through to either the manual edit module 228 , or the auto-edit module 240 .

A Feature Tracking module 232 provides the user with the ability to generate a trajectory based on any selected feature or features within a selected video clip. These trajectories can be saved in the Object/Metadata Store 322 , and referenced to, or by, the clip from which they were generated, for later retrieval and application. A typical application of the feature tracking trajectory is provided in a Sprite Animation module 244 which accepts a trajectory, a video clip and a selection of a sprite animation sequence, and assembles an animation EDL, which is typically composited and rendered by the renderer module 310

The auto-edit function 240 is an important feature in the GUI system 300 , and plays an important role in the video editing process. The playlist controller 224 interacts with the auto-edit module 240 , which allows templates, styles and content to be automatically applied to raw video. The result produced by the auto-editing function is an EDL, which is an object description of a movie, the EDL referencing the content and processes which make up the movie. The EDL information is channelled back to the user in the form of a playlist, via the playlist controller 224 in particular, and the GUI system 300 in general. The auto-edit module 240 hides a large amount of functional complexity from the user, greatly simplifying operation thereof.

As already described, the GUI system 300 (see FIG. 3) supports both (i) automatic post-production editing using the auto-edit function 240 , and (ii) intelligent filtering of digital video, by means of the browser module 220 and the browse filter function 230 . Both these features (the browser 220 and the auto-editor 240 ) of the GUI system 300 operate to a large degree based on meta-data, by identifying clips of video that are relevant to the user. However, although the functionality of these two features ( 220 , 240 ) appears to be similar in some respects, there are important distinctions.

The browse filters 230 allow the contents of the MOD 512 (see FIG. 5) to be filtered and displayed via the browser GUI 100 (see FIG. 1) according to pre-defined categories (such as “bright”) that are defined in the filter specifics. However, unlike the auto-edit function 240 (see FIG. 3), there is no intelligent creation, or manipulation of the raw video to thereby produce a new movie. The output of a browse filter operation is simply a subset of those video clips or other media already residing on the MOD 512 that match the user-specified criteria. Thus the browse filters 230 may be used within the browser 220 (see FIG. 3), independently of the auto-editor 240 , or as a first step in an editing session using the manual editor 228 or auto-editor 240 functions.

The manual editing function 228 provides the user with a flexible, and yet simple and intuitive set of editing functions and interface controls. The manual editor 228 is controlled by the user via the playlist controller 224 and the resultant EDL is displayed via the playlist controller 224 and the GUI system 300 as a playlist.

The outcome of an editing process, whether produced by manual editing, auto-editing or a combination thereof, is captured in a corresponding EDL, or in an equivalent playlist. The EDL can be used to direct production of the edited movies already described, using movie clips rendered according to the EDL's time-line. Alternatively, the EDL can be saved to file and need not include any copy of video and/or audio content, but instead, need only includes references to input video clips.

A content analysis module 314 (see FIG. 3) generates content information (meta-data) from media data. The Content Analysis module 314 provides services to other components of the GUI system 300 , in particular to elements such as the auto-edit function 240 and the browse filters 230 .

Turning to FIG. 4, in one arrangement the GUI system 300 of FIG. 3 is practiced using a general-purpose computer system 208 , such as that shown in FIG. 4 wherein the GUI system 300 may be implemented as software, such as an application program executing within the computer system 208 . The computer system 208 refers, for example, to the PC 208 in FIG. 2. In particular, GUI system is effected by instructions in the software that are carried out by the computer. The software may be divided into two separate parts; one part for providing the GUI system 300 ; and another part to manage the user interface between the latter and the user. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer from the computer readable medium, and then executed by the computer. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer effects an advantageous apparatus for providing the GUI system 300 .

The computer system 208 comprises a computer module 420 , input devices such as a keyboard 428 and mouse 430 , output devices including a printer 400 and a display device 402 . A Modulator-Demodulator (Modem) transceiver device 408 is used by the computer module 420 for communicating to and from a communications network 410 , for example connectable via a telephone line 406 or other functional medium. The modem 408 can be used to obtain access to the Internet, and other network systems, such as a Local Area Network (LAN) or a Wide Area Network (WAN).

The computer module 420 typically includes at least one processor unit 426 , a memory unit 434 , for example formed from semiconductor random access memory (RAM) and read only memory (ROM), input/output (I/O) interfaces including a video interface 412 , and an I/O interface 432 for the keyboard 428 and the mouse 430 and optionally a joystick (not illustrated), and an interface 414 for the modem 408 . A storage device 422 is provided and typically includes a hard disk drive 416 and a floppy disk drive 418 . A magnetic tape drive (not illustrated) may also be used. A CD-ROM drive 436 is typically provided as a non-volatile source of data. The components 412 - 426 and 432 - 436 of the computer module 420 , typically communicate via an interconnected bus 424 and in a manner which results in a conventional mode of operation of the computer system 208 known to those in the relevant art. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun Sparcstations or alike computer systems evolved therefrom.

Typically, the application program is resident on the hard disk drive 416 and read and controlled in its execution by the processor 426 . Intermediate storage of the program and any data fetched from the network 410 may be accomplished using the semiconductor memory 434 , possibly in concert with the hard disk drive 416 . In some instances, the application program may be supplied to the user encoded on a CD-ROM or floppy disk and read via the corresponding drive 436 or 418 , or alternatively may be read by the user from the network 410 via the modem device 408 . Still further, the software can also be loaded into the computer system 208 from other computer readable medium including magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red transmission channel between the computer module 420 and another device, a computer readable card such as a PCMCIA card, and the Internet and Intranets including email transmissions and information recorded on websites and the like. The foregoing is merely exemplary of relevant computer readable mediums. Other computer readable media may alternately be used.

Turning to FIG. 5, in another arrangement the GUI system 300 of FIG. 3 is practiced using a DDC implementation 204 wherein the GUI system may be implemented as software, such as an application program executing within the DDC 204 . The DDC 204 refers, for example, to the DDC 204 in FIG. 2. In particular, the GUI system is effected by instructions in the software that are carried out by the DDC. The software may be divided into two separate parts; one part for providing the GUI system; and another part to manage the remaining DDC functions. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the DDC from the computer readable medium, and then executed by the DDC. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the DDC effects an advantageous apparatus for providing the GUI system.

The DDC 204 comprises a processor module 516 , input devices such as the touch-screen 524 and the pen 526 , output devices including an LCD display device 502 . An I/O interface 510 containing a Modulator-Demodulator (Modem) transceiver device (not shown) is used by the processor module 516 for communicating to and from a communications network 506 , for example connectable via a telephone line 504 or other functional medium. The I/O interface 510 can be used to obtain access to the Internet, and other network systems, such as a Local Area Network (LAN) or a Wide Area Network (WAN).

The processor module 516 typically includes at least one processor unit 522 , a memory unit 530 , for example formed from semiconductor random access memory (RAM) and read only memory (ROM), input/output (I/O) interfaces including an LCD interface 508 , and an I/O interface 528 for the touch screen 524 and pen 526 , and an interface 510 for external communications. The optical sensor 520 is a primary input device for the DDC 204 , which also typically includes an audio input device (not shown). An encoder 532 provides image coding functionality, and a meta-data processor 534 provides specialised meta-data processing. A storage device 518 is provided and typically includes the MOD 512 and a Flash Card memory 514 . The components 508 - 514 , 518 - 522 , and 528 - 534 of the processor module 516 , typically communicate via one or more interconnected busses 536 .

Typically, the GUI 300 system program is resident on one or more of the Flash Card 514 and the MOD 512 , and is read and controlled in its execution by the processor 522 . Intermediate storage of the program and any data fetched from the network 506 may be accomplished using the semiconductor memory 530 . In some instances, the application program may be supplied to the user encoded on the MOD 512 , or alternatively may be read by the user from the network 506 via the I/O 510 . Still further, the software can also be loaded into the DDC 204 from other computer readable medium including a ROM or integrated circuit, a radio or infra-red transmission channel between the processor module 516 and another device, a computer readable card such as a PCMCIA card, and the Internet and Intranets including email transmissions and information recorded on websites and the like. The foregoing is merely exemplary of relevant computer readable mediums. Other computer readable media may alternately be used.

The GUI system 300 of FIG. 3 may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions of the GUI system. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.

4. GUI System Specifics

FIG. 6 shows data flows in the GUI system 300 of FIG. 3, with particular reference to the GUI system controllable processes 214 , and a detailed description is provided in regard to the user interface and underlying data kernel. A function-calling direction is depicted by an arrow 608 , for example. An Engine 610 is a software/hardware component that supports media processing functions such as compositing, playback and information retrieval.

An object, or clip manager function 602 manages clips, noting that a clip has been defined as a single media file that consists of, or emulates, video and/or audio content recorded by a user, commencing from a “record-on” command, and terminating at a “record-off” command. From a “media type” perspective, the clips include (i) Movies, which are files containing video and optionally, sync audio, (ii) Stills, which are files containing a single still picture, and (iii) Audio, which are files containing only audio data. From a “production” perspective, the clips include (iv) raw material, which are clips captured prior to any editing, (v) edited movies, which are movie clips rendered by the engine from an EDL, (vi) EDL files, which are files storing time-line data, and optionally (vii) processing operations. The EDL files (vi) need not, however, include a copy of video and/or audio contents. In this regard it is noted that the EDL file (vi) can be reloaded by the user for modification and playback. The rendered movies (v), however, contain composed video/audio data, but do not have any time-line editing information, which in contrast comprises the major part of EDL files in (vi). Further from a production perspective, the clips also include (viii) auto-editing templates, which can include combinations of clips of any kind, and related media and meta-data, as well as processing or executable files used for automatic-editing. A process for capturing and/or storing media data is described in relation to FIG. 21.

One or more storage areas 604 are created, typically on a hard disk 416 (see FIG. 4), to contain the clip data and associated meta-data, as well as any EDLs. A directory structure used in the MOD 512 (see FIG. 5) in an exemplary DDC system such as is shown in FIG. 2 is described in relation to FIG. 24.

Returning to FIG. 6, the Browse Filters 230 are used to filter the raw material clips presented in the Browser 220 . Filter attributes can include (i) dark, (ii) light, (iii) zoom, (iv) pan, (v) saved, (vi) people, and (vii) highlights (referred to also as “hilites”). From an implementation perspective, many of the browse filter attributes can be defined as bit-fields that can be operated on with Boolean logic operations, in order to define the required meta-data combinations.

The EDL 606 is also referred to as an “edit-list” or a “play-list” and, in this text these terms are generally used interchangeably. The latter term is typically used in connection with the playlist controller 224 and particularly when a play-list or EDL is instantiated in the playlist controller 224 . The term “play-list” is often used in this text to describe the temporal properties of an EDL that may be visually represented to a user using a timeline paradigm or the playlist controller summary bar 722 for instance (see FIG. 7A). This usage is a convenience to assist the reader or user to comprehend the presented concepts. The term “play-list” is not restricted to this meaning however. The EDL 606 can be edited by the manual editor function 228 , or by the auto editor function 240 . The EDL 606 can be fed to the engine 610 , and can direct the engine to either play, or alternatively render, a movie file. The EDL 606 can also be saved by either of the editors 228 , 240 by means of the object manager 602 , for later use. The EDL 606 is discussed in terms of a time-line and is described in more detail in relation to FIG. 25.

The playlist controller 224 controls the auto editor 240 and the manual editor 228 . The Feature Tracking module 232 allows user selection via a GUI of one or more features, which it will subsequently track, and its results are typically fed to the Sprite Animation module 244 , which does not need to provide GUI control in the current arrangement.

4.1 The Browser

FIGS. 7A and 7B show exemplary GUIs 704 and 100 for a playlist controller and the browser GUIs respectively. These GUIs enable the user to quickly view the contents of the MOD 512 (see FIG. 5), and display the contents in the screen area 110 in FIG. 7B.

Turning to FIG. 7B, each displayed key-frame 104 represents a video clip, a still image or an audio clip, or a combination of these, which are stored on the MOD 512 (see FIG. 5). Key-frames are selected by the user, who highlights a desired key-frame by tapping it once with the pen 526 (see FIG. 5), assuming that a touch-screen 524 /pen 526 implementation is being used.

A key-frame list, which is a software entity listing all key-frames on the MOD 512 , can be scrolled up and down using a scroll bar 776 which is positioned to the right of the key-frame display 110 in FIG. 7B. The scroll bar 776 can be dynamically (re)sized depending on the number of key-frames to be displayed in the display area 110 .

The media tabs 102 , 766 and 768 are used to view different media types stored on the MOD 512 . Media types which can be stored on the MOD 512 include “MEDIA” (at the tab 102 ), consisting of video footage, graphics, titles and other types of importable media, photos (also referred to as “Still”), which are still images captured using the DDC photo mode, and sound files independent of images, available for use in editing.

In addition, the MOD 512 can store “EFFECTS” (at the tab 766 ), which include transitions (fades, wipes, dissolves and so on) for use in editing, and also special effects, which are effects that can be applied to video clips during editing, such as blurs, colour filters such as sepia, black and white film effects. In addition, the MOD 512 can store “MY MOVIES” (at the tab 768 ), which are video EDLs or rendered videos generated using manual or auto-edit processes 228 , 240 , allowing the user to save and re-access their work at a later date.

The browser GUI display area 110 displays icons or key-frames representing the available media of a selected type on the MOD 512 . The currently selected “Media” tab 102 also determines the particular browse filters displayed, by removing or disabling inappropriate or irrelevant filter buttons and enabling or displaying relevant buttons. The type of media displayed under any media tab that supports combined media types can also affect the particular browse filters displayed. For instance, selecting the My Media tab 102 allows display of video and still keyframes. If video keyframes are present on the MOD 512 , then a browse filter (selected from a set of browse filters 774 ) that applies only to video content, such as ZOOM, will be displayed (ie. enabled). Similarly, if there are any video clips containing a pan then the “PAN” filter will be enabled, allowing user selection thereof, and indicating that this type of filter is relevant to the available media.

FIG. 7B shows the browser GUI 100 , which enables the user to view, sort, and filter the contents of the MOD 512 . The user is able to select from the video and other media types (which are selectable using the media tabs 102 , 766 , 768 ), in order to include the selected media data in either an auto edit, or manual edit session or other activity.

The browser GUI 110 shares many of the features of typical keyframe browsers, but adds the browser-specific browse filter list or media filter list 774 . The browse filter list 774 displays the available browse filters 230 (see FIG. 2). Browse filters are used to modify the key-frame list (which is displayed in the display area 110 ), in order to display only those key-frames that meet certain criteria defined by the selected browser filter. For example use of a “ZOOM” browse filter will display only those key-frames that contain zoom camera effects.

Once a browse filter is selected, and thus applied, the list of key-frame or other media in the window 110 is updated. The selected media data that is displayed in the display area 110 remains available as various tool buttons are selected by the user. However by deselecting all browse filters 774 , the effects of any previously chosen filter is reversed and the entire contents of the MOD 512 are displayed. If a different media type than the “ALL” type ( 770 ) is selected, then the Browse filters will change and be enabled or disabled to reflect the new media type. For example, selection of the Audio type 772 allows audio files, which will be displayed in the key frame window 110 , to be filtered by music genre or beat pattern, while Images may be filtered by Light or Dark.

The user can, as noted in the above “browse” example, after applying one or more browse filters, select a subset sequence of the key-frames which are displayed in the display area 110 , and transfer that sequence to the playlist controller GUI 704 (see FIG. 7A).

FIG. 8 is a flow diagram for an exemplary process 800 for controlling meta-data driven GUI controls. The process relates to a GUI which is presently displaying a number of different pieces of media (video, audio, still), and in which it must be determined which controls are to be enabled for the user. In this example the system is considering if the ‘Browse on Pan’ control should be enabled. The process 800 commences with a start step 802 , after which meta-data for a first piece of media is obtained in a step 804 . Thereafter, a testing step 806 tests the meta-data for presence of a ‘Pan’ indication. If such an indication is present, then the process 800 is directed in accordance with a ‘yes’ arrow to a step 812 which enables a ‘Browse on Pan’ control, after which the process terminates in a step 814 . If, on the other hand, a ‘Pan’ indication is not detected in the testing step 806 , then the process 800 is directed from the testing step 806 in accordance with a ‘no’ arrow to a second testing step 808 which tests whether any further media items are available. If such items are available, then the process 800 is directed in accordance with a ‘yes’ arrow to a step 816 which obtains meta-data for the next piece of media, after which the process 800 is directed back to the testing step 806 . If, on the other hand, the testing step 808 determines that no further media items are available, then the process 800 is directed from the testing step 808 to a step 810 which disables the ‘Browse on Pan’ control, after which the process 800 terminates at the step 814 .

4.2 The Playlist Controller

Returning to FIG. 7A, the GUI 704 includes a video/still image viewer window 706 and a synchronised audio player device (not shown). The viewer windows 706 together with the audio player allows multimedia presentation of media items to the user. The GUI 704 has viewing controls including a play/pause button 736 , a fast forward button 734 , a rewind button 738 , a frame step forward button 732 , a frame step reverse button 740 , a clip-index forward button 730 , and a clip-index reverse button 742 . The GUI 704 also has a current clip summary bar 720 , over which a clip scrubber 758 slides, as well as an in-point marker 756 and an out-point marker 710 slide. An active clip segment marker 762 is also displayed.

A playlist summary bar 722 is displayed, overlaid by proportional clip indicators including 724 and 728 , as well as a playlist scrubber 746 , which is connected to the clip scrubber 758 by a line 764 . The GUI 704 also has a playlist mode box 702 , and a playlist mode indicator 700 , as well as audio channel indicators and controls for a right channel 708 and a left channel 760 . Media components referable by the playlist controller 224 (see FIG. 2) are selectable and manipulable by common file commands such as “drag and drop”, “copy”, “paste”, “delete” and so on.

Returning to FIG. 7A, the clip summary bar 720 and its components operate in a similar way to a conventional trimmer window scrubber, where the length of the clip summary bar 720 represents 100% of the current clip displayed in the trimmer window 706 . The scrubber control 758 can be moved left or right at varying speeds by the user to access discrete frames within the clip at the proportional position within the clip duration according to the scrubber relative position along the length of the clip summary bar 720 . Further, the ability to set at least one pair of “in and out” points is provided, by dragging corresponding “in” or “out” markers from initial locations 752 or 718 respectively, to locations such as are indicated by 756 and 710 respectively. The constantly available display of the playlist summary bar 722 relieves the user of the need to pan the time line in order to evaluate changes to components therein.

The intervening proportional length and clip duration 762 between these markers 756 and 710 is understood to be the portion of the clip that will be subsequently displayed or used. Other portions 754 and 716 of the clip are understood to have been trimmed, and will not be displayed or used subsequently. It is possible to drag multiple in and out points into the clip summary bar 720 to represent and command the multiple segmentation of (non-nested) clip portions for subsequent use.

The viewer play controls, referred to collectively under the reference numeral 726 in FIG. 7A, may be activated by the user to initiate various kinds of playback within the newly trimmed clip, which is represented by the indicator 762 . For instance, if the play button 736 is selected, then playback resumes from the current position of the scrubber control 758 within the active clip region 762 , and continues until the end 710 of the active clip region 762 at which point playback stops. It is usual for the scrubber 758 to allow a user access to trimmed regions outside the current in/out points 756 and 710 , for the purposes of viewing video in such regions, to allow further modification of the existing in or out point, or for adding further in or out points. Therefore, non-linear interactions are typically provided between playback controls 726 . For example, having regard to the play button 736 , the scrubber 758 , and the active clip region 762 , it is noted that if the scrubber 758 is presently outside the active clip region 762 when the play button 736 is pressed, then the scrubber 758 immediately jumps to the first frame of the active clip region 762 , and commences playback until the last frame of the active clip region 762 . It is typical and useful for the scrubber 758 to track the current playback frame by moving proportionately along the active clip region 762 according to the relative time-location of the currently playing video frame.

As discussed, the above-described functionality of the current clip summary bar 720 , the associated viewer 706 , and the playback controls 726 is conventionally provided in a trimmer window in currently available editors. Such editors that provide a trimming window functionality, either (i) provide the trimming window functionality independently of and without any synchronised viewing at an associated edit timeline, or (ii) provide trimming window functionality with a synchronised position marker within the timeline such that the position marker within the edit timeline represents the proportionate position of the current clip scrubber 758 , but within the edit timeline production as a whole.

Editors of the first kind (i) disadvantage the user because there is not provided any dynamic or current information about the effect of trimming the current clip on the whole of the edit timeline production or playlist or EDL. Accordingly, the local clip information being modified by the user is not presented to the user in terms of its global effect. In fact, it is usual for such editors not to insert the current trimming information into the edit timeline until the user explicitly, and usually laboriously, performs the insertion by a sequence of GUI operations involving dragging and dropping and often replacing clips, and rendering and/or previewing the timeline result.

Editors of the second kind (ii) provide the user with a global marker positioned according to the relative length 762 of the currently trimmed clip to the overall timeline production length and offset proportionately to the position of the scrubber 758 within the proportionate length of the trimmed clip 762 as the trimmed clip 762 is positioned within the overall timeline length. Thus, if the scrubber 758 is moved, then the global marker will move proportionately within the distance describing the trimmed clip length shown within the timeline production display. If the trimmed clip length is altered by the user by moving either or both of the in or out points in the trim window then the proportionate length of the trimmed clip within the timeline production display will also change, and the position marker might be moved to proportionately match the position of the scrubber 758 within the trimmed clip length.

Editors of type (ii) therefore do provide a current or dynamic update of the global effect of a local clip trimming action performed by the user. Such editors also have the following disadvantages: (a) the edit timeline is not positioned in any alignment or visual proximity to the trimming window, and so little information may be available to the user in one glance as to the global effect of a local change; (b) because of the use of fixed length resolution representing time duration along the timeline 722 , a timeline containing a number of clips can easily reach beyond the boundaries of the current window, requiring scrolling to see the total global effect of a local clip trimming action; (c) the facility to quickly select any random clip within a timeline 722 for trimming within the trimmer is not provided in the convenient fashion that will be described.

In summary the GUI 704 in FIG. 7A is supported by the playlist controller 224 (see FIG. 2), and the GUI 704 incorporates with these the multimedia viewer 706 for viewing video clips and listening to audio selections. The viewer 706 also enables display of images, or other media, and has the current clip summary bar 720 for viewing and/or trimming of a current video or audio clip using a clip editing process and for setting or resetting meta-data flags for the current audio/video clip. The GUI 704 also has the playlist summary bar 722 which is updated according to a production editing process in order to conform the production, in a synchronously dependent manner, to changes made to the current clip. The playlist summary bar 722 indicates the overall (global) production timeline, also known as the playlist, of which the current clip shown in the current clip summary bar 720 is typically a part. The GUI 704 also has player controls 726 for controlling playback, and various forward and reverse access modes to the current clip or the playlist.

The GUI 704 also has one or more meta-data display regions 714 aligned with the current clip summary bar 720 or the playlist summary bar 722 . The display region 714 contains zero or more meta-data flags or statii, such as regions 712 and 748 , which identify in and out points for each of two differing user-nominated effecting processes applied over the playlist. The time references associated with the meta-data flags are indicated by the relative position of the flags, referred either to the duration of the current clip, as indicated by the current clip summary bar 720 , or to the duration of the playlist, as indicated by the playlist summary bar 722 . Which summary bar reference is valid to a metadata flag is typically shown by graphical proximity thereto, or by another visible association. For instance, the highlight graphic (which is described in more detail below) points to the current clip summary bar 720 to which it is referenced. The GUI 704 also has audio volume controls 708 and 760 , and optionally, the mode selector 702 which may, however, not be present in some arrangements of the GUI 704 .

The described playlist controller 224 and associated GUIs 100 and 704 provide sufficient editing facilities, particularly with reference to the playlist summary bar 722 , that a conventional edit timeline is not a requirement for even an advanced user. However, such an edit timeline remains an option if the user wishes to benefit from some of its capabilities such as embedded keyframes display. Further, the playlist controller 224 and GUIs 110 and 704 provide additional convenience by providing the synchronised and aligned locating of the clip summary bar 720 and the playlist summary bar 722 , with the meta-data flags summary 714 and the viewer 706 .

4.2.1 Manual Editing Playlist Controller

The example of the GUIs as shown in FIGS. 7A and 7B is specifically directed to manual video or audio editing. The playlist summary bar 722 represents similar data to that shown by a typical edit timeline, the playlist summary bar 722 being based on the EDL output of the manual editor function 228 (see FIG. 2) operating within an application to which the browser and playlist controller GUIs ( 100 , 704 ) are connected. The mode switch 702 is optional in this configuration, and is shown only to indicate that the playlist controller GUI 704 is in manual edit mode (as depicted by the reference numeral 700 ), as opposed to an auto-edit mode, which is discussed later.

The typical operation of the playlist controller 224 is as follows. As a starting point, the user selects and transfers to the playlist controller GUI 704 , a clip sequence by one of several equivalent means. Having done so, this clip sequence is depicted in the playlist summary bar 722 by alternating dark and light bands 728 , 724 where the length of each band represents the proportionate duration of an individual clip compared to the overall sequence duration. The user may choose to view the clip sequence or playlist by operating the playlist scrubber 746 or the play controls 726 . At this point the playlist summary bar 722 contains only the input clips and any default transitions or effects that have been preset by the user within the application. The contents of the playlist summary bar 722 are equivalent to those of an edit timeline that merely holds raw input clips prior to any editing operations being undertaken by the user. The user may proceed to modify or edit or effect the contents of the playlist by using the facilities of the playlist controller 224 as described below.

The user may add effects or transitions to the playlist by selecting and dragging these from the browser GUI 100 into the meta-data display region 714 of the playlist controller GUI 704 . Alternatively, the effects and transitions can be dragged directly on to the playlist summary bar 722 . This meta-data display region 714 may be extended vertically to include a number of meta-data flags, such as effects or transitions, of interest to the user. The user can add or select or remove such effects by various convenient means such as clicking within the appropriate region of the meta-data summary 714 and selecting an effect or a meta-data flag from a popup box that appears (not shown). The pop-up box (which can take the form of a side-bar) can operate as a switch which reveals alternate meta-data displays, or which may display an expanding and/or collapsing display of additional controls. Alternatively the user can drag in a flag or effect from a sidebar such as 750 and modify start and end points for the flag or the effect by clicking and dragging. Thus, the user may obtain many of the standard features of a conventional edit timeline within the context of the controller playlist controller GUI 704 and the associated playlist controller 224 .

The playlist controller 224 provides a strong benefit in its synchronised behaviour between the