METHOD AND SOFTWARE FOR GRAPHICAL REPRESENTATION OF QUALITATIVE SEARCH RESULTS
COPYRIGHT NOTICE A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
BACKGROUND OF THE INVENTION The invention disclosed herein relates generally to search engines. More particularly, the present invention relates to the use of a graphical user interface to present the result set returned by a search.
Traditional search engines, such as those used to search for content on the World Wide Web (WWW), execute searches by processing user-defined search terms connected by conditional operators. Records returned as part of a result set are listed and ranked based on the frequency of the search terms within the pages. Typically, the result set generated is a long list of web sites, with the desired pages lying several dozen entries within the set. In addition, search engines cannot determine the context intended by the user, causing irrelevant pages that contain the term in an unrelated context to be returned. Furthermore, because search engines base the result of their queries on the presence of search terms within a document, they are highly unsuited for searching image libraries, collections of motion picture clips, or other collections of non-textual information.
Graphical user interfaces for search engines are known in the art that partially solve some of these problems. For example, U.S. Patent No. 5,982,369, assigned to Sony Corp., is entitled "Method for Displaying on a Screen of a Computer System Images
Representing Search Results". The Sony patent discloses a system whereby users enter one or more keywords to be searched on. The results are graphically displayed on a continuum with each end bearing a user supplied keyword. Results appearing on either extreme of the continuum are relevant only to the keyword associated with that end (i.e., "or") while results appearing towards the center of the continuum are relevant to both keywords (i.e., "and"). This approach can be extended to two columns, depicted by two perpendicular lines representing four keywords. The results are further laid out in a pyramid fashion with more
relevant results appearing towards the top of the pyramid and represented by a larger icon than the less relevant results, which appear as gradually smaller icons further down the pyramid in relation to their relevance.
Another system directed towards the graphical representation of search results is U.S. Patent No. 5,636,350, assigned to Lucent Technologies, Inc., and entitled "Using Symbols Whose Appearance Varies to show Characteristics of a Result of a Query". This patent describes a system for graphically displaying the results of a query on a grid where each dimension maps an attribute of the data representing factual information about the results, e.g., journal name, journal year, title, author, etc. Records of the result set are plotted as symbols in the grid with the symbol's shape, size and color indicating characteristics about the record, such as the number of times the record contains a searched keyword.
Neither of these systems is particularly well suited for use with non-text based content in which keyword searching is less important and meaningful than with text based content. While it is possible to display such non-text content according to its associated factual information, this type of information typically does not help users judge the quality of the content, either objectively or as judged by others. Instead, more meaningful decisions for this type of content are made based on qualitative attributes such as data describing or rating the quality of the content or data derived empirically from actions taken by or reactions of other users of the content. There is thus a need for a graphical search tool that presents qualitative attributes of a query's result set in a graphical manner, allowing users to quickly determine the most desired items in the set.
BRIEF SUMMARY OF THE INVENTION It is an object of the present invention to provide a system capable of graphically presenting search results.
It is another object of the present invention to provide a system that presents search results based on qualitative values of the content searched.
It is another object of the present invention to provide a convenient manner by which retrieved content may be selected and stored for later viewing. The above and other objects are achieved by a system that provides users with the ability to retrieve content items from a database using a database management system, with the result set graphically plotted by a graphical user interface (GUI) generator based on
the relationship of the qualitative values of each record. Each content item stored in a database has a set of qualitative data associated with it. Users select the qualitative values upon which to generate the graphical display, along with any optional filters, and are presented with a graphical display of the relative ranking of content items based on the selected criteria.
A GUI generator controls the display of the result set on an axis with dimensions that conveniently translate the percentile values associated with each content item returned in the result set. The user may also use the GUI generator to zoom in on the data points displayed within the graph by appropriately selecting the zoom in or zoom out control. Alternatively, a marquee tool may be used to select an area of the graph to zoom in on.
The GUI generator is further integrated with playback and collection tools, which allows a user to select content items returned by the search tool and add them to a collection bin so that they can be viewed at a later date. The collection bin is presented as a set of thumbnail images, with each thumbnail representing an individual content item. Content items contained in the bin can be played by pressing a supplied play button, which will cause the content of the bin to play in order, or by double clicking on a specific thumbnail, which will play content items from the selected point forward. Dragging the thumbnail from the bin onto the playback tool will play a single clip.
Some of the above and other objects of the present invention are also achieved by a method that involves first collecting a set of data items from a network accessible database. Each data item in the set contains a plurality of information that describes the qualitative attributes of a content item. Additional data stored as part of the data item is also used to dynamically generate qualitative attributes for the data item. Representations of the collected data items are arranged on a graphical display whereby the spatial relationships among the data item representations represent relationships among the data items based on two or more of their associated qualitative attributes.
The scope of the set of data items collected from the database can be limited. The search scope can be narrowed by retrieving only those data items that contain or are associated with one or more keywords. Furthermore, each data item is associated with one or more categories that can also be used to narrow the scope of the set of data items returned from the database.
The data item representations presented on the display are plotted on any number of coordinate systems. Exemplary coordinate systems include a x-y coordinate system, a x-y-z coordinate system, or two/three dimensional polar coordinate system. Where a x-y coordinate system is used, the data item representations are plotted on the x-axis according to one of the data item's qualitative attributes and on the y-axis according to a second qualitative attribute. Similarly, when the plot is on a x-y-z axis, the data item representations are plotted on the z-axis according to a third qualitative attribute of the data item. When the result set is displayed, each data item is assigned a color, depending on whether the data item match the keyword, category, or both, as supplied by the user, BRIEF DESCRIPTION OF THE DRAWINGS
The invention is illustrated in the figures of the accompanying drawings, which are meant to be exemplary and not limiting, in which like references are intended to refer to like or corresponding parts, and in which:
Fig. 1 is a block diagram of a system for collecting search results and displaying them in a graphical user interface in accordance with one embodiment of the current invention;
Fig. 2 is a screen display of a graphical search tool in accordance with one embodiment of the current invention;
Fig. 3 is another screen display of the graphical search tool showing a rollover window in accordance with one embodiment of the current invention;
Fig. 4 is a flow diagram showing a process for graphically plotting a result set on a grid according to qualitative values in accordance with one embodiment of the current invention;
Fig. 5 is a flow diagram showing a process of calculating popularity of a content item in accordance with one embodiment of the current invention;
Fig. 6 is a flow diagram showing a process of calculating an average user rating in accordance with one embodiment of the current invention;
Fig. 7 is a flow diagram showing a process for allowing users to zoom in on a region of a graphical display in accordance with one embodiment of the current invention; Fig. 8 is an exemplary screen display of the graphical search tool shown following execution of a zoom-in using the process set forth in Fig. 7;
Fig. 9 is a flow diagram showing a process for updating an average user rating for a content item based upon a rating input by a user in accordance with one embodiment of the current invention;
Figs. 10-11 are exemplary screen displays of one embodiment of the graphical search tool of the current invention developed for use with video clips; and
Fig. 12 is a schema showing a database structure of one embodiment of a content database for use with the system of the current invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
FIG. 1 presents an overview of the components of one embodiment of the present invention. A database 102 is structured to hold information regarding content items, including content identifiers such as the name of the content item, several fields to store qualitative information regarding the item, several fields to hold keyword information, and several fields to store various categories to which the item belongs. The database may also include other miscellaneous data about the item, such as the number of times a content item has been viewed or the number of users who have chosen to rate an item. The table additionally holds an address, such as a URL, which points to the location where the content and associated thumbnail images are stored.
As one skilled in the art will recognize, the database 102 may contain one or more tables and may have any desired structure. One embodiment of a database structured as a relational database is shown in Fig. 12 and described in greater detail below. Alternatively, the database 102 may be an index of content available on multiple distributed computers, such as may be generated by Internet based search engines. A third embodiment involves utilizing an object-oriented database management system whereby each content item is modeled as an object and the database is structured so as to reveal relationships between the objects.
Software 104 facilitates interaction between users and the database 102. A database management system 104A is responsible for interacting with the database 102 and processing queries for records. A quality attribute generator module 104B uses data returned from the database 104A to calculate certain qualitative attributes from the miscellaneous data stored in the database 102. Data produced by the attribute generator 104B serves as the coordinates of each record of the result set within a graphical display to be presented to a user. Exemplary processes performed by the quality attribute generator 104B module are
described further below. The software components 104 may reside on a personal computer 110 or the workstation upon which the database 102 also resides or other workstation which otherwise can access the database 102. Alternatively, the software may reside on a server that is accessible to various users over a network 112, such as local area network, intranet, extranet, or the Internet.
The coordinates for each record are passed off to a graphical user interface (GUI) generator module 108, which plots the data onto a graph presented to the user in accordance with processes described in greater detail below. After the user has examined the result set and selected a content item to view, the software 104 uses the content locator, e.g., URL, returned with the content record from the database 102 to access the appropriate storage device 106 and deliver the content to the user. Storage devices 106 that warehouse the content items and associated thumbnail images may be located within the enterprise or accessible over a network 112,
The GUI generator module 108 may reside in a personal computer 110, which could also contain the software 104 and/or database 102. Alternatively, the GUI generator 108 may further reside in remotely located computerized devices. For example, the GUI generator 108 can reside on a local area network 112 from where it can be accessed by a client workstation 116. Since the GUI generator 108 controls the display of the user interface, terminals 118 located throughout the network can be used to access the system. When provided with access to the Internet, the software 104 can graphically deliver search results and coordinates to any number of "smart" devices, such as workstation PCs 116, set- top box devices 122, and wireless devices 124. The locally executing GUI generator module 108 uses the data received to generate and display the graphical interface as described further below. In one embodiment, the GUI generator 108 is an applet that may be downloaded from a server upon which the software 104 resides.
Turning to FIG. 2, an exemplary user interface according to the invention is presented. The primary component of the interface is the graph that a result set is plotted on 206. Users have the ability to select qualitative attributes 208 to plot a result set on, as well as the axis that each attribute is tied to. The attributes selected by the user appear on the appropriate axis 202 as indicated by the user 208. Controls are also provided for optional filters 210, such as category and keyword, which can be applied by the software 104 to limit the scope of the result set. After a result set is returned, the data items in the result set are
represented by symbols, such as dots as shown, and plotted on the graph 206 according to the values of the qualitative attributes that are being mapped. For example, a content item 204 with a value of one for quality two and zero for quality one will be plotted on the graph 206 at (1,0). Referring to FIG. 3, a rollover window 302 is generated and displayed by the
GUI generator when a user places the mouse pointer, or other selection device such as a pen or stylus, over a content item for several seconds or otherwise selects a content item. The rollover window presents data regarding the content item, such as the value of any qualitative attributes that are not currently mapped, the author, the publication date, and any other miscellaneous data regarding the content item that is stored in the database.
FIG. 4 presents a flow diagram outlining the process used to graphically present a result set according to the values of its qualitative attributes. Users select the qualitative attributes that the result set is going to be plotted against 402. In one embodiment, the qualitative values that may be mapped on are rating, extremity, and popularity. Rating is the average value received by the content item by all users who have chosen to rate it. Popularity is the item's popularity relative to all other items in the content database. Extremity is a rating assigned to the content item by the system's administrator. These qualitative categories are exemplary, and not intended to be limiting, and one skilled in the art will realize that any number of qualitative values may be associated with each content item and stored in the database or generated based on empirical data stored in the database. In addition to selecting the qualitative values, the user must indicate (through the use of a series of radio buttons or other interface component that allows a user to make multiple nonexclusive selections) the value that maps to the x-axis and the value that maps to the y-axis. In some embodiments, each qualitative value has values fitting within the same predetermined range and are mapped on a scale representing that range, e.g., negative five to positive five. This allows for a single basic graph to be used for all graphical interfaces and provides consistency to users.
Users select the categories that will be returned by the search 404. In one embodiment, the software issues a query to the content database and receives a list of the unique categories contained therein, which is used to populate category menus. Alternately, several categories may be hard-coded into the software and updated periodically by the
system administrator. The software is designed to manage several hundred thousand content items and the database schema is organized to facilitate the search of these items by category. Each content item in the system's database is also associated with one or more keywords. Providing keyword parameters narrows the scope of the result set to include only those content items that are associated with the entered keyword, or, for text-based content, contain the keyword in their title or description. Users are free to enter any type and number of keywords desired 406. An optional time frame restriction control is provided to add a date restriction to the search.
After the user has set all the desired parameters for the search, the query is executed 408. The GUI generator module 108 sends the query to the database manager 104 A, which queries the database 102 and receives a result set comprising content items that either match the user selected category, keywords, or both. The database manager 104A may then make a one-time determination to resolve whether the result set is comprised of records that contain a keyword match 410. In situations where the keyword supplied by the user fails to result in any matches, the software accesses an integrated thesaurus 412. Related words returned by the thesaurus module are used to replace the keywords entered by the user and the query is rerun 408. Once the result set is returned, the qualitative attribute generator 104B executes a subroutine to determine the qualitative values and associated data for each item in the result set 414, in accordance with a process described further below. Data is placed into a data structure, with each element containing data for an individual record in the result set 416. The data structure is sent to the GUI generator module which traverses it and draws points on the search grid according to the qualitative values the user has mapped to each axis 418. The user is now free to examine the result set that has been visually plotted according to the qualitative values of each content item. FIG. 5 presents a process of setting the coordinates for each content item in the result set for display on the search grid. The subroutine receives a result set containing the set of content items that match the keywords, categories, or both keywords and categories supplied by the user 502. If the result set is not null 504, a record is retrieved and its data parsed 506. Control is then passed to a subroutine responsible for calculating an item's popularity 508. The subroutine will return, and the data for the current content item is returned to the main routine 510, including all qualitative values and description data returned with the item's record in the result set. The process returns to step 504 and the subroutine is
repeated if there are more records in the result set to be processed. If the result set is null, i.e., all the records in the set have been processed and the data for each record's qualitative values passed to the main routine, the subroutine exits 512.
Since the popularity of a content item is an empirically based qualitative attribute which changes each time a content item is retrieved, the popularity is recalculated each time the item is loaded into memory. Referring to FIG. 6, a process of determining the popularity value for positioning within the grid is presented. First, the content item's view count is selected from the record 602. Next, the result set is traversed to determine the view count for the content item with the highest view count (the "high view count") 604. The view count for the current item is then divided by the high view count 606. The result is multiplied by 10 and 5 is subtracted to arrive at a value for the item that falls within the range of -5 and +5 608. The value is returned and the subroutine ends 610.
The GUI generator also provides zoom functionality to view the data plot in greater detail. Referring to FIG. 7, the user indicates the desire to access zoom functionality of the software 702, e.g., by selecting a zoom tool icon from a GUI toolbar. The software determines if the user is attempting to "click" or "marquee" zoom 704. Click zoom is where a user clicks the mouse at a certain location and the search grid performs a zoom on the point, whereas a marquee zoom is where the search grid zooms in on an area designated by dragging the mouse over a particular area. For a click zoom, when the user clicks on a point in the graph, the coordinates of the mouse position within the graph are used as a center point for the new display 706. Since the graph represents a finite set of data points, the zoom function will not return results outside the original result set. This means that if a user clicks on a point near the edge of the graph, the search display will adjust the selected coordinates to compensate. For a click zoom, the GUI generator determines the horizontal and vertical center point for the zoom-in 708 by applying the following calculations:
The value of D is the pixel dimensions of the square graph, e.g., for a 200x200 graph, D=200. AdjX and AdjY are the adjusted coordinates for the new center point. After the new center point is calculated, the upper and lower bounds of the horizontal and vertical dimensions of the zoom are calculated 710 as follows:
After calculating the values for Xlow, Xhigh, Ylow, and Yhigh, the GUI generator identifies the content items that fall within the zoom range by performing a comparison of the values against the new boundary points. Using this data, the GUI generator regenerates the display 712 with new pixel positions calculated for each data point and the routine will exit 718. Alternately, the formulas above can be calculated by taking the existing percentile values for each data point and translating the user's initial click coordinates into a percentile value.
When executing a marquee zoom 704, the user drags the cursor across a section of the screen to zoom in on. Using the coordinates of the marquee, the software will calculate the horizontal and vertical center point for the zoom in 714 according to the calculations of Table 1. The upper and lower bounds for the horizontal and vertical dimensions of the zoom are calculated using the dimensions of the marquee area 716. The software then regenerates the display 712 and the zoom routine exits 718. FIG. 8 presents a zoom performed on the result set displayed in FIGS. 2 and 3 with the mouse click and new center point being in the lower left quadrant of the x-y axis coordinate system.
When viewing a content item, users are presented with an opportunity to provide a rating. FIG. 9 presents the process of determining the new average user rating for a content item. The number of users who have viewed a content item and the item's current average rating are retrieved from the database 902 and multiplied together 904. The new rating selected by the user is added to the product 906. The sum is then divided by the number of users who have viewed the item plus one 908. The resultant value is loaded into the database to reflect the new average user rating and total number of reviewers 910. For example, assume 500 people have rated a content item with an average rating of 2.2 and the current user is providing a rating of zero. The new average user rating would be determined by the following calculation:
((500 x 2.2)+0)/501 = 2.19 The new average user rating of 2.19 is then loaded into the database along with the value 501, to reflect the new total number of reviewers.
Referring to FIG. 10, an overview of the major elements of one embodiment of the search tool is presented. Contained within the left panel of the display are controls 1002-1010 through which the user may enter search criteria. The controls allow the user to decide which qualitative values to map the result set on, including the flexibility to choose which axis each quality will map to 1002. Through the use of a pop-up menu or other similar input structure, users can choose one or more categories on which to focus their search 1006. The scope of the search may be further narrowed through the use of keywords 1004 and time frame 1008. A Control at the bottom of the panel allows the user to reset the search parameters 1012, which when activated will instruct the GUI generator to clear the search parameters entered into the GUI controls by the user.
A control is provided to execute the currently defined search 1014. When the control is actuated, the keyword and category data provided by the user are passed to the database management system by the GUI generator. The data is used as query parameters in searching the database. After a search is executed, the number of content items contained in the result set returned from the database and the parameters of the search provided by the user are displayed by the GUI generator in the upper right portion of the pane 1010. The center panel of the display contains the graph where the search result set will be plotted 1016 by the GUI generator. In one embodiment, the graph is comprised of an equal number of units in each quadrant with the origin in the center of the panel.
Alternatively, the origin may appear anywhere within the panel, dedicating more of the panel to a particular quadrant of the graph. The currently selected qualitative values on which the result set is mapped are displayed along two sides of the graph 1018.
Content items 1026 returned as the result set of a search are plotted within the graph 1016 according to their associated qualitative values that are being mapped to each axis on the graph provided by the GUI generator. For example, a content item with a rating value of 0, an extremity value of 0, and a popularity value of -5 is drawn at the origin when rating is tied to the x-axis and extremity is tied to the y-axis. Similarly, where rating is tied to the x- axis and popularity is tied to the y-axis, the same content item is drawn at (0, -5). When the mouse is placed upon a content item, a rollover window will appear. The GUI generator will render a window that will display information such as the title, author or producer, the category that the content item belongs to, and the qualitative value or values that are not currently mapped on the grid 1016.
The right side of the display is comprised of several tools: the legend 1020, the "more like this" tool 1024, and the bin 1022. The legend 1020 contains the key to help determine the accuracy of the elements of the result set. Each level of accuracy is associated with a distinct color. Several exemplary levels of accuracy are presented in the instant embodiment and include: result elements that match both the specified category and keyword, results that match the specified keyword, results that match the specified category, and related content items. The GUI generator will appropriately color the content item representations to reflect each item's accuracy level as indicated by the legend 1020.
Below the legend is the "more like this" tool 1024, which is used to quickly retrieve similar or related content items. Dragging a content item 1026 from the grid 1016 and dropping it on the tool will display a thumbnail image of the content item, retrieved from the database or other storage device storing the content, and use the item's categories and keywords as parameters for the search. Alternatively, the mapped qualitative attributes may be used as the parameters for the search. The GUI generator will pass the parameters to the database management system to be used in formulating the new query. The database management system returns content items that are similar to the selected item. The GUI generator will display the returned data items with the color reserved for related content items.
The last tool in the panel is the bin 1022, which is a storage area for content items that may later be saved in permanent bins. The GUI generator allows users to drag a content item 1026 from the grid 1016 and drop it on a cell within the bin 1022. The GUI generator adds the selected content item to a file or otherwise marks it as added to the bin, and causes the thumbnail image to appear within the indicated cell. Alternately, the thumbnail and associated content item may be sequentially loaded into the next empty cell. The bin has controls that allow the user to save a bin, send a bin, and consecutively play all the content items current in the bin.
Turning to FIG. 11 , an exemplary content viewing tool is presented. Information regarding the content item that is currently being viewed 1102 occupies the majority of the left panel. Information presented includes: duration, publication or release date, the categories that the content item is a member of, a brief synopsis, and its qualitative values (popularity, extremity, and rating). Below this information is a slider that allows a user to submit a user rating for the content item 1104. By setting the slider to the desired rating and activating the submit control, the GUI generator will transmit the value to the attribute generator, which will recompute the new average user rating and return it to the GUI generator for presentation to the user. Below the rating slider 1104 is a control that allows a user to select a personal bin where the current content item can be saved 1106.
The center panel is divided into two sections: one containing a viewer 1108 used to playback the current content item and another used to display one of several personal bins 1116 that may be created by the user. The GUI generator provides functionality that allows the user to select a content item currently held in the bin 1116 and drag and drop it on the viewer 1108. This will cause the GUI generator to being playback of the content item
Along the lower edge of the panel are controls 1110 that allow the user to save the bin, send the bin, and search for "more like this". The send bin function will send an electronic mail message to the recipient indicated by the user. The message may contain a series of thumbnail images associated with the content items contained in the bin, along with an invitation to access the system to view the actual content items in their entirety. The "more like this" function will return the user to the search tool depicted in FIG. 10, where the GUI generator will pass the parameters of the selected content item to the database management system. The database management system will execute the query and return the result set to the GUI generator for visual presentation by the GUI generator on the grid 1016.
The rightmost panel contains a quick search tool 1 1 14. By entering keywords and selecting a category, the user will return to the search tool. The GUI generator will pass the parameters off to the database management system, which will execute the query and return the result set to the GUI generator. The GUI generator will use the data contained in the result set returned from the database management system to update the contents of the search results pane 1010 and the grid 1016. Below this is the temporary bin 1 1 12 containing content items retrieved from the grid 1016 and placed in the bin 1022.
Fig. 12 shows a database schema of a relational database structure used to support one embodiment of the present invention. As the schema reveals, the database is comprised of a number of related tables 1200, with each table in turn comprised of fields configured to store data regarding a particular piece or feature of the system. As can be seen, the schema is primarily composed of two major tables, or entities, that hold data regarding users 1202 and content items 1204, in this case video files.
Each record held in the video table 1204 contains data regarding a particular content item's identification number, location (e.g., URL or file system address), title, and producer. The records further contain qualitative attribute data such average rating and extremity, as well as with additional data such as view count and ratings count that, as described above, are used to generate additional empirical, qualitative attributes. Each record in the video table is also associated with one or more records contained in category tables 1206 containing category data that is structured or organized in a hierarchy of parent and child categories.
The other major element of the schema is the user table 1202. These records contain data regarding all system users, including the system username, password, email address, and the speed of the user's Internet connection. The table also contains data concerning the user's identity (first and last name) and city, state, zip code, and country. Each record in the user table is associated with one or more collections or bins 1208. Each bin, in turn, is associated with one or more videos (content items) 1210. This collection of relationships allows each user, through the interface to the system provided by the GUI generator, to create and save a plurality of bins, each containing a plurality of videos. While the invention has been described and illustrated in connection with preferred embodiments, many variations and modifications as will be evident to those skilled in this art may be made without departing from the spirit and scope of the invention, and the
invention is thus not to be limited to the precise details of methodology or construction set forth above as such variations and modification are intended to be included within the scope of the invention.