US20060116983A1 - System and method for ordering query results - Google Patents
System and method for ordering query results Download PDFInfo
- Publication number
- US20060116983A1 US20060116983A1 US10/999,498 US99949804A US2006116983A1 US 20060116983 A1 US20060116983 A1 US 20060116983A1 US 99949804 A US99949804 A US 99949804A US 2006116983 A1 US2006116983 A1 US 2006116983A1
- Authority
- US
- United States
- Prior art keywords
- data
- query
- data records
- list
- sorting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 135
- 230000004044 response Effects 0.000 claims abstract description 33
- 238000009826 distribution Methods 0.000 claims description 41
- 230000015654 memory Effects 0.000 claims description 16
- 230000001131 transforming effect Effects 0.000 claims 6
- 238000012545 processing Methods 0.000 abstract description 21
- 238000004519 manufacturing process Methods 0.000 abstract description 4
- 102000001554 Hemoglobins Human genes 0.000 description 58
- 108010054147 Hemoglobins Proteins 0.000 description 58
- 238000003860 storage Methods 0.000 description 12
- 230000003190 augmentative effect Effects 0.000 description 8
- 230000008901 benefit Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 230000003993 interaction Effects 0.000 description 4
- 101710169603 Hemoglobin-1 Proteins 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- PCTMTFRHKVHKIS-BMFZQQSSSA-N (1s,3r,4e,6e,8e,10e,12e,14e,16e,18s,19r,20r,21s,25r,27r,30r,31r,33s,35r,37s,38r)-3-[(2r,3s,4s,5s,6r)-4-amino-3,5-dihydroxy-6-methyloxan-2-yl]oxy-19,25,27,30,31,33,35,37-octahydroxy-18,20,21-trimethyl-23-oxo-22,39-dioxabicyclo[33.3.1]nonatriaconta-4,6,8,10 Chemical compound C1C=C2C[C@@H](OS(O)(=O)=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2.O[C@H]1[C@@H](N)[C@H](O)[C@@H](C)O[C@H]1O[C@H]1/C=C/C=C/C=C/C=C/C=C/C=C/C=C/[C@H](C)[C@@H](O)[C@@H](C)[C@H](C)OC(=O)C[C@H](O)C[C@H](O)CC[C@@H](O)[C@H](O)C[C@H](O)C[C@](O)(C[C@H](O)[C@H]2C(O)=O)O[C@H]2C1 PCTMTFRHKVHKIS-BMFZQQSSSA-N 0.000 description 2
- 101710169606 Hemoglobin-2 Proteins 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 1
- 206010042496 Sunburn Diseases 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 208000019423 liver disease Diseases 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 210000003371 toe Anatomy 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
Definitions
- the present invention generally relates to managing query results and, more particularly, to ordering data records contained in a query result obtained in response to execution of a query against a database.
- Databases are computerized information storage and retrieval systems.
- a relational database management system is a computer database management system (DBMS) that uses relational techniques for storing and retrieving data.
- the most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways.
- a distributed database is one that can be dispersed or replicated among different points in a network.
- An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.
- a DBMS can be structured to support a variety of different types of operations for a requesting entity (e.g., an application, the operating system or an end user). Such operations can be configured to retrieve, add, modify and delete information being stored and managed by the DBMS.
- Standard database access methods support these operations using high-level query languages, such as the Structured Query Language (SQL).
- SQL Structured Query Language
- the term “query” denominates a set of commands that cause execution of operations for processing data from a stored database.
- SQL supports four types of query operations, i.e., SELECT, INSERT, UPDATE and DELETE.
- a SELECT operation retrieves data from a database
- an INSERT operation adds new data to a database
- an UPDATE operation modifies data in a database
- a DELETE operation removes data from a database.
- Processing queries and query results can consume significant system resources, particularly processor resources. Furthermore, one difficulty when dealing with large query results, i.e., query results including a large amount of data, is to identify relevant information therefrom.
- query languages generally provide some functionality for ordering query results so that retrieval of relevant information can be simplified.
- an ORDER BY clause can be used to order rows of a given query result presented in a tabular form according to an ascending or descending order of data contained in a user-selected column of the query result.
- a given query result can be represented graphically to outline the information conveyed by the query result.
- such techniques still require a significant amount of user interaction to identify the relevant information, especially from large query results. Thus, these techniques are an ineffective means to support users in easily and rapidly identifying relevant information from query results.
- the present invention is generally directed to a method, system and article of manufacture for managing query results and, more particularly, for ordering data records contained in a query result obtained in response to execution of a query against a database.
- One embodiment provides a computer-implemented method of ordering query results comprising, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, accessing a data source to retrieve information related to the received list of data records; (c) sorting the received list of data records on the basis of the retrieved information; and (d) outputting the sorted list of data records.
- Another embodiment provides a computer-implemented method of ordering query results comprising, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, determining a value variance for each data record in the list, the value variance of a given data record indicating a relative proximity between a predefined value and a corresponding value of the given data record; (c) sorting the received list of data records on the basis of the determined value variances; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-implemented method of ordering query results comprising, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, identifying a subset of the data records in the list to satisfy a requested value range coverage, the requested value range coverage being defined by a predefined maximum number of data records of the list to be output, each having a corresponding value within a predefined value range; (c) sorting the received list of data records on the basis of the requested value range coverage; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-readable medium containing a program which, when executed by a processor, performs operations for ordering query results.
- the operations comprise, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, accessing a data source to retrieve information related to the received list of data records; (c) sorting the received list of data records on the basis of the retrieved information; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-readable medium containing a program which, when executed by a processor, performs operations for ordering query results.
- the operations comprise, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, determining a value variance for each data record in the list, the value variance of a given data record indicating a relative proximity between a predefined value and a corresponding value of the given data record; (c) sorting the received list of data records on the basis of the determined value variances; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-readable medium containing a program which, when executed by a processor, performs operations for ordering query results.
- the operations comprise, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, identifying a subset of the data records in the list to satisfy a requested value range coverage, the requested value range coverage being defined by a predefined maximum number of data records of the list to be output, each having a corresponding value within a predefined value range; (c) sorting the received list of data records on the basis of the requested value range coverage; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer system comprising a requesting entity, a data source residing in memory, and a sorting program for ordering query results obtained in response to a query issued by the requesting entity.
- the sorting program is configured to: (a) receive a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, access the data source to retrieve information related to the received list of data records; (c) sort the received list of data records on the basis of the retrieved information; and (d) output the sorted list of data records.
- Still another embodiment provides a computer system comprising a requesting entity and a sorting program for ordering query results obtained in response to a query issued by the requesting entity.
- the sorting program is configured to: (a) receive a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, determine a value variance for each data record in the list, the value variance of a given data record indicating a relative proximity between a predefined value and a corresponding value of the given data record; (c) sort the received list of data records on the basis of the determined value variances; and (d) output the sorted list of data records.
- Yet another embodiment provides a computer system comprising a requesting entity and a sorting program for ordering query results obtained in response to a query issued by the requesting entity.
- the sorting program is configured to: (a) receive a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, identify a subset of the data records in the list to satisfy a requested value range coverage, the requested value range coverage being defined by a predefined maximum number of data records of the list to be output, each having a corresponding value within a predefined value range; (c) sort the received list of data records on the basis of the requested value range coverage; and (d) output the sorted list of data records.
- FIG. 1 is a data processing system illustratively utilized in accordance with the invention
- FIGS. 2 A-B are relational views of software components in one embodiment
- FIG. 3 is a flow chart illustrating sorting of data records contained in a query result in one embodiment
- FIG. 4 is a flow chart illustrating sorting of data records contained in a query result in another embodiment
- FIGS. 5 A-B are flow charts illustrating sorting of data records contained in a query result in still another embodiment
- FIGS. 6 A-C are flow charts illustrating sorting of data records contained in a query result in still another embodiment
- FIGS. 7-8 are relational views of software components for query building support in one embodiment.
- FIGS. 9-10 are flow charts illustrating the operation of a runtime component.
- the present invention is generally directed to a method, system and article of manufacture for managing query results and, more particularly, for sorting data records contained in a query result.
- a user issues a query against a database.
- a list of data records defining a query result is obtained.
- the data records in the received list of data records are ordered according to an initial order.
- the data records are then sorted to provide a re-ordered query result which intelligently conveys information contained in the query result to the user.
- the sorting is performed on the basis of information which is related to the received list of data records. For instance, annotations associated with the data records in the list are retrieved from a suitable data source. For each data record in the list, a total number of associated annotations is counted. The total numbers can then be used as a basis for sorting the data records in the received list. By way of example, data records having the greatest total number of counted annotations can be placed on the top of a corresponding sorted list.
- the sorting is performed on the basis of a value variance which is determined for each data record in the list.
- the value variance of a given data record indicates a relative proximity between a predefined value and a corresponding value of the given data record.
- a given query result may include data records having values which are included within a specific value range.
- the value range may include a center value which can be specified as the predefined value.
- the value variance of each of the values from the data records in the list with respect to the predefined value i.e., the center value
- a relative proximity to the predefined value can be identified for each corresponding value of the data records.
- data records having values with a closest relative proximity to the predefined value can be placed on the top of a corresponding sorted list.
- the sorting is performed on the basis of a requested value range coverage.
- the requested value range coverage is defined by a predefined maximum number of data records of the list to be output according to a requested value distribution, each data record having a corresponding value within a predefined value range.
- a given query result may include a multiplicity of data records having corresponding values. All such corresponding values are spread over a given value range. From the multiplicity of data records, only a portion should be output according to a predefined maximum number in order to define a requested value distribution.
- the requested value distribution can be defined by any possible type of distribution, such as a uniform distribution (also referred to as “flat distribution”) and a normal distribution (also referred to as “bell curve”).
- a flat distribution consists of values that are evenly distributed between upper and lower bounds.
- a bell curve consists of values which are selected such that the frequency of selection is weighted towards a center, or average, value within upper and lower bounds. Accordingly, the predefined maximum number of data records is selected from the multiplicity of data records such that the corresponding values of the selected data records define the requested value distribution. Thus, the one or more selected data records can be placed on the top of a corresponding sorted list.
- the sorting is performed on the basis of suitability scores which are determined with respect to available analysis routines.
- a suitability score is determined for each data record in the list.
- the suitability score for a given data record indicates a relative suitability of the given data record as input to one or more analysis routines.
- data records which are most suitable as input to the one or more analysis routines can be placed on the top of a corresponding sorted list.
- embodiments described herein may refer to re-ordering of specific requested data.
- embodiments may be described with reference to re-ordering of query results obtained in response to execution of queries against databases.
- references to re-ordering of query results are merely for purposes of illustration and not limiting of the invention. More broadly, re-ordering of any suitable data received in a list form in response to a request for the data (whether or not the request be a query, per se) is contemplated.
- One embodiment of the invention is implemented as a program product for use with a computer system such as, for example, computer system 110 shown in FIG. 1 and described below.
- the program(s) of the program product defines functions of the embodiments (including the methods described herein) and can be contained on a variety of signal-bearing media.
- Illustrative signal-bearing media include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks.
- Such signal-bearing media when carrying computer-readable instructions that direct the functions of the present invention, represent embodiments of the present invention.
- routines executed to implement the embodiments of the invention may be part of an operating system or a specific application, component, program, module, object, or sequence of instructions.
- the software of the present invention typically is comprised of a multitude of instructions that will be translated by the native computer into a machine-readable format and hence executable instructions.
- programs are comprised of variables and data structures that either reside locally to the program or are found in memory or on storage devices.
- various programs described hereinafter may be identified based upon the application for which they are implemented in a specific embodiment of the invention. However, it should be appreciated that any particular nomenclature that follows is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature.
- the distributed environment 100 includes computer system 110 and a plurality of networked devices 146 .
- the computer system 110 may represent any type of computer, computer system or other programmable electronic device, including a client computer, a server computer, a portable computer, an embedded controller, a PC-based server, a minicomputer, a midrange computer, a mainframe computer, and other computers adapted to support the methods, apparatus, and article of manufacture of the invention.
- the computer system 110 is an eServer computer available from International Business Machines of Armonk, N.Y.
- the computer system 110 comprises a networked system.
- the computer system 110 may also comprise a standalone device.
- FIG. 1 is merely one configuration for a computer system.
- Embodiments of the invention can apply to any comparable configuration, regardless of whether the computer system 110 is a complicated multi-user apparatus, a single-user workstation, or a network appliance that does not have non-volatile storage of its own.
- the embodiments of the present invention may also be practiced in distributed computing environments in which tasks are performed by remote processing devices that are linked through a communications network.
- program modules may be located in both local and remote memory storage devices.
- the computer system 110 and/or one or more of the networked devices 146 may be thin clients which perform little or no processing.
- the computer system 110 could include a number of operators and peripheral systems as shown, for example, by a mass storage interface 137 operably connected to a direct access storage device 138 , by a video interface 140 operably connected to a display 142 , and by a network interface 144 operably connected to the plurality of networked devices 146 .
- the display 142 may be any video output device for outputting viewable information.
- Computer system 110 is shown comprising at least one processor 112 , which obtains instructions and data via a bus 114 from a main memory 116 .
- the processor 112 could be any processor adapted to support the methods of the invention.
- the main memory 116 is any memory sufficiently large to hold the necessary programs and data structures.
- Main memory 116 could be one or a combination of memory devices, including Random Access Memory, nonvolatile or backup memory, (e.g., programmable or Flash memories, read-only memories, etc.).
- memory 116 may be considered to include memory physically located elsewhere in the computer system 110 , for example, any storage capacity used as virtual memory or stored on a mass storage device (e.g., direct access storage device 138 ) or on another computer coupled to the computer system 110 via bus 114 .
- the memory 116 is shown configured with an operating system 118 .
- the operating system 118 is the software used for managing the operation of the computer system 110 . Examples of the operating system 118 include IBM OS/400®, UNIX, Microsoft Windows®, and the like.
- the memory 116 further includes one or more applications 120 and an abstract model interface 130 .
- the applications 120 and the abstract model interface 130 are software products comprising a plurality of instructions that are resident at various times in various memory and storage devices in the computer system 110 .
- the applications 120 and the abstract model interface 130 When read and executed by one or more processors 112 in the computer system 110 , the applications 120 and the abstract model interface 130 cause the computer system 110 to perform the steps necessary to execute steps or elements embodying the various aspects of the invention.
- the applications 120 include an application query specification 122 , one or more requesting applications 124 , each having a sorting program 126 , and analysis routines 180 .
- the requesting application(s) 124 (and more generally, any requesting entity, including the operating system 118 ) is configured to issue queries against data 136 in a database 139 .
- the database 139 is shown as part of a database management system (DBMS) 154 in storage 138 .
- DBMS database management system
- the database 139 is representative of any collection of data regardless of the particular physical representation of the data.
- a physical representation of data defines an organizational schema of the data.
- the database 139 may be organized according to a relational schema (accessible by SQL queries) or according to an XML schema (accessible by XML queries).
- a relational schema accessible by SQL queries
- XML schema accessible by XML queries
- the invention is not limited to a particular schema and contemplates extension to schemas presently unknown.
- the term “schema” generically refers to a particular arrangement of data.
- the queries issued by the requesting application(s) 124 are defined according to the application query specification 122 included with each requesting application 124 .
- the queries issued by the requesting application(s) 124 may be predefined (i.e., hard coded as part of the requesting application(s) 124 ) or may be generated in response to input (e.g., user input).
- the queries (referred to herein as “abstract queries”) can be composed using logical fields defined by the abstract model interface 130 .
- a logical field defines an abstract view of data whether as an individual data item or a data structure in the form of, for example, a database table.
- the logical fields used in the abstract queries are defined by a data abstraction model component 132 of the abstract model interface 130 .
- a runtime component 134 transforms the abstract queries into concrete queries having a form consistent with the physical representation of the data contained in the database 139 .
- the concrete queries can be executed by the runtime component 134 against the database 139 . Operation of the runtime component 134 is further described below with reference to FIGS. 7-10 .
- a result set is obtained from the data 136 in response to execution of a given query against the database 139 .
- the result set defines a query result which is ordered according to an initial order.
- the query result can be re-ordered to simplify retrieval of relevant information therefrom.
- the query result can be re-ordered to facilitate retrieval of relevant information required for subsequent processing of the query result using one or more of the analysis routines 180 . Operation and interaction of the requesting application(s) 124 and the analysis routines 180 are further described below with reference to FIGS. 2A-6C .
- sorting program(s) 126 are illustrated as an integral part of the requesting application(s) 124 for purposes of illustration. However, it should be noted that the sorting program(s) 126 can be implemented as separate application(s) which is independent of the requesting application(s) 124 . Accordingly, any suitable implementation of the requesting application(s) 124 and the sorting program(s) 126 is broadly contemplated.
- the computing environment includes the requesting application(s) 124 having the sorting program(s) 126 , the database 139 having the data 136 , the display device 142 and the analysis routines 180 of FIG. 1 , as well as a user interface 210 .
- the requesting application(s) 124 issues a data request 220 (e.g., a query) against the data 136 in the database 139 .
- the data request 220 is created by a user using the user interface 210 .
- the data request 220 is executed against the database 139 to obtain a corresponding result set of data from the data 136 (e.g., a query result) using the DBMS 154 .
- requested data 230 defining the corresponding result set is identified from the data 136 .
- the requested data 230 is ordered according to an initial order and returned to the requesting application(s) 124 .
- the requested data 230 is presented as an ordered list of data records.
- the requested data 230 is re-ordered, i.e., the data records in the ordered list are sorted in order to reduce the complexity of retrieving relevant information therefrom. Sorting the data records of a received list of data records according to predefined criteria is described in more detail below with reference to FIGS. 2B-6C .
- the sorted requested data 240 is output to the display device 142 for display. Accordingly, the sorted requested data 240 can be presented on the display device 142 as a sorted list of data records.
- the sorted requested data 240 can subsequently be processed using one or more of the analysis routines 180 .
- the sorting program(s) 126 sorts the data records in the ordered list on the basis of related information 258 which is retrieved from a corresponding data source 252 .
- the sorting program(s) 126 issues a sort request 222 against the data source 252 .
- the sort request 252 is executed against the data source 252 using a DBMS 256 which manages the data source 252 .
- the requesting application(s) 124 and the sorting program(s) 126 are implemented as a single, integrated software product resident at the server-side or the client-side. Furthermore, the sorting can be done by a client-side application (e.g., the requesting application(s) 124 having the sorting program(s) 126 ) and the requested data 230 is received from a server-side database (e.g., the database 139 ).
- a server-side database e.g., the database 139
- the sorting program(s) 126 can be implemented by a server-side application in which case the sorting can be done on a server machine having the database 139 .
- the sorting program(s) 126 and the database 139 can be resident on a common computer system.
- sorting program(s) 126 of FIG. 2A can invoke various functions which are configured for pre- and post-processing of the data request 220 and the requested data 230 of FIG. 2A .
- the requested data 230 is retrieved from the data 136 in the database 139 using the DBMS 154 .
- the requested data 230 is presented in a tabular form having a plurality of rows and columns.
- the plurality of rows includes rows “A”, “B”, “C” and “D”, and the plurality of columns includes columns “E”, “F”, “G” and “H”.
- the plurality of rows is shown having an initial order “ABCD” and the plurality of columns is shown having an order “EFGH”.
- Each of the rows “A”, “B”, “C” and “D” represents a data record, so that the requested data 230 defines an ordered list of data records having the initial order “ABCD”.
- the sorting program(s) 126 receives the ordered list of data records (i.e., the requested data 230 ) as input and sorts the data records “A”, “B”, “C” and “D” of the received list. Illustratively, the sorting program(s) 126 sorts the data records “A”, “B”, “C” and “D” of the requested data 230 such that the sorted data records have the order “CADB” in the sorted requested data 240 . After sorting the data records, the sorting program(s) 126 outputs the sorted list of the data records (i.e., the sorted requested data 240 ) to the display device 142 . Exemplary embodiments of operations for sorting data records of a received list of data records are described below with reference to FIGS. 3-6C .
- the re-ordering of the requested data 230 is performed on the basis of information (e.g., related information 258 of FIG. 2A ) which is related to the received list of data records “A”, “B”, “C” and “D” defining the requested data 230 .
- the sorting program(s) 126 invokes an information determination unit 250 having a sub-query generator 257 .
- the information determination unit 250 is represented as a separate unit only by way of example and not for limiting the invention accordingly. In other words, the information determination unit 250 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system program.
- the information determination unit 250 accesses a data source 252 to determine the related information therefrom.
- the sub-query generator 257 generates the sort request 222 of FIG. 2A which is issued against the data source 252 for retrieving the related information using the DBMS 256 .
- the data source 252 includes annotations 254 associated with the data records “A”, “B”, “C” and “D”.
- annotations are merely one example of information related to data records. Any suitable related information including annotations can be used as a basis for re-ordering the requested data 230 . More generally, any reference to the data records can be used as a basis for the re-ordering. Accordingly, all such suitable types of related information are broadly contemplated.
- annotations themselves can be classified based on an organization type in which the creators of the annotations are working. For example, data records which have been annotated by individuals working in the same technological field of study can be preferred to general annotations.
- annotations can be ranked on the basis of hierarchical positions of the creators of the annotations. For instance, for a researcher who performs a study on a liver disease, annotations made by a chief site specialist are certainly preferred to those of assistants.
- the information determination unit 250 counts a total number of associated annotations for each of the data records of the requested data 230 .
- the counted total numbers are then used as a basis for sorting the data records.
- For the data record “C” a total number of 76 associated annotations is counted.
- a total number of 62 annotations is counted for the data record “A”
- a total number of 43 annotations is counted for the data record “D”
- a total number of 15 annotations is counted for the data record “B”.
- the data record having the greatest total number of annotations is placed on the top of the sorted list of data records defining the sorted requested data 240 .
- the data records “A”, “B”, “C” and “D” are programmatically sorted in the sorted requested data 240 according to the order “CADB”, as illustrated.
- An exemplary method for re-ordering the requested data 230 on the basis of information which is related to the received list of data records defining the requested data 230 is described below with reference to FIG. 3 .
- a method 300 for re-ordering requested data (e.g., requested data 230 of FIG. 2B ) on the basis of information (e.g., annotations 254 of FIG. 2B ) which is related to the requested data is shown.
- the requested data is obtained in response to execution of a corresponding data request against data in a database (e.g., data 136 of database 139 of FIG. 2A ).
- a suitable requesting entity e.g., requesting application(s) 124 of FIG. 2A
- suitable functionalities of an associated sorting program(s) e.g., sorting program(s) 126 of FIG. 2B .
- the method 300 is explained with respect to a data request being implemented as a query against the data in the database for purposes of illustration.
- the requested data is a result set of data which defines a query result.
- Method 300 starts at step 310 .
- the query is issued by the suitable requesting entity.
- the issued query is executed against the data in the database.
- An exemplary query is shown in Table I below.
- the exemplary query of Table I is described in natural language without reference to a particular query language. Thus, it is understood that any suitable query language, known or unknown, can be used to create the query of Table I.
- the exemplary query shown in Table I includes data selection criteria in lines 001-002.
- the data selection criteria include a result field specification (line 002) which specifies three result fields for which information is to be returned in the query result. Specifically, in line 002 the result fields “ID”, “Name” and “Age” are specified.
- the exemplary query further includes sorting criteria in lines 003-004. The sorting criteria indicate that all data records in the query result should be sorted according to counted numbers of annotations associated with the data records (line 004).
- the exemplary database table “Demographic” includes an “ID”, “Name” and “Age” column.
- the “ID” column contains a unique identifier for each of the data records included with lines 002-004.
- the “Name” column includes names of individuals and the “Age” column contains information about the age of the corresponding individuals.
- the exemplary database table “Annotations” includes a “Node_ID”, “Patient_ID”, “Date” and “Comment” column.
- the “Node_ID” column contains a unique identifier for each of the data records included with lines 002-004.
- the “Patient_ID” column includes patient identifiers according to the “ID” column of the “Demographic” table of Table II above.
- the “Date” column contains indications of dates on which, for example, a corresponding diagnosis has been established for a given patient.
- the “Comment” column contains annotations with respect to the established diagnoses.
- executing the query at step 320 includes generating a data query (e.g., data request 220 of FIG. 2A ) and a sorting query (e.g., sort request 222 of FIG. 2A ) using a suitable sub-query generator.
- the issued query includes: (i) data selection criteria (e.g., data selection criteria in lines 001-002 of Table I) configured to select data records defining the query result from the data in the database, and (ii) sorting criteria (e.g., sorting criteria in lines 003-004 of Table I) configured to specify the information related to the data records of the query result.
- the data query is generated on the basis of the data selection criteria and the sorting query is generated on the basis of the sorting criteria.
- the data query is used to determine the query result in an initial order on the basis of the data selection criteria.
- the sorting query is used to sort the data records in the determined query result on the basis of the sorting criteria.
- the data query shown in Table IV below can be generated from the exemplary query of Table I above using the suitable sub-query generator. TABLE IV DATA QUERY EXAMPLE FIND ID, Name, Age FROM Demographic
- the exemplary data query shown in Table IV includes data selection criteria in lines 001-002 which correspond to the data selection criteria in lines 001-002 of Table I.
- the exemplary data query further includes a specification of the database table which contains the requested data (lines 003-004), i.e., the “Demographic” table of Table II above.
- the suitable sub-query generator retrieves the table named “Demographic” from the issued query of Table I above.
- the sorting query shown in Table V below can be generated from the exemplary query of Table I above. TABLE V SORTING QUERY EXAMPLE FIND Patient_ID, count(comment) FROM Annotations GROUP BY Patient_ID
- the exemplary sorting query shown in Table V includes data selection criteria in lines 001-002 for selection of the required related information, and a specification of the database table which contains the related information in lines 003-004, i.e., the “Annotations” table of Table III above.
- a corresponding rule may indicate to the suitable sub-query generator (e.g., sub-query generator 257 of FIG. 2B ) that the related information about annotations is contained in the database table named “Annotations”.
- the sorting criteria “SORT BY number of associated annotations” in lines 003-004 of Table I specify that the query result should be sorted with respect to a number of annotations associated with each data record contained in the query result.
- annotations (“comment” in line 002) with respect to patients which are identified by corresponding patient identifiers (“Patient_ID” in line 002) are retrieved. Furthermore, all retrieved annotations are counted for each data record associated with one of the retrieved patient identifiers (“count(comment)” in line 002). Moreover, a ranking of the counted retrieved annotations is established according to the sorting query of Table V by grouping the counted numbers of annotations (lines 005-006) per patient.
- the query result in an initial order is received.
- the query result is presented in a list form having a plurality of data records.
- receiving the query result in the initial order corresponds to receiving a query result (hereinafter referred to as “data query result”) obtained in response to execution of the data query of Table IV against the exemplary “Demographic” table of Table II.
- data query result obtained in response to execution of the data query of Table IV against the exemplary “Demographic” table of Table II.
- Table VI TABLE VI EXEMPLARY DATA QUERY RESULT ID Name Age 3 Renee 24 1 Karl 54 2 Kris 49
- a data source e.g., data source 252 of FIG. 2B
- annotations e.g., one or more of the annotations 254 of FIG. 2B
- a total number of retrieved annotations is counted for each one of the data records.
- steps 340 and 350 can be accomplished by executing the sorting query of Table V against the “Annotations” table of Table II. Accordingly, the exemplary query result (hereinafter referred to as “sorting query result”) shown in Table VII below is received. TABLE VII EXEMPLARY SORTING QUERY RESULT Patient_ID Number of Annotations 1 2 2 1
- the patient identifier “1” has two associated annotations (in lines 002 and 004 of Table III). According to line 003, the patient identifier “2” has only one associated annotation (in line 003 of Table III).
- a ranking of the data records in lines 002-004 of the data query result in Table VI above is determined on the basis of the counted total numbers of retrieved associated annotations.
- the data query result of Table VI can be augmented with the exemplary sorting query result of Table VII, in the given example. Accordingly, the augmented query result shown in Table VIII below is obtained.
- the exemplary augmented query result of Table VIII corresponds to the data query result shown in Table VI above, wherein a column containing the counted numbers of annotations according to Table VII has been inserted. Furthermore, as can be seen from the “Number of Annotations” column, the following ranking can be established: (1) the data record of line 003 has the most associated annotations, (2) the data record in line 004 has the second most associated annotations, and (3) the data record in line 002 has no associated annotations at all.
- the above ranking is performed on the basis of the counted numbers of annotations.
- types and/or attributes of the annotations can also be considered when establishing the ranking.
- the annotations can be weighted based on an organization hierarchy or type of the creators of the annotations.
- the annotations can be weighted so that some annotations are weighted relatively more heavily than others. For example, assume that the annotation in line 003 of Table III which is related to “Kris” was made by a chief site specialist while the annotations in lines 002 and 004 of Table III, which are both related to “Karl”, were made by an assistant. In this case, it might be desirable to weight the annotation made by the chief site specialist such that the one annotation associated with “Kris” is considered more important than the two annotations associated with “Karl”.
- the data records in the received query result i.e., data query result of Table VI
- the exemplary sorted query result shown in Table IX is obtained.
- the sorted list of data records (e.g., sorted requested data 240 of FIG. 2B ) is output.
- the sorted list is output for display on a display device (e.g., display device 142 in FIG. 2B ).
- a display device e.g., display device 142 in FIG. 2B
- the exemplary sorted query result of Table IX is output.
- Method 300 then exits at step 390 .
- the re-ordering of the requested data 230 is performed in another embodiment on the basis of a value variance which is determined for each of the data records of the requested data 230 .
- the value variance of a given data record indicates a relative proximity between a predefined value 262 and a corresponding value of the given data record.
- the sorting program(s) 126 illustratively invokes a value variance determination unit 260 .
- the value variance determination unit 260 is represented as a separate unit only by way of example and not for limiting the invention accordingly.
- the value variance determination unit 260 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system program.
- the value variance determination unit 260 illustratively includes the predefined value 262 .
- the predefined value 262 can be provided by a user using a suitable user interface (e.g., user interface 210 of FIG. 2A ).
- each one of the data records of the requested data 230 has a particular value of a type which corresponds to an underlying value type of the predefined value 262 .
- each one of the data records may include a particular value related to a hemoglobin test and the predefined value 262 may represent a user-specified value of interest for hemoglobin tests. More specifically, assume that the particular values of the data records are hemoglobin test result values between 12 and 14.
- a user specifies 13.5 as a central or ideal interest value, i.e., the predefined value 262 .
- the data records having particular values which are the most close to the central or ideal interest value of 13.5 can be identified from the requested data 230 .
- the value variance determination unit 260 determines the value variance of the particular value of each of the data records of the requested data 230 with respect to the predefined value 262 .
- a relative proximity between the corresponding particular value and the predefined value 262 can be identified.
- the data records having the particular values with the closest relative proximity to the predefined value 262 are programmatically placed on the top of the sorted list of data records defining the sorted requested data 240 .
- An exemplary method for re-ordering the requested data 230 on the basis of a value variance which is determined for each of the data records of the requested data 230 is described below with reference to FIG. 4 .
- a method 400 for re-ordering requested data (e.g., requested data 230 of FIG. 2B ) on the basis of value variances is shown.
- the requested data is obtained in response to execution of a corresponding data request (e.g., data request 220 of FIG. 2A ) against data in a database (e.g., data 136 of database 139 of FIG. 2A ).
- a database e.g., data 136 of database 139 of FIG. 2A
- the method 400 is explained by way of example with respect to a query issued against the data in the database in order to obtain a corresponding query result.
- At least part of the steps of the method 400 can be performed by a suitable requesting entity (e.g., requesting application(s) 124 of FIG. 2A ) and suitable functionalities of an associated sorting program(s) (e.g., sorting program(s) 126 of FIG. 2B ).
- Method 400 starts at step 410 .
- the query is issued by a suitable requesting entity (e.g., requesting application(s) 124 of FIG. 2A ).
- the issued query is executed against the data in the database.
- An exemplary query is shown in Table X below.
- the exemplary query of Table X is described in natural language without reference to a particular query language.
- any suitable query language known or unknown, can be used to create the query of Table X.
- the exemplary query shown in Table X includes data selection criteria in lines 001-002.
- the data selection criteria include a result field specification (line 002) which specifies two result fields for which information is to be returned in the query result. Specifically, in line 002 the result fields “Patient_ID” and “Hemoglobin” are specified.
- the exemplary query further includes sorting criteria in lines 003-004.
- the sorting criteria indicate that all data records in the query result should be sorted with respect to a predefined Hemoglobin value (e.g., predefined value 262 of FIG. 2B ) of “34” (line 004). More specifically, each Hemoglobin test value included with a data record of the query result is compared with the predefined Hemoglobin value to identify a relative proximity thereto.
- the exemplary database table “Tests” includes a “Patient_ID”, “Date” and “Hemoglobin” column.
- the “Patient_ID” column includes patient identifiers according to the “ID” column of the “Demographic” table of Table II above.
- the “Date” column contains exemplary dates at which a corresponding Hemoglobin test has been performed on a given patient.
- the “Hemoglobin” column includes Hemoglobin test values which have been determined at the indicated dates.
- executing the query at step 420 includes identifying the data selection criteria and the sorting criteria from the issued query. Executing the query further includes generating a data query on the basis of the identified data selection criteria and executing the data query against the database.
- the data selection criteria “FIND Patient_ID, Hemoglobin” can be identified from the issued query (lines 001-002 of Table X).
- the data query shown in Table XII below can be generated. TABLE XII DATA QUERY EXAMPLE FIND Patient_ID, Hemoglobin FROM Tests
- the exemplary data query shown in Table XII includes the data selection criteria of lines 001-002 of Table X.
- the exemplary data query further includes a specification of the database which contains the requested data (lines 003-004), i.e., the “Tests” table of Table XII above.
- the query result in an initial order is received.
- Receiving the query result in the initial order corresponds to receiving a data query result obtained in response to execution of the data query of Table XII against the exemplary “Tests” table of Table XI.
- the data query result shown in Table XIII below is received in the given example.
- a value variance is determined for each one of the data records contained in the data query result of Table XIII to determine the relative proximities.
- the data query result can be augmented with a column indicating the determined value variances. Accordingly, the augmented query result shown in Table XIV below is obtained. TABLE XIV EXEMPLARY AUGMENTED QUERY RESULT Patient_ID Hemoglobin Value Variance 1 29 5 1 23 11 3 35 1 2 45 11 2 33 1
- the exemplary augmented query result of Table XIV corresponds to the data query result shown in Table XIII above, wherein a column containing the determined value variances has been inserted.
- Each value variance is defined by the difference between the returned Hemoglobin value and the predefined Hemoglobin value.
- the data record in line 005 has a Hemoglobin value of “45”.
- a ranking of the data records is determined on the basis of the determined relative proximities, i.e., the determined value variances.
- the following ranking can be established: (1) the data records of lines 003 and 006 have a value variance of “1” and, thus, the closest relative proximity with respect to the predefined Hemoglobin value, (2) the data record of line 001 has a value variance of “5” and, thus, the second closest relative proximity, and (3) the data records of lines 002 and 005 have a value variance of “11” and, thus, the farthest relative proximity.
- the data records in the data query result of Table XIII are sorted on the basis of the determined ranking. Accordingly, the exemplary sorted query result shown in Table XV below is obtained. TABLE XV EXEMPLARY SORTED QUERY RESULT Patient_ID Hemoglobin 2 33 3 35 1 29 1 23 2 45
- the sorted list of data records (e.g., sorted requested data 240 of FIG. 2B ) is output.
- the exemplary sorted query result of Table XV is output.
- Method 400 then exits at step 480 .
- the requested value range coverage is defined by a predefined maximum number 274 “VALUE COUNT” of data records of the requested data 230 to be output.
- VALUE COUNT a predefined maximum number 274 “VALUE COUNT” of data records of the requested data 230 to be output.
- Each data record has an associated particular value and the particular value of each of the outputted data records must be included within a predefined value range 272 “VALUE RANGE”.
- the predefined maximum number 274 of data records having associated particular values within the predefined value range 272 is programmatically selected and output.
- the requested data 230 includes data records having particular values for the weight of respective individuals. Assume further that the researcher requires 100 test persons and that the 100 test persons should have weights which are included in a value range of 100 pounds-250 pounds. To this end, the researcher using a suitable user interface (e.g., user interface 210 of FIG. 2A ) defines the predefined maximum number 274 to be 100 and the predefined value range 272 to be 100 pounds-250 pounds. Assume now that the requested data 230 includes 1000 data records having particular values within the predefined value range 272 . Thus, by specifying the requested range coverage to retrieve the 100 test persons, 100 data records would be selected programmatically and output. The 100 data records can be selected arbitrarily to satisfy the requested value range coverage.
- a suitable user interface e.g., user interface 210 of FIG. 2A
- the sorting program(s) 126 illustratively invokes a range coverage determination unit 270 having the predefined value range 272 and the predefined maximum number 274 .
- the range coverage determination unit 270 is represented as a separate unit only by way of example and not for limiting the invention accordingly.
- the range coverage determination unit 270 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system program.
- an arbitrary selection of the 100 data records in the given example may result in selection of 100 individuals all having an identical weight of 175 pounds, for example.
- the user can use the suitable user interface in one embodiment to specify how many data records having an identical associated particular value should be output at maximum. For instance, the user can specify that not more than five data records associated with individuals having an identical weight should be output. Accordingly, in the given example the 100 selected data records would represent individuals having at least 20 different weights within the predefined value range 272 .
- the particular values of the outputted data records must define a requested value distribution in the predefined value range 272 .
- the requested value distribution can be defined by any possible type of distribution, such as a flat distribution and a bell curve.
- a flat distribution and a bell curve are merely described by way of example and that other distribution types can also be requested, such as an inverted bell curve or a negative exponential distribution. Accordingly, all such distributions are broadly contemplated. For instance, assume that in the given example the researcher requires 100 test persons having weights which are evenly spread out over the value range of 100 pounds-250 pounds, so that the weights of the 100 test persons can be considered as being representative of the complete value range.
- the requested range coverage to retrieve the 100 test persons such that the weights of the retrieved test persons define a flat distribution over the value range of 100 pounds-250 pounds, the best fit of representative data records would be selected programmatically.
- the range coverage determination unit 270 determines for each of the data records of the requested data 230 whether the particular value of the data record is included within the predefined value range 272 . From all data records having their particular value included within the predefined value range 272 , a total number of data records is selected that is equal to, or at least does not exceed, the predefined maximum number 274 (in this example, 100). The particular values of the selected data records define the requested value distribution.
- the requested value distribution is represented as a histogram having one or more value windows, each having a specified value range defining a granularity of the value window.
- the granularity can be user-specified or system and/or application specific.
- a user can specify a histogram using the suitable user interface. For instance, in the given example the user can specify a histogram representing a bell curve.
- the user may divide the value range of 100 pounds-250 pounds into five different value windows, such as (1) 100 pounds-129 pounds, (2) 130 pounds to 159 pounds, (3) 160 pounds-189 pounds, (4) 190 pounds-219 pounds, and (5) 220 pounds to 250 pounds.
- the user may specify that from the 100 requested test persons (i) 15 persons should have weights within the value windows (1) and (5), respectively, (ii) 40 persons should have weights within the value windows (2) and (4), respectively, and 90 persons should have weights within the value window (3). Accordingly, the weights of all selected data records would define a bell curve.
- the one or more selected data records can, for instance, be placed on the top of the sorted list defining the sorted requested data 240 .
- only the selected data records can be displayed in the sorted list on the display device 142 , while the remaining data records are hidden to the user.
- An exemplary method for re-ordering the requested data 230 on the basis of a requested value range coverage is described below with reference to FIGS. 5 A-B.
- FIG. 5A one embodiment of a method 500 for re-ordering requested data (e.g., requested data 230 of FIG. 2B ) on the basis of a requested value range coverage is shown.
- the requested data is obtained in response to execution of a corresponding data request (e.g., data request 220 of FIG. 2A ) against data in a database (e.g., data 136 of database 139 of FIG. 2A ).
- a database e.g., data 136 of database 139 of FIG. 2A .
- the method 500 is explained by way of example with respect to a query issued against the data in the database in order to obtain a corresponding query result.
- At least part of the steps of the method 500 can be performed by a suitable requesting entity (e.g., requesting application(s) 124 of FIG. 2A ) and suitable functionalities of an associated sorting program(s) (e.g., sorting program(s) 126 of FIG. 2B ).
- Method 500 starts at step 510 .
- the query is issued by a suitable requesting entity (e.g., requesting application 124 of FIG. 2A ).
- the issued query is executed against the data in the database.
- An exemplary query is shown in Table XVI below.
- the exemplary query of Table XVI is described in natural language without reference to a particular query language.
- any suitable query language known or unknown, can be used to create the query of Table XVI.
- the exemplary query shown in Table XVI includes data selection criteria in lines 001-002.
- the data selection criteria include a result field specification (line 002) which specifies two result fields for which information is to be returned in the query result.
- the result fields “Patient_ID” and “Hemoglobin” are specified.
- the exemplary query further includes sorting criteria in lines 003-006.
- the sorting criteria indicate that all data records in the query result should be sorted with respect to a spread of Hemoglobin values (line 004).
- the range of values which is defined by the Hemoglobin values of the query result constitutes a predefined value range (e.g., predefined value range 272 of FIG. 2B ) for the requested value range coverage.
- the Hemoglobin test values in the “Tests” table of Table XI define the predefined value range [23; 45].
- the predefined value range may also be provided by a user using a suitable user interface (e.g., user interface 210 of FIG. 2A ).
- the sorting criteria further indicate a predefined maximum number (e.g., predefined maximum number 274 of FIG. 2B ) which specifies that only “3” data records should be returned in the query result (line 006).
- executing the query at step 520 includes identifying the data selection criteria and the sorting criteria from the issued query. Executing the query further includes generating a data query on the basis of the identified data selection criteria and executing the data query against the database.
- the data selection criteria “FIND Patient_ID, Hemoglobin” can be identified from the issued query (lines 001-002 of Table XV).
- the sorting criteria “SORT BY spread of Hemoglobin RETURN 3 data records” can be identified from the issued query (lines 003-006 of Table XV).
- the data query which can be generated on the basis of the identified data selection criteria corresponds to the data query shown in Table XII above.
- the data query of Table XII is executed against the database table “Tests” illustrated in Table XI to determine the query result in an initial order for the exemplary query of Table XVI.
- receiving the query result in the initial order corresponds to receiving a data query result which corresponds to the data query result shown in Table XIII, as described above with reference to FIG. 4 .
- a subset of the data records of the data query result of Table XIII is selected which satisfies the requested value range coverage.
- the subset of data records should be selected such Hemoglobin test values associated with the data records of the subset define a flat distribution over the predefined value range, i.e., that the Hemoglobin test values are evenly spread over the predefined value range.
- three of the data records which have associated Hemoglobin test values that are evenly spread over the predefined value range [23; 45] are identified from the data query result of Table XIII.
- the data records in the data query result of Table XIII are sorted on the basis of the requested value range coverage.
- the sorting comprises including only the three identified data records with the sorted query result.
- the three identified data records can be placed on the top of the sorted list.
- the three identified data records can be flagged to indicate that only display of these data records is allowed, while all remaining data records should be hidden to the user.
- the data records of lines 003, 005 and 006 of Table XIII are identified. Accordingly, the exemplary sorted query result shown in Table XVII below is obtained.
- the sorted list of data records (e.g., sorted requested data 240 of FIG. 2B ) is output.
- the exemplary sorted query result of Table XVII is output.
- Method 500 then exits at step 570 .
- FIG. 5B one embodiment of a method 548 for identifying the subset of data records from the data query result according to step 540 of FIG. 5A is shown.
- the method 548 starts at step 541 , where all data records of the data query result which have an associated value within the predefined value range are determined.
- the associated values of all data records in the data query result of Table XIII are included within the predefined value range [23; 45].
- a requested value distribution is determined for all associated values which are included within the predefined value range. Assume now that in the given example a flat distribution in the predefined value range [23; 45] is requested. Assume further that three value windows are specified for the flat distribution, such as [23;30], [31;38] and [39;45].
- each value group may include one or more data records.
- the Hemoglobin test values 23, 29, 33, 35 and 45 of the data records shown in Table XIII are grouped into three value groups: (i) the values 23 and 29 are grouped into a first value group corresponding to the value window [23;30], (ii) the values 33 and 35 are grouped into a second value group corresponding to the value window [31;38], and (iii) the value 45 is grouped into a third value group corresponding to the value window [39;45].
- one or more data records from at least a portion of the value groups are determined such that a total number of selected data records is equal to, or at least does not exceed, the predefined maximum number, i.e., “3”.
- the one or more data records are selected to be evenly spread over the predefined value range in order to define the requested flat distribution.
- data records for a maximum number of different values of the value distribution are determined, according to one aspect.
- the predefined maximum number of “3” data records is selected from the three different value groups. Accordingly, one data record is selected for each value group.
- the values “23” and “45” of the first and third value groups are boundary values of the predefined value range [23;45] and, thus, equidistant to a median value of the predefined value range, i.e., “34”, the data records having these values are selected. Furthermore, a data record having an associated value which is in the second value group is selected.
- one of both data records can be selected programmatically in an arbitrary manner so that the requested flat distribution is satisfied. As was noted above, the data record having the associated value “33” is selected. Processing then continues at step 550 of FIG. 5A .
- the selection can be based on additional selection criteria provided by a user. More specifically, assume that in the described example the Hemoglobin test value “35” has been established for an individual living in Rochester, Minn., and that the Hemoglobin test value “33” has been established for an individual living in Houston, Tex. Assume further that the user specifies that data records for individuals living in Texas should be preferred. Accordingly, the data record having the Hemoglobin test value “33” for an individual living in Houston, Tex., is selected.
- the re-ordering of the requested data 230 is performed in still another embodiment on the basis of suitability scores which are determined with respect to the available analysis routines 180 . More specifically, a suitability score is determined for each data record of the requested data 230 .
- the suitability score of a given data record indicates a relative suitability of the given data record as input to one or more of the analysis routines 180 .
- the sorting program(s) 126 illustratively invokes an analysis routine identification unit 280 .
- the analysis routine identification unit 280 is represented as a separate unit only by way of example and not for limiting the invention accordingly.
- the analysis routine identification unit 280 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system component.
- the analysis routine identification unit 280 identifies one or more analysis routines from the analysis routines 180 which are configured for processing the requested data 230 .
- the analysis routine identification unit 280 then identifies qualifiers, such as row qualifiers and result set qualifiers for the identified analysis routine(s).
- a row qualifier of a given analysis routine indicates a possible input field of the given analysis routine and may specify a preferred input value for the possible input field.
- a result set qualifier of a given analysis routine specifies characteristics which qualify a result set that is suitable as input to the given analysis routine. For instance, a result set qualifier may specify that only a result set having Hemoglobin values for each data record is suitable.
- the result set qualifier of the given analysis routine indicates a preferred range of input values of the given analysis routine.
- the row qualifier(s) and/or result set qualifier(s) of the identified analysis routine(s) can be determined from associated metadata 282 .
- the analysis routine identification unit 280 determines how suitable each one of the data records of the requested data 230 is as input to the identified analysis routine(s). In the case of an identified row qualifier, the analysis routine identification unit 280 determines for a given data record having a particular value whether an underlying type of the particular value of that data record corresponds to an input type of the possible input field of the identified analysis routine(s). Each time a match of the types is encountered, the suitability score of the given data record is modified. Modifying the suitability score includes, by way of example, increasing or decreasing the suitability score.
- the result set qualifier can be transformed into a set of one time row qualifiers, each of which can be processed similar to the processing of the row qualifier, as described above.
- data records which are most suitable as input to the identified analysis routine(s) can be identified and placed on the top of a corresponding sorted list.
- FIG. 6A one embodiment of a method 600 for re-ordering requested data (e.g., requested data 230 of FIG. 2B ) on the basis of suitability scores is shown.
- the suitability scores are determined for data records included with the requested data with respect to analysis routines which are configured to process the data records.
- the requested data is obtained in response to execution of a corresponding data request (e.g., data request 220 of FIG. 2A ) against data in a database (e.g., data 136 of database 139 of FIG. 2A ).
- a database e.g., data 136 of database 139 of FIG. 2A .
- the method 600 is explained by way of example with respect to a query issued against the data in the database in order to obtain a corresponding query result. At least part of the steps of the method 600 can be performed by a suitable requesting entity (e.g., requesting application(s) 124 of FIG. 2A ) and suitable functionalities of an associated sorting program(s) (e.g., sorting program(s) 126 of FIG. 2B ). Method 600 starts at step 610 .
- a suitable requesting entity e.g., requesting application(s) 124 of FIG. 2A
- suitable functionalities of an associated sorting program(s) e.g., sorting program(s) 126 of FIG. 2B .
- Method 600 starts at step 610 .
- the query is issued by a suitable requesting entity (e.g., requesting application 124 of FIG. 2A ).
- the issued query is executed against the data in the database.
- An exemplary query is shown in Table XVIII below.
- the exemplary query of Table XVIII is described in natural language without reference to a particular query language.
- any suitable query language known or unknown, can be used to create the query of Table XVIII.
- the exemplary query shown in Table XVIII includes data selection criteria in lines 001-002.
- the data selection criteria include a result field specification (line 002) which specifies two result fields for which information is to be returned in the query result.
- the result fields “Patient_ID” and “Hemoglobin” are specified.
- the exemplary query further includes sorting criteria in lines 003-004. The sorting criteria indicate that all data records in the query result should be sorted with respect to available analysis routines (line 004).
- executing the query at step 620 includes identifying the data selection criteria and the sorting criteria from the issued query. Executing the query further includes generating a data query on the basis of the identified data selection criteria and executing the data query against the database.
- the data selection criteria “FIND Patient_ID, Hemoglobin” can be identified from the issued query (lines 001-002 of Table XVIII).
- the sorting criteria “SORT BY available analysis routines” can be identified from the issued query (lines 003-004 of Table XVIII).
- the data query which can be generated on the basis of the identified data selection criteria corresponds to the data query shown in Table XII above.
- the data query of Table XII is executed against the database table “Tests” illustrated in Table XI to determine the query result in an initial order for the exemplary query of Table XVIII.
- receiving the query result in the initial order corresponds to receiving the data query result of Table XIII, as described above with reference to FIG. 4 .
- all analysis routines which are configured to process the data query result are identified from a plurality of available analysis routines (e.g., analysis routines 180 of FIG. 2B ).
- identifying the analysis routines which are configured to process the data query result includes accessing metadata associated with the analysis routines (e.g., metadata 282 of FIG. 2B ).
- the associated metadata may include qualifiers, such as row and result set qualifiers, which specify a type of query result that can be processed by corresponding analysis routines.
- a suitability score is determined for each data record of the data query result.
- the suitability score of a given data record indicates a relative suitability of the given data record as input to the identified analysis routine(s). Exemplary methods for determining suitability scores are described below with reference to FIGS. 6 B-C.
- the data records in the data query result are sorted on the basis of the determined suitability scores.
- the sorted list of data records e.g., sorted requested data 240 of FIG. 2B .
- Method 600 then exits at step 680 .
- FIG. 6B one embodiment of a method 690 for determining suitability scores for data records of a data query result (e.g., the data query result of Table XIII) according to step 650 of FIG. 6A is shown.
- the method 690 starts at step 651 , where a loop consisting of steps 651 - 653 is entered for each analysis routine that is identified at step 640 of FIG. 6A .
- each row qualifier indicates a possible input field of the given analysis routine.
- each row qualifier may specify a preferred input value for the possible input field.
- each row qualifier may have an associated weight. For instance, a first row qualifier may define that a given data record having a Hemoglobin test value greater than 35 is suitable as input to the given analysis routine.
- the given analysis routine may further have a second row qualifier which defines that a given data record having an Age value greater than 30 is also suitable as input to the given analysis routine.
- the given analysis routine performs better on data records having higher Hemoglobin test values than on data records having higher Age values.
- the first row qualifier may be associated with a higher weight than the second row qualifier. Then, at step 653 the possible input fields, the preferred input values and the associated weights of the identified row qualifiers are identified. When the loop consisting of steps 651 - 653 has been executed for each identified analysis routine, processing continues at step 654 .
- each result field of each data record of the data query result is compared with the identified possible input fields. For all matching fields, the value of the corresponding result field is compared with the preferred input value of the matching possible input field.
- step 655 for each data record of the data query result, all matching fields are counted. Optionally, all associated weights are applied to the counted matching fields.
- step 656 relative proximities for all matching fields of each data record of the data query result are determined. More specifically, for each result field of a given data record that matches a possible input field defined by one of the identified row qualifiers, a relative proximity between the value of the result field and the preferred input value of the matched possible input field is determined. In one embodiment, all associated weights are applied to the determined relative proximities.
- a difference value between the preferred input value of the possible input field and the values of matching result fields can be determined instead of a relative proximity.
- the suitability scores for all data records of the data query result are determined on the basis of the counted matching fields and/or the determined relative proximities. More specifically, according to one aspect, the suitability score of a given data record can be increased or decreased for each matching field with respect to any of the identified analysis routines. The suitability score may also be increased or decreased on the basis of each determined relative proximity or difference value of the given data record with respect to each identified analysis routine. Furthermore, the increase/decrease may be dependent on the determined relative proximity or difference value. For instance, a greater relative proximity may result in a higher increase/decrease.
- the given analysis routine performs better if the Hemoglobin test value of a given data record and, thus, the corresponding difference value is high. Accordingly, the given analysis routine performs better on the data record having the Hemoglobin test value of 55 and the difference value of 20. Thus, this data record may have a higher increase of its suitability score with respect to the given analysis routine than the data record having the Hemoglobin test value of 49 and the difference value of 14. Processing then continues at step 660 of FIG. 6A .
- the determined suitability scores can be expressed by a plurality of score portions, wherein each score portion is related to a different identified analysis routine.
- each score portion can be normalized in order to limit the ability of a single identified analysis routine to push a particular data record to the top of the sorted list of data records.
- all score portions of the plurality of score portions of a given suitability score can be determined and stored separately.
- the data query result e.g., the data query result of Table XIII
- the sorting can be based on the score portions associated with the user-selected analysis routine(s), as described above.
- FIG. 6C another embodiment of a method 695 for determining suitability scores for data records of a data query result (e.g., the data query result of Table XIII) according to step 650 of FIG. 6A is shown.
- the method 695 starts at step 658 , where a loop consisting of steps 658 , 659 , 691 , 692 and 693 is entered for each analysis routine that is identified at step 640 of FIG. 6A .
- the loop is entered for a given analysis routine.
- a result set qualifier which is associated with the given analysis routine is identified.
- the result set qualifier indicates a preferred range of input values for a possible input field of the given analysis routine.
- a given result set qualifier may define that a given data record having Hemoglobin test values between 20 and 50 is suitable as input to the given analysis routine.
- the preferred range of input values for the possible input field is determined from the identified result set qualifier.
- the value range [20; 50] is determined.
- a distribution of values spread over the preferred range of input values is determined.
- a number of values of the preferred range of input values is identified such that the identified values are uniformly spread over the preferred range of input values.
- the distribution of values may be determined programmatically to include each integer number in the value range [20; 50].
- a predefined number can be provided for determination of the distribution of values.
- the predefined number can be provided with the issued query (i.e., the exemplary query of Table XVIII).
- the predefined number of values can be identified from the preferred range of values to define the distribution of values. For instance, assume that in the given example the predefined number is “11”. Accordingly, “11” uniformly spread values of the preferred range of values [20; 50] are determined for the distribution of values. By way of example, the values 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50 are identified.
- a unique temporary row qualifier is created for each identified value of the distribution of values.
- each unique temporary row qualifier is processed similar to a row qualifier (as identified at step 652 of FIG. 6B ). However, in one embodiment, if a match is determined for a possible input field and/or preferred input value of a given temporary row qualifier at step 654 , the given temporary row qualifier is deleted.
- queries issued by a suitable requesting entity can be abstract queries formulated on the basis of a data abstraction model (e.g., data abstraction model 132 of FIG. 1 ).
- An abstract query can be transformed by a suitable runtime component (e.g., runtime component 134 of FIG. 1 ) into a concrete query having a form consistent with the physical representation of data contained in an underlying database (e.g., data 136 in database 139 of FIG. 1 ).
- the concrete queries can be executed by the runtime component against the database.
- An exemplary data abstraction model, creation of abstract queries and operation of an exemplary runtime component are further described below with reference to FIGS. 7-10 .
- the data abstraction model 132 defines logical fields corresponding to physical entities of data in a database (e.g., data 136 in database 139 ), thereby providing a logical representation of the data.
- a database e.g., data 136 in database 139
- a specific logical representation having specific logical fields can be provided for each database table.
- all specific logical representations together constitute the data abstraction model 132 .
- the physical entities of the data are arranged in the database according to a physical representation of the data in the database.
- a different single data abstraction model is provided for each separate physical representation 714 , as explained above for the case of a relational database environment.
- a single data abstraction model 132 contains field specifications (with associated access methods) for two or more physical representations 714 .
- the application query specification 122 of FIG. 1 specifies one or more logical fields to compose a resulting query 702 .
- a requesting entity e.g., the requesting application 124 ) issues the resulting query 702 as defined by an application query specification of the requesting entity.
- the abstract query 702 may include both criteria used for data selection and an explicit specification of result fields to be returned based on the data selection criteria.
- An example of the selection criteria and the result field specification of the abstract query 702 is shown in FIG. 8 .
- the abstract query 702 illustratively includes selection criteria 804 and a result field specification 806 .
- the resulting query 702 is generally referred to herein as an “abstract query” because the query is composed according to abstract (i.e., logical) fields rather than by direct reference to the underlying physical data entities in the database.
- abstract queries may be defined that are independent of the particular underlying physical data representation used.
- the abstract query is transformed into a concrete query consistent with the underlying physical representation of the data using the data abstraction model 132 .
- the data abstraction model 132 exposes information as a set of logical fields that may be used within an abstract query to specify criteria for data selection and specify the form of result data returned from a query operation.
- the logical fields are defined independently of the underlying physical representation being used in the database, thereby allowing abstract queries to be formed that are loosely coupled to the underlying physical representation.
- the data abstraction model 132 comprises a plurality of field specifications 808 1 , 808 2 , 808 3 , 808 4 and 808 5 (five shown by way of example), collectively referred to as the field specifications 808 .
- a field specification is provided for each logical field available for composition of an abstract query.
- Each field specification may contain one or more attributes.
- the field specifications 808 include a logical field name attribute 820 1 , 820 2 , 820 3 , 820 4 , 820 5 (collectively, field name 820 ) and an associated access method attribute 822 1 , 822 2 , 822 3 , 822 4 , 822 5 (collectively, access methods 822 ).
- Each attribute may have a value.
- logical field name attribute 820 1 has the value “FirstName”
- access method attribute 822 1 has the value “Simple”.
- each attribute may include one or more associated abstract properties. Each abstract property describes a characteristic of a data structure and has an associated value.
- a data structure refers to a part of the underlying physical representation that is defined by one or more physical entities of the data corresponding to the logical field.
- an abstract property may represent data location metadata abstractly describing a location of a physical data entity corresponding to the data structure, like a name of a database table or a name of a column in a database table.
- the access method attribute 822 includes data location metadata “Table” and “Column”.
- data location metadata “Table” has the value “contact”
- data location metadata “Column” has the value “f_name”. Accordingly, assuming an underlying relational database schema in the present example, the values of data location metadata “Table” and “Column” point to a table “contact” having a column “f_name”.
- groups i.e. two or more
- the data abstraction model 132 includes a plurality of category specifications 810 1 and 810 2 (two shown by way of example), collectively referred to as the category specifications.
- a category specification is provided for each logical grouping of two or more logical fields.
- logical fields 808 1-3 and 808 4-5 are part of the category specifications 810 1 and 810 2 , respectively.
- a category specification is also referred to herein simply as a “category”.
- the categories are distinguished according to a category name, e.g., category names 830 1 and 830 2 (collectively, category name(s) 830 ).
- the logical fields 808 1-3 are part of the “Name and Address” category
- logical fields 808 4-5 are part of the “Birth and Age” category.
- the access methods 822 generally associate (i.e., map) the logical field names to data in the database (e.g., database 139 of FIG. 1 ). Any number of access methods is contemplated depending upon the number of different types of logical fields to be supported. In one embodiment, access methods for simple fields, filtered fields and composed fields are provided.
- the field specifications 808 1 , 808 2 and 808 5 exemplify simple field access methods 822 1 , 822 2 , and 822 5 , respectively. Simple fields are mapped directly to a particular entity in the underlying physical representation (e.g., a field mapped to a given database table and column). By way of illustration, as described above, the simple field access method 822 1 shown in FIG.
- the field specification 808 3 exemplifies a filtered field access method 822 3 .
- Filtered fields identify an associated physical entity and provide filters used to define a particular subset of items within the physical representation.
- An example is provided in FIG. 8 in which the filtered field access method 822 3 maps the logical field name 820 3 (“AnyTownLastName”) to a physical entity in a column named “I_name” in a table named “contact” and defines a filter for individuals in the city of “Anytown”.
- a filtered field is a New York ZIP code field that maps to the physical representation of ZIP codes and restricts the data only to those ZIP codes defined for the state of New York.
- the field specification 808 4 exemplifies a composed field access method 822 4 .
- Composed access methods compute a logical field from one or more physical fields using an expression supplied as part of the access method definition. In this way, information which does not exist in the underlying physical data representation may be computed.
- the composed field access method 822 4 maps the logical field name 8204 “AgeInDecades” to “AgeInYears/10”.
- Another example is a sales tax field that is composed by multiplying a sales price field by a sales tax rate.
- the formats for any given data type may vary.
- the field specifications 808 include a type attribute which reflects the format of the underlying data.
- the data format of the field specifications 808 is different from the associated underlying physical data, in which case a conversion of the underlying physical data into the format of the logical field is required.
- the field specifications 808 of the data abstraction model 132 shown in FIG. 8 are representative of logical fields mapped to data represented in the relational data representation 7142 shown in FIG. 7 .
- other instances of the data abstraction model 132 map logical fields to other physical representations, such as XML.
- the abstract query shown in Table XX includes a selection specification (lines 004-008) containing selection criteria and a results specification (lines 009-013).
- result specification is a list of abstract fields that are to be returned as a result of query execution.
- a result specification in the abstract query may consist of a field name and sort criteria.
- DAM data abstraction model
- lines 004-008 correspond to the first field specification 808 1 of the DAM 132 shown in FIG. 8 and lines 009-013 correspond to the second field specification 808 2 .
- the method 900 is entered at step 902 when the runtime component receives as input an abstract query (such as the abstract query shown in Table XX).
- the runtime component reads and parses the abstract query and locates individual selection criteria and desired result fields.
- the runtime component enters a loop (comprising steps 906 , 908 , 910 and 912 ) for processing each query selection criteria statement present in the abstract query, thereby building a data selection portion of a concrete query.
- the runtime component uses the field name from a selection criterion of the abstract query to look up the definition of the field in the data abstraction model 132 of FIG. 1 .
- the field definition includes a definition of the access method used to access the physical data associated with the field.
- the runtime component then builds (step 910 ) a concrete query contribution for the logical field being processed.
- a concrete query contribution is a portion of a concrete query that is used to perform data selection based on the current logical field.
- a concrete query is a query represented in languages like SQL and XML Query and is consistent with the data of a given physical data repository (e.g., a relational database or XML repository). Accordingly, the concrete query is used to locate and retrieve data from the physical data repository, represented by the database 139 shown in FIG. 1 . The concrete query contribution generated for the current field is then added to a concrete query statement. The method 900 then returns to step 906 to begin processing for the next field of the abstract query. Accordingly, the process entered at step 906 is iterated for each data selection field in the abstract query, thereby contributing additional content to the eventual query to be performed.
- a given physical data repository e.g., a relational database or XML repository
- the runtime component After building the data selection portion of the concrete query, the runtime component identifies the information to be returned as a result of query execution.
- the abstract query defines a list of logical fields that are to be returned as a result of query execution, referred to herein as a result specification.
- a result specification in the abstract query may consist of a field name and sort criteria. Accordingly, the method 900 enters a loop at step 914 (defined by steps 914 , 916 , 918 and 920 ) to add result field definitions to the concrete query being generated.
- the runtime component looks up a result field name (from the result specification of the abstract query) in the data abstraction model 132 and then retrieves a result field definition from the data abstraction model 132 to identify the physical location of data to be returned for the current logical result field.
- the runtime component then builds (at step 918 ) a concrete query contribution (of the concrete query that identifies physical location of data to be returned) for the logical result field.
- the concrete query contribution is then added to the concrete query statement.
- step 1002 the method 1000 queries whether the access method associated with the current logical field is a simple access method. If so, the concrete query contribution is built (step 1004 ) based on physical data location information and processing then continues according to method 900 described above. Otherwise, processing continues to step 1006 to query whether the access method associated with the current logical field is a filtered access method. If so, the concrete query contribution is built (step 1008 ) based on physical data location information for some physical data entity. At step 1010 , the concrete query contribution is extended with additional logic (filter selection) used to subset data associated with the physical data entity. Processing then continues according to method 900 described above.
- step 1012 the method 1000 queries whether the access method is a composed access method. If the access method is a composed access method, the physical data location for each sub-field reference in the composed field expression is located and retrieved at step 1014 . At step 1016 , the physical field location information of the composed field expression is substituted for the logical field references of the composed field expression, whereby the concrete query contribution is generated. Processing then continues according to method 400 described above.
- Step 1018 is representative of any other access methods types contemplated as embodiments of the present invention. However, it should be understood that embodiments are contemplated in which less then all the available access methods are implemented. For example, in a particular embodiment only simple access methods are used. In another embodiment, only simple access methods and filtered access methods are used.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- This application is related to commonly owned, co-pending U.S. patent application Ser. No. _entitled “SYSTEM AND METHOD FOR SORTING DATA RECORDS CONTAINED IN A QUERY RESULT”, filed herewith (Attorney Docket No. ROC920040311 US1), and U.S. patent application Ser. No. 10/083,075, filed Feb. 26, 2002, entitled “APPLICATION PORTABILITY AND EXTENSIBILITY THROUGH DATABASE SCHEMA AND QUERY ABSTRACTION”, which are both incorporated herein in their entirety.
- 1. Field of the Invention
- The present invention generally relates to managing query results and, more particularly, to ordering data records contained in a query result obtained in response to execution of a query against a database.
- 2. Description of the Related Art
- Databases are computerized information storage and retrieval systems. A relational database management system is a computer database management system (DBMS) that uses relational techniques for storing and retrieving data. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.
- Regardless of the particular architecture, a DBMS can be structured to support a variety of different types of operations for a requesting entity (e.g., an application, the operating system or an end user). Such operations can be configured to retrieve, add, modify and delete information being stored and managed by the DBMS. Standard database access methods support these operations using high-level query languages, such as the Structured Query Language (SQL). The term “query” denominates a set of commands that cause execution of operations for processing data from a stored database. For instance, SQL supports four types of query operations, i.e., SELECT, INSERT, UPDATE and DELETE. A SELECT operation retrieves data from a database, an INSERT operation adds new data to a database, an UPDATE operation modifies data in a database and a DELETE operation removes data from a database.
- Processing queries and query results can consume significant system resources, particularly processor resources. Furthermore, one difficulty when dealing with large query results, i.e., query results including a large amount of data, is to identify relevant information therefrom.
- A number of techniques have been employed to deal with this difficulty. For instance, query languages generally provide some functionality for ordering query results so that retrieval of relevant information can be simplified. In SQL, for example, an ORDER BY clause can be used to order rows of a given query result presented in a tabular form according to an ascending or descending order of data contained in a user-selected column of the query result. Furthermore, a given query result can be represented graphically to outline the information conveyed by the query result. However, such techniques still require a significant amount of user interaction to identify the relevant information, especially from large query results. Thus, these techniques are an ineffective means to support users in easily and rapidly identifying relevant information from query results.
- Therefore, there is a need for an efficient technique for presenting query results to users in order to simplify identification of relevant information therefrom.
- The present invention is generally directed to a method, system and article of manufacture for managing query results and, more particularly, for ordering data records contained in a query result obtained in response to execution of a query against a database.
- One embodiment provides a computer-implemented method of ordering query results comprising, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, accessing a data source to retrieve information related to the received list of data records; (c) sorting the received list of data records on the basis of the retrieved information; and (d) outputting the sorted list of data records.
- Another embodiment provides a computer-implemented method of ordering query results comprising, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, determining a value variance for each data record in the list, the value variance of a given data record indicating a relative proximity between a predefined value and a corresponding value of the given data record; (c) sorting the received list of data records on the basis of the determined value variances; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-implemented method of ordering query results comprising, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, identifying a subset of the data records in the list to satisfy a requested value range coverage, the requested value range coverage being defined by a predefined maximum number of data records of the list to be output, each having a corresponding value within a predefined value range; (c) sorting the received list of data records on the basis of the requested value range coverage; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-readable medium containing a program which, when executed by a processor, performs operations for ordering query results. The operations comprise, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, accessing a data source to retrieve information related to the received list of data records; (c) sorting the received list of data records on the basis of the retrieved information; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-readable medium containing a program which, when executed by a processor, performs operations for ordering query results. The operations comprise, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, determining a value variance for each data record in the list, the value variance of a given data record indicating a relative proximity between a predefined value and a corresponding value of the given data record; (c) sorting the received list of data records on the basis of the determined value variances; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer-readable medium containing a program which, when executed by a processor, performs operations for ordering query results. The operations comprise, in response to a query issued by a requesting entity: (a) receiving a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, identifying a subset of the data records in the list to satisfy a requested value range coverage, the requested value range coverage being defined by a predefined maximum number of data records of the list to be output, each having a corresponding value within a predefined value range; (c) sorting the received list of data records on the basis of the requested value range coverage; and (d) outputting the sorted list of data records.
- Still another embodiment provides a computer system comprising a requesting entity, a data source residing in memory, and a sorting program for ordering query results obtained in response to a query issued by the requesting entity. The sorting program is configured to: (a) receive a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, access the data source to retrieve information related to the received list of data records; (c) sort the received list of data records on the basis of the retrieved information; and (d) output the sorted list of data records.
- Still another embodiment provides a computer system comprising a requesting entity and a sorting program for ordering query results obtained in response to a query issued by the requesting entity. The sorting program is configured to: (a) receive a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, determine a value variance for each data record in the list, the value variance of a given data record indicating a relative proximity between a predefined value and a corresponding value of the given data record; (c) sort the received list of data records on the basis of the determined value variances; and (d) output the sorted list of data records.
- Yet another embodiment provides a computer system comprising a requesting entity and a sorting program for ordering query results obtained in response to a query issued by the requesting entity. The sorting program is configured to: (a) receive a list of data records ordered according to an initial order, the list of data records defining a result set for the query; (b) before outputting the result set, identify a subset of the data records in the list to satisfy a requested value range coverage, the requested value range coverage being defined by a predefined maximum number of data records of the list to be output, each having a corresponding value within a predefined value range; (c) sort the received list of data records on the basis of the requested value range coverage; and (d) output the sorted list of data records.
- So that the manner in which the above recited features of the present invention are attained can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments thereof which are illustrated in the appended drawings.
- It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
-
FIG. 1 is a data processing system illustratively utilized in accordance with the invention; - FIGS. 2A-B are relational views of software components in one embodiment;
-
FIG. 3 is a flow chart illustrating sorting of data records contained in a query result in one embodiment; -
FIG. 4 is a flow chart illustrating sorting of data records contained in a query result in another embodiment; - FIGS. 5A-B are flow charts illustrating sorting of data records contained in a query result in still another embodiment;
- FIGS. 6A-C are flow charts illustrating sorting of data records contained in a query result in still another embodiment;
-
FIGS. 7-8 are relational views of software components for query building support in one embodiment; and -
FIGS. 9-10 are flow charts illustrating the operation of a runtime component. - Introduction
- The present invention is generally directed to a method, system and article of manufacture for managing query results and, more particularly, for sorting data records contained in a query result. According to one aspect, a user issues a query against a database. In response to execution of the query, a list of data records defining a query result is obtained. The data records in the received list of data records are ordered according to an initial order. The data records are then sorted to provide a re-ordered query result which intelligently conveys information contained in the query result to the user.
- In one embodiment, the sorting is performed on the basis of information which is related to the received list of data records. For instance, annotations associated with the data records in the list are retrieved from a suitable data source. For each data record in the list, a total number of associated annotations is counted. The total numbers can then be used as a basis for sorting the data records in the received list. By way of example, data records having the greatest total number of counted annotations can be placed on the top of a corresponding sorted list.
- In another embodiment, the sorting is performed on the basis of a value variance which is determined for each data record in the list. The value variance of a given data record indicates a relative proximity between a predefined value and a corresponding value of the given data record. For instance, a given query result may include data records having values which are included within a specific value range. The value range may include a center value which can be specified as the predefined value. The value variance of each of the values from the data records in the list with respect to the predefined value (i.e., the center value) can be determined. Accordingly, a relative proximity to the predefined value can be identified for each corresponding value of the data records. Thus, data records having values with a closest relative proximity to the predefined value can be placed on the top of a corresponding sorted list.
- In still another embodiment, the sorting is performed on the basis of a requested value range coverage. The requested value range coverage is defined by a predefined maximum number of data records of the list to be output according to a requested value distribution, each data record having a corresponding value within a predefined value range. For instance, a given query result may include a multiplicity of data records having corresponding values. All such corresponding values are spread over a given value range. From the multiplicity of data records, only a portion should be output according to a predefined maximum number in order to define a requested value distribution. The requested value distribution can be defined by any possible type of distribution, such as a uniform distribution (also referred to as “flat distribution”) and a normal distribution (also referred to as “bell curve”). A flat distribution consists of values that are evenly distributed between upper and lower bounds. A bell curve consists of values which are selected such that the frequency of selection is weighted towards a center, or average, value within upper and lower bounds. Accordingly, the predefined maximum number of data records is selected from the multiplicity of data records such that the corresponding values of the selected data records define the requested value distribution. Thus, the one or more selected data records can be placed on the top of a corresponding sorted list.
- In still another embodiment, the sorting is performed on the basis of suitability scores which are determined with respect to available analysis routines. To this end, a suitability score is determined for each data record in the list. The suitability score for a given data record indicates a relative suitability of the given data record as input to one or more analysis routines. Thus, data records which are most suitable as input to the one or more analysis routines can be placed on the top of a corresponding sorted list.
- It is noted that particular embodiments described herein may refer to re-ordering of specific requested data. For example, embodiments may be described with reference to re-ordering of query results obtained in response to execution of queries against databases. However, references to re-ordering of query results are merely for purposes of illustration and not limiting of the invention. More broadly, re-ordering of any suitable data received in a list form in response to a request for the data (whether or not the request be a query, per se) is contemplated.
- In the following, reference is made to embodiments of the invention. However, it should be understood that the invention is not limited to specific described embodiments. Instead, any combination of the following features and elements, whether related to different embodiments or not, is contemplated to implement and practice the invention. Furthermore, in various embodiments the invention provides numerous advantages over the prior art. However, although embodiments of the invention may achieve advantages over other possible solutions and/or over the prior art, whether or not a particular advantage is achieved by a given embodiment is not limiting of the invention. Thus, the following aspects, features, embodiments and advantages are merely illustrative and, unless explicitly present, are not considered elements or limitations of the appended claims.
- One embodiment of the invention is implemented as a program product for use with a computer system such as, for example,
computer system 110 shown inFIG. 1 and described below. The program(s) of the program product defines functions of the embodiments (including the methods described herein) and can be contained on a variety of signal-bearing media. Illustrative signal-bearing media include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks. Such signal-bearing media, when carrying computer-readable instructions that direct the functions of the present invention, represent embodiments of the present invention. - In general, the routines executed to implement the embodiments of the invention, may be part of an operating system or a specific application, component, program, module, object, or sequence of instructions. The software of the present invention typically is comprised of a multitude of instructions that will be translated by the native computer into a machine-readable format and hence executable instructions. Also, programs are comprised of variables and data structures that either reside locally to the program or are found in memory or on storage devices. In addition, various programs described hereinafter may be identified based upon the application for which they are implemented in a specific embodiment of the invention. However, it should be appreciated that any particular nomenclature that follows is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature. Further, it is understood that while reference may be made to particular query languages, including SQL, the invention is not limited to a particular language, standard or version. Accordingly, persons skilled in the art will recognize that the invention is adaptable to other query languages and that the invention is also adaptable to future changes in a particular query language as well as to other query languages presently unknown.
- Referring now to
FIG. 1 , acomputing environment 100 is shown. In general, the distributedenvironment 100 includescomputer system 110 and a plurality ofnetworked devices 146. Thecomputer system 110 may represent any type of computer, computer system or other programmable electronic device, including a client computer, a server computer, a portable computer, an embedded controller, a PC-based server, a minicomputer, a midrange computer, a mainframe computer, and other computers adapted to support the methods, apparatus, and article of manufacture of the invention. In one embodiment, thecomputer system 110 is an eServer computer available from International Business Machines of Armonk, N.Y. - Illustratively, the
computer system 110 comprises a networked system. However, thecomputer system 110 may also comprise a standalone device. In any case, it is understood thatFIG. 1 is merely one configuration for a computer system. Embodiments of the invention can apply to any comparable configuration, regardless of whether thecomputer system 110 is a complicated multi-user apparatus, a single-user workstation, or a network appliance that does not have non-volatile storage of its own. - The embodiments of the present invention may also be practiced in distributed computing environments in which tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices. In this regard, the
computer system 110 and/or one or more of thenetworked devices 146 may be thin clients which perform little or no processing. - The
computer system 110 could include a number of operators and peripheral systems as shown, for example, by amass storage interface 137 operably connected to a directaccess storage device 138, by avideo interface 140 operably connected to adisplay 142, and by anetwork interface 144 operably connected to the plurality ofnetworked devices 146. Thedisplay 142 may be any video output device for outputting viewable information. -
Computer system 110 is shown comprising at least oneprocessor 112, which obtains instructions and data via abus 114 from amain memory 116. Theprocessor 112 could be any processor adapted to support the methods of the invention. Themain memory 116 is any memory sufficiently large to hold the necessary programs and data structures.Main memory 116 could be one or a combination of memory devices, including Random Access Memory, nonvolatile or backup memory, (e.g., programmable or Flash memories, read-only memories, etc.). In addition,memory 116 may be considered to include memory physically located elsewhere in thecomputer system 110, for example, any storage capacity used as virtual memory or stored on a mass storage device (e.g., direct access storage device 138) or on another computer coupled to thecomputer system 110 viabus 114. - The
memory 116 is shown configured with anoperating system 118. Theoperating system 118 is the software used for managing the operation of thecomputer system 110. Examples of theoperating system 118 include IBM OS/400®, UNIX, Microsoft Windows®, and the like. - The
memory 116 further includes one ormore applications 120 and anabstract model interface 130. Theapplications 120 and theabstract model interface 130 are software products comprising a plurality of instructions that are resident at various times in various memory and storage devices in thecomputer system 110. When read and executed by one ormore processors 112 in thecomputer system 110, theapplications 120 and theabstract model interface 130 cause thecomputer system 110 to perform the steps necessary to execute steps or elements embodying the various aspects of the invention. - Illustratively, the
applications 120 include anapplication query specification 122, one or more requestingapplications 124, each having asorting program 126, andanalysis routines 180. The requesting application(s) 124 (and more generally, any requesting entity, including the operating system 118) is configured to issue queries againstdata 136 in adatabase 139. Illustratively, thedatabase 139 is shown as part of a database management system (DBMS) 154 instorage 138. Thedatabase 139 is representative of any collection of data regardless of the particular physical representation of the data. A physical representation of data defines an organizational schema of the data. By way of illustration, thedatabase 139 may be organized according to a relational schema (accessible by SQL queries) or according to an XML schema (accessible by XML queries). However, the invention is not limited to a particular schema and contemplates extension to schemas presently unknown. As used herein, the term “schema” generically refers to a particular arrangement of data. - The queries issued by the requesting application(s) 124 are defined according to the
application query specification 122 included with each requestingapplication 124. The queries issued by the requesting application(s) 124 may be predefined (i.e., hard coded as part of the requesting application(s) 124) or may be generated in response to input (e.g., user input). In either case, the queries (referred to herein as “abstract queries”) can be composed using logical fields defined by theabstract model interface 130. A logical field defines an abstract view of data whether as an individual data item or a data structure in the form of, for example, a database table. In particular, the logical fields used in the abstract queries are defined by a dataabstraction model component 132 of theabstract model interface 130. Aruntime component 134 transforms the abstract queries into concrete queries having a form consistent with the physical representation of the data contained in thedatabase 139. The concrete queries can be executed by theruntime component 134 against thedatabase 139. Operation of theruntime component 134 is further described below with reference toFIGS. 7-10 . - It should be noted that embodiments of the present invention can be explained below, by way of example, with reference to abstract queries which are created on the basis of a corresponding data abstraction model. However, other embodiments can be implemented using other types of queries and database representations, such as SQL or XML queries issued against data in databases having an underlying relational or XML data representation. Accordingly, the present invention is not limited to a particular query environment, including abstract queries and data abstraction models, and various different query environments and implementations are broadly contemplated
- In one embodiment, a result set is obtained from the
data 136 in response to execution of a given query against thedatabase 139. The result set defines a query result which is ordered according to an initial order. Using functions which are invoked by the sorting program(s) 126 of the requesting application(s) 124, the query result can be re-ordered to simplify retrieval of relevant information therefrom. Specifically, the query result can be re-ordered to facilitate retrieval of relevant information required for subsequent processing of the query result using one or more of theanalysis routines 180. Operation and interaction of the requesting application(s) 124 and theanalysis routines 180 are further described below with reference toFIGS. 2A-6C . - It should be noted that the sorting program(s) 126 are illustrated as an integral part of the requesting application(s) 124 for purposes of illustration. However, it should be noted that the sorting program(s) 126 can be implemented as separate application(s) which is independent of the requesting application(s) 124. Accordingly, any suitable implementation of the requesting application(s) 124 and the sorting program(s) 126 is broadly contemplated.
- Referring now to
FIG. 2A , a block diagram of a computing environment for re-ordering of requested data in one embodiment is shown. Illustratively, the computing environment includes the requesting application(s) 124 having the sorting program(s) 126, thedatabase 139 having thedata 136, thedisplay device 142 and theanalysis routines 180 ofFIG. 1 , as well as auser interface 210. - By way of example, the requesting application(s) 124 issues a data request 220 (e.g., a query) against the
data 136 in thedatabase 139. In one embodiment, thedata request 220 is created by a user using theuser interface 210. The data request 220 is executed against thedatabase 139 to obtain a corresponding result set of data from the data 136 (e.g., a query result) using theDBMS 154. - In response to the
data request 220, requesteddata 230 defining the corresponding result set is identified from thedata 136. The requesteddata 230 is ordered according to an initial order and returned to the requesting application(s) 124. In one embodiment, the requesteddata 230 is presented as an ordered list of data records. - Using functions invoked by the sorting program(s) 126, the requested
data 230 is re-ordered, i.e., the data records in the ordered list are sorted in order to reduce the complexity of retrieving relevant information therefrom. Sorting the data records of a received list of data records according to predefined criteria is described in more detail below with reference toFIGS. 2B-6C . After sorting the data records in the ordered list, the sorted requesteddata 240 is output to thedisplay device 142 for display. Accordingly, the sorted requesteddata 240 can be presented on thedisplay device 142 as a sorted list of data records. The sorted requesteddata 240 can subsequently be processed using one or more of theanalysis routines 180. - In one embodiment, the sorting program(s) 126 sorts the data records in the ordered list on the basis of
related information 258 which is retrieved from a correspondingdata source 252. In order to retrieve therelated information 258, the sorting program(s) 126 issues asort request 222 against thedata source 252. Thesort request 252 is executed against thedata source 252 using aDBMS 256 which manages thedata source 252. - According to one aspect, the requesting application(s) 124 and the sorting program(s) 126 are implemented as a single, integrated software product resident at the server-side or the client-side. Furthermore, the sorting can be done by a client-side application (e.g., the requesting application(s) 124 having the sorting program(s) 126) and the requested
data 230 is received from a server-side database (e.g., the database 139). However, it should be noted that alternative embodiments are contemplated. For instance, the sorting program(s) 126 can be implemented by a server-side application in which case the sorting can be done on a server machine having thedatabase 139. In another embodiment, the sorting program(s) 126 and thedatabase 139 can be resident on a common computer system. - Referring now to
FIG. 2B , exemplary functions which can be called by the sorting program(s) 126 ofFIG. 2A for re-ordering the requesteddata 230 are shown in more detail. Specifically, the sorting program(s) 126 can invoke various functions which are configured for pre- and post-processing of thedata request 220 and the requesteddata 230 ofFIG. 2A . - As was noted above with reference to
FIG. 2A , the requesteddata 230 is retrieved from thedata 136 in thedatabase 139 using theDBMS 154. According to one aspect, the requesteddata 230 is presented in a tabular form having a plurality of rows and columns. Illustratively, the plurality of rows includes rows “A”, “B”, “C” and “D”, and the plurality of columns includes columns “E”, “F”, “G” and “H”. By way of example, the plurality of rows is shown having an initial order “ABCD” and the plurality of columns is shown having an order “EFGH”. Each of the rows “A”, “B”, “C” and “D” represents a data record, so that the requesteddata 230 defines an ordered list of data records having the initial order “ABCD”. - The sorting program(s) 126 receives the ordered list of data records (i.e., the requested data 230) as input and sorts the data records “A”, “B”, “C” and “D” of the received list. Illustratively, the sorting program(s) 126 sorts the data records “A”, “B”, “C” and “D” of the requested
data 230 such that the sorted data records have the order “CADB” in the sorted requesteddata 240. After sorting the data records, the sorting program(s) 126 outputs the sorted list of the data records (i.e., the sorted requested data 240) to thedisplay device 142. Exemplary embodiments of operations for sorting data records of a received list of data records are described below with reference toFIGS. 3-6C . - More specifically, in the embodiment illustrated in FIB. 2B, the re-ordering of the requested
data 230 is performed on the basis of information (e.g.,related information 258 ofFIG. 2A ) which is related to the received list of data records “A”, “B”, “C” and “D” defining the requesteddata 230. In order to determine the related information, the sorting program(s) 126 invokes aninformation determination unit 250 having asub-query generator 257. It should be noted that theinformation determination unit 250 is represented as a separate unit only by way of example and not for limiting the invention accordingly. In other words, theinformation determination unit 250 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system program. - The
information determination unit 250 accesses adata source 252 to determine the related information therefrom. To this end, thesub-query generator 257 generates thesort request 222 ofFIG. 2A which is issued against thedata source 252 for retrieving the related information using theDBMS 256. In the illustrated example, thedata source 252 includesannotations 254 associated with the data records “A”, “B”, “C” and “D”. However, it should be noted that annotations are merely one example of information related to data records. Any suitable related information including annotations can be used as a basis for re-ordering the requesteddata 230. More generally, any reference to the data records can be used as a basis for the re-ordering. Accordingly, all such suitable types of related information are broadly contemplated. Furthermore, the annotations themselves can be classified based on an organization type in which the creators of the annotations are working. For example, data records which have been annotated by individuals working in the same technological field of study can be preferred to general annotations. Moreover, the annotations can be ranked on the basis of hierarchical positions of the creators of the annotations. For instance, for a researcher who performs a study on a liver disease, annotations made by a chief site specialist are certainly preferred to those of assistants. - According to one aspect, the
information determination unit 250 counts a total number of associated annotations for each of the data records of the requesteddata 230. The counted total numbers are then used as a basis for sorting the data records. By way of example, assume that for the data record “C” a total number of 76 associated annotations is counted. Similarly, a total number of 62 annotations is counted for the data record “A”, a total number of 43 annotations is counted for the data record “D”, and a total number of 15 annotations is counted for the data record “B”. Assume further that the data record having the greatest total number of annotations is placed on the top of the sorted list of data records defining the sorted requesteddata 240. Accordingly, the data records “A”, “B”, “C” and “D” are programmatically sorted in the sorted requesteddata 240 according to the order “CADB”, as illustrated. An exemplary method for re-ordering the requesteddata 230 on the basis of information which is related to the received list of data records defining the requesteddata 230 is described below with reference toFIG. 3 . - Referring now to
FIG. 3 , one embodiment of amethod 300 for re-ordering requested data (e.g., requesteddata 230 ofFIG. 2B ) on the basis of information (e.g.,annotations 254 ofFIG. 2B ) which is related to the requested data is shown. The requested data is obtained in response to execution of a corresponding data request against data in a database (e.g.,data 136 ofdatabase 139 ofFIG. 2A ). At least part of the steps of themethod 300 can be performed by a suitable requesting entity (e.g., requesting application(s) 124 ofFIG. 2A ) and suitable functionalities of an associated sorting program(s) (e.g., sorting program(s) 126 ofFIG. 2B ). - By way of example, the
method 300 is explained with respect to a data request being implemented as a query against the data in the database for purposes of illustration. In this case, the requested data is a result set of data which defines a query result. -
Method 300 starts atstep 310. Atstep 320, the query is issued by the suitable requesting entity. The issued query is executed against the data in the database. An exemplary query is shown in Table I below. For simplicity, the exemplary query of Table I is described in natural language without reference to a particular query language. Thus, it is understood that any suitable query language, known or unknown, can be used to create the query of Table I.TABLE I QUERY EXAMPLE FIND ID, Name, Age SORT BY number of associated annotations - Illustratively, the exemplary query shown in Table I includes data selection criteria in lines 001-002. The data selection criteria include a result field specification (line 002) which specifies three result fields for which information is to be returned in the query result. Specifically, in line 002 the result fields “ID”, “Name” and “Age” are specified. The exemplary query further includes sorting criteria in lines 003-004. The sorting criteria indicate that all data records in the query result should be sorted according to counted numbers of annotations associated with the data records (line 004).
- Assume now that data related to the result fields “ID”, “Name” and “Age” of the exemplary query of Table I is included with a database table “Demographic”. An exemplary “Demographic” table is shown in Table II below.
TABLE II EXEMPLARY DATABASE TABLE “DEMOGRAPHIC” ID Name Age 3 Renee 24 1 Karl 54 2 Kris 49 - Illustratively, the exemplary database table “Demographic” includes an “ID”, “Name” and “Age” column. The “ID” column contains a unique identifier for each of the data records included with lines 002-004. The “Name” column includes names of individuals and the “Age” column contains information about the age of the corresponding individuals.
- Assume further that the annotations required for the sorting of the query result are included with a database table named “Annotations”. An exemplary “Annotations” table is shown in Table III below.
TABLE III EXEMPLARY DATABASE TABLE “ANNOTATIONS” Note_ID Patient_ID Date Comment 453 1 1/2/04 Karl has three broken toes 454 2 1/3/04 Kris has a bad sunburn 455 1 1/4/04 Karl has a cut finger - Illustratively, the exemplary database table “Annotations” includes a “Node_ID”, “Patient_ID”, “Date” and “Comment” column. The “Node_ID” column contains a unique identifier for each of the data records included with lines 002-004. The “Patient_ID” column includes patient identifiers according to the “ID” column of the “Demographic” table of Table II above. The “Date” column contains indications of dates on which, for example, a corresponding diagnosis has been established for a given patient. The “Comment” column contains annotations with respect to the established diagnoses.
- In one embodiment, executing the query at
step 320 includes generating a data query (e.g., data request 220 ofFIG. 2A ) and a sorting query (e.g.,sort request 222 ofFIG. 2A ) using a suitable sub-query generator. More specifically, the issued query includes: (i) data selection criteria (e.g., data selection criteria in lines 001-002 of Table I) configured to select data records defining the query result from the data in the database, and (ii) sorting criteria (e.g., sorting criteria in lines 003-004 of Table I) configured to specify the information related to the data records of the query result. The data query is generated on the basis of the data selection criteria and the sorting query is generated on the basis of the sorting criteria. Accordingly, the data query is used to determine the query result in an initial order on the basis of the data selection criteria. The sorting query is used to sort the data records in the determined query result on the basis of the sorting criteria. In the given example, the data query shown in Table IV below can be generated from the exemplary query of Table I above using the suitable sub-query generator.TABLE IV DATA QUERY EXAMPLE FIND ID, Name, Age FROM Demographic - Illustratively, the exemplary data query shown in Table IV includes data selection criteria in lines 001-002 which correspond to the data selection criteria in lines 001-002 of Table I. The exemplary data query further includes a specification of the database table which contains the requested data (lines 003-004), i.e., the “Demographic” table of Table II above. Assume for simplicity that the suitable sub-query generator retrieves the table named “Demographic” from the issued query of Table I above. Furthermore, the sorting query shown in Table V below can be generated from the exemplary query of Table I above.
TABLE V SORTING QUERY EXAMPLE FIND Patient_ID, count(comment) FROM Annotations GROUP BY Patient_ID - Illustratively, the exemplary sorting query shown in Table V includes data selection criteria in lines 001-002 for selection of the required related information, and a specification of the database table which contains the related information in lines 003-004, i.e., the “Annotations” table of Table III above. According to one aspect, a corresponding rule may indicate to the suitable sub-query generator (e.g.,
sub-query generator 257 ofFIG. 2B ) that the related information about annotations is contained in the database table named “Annotations”. Furthermore, the sorting criteria “SORT BY number of associated annotations” in lines 003-004 of Table I specify that the query result should be sorted with respect to a number of annotations associated with each data record contained in the query result. As the data records in the query result are identified by patient identifiers (“Patient_ID”), annotations (“comment” in line 002) with respect to patients which are identified by corresponding patient identifiers (“Patient_ID” in line 002) are retrieved. Furthermore, all retrieved annotations are counted for each data record associated with one of the retrieved patient identifiers (“count(comment)” in line 002). Moreover, a ranking of the counted retrieved annotations is established according to the sorting query of Table V by grouping the counted numbers of annotations (lines 005-006) per patient. - At
step 330, the query result in an initial order is received. In one embodiment, the query result is presented in a list form having a plurality of data records. In the given example, receiving the query result in the initial order corresponds to receiving a query result (hereinafter referred to as “data query result”) obtained in response to execution of the data query of Table IV against the exemplary “Demographic” table of Table II. Accordingly, the data query result shown in Table VI below is received.TABLE VI EXEMPLARY DATA QUERY RESULT ID Name Age 3 Renee 24 1 Karl 54 2 Kris 49 - Note that in the given example the exemplary data query result of Table VI corresponds to the database table shown in Table II above.
- At
step 340, a data source (e.g.,data source 252 ofFIG. 2B ) is accessed to retrieve annotations (e.g., one or more of theannotations 254 ofFIG. 2B ) for at least a portion of the data records contained in lines 002-004 of the data query result of Table VI above. Atstep 350, a total number of retrieved annotations is counted for each one of the data records. In the given example, steps 340 and 350 can be accomplished by executing the sorting query of Table V against the “Annotations” table of Table II. Accordingly, the exemplary query result (hereinafter referred to as “sorting query result”) shown in Table VII below is received.TABLE VII EXEMPLARY SORTING QUERY RESULT Patient_ID Number of Annotations 1 2 2 1 - Note that according to line 002 the patient identifier “1” has two associated annotations (in lines 002 and 004 of Table III). According to line 003, the patient identifier “2” has only one associated annotation (in line 003 of Table III).
- At
step 360, a ranking of the data records in lines 002-004 of the data query result in Table VI above is determined on the basis of the counted total numbers of retrieved associated annotations. To this end, the data query result of Table VI can be augmented with the exemplary sorting query result of Table VII, in the given example. Accordingly, the augmented query result shown in Table VIII below is obtained.TABLE VIII EXEMPLARY AUGMENTED QUERY RESULT ID Name Age Number of Annotations 3 Renee 24 0 1 Karl 54 2 2 Kris 49 1 - Note that in the given example the exemplary augmented query result of Table VIII corresponds to the data query result shown in Table VI above, wherein a column containing the counted numbers of annotations according to Table VII has been inserted. Furthermore, as can be seen from the “Number of Annotations” column, the following ranking can be established: (1) the data record of line 003 has the most associated annotations, (2) the data record in line 004 has the second most associated annotations, and (3) the data record in line 002 has no associated annotations at all.
- By way of example, the above ranking is performed on the basis of the counted numbers of annotations. However, as was noted above it should be noted that types and/or attributes of the annotations can also be considered when establishing the ranking. For instance, the annotations can be weighted based on an organization hierarchy or type of the creators of the annotations. Furthermore, the annotations can be weighted so that some annotations are weighted relatively more heavily than others. For example, assume that the annotation in line 003 of Table III which is related to “Kris” was made by a chief site specialist while the annotations in lines 002 and 004 of Table III, which are both related to “Karl”, were made by an assistant. In this case, it might be desirable to weight the annotation made by the chief site specialist such that the one annotation associated with “Kris” is considered more important than the two annotations associated with “Karl”.
- At
step 370, the data records in the received query result (i.e., data query result of Table VI) are sorted on the basis of the determined ranking. Accordingly, the exemplary sorted query result shown in Table IX below is obtained.TABLE IX EXEMPLARY SORTED QUERY RESULT ID Name Age 1 Karl 54 2 Kris 49 3 Renee 24 - Note that in the given example the data records in lines 002-004 of Table IX are sorted according to the above described ranking. Accordingly, the data record of line 003 of Table VI is presented on the top of the exemplary sorted query result, i.e., in line 002 of Table IX, as this data record has the greatest counted number of associated annotations.
- At
step 380, the sorted list of data records (e.g., sorted requesteddata 240 ofFIG. 2B ) is output. For instance, the sorted list is output for display on a display device (e.g.,display device 142 inFIG. 2B ). In other words, in the given example the exemplary sorted query result of Table IX is output.Method 300 then exits atstep 390. - Referring now back to
FIG. 2B , the re-ordering of the requesteddata 230 is performed in another embodiment on the basis of a value variance which is determined for each of the data records of the requesteddata 230. The value variance of a given data record indicates a relative proximity between apredefined value 262 and a corresponding value of the given data record. - In order to determine the value variance for each data record of the requested
data 230, the sorting program(s) 126 illustratively invokes a valuevariance determination unit 260. It should be noted that the valuevariance determination unit 260 is represented as a separate unit only by way of example and not for limiting the invention accordingly. In other words, the valuevariance determination unit 260 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system program. - The value
variance determination unit 260 illustratively includes thepredefined value 262. According to one aspect, thepredefined value 262 can be provided by a user using a suitable user interface (e.g.,user interface 210 ofFIG. 2A ). In one embodiment, each one of the data records of the requesteddata 230 has a particular value of a type which corresponds to an underlying value type of thepredefined value 262. For instance, each one of the data records may include a particular value related to a hemoglobin test and thepredefined value 262 may represent a user-specified value of interest for hemoglobin tests. More specifically, assume that the particular values of the data records are hemoglobin test result values between 12 and 14. Assume further that a user specifies 13.5 as a central or ideal interest value, i.e., thepredefined value 262. Thus, the data records having particular values which are the most close to the central or ideal interest value of 13.5 can be identified from the requesteddata 230. - To this end, the value
variance determination unit 260 determines the value variance of the particular value of each of the data records of the requesteddata 230 with respect to thepredefined value 262. Thus, for each one of the data records a relative proximity between the corresponding particular value and thepredefined value 262 can be identified. According to one aspect, the data records having the particular values with the closest relative proximity to thepredefined value 262 are programmatically placed on the top of the sorted list of data records defining the sorted requesteddata 240. An exemplary method for re-ordering the requesteddata 230 on the basis of a value variance which is determined for each of the data records of the requesteddata 230 is described below with reference toFIG. 4 . - Referring now to
FIG. 4 , one embodiment of amethod 400 for re-ordering requested data (e.g., requesteddata 230 ofFIG. 2B ) on the basis of value variances is shown. The requested data is obtained in response to execution of a corresponding data request (e.g., data request 220 ofFIG. 2A ) against data in a database (e.g.,data 136 ofdatabase 139 ofFIG. 2A ). Similarly to themethod 300 ofFIG. 3 , themethod 400 is explained by way of example with respect to a query issued against the data in the database in order to obtain a corresponding query result. At least part of the steps of themethod 400 can be performed by a suitable requesting entity (e.g., requesting application(s) 124 ofFIG. 2A ) and suitable functionalities of an associated sorting program(s) (e.g., sorting program(s) 126 ofFIG. 2B ).Method 400 starts atstep 410. - At
step 420, the query is issued by a suitable requesting entity (e.g., requesting application(s) 124 ofFIG. 2A ). The issued query is executed against the data in the database. An exemplary query is shown in Table X below. For simplicity, the exemplary query of Table X is described in natural language without reference to a particular query language. Thus, it is understood that any suitable query language, known or unknown, can be used to create the query of Table X.TABLE X QUERY EXAMPLE FIND Patient_ID, Hemoglobin SORT BY proximity to Hemoglobin = 34 - Illustratively, the exemplary query shown in Table X includes data selection criteria in lines 001-002. The data selection criteria include a result field specification (line 002) which specifies two result fields for which information is to be returned in the query result. Specifically, in line 002 the result fields “Patient_ID” and “Hemoglobin” are specified. The exemplary query further includes sorting criteria in lines 003-004. The sorting criteria indicate that all data records in the query result should be sorted with respect to a predefined Hemoglobin value (e.g.,
predefined value 262 ofFIG. 2B ) of “34” (line 004). More specifically, each Hemoglobin test value included with a data record of the query result is compared with the predefined Hemoglobin value to identify a relative proximity thereto. - Assume now that information related to the result fields “Patient_ID” and “Hemoglobin” of the exemplary query of Table X is included with a database table “Tests”. An exemplary “Tests” table is shown in Table XI below.
TABLE XI EXEMPLARY DATABASE TABLE “TESTS” Patient_ID Date Hemoglobin 1 1/2/04 29 1 16/7/04 23 3 5/5/04 35 2 12/8/04 45 2 19/10/04 33 - Illustratively, the exemplary database table “Tests” includes a “Patient_ID”, “Date” and “Hemoglobin” column. By way of example, the “Patient_ID” column includes patient identifiers according to the “ID” column of the “Demographic” table of Table II above. The “Date” column contains exemplary dates at which a corresponding Hemoglobin test has been performed on a given patient. The “Hemoglobin” column includes Hemoglobin test values which have been determined at the indicated dates.
- In one embodiment, executing the query at
step 420 includes identifying the data selection criteria and the sorting criteria from the issued query. Executing the query further includes generating a data query on the basis of the identified data selection criteria and executing the data query against the database. In the given example, the data selection criteria “FIND Patient_ID, Hemoglobin” can be identified from the issued query (lines 001-002 of Table X). Furthermore, the sorting criteria “SORT BY proximity to Hemoglobin=34” can be identified from the issued query (lines 003-004 of Table X). On the basis of the identified data selection criteria, the data query shown in Table XII below can be generated.TABLE XII DATA QUERY EXAMPLE FIND Patient_ID, Hemoglobin FROM Tests - Illustratively, the exemplary data query shown in Table XII includes the data selection criteria of lines 001-002 of Table X. The exemplary data query further includes a specification of the database which contains the requested data (lines 003-004), i.e., the “Tests” table of Table XII above.
- At
step 430, the query result in an initial order is received. Receiving the query result in the initial order corresponds to receiving a data query result obtained in response to execution of the data query of Table XII against the exemplary “Tests” table of Table XI. The data query result shown in Table XIII below is received in the given example.TABLE XIII EXEMPLARY DATA QUERY RESULT Patient_ID Hemoglobin 1 29 1 23 3 35 2 45 2 33 - Note that in the given example the exemplary data query result of Table XIII corresponds to the database table shown in Table XI above, where the “Date” column has been removed.
- At
step 440, a value variance is determined for each one of the data records contained in the data query result of Table XIII to determine the relative proximities. Illustratively, the data query result can be augmented with a column indicating the determined value variances. Accordingly, the augmented query result shown in Table XIV below is obtained.TABLE XIV EXEMPLARY AUGMENTED QUERY RESULT Patient_ID Hemoglobin Value Variance 1 29 5 1 23 11 3 35 1 2 45 11 2 33 1 - Note that in the given example the exemplary augmented query result of Table XIV corresponds to the data query result shown in Table XIII above, wherein a column containing the determined value variances has been inserted. Each value variance is defined by the difference between the returned Hemoglobin value and the predefined Hemoglobin value. As can be seen from Table XIV, the data record in line 005 has a Hemoglobin value of “45”. Thus, the value variance for this data record can be determined according to one aspect by subtracting the predefined Hemoglobin value of “34” therefrom, i.e., 45−34=11.
- At
step 450, a ranking of the data records is determined on the basis of the determined relative proximities, i.e., the determined value variances. As can be seen from the “Value Variance” column of Table XIV, the following ranking can be established: (1) the data records of lines 003 and 006 have a value variance of “1” and, thus, the closest relative proximity with respect to the predefined Hemoglobin value, (2) the data record of line 001 has a value variance of “5” and, thus, the second closest relative proximity, and (3) the data records of lines 002 and 005 have a value variance of “11” and, thus, the farthest relative proximity. - At
step 460, the data records in the data query result of Table XIII are sorted on the basis of the determined ranking. Accordingly, the exemplary sorted query result shown in Table XV below is obtained.TABLE XV EXEMPLARY SORTED QUERY RESULT Patient_ID Hemoglobin 2 33 3 35 1 29 1 23 2 45 - Note that in the given example the data record of line 006 of the data query result of Table XIII is presented on the top of the exemplary sorted query result, i.e., in line 002 of Table XV, as the Hemoglobin test value of this data record has the closest relative proximity to the predefined Hemoglobin value.
- At
step 470, the sorted list of data records (e.g., sorted requesteddata 240 ofFIG. 2B ) is output. In the given example, the exemplary sorted query result of Table XV is output.Method 400 then exits atstep 480. - Referring now back to
FIG. 2B , the re-ordering of the requesteddata 230 is performed in still another embodiment on the basis of a requested value range coverage. The requested value range coverage is defined by a predefinedmaximum number 274 “VALUE COUNT” of data records of the requesteddata 230 to be output. Each data record has an associated particular value and the particular value of each of the outputted data records must be included within a predefined value range 272 “VALUE RANGE”. Accordingly, in one embodiment the predefinedmaximum number 274 of data records having associated particular values within the predefined value range 272 is programmatically selected and output. - For instance, assume a researcher who wants to conduct a study on the effects of alcohol on the liver of humans dependent on the weight of corresponding test persons. Assume now that the requested
data 230 includes data records having particular values for the weight of respective individuals. Assume further that the researcher requires 100 test persons and that the 100 test persons should have weights which are included in a value range of 100 pounds-250 pounds. To this end, the researcher using a suitable user interface (e.g.,user interface 210 ofFIG. 2A ) defines the predefinedmaximum number 274 to be 100 and the predefined value range 272 to be 100 pounds-250 pounds. Assume now that the requesteddata 230 includes 1000 data records having particular values within the predefined value range 272. Thus, by specifying the requested range coverage to retrieve the 100 test persons, 100 data records would be selected programmatically and output. The 100 data records can be selected arbitrarily to satisfy the requested value range coverage. - More specifically, in order to determine a requested value range coverage for the data records of the requested
data 230, the sorting program(s) 126 illustratively invokes a rangecoverage determination unit 270 having the predefined value range 272 and the predefinedmaximum number 274. It should be noted that the rangecoverage determination unit 270 is represented as a separate unit only by way of example and not for limiting the invention accordingly. In other words, the rangecoverage determination unit 270 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system program. - It should be noted that an arbitrary selection of the 100 data records in the given example may result in selection of 100 individuals all having an identical weight of 175 pounds, for example. However, as 100 individuals having an identical weight are not considered being representative of the predefined value range 272, the user can use the suitable user interface in one embodiment to specify how many data records having an identical associated particular value should be output at maximum. For instance, the user can specify that not more than five data records associated with individuals having an identical weight should be output. Accordingly, in the given example the 100 selected data records would represent individuals having at least 20 different weights within the predefined value range 272.
- In another embodiment, the particular values of the outputted data records must define a requested value distribution in the predefined value range 272. As was noted above, the requested value distribution can be defined by any possible type of distribution, such as a flat distribution and a bell curve. However, it should be noted that a flat distribution and a bell curve are merely described by way of example and that other distribution types can also be requested, such as an inverted bell curve or a negative exponential distribution. Accordingly, all such distributions are broadly contemplated. For instance, assume that in the given example the researcher requires 100 test persons having weights which are evenly spread out over the value range of 100 pounds-250 pounds, so that the weights of the 100 test persons can be considered as being representative of the complete value range. Thus, by specifying the requested range coverage to retrieve the 100 test persons such that the weights of the retrieved test persons define a flat distribution over the value range of 100 pounds-250 pounds, the best fit of representative data records would be selected programmatically.
- In this case, the range
coverage determination unit 270 determines for each of the data records of the requesteddata 230 whether the particular value of the data record is included within the predefined value range 272. From all data records having their particular value included within the predefined value range 272, a total number of data records is selected that is equal to, or at least does not exceed, the predefined maximum number 274 (in this example, 100). The particular values of the selected data records define the requested value distribution. - In one embodiment, the requested value distribution is represented as a histogram having one or more value windows, each having a specified value range defining a granularity of the value window. The granularity can be user-specified or system and/or application specific. According to one aspect, a user can specify a histogram using the suitable user interface. For instance, in the given example the user can specify a histogram representing a bell curve. By way of example, the user may divide the value range of 100 pounds-250 pounds into five different value windows, such as (1) 100 pounds-129 pounds, (2) 130 pounds to 159 pounds, (3) 160 pounds-189 pounds, (4) 190 pounds-219 pounds, and (5) 220 pounds to 250 pounds. Furthermore, the user may specify that from the 100 requested test persons (i) 15 persons should have weights within the value windows (1) and (5), respectively, (ii) 40 persons should have weights within the value windows (2) and (4), respectively, and 90 persons should have weights within the value window (3). Accordingly, the weights of all selected data records would define a bell curve.
- The one or more selected data records can, for instance, be placed on the top of the sorted list defining the sorted requested
data 240. Alternatively, only the selected data records can be displayed in the sorted list on thedisplay device 142, while the remaining data records are hidden to the user. An exemplary method for re-ordering the requesteddata 230 on the basis of a requested value range coverage is described below with reference to FIGS. 5A-B. - Referring now to
FIG. 5A , one embodiment of amethod 500 for re-ordering requested data (e.g., requesteddata 230 ofFIG. 2B ) on the basis of a requested value range coverage is shown. The requested data is obtained in response to execution of a corresponding data request (e.g., data request 220 ofFIG. 2A ) against data in a database (e.g.,data 136 ofdatabase 139 ofFIG. 2A ). Similarly to themethods FIGS. 3 and 4 , themethod 500 is explained by way of example with respect to a query issued against the data in the database in order to obtain a corresponding query result. At least part of the steps of themethod 500 can be performed by a suitable requesting entity (e.g., requesting application(s) 124 ofFIG. 2A ) and suitable functionalities of an associated sorting program(s) (e.g., sorting program(s) 126 ofFIG. 2B ).Method 500 starts atstep 510. - At
step 520, the query is issued by a suitable requesting entity (e.g., requestingapplication 124 ofFIG. 2A ). The issued query is executed against the data in the database. An exemplary query is shown in Table XVI below. For simplicity, the exemplary query of Table XVI is described in natural language without reference to a particular query language. Thus, it is understood that any suitable query language, known or unknown, can be used to create the query of Table XVI.TABLE XVI QUERY EXAMPLE FIND Patient_ID, Hemoglobin SORT BY spread of Hemoglobin RETURN 3 data records - Illustratively, the exemplary query shown in Table XVI includes data selection criteria in lines 001-002. The data selection criteria include a result field specification (line 002) which specifies two result fields for which information is to be returned in the query result. Specifically, in line 002 the result fields “Patient_ID” and “Hemoglobin” are specified. Assume now that information related to the result fields “Patient_ID” and “Hemoglobin” is included with the database table “Tests” illustrated in Table XI above. The exemplary query further includes sorting criteria in lines 003-006. The sorting criteria indicate that all data records in the query result should be sorted with respect to a spread of Hemoglobin values (line 004). In this case, the range of values which is defined by the Hemoglobin values of the query result constitutes a predefined value range (e.g., predefined value range 272 of
FIG. 2B ) for the requested value range coverage. Specifically, the Hemoglobin test values in the “Tests” table of Table XI define the predefined value range [23; 45]. However, it should be noted that the predefined value range may also be provided by a user using a suitable user interface (e.g.,user interface 210 ofFIG. 2A ). The sorting criteria further indicate a predefined maximum number (e.g., predefinedmaximum number 274 ofFIG. 2B ) which specifies that only “3” data records should be returned in the query result (line 006). - In one embodiment, executing the query at
step 520 includes identifying the data selection criteria and the sorting criteria from the issued query. Executing the query further includes generating a data query on the basis of the identified data selection criteria and executing the data query against the database. In the given example, the data selection criteria “FIND Patient_ID, Hemoglobin” can be identified from the issued query (lines 001-002 of Table XV). Furthermore, the sorting criteria “SORT BY spread ofHemoglobin RETURN 3 data records” can be identified from the issued query (lines 003-006 of Table XV). - The data query which can be generated on the basis of the identified data selection criteria corresponds to the data query shown in Table XII above. In other words, in the given example the data query of Table XII is executed against the database table “Tests” illustrated in Table XI to determine the query result in an initial order for the exemplary query of Table XVI.
- At
step 530, the query result in the initial order is received. In the given example, receiving the query result in the initial order corresponds to receiving a data query result which corresponds to the data query result shown in Table XIII, as described above with reference toFIG. 4 . - At
step 540, a subset of the data records of the data query result of Table XIII is selected which satisfies the requested value range coverage. Assume now that the subset of data records should be selected such Hemoglobin test values associated with the data records of the subset define a flat distribution over the predefined value range, i.e., that the Hemoglobin test values are evenly spread over the predefined value range. In other words, three of the data records which have associated Hemoglobin test values that are evenly spread over the predefined value range [23; 45] are identified from the data query result of Table XIII. An exemplary method for identifying the subset of data records from the data query result is described below with reference toFIG. 5B . - At
step 550, the data records in the data query result of Table XIII are sorted on the basis of the requested value range coverage. According to one aspect, the sorting comprises including only the three identified data records with the sorted query result. Alternatively, the three identified data records can be placed on the top of the sorted list. Furthermore, the three identified data records can be flagged to indicate that only display of these data records is allowed, while all remaining data records should be hidden to the user. By way of example, assume that in the given example only the three identified data records are included with the sorted list. Assume further that the data records of lines 003, 005 and 006 of Table XIII are identified. Accordingly, the exemplary sorted query result shown in Table XVII below is obtained.TABLE XVII EXEMPLARY SORTED QUERY RESULT Patient_ID Hemoglobin 1 23 2 33 2 45 - At
step 560, the sorted list of data records (e.g., sorted requesteddata 240 ofFIG. 2B ) is output. In the given example, the exemplary sorted query result of Table XVII is output.Method 500 then exits atstep 570. - Referring now to
FIG. 5B , one embodiment of amethod 548 for identifying the subset of data records from the data query result according to step 540 ofFIG. 5A is shown. Themethod 548 starts atstep 541, where all data records of the data query result which have an associated value within the predefined value range are determined. In the given example, the associated values of all data records in the data query result of Table XIII are included within the predefined value range [23; 45]. - At
step 542, a requested value distribution is determined for all associated values which are included within the predefined value range. Assume now that in the given example a flat distribution in the predefined value range [23; 45] is requested. Assume further that three value windows are specified for the flat distribution, such as [23;30], [31;38] and [39;45]. - At
step 544, all data records of the data query result are grouped into value groups on the basis of the specified value windows. Each value group may include one or more data records. In the given example, the Hemoglobin test values 23, 29, 33, 35 and 45 of the data records shown in Table XIII are grouped into three value groups: (i) the values 23 and 29 are grouped into a first value group corresponding to the value window [23;30], (ii) the values 33 and 35 are grouped into a second value group corresponding to the value window [31;38], and (iii) the value 45 is grouped into a third value group corresponding to the value window [39;45]. - At
step 545, one or more data records from at least a portion of the value groups are determined such that a total number of selected data records is equal to, or at least does not exceed, the predefined maximum number, i.e., “3”. In the given example, the one or more data records are selected to be evenly spread over the predefined value range in order to define the requested flat distribution. Furthermore, data records for a maximum number of different values of the value distribution are determined, according to one aspect. - In the given example, the predefined maximum number of “3” data records is selected from the three different value groups. Accordingly, one data record is selected for each value group. As the values “23” and “45” of the first and third value groups are boundary values of the predefined value range [23;45] and, thus, equidistant to a median value of the predefined value range, i.e., “34”, the data records having these values are selected. Furthermore, a data record having an associated value which is in the second value group is selected. As two data records have associated values in the second value group which are immediately adjacent to the median value, i.e., the data records having the associated values “33” and “35”, one of both data records can be selected programmatically in an arbitrary manner so that the requested flat distribution is satisfied. As was noted above, the data record having the associated value “33” is selected. Processing then continues at
step 550 ofFIG. 5A . - It should be noted that various implementations for selection of the data records in order to satisfy a uniform spread over the value distribution are contemplated. All such implementations are broadly contemplated. For instance, the selection can be based on additional selection criteria provided by a user. More specifically, assume that in the described example the Hemoglobin test value “35” has been established for an individual living in Rochester, Minn., and that the Hemoglobin test value “33” has been established for an individual living in Houston, Tex. Assume further that the user specifies that data records for individuals living in Texas should be preferred. Accordingly, the data record having the Hemoglobin test value “33” for an individual living in Houston, Tex., is selected.
- Referring now back to
FIG. 2B , the re-ordering of the requesteddata 230 is performed in still another embodiment on the basis of suitability scores which are determined with respect to theavailable analysis routines 180. More specifically, a suitability score is determined for each data record of the requesteddata 230. The suitability score of a given data record indicates a relative suitability of the given data record as input to one or more of theanalysis routines 180. - In order to determine the suitability scores for the data records of the requested
data 230, the sorting program(s) 126 illustratively invokes an analysisroutine identification unit 280. It should be noted that the analysisroutine identification unit 280 is represented as a separate unit only by way of example and not for limiting the invention accordingly. In other words, the analysisroutine identification unit 280 can also be implemented as an integral part of the sorting program(s) 126 or some other suitable system component. - The analysis
routine identification unit 280 identifies one or more analysis routines from theanalysis routines 180 which are configured for processing the requesteddata 230. The analysisroutine identification unit 280 then identifies qualifiers, such as row qualifiers and result set qualifiers for the identified analysis routine(s). A row qualifier of a given analysis routine indicates a possible input field of the given analysis routine and may specify a preferred input value for the possible input field. A result set qualifier of a given analysis routine specifies characteristics which qualify a result set that is suitable as input to the given analysis routine. For instance, a result set qualifier may specify that only a result set having Hemoglobin values for each data record is suitable. In one embodiment, the result set qualifier of the given analysis routine indicates a preferred range of input values of the given analysis routine. According to one aspect, the row qualifier(s) and/or result set qualifier(s) of the identified analysis routine(s) can be determined from associatedmetadata 282. - On the basis of corresponding row and/or result set qualifiers, the analysis
routine identification unit 280 determines how suitable each one of the data records of the requesteddata 230 is as input to the identified analysis routine(s). In the case of an identified row qualifier, the analysisroutine identification unit 280 determines for a given data record having a particular value whether an underlying type of the particular value of that data record corresponds to an input type of the possible input field of the identified analysis routine(s). Each time a match of the types is encountered, the suitability score of the given data record is modified. Modifying the suitability score includes, by way of example, increasing or decreasing the suitability score. In the case of an identified result set qualifier, the result set qualifier can be transformed into a set of one time row qualifiers, each of which can be processed similar to the processing of the row qualifier, as described above. Thus, data records which are most suitable as input to the identified analysis routine(s) can be identified and placed on the top of a corresponding sorted list. An exemplary method for re-ordering the requesteddata 230 on the basis of suitability scores which are determined with respect to theavailable analysis routines 180 is described below with reference to FIGS. 6A-C. - Referring now to
FIG. 6A , one embodiment of amethod 600 for re-ordering requested data (e.g., requesteddata 230 ofFIG. 2B ) on the basis of suitability scores is shown. The suitability scores are determined for data records included with the requested data with respect to analysis routines which are configured to process the data records. The requested data is obtained in response to execution of a corresponding data request (e.g., data request 220 ofFIG. 2A ) against data in a database (e.g.,data 136 ofdatabase 139 ofFIG. 2A ). Similarly to themethods FIGS. 3, 4 and 5, themethod 600 is explained by way of example with respect to a query issued against the data in the database in order to obtain a corresponding query result. At least part of the steps of themethod 600 can be performed by a suitable requesting entity (e.g., requesting application(s) 124 ofFIG. 2A ) and suitable functionalities of an associated sorting program(s) (e.g., sorting program(s) 126 ofFIG. 2B ).Method 600 starts atstep 610. - At
step 620, the query is issued by a suitable requesting entity (e.g., requestingapplication 124 ofFIG. 2A ). The issued query is executed against the data in the database. An exemplary query is shown in Table XVIII below. For simplicity, the exemplary query of Table XVIII is described in natural language without reference to a particular query language. Thus, it is understood that any suitable query language, known or unknown, can be used to create the query of Table XVIII.TABLE XVIII QUERY EXAMPLE FIND Patient_ID, Hemoglobin SORT BY available analysis routines - Illustratively, the exemplary query shown in Table XVIII includes data selection criteria in lines 001-002. The data selection criteria include a result field specification (line 002) which specifies two result fields for which information is to be returned in the query result. Specifically, in line 002 the result fields “Patient_ID” and “Hemoglobin” are specified. Assume now that information related to the result fields “Patient_ID” and “Hemoglobin” is included with the database table “Tests” illustrated in Table XI above. The exemplary query further includes sorting criteria in lines 003-004. The sorting criteria indicate that all data records in the query result should be sorted with respect to available analysis routines (line 004).
- In one embodiment, executing the query at
step 620 includes identifying the data selection criteria and the sorting criteria from the issued query. Executing the query further includes generating a data query on the basis of the identified data selection criteria and executing the data query against the database. In the given example, the data selection criteria “FIND Patient_ID, Hemoglobin” can be identified from the issued query (lines 001-002 of Table XVIII). Furthermore, the sorting criteria “SORT BY available analysis routines” can be identified from the issued query (lines 003-004 of Table XVIII). - The data query which can be generated on the basis of the identified data selection criteria corresponds to the data query shown in Table XII above. Thus, in the given example the data query of Table XII is executed against the database table “Tests” illustrated in Table XI to determine the query result in an initial order for the exemplary query of Table XVIII.
- At
step 630, the query result in the initial order is received. In the given example, receiving the query result in the initial order corresponds to receiving the data query result of Table XIII, as described above with reference toFIG. 4 . - At
step 640, all analysis routines which are configured to process the data query result are identified from a plurality of available analysis routines (e.g.,analysis routines 180 ofFIG. 2B ). According to one aspect, identifying the analysis routines which are configured to process the data query result includes accessing metadata associated with the analysis routines (e.g.,metadata 282 ofFIG. 2B ). The associated metadata may include qualifiers, such as row and result set qualifiers, which specify a type of query result that can be processed by corresponding analysis routines. - At
step 650, a suitability score is determined for each data record of the data query result. The suitability score of a given data record indicates a relative suitability of the given data record as input to the identified analysis routine(s). Exemplary methods for determining suitability scores are described below with reference to FIGS. 6B-C. - At
step 660, the data records in the data query result are sorted on the basis of the determined suitability scores. Atstep 670, the sorted list of data records (e.g., sorted requesteddata 240 ofFIG. 2B ) is output.Method 600 then exits atstep 680. - Referring now to
FIG. 6B , one embodiment of amethod 690 for determining suitability scores for data records of a data query result (e.g., the data query result of Table XIII) according to step 650 ofFIG. 6A is shown. Themethod 690 starts atstep 651, where a loop consisting of steps 651-653 is entered for each analysis routine that is identified atstep 640 ofFIG. 6A . - At
step 651, the loop is entered for a given analysis routine. Atstep 652, all row qualifiers which are associated with the given analysis routine are identified. According to one aspect, each row qualifier indicates a possible input field of the given analysis routine. Furthermore, each row qualifier may specify a preferred input value for the possible input field. Moreover, in one embodiment each row qualifier may have an associated weight. For instance, a first row qualifier may define that a given data record having a Hemoglobin test value greater than 35 is suitable as input to the given analysis routine. The given analysis routine may further have a second row qualifier which defines that a given data record having an Age value greater than 30 is also suitable as input to the given analysis routine. However, assume that the given analysis routine performs better on data records having higher Hemoglobin test values than on data records having higher Age values. In this case, the first row qualifier may be associated with a higher weight than the second row qualifier. Then, atstep 653 the possible input fields, the preferred input values and the associated weights of the identified row qualifiers are identified. When the loop consisting of steps 651-653 has been executed for each identified analysis routine, processing continues atstep 654. - At
step 654, each result field of each data record of the data query result is compared with the identified possible input fields. For all matching fields, the value of the corresponding result field is compared with the preferred input value of the matching possible input field. - At
step 655, for each data record of the data query result, all matching fields are counted. Optionally, all associated weights are applied to the counted matching fields. - At
step 656, relative proximities for all matching fields of each data record of the data query result are determined. More specifically, for each result field of a given data record that matches a possible input field defined by one of the identified row qualifiers, a relative proximity between the value of the result field and the preferred input value of the matched possible input field is determined. In one embodiment, all associated weights are applied to the determined relative proximities. - According to one aspect, if the preferred input value of a given possible input field is associated with a comparison operator, a difference value between the preferred input value of the possible input field and the values of matching result fields can be determined instead of a relative proximity. For instance, in the example described above the first row qualifier defines that a given data record having a Hemoglobin test value greater than 35 is suitable as input to the given analysis routine. Accordingly, a data record having a Hemoglobin test value of 55 has a difference value of 20 with respect to the predefined Hemoglobin value of 35 (i.e., 55−35=20) and a data record having a Hemoglobin test value of 49 has a difference value of 14 (i.e., 49−35=14).
- At
step 657, the suitability scores for all data records of the data query result are determined on the basis of the counted matching fields and/or the determined relative proximities. More specifically, according to one aspect, the suitability score of a given data record can be increased or decreased for each matching field with respect to any of the identified analysis routines. The suitability score may also be increased or decreased on the basis of each determined relative proximity or difference value of the given data record with respect to each identified analysis routine. Furthermore, the increase/decrease may be dependent on the determined relative proximity or difference value. For instance, a greater relative proximity may result in a higher increase/decrease. More specifically, in the above example it is assumed that the given analysis routine performs better if the Hemoglobin test value of a given data record and, thus, the corresponding difference value is high. Accordingly, the given analysis routine performs better on the data record having the Hemoglobin test value of 55 and the difference value of 20. Thus, this data record may have a higher increase of its suitability score with respect to the given analysis routine than the data record having the Hemoglobin test value of 49 and the difference value of 14. Processing then continues atstep 660 ofFIG. 6A . - By way of example, assume that at
step 640 ofFIG. 6A only a single analysis routine is identified, which is configured to process the data records of the data query result of Table XIII. Assume further that atstep 652 the first row qualifier described above is identified for the single analysis routine. As was noted above, the first row qualifier specifies as possible input field the field “Hemoglobin” and as preferred input value values which are greater than 35. Accordingly, the exemplary sorted query result shown in Table XIX below is obtained.TABLE XIX EXEMPLARY SORTED QUERY RESULT Patient_ID Hemoglobin 2 45 3 35 2 33 1 29 1 23 - Note that in the given example the data record of line 005 of the data query result of Table XIII is presented on the top of the exemplary sorted query result, i.e., in line 002 of Table XIX, as this data record has the greatest Hemoglobin test value.
- It should be noted that the determined suitability scores can be expressed by a plurality of score portions, wherein each score portion is related to a different identified analysis routine. In one embodiment, each score portion can be normalized in order to limit the ability of a single identified analysis routine to push a particular data record to the top of the sorted list of data records. Furthermore, all score portions of the plurality of score portions of a given suitability score can be determined and stored separately. In this case, the data query result (e.g., the data query result of Table XIII) can be presented to a user. Thus, the user may decide which analysis routine(s) to use. Accordingly, the sorting can be based on the score portions associated with the user-selected analysis routine(s), as described above.
- Referring now to
FIG. 6C , another embodiment of amethod 695 for determining suitability scores for data records of a data query result (e.g., the data query result of Table XIII) according to step 650 ofFIG. 6A is shown. Themethod 695 starts atstep 658, where a loop consisting ofsteps step 640 ofFIG. 6A . - At
step 658, the loop is entered for a given analysis routine. Atstep 659, a result set qualifier which is associated with the given analysis routine is identified. According to one aspect, the result set qualifier indicates a preferred range of input values for a possible input field of the given analysis routine. For instance, a given result set qualifier may define that a given data record having Hemoglobin test values between 20 and 50 is suitable as input to the given analysis routine. - At
step 691, the preferred range of input values for the possible input field is determined from the identified result set qualifier. In the given example, the value range [20; 50] is determined. - At
step 692, a distribution of values spread over the preferred range of input values is determined. To this end, a number of values of the preferred range of input values is identified such that the identified values are uniformly spread over the preferred range of input values. For instance, the distribution of values may be determined programmatically to include each integer number in the value range [20; 50]. - In one embodiment, a predefined number can be provided for determination of the distribution of values. By way of example, the predefined number can be provided with the issued query (i.e., the exemplary query of Table XVIII). Thus, the predefined number of values can be identified from the preferred range of values to define the distribution of values. For instance, assume that in the given example the predefined number is “11”. Accordingly, “11” uniformly spread values of the preferred range of values [20; 50] are determined for the distribution of values. By way of example, the values 20, 23, 26, 29, 32, 35, 38, 41, 44, 47, 50 are identified. At
step 693, a unique temporary row qualifier is created for each identified value of the distribution of values. - When the loop consisting of steps 658-693 has been executed for each identified analysis routine, processing continues at
step 654 ofFIG. 6B , where each unique temporary row qualifier is processed similar to a row qualifier (as identified atstep 652 ofFIG. 6B ). However, in one embodiment, if a match is determined for a possible input field and/or preferred input value of a given temporary row qualifier atstep 654, the given temporary row qualifier is deleted. - As was noted above, queries issued by a suitable requesting entity (e.g., requesting
application 124 ofFIG. 1 ) can be abstract queries formulated on the basis of a data abstraction model (e.g.,data abstraction model 132 ofFIG. 1 ). An abstract query can be transformed by a suitable runtime component (e.g.,runtime component 134 ofFIG. 1 ) into a concrete query having a form consistent with the physical representation of data contained in an underlying database (e.g.,data 136 indatabase 139 ofFIG. 1 ). The concrete queries can be executed by the runtime component against the database. An exemplary data abstraction model, creation of abstract queries and operation of an exemplary runtime component are further described below with reference toFIGS. 7-10 . - Referring now to
FIG. 7 , a relational view illustrating operation and interaction of the requestingapplication 124 ofFIG. 1 and thedata abstraction model 132 ofFIG. 1 is shown. Thedata abstraction model 132 defines logical fields corresponding to physical entities of data in a database (e.g.,data 136 in database 139), thereby providing a logical representation of the data. In a relational database environment having a multiplicity of database tables, a specific logical representation having specific logical fields can be provided for each database table. In this case, all specific logical representations together constitute thedata abstraction model 132. The physical entities of the data are arranged in the database according to a physical representation of the data in the database. By way of illustration, two physical representations are shown, an XML data representation 714 1 and a relational data representation 714 2. However, the physical representation 714 N indicates that any other physical representation, known or unknown, is contemplated. In one embodiment, a different single data abstraction model is provided for each separate physical representation 714, as explained above for the case of a relational database environment. In an alternative embodiment, a singledata abstraction model 132 contains field specifications (with associated access methods) for two or more physical representations 714. - Using a logical representation of the data, the
application query specification 122 ofFIG. 1 specifies one or more logical fields to compose a resultingquery 702. A requesting entity (e.g., the requesting application 124) issues the resultingquery 702 as defined by an application query specification of the requesting entity. In one embodiment, theabstract query 702 may include both criteria used for data selection and an explicit specification of result fields to be returned based on the data selection criteria. An example of the selection criteria and the result field specification of theabstract query 702 is shown inFIG. 8 . Accordingly, theabstract query 702 illustratively includesselection criteria 804 and aresult field specification 806. - The resulting
query 702 is generally referred to herein as an “abstract query” because the query is composed according to abstract (i.e., logical) fields rather than by direct reference to the underlying physical data entities in the database. As a result, abstract queries may be defined that are independent of the particular underlying physical data representation used. For execution, the abstract query is transformed into a concrete query consistent with the underlying physical representation of the data using thedata abstraction model 132. - In general, the
data abstraction model 132 exposes information as a set of logical fields that may be used within an abstract query to specify criteria for data selection and specify the form of result data returned from a query operation. The logical fields are defined independently of the underlying physical representation being used in the database, thereby allowing abstract queries to be formed that are loosely coupled to the underlying physical representation. - Referring now to
FIG. 8 , a relational view illustrating interaction of theabstract query 702 and thedata abstraction model 132 is shown. In one embodiment, thedata abstraction model 132 comprises a plurality offield specifications field specifications 808. Specifically, a field specification is provided for each logical field available for composition of an abstract query. Each field specification may contain one or more attributes. Illustratively, thefield specifications 808 include a logicalfield name attribute access method attribute field name attribute 820 1 has the value “FirstName” andaccess method attribute 822 1 has the value “Simple”. Furthermore, each attribute may include one or more associated abstract properties. Each abstract property describes a characteristic of a data structure and has an associated value. In the context of the invention, a data structure refers to a part of the underlying physical representation that is defined by one or more physical entities of the data corresponding to the logical field. In particular, an abstract property may represent data location metadata abstractly describing a location of a physical data entity corresponding to the data structure, like a name of a database table or a name of a column in a database table. Illustratively, theaccess method attribute 822, includes data location metadata “Table” and “Column”. Furthermore, data location metadata “Table” has the value “contact” and data location metadata “Column” has the value “f_name”. Accordingly, assuming an underlying relational database schema in the present example, the values of data location metadata “Table” and “Column” point to a table “contact” having a column “f_name”. - In one embodiment, groups (i.e. two or more) of logical fields may be part of categories. Accordingly, the
data abstraction model 132 includes a plurality ofcategory specifications 810 1 and 810 2 (two shown by way of example), collectively referred to as the category specifications. In one embodiment, a category specification is provided for each logical grouping of two or more logical fields. For example,logical fields category specifications logical fields 808 1-3 are part of the “Name and Address” category andlogical fields 808 4-5 are part of the “Birth and Age” category. - The
access methods 822 generally associate (i.e., map) the logical field names to data in the database (e.g.,database 139 ofFIG. 1 ). Any number of access methods is contemplated depending upon the number of different types of logical fields to be supported. In one embodiment, access methods for simple fields, filtered fields and composed fields are provided. Thefield specifications field access methods field access method 822 1 shown inFIG. 8 maps the logical field name 820 1 (“FirstName”) to a column named “f_name” in a table named “contact”. Thefield specification 808 3 exemplifies a filteredfield access method 822 3. Filtered fields identify an associated physical entity and provide filters used to define a particular subset of items within the physical representation. An example is provided inFIG. 8 in which the filteredfield access method 822 3 maps the logical field name 820 3 (“AnyTownLastName”) to a physical entity in a column named “I_name” in a table named “contact” and defines a filter for individuals in the city of “Anytown”. Another example of a filtered field is a New York ZIP code field that maps to the physical representation of ZIP codes and restricts the data only to those ZIP codes defined for the state of New York. Thefield specification 808 4 exemplifies a composedfield access method 822 4. Composed access methods compute a logical field from one or more physical fields using an expression supplied as part of the access method definition. In this way, information which does not exist in the underlying physical data representation may be computed. In the example illustrated inFIG. 8 the composedfield access method 822 4 maps the logical field name 8204 “AgeInDecades” to “AgeInYears/10”. Another example is a sales tax field that is composed by multiplying a sales price field by a sales tax rate. - It is contemplated that the formats for any given data type (e.g., dates, decimal numbers, etc.) of the underlying data may vary. Accordingly, in one embodiment, the
field specifications 808 include a type attribute which reflects the format of the underlying data. However, in another embodiment, the data format of thefield specifications 808 is different from the associated underlying physical data, in which case a conversion of the underlying physical data into the format of the logical field is required. - By way of example, the
field specifications 808 of thedata abstraction model 132 shown inFIG. 8 are representative of logical fields mapped to data represented in therelational data representation 7142 shown inFIG. 7 . However, other instances of thedata abstraction model 132 map logical fields to other physical representations, such as XML. - An illustrative abstract query corresponding to the
abstract query 702 shown inFIG. 8 is shown in Table XX below. By way of illustration, the illustrative abstract query is defined using XML. However, any other language may be used to advantage.TABLE XX ABSTRACT QUERY EXAMPLE <?xml version=“1.0”?> <!--Query string representation: (AgeInYears > “55”--> <QueryAbstraction> <Selection> <Condition internalID=“4”> <Condition field=“AgeInYears” operator=“GT” value=“55” internalID=“1”/> </Selection> <Results> <Field name=“FirstName”/> <Field name=“AnyTownLastName”/> <Field name=“Street”/> </Results> </QueryAbstraction> - Illustratively, the abstract query shown in Table XX includes a selection specification (lines 004-008) containing selection criteria and a results specification (lines 009-013). In one embodiment, a selection criterion consists of a field name (for a logical field), a comparison operator (=, >, <, etc) and a value expression (what is the field being compared to). In one embodiment, result specification is a list of abstract fields that are to be returned as a result of query execution. A result specification in the abstract query may consist of a field name and sort criteria.
- An illustrative data abstraction model (DAM) corresponding to the
data abstraction model 132 shown inFIG. 8 is shown in Table XXI below. By way of illustration, the illustrative Data Abstraction Model is defined using XML. However, any other language may be used to advantage.TABLE XXI DATA ABSTRACTION MODEL EXAMPLE <?xml version=“1.0”?> <DataAbstraction> <Category name=“Name and Address”> <Field queryable=“Yes” name=“FirstName” displayable=“Yes”> <AccessMethod> <Simple columnName=“f_name” tableName=“contact”></Simple> </AccessMethod> </Field> <Field queryable=“Yes” name=“LastName” displayable=“Yes”> <AccessMethod> <Simple columnName=“l_name” tableName=“contact”></Simple> </AccessMethod> </Field> <Field queryable=“Yes” name=“AnyTownLastName” displayable=“Yes”> <AccessMethod> <Filter columnName=“l_name” tableName=“contact”> </Filter=“contact.city=Anytown”> </AccessMethod> </Field> </Category> <Category name=“Birth and Age”> <Field queryable=“Yes” name=“AgeInDecades” displayable=“Yes”> <AccessMethod> <Composed columnName=“age” tableName=“contact”> </Composed Expression=“columnName/10”> </AccessMethod> </Field> <Field queryable=“Yes” name=“AgeInYears” displayable=“Yes”> <AccessMethod> <Simple columnName=“age” tableName=“contact”></Simple> </AccessMethod> </Field> </Category> </DataAbstraction> - By way of example, note that lines 004-008 correspond to the
first field specification 808 1 of theDAM 132 shown inFIG. 8 and lines 009-013 correspond to thesecond field specification 808 2. - Referring now to
FIG. 9 , anillustrative runtime method 900 exemplifying one embodiment of the operation of theruntime component 134 ofFIG. 1 is shown. Themethod 900 is entered atstep 902 when the runtime component receives as input an abstract query (such as the abstract query shown in Table XX). Atstep 904, the runtime component reads and parses the abstract query and locates individual selection criteria and desired result fields. Atstep 906, the runtime component enters a loop (comprisingsteps step 908, the runtime component uses the field name from a selection criterion of the abstract query to look up the definition of the field in thedata abstraction model 132 ofFIG. 1 . As noted above, the field definition includes a definition of the access method used to access the physical data associated with the field. The runtime component then builds (step 910) a concrete query contribution for the logical field being processed. As defined herein, a concrete query contribution is a portion of a concrete query that is used to perform data selection based on the current logical field. A concrete query is a query represented in languages like SQL and XML Query and is consistent with the data of a given physical data repository (e.g., a relational database or XML repository). Accordingly, the concrete query is used to locate and retrieve data from the physical data repository, represented by thedatabase 139 shown inFIG. 1 . The concrete query contribution generated for the current field is then added to a concrete query statement. Themethod 900 then returns to step 906 to begin processing for the next field of the abstract query. Accordingly, the process entered atstep 906 is iterated for each data selection field in the abstract query, thereby contributing additional content to the eventual query to be performed. - After building the data selection portion of the concrete query, the runtime component identifies the information to be returned as a result of query execution. As described above, in one embodiment, the abstract query defines a list of logical fields that are to be returned as a result of query execution, referred to herein as a result specification. A result specification in the abstract query may consist of a field name and sort criteria. Accordingly, the
method 900 enters a loop at step 914 (defined bysteps step 916, the runtime component looks up a result field name (from the result specification of the abstract query) in thedata abstraction model 132 and then retrieves a result field definition from thedata abstraction model 132 to identify the physical location of data to be returned for the current logical result field. The runtime component then builds (at step 918) a concrete query contribution (of the concrete query that identifies physical location of data to be returned) for the logical result field. Atstep 920, the concrete query contribution is then added to the concrete query statement. Once each of the result specifications in the abstract query has been processed, the concrete query is executed atstep 922. - One embodiment of a
method 1000 for building a concrete query contribution for a logical field according tosteps FIG. 9 . Atstep 1002, themethod 1000 queries whether the access method associated with the current logical field is a simple access method. If so, the concrete query contribution is built (step 1004) based on physical data location information and processing then continues according tomethod 900 described above. Otherwise, processing continues to step 1006 to query whether the access method associated with the current logical field is a filtered access method. If so, the concrete query contribution is built (step 1008) based on physical data location information for some physical data entity. Atstep 1010, the concrete query contribution is extended with additional logic (filter selection) used to subset data associated with the physical data entity. Processing then continues according tomethod 900 described above. - If the access method is not a filtered access method, processing proceeds from
step 1006 to step 1012 where themethod 1000 queries whether the access method is a composed access method. If the access method is a composed access method, the physical data location for each sub-field reference in the composed field expression is located and retrieved atstep 1014. At step 1016, the physical field location information of the composed field expression is substituted for the logical field references of the composed field expression, whereby the concrete query contribution is generated. Processing then continues according tomethod 400 described above. - If the access method is not a composed access method, processing proceeds from
step 1012 to step 1018.Step 1018 is representative of any other access methods types contemplated as embodiments of the present invention. However, it should be understood that embodiments are contemplated in which less then all the available access methods are implemented. For example, in a particular embodiment only simple access methods are used. In another embodiment, only simple access methods and filtered access methods are used. - While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (71)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/999,498 US20060116983A1 (en) | 2004-11-30 | 2004-11-30 | System and method for ordering query results |
US12/041,743 US8380708B2 (en) | 2004-11-30 | 2008-03-04 | Methods and systems for ordering query results based on annotations |
US12/041,768 US8185525B2 (en) | 2004-11-30 | 2008-03-04 | Ordering query results based on value range filtering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/999,498 US20060116983A1 (en) | 2004-11-30 | 2004-11-30 | System and method for ordering query results |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/041,768 Division US8185525B2 (en) | 2004-11-30 | 2008-03-04 | Ordering query results based on value range filtering |
US12/041,743 Division US8380708B2 (en) | 2004-11-30 | 2008-03-04 | Methods and systems for ordering query results based on annotations |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060116983A1 true US20060116983A1 (en) | 2006-06-01 |
Family
ID=36568412
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/999,498 Abandoned US20060116983A1 (en) | 2004-11-30 | 2004-11-30 | System and method for ordering query results |
US12/041,768 Expired - Fee Related US8185525B2 (en) | 2004-11-30 | 2008-03-04 | Ordering query results based on value range filtering |
US12/041,743 Active 2026-10-21 US8380708B2 (en) | 2004-11-30 | 2008-03-04 | Methods and systems for ordering query results based on annotations |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/041,768 Expired - Fee Related US8185525B2 (en) | 2004-11-30 | 2008-03-04 | Ordering query results based on value range filtering |
US12/041,743 Active 2026-10-21 US8380708B2 (en) | 2004-11-30 | 2008-03-04 | Methods and systems for ordering query results based on annotations |
Country Status (1)
Country | Link |
---|---|
US (3) | US20060116983A1 (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060242178A1 (en) * | 2005-04-21 | 2006-10-26 | Yahoo! Inc. | Media object metadata association and ranking |
US20060242139A1 (en) * | 2005-04-21 | 2006-10-26 | Yahoo! Inc. | Interestingness ranking of media objects |
US20070276858A1 (en) * | 2006-05-22 | 2007-11-29 | Cushman James B Ii | Method and system for indexing information about entities with respect to hierarchies |
US20080154901A1 (en) * | 2004-11-30 | 2008-06-26 | International Business Machines Corporation | Methods and systems for ordering query results based on annotations |
US20080229221A1 (en) * | 2007-03-14 | 2008-09-18 | Xerox Corporation | Graphical user interface for gathering image evaluation information |
US20090089630A1 (en) * | 2007-09-28 | 2009-04-02 | Initiate Systems, Inc. | Method and system for analysis of a system for matching data records |
US20100318541A1 (en) * | 2009-06-15 | 2010-12-16 | International Business Machines Corporation | Filter Range Bound Paged Search |
US20110010346A1 (en) * | 2007-03-22 | 2011-01-13 | Glenn Goldenberg | Processing related data from information sources |
US8321383B2 (en) | 2006-06-02 | 2012-11-27 | International Business Machines Corporation | System and method for automatic weight generation for probabilistic matching |
US8321393B2 (en) | 2007-03-29 | 2012-11-27 | International Business Machines Corporation | Parsing information in data records and in different languages |
US8356009B2 (en) | 2006-09-15 | 2013-01-15 | International Business Machines Corporation | Implementation defined segments for relational database systems |
US8359339B2 (en) | 2007-02-05 | 2013-01-22 | International Business Machines Corporation | Graphical user interface for configuration of an algorithm for the matching of data records |
US8370355B2 (en) | 2007-03-29 | 2013-02-05 | International Business Machines Corporation | Managing entities within a database |
US8370366B2 (en) | 2006-09-15 | 2013-02-05 | International Business Machines Corporation | Method and system for comparing attributes such as business names |
US8417702B2 (en) | 2007-09-28 | 2013-04-09 | International Business Machines Corporation | Associating data records in multiple languages |
US8423514B2 (en) | 2007-03-29 | 2013-04-16 | International Business Machines Corporation | Service provisioning |
US8429220B2 (en) | 2007-03-29 | 2013-04-23 | International Business Machines Corporation | Data exchange among data sources |
US8589415B2 (en) | 2006-09-15 | 2013-11-19 | International Business Machines Corporation | Method and system for filtering false positives |
US8713434B2 (en) | 2007-09-28 | 2014-04-29 | International Business Machines Corporation | Indexing, relating and managing information about entities |
US9094225B1 (en) * | 2006-12-27 | 2015-07-28 | Google Inc. | Discovery of short-term and emerging trends in computer network traffic |
WO2016183550A1 (en) * | 2015-05-14 | 2016-11-17 | Walleye Software, LLC | Dynamic table index mapping |
US20160371275A1 (en) * | 2015-06-18 | 2016-12-22 | Microsoft Technology Licensing, Llc | Automated database schema annotation |
US10002154B1 (en) | 2017-08-24 | 2018-06-19 | Illumon Llc | Computer data system data source having an update propagation graph with feedback cyclicality |
WO2019062412A1 (en) * | 2017-09-30 | 2019-04-04 | Oppo广东移动通信有限公司 | Method and apparatus for recording application information, storage medium, and electronic device |
CN112819582A (en) * | 2021-02-04 | 2021-05-18 | 苏州达家迎信息技术有限公司 | Order data display method and device, storage medium and electronic equipment |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10144241B2 (en) * | 2016-08-19 | 2018-12-04 | Charles E. Emmott | Separable or opening portions for printable sheet material |
KR101548273B1 (en) * | 2009-04-08 | 2015-08-28 | 삼성전자주식회사 | Apparatus and method for improving web searching speed in portable terminal |
JP2012059219A (en) * | 2010-09-13 | 2012-03-22 | Fuji Xerox Co Ltd | Program and information collection supporting system |
CN102830950B (en) * | 2012-08-03 | 2016-05-04 | 苏州迈科网络安全技术股份有限公司 | The sort method of monitor data and system |
US9244952B2 (en) | 2013-03-17 | 2016-01-26 | Alation, Inc. | Editable and searchable markup pages automatically populated through user query monitoring |
USD760008S1 (en) * | 2014-07-08 | 2016-06-28 | Clover Co., Ltd. | Beverage machine |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030028495A1 (en) * | 2001-08-06 | 2003-02-06 | Pallante Joseph T. | Trusted third party services system and method |
US20040068489A1 (en) * | 2002-10-03 | 2004-04-08 | International Business Machines Corporation | SQL query construction using durable query components |
US6725227B1 (en) * | 1998-10-02 | 2004-04-20 | Nec Corporation | Advanced web bookmark database system |
US20050222989A1 (en) * | 2003-09-30 | 2005-10-06 | Taher Haveliwala | Results based personalization of advertisements in a search engine |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4575798A (en) | 1983-06-03 | 1986-03-11 | International Business Machines Corporation | External sorting using key value distribution and range formation |
JP3151820B2 (en) | 1990-09-19 | 2001-04-03 | 富士通株式会社 | Sorting method based on count classification using relative keys |
US5630121A (en) | 1993-02-02 | 1997-05-13 | International Business Machines Corporation | Archiving and retrieving multimedia objects using structured indexes |
US5619709A (en) * | 1993-09-20 | 1997-04-08 | Hnc, Inc. | System and method of context vector generation and retrieval |
US5734887A (en) | 1995-09-29 | 1998-03-31 | International Business Machines Corporation | Method and apparatus for logical data access to a physical relational database |
US6061677A (en) * | 1997-06-09 | 2000-05-09 | Microsoft Corporation | Database query system and method |
US6012053A (en) * | 1997-06-23 | 2000-01-04 | Lycos, Inc. | Computer system with user-controlled relevance ranking of search results |
US6009422A (en) | 1997-11-26 | 1999-12-28 | International Business Machines Corporation | System and method for query translation/semantic translation using generalized query language |
US6460043B1 (en) | 1998-02-04 | 2002-10-01 | Microsoft Corporation | Method and apparatus for operating on data with a conceptual data manipulation language |
US6553368B2 (en) | 1998-03-03 | 2003-04-22 | Sun Microsystems, Inc. | Network directory access mechanism |
US6233586B1 (en) | 1998-04-01 | 2001-05-15 | International Business Machines Corp. | Federated searching of heterogeneous datastores using a federated query object |
US6108650A (en) * | 1998-08-21 | 2000-08-22 | Myway.Com Corporation | Method and apparatus for an accelerated radius search |
US6457009B1 (en) | 1998-11-09 | 2002-09-24 | Denison W. Bollay | Method of searching multiples internet resident databases using search fields in a generic form |
US6611825B1 (en) * | 1999-06-09 | 2003-08-26 | The Boeing Company | Method and system for text mining using multidimensional subspaces |
CA2743462C (en) * | 1999-07-30 | 2012-10-16 | Basantkumar John Oommen | A method of generating attribute cardinality maps |
US6760720B1 (en) * | 2000-02-25 | 2004-07-06 | Pedestrian Concepts, Inc. | Search-on-the-fly/sort-on-the-fly search engine for searching databases |
US20020062258A1 (en) | 2000-05-18 | 2002-05-23 | Bailey Steven C. | Computer-implemented procurement of items using parametric searching |
US7024425B2 (en) | 2000-09-07 | 2006-04-04 | Oracle International Corporation | Method and apparatus for flexible storage and uniform manipulation of XML data in a relational database system |
US7158989B2 (en) * | 2000-10-27 | 2007-01-02 | Buc International Corporation | Limit engine database management system |
US6601065B1 (en) | 2000-12-21 | 2003-07-29 | Cisco Technology, Inc. | Method and apparatus for accessing a database through a network |
US6980984B1 (en) * | 2001-05-16 | 2005-12-27 | Kanisa, Inc. | Content provider systems and methods using structured data |
US6996558B2 (en) * | 2002-02-26 | 2006-02-07 | International Business Machines Corporation | Application portability and extensibility through database schema and query abstraction |
US20030182278A1 (en) * | 2002-03-25 | 2003-09-25 | Valk Jeffrey W. | Stateless cursor for information management system |
US6994613B2 (en) * | 2002-04-05 | 2006-02-07 | Michael Hacikyan | Grinding apparatus with splash protector and improved fluid delivery system |
US6928431B2 (en) | 2002-04-25 | 2005-08-09 | International Business Machines Corporation | Dynamic end user specific customization of an application's physical data layer through a data repository abstraction layer |
US6954748B2 (en) | 2002-04-25 | 2005-10-11 | International Business Machines Corporation | Remote data access and integration of distributed data sources through data schema and query abstraction |
US7096229B2 (en) | 2002-05-23 | 2006-08-22 | International Business Machines Corporation | Dynamic content generation/regeneration for a database schema abstraction |
US6944613B2 (en) * | 2002-12-13 | 2005-09-13 | Sciquest, Inc. | Method and system for creating a database and searching the database for allowing multiple customized views |
US8271369B2 (en) * | 2003-03-12 | 2012-09-18 | Norman Gilmore | Financial modeling and forecasting system |
US20050027705A1 (en) * | 2003-05-20 | 2005-02-03 | Pasha Sadri | Mapping method and system |
US7734566B2 (en) * | 2004-11-01 | 2010-06-08 | Sap Ag | Information retrieval method with efficient similarity search capability |
US20060116983A1 (en) * | 2004-11-30 | 2006-06-01 | International Business Machines Corporation | System and method for ordering query results |
-
2004
- 2004-11-30 US US10/999,498 patent/US20060116983A1/en not_active Abandoned
-
2008
- 2008-03-04 US US12/041,768 patent/US8185525B2/en not_active Expired - Fee Related
- 2008-03-04 US US12/041,743 patent/US8380708B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6725227B1 (en) * | 1998-10-02 | 2004-04-20 | Nec Corporation | Advanced web bookmark database system |
US20030028495A1 (en) * | 2001-08-06 | 2003-02-06 | Pallante Joseph T. | Trusted third party services system and method |
US20040068489A1 (en) * | 2002-10-03 | 2004-04-08 | International Business Machines Corporation | SQL query construction using durable query components |
US20050222989A1 (en) * | 2003-09-30 | 2005-10-06 | Taher Haveliwala | Results based personalization of advertisements in a search engine |
Cited By (109)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080154902A1 (en) * | 2004-11-30 | 2008-06-26 | International Business Machines Corporation | Methods and systems for ordering query results based on annotations |
US8185525B2 (en) | 2004-11-30 | 2012-05-22 | International Business Machines Corporation | Ordering query results based on value range filtering |
US8380708B2 (en) | 2004-11-30 | 2013-02-19 | International Business Machines Corporation | Methods and systems for ordering query results based on annotations |
US20080154901A1 (en) * | 2004-11-30 | 2008-06-26 | International Business Machines Corporation | Methods and systems for ordering query results based on annotations |
US10216763B2 (en) | 2005-04-21 | 2019-02-26 | Oath Inc. | Interestingness ranking of media objects |
US10210159B2 (en) | 2005-04-21 | 2019-02-19 | Oath Inc. | Media object metadata association and ranking |
US20100057555A1 (en) * | 2005-04-21 | 2010-03-04 | Yahoo! Inc. | Media object metadata association and ranking |
US8732175B2 (en) * | 2005-04-21 | 2014-05-20 | Yahoo! Inc. | Interestingness ranking of media objects |
US20060242139A1 (en) * | 2005-04-21 | 2006-10-26 | Yahoo! Inc. | Interestingness ranking of media objects |
US20060242178A1 (en) * | 2005-04-21 | 2006-10-26 | Yahoo! Inc. | Media object metadata association and ranking |
US7526486B2 (en) * | 2006-05-22 | 2009-04-28 | Initiate Systems, Inc. | Method and system for indexing information about entities with respect to hierarchies |
US8510338B2 (en) * | 2006-05-22 | 2013-08-13 | International Business Machines Corporation | Indexing information about entities with respect to hierarchies |
US20070276858A1 (en) * | 2006-05-22 | 2007-11-29 | Cushman James B Ii | Method and system for indexing information about entities with respect to hierarchies |
US8321383B2 (en) | 2006-06-02 | 2012-11-27 | International Business Machines Corporation | System and method for automatic weight generation for probabilistic matching |
US8332366B2 (en) | 2006-06-02 | 2012-12-11 | International Business Machines Corporation | System and method for automatic weight generation for probabilistic matching |
US8356009B2 (en) | 2006-09-15 | 2013-01-15 | International Business Machines Corporation | Implementation defined segments for relational database systems |
US8589415B2 (en) | 2006-09-15 | 2013-11-19 | International Business Machines Corporation | Method and system for filtering false positives |
US8370366B2 (en) | 2006-09-15 | 2013-02-05 | International Business Machines Corporation | Method and system for comparing attributes such as business names |
US9094225B1 (en) * | 2006-12-27 | 2015-07-28 | Google Inc. | Discovery of short-term and emerging trends in computer network traffic |
US8359339B2 (en) | 2007-02-05 | 2013-01-22 | International Business Machines Corporation | Graphical user interface for configuration of an algorithm for the matching of data records |
US20080229221A1 (en) * | 2007-03-14 | 2008-09-18 | Xerox Corporation | Graphical user interface for gathering image evaluation information |
US7904825B2 (en) * | 2007-03-14 | 2011-03-08 | Xerox Corporation | Graphical user interface for gathering image evaluation information |
US20110010346A1 (en) * | 2007-03-22 | 2011-01-13 | Glenn Goldenberg | Processing related data from information sources |
US8515926B2 (en) | 2007-03-22 | 2013-08-20 | International Business Machines Corporation | Processing related data from information sources |
US8429220B2 (en) | 2007-03-29 | 2013-04-23 | International Business Machines Corporation | Data exchange among data sources |
US8370355B2 (en) | 2007-03-29 | 2013-02-05 | International Business Machines Corporation | Managing entities within a database |
US8321393B2 (en) | 2007-03-29 | 2012-11-27 | International Business Machines Corporation | Parsing information in data records and in different languages |
US8423514B2 (en) | 2007-03-29 | 2013-04-16 | International Business Machines Corporation | Service provisioning |
US9286374B2 (en) | 2007-09-28 | 2016-03-15 | International Business Machines Corporation | Method and system for indexing, relating and managing information about entities |
US8417702B2 (en) | 2007-09-28 | 2013-04-09 | International Business Machines Corporation | Associating data records in multiple languages |
US8713434B2 (en) | 2007-09-28 | 2014-04-29 | International Business Machines Corporation | Indexing, relating and managing information about entities |
US8799282B2 (en) | 2007-09-28 | 2014-08-05 | International Business Machines Corporation | Analysis of a system for matching data records |
US10698755B2 (en) | 2007-09-28 | 2020-06-30 | International Business Machines Corporation | Analysis of a system for matching data records |
US20090089630A1 (en) * | 2007-09-28 | 2009-04-02 | Initiate Systems, Inc. | Method and system for analysis of a system for matching data records |
US9600563B2 (en) | 2007-09-28 | 2017-03-21 | International Business Machines Corporation | Method and system for indexing, relating and managing information about entities |
US20100318541A1 (en) * | 2009-06-15 | 2010-12-16 | International Business Machines Corporation | Filter Range Bound Paged Search |
US20120166455A1 (en) * | 2009-06-15 | 2012-06-28 | International Business Machines Corporation | Filter Range Bound Paged Search |
US8423560B2 (en) * | 2009-06-15 | 2013-04-16 | International Business Machines Corporation | Filter range bound paged search |
US8219565B2 (en) * | 2009-06-15 | 2012-07-10 | International Business Machines Corporation | Filter range bound paged search |
US10002153B2 (en) | 2015-05-14 | 2018-06-19 | Illumon Llc | Remote data object publishing/subscribing system having a multicast key-value protocol |
US10452649B2 (en) | 2015-05-14 | 2019-10-22 | Deephaven Data Labs Llc | Computer data distribution architecture |
US9619210B2 (en) | 2015-05-14 | 2017-04-11 | Walleye Software, LLC | Parsing and compiling data system queries |
US9639570B2 (en) | 2015-05-14 | 2017-05-02 | Walleye Software, LLC | Data store access permission system with interleaved application of deferred access control filters |
US9672238B2 (en) | 2015-05-14 | 2017-06-06 | Walleye Software, LLC | Dynamic filter processing |
US9679006B2 (en) | 2015-05-14 | 2017-06-13 | Walleye Software, LLC | Dynamic join processing using real time merged notification listener |
US9690821B2 (en) | 2015-05-14 | 2017-06-27 | Walleye Software, LLC | Computer data system position-index mapping |
US9710511B2 (en) | 2015-05-14 | 2017-07-18 | Walleye Software, LLC | Dynamic table index mapping |
US9760591B2 (en) | 2015-05-14 | 2017-09-12 | Walleye Software, LLC | Dynamic code loading |
US9805084B2 (en) | 2015-05-14 | 2017-10-31 | Walleye Software, LLC | Computer data system data source refreshing using an update propagation graph |
US9836495B2 (en) | 2015-05-14 | 2017-12-05 | Illumon Llc | Computer assisted completion of hyperlink command segments |
US9836494B2 (en) | 2015-05-14 | 2017-12-05 | Illumon Llc | Importation, presentation, and persistent storage of data |
US9886469B2 (en) | 2015-05-14 | 2018-02-06 | Walleye Software, LLC | System performance logging of complex remote query processor query operations |
US9898496B2 (en) | 2015-05-14 | 2018-02-20 | Illumon Llc | Dynamic code loading |
US9934266B2 (en) | 2015-05-14 | 2018-04-03 | Walleye Software, LLC | Memory-efficient computer system for dynamic updating of join processing |
US11687529B2 (en) | 2015-05-14 | 2023-06-27 | Deephaven Data Labs Llc | Single input graphical user interface control element and method |
US9613109B2 (en) | 2015-05-14 | 2017-04-04 | Walleye Software, LLC | Query task processing based on memory allocation and performance criteria |
US10002155B1 (en) | 2015-05-14 | 2018-06-19 | Illumon Llc | Dynamic code loading |
US10003673B2 (en) | 2015-05-14 | 2018-06-19 | Illumon Llc | Computer data distribution architecture |
US10019138B2 (en) | 2015-05-14 | 2018-07-10 | Illumon Llc | Applying a GUI display effect formula in a hidden column to a section of data |
US10069943B2 (en) | 2015-05-14 | 2018-09-04 | Illumon Llc | Query dispatch and execution architecture |
US10176211B2 (en) | 2015-05-14 | 2019-01-08 | Deephaven Data Labs Llc | Dynamic table index mapping |
US10198465B2 (en) | 2015-05-14 | 2019-02-05 | Deephaven Data Labs Llc | Computer data system current row position query language construct and array processing query language constructs |
US10198466B2 (en) | 2015-05-14 | 2019-02-05 | Deephaven Data Labs Llc | Data store access permission system with interleaved application of deferred access control filters |
US11663208B2 (en) | 2015-05-14 | 2023-05-30 | Deephaven Data Labs Llc | Computer data system current row position query language construct and array processing query language constructs |
US9613018B2 (en) | 2015-05-14 | 2017-04-04 | Walleye Software, LLC | Applying a GUI display effect formula in a hidden column to a section of data |
US10212257B2 (en) | 2015-05-14 | 2019-02-19 | Deephaven Data Labs Llc | Persistent query dispatch and execution architecture |
US11556528B2 (en) | 2015-05-14 | 2023-01-17 | Deephaven Data Labs Llc | Dynamic updating of query result displays |
US11514037B2 (en) | 2015-05-14 | 2022-11-29 | Deephaven Data Labs Llc | Remote data object publishing/subscribing system having a multicast key-value protocol |
US10242040B2 (en) | 2015-05-14 | 2019-03-26 | Deephaven Data Labs Llc | Parsing and compiling data system queries |
US10241960B2 (en) | 2015-05-14 | 2019-03-26 | Deephaven Data Labs Llc | Historical data replay utilizing a computer system |
US10242041B2 (en) | 2015-05-14 | 2019-03-26 | Deephaven Data Labs Llc | Dynamic filter processing |
US11263211B2 (en) | 2015-05-14 | 2022-03-01 | Deephaven Data Labs, LLC | Data partitioning and ordering |
US10346394B2 (en) | 2015-05-14 | 2019-07-09 | Deephaven Data Labs Llc | Importation, presentation, and persistent storage of data |
US10353893B2 (en) | 2015-05-14 | 2019-07-16 | Deephaven Data Labs Llc | Data partitioning and ordering |
US11249994B2 (en) | 2015-05-14 | 2022-02-15 | Deephaven Data Labs Llc | Query task processing based on memory allocation and performance criteria |
US9612959B2 (en) | 2015-05-14 | 2017-04-04 | Walleye Software, LLC | Distributed and optimized garbage collection of remote and exported table handle links to update propagation graph nodes |
US10496639B2 (en) | 2015-05-14 | 2019-12-03 | Deephaven Data Labs Llc | Computer data distribution architecture |
US10540351B2 (en) | 2015-05-14 | 2020-01-21 | Deephaven Data Labs Llc | Query dispatch and execution architecture |
US10552412B2 (en) | 2015-05-14 | 2020-02-04 | Deephaven Data Labs Llc | Query task processing based on memory allocation and performance criteria |
US10565194B2 (en) | 2015-05-14 | 2020-02-18 | Deephaven Data Labs Llc | Computer system for join processing |
US10565206B2 (en) | 2015-05-14 | 2020-02-18 | Deephaven Data Labs Llc | Query task processing based on memory allocation and performance criteria |
US10572474B2 (en) | 2015-05-14 | 2020-02-25 | Deephaven Data Labs Llc | Computer data system data source refreshing using an update propagation graph |
US10621168B2 (en) | 2015-05-14 | 2020-04-14 | Deephaven Data Labs Llc | Dynamic join processing using real time merged notification listener |
US10642829B2 (en) | 2015-05-14 | 2020-05-05 | Deephaven Data Labs Llc | Distributed and optimized garbage collection of exported data objects |
US11238036B2 (en) | 2015-05-14 | 2022-02-01 | Deephaven Data Labs, LLC | System performance logging of complex remote query processor query operations |
US10678787B2 (en) | 2015-05-14 | 2020-06-09 | Deephaven Data Labs Llc | Computer assisted completion of hyperlink command segments |
US10691686B2 (en) | 2015-05-14 | 2020-06-23 | Deephaven Data Labs Llc | Computer data system position-index mapping |
WO2016183550A1 (en) * | 2015-05-14 | 2016-11-17 | Walleye Software, LLC | Dynamic table index mapping |
US11151133B2 (en) | 2015-05-14 | 2021-10-19 | Deephaven Data Labs, LLC | Computer data distribution architecture |
US11023462B2 (en) | 2015-05-14 | 2021-06-01 | Deephaven Data Labs, LLC | Single input graphical user interface control element and method |
US10929394B2 (en) | 2015-05-14 | 2021-02-23 | Deephaven Data Labs Llc | Persistent query dispatch and execution architecture |
US10915526B2 (en) | 2015-05-14 | 2021-02-09 | Deephaven Data Labs Llc | Historical data replay utilizing a computer system |
US10922311B2 (en) | 2015-05-14 | 2021-02-16 | Deephaven Data Labs Llc | Dynamic updating of query result displays |
US10452661B2 (en) * | 2015-06-18 | 2019-10-22 | Microsoft Technology Licensing, Llc | Automated database schema annotation |
US20160371275A1 (en) * | 2015-06-18 | 2016-12-22 | Microsoft Technology Licensing, Llc | Automated database schema annotation |
US11449557B2 (en) | 2017-08-24 | 2022-09-20 | Deephaven Data Labs Llc | Computer data distribution architecture for efficient distribution and synchronization of plotting processing and data |
US10241965B1 (en) | 2017-08-24 | 2019-03-26 | Deephaven Data Labs Llc | Computer data distribution architecture connecting an update propagation graph through multiple remote query processors |
US10783191B1 (en) | 2017-08-24 | 2020-09-22 | Deephaven Data Labs Llc | Computer data distribution architecture for efficient distribution and synchronization of plotting processing and data |
US10657184B2 (en) | 2017-08-24 | 2020-05-19 | Deephaven Data Labs Llc | Computer data system data source having an update propagation graph with feedback cyclicality |
US10866943B1 (en) | 2017-08-24 | 2020-12-15 | Deephaven Data Labs Llc | Keyed row selection |
US11941060B2 (en) | 2017-08-24 | 2024-03-26 | Deephaven Data Labs Llc | Computer data distribution architecture for efficient distribution and synchronization of plotting processing and data |
US10909183B2 (en) | 2017-08-24 | 2021-02-02 | Deephaven Data Labs Llc | Computer data system data source refreshing using an update propagation graph having a merged join listener |
US11126662B2 (en) | 2017-08-24 | 2021-09-21 | Deephaven Data Labs Llc | Computer data distribution architecture connecting an update propagation graph through multiple remote query processors |
US11860948B2 (en) | 2017-08-24 | 2024-01-02 | Deephaven Data Labs Llc | Keyed row selection |
US11574018B2 (en) | 2017-08-24 | 2023-02-07 | Deephaven Data Labs Llc | Computer data distribution architecture connecting an update propagation graph through multiple remote query processing |
US10198469B1 (en) | 2017-08-24 | 2019-02-05 | Deephaven Data Labs Llc | Computer data system data source refreshing using an update propagation graph having a merged join listener |
US10002154B1 (en) | 2017-08-24 | 2018-06-19 | Illumon Llc | Computer data system data source having an update propagation graph with feedback cyclicality |
WO2019062412A1 (en) * | 2017-09-30 | 2019-04-04 | Oppo广东移动通信有限公司 | Method and apparatus for recording application information, storage medium, and electronic device |
CN112819582A (en) * | 2021-02-04 | 2021-05-18 | 苏州达家迎信息技术有限公司 | Order data display method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
US8185525B2 (en) | 2012-05-22 |
US8380708B2 (en) | 2013-02-19 |
US20080154902A1 (en) | 2008-06-26 |
US20080154901A1 (en) | 2008-06-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8185525B2 (en) | Ordering query results based on value range filtering | |
US8027985B2 (en) | Sorting data records contained in a query result | |
US7472116B2 (en) | Method for filtering query results using model entity limitations | |
US8027971B2 (en) | Relationship management in a data abstraction model | |
US7805465B2 (en) | Metadata management for a data abstraction model | |
US7844623B2 (en) | Method to provide management of query output | |
US8285739B2 (en) | System and method for identifying qualifying data records from underlying databases | |
US7139774B2 (en) | Singleton abstract model correspondence to multiple physical models | |
US7752215B2 (en) | System and method for protecting sensitive data | |
US8140595B2 (en) | Linked logical fields | |
US7584178B2 (en) | Query condition building using predefined query objects | |
US20060116999A1 (en) | Sequential stepwise query condition building | |
US20080228716A1 (en) | System and method for accessing unstructured data using a structured database query environment | |
US20060206468A1 (en) | Rule application management in an abstract database | |
US20070143245A1 (en) | System and method for managing presentation of query results | |
US20080016049A1 (en) | Natural language support for query results | |
US20080040320A1 (en) | Method and system for filtering data | |
US20080016047A1 (en) | System and method for creating and populating dynamic, just in time, database tables | |
US7814127B2 (en) | Natural language support for database applications | |
US7761461B2 (en) | Method and system for relationship building from XML | |
US20080189289A1 (en) | Generating logical fields for a data abstraction model | |
US20080168043A1 (en) | System and method for optimizing query results in an abstract data environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MICHINES CORPORATION, NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DETTINGER, RICHARD D.;KOLZ, DANIEL P.;KULACK, FREDERICK A.;SIGNING DATES FROM 20041129 TO 20041130;REEL/FRAME:015628/0846 Owner name: INTERNATIONAL BUSINESS MICHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DETTINGER, RICHARD D.;KOLZ, DANIEL P.;KULACK, FREDERICK A.;REEL/FRAME:015628/0846;SIGNING DATES FROM 20041129 TO 20041130 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: AIRBNB, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:056427/0193 Effective date: 20210106 |
|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW YORK Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE RECEIVING PARTY'S NAME PREVIOUSLY RECORDED AT REEL: 015628 FRAME: 0846. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNORS:DETTINGER, RICHARD D.;KOLZ, DANIEL P.;KULACK, FREDERICK A.;SIGNING DATES FROM 20041129 TO 20041130;REEL/FRAME:055859/0606 |