CN103942245A - Data extracting method based on metadata - Google Patents
Data extracting method based on metadata Download PDFInfo
- Publication number
- CN103942245A CN103942245A CN201410055786.7A CN201410055786A CN103942245A CN 103942245 A CN103942245 A CN 103942245A CN 201410055786 A CN201410055786 A CN 201410055786A CN 103942245 A CN103942245 A CN 103942245A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- data pick
- metadata
- definition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data extracting method based on metadata, and belongs to the field of data extracting. According to the method, a data extracting model is established on the basis of a common metadata model of a service model, and service data are extracted and formulated from the service model. Compared with the prior art, the data extracting method based on the metadata is based on industrial standard specification data elements, a metadata model is sorted out by extracting, refining and carding the service model, and the service data correspond to the metadata. Service classification is carried out on the metadata, and the processed metadata are mapped to the established data extracting model, so that the data extracting model based on the metadata is formed, the purpose of extracting the flexible service data is achieved, and the data extracting method has good popularization and application value.
Description
Technical field
The present invention relates to data pick-up field, specifically a kind of data pick-up method based on metadata.
Background technology
At each business bar line of health industry, there is the business model that all volumes are large and complicated, corresponding data model has the business characteristics such as list structure complexity, field are various.
In existing data pick-up model, great majority be for each line service model or certain independently business model carry out data pick-up modelling targetedly.This design not only designs complexity, and adaptability to changes is poor.When changing because of industry standard specification or during in zones of different region form, will producing loaded down with trivial details and complicated change, bring huge maintenance workload, and be difficult for expansion.
Summary of the invention
Technical assignment of the present invention is for above-mentioned the deficiencies in the prior art, and a kind of data pick-up method based on metadata is provided.
Technical assignment of the present invention is realized in the following manner: the data pick-up method based on metadata, be characterized on the basis of the public metadata schema of business model, setting up data pick-up model, and from business model, extract and formulate business datum.
Described metadata schema, by extracting in business model, is set up the incidence relation of business model and metadata schema.
The renewal source definition that described data pick-up model comprises model definition, the definition of data pick-up item, each item number certificate, and data pick-up sorted logic processing, by data pick-up item and the associated and metadata of metadata and the associated relation of setting up three of business datum, reach by the target of the data pick-up model extraction business datum of metadata.
The realization of said method comprises data pick-up model definition, the definition of data pick-up item, the definition of Data Update source and the processing of data pick-up sorted logic:
Described data pick-up model definition refers to the framework of data extraction definition model, from different tangent planes, different dimensions, different points, the data that will extract are sorted out and gathered, each model definition comprises: model ISN, title, description base attribute, and the processing mode of model, whether need the table name definition of redirect mark, storage details;
The definition of described data pick-up item comprises processing mode, data type, length, the precision of data pick-up item corresponding field value, the extraction attribute definition of processing type;
The source definition of described Data Update is to define for the Data Source of data pick-up item, for the data of determining each data pick-up item when, from which metadata, upgrade, comprise source algorithm definition, metadata mark and the DSD calculating;
The processing of described data pick-up sorted logic comprises the processing of accumulative total class data pick-up, essential information class data pick-up processes and renewal gathers the processing of class data pick-up.
Compared with prior art, method of the present invention, based on industry standard specification data element, by the refinement combing to business model, arranges out metadata schema, and business datum is corresponding with metadata.Carry out business classification for these metadata, and be mapped to the data pick-up model of having set up, thereby form the data pick-up model based on metadata, reach business datum extracting objects flexibly, there is following beneficial effect highlightedly:
(1) extraction model is based upon on bottom metadata schema, can not carry out a large amount of variations and maintenance because of the variation of business model.
(2) data pick-up model corresponding element data, can, according to the version of metadata, set up the version management of data pick-up, are of value to the version of managing extraction model.
(3) by the Data Update definition of originating, effectively process the update mechanism of extracted data.
(4) by dissimilar processing logic, process targetedly different extraction models in the time that business datum changes and the operation of doing, and Unified Model, be convenient to management and expansion.
Brief description of the drawings
Accompanying drawing 1 is data pick-up illustraton of model in the inventive method;
Accompanying drawing 2 is samples of data pick-up model in embodiment;
Accompanying drawing 3 is samples of data pick-up item definition in embodiment;
Accompanying drawing 4 is samples of Data Update source definition in embodiment;
Accompanying drawing 5 is concise and to the point class figure of data pick-up model in embodiment.
Embodiment
Data pick-up method based on metadata of the present invention is described in detail below with specific embodiment with reference to Figure of description.
Embodiment:
The data pick-up method that the present invention is based on metadata comprises data pick-up model definition, the definition of data pick-up item, the definition of Data Update source, the processing of data pick-up sorted logic.
Below further illustrate:
(1), data pick-up model definition
According to business demand, the framework of data extraction definition model, sorts out and gathers the data that will extract from different tangent planes, different dimensions, different points.Each model definition comprises: model ISN, title, description base attribute, and the processing mode of model, whether need the table name definition of redirect mark, storage details.By the definition of these aspects, can determine a kind of extraction process and mode of data pick-up model.
attribute | describe |
processing mode | accumulative total class, essential information class, renewal gather class |
redirect mark | whether definition there is hop field.Hop field need to generate the detailed record of field. |
table name in detail | for configuring the storage list of hop field record. |
(2), data pick-up item definition
Each or each class data pick-up model, is made up of for the data item extracting some.Each data pick-up item will comprise processing mode, data type, length, the precision of data pick-up item corresponding field value, the extraction attribute definition of processing type.Can determine extraction process and the mode of the details item of a data extraction model by the definition of these data pick-up items.
attribute | describe |
processing mode | specific field type, (MERGE merges the field of type, the field of JUMP redirect type). |
(3), Data Update source definition
Data Update source is mainly to define for the Data Source of data pick-up item, when can determine the data of each data pick-up item, from which metadata, upgrades.Comprise source algorithm definition, metadata mark and the DSD calculating.
attribute | describe |
algorithm | can, according to the processing class definition dynamic call Processing Algorithm of definition, realize the dynamic expansion of data pick-up Processing Algorithm. |
(4), data pick-up sorted logic processing
Combing based on us to business extraction model and classification, can be from accumulative total class, essential information class, upgrade the data pick-up model that gathers class three types and process.
Further instruction is as follows:
A. add up class data pick-up processing logic:
A) obtain data according to the traffic table that is defined into metadata association of data pick-up model.
If b) these data have existed and business datum state is non-delete or new state more, abandon current business list processing (LISP).Next step processing of other situation continueds.
C) business datum is deleted or more when new state, former extraction record is deleted.End after deletion state service data processing completes, non-delete state service data continue next step.
D) business datum, for newly-increased or more when new state, is proceeded to analyze to business datum, obtains in traffic table and records for generating extracted data the table name that number is maximum.
E) data of obtaining table according to table name circulate.Extract many data according to the definition of extraction model value from each table.
B. essential information class data pick-up logic:
A) the essential information data that generated according to the definition data acquisition of data pick-up model.
If what b) carry out is deletion action, directly the extraction record existing is deleted.Complete aftertreatment and finish, other situations continue to process.
C) carry out extraction one by one according to field contents in the definition of data pick-up item, obtain the data of the corresponding service fields of metadata.Whether exist according to the data that extract, carry out and upgrade or newly-increased operation.
C. upgrade and gather class data pick-up logic:
A) obtain renewal according to data pick-up model definition data and gather traffic table data.
B) business datum state is judged, if deletion state and more new state are deleted data recording corresponding traffic table in the detailed table of data pick-up record.After complete, deletion state service data processing is complete.Other types continue processing below.
Renewal gathers class data pick-up model, deletes only for the data manipulation in detailed record sheet.Upgrading data in summary sheet does not change.May cause upgrading and in summary sheet, have part dirty data.
Business datum is newly-increased or upgrades while operation, need to extract one by one the field for data pick-up in business datum.The details record that also needs other generated data to extract for the field that belongs to redirect type.
Claims (4)
1. the data pick-up method based on metadata, is characterized in that: on the basis of the public metadata schema of business model, set up data pick-up model, extract and formulate business datum from business model.
2. the data pick-up method based on metadata according to claim 1, is characterized in that: described metadata schema is by extracting in business model.
3. the data pick-up method based on metadata according to claim 2, it is characterized in that: the renewal source definition that described data pick-up model comprises model definition, the definition of data pick-up item, each item number certificate, and data pick-up sorted logic processing, by data pick-up item and the associated and metadata of metadata and the associated relation of setting up three of business datum, reach by the target of the data pick-up model extraction business datum of metadata.
4. the data pick-up method based on metadata according to claim 3, is characterized in that comprising data pick-up model definition, the definition of data pick-up item, the definition of Data Update source and the processing of data pick-up sorted logic:
Described data pick-up model definition refers to the framework of data extraction definition model, from different tangent planes, different dimensions, different points, the data that will extract are sorted out and gathered, each model definition comprises: model ISN, title, description base attribute, and the processing mode of model, whether need the table name definition of redirect mark, storage details;
The definition of described data pick-up item comprises processing mode, data type, length, the precision of data pick-up item corresponding field value, the extraction attribute definition of processing type;
The source definition of described Data Update is to define for the Data Source of data pick-up item, for the data of determining each data pick-up item when, from which metadata, upgrade, comprise source algorithm definition, metadata mark and the DSD calculating;
The processing of described data pick-up sorted logic comprises the processing of accumulative total class data pick-up, essential information class data pick-up processes and renewal gathers the processing of class data pick-up.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410055786.7A CN103942245A (en) | 2014-02-19 | 2014-02-19 | Data extracting method based on metadata |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410055786.7A CN103942245A (en) | 2014-02-19 | 2014-02-19 | Data extracting method based on metadata |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103942245A true CN103942245A (en) | 2014-07-23 |
Family
ID=51189913
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410055786.7A Pending CN103942245A (en) | 2014-02-19 | 2014-02-19 | Data extracting method based on metadata |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103942245A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104778236A (en) * | 2015-04-02 | 2015-07-15 | 上海烟草集团有限责任公司 | ETL (Extract-Transform-Load) realization method and system based on metadata |
CN105989162A (en) * | 2015-03-04 | 2016-10-05 | 银联商务有限公司 | Online data extraction method and apparatus |
CN106021294A (en) * | 2016-04-30 | 2016-10-12 | 华南理工大学 | Urban rail transit line net access data interface processing method |
CN106921614A (en) * | 2015-12-24 | 2017-07-04 | 北京国双科技有限公司 | Business data processing method and device |
CN108255953A (en) * | 2017-12-20 | 2018-07-06 | 浪潮软件集团有限公司 | Data processing method and processing device |
CN108280147A (en) * | 2018-01-02 | 2018-07-13 | 浪潮软件集团有限公司 | Data management method and device |
WO2019019621A1 (en) * | 2017-07-25 | 2019-01-31 | 平安科技(深圳)有限公司 | Service processing method, device, server and storage medium |
CN110851559A (en) * | 2019-10-14 | 2020-02-28 | 中科曙光南京研究院有限公司 | Automatic data element identification method and identification system |
CN111159191A (en) * | 2019-12-30 | 2020-05-15 | 深圳博沃智慧科技有限公司 | Data processing method, device and interface |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050216443A1 (en) * | 2000-07-06 | 2005-09-29 | Streamsage, Inc. | Method and system for indexing and searching timed media information based upon relevance intervals |
CN101364240A (en) * | 2008-10-14 | 2009-02-11 | 杭州华三通信技术有限公司 | Metadata management method and device |
US20110295794A1 (en) * | 2010-05-28 | 2011-12-01 | Oracle International Corporation | System and method for supporting data warehouse metadata extension using an extender |
CN102902750A (en) * | 2012-09-20 | 2013-01-30 | 浪潮齐鲁软件产业有限公司 | Universal data extraction and conversion method |
CN102938731A (en) * | 2012-11-22 | 2013-02-20 | 北京锐易特软件技术有限公司 | Exchange and integration device and method based on proxy cache adaptation model |
-
2014
- 2014-02-19 CN CN201410055786.7A patent/CN103942245A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050216443A1 (en) * | 2000-07-06 | 2005-09-29 | Streamsage, Inc. | Method and system for indexing and searching timed media information based upon relevance intervals |
CN101364240A (en) * | 2008-10-14 | 2009-02-11 | 杭州华三通信技术有限公司 | Metadata management method and device |
US20110295794A1 (en) * | 2010-05-28 | 2011-12-01 | Oracle International Corporation | System and method for supporting data warehouse metadata extension using an extender |
CN102902750A (en) * | 2012-09-20 | 2013-01-30 | 浪潮齐鲁软件产业有限公司 | Universal data extraction and conversion method |
CN102938731A (en) * | 2012-11-22 | 2013-02-20 | 北京锐易特软件技术有限公司 | Exchange and integration device and method based on proxy cache adaptation model |
Non-Patent Citations (1)
Title |
---|
周茂伟等: "基于元数据的ETL工具设计与实现", 《科学技术与工程》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105989162A (en) * | 2015-03-04 | 2016-10-05 | 银联商务有限公司 | Online data extraction method and apparatus |
CN105989162B (en) * | 2015-03-04 | 2020-01-31 | 银联商务有限公司 | online data extraction method and device |
CN104778236A (en) * | 2015-04-02 | 2015-07-15 | 上海烟草集团有限责任公司 | ETL (Extract-Transform-Load) realization method and system based on metadata |
CN106921614A (en) * | 2015-12-24 | 2017-07-04 | 北京国双科技有限公司 | Business data processing method and device |
CN106021294A (en) * | 2016-04-30 | 2016-10-12 | 华南理工大学 | Urban rail transit line net access data interface processing method |
WO2019019621A1 (en) * | 2017-07-25 | 2019-01-31 | 平安科技(深圳)有限公司 | Service processing method, device, server and storage medium |
CN108255953A (en) * | 2017-12-20 | 2018-07-06 | 浪潮软件集团有限公司 | Data processing method and processing device |
CN108280147A (en) * | 2018-01-02 | 2018-07-13 | 浪潮软件集团有限公司 | Data management method and device |
CN110851559A (en) * | 2019-10-14 | 2020-02-28 | 中科曙光南京研究院有限公司 | Automatic data element identification method and identification system |
CN111159191A (en) * | 2019-12-30 | 2020-05-15 | 深圳博沃智慧科技有限公司 | Data processing method, device and interface |
CN111159191B (en) * | 2019-12-30 | 2023-05-09 | 深圳博沃智慧科技有限公司 | Data processing method, device and interface |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103942245A (en) | Data extracting method based on metadata | |
JP6032467B2 (en) | Spatio-temporal data management system, spatio-temporal data management method, and program thereof | |
US20140351285A1 (en) | Platform and method for analyzing electric power system data | |
CN104394118A (en) | User identity identification method and system | |
CN103970853A (en) | Method and device for optimizing search engine | |
JP2007011548A (en) | Data set dividing program, data set dividing device, and data set dividing method | |
CN105224377A (en) | A kind of method by metadata automatic generating software project code file and device | |
CN106126601A (en) | A kind of social security distributed preprocess method of big data and system | |
CN104978324B (en) | Data processing method and device | |
CN105205105A (en) | Data ETL (Extract Transform Load) system based on storm and treatment method based on storm | |
CN103234549B (en) | A kind of differential data generation method for upgrading map | |
CN104965886A (en) | Data dimension processing method | |
CN103903086A (en) | Method and system for developing management information system based on service model driving | |
CN104657387A (en) | Data query method and device | |
CN106970929A (en) | Data lead-in method and device | |
CN111694505B (en) | Data storage management method, device and computer readable storage medium | |
CN106649718B (en) | A kind of big data acquisition and processing method for PDM system | |
CN107526746A (en) | The method and apparatus of management document index | |
CN104484460A (en) | Metadata heat degree statistical method of distributed file system | |
CN105574660A (en) | Supplier evaluation and analysis system | |
CN104462361A (en) | Method and device for matching data in data table | |
CN103678682A (en) | Mass grid data processing and management method based on abstract templates | |
CN112328592A (en) | Data storage method, electronic device and computer readable storage medium | |
CN104462462B (en) | Change the data warehouse modeling method and model building device of frequency based on business | |
CN103425490B (en) | Based on the management method running object data in crm system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140723 |