CN109960708A - Data processing method, device, electronic equipment and storage medium - Google Patents
Data processing method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN109960708A CN109960708A CN201910221391.2A CN201910221391A CN109960708A CN 109960708 A CN109960708 A CN 109960708A CN 201910221391 A CN201910221391 A CN 201910221391A CN 109960708 A CN109960708 A CN 109960708A
- Authority
- CN
- China
- Prior art keywords
- data
- source system
- file
- database
- interface
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of data processing method, device, electronic equipment and storage medium, is related to technical field of data processing.This method comprises: extracting parameter according to the database of the log file in source system data library, source system data library connection type, pre-configuration, target data is extracted in the system table of source, wherein, it includes: the mark of source system and the mark of source system table that the database of pre-configuration, which extracts parameter,;The interface data file for meeting the first name requirement is generated according to target data;Data in the interface data file for meeting the first name requirement are loaded onto data warehouse;The data in the interface data file for meeting the first name requirement are integrated and calculated in data warehouse, the data after obtaining processing.Compared with the existing technology, repetitive work in ETL work data treatment process is avoided, so that the characteristics of ETL work data treatment effeciency can be improved reduces development and maintenance cost in ETL work data treatment process when executing batch data processing.
Description
Technical field
The present invention relates to technical field of data processing, set in particular to a kind of data processing method, device, electronics
Standby and storage medium.
Background technique
ETL (Extract-Transform-Load, extraction-conversion-load) is for describing data from source system end
By extraction, the process of goal systems and the element task of data platform, data warehouse are converted, are loaded onto, in big data
Dai Zhong, such as data analysis, data mining, data visualization, data model are completed on the basis of being all based on ETL.
Include in the method for existing ETL whole design, by the logic of data processing with SQL (Structured Query
Language, structured query language) form of sentence is programmed in except database, data platform, for example is programmed in ETL service
On device, using Perl (Practical Extraction and Reporting Language, practical extraction and report language
Speech), the scripting languages such as Python, Shell, call database client to execute the SQL statement, to reach at corresponding data
Manage result.
But in existing ETL design, when developing, go into operation, safeguarding ETL operation, the ETL workload and data that need to operate
Measure directly proportional, especially when task amount is big, the development and maintenance cost of ETL operation not can guarantee.
Summary of the invention
It is an object of the present invention in view of the deficiency of the prior art, provide a kind of data processing method, device,
Electronic equipment and storage medium, to solve the problems, such as that development and maintenance cost is high in available data treatment process.
To achieve the above object, technical solution used in the embodiment of the present invention is as follows:
In a first aspect, the embodiment of the invention provides a kind of data processing methods, comprising:
It is extracted and is joined according to the database of the log file in source system data library, source system data library connection type, pre-configuration
Number, extracts target data in the system table of source, wherein the database of pre-configuration extracts mark and the source that parameter includes: source system
The mark of system table;
The interface data file for meeting the first name requirement is generated according to target data;
Data in the interface data file for meeting the first name requirement are loaded onto data warehouse;
The data in the interface data file for meeting the first name requirement are integrated in the data warehouse
With calculating, the data after processing are obtained.
Optionally, the method also includes: using data warehouse as source system, according to the log file of data warehouse, number
Parameter is extracted according to the database of warehouse connection type, pre-configuration, extracts target data in the data warehouse, wherein
It includes: the mark of data warehouse and the mark of data warehouse table that the database of pre-configuration, which extracts parameter,;
The interface data file for meeting the first name requirement is generated according to the target data;
Data in the interface data file for meeting the first name requirement are loaded onto Data Mart.
Optionally, it is described according to the log file in source system data library, source system data library connection type, pre-configuration number
Parameter is extracted according to library, extracts target data in the source system table, comprising:
It is extracted according to the database of the log file in source system data library, the connection type in source system data library, pre-configuration
Parameter, multiple processes concurrently extract target data in multiple source system tables, wherein the corresponding institute of a process
State source system table.
Optionally, the database of the pre-configuration extracts parameter further include: the Data Date information and pumping of source system table
Mode is taken to identify, wherein the extraction mode identifies instruction increment extraction or full dose extracts;
The log file according to the source system data library, source system data library connection type, the data of pre-configuration
Parameter is extracted in library, extracts target data in the source system table, comprising:
It is taken out according to the database of the log file in the source system data library, source system data library connection type, pre-configuration
Take parameter, extracted in the source system table source system table the corresponding increment interface data file of Data Date information or
Person's full dose interface data file.
Optionally, the data by the interface data file for meeting the first name requirement are loaded onto data bins
Library, comprising:
According to the log file of data warehouse, data warehouse connection type, by the interface for meeting the first name requirement
Data in data file are loaded onto the interface table for meeting the second name requirement in the data warehouse, wherein described second
The interface table that name requires includes that the database of the pre-configuration extracts parameter.
Optionally, it is described in the data warehouse to the number in the interface data file for meeting the first name requirement
According to being integrated and being calculated, the data after processing are obtained, comprising:
According to the log file of data warehouse, data warehouse connection type, the Data Warehouse library name, call
The job file that preset language is write;
The job file is run to integrate the data in the interface data file for meeting the first name requirement
With calculating, the data after processing are obtained.
Second aspect, the embodiment of the invention also provides a kind of data processing equipments, comprising: abstraction module, generation module,
Loading module obtains module, in which:
The abstraction module, for according to the log file in source system data library, source system data library connection type, prewired
The database set extracts parameter, extracts target data in the source system table, wherein the database of pre-configuration extracts parameter packet
It includes: the mark of source system and the mark of source system table;
The generation module, for generating the interface data file for meeting the first name requirement according to the target data;
The loading module, for the data in the interface data file for meeting the first name requirement to be loaded onto number
According to warehouse;
The acquisition module, in the data warehouse to the interface data file for meeting the first name requirement
In data integrated and calculated, obtain processing after data.
Optionally, the abstraction module is specifically used for: according to the log file in source system data library, source system data library
Connection type, the database extraction parameter of pre-configuration, multiple processes concurrently extract target data in multiple source system tables,
Wherein, the corresponding source system table of a process.
The third aspect, the embodiment of the invention also provides a kind of electronic equipment, comprising: processor, storage medium and bus,
Storage medium is stored with the executable machine readable instructions of processor, when electronic equipment operation, processor and storage medium it
Between by bus communication, the step of processor executes machine readable instructions, data processing method to execute above-mentioned first aspect.
Fourth aspect is stored with computer on the storage medium the embodiment of the invention also provides a kind of storage medium
The step of program, which executes data processing method described in above-mentioned first aspect when being run by processor.
The beneficial effects of the present invention are: data processing method provided herein includes: according to source system data library
Log file, source system data library connection type, the database extraction parameter of pre-configuration, therefore, based on source system data library
Log file, the data pick-up operation for batch operation carry out batch data extraction, root to target data in the system table of source
The interface data file for meeting the first name requirement is generated according to target data, and data warehouse is wanted to the first name is met
The data for the interface data file asked carry out batch processing, so that can will be accorded with according to the naming method of interface data file
It closes the interface data files in batch that the first name requires and is loaded into the interface table that corresponding goal systems meets the second name requirement
In, so as to avoid repetitive work in ETL work data treatment process, so that when executing batch data processing operation,
The characteristics of ETL work data treatment effeciency can be improved reduces development and maintenance cost in ETL work data treatment process.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 is the flow diagram for the data processing method that one embodiment of the application provides;
Fig. 2 is the flow diagram for the data processing method that another embodiment of the application provides;
Fig. 3 is the structural schematic diagram for the data processing equipment that one embodiment of the application provides;
Fig. 4 is the structural schematic diagram for the electronic equipment that one embodiment of the application provides.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.
Firstly, before the application is introduced, first to it is used in this application to name explained accordingly,
Specific explanations are as follows.
ETL: being the abbreviation of English Extract-Transform-Load, for describing data from source terminal by extracting
(extract), the process of interaction conversion (transform), load (load) to destination, ETL realize that there are two types of technologies at present
Framework, ELT framework and ETL framework.
ETL framework: in ETL framework, the flow direction of data is from source traffic to ETL tool, and ETL tool is one independent
Data processing engine, generally can realize the work of all data conversion on individual hardware server, then add data
It is downloaded in target data warehouse, if to increase the efficiency of entire ETL process, the configuration of ETL tool server can only be enhanced,
Optimization system process flow.
ELT framework: in ELT framework, ELT is responsible for providing patterned interface only to design business rule, data it is whole
A process flows all between target database and source database, and ELT coordinates relevant Database Systems to execute correlation
Application, data mart modeling process can both execute at source database end, and can also execute at target data warehouse end, mainly take
Certainly in the architecture design of system and data attribute.
Source system: being called source data provider, up-stream system, upper in data flow for data flow
Trip, relative to being in the side of being supplied with for the system that data are walked downstream.
Goal systems: being called data receiver, down-stream system, for data flow under data flow
Trip is target side for the up-stream system for being in data trend.
Data warehouse: proper noun, it can be understood as specially do the place of Data Integration and data calculating, physical support
Usually distributed data base.Data warehouse in enterprise is the place that all source system datas are concentrated, be it is large and complete, towards
Theme, Data Integration thereon is not separate service bar line.Citing: the data warehouse of bank is exactly loan system, deposit
The source system datas such as system, access are drawn into data warehouse and are processed, summarized, formed client's theme, account theme,
The data of the integration such as assets theme.
Data Mart: Data Mart is towards some specific department or certain concrete application for data warehouse
, vivid understanding can be the subset of a small-sized data warehouse or data warehouse.Citing: the credit risk of bank is answered
With being available to risk and close rule portion and use, be the data carried out according to distribution subjects such as the assets theme of client, transaction events
Processing provides effective risk data support for fiduciary loan financial product.
Log file: the user of in store encryption and password.
Interface data file: source data provider is interior at the appointed time to complete to carry out source system according to code requirement
It periodically extracts and is formed by data file, general interface data file needs regulation file coding format (such as to compile using UTF8
Code format), in file field separator (as used ## as the separator between field).
Frame-type program: for solve the problems, such as some general character, support construction open and design be exactly frame, frame
Formula program is exactly that general character function is carried out to the program of abstract formation using certain language.Frame-type program can reduce redundancy feature
The exploitation of program reduces program quantity, uses soft code to the maximum extent, realizes most of base by seldom a part of program
Plinth processing work, and then user only needs to configure the work that extraction, load and the accumulation of data can be realized in some parameters,
And the platform transplantation of frame-type program is preferable.
The data processing method of the application can be adapted for the data processing of ETL framework, be readily applicable to ELT framework
Data processing is first introduced the application environment of this system, which includes source system, goal systems, data source, data
Warehouse, Data Mart.Wherein, when data source is relative to data warehouse, data source is source system, and data warehouse is target at this time
System;And when data warehouse is relative to Data Mart, data warehouse is source system, and Data Mart is goal systems at this time.
Wherein, source system is used to complete the extraction of data, goal systems is used to complete the reception and load of data, this method
The data warehouse etc. that can be used for the enterprise-levels such as the processing of business datum, such as bank, insurance, security, telecommunications, with specific reference to
Family needs to be arranged, and the application does not carry out any restriction to this application scenarios.
Fig. 1 is a kind of flow diagram of data processing method provided by the present application.Data processing provided herein
Method is suitable for ELT framework, can be used for ETL framework, wherein the source system in the application method can complete the data of ETL
Operation process is extracted, goal systems can complete the process of processing and the load of the data of ETL, as shown in Figure 1, this method packet
It includes:
S101, it is taken out according to the database of the log file in source system data library, source system data library connection type, pre-configuration
Parameter is taken, extracts target data in the system table of source.
Wherein, it includes: the mark of source system and the mark of source system table that the database of pre-configuration, which extracts parameter,.
Wherein, the mark of source system can be the abbreviation of source system, source system data library type etc., the mark of source system table
It may include the title etc. of source system table, the application does not do any restrictions herein.
The database of source system can be Oracle, SQL Server, MySQL, MogonDB or GBase, and the application is not
The type of database of limitation source system, each source system data library can select corresponding log according to the security settings of its own
File can configure corresponding database according to different source system data library types and extract parameter, be the data of source system table
Operation is extracted to prepare.Wherein, security settings may include security level, safety requirements etc., and the application does not do any herein
Limitation.
In a kind of possible embodiment, the username and password in source system data library is store in log file, it should
Password can be encryption, which can be stored in the other systems other than the system of source.Such as: log file can be with
It is stored on ETL server, different source system datas library is allowed to correspond to different log files.When needing access originator system
System, can be decrypted by corresponding decipherment algorithm, username and password is obtained from log file, in addition, by source system data
The log file in library and source system data library separate setting, and user is not easy to get the user name in source system data library and close
Code, can promote the safety in source system data library.Certainly, the application only refers to be pacified according to source system data library here
Full setting selects corresponding log file, in addition it is also possible to be stepped on accordingly according to the selection of the security settings of goal systems database
File is recorded, correspondingly, the log file of goal systems database can be stored in the other systems other than goal systems database
On, such as: special ETL server allows to be promoted the safety of goal systems database, and the application is no longer superfluous herein
It states.
For example: when source system is loan system, corresponding database has an access username, it is assumed that it is logged in
The entitled Source_Loan_Logon of file, the content of the inside just record access username and claim+password, wherein the password can be with
It is according to the encrypted password of preset Encryption Algorithm, the application is defined not to this.
S102, the interface data file for meeting the first name requirement is generated according to target data.
In the application, interface data file, which refers to, extracts parameter according to the database in source system data library, using corresponding
Data extraction module is within a specified time completed periodically extract to source system to be formed by data file.
In a kind of optional embodiment, when interface data file requires name according to the first name, so that number of ports
According to the naming standard of file, convenient for data warehouse according to the screening rule of interface data file, corresponding interface is chosen in batches
Data file carries out data processing, so that improving interface data files in batch when carrying out the batch processing of interface data file
The efficiency of processing.Wherein, which requires to may is that<source system English abbreviation><table name><date, (format was
YYYMMDD)><batch number><retransmission sequence number><data providing formula (z be initial, i is increment, and f is full dose)>.dat.Text
The coded format of part is UTF8, ASCII value 0x1B other than the separator selection Chinese character set between field.
Due to source system daily can transmission primaries data, the data of multiple batches can also be transmitted, made in the application
Indicate current transmission is which time transmission of the same day with batch number.
Optionally, each section is with English input method " point " for separator in name, and be described as follows: source system English abbreviation is
The small English character of 3 characters;Table name is known as extracting the title of the table of data, small English word;Date, format are
YYYYMMDD is Data Date, even if data offer is overdue, also to use the Data Date of the data rather than current system day
Phase;If the interface unit is the moon/year interface, which is the format of MM/YYYY;Batch number is 3 numbers, N+1
Form, be sequentially incremented by since 1 if multiple batches of, shaped like 001;It is 000 if the system only provides single batch data, it is more
When batch, first batch number is 001;The interfaces such as the moon/year only support single batch to transmit;Retransmission sequence number is 2 numbers, the shape of N+1
Formula is since 0 sequentially 00 when initial transmission, when data need to transmit again then on the basis of a upper number plus
1, such as: it after initial transmission, finds data quality problem, has carried out the re-transmission of first time data, then the number is 01;Data provide
Mode: z be it is initial, i is increment, and f is full dose, small English word.
Such as: cct.t031001.20151010.000.00.z.dat indicates containing for the interface document title of the source system
Justice are as follows: the English abbreviation of source system is cct, table name is known as t031001, Data Date be on October 10th, 2015, single batch, just
Begin the primary data transmitted.
It should be noted that the application only lists a kind of naming method of the first name requirement herein, specifically answering
With in the process, can also according to the actual needs to first name require in field and representation adjusted accordingly
Whole, the application does not do any restrictions herein.
S103, the data in the interface data file for meeting the first name requirement are loaded onto data warehouse.
Wherein, the interface data file for meeting the first name requirement needs data file to be loaded as goal systems, can
The data in the interface data file for meeting the first name requirement to be loaded into corresponding goal systems, and data warehouse this
When relative to source system be goal systems, and the data in the interface data file for meeting the first name requirement are loaded onto data
Warehouse.
It should be noted that goal systems can also have multiple, the type of database of multiple goal systems can be identical,
Can be different, different goal systems can be corresponded to according to different requirements, so that is generated meets connecing for the first name requirement
Mouth data file can use for multiple goal systems, to be different business services.
S104, the data in the interface data file for meeting the first name requirement are integrated and is counted in data warehouse
It calculates, the data after obtaining processing.
Wherein, data warehouse is the place that data are integrated and calculated, for example: integration can be with are as follows: deposit has
Corresponding deposit client, loan have corresponding loan customer, credit card to have corresponding card holder, and data warehouse can be to these
Customer information is integrated (such as pooling together).Calculating can be with are as follows: a certain personal all Assets in bank need
The use of funds situation of his deposit under all accounts of bank, loan, credit card is carried out summarizing operation.
In conclusion data processing method provided herein includes, obtain source system data library log file and
The database of source system data library connection type, pre-configuration extracts parameter, and target data, therefore, base are extracted in the system table of source
In the log file of source system, data pick-up operation for batch operation can be to class database by the parameter list configured
The identical source system of type carries out batch data extraction, generates and meets the interface data file that the first name requires, data warehouse can
Batch processing is carried out with the data of the interface data file required the name of satisfaction first, so that according to interface data file
Naming method interface data files in batch is loaded into corresponding goal systems, so as to avoid ETL work data processing
Repetitive work in the process, so that ETL work data treatment effeciency can be improved when executing batch data processing operation
Feature reduces development and maintenance cost in ETL work data treatment process.
Fig. 2 is the flow diagram for the data processing method that another embodiment of the application provides, as shown in Fig. 2, this method
Further include following steps:
S105, using data warehouse as source system, according to the log file of data warehouse, data warehouse connection type, pre-
The database of configuration extracts parameter, and target data is extracted in data warehouse.
Wherein, it includes: the mark of data warehouse and the mark of data warehouse table that the database of pre-configuration, which extracts parameter,.
Wherein, the data source of Data Mart is in the downstream of data warehouse in data warehouse in data flow, so
At this time relative to Data Mart, data warehouse is source system, and for data warehouse, Data Mart is target system
System.
S106, the interface data file for meeting the first name requirement is generated according to target data.
S107, the data in the interface data file for meeting the first name requirement are loaded onto Data Mart.
Optionally, step S101 includes: the connection side of log file according to source system data library, source system data library
Formula, the database extraction parameter of pre-configuration, multiple processes concurrently extract target data in multiple source system tables, wherein one
Process corresponds to a source system table.
Wherein, complete pre-configuration database extract parameter after, by database extract parameter in content it is found that
When carrying out data pick-up to source system data by the way of parameter configuration, source system identical for type of database passes through
The mode of parameter configuration can concurrently extract target data to multiple processes in the source system table, wherein a process simultaneously
A corresponding source system table, avoids in the prior art, carries out data pumping to multiple source system tables in same source system data library
It when taking, needs to develop multiple data pick-up tasks, and then avoids the data pick-up task exploitation of a large amount of repeatability, improve out
Hair efficiency reduces maintenance cost.
Such as: in the prior art, when carrying out the data pick-up of 300 tables using ETL tool to source system, it is necessary to be every
It opens table and configures an extraction task, so that 300 tables 300 data pick-up tasks of corresponding configuration.
And in the application, the content in parameter list is extracted by database it is found that in such a way that database extracts parameter list
When carrying out data pick-up to source system data, all data pick-up task records are extracted in parameter list in a database,
Type of database, system abbreviation in parameter list by field value differentiation source system etc. can be extracted in database, and then be based on
The database extracts parameter list and carries out data pick-up to source system table, wherein different type of database can choose not
Same data extraction module, such as: one to two data extractors, source can be developed to different types of source system data library
The type of database of system be oracle database when, no matter the source system number how many, the class database can be set
The data extractor of type be A, and the type of database of source system be MySQL database when, no matter the source system number have it is more
Few, the data extractor that the type of database can be set is B, so that the source system to disparate databases type counts
When according to extracting, parameter can be extracted according to the database of configuration to source system table and carrying out corresponding data pick-up.Therefore, the application
In by database extract parameter list in content configured accordingly, allow to same type of source system data
Library type carries out the data pick-up task of multiple tables, has kept away a large amount of repetitive operation, it is possible to reduce batch data extracts task
Workload, improve the working efficiency of batch processing.
Optionally, the database of pre-configuration extracts parameter further include: the Data Date information of source system table and extraction side
Formula mark, wherein extraction mode identifies instruction increment extraction or full dose extracts.
In a kind of possible embodiment, the database of pre-configuration extract parameter content include: source system mark,
The mark of source system table, the Data Date information and extraction mode of source system table identify, wherein it is complete that extraction mode identifies instruction
Amount extracts or increment extraction can configure accordingly the parameter, according to the job task actually extracted in addition, according to reality
The demand on border, the content that can also be extracted to the database in parameter list are increased or are deleted, and the application is not in the parameter
The number and classification of appearance carry out any restriction.
Wherein, data pick-up mode includes that increment extraction or full dose extract, correspondingly, according to stepping on for source system data library
File, source system data library connection type, the database extraction parameter of pre-configuration are recorded, extracts target data in the system table of source,
It include: that parameter is extracted according to the database of the log file in source system data library, source system data library connection type, pre-configuration,
The corresponding increment interface data file of Data Date information or full dose interface data of source system table are extracted in the system table of source
File.
Wherein, what increment extraction referred to extraction is changed, newly generated data in preset time;Full dose extraction refers to pumping
What is taken is the last state of all data of preset time point, which is per hour, daily, monthly etc. that the application is not
The duration of preset time is defined, specific preset time is determined according to the definition of source system.
Such as: incremental data can be according to daily and monthly counting, including every daily increment, monthly increment.Every daily increment can
To be the snapshot for extracting changed, the new last state for generating data of daily 00:00 to 24:00.Monthly increment can be
Begin that last day 24:00 does not stop to the moon when extracting monthly 00:00 on the 1st, changed, newly generated data last state it is fast
According to.
Full dose data can be according to a certain regular time point in daily or a certain regular time point counts per month, packet
Include daily full dose, monthly full dose.When daily full dose can refer to extraction daily 24:00, the last state snapshot of all data.Often
Month full dose can be extraction per month in and month out not last day 24:00 when, the last state snapshot of all data.
Therefore, interface data file for different data pick-up modes that correspondence is different.For example, using increment extraction
When carrying out data pick-up to source system table, generation is increment interface data file, and is extracted with full dose and carried out to source system table
When data pick-up, generation is full dose interface data file.
Optionally, step S103 includes: log file according to data warehouse, data warehouse connection type, will meet
The data in interface data file that one name requires are loaded onto the interface table for meeting the second name requirement in data warehouse,
In, the interface table that the second name requires includes that the database being pre-configured extracts parameter.
Wherein, the data in the interface data file for meeting the first name requirement are loaded onto data warehouse and meet second
It names in desired interface table, the interface table for being loaded onto data warehouse requires the database comprising being pre-configured to take out according to the second name
Take parameter, wherein it includes: the mark of source service system and the mark of source system table that the database of pre-configuration, which extracts parameter,.
Optionally, step S104 further include: according to the log file of data warehouse, data warehouse connection type, data bins
Database-name in library, the job file for calling preset language to write.
Operation job file is integrated and is calculated to the data in the interface data file for meeting the first name requirement, is obtained
Data after taking processing.
It should be noted that the preset language in the application is SQL statement, SQL statement is not related to any type database,
It is called by frame-type program, just runs all support SQL statements by calling SQL statement that can not need any rewriting
Database (such as moves to the Hive of Hadoop) when replacing the data medium of data warehouse from the MaxCompute of Ali, only
Need to change frame-type program link information (such as: data warehouse connection type, Data Warehouse library name, data
The log file etc. in warehouse).
Wherein, believed by the log file of data warehouse, the connection type of data warehouse, data warehouse data library name etc.
Breath, the job file for calling preset language to write, in the interface data file for meeting the first name requirement in data warehouse
Data integrated and calculated, can be reduced ETL developer and the database technical ability of carrying data warehouse function wanted
It asks, while promoting the cross-platform transplantability of ETL operation on data warehouse.
Fig. 3 is a kind of structural schematic diagram of data processing equipment provided by the present application.As shown in figure 3, the device includes: to take out
Modulus block 210, generation module 220, loading module 230 and acquisition module 240.
Abstraction module 210, for according to the log file in source system data library, source system data library connection type, prewired
The database set extracts parameter, extracts target data in the source system table, wherein the database of pre-configuration extracts parameter packet
It includes: the mark of source system and the mark of source system table.
Generation module 220, for generating the interface data file for meeting the first name requirement according to target data.
Loading module 230, for the data in the interface data file for meeting the first name requirement to be loaded onto data bins
Library.
Obtain module 240, in data warehouse to the data in the interface data file for meeting the first name requirement
It is integrated and is calculated, the data after obtaining processing.
The method that above-mentioned apparatus is used to execute previous embodiment offer, it is similar that the realization principle and technical effect are similar, herein not
It repeats again.
Optionally, abstraction module 210 is specifically used for: according to the log file in source system data library, source system data library
Connection type, the database extraction parameter of pre-configuration, multiple processes concurrently extract target data in multiple source system tables,
Wherein, the corresponding source system table of a process.
Fig. 4 is the structural schematic diagram for the electronic equipment that an embodiment provided by the present application provides.As shown in figure 4, the equipment
It include: processor 310, storage medium 320 and bus 330, storage medium 320 is stored with the executable machine of processor 310 can
Reading instruction is communicated between processor 310 and storage medium 320 by bus 330, processor 310 is held when electronic equipment operation
Row machine readable instructions, the step of above-mentioned data processing method is executed when executing.
Optionally, the present invention also provides a kind of storage medium, it is stored with computer program on the storage medium, the computer
The step of above-mentioned data processing method is executed when program is run by processor.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit
Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one
In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) or processor (English: processor) execute this hair
The part steps of bright each embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory
(English: Read-Only Memory, abbreviation: ROM), random access memory (English: Random Access Memory, letter
Claim: RAM), the various media that can store program code such as magnetic or disk.
Claims (10)
1. a kind of data processing method characterized by comprising
Parameter is extracted according to the database of the log file in source system data library, source system data library connection type, pre-configuration,
Target data is extracted in the system table of source, wherein the database of pre-configuration extracts the mark and source system table that parameter includes: source system
Mark;
The interface data file for meeting the first name requirement is generated according to target data;
Data in the interface data file for meeting the first name requirement are loaded onto data warehouse;
The data in the interface data file for meeting the first name requirement are integrated and counted in the data warehouse
It calculates, the data after obtaining processing.
2. the method as described in claim 1, which is characterized in that the method also includes:
Using data warehouse as source system, according to the log file of data warehouse, data warehouse connection type, pre-configuration
Database extracts parameter, extracts target data in the data warehouse, wherein the database of pre-configuration extracts parameter and includes:
The mark of data warehouse and the mark of data warehouse table;
The interface data file for meeting the first name requirement is generated according to the target data;
Data in the interface data file for meeting the first name requirement are loaded onto Data Mart.
3. the method as described in claim 1, which is characterized in that the log file according to source system data library, source system
Database connection type, the database extraction parameter of pre-configuration, extract target data in the source system table, comprising:
Parameter is extracted according to the database of the log file in source system data library, the connection type in source system data library, pre-configuration,
Multiple processes concurrently extract target data in multiple source system tables, wherein the corresponding source of a process
System table.
4. the method as described in claim 1, which is characterized in that the database of the pre-configuration extracts parameter further include: source system
The Data Date information and extraction mode of system table identify, wherein the extraction mode identifies instruction increment extraction or full dose is taken out
It takes;
It is described to be taken out according to the database of the log file in the source system data library, source system data library connection type, pre-configuration
Parameter is taken, extracts target data in the source system table, comprising:
It is extracted and is joined according to the database of the log file in the source system data library, source system data library connection type, pre-configuration
Number extracts the corresponding increment interface data file of Data Date information or complete of the source system table in the source system table
Measure interface data file.
5. the method as described in claim 1, which is characterized in that described by the interface data text for meeting the first name requirement
Data in part are loaded onto data warehouse, comprising:
According to the log file of data warehouse, data warehouse connection type, by the interface data for meeting the first name requirement
Data in file are loaded onto the interface table for meeting the second name requirement in the data warehouse, wherein second name
It is required that interface table include the pre-configuration database extract parameter.
6. the method as described in claim 1, which is characterized in that described to meet the first name to described in the data warehouse
It is required that interface data file in data integrated and calculated, obtain processing after data, comprising:
According to the log file of data warehouse, data warehouse connection type, the Data Warehouse library name, call default
The job file that language is write;
It runs the job file data in the interface data file for meeting the first name requirement are integrated and counted
It calculates, the data after obtaining processing.
7. a kind of data processing equipment characterized by comprising abstraction module, loading module, obtains module at generation module,
In:
The abstraction module, for according to the log file in source system data library, source system data library connection type, be pre-configured
Database extracts parameter, extracts target data in the system table of source, wherein it includes: source system that the database of pre-configuration, which extracts parameter,
The mark of system and the mark of source system table;
The generation module, for generating the interface data file for meeting the first name requirement according to the target data;
The loading module, for the data in the interface data file for meeting the first name requirement to be loaded onto data bins
Library;
The acquisition module, in the data warehouse in the interface data file for meeting the first name requirement
Data are integrated and are calculated, the data after obtaining processing.
8. device as claimed in claim 7, which is characterized in that the abstraction module is specifically used for: according to source system data library
Log file, the connection type in source system data library, pre-configuration database extract parameter, in multiple source system tables
Multiple processes concurrently extract target data, wherein the corresponding source system table of a process.
9. a kind of electronic equipment characterized by comprising processor, storage medium and bus, storage medium are stored with processor
Executable machine readable instructions pass through bus communication, processor when electronic equipment operation between processor and storage medium
Machine readable instructions are executed, the step of to execute data processing method described in the claims 1-6.
10. a kind of storage medium, which is characterized in that be stored with computer program on the storage medium, the computer program quilt
The step of data processing method described in the claims 1-6 is executed when processor is run.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910221391.2A CN109960708A (en) | 2019-03-22 | 2019-03-22 | Data processing method, device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910221391.2A CN109960708A (en) | 2019-03-22 | 2019-03-22 | Data processing method, device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109960708A true CN109960708A (en) | 2019-07-02 |
Family
ID=67024730
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910221391.2A Pending CN109960708A (en) | 2019-03-22 | 2019-03-22 | Data processing method, device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109960708A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110515995A (en) * | 2019-08-22 | 2019-11-29 | 深圳前海环融联易信息科技服务有限公司 | Quickly generate the ETL operational method and device of big data platform |
CN111078777A (en) * | 2019-12-13 | 2020-04-28 | 紫光云(南京)数字技术有限公司 | Method for loading data based on dynamic increment of relational database |
CN111291025A (en) * | 2020-03-10 | 2020-06-16 | 北京东方金信科技有限公司 | Method for supporting multi-physical model conversion by logic model and storage device |
CN111324647A (en) * | 2020-01-21 | 2020-06-23 | 北京东方金信科技有限公司 | Method and device for generating ETL code |
CN111798311A (en) * | 2020-07-22 | 2020-10-20 | 睿智合创(北京)科技有限公司 | Bank risk analysis library platform based on big data, building method and readable medium |
CN112559611A (en) * | 2020-12-15 | 2021-03-26 | 中国人寿保险股份有限公司 | Data processing method, device, equipment and storage medium |
CN113312357A (en) * | 2021-06-23 | 2021-08-27 | 中国农业银行股份有限公司 | Data loading method, device, equipment and storage medium |
CN114816578A (en) * | 2022-05-11 | 2022-07-29 | 上海柯林布瑞信息技术有限公司 | Method, device and equipment for generating program configuration file based on configuration table |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101452487A (en) * | 2008-12-31 | 2009-06-10 | 中国建设银行股份有限公司 | Data loading method and system, and data loading unit |
US20090282058A1 (en) * | 2008-05-12 | 2009-11-12 | Expressor Software | Method and system for developing data integration applications with reusable functional rules that are managed according to their output variables |
CN102073698A (en) * | 2010-12-28 | 2011-05-25 | 中国工商银行股份有限公司 | Sample data acquisition method and device for enterprise data warehouse system |
CN103197960A (en) * | 2013-04-12 | 2013-07-10 | 中国银行股份有限公司 | Scheduling method and scheduling system for batch job system |
CN103678665A (en) * | 2013-12-24 | 2014-03-26 | 焦点科技股份有限公司 | Heterogeneous large data integration method and system based on data warehouses |
CN104734894A (en) * | 2013-12-18 | 2015-06-24 | 中国移动通信集团甘肃有限公司 | Flow data screening method and device |
CN107784026A (en) * | 2016-08-31 | 2018-03-09 | 杭州海康威视数字技术股份有限公司 | A kind of ETL data processing methods and device |
-
2019
- 2019-03-22 CN CN201910221391.2A patent/CN109960708A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090282058A1 (en) * | 2008-05-12 | 2009-11-12 | Expressor Software | Method and system for developing data integration applications with reusable functional rules that are managed according to their output variables |
CN101452487A (en) * | 2008-12-31 | 2009-06-10 | 中国建设银行股份有限公司 | Data loading method and system, and data loading unit |
CN102073698A (en) * | 2010-12-28 | 2011-05-25 | 中国工商银行股份有限公司 | Sample data acquisition method and device for enterprise data warehouse system |
CN103197960A (en) * | 2013-04-12 | 2013-07-10 | 中国银行股份有限公司 | Scheduling method and scheduling system for batch job system |
CN104734894A (en) * | 2013-12-18 | 2015-06-24 | 中国移动通信集团甘肃有限公司 | Flow data screening method and device |
CN103678665A (en) * | 2013-12-24 | 2014-03-26 | 焦点科技股份有限公司 | Heterogeneous large data integration method and system based on data warehouses |
CN107784026A (en) * | 2016-08-31 | 2018-03-09 | 杭州海康威视数字技术股份有限公司 | A kind of ETL data processing methods and device |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110515995A (en) * | 2019-08-22 | 2019-11-29 | 深圳前海环融联易信息科技服务有限公司 | Quickly generate the ETL operational method and device of big data platform |
CN111078777A (en) * | 2019-12-13 | 2020-04-28 | 紫光云(南京)数字技术有限公司 | Method for loading data based on dynamic increment of relational database |
CN111324647A (en) * | 2020-01-21 | 2020-06-23 | 北京东方金信科技有限公司 | Method and device for generating ETL code |
CN111291025A (en) * | 2020-03-10 | 2020-06-16 | 北京东方金信科技有限公司 | Method for supporting multi-physical model conversion by logic model and storage device |
CN111291025B (en) * | 2020-03-10 | 2020-11-10 | 北京东方金信科技有限公司 | Method for supporting multi-physical model conversion by logic model and storage device |
CN111798311A (en) * | 2020-07-22 | 2020-10-20 | 睿智合创(北京)科技有限公司 | Bank risk analysis library platform based on big data, building method and readable medium |
CN112559611A (en) * | 2020-12-15 | 2021-03-26 | 中国人寿保险股份有限公司 | Data processing method, device, equipment and storage medium |
CN113312357A (en) * | 2021-06-23 | 2021-08-27 | 中国农业银行股份有限公司 | Data loading method, device, equipment and storage medium |
CN114816578A (en) * | 2022-05-11 | 2022-07-29 | 上海柯林布瑞信息技术有限公司 | Method, device and equipment for generating program configuration file based on configuration table |
CN114816578B (en) * | 2022-05-11 | 2024-05-17 | 上海柯林布瑞信息技术有限公司 | Program configuration file generation method, device and equipment based on configuration table |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109960708A (en) | Data processing method, device, electronic equipment and storage medium | |
CN109814856B (en) | Data entry method, device, terminal and computer readable storage medium | |
CN100478956C (en) | Method and corresponding system for creating and obtaining report forms | |
US7464120B1 (en) | Method for analyzing the quality of telecommunications switch command tables | |
CN111488145B (en) | Micro-service code generation system and method based on service domain data model library | |
US20040039623A1 (en) | Workflow management software overview | |
US9704172B2 (en) | Systems and methods of simulating user intuition of business relationships using biographical imagery | |
US6993717B2 (en) | Data transformation system | |
US7895094B2 (en) | Global account reconciliation tool | |
CN110119294A (en) | The generation method of menu page, apparatus and system | |
CN107506383B (en) | Audit data processing method and computer equipment | |
CN107273122A (en) | Based on decoupling mechanism can iteration set up operation system method and its terminal | |
WO2000007128A1 (en) | A modular, convergent customer care and billing system | |
US8036961B2 (en) | Dynamically managing timesheet data associated with multiple billing types | |
CN110162524A (en) | Management method, device, computer equipment and the storage medium of configuration data | |
CN110019437A (en) | A kind of method and system exporting data | |
CN109598631B (en) | Method and system for generating human resource outsourcing client bill based on social security policy | |
CN109582446A (en) | Quasi real time asynchronous batch processing system, method, apparatus and storage medium | |
CN115098047A (en) | Printing platform based on Word template and functional plug-in | |
CN109685636A (en) | Wage number generation method, device, system and computer storage medium | |
US20120078967A1 (en) | Integration of a Framework Application and a Task Database | |
CN100395752C (en) | Report data collection system and method | |
US20050097122A1 (en) | Redundancy-free provision of multi-purpose data | |
CN108153845B (en) | Method and device for exporting background image data | |
CN110111203A (en) | Batch process, device and the electronic equipment of business datum |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190702 |